UNCLASSIFIED

Hi,
 
I have field indexes that looks something like
 
Row Id: <date>-<UUID>
CF: fi||<type>||<value>
CQ: <date>-<UUID>
 
For example: 

20130814-550e8400-e29b-41d4-a716-446655440000 fi||verb||run 
20130814-550e8400-e29b-41d4-a716-446655440000
20130814-550e8400-e29b-41d4-a716-446655440000 page||58 line||16 "the boy can 
run up the hill"

>From what I could determine from the doco and API I am executing the following 
>code to perform an intersecting query on two values...

Set<Range> shards = new HashSet<Range>();

Text[] terms = {new Text("fi||<type>||<value>"), new 
Text("fi||<type>||<value>")};

BatchScanner bs = conn.createBatchScanner(table, auths, 20); bs.setTimeout(360, 
TimeUnit.SECONDS);

IteratorSetting iter = new IteratorSetting(20, "ii", 
IntersectingIterator.class); IntersectingIterator.setColumnFamilies(iter, 
terms); bs.addScanIterator(iter);

bs.setRanges(Collections.singleton(new Range()));

for(Entry<Key,Value> entry : bs) {

    shards.add(new Range(entry.getKey().getColumnQualifier()));
}

I then perform a second batch scan using the set of ranges returned by the 
above to get my actual results.

My issues is that the intersecting query takes several minutes to return if at 
all (in some cases it times out). Is this expected? Is there some way to 
improve performance? Is there a better way to do this sort of query?

Any guidance would be much appreciated.

Thanks

Luke


IMPORTANT: This email remains the property of the Department of Defence and is 
subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have 
received this email in error, you are requested to contact the sender and 
delete the email.

Reply via email to