RE: Intersecting Iterators [SEC=UNCLASSIFIED]

Williamson, Luke MR 1 Tue, 13 Aug 2013 21:52:01 -0700

UNCLASSIFIED

I have tried increasing the number of threads and it seems to guarantee that it 
will return before it hits the timeout but it is taking approx. 7 minutes to 
complete. Looking at the accumulo manager page it appears that all the tablet 
servers get equally hit (around 16 per node) and start to return but a couple 
of tablet servers take longer than the others. This behaviour was indicated to 
potentially happen in the doco but I was hoping it wouldn't be taking this long.


________________________________

From: David Medinets [mailto:[email protected]]
Sent: Wednesday, 14 August 2013 12:45
To: accumulo-user
Subject: Re: Intersecting Iterators [SEC=UNCLASSIFIED]


I'm wondering about the 20 threads in the BatchScanner. Have you played with 
increasing it? I've seen that number go above 15 per accumulo node. Are you 
seeing the scans in the Accumulo monitor? Are the scans progressing through the 
Accumulo nodes?


On Tue, Aug 13, 2013 at 9:58 PM, Williamson, Luke MR 1 
<[email protected]> wrote:


        UNCLASSIFIED
        
        Hi,
        
        I have field indexes that looks something like
        
        Row Id: <date>-<UUID>
        CF: fi||<type>||<value>
        CQ: <date>-<UUID>
        
        For example:
        
        20130814-550e8400-e29b-41d4-a716-446655440000 fi||verb||run 
20130814-550e8400-e29b-41d4-a716-446655440000
        20130814-550e8400-e29b-41d4-a716-446655440000 page||58 line||16 "the 
boy can run up the hill"
        
        From what I could determine from the doco and API I am executing the 
following code to perform an intersecting query on two values...
        
        Set<Range> shards = new HashSet<Range>();
        
        Text[] terms = {new Text("fi||<type>||<value>"), new 
Text("fi||<type>||<value>")};
        
        BatchScanner bs = conn.createBatchScanner(table, auths, 20); 
bs.setTimeout(360, TimeUnit.SECONDS);
        
        IteratorSetting iter = new IteratorSetting(20, "ii", 
IntersectingIterator.class); IntersectingIterator.setColumnFamilies(iter, 
terms); bs.addScanIterator(iter);
        
        bs.setRanges(Collections.singleton(new Range()));
        
        for(Entry<Key,Value> entry : bs) {
        
            shards.add(new Range(entry.getKey().getColumnQualifier()));
        }
        
        I then perform a second batch scan using the set of ranges returned by 
the above to get my actual results.
        
        My issues is that the intersecting query takes several minutes to 
return if at all (in some cases it times out). Is this expected? Is there some 
way to improve performance? Is there a better way to do this sort of query?
        
        Any guidance would be much appreciated.
        
        Thanks
        
        Luke
        
        
        IMPORTANT: This email remains the property of the Department of Defence 
and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you 
have received this email in error, you are requested to contact the sender and 
delete the email.
        



IMPORTANT: This email remains the property of the Department of Defence and is 
subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have 
received this email in error, you are requested to contact the sender and 
delete the email.

RE: Intersecting Iterators [SEC=UNCLASSIFIED]

Reply via email to