[ 
https://issues.apache.org/jira/browse/ACCUMULO-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15120803#comment-15120803
 ] 

Josh Elser commented on ACCUMULO-4066:
--------------------------------------

This one slipped in under my radar. Thought I'd give your changes a glance. 3x 
speed up is awesome!

{code}
-    for (Condition cond : cm.getConditions()) {
+    // sort conditions inorder to get better lookup performance. Sort on 
client side so tserver does not have to do it.
+    Condition[] ca = cm.getConditions().toArray(new 
Condition[cm.getConditions().size()]);
+    Arrays.sort(ca, CONDITION_COMPARATOR);
{code}

To confirm, the server doesnt' rely on the sorted order, just hopes for it for 
performance reasons?

I see a lot of changes in IteratorUtil (I assume to your point about loading 
iterators from the table config). How did this used to work? You had lots of 
new tests added for the other cases -- do we have good coverage for 
IteratorUtil already?

> Conditional mutation processing performance could be improved.
> --------------------------------------------------------------
>
>                 Key: ACCUMULO-4066
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4066
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.6.4, 1.7.0
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>             Fix For: 1.6.5, 1.7.1, 1.8.0
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When processing conditional mutations tablets reads are done.   The way the 
> current implementation does tablet reads has a lot of overhead.   For each 
> condition the following is done :
>  * Opens and reserves iterators files.
>  * Parse table iterators from table config (involves scanning and filtering 
> entire table config)
>  * Merges condition iterators and table iterators
>  * Constructs iterator stack.
> I created a branch where these operations (except for constructing iterator 
> stack) are done per tablet and/or per batch of conditional mutations.   Doing 
> this I am seeing a 3x speed up in conditional mutation processing rates when 
> data is cached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to