[jira] [Assigned] (KUDU-686) Delta apply optimizations

Adar Dembo (JIRA) Wed, 05 Sep 2018 15:09:46 -0700


     [ 
https://issues.apache.org/jira/browse/KUDU-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Adar Dembo reassigned KUDU-686:
-------------------------------

    Assignee: Adar Dembo

I'm going to fix this using a different tack: port the {{PrepareBatch}}-heavy 
approach from {{DMSIterator}} to {{DeltaFileIterator}}. The idea is to do a lot 
more work in {{PrepareBatch}} such that there's no need for any IO in the 
per-column {{ApplyUpdates}} calls later on.

With that in place, there's no longer any need to switch to a "multi-pass" 
approach outlined earlier in this bug report.


> Delta apply optimizations
> -------------------------
>
>                 Key: KUDU-686
>                 URL: https://issues.apache.org/jira/browse/KUDU-686
>             Project: Kudu
>          Issue Type: Improvement
>          Components: perf, tablet
>    Affects Versions: M4.5
>            Reporter: David Alves
>            Assignee: Adar Dembo
>            Priority: Trivial
>
> We currently iterate on each delta file several times, one for deletes and 
> then one for each one of the columns.
> It seems that, when selecting all the columns it would be more efficient to 
> apply the deltas to all columns at the same time. This might or might not be 
> advantageous depending on the number of columns projected. Todd also suggest 
> that whether this is an advantage also depends on whether there are 
> predicates being pushed down.
> We could likely also merge the updates and deletes into a single iteration or 
> at least avoid applying the mutations if the row will end up delete (right 
> now we still apply the updates even when we find that the row will be 
> deleted).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KUDU-686) Delta apply optimizations

Reply via email to