[ 
https://issues.apache.org/jira/browse/PHOENIX-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973586#comment-16973586
 ] 

Lars Hofhansl commented on PHOENIX-5494:
----------------------------------------

Do you have some cycles to look into this [~comnetwork]?
It work involved would roughly be:
1. Invert the loop in IndexBuildManager.getIndexUpdates(...) (i.e. loop over 
IndexMaintainers, and then get all previous state in a single scan.
2. Convert the list of keys to retrieve into a skip scan... 
2a. By creating key ranges from the keys and then constructing a SkipScanFilter 
for that.

The details would be tricky. The SkipScanFilter is designed to work with 
columns, not entire keys, and inverting the loop requires refactoring a bunch 
of methods.

> Batched, mutable Index updates are unnecessarily run one-by-one
> ---------------------------------------------------------------
>
>                 Key: PHOENIX-5494
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5494
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Lars Hofhansl
>            Assignee: Kadir OZDEMIR
>            Priority: Major
>              Labels: performance
>         Attachments: Screenshot_20191110_160243.png, 
> Screenshot_20191110_160351.png, Screenshot_20191110_161453.png
>
>
> I just noticed that index updates on mutable tables retrieve their deletes 
> (to invalidate the old index entry) one-by-one.
> For batches, this can be *the* major time spent during an index update. The 
> cost is mostly incured by the repeated setup (and seeking) of the new region 
> scanner (for each row).
> We can instead do a skip scan and get all updates in a single scan per region.
> (Logically that is simple, but it will require some refactoring)
> I won't be getting to this, but recording it here in case someone feels 
> inclined.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to