[jira] [Commented] (PHOENIX-5494) Batched, mutable Index updates are unnecessarily run one-by-one

Kadir OZDEMIR (Jira) Fri, 15 Nov 2019 04:54:06 -0800


    [ 
https://issues.apache.org/jira/browse/PHOENIX-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16975073#comment-16975073
 ]


Kadir OZDEMIR commented on PHOENIX-5494:
----------------------------------------

[~comnetwork], Thank you for suggestions and the patch. I agree on most of your 
points. However, I do not think we can apply this optimization to the reply 
writes. The reply writes can have cells with different timestamps and can have 
multiple mutations for the same row key within a batch. Regarding the 
concurrency issues for regular writes, they are taken care by the existing row 
locks and using ConcurrentHashSet. 

> Batched, mutable Index updates are unnecessarily run one-by-one
> ---------------------------------------------------------------
>
>                 Key: PHOENIX-5494
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5494
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Lars Hofhansl
>            Assignee: Kadir OZDEMIR
>            Priority: Major
>              Labels: performance
>         Attachments: 5494-4.x-HBase-1.5.txt, 
> PHOENIX-5494-4.x-HBase-1.4.patch, PHOENIX-5494.master.001.patch, 
> PHOENIX-5494.master.002.patch, PHOENIX-5494.master.003.patch, 
> Screenshot_20191110_160243.png, Screenshot_20191110_160351.png, 
> Screenshot_20191110_161453.png
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> I just noticed that index updates on mutable tables retrieve their deletes 
> (to invalidate the old index entry) one-by-one.
> For batches, this can be *the* major time spent during an index update. The 
> cost is mostly incured by the repeated setup (and seeking) of the new region 
> scanner (for each row).
> We can instead do a skip scan and get all updates in a single scan per region.
> (Logically that is simple, but it will require some refactoring)
> I won't be getting to this, but recording it here in case someone feels 
> inclined.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5494) Batched, mutable Index updates are unnecessarily run one-by-one

Reply via email to