[ 
https://issues.apache.org/jira/browse/PHOENIX-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated PHOENIX-5688:
-----------------------------------
    Summary: Investigate better client/server work pacing  (was: Investigate 
better server work pacing)

> Investigate better client/server work pacing
> --------------------------------------------
>
>                 Key: PHOENIX-5688
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5688
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Priority: Major
>
> [~kozdemir] shared an intriguing idea that he used for the server side index 
> repair tool, which would equally well apply to the server side deletes and 
> server side UPSERT/SELECT.
> The main problem with the current implementation is that we basically send a 
> predicate to the server - DELETE FROM <table> WHERE <condition>. Now the 
> server(s) will go away per region chunk, evaluate the condition and delete 
> whatever matched it... All in tight server loop.
> The downside is that (a) a server thread is held up arbitrarily long, (b) 
> there is no way for the server to do any fair queuing, the loop has to 
> finish, and (c) if the server takes too long the client will just time out.
> The alternative used to be to do the work on the client instead: Issue a scan 
> with the condition to the server, retrieve the IDs to the client, and then 
> issue nice chunks of deletes back to the server.
> The downside here is the extra communication overhead between the server and 
> client (which might be especially taxing for UPSERT/SELECTS).
> Kadir's approach is a middle ground:
>  # Issue a scan from the client, and send along a chunk size (N rows), when 
> getting the scanner.
>  # The server will do N rows worth of work, then return.
>  # The client keeps the scanner open, and calls next.
>  # Goto #2
> This way we get the benefit of both approaches: (1) work close to where the 
> data is, (2) the client can pace the work and the server gets a chance to 
> schedule other work.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to