[jira] [Commented] (PHOENIX-4344) MapReduce Delete Support

Lars Hofhansl (JIRA) Wed, 24 Oct 2018 10:52:43 -0700


    [ 
https://issues.apache.org/jira/browse/PHOENIX-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662617#comment-16662617
 ]


Lars Hofhansl commented on PHOENIX-4344:
----------------------------------------

We just had a discussion around that. Can we do this?
 # Create input split as we do now. No change there.
 # In the map function, upon the _first row_ issue the equivalent of DELETE 
FROM <table> WHERE <pk> >= split_start AND pk < split_end AND <whatever select 
predicate was specified>
 # finish the map task after the first row

Now Phoenix can push the DELETE down into the region and be an order of 
magnitude or two faster compared to issuing point deletes.

A nice side effect is that if there's no data in a region we won't issue any 
work at all.

I think that's what James was saying in the first comment.

[~gjacoby], [~jisaac]

> MapReduce Delete Support
> ------------------------
>
>                 Key: PHOENIX-4344
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4344
>             Project: Phoenix
>          Issue Type: New Feature
>    Affects Versions: 4.12.0
>            Reporter: Geoffrey Jacoby
>            Assignee: Geoffrey Jacoby
>            Priority: Major
>
> Phoenix already has the ability to use MapReduce for asynchronous handling of 
> long-running SELECTs. It would be really useful to have this capability for 
> long-running DELETEs, particularly of tables with indexes where using HBase's 
> own MapReduce integration would be prohibitively complicated. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4344) MapReduce Delete Support

Reply via email to