[jira] [Commented] (PHOENIX-1590) Add an Asynchronous/Deferred Delete Option

Jan Fernando (JIRA) Mon, 26 Jan 2015 11:23:03 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292272#comment-14292272
 ]


Jan Fernando commented on PHOENIX-1590:
---------------------------------------

[~jamestaylor] Re the corner cases:

 #1 I think it is okay to allow the creation of a new VIEW with a different 
name but with the same or overlapping predicate. I think some of the 
responsibility is on the caller of the DDL statements to not issue conflicting 
statements. I think it's hard to prevent people from shooting themselves in the 
foot if they really want to.  I think making sure the order of DDL operations 
guarantees the data and application behavior is consistent is the 
responsibility of the caller.  I could see scenarios with overlapping views 
where changes are made across release boundaries e.g. version 1 of the software 
uses viewX and version 2 users viewY and both need to coexist during a 
deployment and you want to delete data based on viewX after a deployment. I 
think that there are a lot combinations based on unique needs where application 
developers are best equipped to order the DDL operations in these cases as 
opposed to pushing this responsibility to Phoenix.

#2 I think we should only allow a view with the exact same to be created only 
after all the data is deleted. This seems the easiest to reason about and 
forces you to think about a migration strategy.

> Add an Asynchronous/Deferred Delete Option
> ------------------------------------------
>
>                 Key: PHOENIX-1590
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1590
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Jan Fernando
>
> For use cases where we need to delete very large amounts of data from Phoenix 
> tables running a synchronous delete can be problematic. In order to guarantee 
> that the delete completes, handle failure scenarios, and ensure it doesn't 
> put too much load on the HBase cluster and crowd out other queries running we 
> need to build tooling around the longer running delete operations to chunk 
> them up, provide retries in the event of failures, and have ways to throttle 
> delete load if the Region Servers get hot.  
> It would be really great if Phoenix offered a way to invoke a resilient 
> delete that was processed asynchronously and had minimal load on the cluster. 
> An idea mentioned to implement this is to introduce a DEFERRED keyword to the 
> DELETE operation and for such a delete to remove the data at compaction time.
> For our use cases, ideally, we would like to set delete filters that are 
> based on the first 2 elements of the row key (a multi-tenant id and the next 
> item).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-1590) Add an Asynchronous/Deferred Delete Option

Reply via email to