[
https://issues.apache.org/jira/browse/PHOENIX-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292272#comment-14292272
]
Jan Fernando commented on PHOENIX-1590:
---------------------------------------
[~jamestaylor] Re the corner cases:
#1 I think it is okay to allow the creation of a new VIEW with a different
name but with the same or overlapping predicate. I think some of the
responsibility is on the caller of the DDL statements to not issue conflicting
statements. I think it's hard to prevent people from shooting themselves in the
foot if they really want to. I think making sure the order of DDL operations
guarantees the data and application behavior is consistent is the
responsibility of the caller. I could see scenarios with overlapping views
where changes are made across release boundaries e.g. version 1 of the software
uses viewX and version 2 users viewY and both need to coexist during a
deployment and you want to delete data based on viewX after a deployment. I
think that there are a lot combinations based on unique needs where application
developers are best equipped to order the DDL operations in these cases as
opposed to pushing this responsibility to Phoenix.
#2 I think we should only allow a view with the exact same to be created only
after all the data is deleted. This seems the easiest to reason about and
forces you to think about a migration strategy.
> Add an Asynchronous/Deferred Delete Option
> ------------------------------------------
>
> Key: PHOENIX-1590
> URL: https://issues.apache.org/jira/browse/PHOENIX-1590
> Project: Phoenix
> Issue Type: New Feature
> Reporter: Jan Fernando
>
> For use cases where we need to delete very large amounts of data from Phoenix
> tables running a synchronous delete can be problematic. In order to guarantee
> that the delete completes, handle failure scenarios, and ensure it doesn't
> put too much load on the HBase cluster and crowd out other queries running we
> need to build tooling around the longer running delete operations to chunk
> them up, provide retries in the event of failures, and have ways to throttle
> delete load if the Region Servers get hot.
> It would be really great if Phoenix offered a way to invoke a resilient
> delete that was processed asynchronously and had minimal load on the cluster.
> An idea mentioned to implement this is to introduce a DEFERRED keyword to the
> DELETE operation and for such a delete to remove the data at compaction time.
> For our use cases, ideally, we would like to set delete filters that are
> based on the first 2 elements of the row key (a multi-tenant id and the next
> item).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)