[
https://issues.apache.org/jira/browse/PHOENIX-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16805978#comment-16805978
]
Lars Hofhansl commented on PHOENIX-5090:
----------------------------------------
Here's another observation on this: If I leave autocommit on I can insert much
larger batches. *And* it is still transactionally correct. I.e. I can execute a
huge upsert as long as autocommit is on and in another client the client the
changes are not visible until the entire upsert has finished.
When I turn autocommit off the upsert fails due to the mutation size.
So the bug is clearly in Phoenix!
When we have transactions Phoenix should flush batches to HBase even when
autocommit is OFF. The semantics have changed.
Without transactions: Phoenix has to batch up all edit on the heap.
With transactions: Phoenix only has to be make that changes are not visible to
other transactions.
I'll try to provide a patch.
> Discuss: Allow transactional writes without buffering the entire transaction
> on the client.
> -------------------------------------------------------------------------------------------
>
> Key: PHOENIX-5090
> URL: https://issues.apache.org/jira/browse/PHOENIX-5090
> Project: Phoenix
> Issue Type: Wish
> Reporter: Lars Hofhansl
> Priority: Major
>
> Currently it is not possible execute transactions in Phoenix that are too
> large to be buffered entirely on the client.
> Both Tephra and Omid support writing uncommitted data to HBase immediately
> and at full speed. The client still needs to keep tracks of the rows changes
> for:
> # Conflict detection
> # (for Omid) writing the shadow cells
> I'd like to do some brainstorming here.
> * It should *always* be enough to only hold on to the changed rows (and
> columns?) only for _conflict resolution_ and free the rest from the client as
> soon as the uncommitted data is written to HBase.
> * For the shadows cells we need only keep the rows changed, right?
> * There are situations where we can avoid the client site buffering entirely
> (perhaps only for Tephra) when we declare a table or upsert not to
> participate in conflict resolution.
> [~tdsilva], [~ohads], [~yonigo], [~jamestaylor], [~vincentpoon], more, better
> ideas?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)