[ 
https://issues.apache.org/jira/browse/PHOENIX-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742178#comment-16742178
 ] 

Ohad Shacham commented on PHOENIX-5090:
---------------------------------------

>  Is it fair to say that Omid is designed for very many, small transactions, 
>and not for extremely large transactions?

Extremely large transaction requires to store a lot of metadata information, 
then I assume the answer is yes.

 

> What would need to change in HBase so you can use those?

If we could ask HBase to delete all the row's column with version which is 
equal (not smaller) to the transaction version then we can keep only the row 
information (in row level conflict analysis) and use this function to delete in 
case of aborts.

 

>   Is that done on the server in the context of scan operations? 

Assuming the filtering is done at the server side (as we do for Phoenix) then 
yes :(.

 

> Perhaps good to take offline and have a quick chat?

Sounds great, we are 10 hours ahead so meeting at 11am your time would be a 
good match for us. Can Tuesday or Wednesday work?

 

Overall, for row level conflict analysis it would be great to add row level 
shadow cell. This will significantly increase the scalability for large 
transactions all will solve the issues above. Adding an HBase delete that 
deletes all columns with the exact version will let Omid do efficient deletion 
in case of abort. Otherwise, removing these during the regular GC is also 
possible. 

 

> Discuss: Allow transactional writes without buffering the entire transaction 
> on the client.
> -------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5090
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5090
>             Project: Phoenix
>          Issue Type: Wish
>            Reporter: Lars Hofhansl
>            Priority: Major
>
> Currently it is not possible execute transactions in Phoenix that are too 
> large to be buffered entirely on the client.
> Both Tephra and Omid support writing uncommitted data to HBase immediately 
> and at full speed. The client still needs to keep tracks of the rows changes 
> for:
> # Conflict detection
> # (for Omid) writing the shadow cells
> I'd like to do some brainstorming here.
> * It should *always* be enough to only hold on to the changed rows (and 
> columns?) only for _conflict resolution_ and free the rest from the client as 
> soon as the uncommitted data is written to HBase.
> * For the shadows cells we need only keep the rows changed, right?
> * There are situations where we can avoid the client site buffering entirely 
> (perhaps only for Tephra) when we declare a table or upsert not to 
> participate in conflict resolution.
> [~tdsilva], [~ohads], [~yonigo], [~jamestaylor], [~vincentpoon], more, better 
> ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to