[ 
https://issues.apache.org/jira/browse/HBASE-18703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190119#comment-16190119
 ] 

Umesh Agashe commented on HBASE-18703:
--------------------------------------

Thanks for your comments, [~anoop.hbase]! The uses case that [~chia7712] has 
specified is interesting and will be supported.

bq. So u propose we dont give RowProcessor kind of way for customized 
processing?

I am proposing that Coprocessors can be used for customized processing instead 
of RowProcessor. Currently this can be done either with RowProcessors by 
calling Region.processRowsWithLocks() or with coprocessors by calling 
Region.batchMutate(). The intended difference between these 2 APIs is that 
Region.batchMutate() will only perform PUT and DELETE operations and 
Region.processRowsWithLocks() can perform any of GET, PUT, DELETE, 
CheckAndMutate etc operations.

bq. Use has to do multi row compare and based on that do a mutate op on another 
row.. I see adding a new API which takes rowsToLock. But here in this issue, 
user may have to take write locks on certain rows and read lock on another. 
Also has to do a complex compare op on many row:columns. What is the alternate 
we can give?

Sequence of steps for methods Region.processRowsWithLocks() and 
Region.batchMutate() can be roughly described as below.

for Region.processRowsWithLocks():
* Build empty WALEdit
* Call RowProcessor.preProcess(WALEdit)
* Lock all user specified rows
* Call RowProcessor.process(WALEdit) and get mutations[] to apply to MemStore
* Call RowProcessor.preBatchMutate(WALEdit)
* Append WALEdit processed by RowProcessor to WAL
* Apply mutations to MemStore
* Call RowProcessor.postBatchMutate()
* Release locks
* Call RowProcessor.postProcess()

Region.batchMutate(Mutation[]):
* Take mutations (PUTs and DELETEs only) as an input
* Prepare empty WALEdit
* Call cp.prePut(WALEdit) or cp.preDelete(WALEdit) and store WALEdits for each 
mutation into BatchOperation
* Lock as many rows corresponding to mutations as possible
* For mutations for which rows can be locked, call cp.preBatchMutate(Mutation[])
* Get Mutations from CP.
* For each mutation returned by CP: lock rows and merge them with input list of 
mutations
* Build new WALEdit by applying these merged mutations (input + from cp)
* Apply WALEdits from CP (previously stored in BatchOperation after calling 
prePut/ preDelete) to WALEdit built in previous step
* Append the merged WALEdit to WAL
* Apply merged mutations to MemStore
* call cp.postBatchMutate()
* Release locks
* Call cp.postPut()/ cp.postDelete()
* Call cp.postBatchMutateIndispensably()

Comparing these list of steps for 2 methods, we can see the correlation for 
most hooks except for RowProcessor.process().

Currently processRowsWithLocks() will still be supported but it will not take 
RowProcessor as an argument. User can still specify customized rows to lock and 
those rows will be locked by batchMutate(). For customized processing user can 
write his/ her own coprocessor. What do you think?

> Inconsistent behavior for preBatchMutate in doMiniBatchMutate and 
> processRowsWithLocks
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-18703
>                 URL: https://issues.apache.org/jira/browse/HBASE-18703
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Coprocessors
>            Reporter: Duo Zhang
>            Assignee: Umesh Agashe
>            Priority: Critical
>             Fix For: 2.0.0-alpha-4
>
>         Attachments: hbase-18703.master.001.patch, 
> hbase-18703.master.002.patch, hbase-18703.master.003.patch, 
> hbase-18703.master.004.patch, hbase-18703.master.005.patch, 
> hbase-18703.master.005.patch
>
>
> In doMiniBatchMutate, the preBatchMutate is called before building WAL, but 
> in processRowsWithLocks, we suggest the RowProcessor implementation to build 
> WAL in process  method, which is ahead of preBatchMutate.
> If a CP modifies the mutations, especially if it removes some cells from the 
> mutations, then the behavior of processRowsWithLocks is broken. The changes 
> applied to memstore and WAL will be different. And there is no way to remove 
> entries from a WALEdit through CP. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to