cuijianwei created HBASE-10999:
----------------------------------

             Summary: Cross-row Transaction : Implement Percolator Algorithm on 
HBase
                 Key: HBASE-10999
                 URL: https://issues.apache.org/jira/browse/HBASE-10999
             Project: HBase
          Issue Type: New Feature
          Components: Transactions/MVCC
    Affects Versions: 0.94.18
            Reporter: cuijianwei


Cross-row transaction is a desired function for database. It is not easy to 
keep ACID characteristics of cross-row transactions in distribute databases 
such as HBase, because data of cross-transaction might locate in different 
machines. In the paper http://research.google.com/pubs/pub36726.html, google 
presents an algorithm(named percolator) to implement cross-row transactions on 
BigTable. After analyzing the algorithm, we found percolator might also be a 
choice to provide cross-row transaction on HBase. The reasons includes:
1. Percolator could keep the ACID of cross-row transaction as described in 
google's paper. Percolator depends on a Global Incremental Timestamp Service to 
define the order of transactions, this is important to keep ACID of transaction.
2. Percolator algorithm could be totally implemented in client-side. This means 
we do not need to change the logic of server side. Users could easily include 
percolator in their client and adopt percolator APIs only when they want 
cross-row transaction.
3. Percolator is a general algorithm which could be implemented based on 
databases providing single-row transaction. Therefore, it is feasible to 
implement percolator on HBase.
In last few months, we have implemented percolator on HBase, did correctness 
validation, performance test and finally successfully applied this algorithm in 
our production environment. Our works include:
1. percolator algorithm implementation on HBase. The current implementations 
includes:
    a). a Transaction module to provides put/delete/get/scan interfaces to do 
cross-row/cross-table transaction.
    b). a Global Incremental Timestamp Server to provide globally monotonically 
increasing timestamp for transaction.
    c). a LockCleaner module to resolve conflict when concurrent transactions 
mutate the same column.
    d). an internal module to implement prewrite/commit/get/scan logic of 
percolator.
   Although percolator logic could be totally implemented in client-side, we 
use coprocessor framework of HBase in our implementation. This is because 
coprocessor could provide percolator-specific Rpc interfaces such as 
prewrite/commit to reduce Rpc rounds and improve efficiency. Another reason to 
use coprocessor is that we want to decouple percolator's code from HBase so 
that users will get clean HBase code if they don't need cross-row transactions. 
In future, we will also explore the concurrent running characteristic of 
coprocessor to do cross-row mutations more efficiently.
2. an AccountTransfer simulation program to validate the correctness of 
implementation. This program will distribute initial values in different 
tables, rows and columns in HBase. Each column represents an account. Then, 
configured client threads will be concurrently started to read out a number of 
account values from different tables and rows by percolator's get; after this, 
clients will randomly transfer values among these accounts while keeping the 
sum unchanged, which simulates concurrent cross-table/cross-row transactions. 
To check the correctness of transactions, a checker thread will periodically 
scan account values from all columns, make sure the current total value is the 
same as the initial total value. We run this validation program while 
developing, this help us correct errors of implementation.
3. performance evaluation under various test situations. We compared 
percolator's APIs with HBase's with different data size and client thread count 
for single-column transaction which represents the worst performance case for 
percolator. We get the performance comparison result as (below):
    a) For read, the performance of percolator is 90% of HBase;
    b) For write, the performance of percolator is 23%  of HBase.
The drop derives from the overhead of percolator logic, the performance test 
result is similar as the result reported by google's paper.
4. Performance improvement. The write performance of percolator decreases more 
compared with HBase. This is because percolator's write needs to read data out 
to check write conflict and needs two Rpcs which do prewriting and commiting 
respectively. We are investigating ways to improve the write performance.
We are glad to share current percolator implementation and hope this could 
provide a choice for users who want cross-row transactions because it does not 
need to change the code and logic of origin HBase. Comments and discussions are 
welcomed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to