[
https://issues.apache.org/jira/browse/HBASE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010679#comment-14010679
]
cuijianwei commented on HBASE-10999:
------------------------------------
[~stack], thanks for your suggestion, sorry to reply late. The performance of
themis write in current version is not good enough. In recent weeks, we
optimized the performance of multi-row transactions by concurrent
prewrite/commit. The performance after adopting concurrent prewrite/commit has
improved significantly when doing multi-row transactions. We are trying to
optimize single-row transaction and will update the performance report, then,
we will post a note on dev list with new result :)
> Cross-row Transaction : Implement Percolator Algorithm on HBase
> ---------------------------------------------------------------
>
> Key: HBASE-10999
> URL: https://issues.apache.org/jira/browse/HBASE-10999
> Project: HBase
> Issue Type: New Feature
> Components: Transactions/MVCC
> Affects Versions: 0.99.0
> Reporter: cuijianwei
> Assignee: cuijianwei
>
> Cross-row transaction is a desired function for database. It is not easy to
> keep ACID characteristics of cross-row transactions in distribute databases
> such as HBase, because data of cross-transaction might locate in different
> machines. In the paper http://research.google.com/pubs/pub36726.html, google
> presents an algorithm(named percolator) to implement cross-row transactions
> on BigTable. After analyzing the algorithm, we found percolator might also be
> a choice to provide cross-row transaction on HBase. The reasons includes:
> 1. Percolator could keep the ACID of cross-row transaction as described in
> google's paper. Percolator depends on a Global Incremental Timestamp Service
> to define the order of transactions, this is important to keep ACID of
> transaction.
> 2. Percolator algorithm could be totally implemented in client-side. This
> means we do not need to change the logic of server side. Users could easily
> include percolator in their client and adopt percolator APIs only when they
> want cross-row transaction.
> 3. Percolator is a general algorithm which could be implemented based on
> databases providing single-row transaction. Therefore, it is feasible to
> implement percolator on HBase.
> In last few months, we have implemented percolator on HBase, did correctness
> validation, performance test and finally successfully applied this algorithm
> in our production environment. Our works include:
> 1. percolator algorithm implementation on HBase. The current implementations
> includes:
> a). a Transaction module to provides put/delete/get/scan interfaces to do
> cross-row/cross-table transaction.
> b). a Global Incremental Timestamp Server to provide globally
> monotonically increasing timestamp for transaction.
> c). a LockCleaner module to resolve conflict when concurrent transactions
> mutate the same column.
> d). an internal module to implement prewrite/commit/get/scan logic of
> percolator.
> Although percolator logic could be totally implemented in client-side, we
> use coprocessor framework of HBase in our implementation. This is because
> coprocessor could provide percolator-specific Rpc interfaces such as
> prewrite/commit to reduce Rpc rounds and improve efficiency. Another reason
> to use coprocessor is that we want to decouple percolator's code from HBase
> so that users will get clean HBase code if they don't need cross-row
> transactions. In future, we will also explore the concurrent running
> characteristic of coprocessor to do cross-row mutations more efficiently.
> 2. an AccountTransfer simulation program to validate the correctness of
> implementation. This program will distribute initial values in different
> tables, rows and columns in HBase. Each column represents an account. Then,
> configured client threads will be concurrently started to read out a number
> of account values from different tables and rows by percolator's get; after
> this, clients will randomly transfer values among these accounts while
> keeping the sum unchanged, which simulates concurrent cross-table/cross-row
> transactions. To check the correctness of transactions, a checker thread will
> periodically scan account values from all columns, make sure the current
> total value is the same as the initial total value. We run this validation
> program while developing, this help us correct errors of implementation.
> 3. performance evaluation under various test situations. We compared
> percolator's APIs with HBase's with different data size and client thread
> count for single-column transaction which represents the worst performance
> case for percolator. We get the performance comparison result as (below):
> a) For read, the performance of percolator is 90% of HBase;
> b) For write, the performance of percolator is 23% of HBase.
> The drop derives from the overhead of percolator logic, the performance test
> result is similar as the result reported by google's paper.
> 4. Performance improvement. The write performance of percolator decreases
> more compared with HBase. This is because percolator's write needs to read
> data out to check write conflict and needs two Rpcs which do prewriting and
> commiting respectively. We are investigating ways to improve the write
> performance.
> We are glad to share current percolator implementation and hope this could
> provide a choice for users who want cross-row transactions because it does
> not need to change the code and logic of origin HBase. Comments and
> discussions are welcomed.
--
This message was sent by Atlassian JIRA
(v6.2#6252)