Peter Vary created HIVE-21506:
---------------------------------

             Summary: Memory based TxnHandler implementation
                 Key: HIVE-21506
                 URL: https://issues.apache.org/jira/browse/HIVE-21506
             Project: Hive
          Issue Type: New Feature
          Components: Transactions
            Reporter: Peter Vary


The current TxnHandler implementations are using the backend RDBMS to store 
every Hive lock and transaction data, so multiple TxnHandler instances can run 
simultaneously and can serve requests. The continuous communication/locking 
done on the RDBMS side puts serious load on the backend databases also 
restricts the possible throughput.

If it is possible to have only a single active TxnHandler (with the current 
design HMS) instance then we can provide much better (using only java based 
locking) performance. We still have to store the committed write transactions 
to the RDBMS (or later some other persistent storage), but other lock and 
transaction operations could remain memory only.

The most important drawbacks with this solution is that we definitely lose 
scalability when one instance of TxnHandler is no longer able to serve the 
requests (see NameNode), and fault tolerance in the sense that the ongoing 
transactions should be terminated when the TxnHandler is failed. If this 
drawbacks are acceptable in certain situations the we can provide better 
throughput for the users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to