nkeywal created HBASE-6295: ------------------------------ Summary: Possible performance improvement in client batch operations: presplit and send in background Key: HBASE-6295 URL: https://issues.apache.org/jira/browse/HBASE-6295 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.96.0 Reporter: nkeywal
today batch algo is: {noformat} for Operation o: List<Op>{ add o to todolist if todolist > maxsize or o last in list split todolist per location send split lists to region servers clear todolist wait } {noformat} We could: - create immediately the final object instead of an intermediate array - split per location immediately - instead of sending when the list as a whole is full, send it when there is enough data for a single location It would be: {noformat} for Operation o: List<Op>{ get location add o to todo location.todolist if (location.todolist > maxLocationSize) send location.todolist to region server clear location.todolist // don't wait, continue the loop } send remaining wait {noformat} It's not trivial to write if you add error management: retried list must be shared with the operations added in the todolist. But it's doable. It's interesting mainly for 'big' writes -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira