Jason Bray created HBASE-8762:
---------------------------------
Summary: Performance/operational penalty when calling HTable.get
with a list of one Get
Key: HBASE-8762
URL: https://issues.apache.org/jira/browse/HBASE-8762
Project: HBase
Issue Type: Bug
Components: Client
Reporter: Jason Bray
Priority: Minor
There are two implications to calling HTable.get with a list of one Get.
1. The overhead of processBatch is paid unnecessarily, which is not
insignificant.
2. The get requests show up as a 'multi' when reviewing RPC handlers, when the
request should just be a single Get. It seems likely that there are other
places in logs/ui it shows up as a multi as well.
To give some context to the overhead, here are some timings performed by a
member of our team:
In a very simple test, of reading the same key 100 times, taking the time it
took, and then repeating this 10 times (1000 total gets), the times are as
follows (excluding the actual first iteration as there was considerable HBase
warm-up times on the JVM for establishing connections):
||Iteration||Batch (in ms)||Single (in ms)||
|1|2255|815|
|2|1545|823|
|3|1427|742|
|4|1451|721|
|5|1480|775|
|6|1379|735|
|7|1657|775|
|8|1392|804|
While I can see the argument that callers should use the single Get method
signature, the cost implications are somewhat surprising and it's very easy to
be smart in this case. We simply need to have HTable.get(List<Get>) delegate
to HTable.get(<Get>) if the list has one Get.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira