[
https://issues.apache.org/jira/browse/HBASE-17408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796238#comment-15796238
]
Enis Soztutar commented on HBASE-17408:
---------------------------------------
Ted, giving more context definitely helps others to follow along.
I have seen applications sending very big "batches" in two different settings.
In the first instance, the application was sending multi-get requests, but each
RPC would end up accumulating about 5K get requests (because application would
call HTable.get(List) with default settings with a list of that size). The
server would have to buffer up all the responses causing OOM on the server
side.
On the second instance (which is the same reason Ted logged this issue), the
application was doing multi requests with Increment. In this case, it was doing
15K increments in a single RPC, which would take longer than RPC timeout. The
following RPCs would then get ~15K exceptions due to nonce collisions (because
previous RPC indeed succeeds regardless of the timeout on the client side).
This RPC response is multi-GB in size causing OOMs.
I have not checked the recent status for the AP, especially with HBASE-16224,
but even with that, we should have both the heap size limit and number of
actions-per-batch limit. With default settings, doing HTable.get(List) with a
10K gets in that list should not cause OOM on the server side (same with
HTable.batch()). If we already have limits for these (sorry I did not check
yet), then we should adjust down the default values at least.
> Introduce per request limit by number of mutations
> --------------------------------------------------
>
> Key: HBASE-17408
> URL: https://issues.apache.org/jira/browse/HBASE-17408
> Project: HBase
> Issue Type: Improvement
> Reporter: Ted Yu
>
> HBASE-16224 introduced hbase.client.max.perrequest.heapsize to limit the
> amount of data sent from client.
> We should consider adding per request limit through the number of mutations
> in a batch.
> In recent troubleshooting sessions, customer had to do this in their
> application code to avoid OOME on the server side.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)