[
https://issues.apache.org/jira/browse/HBASE-18023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16057674#comment-16057674
]
Josh Elser commented on HBASE-18023:
------------------------------------
{code}
@@ -1363,6 +1363,10 @@ public final class HConstants {
public static final String DEFAULT_SNAPSHOT_RESTORE_FAILSAFE_NAME =
"hbase-failsafe-{snapshot.name}-{restore.timestamp}";
+ public static final String BATCH_ROWS_THRESHOLD_NAME =
"hbase.rpc.rows.warning.threshold";
+
+ public static final int BATCH_ROWS_THRESHOLD_DEFAULT = 1000;
{code}
We try to avoid placing constants into {{HConstants}} unless they're used in a
number of places. Can you move this to RSRpcServices, please?
{code}
+ private static LogDelegate DEFAULT_LOG_DELEGATE = new LogDelegate(){
+ @Override
+ public void logBatchWarning(int sum, int rowSizeWarnThreshold) {
+ LOG.warn("Large batch operation detected (greater than
"+rowSizeWarnThreshold+") (HBASE-18023)."
+ + " Requested Number of Rows: "+sum
+ + " Client:
"+RpcServer.getRequestUserName()+"/"+RpcServer.getRemoteAddress());
+ }};
{code}
Please wrap the {{LOG.warn(..)}} call in a {{if (LOG.isWarnEnabled())}}
conditional. This is a relatively "hot" code path -- we want to avoid the
intermediate string generation if we aren't going to log it.
Can you make sure the indentation is consistently to 2-spaces in your changes?
There are also a couple of long-lines as well which you could wrap which would
be great!
One final request, could you add a test that verifies the warning happens when
there is one RegionAction in the MultiRequest which exceeds the threshold?
Other than the above, I think this looks good. Just minor things at this point
:)
> Log multi-* requests for more than threshold number of rows
> -----------------------------------------------------------
>
> Key: HBASE-18023
> URL: https://issues.apache.org/jira/browse/HBASE-18023
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: Clay B.
> Assignee: David Harju
> Priority: Minor
> Attachments: HBASE-18023.master.001.patch,
> HBASE-18023.master.002.patch
>
>
> Today, if a user happens to do something like a large multi-put, they can get
> through request throttling (e.g. it is one request) but still crash a region
> server with a garbage storm. We have seen regionservers hit this issue and it
> is silent and deadly. The RS will report nothing more than a mysterious
> garbage collection and exit out.
> Ideally, we could report a large multi-* request before starting it, in case
> it happens to be deadly. Knowing the client, user and how many rows are
> affected would be a good start to tracking down painful users.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)