[
https://issues.apache.org/jira/browse/HBASE-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314915#comment-16314915
]
Chia-Ping Tsai commented on HBASE-19715:
----------------------------------------
Seems the cost of retry is prohibitive. I run the
TestMultiRespectsLimits#testMultiLimits with jmc. The observation, which is
different from yours, is that building the exception create lots of char
arrays. The {{sizeIOE}} won't be changed after initializing so we can cache the
{{pair}} in order to avoid building the proto exception repeatedly.
{code:title=RsRpcService#doNonAtomicRegionMutation}
// We're storing the exception since the exception and reason string
won't
// change after the response size limit is reached.
if (sizeIOE == null ) {
// We don't need the stack un-winding do don't throw the exception.
// Throwing will kill the JVM's JIT.
//
// Instead just create the exception and then store it.
sizeIOE = new MultiActionResultTooLarge("Max size exceeded"
+ " CellSize: " + context.getResponseCellSize()
+ " BlockSize: " + context.getResponseBlockSize());
// Only report the exception once since there's only one request
that
// caused the exception. Otherwise this number will dominate the
exceptions count.
rpcServer.getMetrics().exception(sizeIOE);
}
// Now that there's an exception is known to be created
// use it for the response.
//
// This will create a copy in the builder.
hasResultOrException = true;
NameBytesPair pair = ResponseConverter.buildException(sizeIOE);
{code}
> Fix timing out test TestMultiRespectsLimits
> -------------------------------------------
>
> Key: HBASE-19715
> URL: https://issues.apache.org/jira/browse/HBASE-19715
> Project: HBase
> Issue Type: Bug
> Reporter: Appy
> Assignee: Appy
> Attachments: failued.txt, passed.txt, screenshot-1.png,
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> !screenshot-1.png|width=800px!
> Attached logs for both cases, when it passes and fails.
> Link (temporary) to logs:
> passed:
> http://104.198.223.121:8080/job/HBase-Flaky-Tests/33449/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestMultiRespectsLimits-output.txt/*view*/
> failed:
> http://104.198.223.121:8080/job/HBase-Flaky-Tests/33455/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestMultiRespectsLimits-output.txt/*view*/
> Correlating across more runs, whenever the tests passes, it does so within
> 10-30sec of 3min deadline for medium tests.
> So i think we can make it pass by just increasing the timeout.
> But I'm a bit skeptical after seeing all those long GC pauses (10sec +) in
> the log. Test code doesn't seem to be doing anything that intensive. Are we
> mismanaging the memory somewhere?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)