[ 
https://issues.apache.org/jira/browse/HBASE-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314915#comment-16314915
 ] 

Chia-Ping Tsai commented on HBASE-19715:
----------------------------------------

Seems the cost of retry is prohibitive. I run the 
TestMultiRespectsLimits#testMultiLimits with jmc. The observation, which is 
different from yours, is that building the exception create lots of char 
arrays. The {{sizeIOE}} won't be changed after initializing so we can cache the 
{{pair}} in order to avoid building the proto exception repeatedly.
{code:title=RsRpcService#doNonAtomicRegionMutation}
          // We're storing the exception since the exception and reason string 
won't
          // change after the response size limit is reached.
          if (sizeIOE == null ) {
            // We don't need the stack un-winding do don't throw the exception.
            // Throwing will kill the JVM's JIT.
            //
            // Instead just create the exception and then store it.
            sizeIOE = new MultiActionResultTooLarge("Max size exceeded"
                + " CellSize: " + context.getResponseCellSize()
                + " BlockSize: " + context.getResponseBlockSize());

            // Only report the exception once since there's only one request 
that
            // caused the exception. Otherwise this number will dominate the 
exceptions count.
            rpcServer.getMetrics().exception(sizeIOE);
          }

          // Now that there's an exception is known to be created
          // use it for the response.
          //
          // This will create a copy in the builder.
          hasResultOrException = true;
          NameBytesPair pair = ResponseConverter.buildException(sizeIOE);
{code}

> Fix timing out test TestMultiRespectsLimits
> -------------------------------------------
>
>                 Key: HBASE-19715
>                 URL: https://issues.apache.org/jira/browse/HBASE-19715
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Appy
>            Assignee: Appy
>         Attachments: failued.txt, passed.txt, screenshot-1.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> !screenshot-1.png|width=800px!
> Attached logs for both cases, when it passes and fails.
> Link (temporary) to logs:
> passed: 
> http://104.198.223.121:8080/job/HBase-Flaky-Tests/33449/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestMultiRespectsLimits-output.txt/*view*/
> failed: 
> http://104.198.223.121:8080/job/HBase-Flaky-Tests/33455/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestMultiRespectsLimits-output.txt/*view*/
> Correlating across more runs, whenever the tests passes, it does so within 
> 10-30sec of 3min deadline for medium tests.
> So i think we can make it pass by just increasing the timeout.
> But I'm a bit skeptical after seeing all those long GC pauses (10sec +) in 
> the log. Test code doesn't seem to be doing anything that intensive. Are we 
> mismanaging the memory somewhere? 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to