[ 
https://issues.apache.org/jira/browse/HBASE-23977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061438#comment-17061438
 ] 

Viraj Jasani commented on HBASE-23977:
--------------------------------------

[~stack] [~ndimiduk] sorry for one more quick follow up while both of you are 
quite busy.

The changes done in the patch are to ensure that we wait for ordered events 
processing done by consumer of ring buffer rather than just sleeping for some 
fixed amount of time.

For instance, if we send \{1,2,3,4,5,6,7} to RingBuffer of size 8, then all of 
them are consumed in same order. Now if we send \{8,9,10,11}, we expect final 
output from RingBuffer consumer to be \{9,10,11,4,5,6,7,8}  (1,2,3 are 
overridden). However, the reason why we have flakes is because by the time we 
expect above output, consumer might not have consumed say 10 and 11 and hence 
actual output would be \{9,2,3,4,5,6,7,8}. And hence the failures, so now with 
waitFor(), we wait until we get ordered output \{9,10,11,4,5,6,7,8}, which is 
why we will wait for all of 8,9,10,11 to be consumed and put in the queue in 
correct order i.e. no flakes.

> [Flakey Tests]  
> TestSlowLogRecorder.testOnlieSlowLogConsumption:178->confirmPayloadParams:97 
> expected:<client_1[0]> but was:<client_1[4]>
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-23977
>                 URL: https://issues.apache.org/jira/browse/HBASE-23977
>             Project: HBase
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 3.0.0, 2.3.0
>            Reporter: Michael Stack
>            Assignee: Viraj Jasani
>            Priority: Major
>             Fix For: 3.0.0, 2.3.0
>
>
> I see this occasionally on a linux vm. [~vjasani] .. you might have an idea 
> why this is going on (this is your test I believe and a recent addition). If 
> you have stuff I can try, shout.
> Here is fail:
> {code}
> org.junit.ComparisonFailure: expected:<client_1[0]> but was:<client_1[4]>
>    at 
> org.apache.hadoop.hbase.regionserver.slowlog.TestSlowLogRecorder.confirmPayloadParams(TestSlowLogRecorder.java:97)
>    at 
> org.apache.hadoop.hbase.regionserver.slowlog.TestSlowLogRecorder.testOnlieSlowLogConsumption(TestSlowLogRecorder.java:178)
> {code}
> Here is log:
> {code}
> 2020-03-12 17:02:06,266 INFO  [Time-limited test] hbase.ResourceChecker(179): 
> after: 
> regionserver.slowlog.TestSlowLogRecorder#testOnlineSlowLogWithDisableConfig 
> Thread=7 (was   7), OpenFileDescriptor=226 (was 226), 
> MaxFileDescriptor=131072 (was 131072), SystemLoadAverage=353 (was 353), 
> ProcessCount=195 (was 195), AvailableMemoryMB=7037 (was 7027) - 
> AvailableMemoryMB LEAK? -
>  2020-03-12 17:02:06,281 INFO  [Time-limited test] 
> hbase.ResourceChecker(151): before: 
> regionserver.slowlog.TestSlowLogRecorder#testOnlieSlowLogConsumption 
> Thread=7, OpenFileDescriptor=226,              MaxFileDescriptor=131072, 
> SystemLoadAverage=353, ProcessCount=195, AvailableMemoryMB=7036
>  2020-03-12 17:02:06,317 DEBUG [Time-limited test] 
> slowlog.TestSlowLogRecorder(111): Initially ringbuffer of Slow Log records is 
> empty
>  2020-03-12 17:02:06,326 INFO  [Time-limited test] hbase.Waiter(183): Waiting 
> up to [3,000] milli-secs(wait.for.ratio=[1])
>  2020-03-12 17:02:06,528 INFO  [Time-limited test] hbase.Waiter(183): Waiting 
> up to [3,000] milli-secs(wait.for.ratio=[1])
>  2020-03-12 17:02:06,630 INFO  [Time-limited test] hbase.Waiter(183): Waiting 
> up to [3,000] milli-secs(wait.for.ratio=[1])
>  2020-03-12 17:02:06,732 INFO  [Time-limited test] hbase.Waiter(183): Waiting 
> up to [3,000] milli-secs(wait.for.ratio=[1])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to