[ 
https://issues.apache.org/jira/browse/HADOOP-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183556#comment-17183556
 ] 

Mehakmeet Singh commented on HADOOP-17158:
------------------------------------------

Thanks for checking, since it never fails for me, I could only rely on your 
traces.
Yes, we should. But the reason why it is timing out is that the condition of 
while loop isn't being fulfilled till the end, hence it becomes an infinite 
loop. The reasoning for this is that the counter value of either of the counter 
is never reaching the expected value.
So, we are checking 2 counters readAheadBytesRead and remoteBytesRead, most 
probably what is happening is that the readAhead buffer which fills in the 
background threads is never filled when data is to be read in your case which 
could result in remote reads instead of readAhead reads and hence never 
reaching the counter values. 
I just need to confirm the actual counter values in your setup(Which I suspect 
readAheadBytesRead could be 0). I was expecting the value to be greater than or 
equal to the block size(I set it to 4KB).
But, nevertheless, I would write a better test as the values could be quite 
arbitrary(Race conditions involved) without sleeping.

> Intermittent test timeout for 
> ITestAbfsInputStreamStatistics#testReadAheadCounters
> ----------------------------------------------------------------------------------
>
>                 Key: HADOOP-17158
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17158
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>    Affects Versions: 3.3.0
>            Reporter: Mehakmeet Singh
>            Assignee: Mehakmeet Singh
>            Priority: Major
>
> Intermittent test timeout for 
> ITestAbfsInputStreamStatistics#testReadAheadCounters happening due to race 
> conditions in readAhead threads.
> Test error:
> {code:java}
> [ERROR] 
> testReadAheadCounters(org.apache.hadoop.fs.azurebfs.ITestAbfsInputStreamStatistics)
>   Time elapsed: 30.723 s  <<< 
> ERROR!org.junit.runners.model.TestTimedOutException: test timed out after 
> 30000 milliseconds        at java.lang.Thread.sleep(Native Method)        at 
> org.apache.hadoop.fs.azurebfs.ITestAbfsInputStreamStatistics.testReadAheadCounters(ITestAbfsInputStreamStatistics.java:346)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)        
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
>        at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)        at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>         at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>         at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>         at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>         at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>         at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)        at 
> java.lang.Thread.run(Thread.java:748) {code}
> Possible Reasoning:
> - ReadAhead queue doesn't get completed and hence the counter values are not 
> satisfied in 30 seconds time for some systems.
> - The condition that readAheadBytesRead and remoteBytesRead counter values 
> need to be greater than or equal to 4KB and 32KB respectively doesn't occur 
> in some machines due to the fact that sometimes instead of reading for 
> readAhead Buffer, remote reads are performed due to Threads still being in 
> the readAhead queue to fill that buffer. Thus resulting in either of the 2 
> counter values to be not satisfying the condition and getting in an infinite 
> loop and hence timing out the test eventually.
> Possible Fixes:
> - Write better test(That would pass under all conditions).
> - Maybe UT instead of IT?
> Possible fix to better the test would be preferable and UT as the last resort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to