[ 
https://issues.apache.org/jira/browse/HADOOP-15688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589371#comment-16589371
 ] 

Thomas Marquardt commented on HADOOP-15688:
-------------------------------------------

[^HADOOP-15688-HADOOP-15407-002.patch]

I noticed ABFS has similar issues with wrapping output stream too.  I checked 
with Da and we think this was an accident that occurred during an earlier 
refactor, as there is no need to wrap the stream twice with FSDataInputStream 
or FSDataOutputStream.

I have attached patch 002 which fixes this for all streams. All tests pass 
against my US storage account:

*Tests run: 265, Failures: 0, Errors: 0, Skipped: 11*
*Tests run: 1, Failures: 0, Errors: 0, Skipped: 0*
*Tests run: 861, Failures: 0, Errors: 0, Skipped: 262*
*Tests run: 186, Failures: 0, Errors: 0, Skipped: 10*

Regarding the timeout issue, I do use an Azure VM to run the tests which helps 
reduce latency, but we should make the tests pass regardless.  The test cases 
have various timeouts which we can increased.  Just let us know which ones are 
causing trouble. We are also working to improve parallelization of the tests, 
which will reduce total run time.

> ABFS: InputStream wrapped in FSDataInputStream twice
> ----------------------------------------------------
>
>                 Key: HADOOP-15688
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15688
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>            Priority: Major
>         Attachments: HADOOP-15688-HADOOP-15407-002.patch, 
> HADOOP-15688.001.patch
>
>
> I can't read Parquet files from ABFS. It has 2 different implementations to 
> read seekable streams, and it'll use the one that uses ByteBuffer reads if it 
> can. It currently decides to use the ByteBuffer read implementation because 
> the FSDataInputStream it gets back wraps another FSDataInputStream, which 
> implements ByteBufferReadable.
> That's not the most robust way to check that ByteBufferReads are supported by 
> the ultimately underlying InputStream, but it's unnecessary and probably a 
> mistake to double-wrap the InputStream, so let's not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to