[
https://issues.apache.org/jira/browse/HDFS-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358592#comment-14358592
]
Benoit Perroud commented on HDFS-7873:
--------------------------------------
When a folder contains a lot of files, the output sent back to the client is
split in several packets. As {{channel.write}} call is async, it returns
directly and is thus not waiting for all the data being sent, which might take
some time.
Now the problem is that {{MessageEvent.getFuture()}} will return a future which
is already completed because the header has been sent properly before the data
(there're two calls to {{channel.write}}, one for the header, one for the
content), so {{ChannelFutureListener.CLOSE}} will be called immediately,
potentially before all the packets are sent across the channel. This premature
close of the channel leaves the client in a incomplete response.
The test launches a MiniDFSCluster and create 10000 files in a folder because
with this number, I was able to repeatably reproduce the issue. The FSImage is
then generated and loaded in OIV. Finally the content of the "big" folder is
listed, and output asserted. Without the patch, the exception initially
reported appears here
I hope this will help.
> OIV webhdfs premature close channel issue
> -----------------------------------------
>
> Key: HDFS-7873
> URL: https://issues.apache.org/jira/browse/HDFS-7873
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: tools
> Affects Versions: 2.6.0, 2.5.2
> Reporter: Benoit Perroud
> Priority: Minor
> Attachments: HDFS-7873-v1.txt, HDFS-7873-v2.txt
>
>
> The new Offline Image Viewer (OIV) supports to load the FSImage and _emulate_
> a webhdfs server to explore the image without touching the NN.
> This webhdfs server is not working with folders holding a significant number
> of children (files or other folders):
> {quote}
> $ hadoop fs -ls webhdfs://127.0.0.1:5978/a/big/folder
> 15/03/03 04:28:19 WARN ssl.FileBasedKeyStoresFactory: The property
> 'ssl.client.truststore.location' has not been set, no TrustStore will be
> loaded
> 15/03/03 04:28:21 WARN security.UserGroupInformation:
> PriviledgedActionException as:bperroud (auth:SIMPLE)
> cause:java.io.IOException: Response decoding failure:
> java.lang.IllegalStateException: Expected one of '"}'
> ls: Response decoding failure: java.lang.IllegalStateException: Expected one
> of '"}'
> {quote}
> The error comes from an inappropriate usage of Netty.
> {{e.getFuture().addListener(ChannelFutureListener.CLOSE)}} is closing the
> channel too early because the future attached to the channel already sent the
> header so the I/O operation succeeded.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)