[ 
https://issues.apache.org/jira/browse/HDFS-15408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133284#comment-17133284
 ] 

Stephen O'Donnell commented on HDFS-15408:
------------------------------------------

This started happening after HDFS-2538 from 3.0.0. The reason, is that the fsck 
no longer shows progress (prints a dot per file) and if the cluster is large, 
the timeout can happen. At Cloudera we have seen a lot of customers frustrated 
by this since they moved to a 3.x branch version.

There were logs of ideas to fix this on HDFS-7175, but in the end I just turned 
the dots back on, but printed less of them. 

You can work around this issue by using the -showprogress switch when running 
fsck.

> Failed execution caused by SocketTimeoutException
> -------------------------------------------------
>
>                 Key: HDFS-15408
>                 URL: https://issues.apache.org/jira/browse/HDFS-15408
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.1.1
>            Reporter: echohlne
>            Priority: Major
>
> When I execute command: hdfs fsck / 
>  in the hadoop cluster to check the health of the cluster, It always report 
> an error execution failure like below:
> {code}
> Connecting to namenode via http://hadoop20:50070/fsck?ugi=hdfs&path=%2F
> Exception in thread "main" java.net.SocketTimeoutException: Read timed out
>       at java.net.SocketInputStream.socketRead0(Native Method)
>       at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>       at java.net.SocketInputStream.read(SocketInputStream.java:171)
>       at java.net.SocketInputStream.read(SocketInputStream.java:141)
>       at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>       at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
>       at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
>       at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
>       at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
>       at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587)
>       at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
>       at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:359)
>       at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72)
>       at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:159)
>       at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:156)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>       at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:155)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>       at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:402)
> {code}
> We try to solve this problem by adding a new parameter: 
> {color:#de350b}*dfs.fsck.http.timeout.ms*{color} to control the 
> connectionTimeout and the readTimeout if the HttpConnection in DFSck.java 
> .Please check is it the right way to solve the problem? thanks a lot!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to