Re: Question about dead datanode

2014-02-26 Thread Jack Levin
Submitted JIRA patch: https://issues.apache.org/jira/browse/HDFS-6022 (with test) On Mon, Feb 24, 2014 at 12:16 PM, Jack Levin magn...@gmail.com wrote: I will do that. -Jack On Mon, Feb 24, 2014 at 6:23 AM, Steve Loughran ste...@hortonworks.com wrote: that's a very old version of

Re: Question about dead datanode

2014-02-24 Thread Steve Loughran
that's a very old version of cloudera's branch you are working with there; patching that is not a good way to go, as you are on the slippery slope of having your own private branch and all the costs of it. It looks like dead node logic has - DFSInputStream, where it is still instance-specific:

Re: Question about dead datanode

2014-02-24 Thread Jack Levin
I will do that. -Jack On Mon, Feb 24, 2014 at 6:23 AM, Steve Loughran ste...@hortonworks.com wrote: that's a very old version of cloudera's branch you are working with there; patching that is not a good way to go, as you are on the slippery slope of having your own private branch and all the

Re: Question about dead datanode

2014-02-23 Thread Jack Levin
I can submit Jira for this if you feel that's appropriate On Feb 18, 2014 8:49 PM, Stack st...@duboce.net wrote: On Sat, Feb 15, 2014 at 8:01 PM, Jack Levin magn...@gmail.com wrote: Looks like I patched it in DFSClient.java, here is the patch: https://gist.github.com/anonymous/9028934

Re: Question about dead datanode

2014-02-18 Thread Stack
On Sat, Feb 15, 2014 at 8:01 PM, Jack Levin magn...@gmail.com wrote: Looks like I patched it in DFSClient.java, here is the patch: https://gist.github.com/anonymous/9028934 I moved 'deadNodes' list outside as global field that is accessible by all running threads, so at any point

Re: Question about dead datanode

2014-02-14 Thread Jack Levin
0.20.2-cdh3u2 -- add to deadNodes and continue would solve this issue. For some reason its not getting into this code path. If its a matter of adding a quick line of code to make this work, then we would rather recompile with that and upgrade later when we have better backup. -Jack On Thu,

Re: Question about dead datanode

2014-02-14 Thread Jack Levin
I found the code path that does not work, patched it. Will report if it fixes the problem On Feb 14, 2014 8:19 AM, Jack Levin magn...@gmail.com wrote: 0.20.2-cdh3u2 -- add to deadNodes and continue would solve this issue. For some reason its not getting into this code path. If its a matter

Question about dead datanode

2014-02-13 Thread Jack Levin
Good morning -- I had a question, we have had a datanode go down, and its been down for few days, however hbase is trying to talk to that dead datanode still 2014-02-13 08:57:23,073 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /10.101.5.5:50010 for file

Re: Question about dead datanode

2014-02-13 Thread Jack Levin
I meant its in the 'dead' list on HDFS namenode page. Hadoop fsck / shows no issues. On Thu, Feb 13, 2014 at 10:38 AM, Jack Levin magn...@gmail.com wrote: Good morning -- I had a question, we have had a datanode go down, and its been down for few days, however hbase is trying to talk to

Re: Question about dead datanode

2014-02-13 Thread Stack
RS opens files and then keeps them open as long as the RS is alive. We're failing read of this replica and then we succeed getting the block elsewhere? You get that exception every time? What hadoop version Jack? You have short-circuit reads on? St.Ack On Thu, Feb 13, 2014 at 10:41 AM, Jack

Re: Question about dead datanode

2014-02-13 Thread Jack Levin
As far as I can tell I am hitting this issue: http://grepcode.com/search/usages?type=methodid=repository.cloudera.com%24content%24repositories%24releases@com.cloudera.hadoop%24hadoop-core@0.20.2-320@org%24apache%24hadoop%24hdfs%24protocol@LocatedBlocks@findBlock%28long%29k=u 1581

Re: Question about dead datanode

2014-02-13 Thread Jack Levin
This might be related: http://hadoop.6.n7.nabble.com/Question-on-opening-file-info-from-namenode-in-DFSClient-td6679.html In hbase, we open the file once and keep it open. File is shared amongst all clients. Does it mean its perma cached if datanode is dead? -Jack On Thu, Feb 13, 2014 at

Re: Question about dead datanode

2014-02-13 Thread Stack
Can you upgrade Jack? This stuff is better in later versions (dfsclient keeps running list of bad datanodes...) St.Ack On Thu, Feb 13, 2014 at 1:41 PM, Jack Levin magn...@gmail.com wrote: As far as I can tell I am hitting this issue:

Re: Question about dead datanode

2014-02-13 Thread Jack Levin
Can upgrade now but I would take suggestions on how to deal with this On Feb 13, 2014 2:02 PM, Stack st...@duboce.net wrote: Can you upgrade Jack? This stuff is better in later versions (dfsclient keeps running list of bad datanodes...) St.Ack On Thu, Feb 13, 2014 at 1:41 PM, Jack Levin

Re: Question about dead datanode

2014-02-13 Thread Jack Levin
I meant to say, I can't upgrade now, its a petabyte storage system. A little hard to keep a copy of something like that. On Thu, Feb 13, 2014 at 3:20 PM, Jack Levin magn...@gmail.com wrote: Can upgrade now but I would take suggestions on how to deal with this On Feb 13, 2014 2:02 PM, Stack

Re: Question about dead datanode

2014-02-13 Thread Jack Levin
One other question, we get this: 2014-02-13 02:46:12,768 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /10.101.5.5:50010 for file /hbase/img32/b97657bfcbf922045d96315a4ada0782/att/4890606694307129591 for block -9099107892773428976:java.net.SocketTimeoutException: 6 millis

Re: Question about dead datanode

2014-02-13 Thread Stack
On Thu, Feb 13, 2014 at 8:55 PM, Jack Levin magn...@gmail.com wrote: I meant to say, I can't upgrade now, its a petabyte storage system. A little hard to keep a copy of something like that. You could upgrade in-situ but, yeah, you'd need to be careful.

Re: Question about dead datanode

2014-02-13 Thread Stack
On Thu, Feb 13, 2014 at 9:18 PM, Jack Levin magn...@gmail.com wrote: One other question, we get this: 2014-02-13 02:46:12,768 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /10.101.5.5:50010 for file /hbase/img32/b97657bfcbf922045d96315a4ada0782/att/4890606694307129591 for