There's a bunch of things in this area, yes: A big one would be to split the connect timeout vs. the readtimeout. Todd created a jira for HDFS, and it could make a huge difference imho. readtimeout are not really clear, between GC stuff, may be bugs, network issues, risk with retrying as you don't really know what has been done and so on. But connect are dead process and big network issues only, even under crazy GC the socket gets connected. And if it fails, you can retry. So splitting the two would be useful, we could go to a next DN quickly. But to be very useful, we will need the DFSClient to do it (I still have my doc on dfs timeouts to send, will do it soon). - I need to review some cases on pure functional split in HBase, there are some existing stuff already there that I need to understand.
On Mon, Jul 23, 2012 at 6:59 PM, Stack <[email protected]> wrote: > On Mon, Jul 23, 2012 at 1:15 PM, N Keywal <[email protected]> wrote: >> Hi, >> >> FYI, I created a set of jira in HDFS, related to HBase MTTR or recovery >> alone. >> >> HDFS-3706 :Add the possibility to mark a node as 'low priority' for >> reads in the DFSClient >> HDFS-3705: Add the possibility to mark a node as 'low priority' for >> writes in the DFSClient >> HDFS-3704: In the DFSClient, Add the node to the dead list when the >> ipc.Client call fails >> HDFS-3703: Decrease the datanode failure detection time >> HDFS-3702: Add an option for NOT writing the blocks locally if there >> is a datanode on the same box as the client >> HDFS-3701: HDFS may miss the final block when reading a file opened >> for writing if one of the datanode is dead >> > > Thanks for doing the above. > > What about your idea where you'd like to have different socket > timeouts dependent on what we're doing? Or Todd's idea of being able > to try a DN replica and if its taking too long to read, move on to the > next one quick? To do this, do you think we'd need to get our fingers > into the DFSClient in other areas? > > Good stuff, > St.Ack
