Hi Inder, I didn't see a relevant JIRA on this yet, so I went ahead and filed one at https://issues.apache.org/jira/browse/HDFS-4246 as it seems to affect HBase WALs too (when their blocksizes are configured large, they form a scenario similar to yours), on small clusters or in specific scenarios over large ones.
I think a timed-life cache of excludeNodesList would be more preferable than a static, but we can keep it optional (or defaulted to infinite) to not harm the existing behavior. On Tue, Nov 20, 2012 at 11:49 PM, Harsh J <ha...@cloudera.com> wrote: > The excludeNode list is initialized for each output stream created > under a DFSClient instance. That is, it is empty for every new > FS.create() returned DFSOutputStream initially and is maintained > separately for each file created under a common DFSClient. > > However, this could indeed be a problem for a long-running single-file > client, which I assume is a continuously alive and hflush()-ing one. > > Can you search for and file a JIRA to address this with any discussion > taken there? Please put up your thoughts there as well. > > On Mon, Nov 19, 2012 at 3:25 PM, Inder Pall <inder.p...@gmail.com> wrote: >> Folks, >> >> i was wondering if there is any mechanism/logic to move a node back from the >> excludedNodeList to live nodes to be tried for new block creation. >> >> In the current DFSClient code i do not see this. The use-case is if the >> write timeout is being reduced and certain nodes get aggressively added to >> the excludedNodeList and the client caches DFSClient then the excludedNodes >> never get tried again in the lifetime of the application caching DFSClient >> >> >> -- >> - Inder >> "You are average of the 5 people you spend the most time with" >> > > > > -- > Harsh J -- Harsh J