I have een this happen before. I suspect that this is because of some loss of state in the Namenode when the HDFS cluster is restarted - which causes client leases to not be valid for some reason. My best guess is that you’d need to restart Flume after you restart the HDFS cluster - this is likely an HDFS API limitation. Flume can handle most other HDFS failure scenarios.
Thanks, Hari On Thu, Dec 11, 2014 at 11:17 PM, kevin.hu <[email protected]> wrote: > I saw many "HDFS IO Error" in Flume log when Hadoop process is restarted. > It will never recover even hadoop is restarted successfully. The only way > we can do is restarting Flume and Flume client to recover all setup. > So my question is how Flume handle such kind of failure such as Hadoop is > terminated unexpectedly? From my perspective, it will do nothing but just > report "HDFS IO Error". > Is my understanding wrong? > Thanks > Daiqian
