Re: Failed RC-10 yarn-cluster job for FS closed error when cleaning up staging directory

2014-05-22 Thread Kevin Markey
Update:  Partly user error.  But still getting FS closed error. Yes, we are running plain vanilla Hadoop 2.3.0.  But it probably doesn't matter 1. Tried Colin McCabe's suggestion to patch with pull 850 (https://issues.apache.org/jira/browse/SPARK-1898).  No

Failed RC-10 yarn-cluster job for FS closed error when cleaning up staging directory

2014-05-21 Thread Kevin Markey
I tested an application on RC-10 and Hadoop 2.3.0 in yarn-cluster mode that had run successfully with Spark-0.9.1 and Hadoop 2.3 or 2.2. The application successfully ran to conclusion but it ultimately failed. There were 2 anomalies... 1. ASM reported only

Re: Failed RC-10 yarn-cluster job for FS closed error when cleaning up staging directory

2014-05-21 Thread Tom Graves
It sounds like something is closing the hdfs filesystem before everyone is really done with it. The filesystem gets cached and is shared so if someone closes it while other threads are still using it you run into this error.   Is your application closing the filesystem?     Are you using the

Re: Failed RC-10 yarn-cluster job for FS closed error when cleaning up staging directory

2014-05-21 Thread Tathagata Das
Are you running a vanilla Hadoop 2.3.0 or the one that comes with CDH5 / HDP(?) ? We may be able to reproduce this in that case. TD On Wed, May 21, 2014 at 8:35 PM, Tom Graves tgraves...@yahoo.com wrote: It sounds like something is closing the hdfs filesystem before everyone is really done