Raja, Is application master killing the said operator repeatedly? If app master is killing it, then it is most probably because the operator is blocking the windows from moving forward. In case of hdfs output operator, in an operator restarts, it resets the state of the file as well to the byte length matching the last checkpoint. To do this, it copies over the content of the file to a temp file until the byte length at checkpoint. For large file, it could take a long time to do this in setup while blocking the windows from moving forward. This might have caused the app master to kill the operator for timeouts. Can you check the app master logs if this is happening?
For this reason, it is better to rotate the file so recovery can be faster. Regards, Ashwin. On Fri, Aug 5, 2016 at 8:02 PM, Raja.Aravapalli <[email protected]> wrote: > > Thanks for the response Ashwin!! > > Once I moved the hdfs file that is reported as corrupted in the log files > to a different location on hdfs, application was able to launch a new > container successfully and process went fine. I still need to check the > first container of the operator, what caused the file to actually corrupt…!! > > > Having a single hdfs file to collect approx 15gbs of data without any > rotation set would cause any issues ?? Also, is it a okay/best practice to > not set any rotation on file ? > > > > Thanks for the response again! > > > Regards, > Raja. > > From: Ashwin Chandra Putta <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Friday, August 5, 2016 at 5:45 PM > > To: "[email protected]" <[email protected]> > Subject: Re: HDFS Write File Operator struggling to start > > Can you check in the logs the first time the issue occurred for what > triggered it? Look for the first container that failed. > > Regards, > Ashwin > > On Fri, Aug 5, 2016 at 12:58 PM, Raja.Aravapalli < > [email protected]> wrote: > >> >> Hi Ashwin, >> >> It happens when the application is running!! >> >> >> Regards, >> Raja. >> >> From: Ashwin Chandra Putta <[email protected]> >> Reply-To: "[email protected]" <[email protected]> >> Date: Thursday, August 4, 2016 at 4:42 PM >> To: "[email protected]" <[email protected]> >> Subject: Re: HDFS Write File Operator struggling to start >> >> Raja, >> >> When do you see this issue? Does it happen while the application is >> running? Does this happen while restarting a failed application? Or does >> this happen while starting a new application? >> >> Regards, >> Ashwin. >> >> On Thu, Aug 4, 2016 at 11:25 AM, Samba Surabhi <[email protected]> >> wrote: >> >>> If it is the issue with file size, you can rotate the output file. >>> >>> writer.setAlwaysWriteToTmp(true); >>> >>> writer.setRotationWindows(240); >>> >>> Thanks, >>> >>> Samba Surabhi. >>> >>> >>> ------------------------------ >>> From: [email protected] >>> To: [email protected] >>> Subject: HDFS Write File Operator struggling to start >>> Date: Thu, 4 Aug 2016 14:49:16 +0000 >>> >>> >>> >>> Hi >>> >>> I have a HDFS file write operator in my DAG, which is failing to start a >>> new operator and keep on trying to start one!! >>> >>> It created approx. 800 temporary files in the destination hdfs >>> directory!! How can I fix this issue…. And debug the root cause…. ? >>> >>> All I can see in container log is, File corrupted message…!! >>> >>> >>> Can someone please help me fix this ? >>> >>> >>> Regards, >>> Raja. >>> >> >> >> >> -- >> >> Regards, >> Ashwin. >> > > > > -- > > Regards, > Ashwin. > -- Regards, Ashwin.
