[ 
https://issues.apache.org/jira/browse/PIG-5273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16135644#comment-16135644
 ] 

Rohini Palaniswamy edited comment on PIG-5273 at 8/21/17 7:27 PM:
------------------------------------------------------------------

Comments:
1) Rename PIG_FILEOUTPUTCOMMITTER_MARKSUCCESSFULJOBS to 
PIG_FILEOUTPUTCOMMITTER_MARKSUCCESSFULJOBS_ATEND. Java doc should say "If set 
to true, Pig will create _SUCCESS file in the output directories after 
completion of all jobs at the end of the script. If there are exec or fs 
commands in the middle of the script, then _SUCCESS file will be created for 
those jobs before executing fs or exec command and not at the end of the 
script." Also have to add 
PIG_FILEOUTPUTCOMMITTER_MARKSUCCESSFULJOBS_ATEND_DEFAULT which should be false.
2) markSuccess method
     - Rename to markSuccessAtEnd. 
    - Check for the boolean conditions first before getting OutputFormat and 
checking.
     - System.out.println should be LOG.info("Creating _SUCCESS at " + 
location);
     - Use FileOutputCommitter.SUCCEEDED_FILE_NAME instead of "_SUCCESS". 

3) Call markSuccess after all cleanupOnSuccess has been called.
4) Implement for MR and Tez as well. Create a common method in Launcher and 
call them from MR/Tez/SparkLauncher classes.
5) It is missing the most important piece of setting 
FileOuputCommitter.SUCCESSFUL_JOB_OUTPUT_DIR_MARKER to false in the jobconf 
before launching jobs when this feature is set to true. Tests should have 
FILEOUTPUTCOMMITTER_MARKSUCCESSFULJOBS set to true and not false.


was (Author: rohini):
Comments:
1) Rename PIG_FILEOUTPUTCOMMITTER_MARKSUCCESSFULJOBS to 
PIG_FILEOUTPUTCOMMITTER_MARKSUCCESSFULJOBS_ATEND. Java doc should say "If set 
to true, Pig will create _SUCCESS file in the output directories after 
completion of all jobs at the end of the script. If there are exec or fs 
commands in the middle of the script, then _SUCCESS file will be created for 
those jobs before executing fs or exec command and not at the end of the 
script." Also have to add 
PIG_FILEOUTPUTCOMMITTER_MARKSUCCESSFULJOBS_ATEND_DEFAULT which should be false.
2) markSuccess method
     - Rename to markSuccessAtEnd. 
    - Check for the boolean conditions first before getting OutputFormat and 
checking.
     - System.out.println should be LOG.info("Creating _SUCCESS at " + 
location);
     - Use FileOutputCommitter.SUCCEEDED_FILE_NAME instead of "_SUCCESS". 
3) Call markSuccess after all cleanupOnSuccess has been called.
4) Implement for MR and Tez as well. Create a common method in Launcher and 
call them from MR/Tez/SparkLauncher classes.
5) It is missing the most important piece of setting 
FileOuputCommitter.SUCCESSFUL_JOB_OUTPUT_DIR_MARKER to false in the jobconf 
before launching jobs when this feature is set to true.

> _SUCCESS file should be created at the end of the job
> -----------------------------------------------------
>
>                 Key: PIG-5273
>                 URL: https://issues.apache.org/jira/browse/PIG-5273
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Satish Subhashrao Saley
>            Assignee: Satish Subhashrao Saley
>         Attachments: PIG-5273-1.patch
>
>
> One of the users ran into issues because _SUCCESS file was created by 
> FileOutputCommitter.commitJob() and storeCleanup() called after that in 
> PigOutputCommitter failed to store schema due to network outage. abortJob was 
> then called and the StoreFunc.cleanupOnFailure method in it deleted the 
> output directory. Downstream jobs that started because of _SUCCESS file ran 
> with empty data 
> Possible solutions:
> 1) Move storeCleanup before commit. Found that order was reversed in 
> https://issues.apache.org/jira/browse/PIG-2642, probably due to 
> FileOutputCommitter version 1 and might not be a problem with 
> FileOutputCommitter version 2. This would still not help when there are 
> multiple outputs as main problem is cleanupOnFailure in abortJob deleting 
> directories.
> 2) We can change cleanupOnFailure not delete output directories. It still 
> does not help. The Oozie action retry might kick in and delete the directory 
> while the downstream has already started running because of the _SUCCESS 
> file. 
> 3) It cannot be done in the OutputCommitter at all as multiple output 
> committers are called in parallel in Tez. We can have Pig suppress _SUCCESS 
> creation and try creating them all at the end in TezLauncher if job has 
> succeeded before calling cleanupOnSuccess. Can probably add it as a 
> configurable setting and turn on by default in our clusters. This is probably 
> the possible solution
> Thank you [~rohini] for finding out the issue and providing solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to