Travis Crawford created PIG-2723:
------------------------------------

             Summary: Pig fails when pig.jar is removed during a job
                 Key: PIG-2723
                 URL: https://issues.apache.org/jira/browse/PIG-2723
             Project: Pig
          Issue Type: Bug
            Reporter: Travis Crawford


Many pig scripts are executed as a series of MR jobs. When submitting each MR 
job to the cluster, pig makes a single fat job.jar that contains pig itself, 
along with all registered jars (UDFs and their dependencies).

This becomes problematic when pig is upgraded during a long-running job. For 
example, going from {{pig-0.9.2+1.jar}} to {{pig-0.9.2+2.jar}}. When creating 
the next job.jar pig will fail because the expected pig jar is no longer 
available.

A common case where this happens is deploying a new pig RPM.

Pig should handle the case where its jar is removed while executing a script.

DISCUSSED OPTIONS THAT SEEM PROBLEMATIC:

* Creating a single pig.jar symlink that points at the installed pig version 
could cause MR jobs to use different pig versions during the same script. This 
could lead to very difficult to debug issues, and potential correctness issues.

* Extracting pig.jar once for the whole job could be problematic if /tmp is 
used and something like tmpwatch runs.

POSSIBLE SOLUTION:

Pig could put pig.jar in the distributed cache once at reuse that jar on HDFS 
for all launched jobs.

WHY ARE YOU DELETING PIG.JAR DURING THE JOB!?!?

Allowing RPM upgrades mid-pig-job means the machine does not need to be drained 
for maintenance, reducing the impact of upgrades. Having just one pig version 
installed simplifies packaging and for users to choose the right version. 
Overall it just keeps things simple, which is a feature itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to