[
https://issues.apache.org/jira/browse/PIG-2587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241007#comment-13241007
]
Bill Graham commented on PIG-2587:
----------------------------------
I agree if cosmetic changes happen to the script, all bets are off and you'll
get a different signature.
Also agree about the 3 items out of scope here. The version of registered jars
part would be ugly due to potential transitive dependancies changing and not
being detected.
> Compute LogicalPlan signature and store in job conf
> ---------------------------------------------------
>
> Key: PIG-2587
> URL: https://issues.apache.org/jira/browse/PIG-2587
> Project: Pig
> Issue Type: Improvement
> Reporter: Bill Graham
> Assignee: Bill Graham
> Labels: 0.10_blocker
> Fix For: 0.10, 0.11
>
> Attachments: pig-2587_1.patch
>
>
> We'd like to be able to uniquely identify a re-executed script (possibly with
> different inputs/outputs) by creating a signature of the {{LogicalPlan}}.
> Here's the proposal:
> # Add a new method {{LogicalPlan.getSignature()}} that returns a hash of its
> {{LogicalPlanPrinter}} output.
> # In {{PigServer.execute()}} set the signature on the job conf after the LP
> is compiled, but before it's executed.
> (1) would allow an impl of
> {{PigProgressNotificationListener.setScriptPlan()}} to save the LP signature
> with the script metadata. Upon subsequent runs (2) would allow an impl of
> {{PigReducerEstimator}} (see PIG-2574) to retrieve the current LP signature
> and fetch the historical data for the script. It could then use the previous
> run data to better estimate the number of reducers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira