[
https://issues.apache.org/jira/browse/PIG-4549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mohit Sabharwal updated PIG-4549:
---------------------------------
Attachment: PIG-4549.1.patch
> Set CROSS operation parallelism for Spark engine
> ------------------------------------------------
>
> Key: PIG-4549
> URL: https://issues.apache.org/jira/browse/PIG-4549
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Affects Versions: spark-branch
> Reporter: Mohit Sabharwal
> Assignee: Mohit Sabharwal
> Fix For: spark-branch
>
> Attachments: PIG-4549.1.patch, PIG-4549.patch
>
>
> Spark engine should set parallelism to be used for CROSS operation by GFCross
> UDF.
> If not set, GFCross throws an exception:
> {code}
> String s = cfg.get(PigImplConstants.PIG_CROSS_PARALLELISM +
> "." + crossKey);
> if (s == null) {
> throw new IOException("Unable to get parallelism hint
> from job conf");
> }
> {code}
> Estimating parallelism for Spark engine is a TBD item. Until that is done,
> for CROSS to work, we should use the default parallelism value in GFCross.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)