[ https://issues.apache.org/jira/browse/CRUNCH-670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16936067#comment-16936067 ]
David Ortiz edited comment on CRUNCH-670 at 9/23/19 5:38 PM: ------------------------------------------------------------- [~jwills] here is a patch file combining your update, and the updates to I had to make to AvroParquetPathPerKeyOutputFormat and AvroPathPerKeyOutputFormat to get our stuff running on spark.[^CRUNCH-670-pt2.patch] was (Author: dortiz): [~jwills] Finally got approval to post this from my employer. Here is a patch file combining your update, and the updates to I had to make to AvroParquetPathPerKeyOutputFormat and AvroPathPerKeyOutputFormat to get our stuff running on spark.[^CRUNCH-670-pt2.patch] > Make the AvroPathPerKeyTarget work with the SparkRuntime > -------------------------------------------------------- > > Key: CRUNCH-670 > URL: https://issues.apache.org/jira/browse/CRUNCH-670 > Project: Crunch > Issue Type: Improvement > Reporter: Josh Wills > Assignee: Josh Wills > Priority: Major > Attachments: CRUNCH-670-pt2.patch, CRUNCH-670.patch > > > There is an issue where the AvroPathPerKeyTarget won't properly copy the > output of a Spark pipeline from the temp directory to the target directory > because it assumes it will always get a valid Crunch output index (0, 1, 2, > ...) and the SparkRuntime passes -1 to the Target's output handler method (to > signal that it's the only output for the job.) I _think_ the right move is to > have AvroPathPerKeyTarget rewrite a -1 index to 0 so as not to break any > other implementations that depend on the SparkRuntime's behavior. -- This message was sent by Atlassian Jira (v8.3.4#803005)