[
https://issues.apache.org/jira/browse/CRUNCH-552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh Wills updated CRUNCH-552:
------------------------------
Attachment: CRUNCH-552.patch
The patch for this, which does a couple of things:
1) Makes Crunch's custom OutputFormat for Parquet public so Spark can access it,
2) Moves some of the avro test classes (Employee and Person) to the crunch-test
module so that they can be used by both crunch-core and crunch-spark,
3) Adds Avro/Parquet tests for Spark, and
4) Notes that crunch.namedoutput should be set to "out0" in Crunch-on-Spark so
that the Avro Parquet implementation will work properly.
> Enable AvroParquet to work with Crunch-on-Spark
> -----------------------------------------------
>
> Key: CRUNCH-552
> URL: https://issues.apache.org/jira/browse/CRUNCH-552
> Project: Crunch
> Issue Type: Bug
> Components: Core, IO
> Affects Versions: 0.12.0
> Reporter: Josh Wills
> Assignee: Josh Wills
> Fix For: 0.13.0
>
> Attachments: CRUNCH-552.patch
>
>
> Via the mailing list, we got a bug report that Crunch's Parquet target
> classes did not work with Crunch-on-Spark. The most obvious problem was Spark
> not being able to access the OutputFormat class that Crunch was using for
> reading Parquet files as Avro records, but there were a couple of other
> smaller issues that needed to be fixed as well.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)