Hmm thanks for the pointer. Was not using any Avros and also was not using PCollection#cache when this occurred though unfortunately. Most recent runs I've kicked off didn't exhibit the behavior, will keep looking though.
Thanks, Jeff On Fri, Nov 6, 2015 at 3:13 PM, Josh Wills <[email protected]> wrote: > I think there was a bug w/the caching that Micah noticed: > https://issues.apache.org/jira/browse/CRUNCH-569 > > Maybe related? > > On Fri, Nov 6, 2015 at 12:14 PM, Jeff Quinn <[email protected]> wrote: > >> Hello, >> >> Are there any known issues with using PCollection#materialize with >> SparkPipeline? I am trying to use it in my pipeline and I am seeing >> interesting errors occur sometimes when the materialization is attempted, >> such as: >> >> java.lang.IllegalArgumentException: Unknown codec: >> che.hadoop.io.compress.SnappyCodec^@^@^@^@??^V??fi?8?lU?????????^V??fi?8?lU???^A >> ^@^@^@^C^@^@^@^E^C^H?^A??^AP^@^@^A?^@ >> >> SeqFileReaderFactory: Could not read seqfile at path: >> hdfs://ip-10-0-17-226.ec2.internal:8020/tmp/crunch-300241792/p5/part-r-00001 >> >> java.io.IOException: Invalid size: -2062707543 for file metadata object >> >> This is with Crunch 0.13.0 / Spark 1.5.0. Anyone have any ideas? >> >> Thanks! >> >> Jeff >> >> *DISCLAIMER:* The contents of this email, including any attachments, may >> contain information that is confidential, proprietary in nature, protected >> health information (PHI), or otherwise protected by law from disclosure, >> and is solely for the use of the intended recipient(s). If you are not the >> intended recipient, you are hereby notified that any use, disclosure or >> copying of this email, including any attachments, is unauthorized and >> strictly prohibited. If you have received this email in error, please >> notify the sender of this email. Please delete this and all copies of this >> email from your system. Any opinions either expressed or implied in this >> email and all attachments, are those of its author only, and do not >> necessarily reflect those of Nuna Health, Inc. > > > -- *DISCLAIMER:* The contents of this email, including any attachments, may contain information that is confidential, proprietary in nature, protected health information (PHI), or otherwise protected by law from disclosure, and is solely for the use of the intended recipient(s). If you are not the intended recipient, you are hereby notified that any use, disclosure or copying of this email, including any attachments, is unauthorized and strictly prohibited. If you have received this email in error, please notify the sender of this email. Please delete this and all copies of this email from your system. Any opinions either expressed or implied in this email and all attachments, are those of its author only, and do not necessarily reflect those of Nuna Health, Inc.
