hrm, that sounds like something is wrong with the commit operation on the Spark side; let me take a look at it this evening!
J On Thu, May 10, 2018 at 8:56 AM, David Ortiz <[email protected]> wrote: > Hello, > > Are there any known issues with the AvroParquetPathPerKeyTarget when > running a Spark pipeline? When I run my pipeline with mapreduce, I get > output, and when I run with spark, the step before where I list my > partition keys out (because we use them to add partitions to hive) lists > data being present, but the output directory remains empty. This behavior > is occurring targeting both HDFS and S3 directly. > > Thanks, > Dave >
