Yup, sure, why not.
2016-11-18 2:41 GMT+08:00 Josh Wills <[email protected]>: > oh, lame. I can imagine why that would happen-- by design, Shard.shard takes > a PCollection<T>, so it's mucking with the PTable keys here when it does the > random distribution. We can add a PTable-specific version of Shard.shard w/o > too much trouble- would you mind filing a JIRA? > https://issues.apache.org/jira/browse/CRUNCH > > On Thu, Nov 17, 2016 at 3:48 AM, wu lihu <[email protected]> wrote: >> >> Hi Everyone >> I have a job to work with parquet file output, >> Shard.shard(outTable,10).write(new >> AvroParquetFileTarget(tempOut+path), Target.WriteMode.OVERWRITE); >> >> However, the output looks like below >> 3.0.3.1.2.CH24_RELEASE 2 >> 3.0.3.1.2.CH24_RELEASEE 1 >> 3.0.3.1.2.CH24_RELEASEEA 1 >> 3.0.3.1.2.CH24_RELEASEEAS 1 >> 3.0.3.1.2.CH24_RELEASEEASE 29 >> 3.0.3.1.2.CH24_RELEASEEASES 160 >> 3.0.3.1.2.CH24_RELEASEEASESE 85 >> 3.0.3.1.2.CH24_RELEASEEASESEE 14 >> 3.0.3.1.2.CH24_RELEASEEASESEEE 4 >> 3.0.3.1.2.CH24_RELEASEEASESEEES 1 >> there is extra suffix added to the key of the PTable, all of them >> should be RELEASE but not the RELEASEEASE bra bra >> >> If I remove the Shard, and keeps all the same, the output looks like >> normal >> 3.0.0.1.2.CH.1.4_RELEASE 1 >> 3.0.1.1.2.CH22_RELEASE 1622 >> 3.0.1.1.2.CH23_RELEASE 10607 >> 3.0.14.1.2.CH.1.3_RELEASE 18080 >> 3.0.19.1.2.TC21_RELEASE 5 >> 3.0.2.1.2.CH11_RELEASE 3 >> 3.0.2.1.2.TC21_RELEASE 4 >> 3.0.20.1.2.TC21_RELEASE 247 >> 3.0.20.7.2.SX.1.2A_RELEASE 2 >> 3.0.20.8.2.SX.1.3A_RELEASE 1 >> >> >> Any thoughts ??? > >
