vinothchandar commented on issue #672: [HUDI-113]: Use Tuple2 over # delimited string URL: https://github.com/apache/incubator-hudi/pull/672#issuecomment-491938373 ``` Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 40.0 failed 1 times, most recent failure: Lost task 1.0 in stage 40.0 (TID 56, localhost, executor driver): java.lang.ClassCastException: scala.Tuple2 cannot be cast to java.lang.String at com.uber.hoodie.index.bloom.BucketizedBloomCheckPartitioner.getPartition(BucketizedBloomCheckPartitioner.java:142) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:152) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) ``` See some errors due to casting mismatch.. High level question... I see that sorting a List<Tuple2> already provides records ordering by the first and second level (in that order). So, can we just go back to `Tuple2<String, HoodieKey>` or if we have to use a custom comparator, can it just work off that?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
