[GitHub] [incubator-hudi] vinothchandar commented on issue #672: [HUDI-113]: Use Tuple2 over # delimited string

GitBox Mon, 13 May 2019 11:40:34 -0700

vinothchandar commented on issue #672: [HUDI-113]: Use Tuple2 over # delimited 
string
URL: https://github.com/apache/incubator-hudi/pull/672#issuecomment-491938373
 
 
   ```
   Caused by: org.apache.spark.SparkException: 
   Job aborted due to stage failure: Task 1 in stage 40.0 failed 1 times, most 
recent failure: Lost task 1.0 in stage 40.0 (TID 56, localhost, executor 
driver): java.lang.ClassCastException: scala.Tuple2 cannot be cast to 
java.lang.String
        at 
com.uber.hoodie.index.bloom.BucketizedBloomCheckPartitioner.getPartition(BucketizedBloomCheckPartitioner.java:142)
        at 
org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:152)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
   ```
   
   See some errors due to casting mismatch.. 
   
   High level question... I see that sorting a List<Tuple2> already provides 
records ordering by the first and second level (in that order). So, can we just 
go back to `Tuple2<String, HoodieKey>` or if we have to use a custom 
comparator, can it just work off that?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on issue #672: [HUDI-113]: Use Tuple2 over # delimited string

Reply via email to