[jira] [Commented] (SPARK-3274) Spark Streaming Java API reports java.lang.ClassCastException when calling collectAsMap on JavaPairDStream
[ https://issues.apache.org/jira/browse/SPARK-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14151432#comment-14151432 ] Sean Owen commented on SPARK-3274: -- I don't think that's the same thing. It is just saying you are reading a {{SequenceFile}} of {{Text}} and then pretending they are Strings. Are you sure the first {{return}} statement works? They will both work as expected if you just call {{.toString()}} on the {{Text}} objects you are actually operating on. Spark Streaming Java API reports java.lang.ClassCastException when calling collectAsMap on JavaPairDStream -- Key: SPARK-3274 URL: https://issues.apache.org/jira/browse/SPARK-3274 Project: Spark Issue Type: Bug Components: Java API Affects Versions: 1.0.2 Reporter: Jack Hu Reproduce code: scontext .socketTextStream(localhost, 1) .mapToPair(new PairFunctionString, String, String(){ public Tuple2String, String call(String arg0) throws Exception { return new Tuple2String, String(1, arg0); } }) .foreachRDD(new Function2JavaPairRDDString, String, Time, Void() { public Void call(JavaPairRDDString, String v1, Time v2) throws Exception { System.out.println(v2.toString() + : + v1.collectAsMap().toString()); return null; } }); Exception: java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Lscala.Tupl e2; at org.apache.spark.rdd.PairRDDFunctions.collectAsMap(PairRDDFunctions.s cala:447) at org.apache.spark.api.java.JavaPairRDD.collectAsMap(JavaPairRDD.scala: 464) at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:90) at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:88) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR DD$2.apply(JavaDStreamLike.scala:282) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR DD$2.apply(JavaDStreamLike.scala:282) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mc V$sp(ForEachDStream.scala:41) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(Fo rEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(Fo rEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobS -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3274) Spark Streaming Java API reports java.lang.ClassCastException when calling collectAsMap on JavaPairDStream
[ https://issues.apache.org/jira/browse/SPARK-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14151447#comment-14151447 ] Pulkit Bhuwalka commented on SPARK-3274: [~sowen] - you are right. I was making the mistake of reading the sequence file as String instead of text. Addind toString fixed the problem. Thanks a lot for your help. Spark Streaming Java API reports java.lang.ClassCastException when calling collectAsMap on JavaPairDStream -- Key: SPARK-3274 URL: https://issues.apache.org/jira/browse/SPARK-3274 Project: Spark Issue Type: Bug Components: Java API Affects Versions: 1.0.2 Reporter: Jack Hu Reproduce code: scontext .socketTextStream(localhost, 1) .mapToPair(new PairFunctionString, String, String(){ public Tuple2String, String call(String arg0) throws Exception { return new Tuple2String, String(1, arg0); } }) .foreachRDD(new Function2JavaPairRDDString, String, Time, Void() { public Void call(JavaPairRDDString, String v1, Time v2) throws Exception { System.out.println(v2.toString() + : + v1.collectAsMap().toString()); return null; } }); Exception: java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Lscala.Tupl e2; at org.apache.spark.rdd.PairRDDFunctions.collectAsMap(PairRDDFunctions.s cala:447) at org.apache.spark.api.java.JavaPairRDD.collectAsMap(JavaPairRDD.scala: 464) at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:90) at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:88) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR DD$2.apply(JavaDStreamLike.scala:282) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR DD$2.apply(JavaDStreamLike.scala:282) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mc V$sp(ForEachDStream.scala:41) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(Fo rEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(Fo rEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobS -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3274) Spark Streaming Java API reports java.lang.ClassCastException when calling collectAsMap on JavaPairDStream
[ https://issues.apache.org/jira/browse/SPARK-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14151233#comment-14151233 ] Pulkit Bhuwalka commented on SPARK-3274: SparkConf sparkConf = new SparkConf().setAppName(Page Rank).setMaster(local[4]); JavaSparkContext context = new JavaSparkContext(sparkConf); JavaPairRDDString, String transformedLinkMap = context.sequenceFile(pageRankOptions.getFileLocation(), String.class, String.class, 1) .mapToPair(new PairFunctionTuple2String, String, String, String() { @Override public Tuple2String, String call(Tuple2String, String urlAndLinks) throws Exception { //return new Tuple2String, String(urlAndLinks._1(), urlAndLinks._2()); return new Tuple2String, String( urlAndLinks._1(), new LinkDetails(1.0, new LinkParser().parse(urlAndLinks._2())).toString() ); } }); When I use the commented line above, which simply returns the strings, it works. However, when I use the code after that with LinkDetails which simply parses the string into an object, the code fails with a ClassCastException. java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to java.lang.String at io.pulkit.cmu.acc.project1.phase2.PageRankSparkJob$1.call(PageRankSparkJob.java:28) at io.pulkit.cmu.acc.project1.phase2.PageRankSparkJob$1.call(PageRankSparkJob.java:24) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:926) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:926) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1167) at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:904) at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:904) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:695) I looked at the other link mentioned. However, the pull request link on that does not work and it is marked as resolved in 0.9. However, I'm using 1.1.0. Thanks a lot. Spark Streaming Java API reports java.lang.ClassCastException when calling collectAsMap on JavaPairDStream -- Key: SPARK-3274 URL: https://issues.apache.org/jira/browse/SPARK-3274 Project: Spark Issue Type: Bug Components: Java API Affects Versions: 1.0.2 Reporter: Jack Hu Reproduce code: scontext .socketTextStream(localhost, 1) .mapToPair(new PairFunctionString, String, String(){ public Tuple2String, String call(String arg0) throws Exception { return new Tuple2String, String(1, arg0); } }) .foreachRDD(new Function2JavaPairRDDString, String, Time, Void() { public Void call(JavaPairRDDString, String v1, Time v2) throws Exception { System.out.println(v2.toString() + : + v1.collectAsMap().toString()); return null; } }); Exception: java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Lscala.Tupl e2; at org.apache.spark.rdd.PairRDDFunctions.collectAsMap(PairRDDFunctions.s cala:447) at org.apache.spark.api.java.JavaPairRDD.collectAsMap(JavaPairRDD.scala: 464) at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:90) at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:88) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR DD$2.apply(JavaDStreamLike.scala:282) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR DD$2.apply(JavaDStreamLike.scala:282) at
[jira] [Commented] (SPARK-3274) Spark Streaming Java API reports java.lang.ClassCastException when calling collectAsMap on JavaPairDStream
[ https://issues.apache.org/jira/browse/SPARK-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113702#comment-14113702 ] Sean Owen commented on SPARK-3274: -- Same as the problem and solution in https://issues.apache.org/jira/browse/SPARK-1040 Spark Streaming Java API reports java.lang.ClassCastException when calling collectAsMap on JavaPairDStream -- Key: SPARK-3274 URL: https://issues.apache.org/jira/browse/SPARK-3274 Project: Spark Issue Type: Bug Components: Java API Affects Versions: 1.0.2 Reporter: Jack Hu Reproduce code: scontext .socketTextStream(localhost, 1) .mapToPair(new PairFunctionString, String, String(){ public Tuple2String, String call(String arg0) throws Exception { return new Tuple2String, String(1, arg0); } }) .foreachRDD(new Function2JavaPairRDDString, String, Time, Void() { public Void call(JavaPairRDDString, String v1, Time v2) throws Exception { System.out.println(v2.toString() + : + v1.collectAsMap().toString()); return null; } }); Exception: java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Lscala.Tupl e2; at org.apache.spark.rdd.PairRDDFunctions.collectAsMap(PairRDDFunctions.s cala:447) at org.apache.spark.api.java.JavaPairRDD.collectAsMap(JavaPairRDD.scala: 464) at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:90) at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:88) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR DD$2.apply(JavaDStreamLike.scala:282) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR DD$2.apply(JavaDStreamLike.scala:282) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mc V$sp(ForEachDStream.scala:41) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(Fo rEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(Fo rEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobS -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org