[jira] [Commented] (SPARK-3274) Spark Streaming Java API reports java.lang.ClassCastException when calling collectAsMap on JavaPairDStream

2014-09-29 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14151432#comment-14151432
 ] 

Sean Owen commented on SPARK-3274:
--

I don't think that's the same thing. It is just saying you are reading a 
{{SequenceFile}} of {{Text}} and then pretending they are Strings. Are you sure 
the first {{return}} statement works? 

They will both work as expected if you just call {{.toString()}} on the 
{{Text}} objects you are actually operating on.

 Spark Streaming Java API reports java.lang.ClassCastException when calling 
 collectAsMap on JavaPairDStream
 --

 Key: SPARK-3274
 URL: https://issues.apache.org/jira/browse/SPARK-3274
 Project: Spark
  Issue Type: Bug
  Components: Java API
Affects Versions: 1.0.2
Reporter: Jack Hu

 Reproduce code:
 scontext
   .socketTextStream(localhost, 1)
   .mapToPair(new PairFunctionString, String, String(){
   public Tuple2String, String call(String arg0)
   throws Exception {
   return new Tuple2String, String(1, arg0);
   }
   })
   .foreachRDD(new Function2JavaPairRDDString, String, Time, 
 Void() {
   public Void call(JavaPairRDDString, String v1, Time 
 v2) throws Exception {
   System.out.println(v2.toString() + :  + 
 v1.collectAsMap().toString());
   return null;
   }
   });
 Exception:
 java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to 
 [Lscala.Tupl
 e2;
 at 
 org.apache.spark.rdd.PairRDDFunctions.collectAsMap(PairRDDFunctions.s
 cala:447)
 at 
 org.apache.spark.api.java.JavaPairRDD.collectAsMap(JavaPairRDD.scala:
 464)
 at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:90)
 at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:88)
 at 
 org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR
 DD$2.apply(JavaDStreamLike.scala:282)
 at 
 org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR
 DD$2.apply(JavaDStreamLike.scala:282)
 at 
 org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mc
 V$sp(ForEachDStream.scala:41)
 at 
 org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(Fo
 rEachDStream.scala:40)
 at 
 org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(Fo
 rEachDStream.scala:40)
 at scala.util.Try$.apply(Try.scala:161)
 at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
 at 
 org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3274) Spark Streaming Java API reports java.lang.ClassCastException when calling collectAsMap on JavaPairDStream

2014-09-29 Thread Pulkit Bhuwalka (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14151447#comment-14151447
 ] 

Pulkit Bhuwalka commented on SPARK-3274:


[~sowen] - you are right. I was making the mistake of reading the sequence file 
as String instead of text. Addind toString fixed the problem. Thanks a lot for 
your help.

 Spark Streaming Java API reports java.lang.ClassCastException when calling 
 collectAsMap on JavaPairDStream
 --

 Key: SPARK-3274
 URL: https://issues.apache.org/jira/browse/SPARK-3274
 Project: Spark
  Issue Type: Bug
  Components: Java API
Affects Versions: 1.0.2
Reporter: Jack Hu

 Reproduce code:
 scontext
   .socketTextStream(localhost, 1)
   .mapToPair(new PairFunctionString, String, String(){
   public Tuple2String, String call(String arg0)
   throws Exception {
   return new Tuple2String, String(1, arg0);
   }
   })
   .foreachRDD(new Function2JavaPairRDDString, String, Time, 
 Void() {
   public Void call(JavaPairRDDString, String v1, Time 
 v2) throws Exception {
   System.out.println(v2.toString() + :  + 
 v1.collectAsMap().toString());
   return null;
   }
   });
 Exception:
 java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to 
 [Lscala.Tupl
 e2;
 at 
 org.apache.spark.rdd.PairRDDFunctions.collectAsMap(PairRDDFunctions.s
 cala:447)
 at 
 org.apache.spark.api.java.JavaPairRDD.collectAsMap(JavaPairRDD.scala:
 464)
 at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:90)
 at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:88)
 at 
 org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR
 DD$2.apply(JavaDStreamLike.scala:282)
 at 
 org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR
 DD$2.apply(JavaDStreamLike.scala:282)
 at 
 org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mc
 V$sp(ForEachDStream.scala:41)
 at 
 org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(Fo
 rEachDStream.scala:40)
 at 
 org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(Fo
 rEachDStream.scala:40)
 at scala.util.Try$.apply(Try.scala:161)
 at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
 at 
 org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3274) Spark Streaming Java API reports java.lang.ClassCastException when calling collectAsMap on JavaPairDStream

2014-09-28 Thread Pulkit Bhuwalka (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14151233#comment-14151233
 ] 

Pulkit Bhuwalka commented on SPARK-3274:


SparkConf sparkConf = new SparkConf().setAppName(Page 
Rank).setMaster(local[4]);
JavaSparkContext context = new JavaSparkContext(sparkConf);

JavaPairRDDString, String transformedLinkMap =
context.sequenceFile(pageRankOptions.getFileLocation(), 
String.class, String.class, 1)
.mapToPair(new PairFunctionTuple2String, String, String, 
String() {
@Override
public Tuple2String, String call(Tuple2String, String 
urlAndLinks) throws Exception {
//return new Tuple2String, String(urlAndLinks._1(), 
urlAndLinks._2());
return new Tuple2String, String(
urlAndLinks._1(),
new LinkDetails(1.0, new 
LinkParser().parse(urlAndLinks._2())).toString()
);
}
});

When I use the commented line above, which simply returns the strings, it 
works. However, when I use the code after that with LinkDetails which simply 
parses the string into an object, the code fails with a ClassCastException.

java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to 
java.lang.String
at 
io.pulkit.cmu.acc.project1.phase2.PageRankSparkJob$1.call(PageRankSparkJob.java:28)
at 
io.pulkit.cmu.acc.project1.phase2.PageRankSparkJob$1.call(PageRankSparkJob.java:24)
at 
org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:926)
at 
org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:926)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1167)
at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:904)
at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:904)
at 
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
at 
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
at org.apache.spark.scheduler.Task.run(Task.scala:54)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)

I looked at the other link mentioned. However, the pull request link on that 
does not work and it is marked as resolved in 0.9. However, I'm using 1.1.0.

Thanks a lot.



 Spark Streaming Java API reports java.lang.ClassCastException when calling 
 collectAsMap on JavaPairDStream
 --

 Key: SPARK-3274
 URL: https://issues.apache.org/jira/browse/SPARK-3274
 Project: Spark
  Issue Type: Bug
  Components: Java API
Affects Versions: 1.0.2
Reporter: Jack Hu

 Reproduce code:
 scontext
   .socketTextStream(localhost, 1)
   .mapToPair(new PairFunctionString, String, String(){
   public Tuple2String, String call(String arg0)
   throws Exception {
   return new Tuple2String, String(1, arg0);
   }
   })
   .foreachRDD(new Function2JavaPairRDDString, String, Time, 
 Void() {
   public Void call(JavaPairRDDString, String v1, Time 
 v2) throws Exception {
   System.out.println(v2.toString() + :  + 
 v1.collectAsMap().toString());
   return null;
   }
   });
 Exception:
 java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to 
 [Lscala.Tupl
 e2;
 at 
 org.apache.spark.rdd.PairRDDFunctions.collectAsMap(PairRDDFunctions.s
 cala:447)
 at 
 org.apache.spark.api.java.JavaPairRDD.collectAsMap(JavaPairRDD.scala:
 464)
 at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:90)
 at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:88)
 at 
 org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR
 DD$2.apply(JavaDStreamLike.scala:282)
 at 
 org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR
 DD$2.apply(JavaDStreamLike.scala:282)
 at 
 

[jira] [Commented] (SPARK-3274) Spark Streaming Java API reports java.lang.ClassCastException when calling collectAsMap on JavaPairDStream

2014-08-28 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113702#comment-14113702
 ] 

Sean Owen commented on SPARK-3274:
--

Same as the problem and solution in 
https://issues.apache.org/jira/browse/SPARK-1040

 Spark Streaming Java API reports java.lang.ClassCastException when calling 
 collectAsMap on JavaPairDStream
 --

 Key: SPARK-3274
 URL: https://issues.apache.org/jira/browse/SPARK-3274
 Project: Spark
  Issue Type: Bug
  Components: Java API
Affects Versions: 1.0.2
Reporter: Jack Hu

 Reproduce code:
 scontext
   .socketTextStream(localhost, 1)
   .mapToPair(new PairFunctionString, String, String(){
   public Tuple2String, String call(String arg0)
   throws Exception {
   return new Tuple2String, String(1, arg0);
   }
   })
   .foreachRDD(new Function2JavaPairRDDString, String, Time, 
 Void() {
   public Void call(JavaPairRDDString, String v1, Time 
 v2) throws Exception {
   System.out.println(v2.toString() + :  + 
 v1.collectAsMap().toString());
   return null;
   }
   });
 Exception:
 java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to 
 [Lscala.Tupl
 e2;
 at 
 org.apache.spark.rdd.PairRDDFunctions.collectAsMap(PairRDDFunctions.s
 cala:447)
 at 
 org.apache.spark.api.java.JavaPairRDD.collectAsMap(JavaPairRDD.scala:
 464)
 at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:90)
 at tuk.usecase.failedcall.FailedCall$1.call(FailedCall.java:88)
 at 
 org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR
 DD$2.apply(JavaDStreamLike.scala:282)
 at 
 org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachR
 DD$2.apply(JavaDStreamLike.scala:282)
 at 
 org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mc
 V$sp(ForEachDStream.scala:41)
 at 
 org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(Fo
 rEachDStream.scala:40)
 at 
 org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(Fo
 rEachDStream.scala:40)
 at scala.util.Try$.apply(Try.scala:161)
 at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
 at 
 org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobS



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org