[
https://issues.apache.org/jira/browse/SPARK-12750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved SPARK-12750.
-------------------------------
Resolution: Not A Problem
The problem is as it says. You've written Java code such that your Function
retains a reference to a non-serializable object.
> Java class method don't work properly
> -------------------------------------
>
> Key: SPARK-12750
> URL: https://issues.apache.org/jira/browse/SPARK-12750
> Project: Spark
> Issue Type: Question
> Reporter: Gramce
>
> I use java spark to tansform the labeledpoint.
> I want to select several columns from the JavaRdd<labeledPoint>. For example
> the first three colunmns.
> So I wrote like this:
> int[] ad={1,2,3};
> int b=ad.length;
> JavaRDD<LabeledPoint> ggd=parsedData.map(
> new Function<LabeledPoint, LabeledPoint>(){
> public LabeledPoint call(LabeledPoint a){
> double[] v =new double[b];
> for(int i=0;i<b;i++){
>
> v[i]=a.features().toArray()[ad[i]];
> }
> return new
> LabeledPoint(a.label(),Vectors.dense(v));
> }
> });
> where parsedData is a LabeledPoint data.
> Now I want to converse this to a method. So the code is like this:
> class myrddd{
> public JavaRDD<LabeledPoint> abcd;
> public myrddd(JavaRDD<LabeledPoint> deff ){
> abcd=deff;
> }
> public JavaRDD<LabeledPoint> abcdf(int[]asdf,int b){
> JavaRDD<LabeledPoint> bcd=abcd;
> JavaRDD<LabeledPoint> mms=bcd.map(
> new Function<LabeledPoint, LabeledPoint>(){
> public LabeledPoint call(LabeledPoint a){
> double[] v =new double[b];
> for(int i=0;i<b;i++){
>
> v[i]=a.features().toArray()[asdf[i]];
> }
> return new
> LabeledPoint(a.label(),Vectors.dense(v));
> }
> });
> return(mms);}
> }
> And
> myrddd ndfs=new myrddd(parsedData);
> JavaRDD<LabeledPoint> ggdf=ndfs.abcdf(ad, b);
> But this doesn't work.Following is the error:
> Exception in thread "main" org.apache.spark.SparkException: Task not
> serializable
> at
> org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:304)
> at
> org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:294)
> at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122)
> at org.apache.spark.SparkContext.clean(SparkContext.scala:2032)
> at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:318)
> at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:317)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
> at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
> at org.apache.spark.rdd.RDD.map(RDD.scala:317)
> at org.apache.spark.api.java.JavaRDDLike$class.map(JavaRDDLike.scala:93)
> at
> org.apache.spark.api.java.AbstractJavaRDDLike.map(JavaRDDLike.scala:47)
> at anbv.qwe.myrddd.abcdf(dfa.java:53)
> at anbv.qwe.dfa.main(dfa.java:42)
> Caused by: java.io.NotSerializableException: anbv.qwe.myrddd
> Serialization stack:
> - object not serializable (class: anbv.qwe.myrddd, value:
> anbv.qwe.myrddd@310aee0b)
> - field (class: anbv.qwe.myrddd$1, name: this$0, type: class
> anbv.qwe.myrddd)
> - object (class anbv.qwe.myrddd$1, anbv.qwe.myrddd$1@4b76aa5a)
> - field (class:
> org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1, name:
> fun$1, type: interface org.apache.spark.api.java.function.Function)
> - object (class
> org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1, <function1>)
> at
> org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
> at
> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47)
> at
> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:84)
> at
> org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:301)
> ... 13 more
> but this
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]