Gramce created SPARK-12750:
------------------------------

             Summary: Java class method don't work properly
                 Key: SPARK-12750
                 URL: https://issues.apache.org/jira/browse/SPARK-12750
             Project: Spark
          Issue Type: Question
            Reporter: Gramce


I use java spark to tansform the labeledpoint.
I want to select several columns from the JavaRdd<labeledPoint>. For example 
the first three colunmns.
So I wrote like this:
int[] ad={1,2,3};
int b=ad.length;  
JavaRDD<LabeledPoint> ggd=parsedData.map(
                        new Function<LabeledPoint, LabeledPoint>(){
                                public LabeledPoint call(LabeledPoint a){
                                        double[] v =new double[b];
                                        for(int i=0;i<b;i++){
                                                
v[i]=a.features().toArray()[ad[i]];
                                                        }
                                        return new 
LabeledPoint(a.label(),Vectors.dense(v));
                                        }       
                                        });

where parsedData is a LabeledPoint data.
Now I want to converse this to a method. So the code is like this:

class myrddd{
        public JavaRDD<LabeledPoint> abcd;
        public myrddd(JavaRDD<LabeledPoint> deff ){
                abcd=deff;
                }
        public JavaRDD<LabeledPoint> abcdf(int[]asdf,int b){
                JavaRDD<LabeledPoint> bcd=abcd;
                JavaRDD<LabeledPoint> mms=bcd.map(
                        new Function<LabeledPoint, LabeledPoint>(){
                                public LabeledPoint call(LabeledPoint a){
                                        double[] v =new double[b];
                                        for(int i=0;i<b;i++){
                                                
v[i]=a.features().toArray()[asdf[i]];
                                                        }
                                        return new 
LabeledPoint(a.label(),Vectors.dense(v));
                                        }       
                                        });
                return(mms);}
}

And
myrddd ndfs=new myrddd(parsedData);
JavaRDD<LabeledPoint> ggdf=ndfs.abcdf(ad, b);

But this doesn't work.Following is the error:

Exception in thread "main" org.apache.spark.SparkException: Task not 
serializable
        at 
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:304)
        at 
org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:294)
        at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122)
        at org.apache.spark.SparkContext.clean(SparkContext.scala:2032)
        at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:318)
        at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:317)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
        at org.apache.spark.rdd.RDD.map(RDD.scala:317)
        at org.apache.spark.api.java.JavaRDDLike$class.map(JavaRDDLike.scala:93)
        at 
org.apache.spark.api.java.AbstractJavaRDDLike.map(JavaRDDLike.scala:47)
        at anbv.qwe.myrddd.abcdf(dfa.java:53)
        at anbv.qwe.dfa.main(dfa.java:42)
Caused by: java.io.NotSerializableException: anbv.qwe.myrddd
Serialization stack:
        - object not serializable (class: anbv.qwe.myrddd, value: 
anbv.qwe.myrddd@310aee0b)
        - field (class: anbv.qwe.myrddd$1, name: this$0, type: class 
anbv.qwe.myrddd)
        - object (class anbv.qwe.myrddd$1, anbv.qwe.myrddd$1@4b76aa5a)
        - field (class: 
org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1, name: fun$1, 
type: interface org.apache.spark.api.java.function.Function)
        - object (class 
org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1, <function1>)
        at 
org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
        at 
org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47)
        at 
org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:84)
        at 
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:301)
        ... 13 more
but this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to