[
https://issues.apache.org/jira/browse/SPARK-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14251507#comment-14251507
]
Shixiong Zhu commented on SPARK-2075:
-------------------------------------
Dig deeply and found weird things:
If I used `mvn -Dhadoop.version=1.2.1 -DskipTests clean package -pl core -am`
to compile, the `saveAsTextFile` will be:
{noformat}
public void saveAsTextFile(java.lang.String);
Code:
0: aload_0
1: new #1577; //class org/apache/spark/rdd/RDD$$anonfun$27
4: dup
5: aload_0
6: invokespecial #1578; //Method
org/apache/spark/rdd/RDD$$anonfun$27."<init>":(Lorg/apache/spark/rdd/RDD;)V
9: getstatic #439; //Field
scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
12: ldc_w #441; //class scala/Tuple2
15: invokevirtual #445; //Method
scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
18: invokevirtual #447; //Method
map:(Lscala/Function1;Lscala/reflect/ClassTag;)Lorg/apache/spark/rdd/RDD;
21: astore_2
22: getstatic #439; //Field
scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
25: ldc_w #1580; //class org/apache/hadoop/io/NullWritable
28: invokevirtual #445; //Method
scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
31: astore_3
32: getstatic #439; //Field
scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
35: ldc_w #1582; //class org/apache/hadoop/io/Text
38: invokevirtual #445; //Method
scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
41: astore 4
43: getstatic #21; //Field
org/apache/spark/rdd/RDD$.MODULE$:Lorg/apache/spark/rdd/RDD$;
46: aload_2
47: invokevirtual #23; //Method
org/apache/spark/rdd/RDD$.rddToPairRDDFunctions$default$4:(Lorg/apache/spark/rdd/RDD;)Lscala/runtime/Null$;
50: astore 5
52: getstatic #21; //Field
org/apache/spark/rdd/RDD$.MODULE$:Lorg/apache/spark/rdd/RDD$;
55: aload_2
56: aload_3
57: aload 4
59: aload 5
61: pop
62: aconst_null
63: invokevirtual #47; //Method
org/apache/spark/rdd/RDD$.rddToPairRDDFunctions:(Lorg/apache/spark/rdd/RDD;Lscala/reflect/ClassTag;Lscala/reflect/ClassTag;Lscala/math/Ordering;)Lorg/apache/spark/rdd/PairRDDFunctions;
66: aload_1
67: getstatic #439; //Field
scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
70: ldc_w #1584; //class org/apache/hadoop/mapred/TextOutputFormat
73: invokevirtual #445; //Method
scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
76: invokevirtual #1588; //Method
org/apache/spark/rdd/PairRDDFunctions.saveAsHadoopFile:(Ljava/lang/String;Lscala/reflect/ClassTag;)V
79: return
{noformat}
If I used `mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests clean
package -pl core -am` to compile, the `saveAsTextFile` is different:
{noformat}
public void saveAsTextFile(java.lang.String);
Code:
0: getstatic #21; //Field
org/apache/spark/rdd/RDD$.MODULE$:Lorg/apache/spark/rdd/RDD$;
3: aload_0
4: new #1577; //class
org/apache/spark/rdd/RDD$$anonfun$saveAsTextFile$1
7: dup
8: aload_0
9: invokespecial #1578; //Method
org/apache/spark/rdd/RDD$$anonfun$saveAsTextFile$1."<init>":(Lorg/apache/spark/rdd/RDD;)V
12: getstatic #439; //Field
scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
15: ldc_w #441; //class scala/Tuple2
18: invokevirtual #445; //Method
scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
21: invokevirtual #447; //Method
map:(Lscala/Function1;Lscala/reflect/ClassTag;)Lorg/apache/spark/rdd/RDD;
24: getstatic #439; //Field
scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
27: ldc_w #1580; //class org/apache/hadoop/io/NullWritable
30: invokevirtual #445; //Method
scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
33: getstatic #439; //Field
scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
36: ldc_w #1582; //class org/apache/hadoop/io/Text
39: invokevirtual #445; //Method
scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
42: getstatic #1587; //Field
scala/math/Ordering$.MODULE$:Lscala/math/Ordering$;
45: getstatic #471; //Field scala/Predef$.MODULE$:Lscala/Predef$;
48: invokevirtual #1591; //Method
scala/Predef$.conforms:()Lscala/Predef$$less$colon$less;
51: invokevirtual #1595; //Method
scala/math/Ordering$.ordered:(Lscala/Function1;)Lscala/math/Ordering;
54: invokevirtual #47; //Method
org/apache/spark/rdd/RDD$.rddToPairRDDFunctions:(Lorg/apache/spark/rdd/RDD;Lscala/reflect/ClassTag;Lscala/reflect/ClassTag;Lscala/math/Ordering;)Lorg/apache/spark/rdd/PairRDDFunctions;
57: aload_1
58: getstatic #439; //Field
scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
61: ldc_w #1597; //class org/apache/hadoop/mapred/TextOutputFormat
64: invokevirtual #445; //Method
scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
67: invokevirtual #1601; //Method
org/apache/spark/rdd/PairRDDFunctions.saveAsHadoopFile:(Ljava/lang/String;Lscala/reflect/ClassTag;)V
70: return
{noformat}
Note: in hadoop 1.2.1, saveAsTextFile use the default `Ordering` value `null`,
while in hadoop 2.2.0, saveAsTextFile will use `Ordering.ordered` to create a
new `Ordering`.
> Anonymous classes are missing from Spark distribution
> -----------------------------------------------------
>
> Key: SPARK-2075
> URL: https://issues.apache.org/jira/browse/SPARK-2075
> Project: Spark
> Issue Type: Bug
> Components: Build, Spark Core
> Affects Versions: 1.0.0
> Reporter: Paul R. Brown
> Priority: Critical
>
> Running a job built against the Maven dep for 1.0.0 and the hadoop1
> distribution produces:
> {code}
> java.lang.ClassNotFoundException:
> org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1
> {code}
> Here's what's in the Maven dep as of 1.0.0:
> {code}
> jar tvf
> ~/.m2/repository/org/apache/spark/spark-core_2.10/1.0.0/spark-core_2.10-1.0.0.jar
> | grep 'rdd/RDD' | grep 'saveAs'
> 1519 Mon May 26 13:57:58 PDT 2014
> org/apache/spark/rdd/RDD$anonfun$saveAsTextFile$1.class
> 1560 Mon May 26 13:57:58 PDT 2014
> org/apache/spark/rdd/RDD$anonfun$saveAsTextFile$2.class
> {code}
> And here's what's in the hadoop1 distribution:
> {code}
> jar tvf spark-assembly-1.0.0-hadoop1.0.4.jar| grep 'rdd/RDD' | grep 'saveAs'
> {code}
> I.e., it's not there. It is in the hadoop2 distribution:
> {code}
> jar tvf spark-assembly-1.0.0-hadoop2.2.0.jar| grep 'rdd/RDD' | grep 'saveAs'
> 1519 Mon May 26 07:29:54 PDT 2014
> org/apache/spark/rdd/RDD$anonfun$saveAsTextFile$1.class
> 1560 Mon May 26 07:29:54 PDT 2014
> org/apache/spark/rdd/RDD$anonfun$saveAsTextFile$2.class
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]