Nira Amit created SPARK-18945:
---------------------------------
Summary: java.lang.ClassCastException when Tuple2 field is an array
Key: SPARK-18945
URL: https://issues.apache.org/jira/browse/SPARK-18945
Project: Spark
Issue Type: Bug
Components: Java API
Affects Versions: 2.0.2
Reporter: Nira Amit
The following code results in an error:
{code}
private static PairFunction<String, String, String> keyData =
new PairFunction<String, String, String>() {
public Tuple2<String, String> call(String x) {
return new Tuple2(x.split(" ")[0], x.split(" "));
}
};
public void testPairRdd() throws Exception {
JavaRDD<String> lines = sc.parallelize(Arrays.asList("This is one line",
"And another line",
"Why not one more line"));
JavaPairRDD<String, String> pairs = lines.mapToPair(keyData);
Tuple2<String, String> firstPair = pairs.first();
System.out.println("Got object of type: " + firstPair.getClass());
System.out.println("First element is of type: " +
firstPair._1().getClass());
System.out.println("Second element is of type: " +
firstPair._2().getClass());
}
{code}
The problematic expression is the last print. The output in the console is:
{code}
16/12/20 13:42:12 INFO DAGScheduler: ResultStage 0 (first at
RetentionOutputFormatterTest.java:166) finished in 0.148 s
Got object of type: class scala.Tuple2
First element is of type: class java.lang.String
java.lang.ClassCastException: [Ljava.lang.String; cannot be cast to
java.lang.String
{code}
If the Tuple2 is of <String, String> instead of <String, String[]> then the
code works fine and there is no exception.
This is the relevant part of my pom file:
{code:xml}
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<jdk.version>1.8</jdk.version>
</properties>
<dependencies>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.0.1</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk</artifactId>
<version>1.9.21</version>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
</exclusion>
<exclusion>
<groupId>net.java.dev.jets3t</groupId>
<artifactId>jets3t</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.3.2</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]