Jean Georges Perrin created SPARK-26146:
-------------------------------------------

             Summary: CSV wouln't be ingested in Spark 2.4.0 with Scala 2.12
                 Key: SPARK-26146
                 URL: https://issues.apache.org/jira/browse/SPARK-26146
             Project: Spark
          Issue Type: Bug
          Components: Input/Output
    Affects Versions: 2.4.0
            Reporter: Jean Georges Perrin


When running a simple CSV ingestion like:{{ }}

 

 
{code:java}
    // Creates a session on a local master
    SparkSession spark = SparkSession.builder()
        .appName("CSV to Dataset")
        .master("local")
        .getOrCreate();
    // Reads a CSV file with header, called books.csv, stores it in a dataframe
    Dataset<Row> df = spark.read().format("csv")
        .option("header", "true")
        .load("data/books.csv");
{code}
 

 With Scala 2.12, I get:

 
{code:java}
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 10582
at 
com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.accept(BytecodeReadingParanamer.java:563)
at 
com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.access$200(BytecodeReadingParanamer.java:338)
at 
com.thoughtworks.paranamer.BytecodeReadingParanamer.lookupParameterNames(BytecodeReadingParanamer.java:103)
at 
com.thoughtworks.paranamer.CachingParanamer.lookupParameterNames(CachingParanamer.java:90)
at 
com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.getCtorParams(BeanIntrospector.scala:44)
at 
com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1(BeanIntrospector.scala:58)
at 
com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1$adapted(BeanIntrospector.scala:58)
at 
scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240)
...
at 
net.jgp.books.sparkWithJava.ch01.CsvToDataframeApp.start(CsvToDataframeApp.java:37)
at 
net.jgp.books.sparkWithJava.ch01.CsvToDataframeApp.main(CsvToDataframeApp.java:21)
{code}
 

 Where it works pretty smoothly if I switch back to 2.11.

Full example available at 
[https://github.com/jgperrin/net.jgp.books.sparkWithJava.ch01.] You can modify 
pom.xml to change easily the Scala version in the property section:
{code:java}
<properties>
 <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
 <java.version>1.8</java.version>
 <scala.version>2.11</scala.version>
 <spark.version>2.4.0</spark.version>
</properties>{code}
 

(ps. It's my first bug submission, so I hope I did not mess too much, be 
tolerant if I did)

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to