GitHub user smurakozi opened a pull request:
https://github.com/apache/spark/pull/19906
[SPARK-22516][SQL] Bump up Univocity version to 2.5.9
## What changes were proposed in this pull request?
There was a bug in Univocity Parser that causes the issue in SPARK-22516.
This was fixed by upgrading from 2.5.4 to 2.5.9 version of the library :
**Executing**
```
spark.read.option("header","true").option("inferSchema",
"true").option("multiLine", "true").option("comment",
"g").csv("test_file_without_eof_char.csv").show()
```
**Before**
```
ERROR Executor: Exception in task 0.0 in stage 6.0 (TID 6)
com.univocity.parsers.common.TextParsingException:
java.lang.IllegalArgumentException - Unable to skip 1 lines from line 2. End of
input reached
...
Internal state when error was thrown: line=3, column=0, record=2,
charIndex=31
at
com.univocity.parsers.common.AbstractParser.handleException(AbstractParser.java:339)
at
com.univocity.parsers.common.AbstractParser.parseNext(AbstractParser.java:475)
at
org.apache.spark.sql.execution.datasources.csv.UnivocityParser$$anon$1.next(UnivocityParser.scala:281)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
```
**After**
```
+-------+-------+
|column1|column2|
+-------+-------+
| abc| def|
+-------+-------+
```
## How was this patch tested?
The already existing `CSVSuite.commented lines in CSV data` test was
extended to parse the file also in multiline mode. The test input file was
modified to also include a comment in the last line.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/smurakozi/spark SPARK-22516
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19906.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19906
----
commit 8bc6a9ce9f6eeb854261d26dabaf04052eb8b5b2
Author: smurakozi <[email protected]>
Date: 2017-11-27T08:30:25Z
[SPARK-22516][SQL] Bump up Univocity version to 2.5.9
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]