[
https://issues.apache.org/jira/browse/SPARK-20497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988089#comment-15988089
]
Hyukjin Kwon commented on SPARK-20497:
--------------------------------------
I can't reproduce this on Windows as below:
- Existing path
{code}
val lines =
"""
|1 1:1.0 3:2.0 5:3.0
|0
|0 2:4.0 4:5.0 6:6.0
""".stripMargin
val loc = "C:\\...\\foo"
val file = new java.io.File(loc)
com.google.common.io.Files.write(lines, file,
java.nio.charset.StandardCharsets.UTF_8)
spark.read.format("libsvm").load(loc).show()
{code}
prints
{code}
+-----+--------------------+
|label| features|
+-----+--------------------+
| 1.0|(6,[0,2,4],[1.0,2...|
| 0.0| (6,[],[])|
| 0.0|(6,[1,3,5],[4.0,5...|
+-----+--------------------+
{code}
- Non-existing path
{code}
spark.read.format("libsvm").load("/NON_EXISTS").show()
{code}
produces
{code}
org.apache.spark.sql.AnalysisException: Path does not exist: file:/NON_EXISTS;
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(Dat
aSource.scala:354)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(Dat
aSource.scala:342)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.s
cala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.s
cala:241)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.immutable.List.flatMap(List.scala:344)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataS
ource.scala:342)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:156)
... 48 elided
{code}
It does not look a problem within Spark. I would rather suggest to close this
if anyone is unable to reproduce this within Spark.
> Unhelpful error messages when trying to load data from file.
> ------------------------------------------------------------
>
> Key: SPARK-20497
> URL: https://issues.apache.org/jira/browse/SPARK-20497
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.1.0
> Reporter: Brandon Barker
> Priority: Minor
>
> I'm attempting to do the simple task of reproducing the results from the
> linear regression example in Spark. I'm using Windows 10.
> val training = spark.read.format("libsvm")
> .load("C:Users\\brand\\Documents\\GitHub\\sample_linear_regression_data.txt")
> Although the file is definitely at the specified location, I just get a
> java.lang.NullPointerException at this line. The documentation at
> http://spark.apache.org/docs/latest/sql-programming-guide.html#generic-loadsave-functions
> doesn't seem to clear things up. The associated javadocs do do not seem any
> better.
> In my view, such a simple operation should not be troublesome, but perhaps
> I've missed some critical documentation - if so, I apologize.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]