[ 
https://issues.apache.org/jira/browse/SPARK-20497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988089#comment-15988089
 ] 

Hyukjin Kwon commented on SPARK-20497:
--------------------------------------

I can't reproduce this on Windows as below:

- Existing path

{code}
val lines =
  """
    |1 1:1.0 3:2.0 5:3.0
    |0
    |0 2:4.0 4:5.0 6:6.0
  """.stripMargin

val loc = "C:\\...\\foo"
val file = new java.io.File(loc)
com.google.common.io.Files.write(lines, file, 
java.nio.charset.StandardCharsets.UTF_8)
spark.read.format("libsvm").load(loc).show()
{code}

prints

{code}
+-----+--------------------+
|label|            features|
+-----+--------------------+
|  1.0|(6,[0,2,4],[1.0,2...|
|  0.0|           (6,[],[])|
|  0.0|(6,[1,3,5],[4.0,5...|
+-----+--------------------+
{code}

- Non-existing path

{code}
spark.read.format("libsvm").load("/NON_EXISTS").show()
{code}

produces

{code}
org.apache.spark.sql.AnalysisException: Path does not exist: file:/NON_EXISTS;
  at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(Dat
aSource.scala:354)
  at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(Dat
aSource.scala:342)
  at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.s
cala:241)
  at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.s
cala:241)
  at scala.collection.immutable.List.foreach(List.scala:381)
  at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
  at scala.collection.immutable.List.flatMap(List.scala:344)
  at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataS
ource.scala:342)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:156)
  ... 48 elided
{code}

It does not look a problem within Spark. I would rather suggest to close this 
if anyone is unable to reproduce this within Spark.

> Unhelpful error messages when trying to load data from file.
> ------------------------------------------------------------
>
>                 Key: SPARK-20497
>                 URL: https://issues.apache.org/jira/browse/SPARK-20497
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Brandon Barker
>            Priority: Minor
>
> I'm attempting to do the simple task of reproducing the results from the 
> linear regression example in Spark. I'm using Windows 10.
>   val training = spark.read.format("libsvm")
>  .load("C:Users\\brand\\Documents\\GitHub\\sample_linear_regression_data.txt")
> Although the file is definitely at the specified location, I just get a 
> java.lang.NullPointerException at this line. The documentation at 
> http://spark.apache.org/docs/latest/sql-programming-guide.html#generic-loadsave-functions
>  doesn't seem to clear things up. The associated javadocs do do not seem any 
> better.
> In my view, such a simple operation should not be troublesome, but perhaps 
> I've missed some critical documentation - if so, I apologize. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to