On Thu, Sep 8, 2016 at 9:03 AM, Fred Reiss <freiss....@gmail.com> wrote:
> I suppose the type-inference-time check for the presence of the input > directory could be moved to the FileStreamSource's initialization. But if > the directory isn't there when the source is being created, it probably > won't be there when the source is instantiated. Hi Fred, Thanks for your prompt response, Fred. Isn't it opposite to sc.textFile? The source might not be available until load. There's no reason it should. Yet it is definitely not against the "contract" of DataFrameReader.textFile and perhaps it's implictly assumed in SQL. scala> spark.read.textFile("whatever") org.apache.spark.sql.AnalysisException: Path does not exist: file:/Users/jacek/dev/oss/spark/whatever; at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:371) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:360) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.immutable.List.flatMap(List.scala:344) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:360) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149) at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:500) at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:536) at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:509) ... 48 elided I thought it might've been due to schema inference but... scala> spark.read.schema(StructType(Seq())).textFile("whatever") org.apache.spark.sql.AnalysisException: User specified schema not supported with `textFile`; at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:534) at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:509) ... 50 elided (which also confuses me, but don't wanna drag this thread in multiple directions) Definitely need some help to understand the rationale behing this eager behaviour. Thanks! Pozdrawiam, Jacek --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org