dataframe json schema scan

Alex Nastetsky Thu, 20 Aug 2015 12:36:18 -0700

The doc for DataFrameReader#json(RDD[String]) method says

"Unless the schema is specified using schema function, this function goes
through the input once to determine the input schema."


https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrameReader

Why is this necessary? Why can't it create the dataframe at the same time
as it's determining the schema?

Thanks.

dataframe json schema scan

Reply via email to