[GitHub] [iceberg] rdblue commented on pull request #1843: Support for file paths in SparkCatalogs via HadoopTables

GitBox Mon, 07 Dec 2020 10:37:42 -0800


rdblue commented on pull request #1843:
URL: https://github.com/apache/iceberg/pull/1843#issuecomment-740102113



   > Should we also respect spark.sql.runSQLOnFiles in Spark? I am not sure 
whether we should match built-in sources here.
   
   I don't think so. That setting is for v1. Instead, Spark needs to define how 
it will pass path-based tables to catalogs, and what the behavior requirements 
are for those tables. Right now, no one has done the work to find out what the 
behavior of v1 is or to try to build consensus in the Spark community about 
what it should be. I think that means that Iceberg should avoid Spark's 
path-based syntax for now and focus in this PR on the narrower case of passing 
identifiers for paths used in `DataFrameReader`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on pull request #1843: Support for file paths in SparkCatalogs via HadoopTables

Reply via email to