Re: DataSourceV2: support for named tables

2018-02-02 Thread Ryan Blue
I don’t have a good answer for that yet. My initial motivation here is mainly to get consensus around this: - DSv2 should support table names through SQL and the API, and - It should use the existing classes in the logical plan (i.e., TableIdentifier) To contrast, I think Wenchen is

Re: DataSourceV2: support for named tables

2018-02-02 Thread Michael Armbrust
I am definitely in favor of first-class / consistent support for tables and data sources. One thing that is not clear to me from this proposal is exactly what the interfaces are between: - Spark - A (The?) metastore - A data source If we pass in the table identifier is the data source then

DataSourceV2: support for named tables

2018-02-02 Thread Ryan Blue
There are two main ways to load tables in Spark: by name (db.table) and by a path. Unfortunately, the integration for DataSourceV2 has no support for identifying tables by name. I propose supporting the use of TableIdentifier, which is the standard way to pass around table names. The reason I