Re: SQL DDL statements with replacing default catalog with custom catalog

2020-10-06 Thread Jungtaek Lim
The logical plan for the parsed statement is getting converted either for old one or v2, and for the former one it keeps using an external catalog (Hive) - so replacing default session catalog with custom one and trying to use it like it is in external catalog doesn't work, which destroys the

Re: SQL DDL statements with replacing default catalog with custom catalog

2020-10-06 Thread Ryan Blue
I've hit this with `DROP TABLE` commands that should be passed to a registered v2 session catalog, but are handled by v1. I think that's the only case we hit in our downstream test suites, but we haven't been exploring the use of a session catalog for fallback. We use v2 for everything now, which

SQL DDL statements with replacing default catalog with custom catalog

2020-10-06 Thread Jungtaek Lim
Hi devs, I'm not sure whether it's addressed in Spark 3.1, but at least from Spark 3.0.1, many SQL DDL statements don't seem to go through the custom catalog when I replace default catalog with custom catalog and only provide 'dbName.tableName' as table identifier. I'm not an expert in this

Re: Official support of CREATE EXTERNAL TABLE

2020-10-06 Thread Russell Spitzer
I don't feel differently than I did on the thread linked above, I think treating "External" as a table option is still the safest way to go about things. For the Cassandra catalog this option wouldn't appear on our whitelist of allowed options, the same as "path" and other options that don't apply

Re: Official support of CREATE EXTERNAL TABLE

2020-10-06 Thread Holden Karau
As someone who's had the job of porting different SQL dialects to Spark, I'm also very much in favor of keeping EXTERNAL, and I think Ryan's suggestion of leaving it up to the catalogs on how to handle this makes sense. On Tue, Oct 6, 2020 at 1:54 PM Ryan Blue wrote: > I would summarize both

Re: Official support of CREATE EXTERNAL TABLE

2020-10-06 Thread Ryan Blue
I would summarize both the problem and the current state differently. Currently, Spark parses the EXTERNAL keyword for compatibility with Hive SQL, but Spark’s built-in catalog doesn’t allow creating a table with EXTERNAL unless LOCATION is also present. *This “hidden feature” breaks

Official support of CREATE EXTERNAL TABLE

2020-10-06 Thread Wenchen Fan
Hi all, I'd like to start a discussion thread about this topic, as it blocks an important feature that we target for Spark 3.1: unify the CREATE TABLE SQL syntax. A bit more background for CREATE EXTERNAL TABLE: it's kind of a hidden feature in Spark for Hive compatibility. When you write