If it's by design and not prepared, then IMHO replacing the default session catalog is better to be restricted until things are sorted out, as it gives pretty much confusion and has known bugs. Actually there's another bug/limitation on default session catalog on the length of identifier, so things that work with custom catalog no longer work when it replaces default session catalog.
On Wed, Oct 7, 2020 at 6:05 PM Wenchen Fan <cloud0...@gmail.com> wrote: > Ah, this is by design. V1 tables should still go through the v1 session > catalog. I think we can remove this restriction when we are confident about > the new v2 DDL commands that work with v2 catalog APIs. > > On Wed, Oct 7, 2020 at 5:00 PM Jungtaek Lim <kabhwan.opensou...@gmail.com> > wrote: > >> My case is DROP TABLE and DROP TABLE supports both v1 and v2 (as it >> simply works when I use custom catalog without replacing the default >> catalog). >> >> It just fails on v2 when the "default catalog" is replaced (say I replace >> 'spark_catalog'), because TempViewOrV1Table is providing value even with v2 >> table, and then the catalyst goes with v1 exec. I guess all commands >> leveraging TempViewOrV1Table to determine whether the table is v1 vs v2 >> would all suffer from this issue. >> >> On Wed, Oct 7, 2020 at 5:45 PM Wenchen Fan <cloud0...@gmail.com> wrote: >> >>> Not all the DDL commands support v2 catalog APIs (e.g. CREATE TABLE >>> LIKE), so it's possible that some commands still go through the v1 session >>> catalog although you configured a custom v2 session catalog. >>> >>> Can you create JIRA tickets if you hit any DDL commands that don't >>> support v2 catalog? We should fix them. >>> >>> On Wed, Oct 7, 2020 at 9:15 AM Jungtaek Lim < >>> kabhwan.opensou...@gmail.com> wrote: >>> >>>> The logical plan for the parsed statement is getting converted either >>>> for old one or v2, and for the former one it keeps using an external >>>> catalog (Hive) - so replacing default session catalog with custom one and >>>> trying to use it like it is in external catalog doesn't work, which >>>> destroys the purpose of replacing the default session catalog. >>>> >>>> Btw I see one approach: in TempViewOrV1Table, if it matches >>>> with SessionCatalogAndIdentifier where the catalog is TableCatalog, call >>>> loadTable in catalog and see whether it's V1 table or not. Not sure it's a >>>> viable approach though, as it requires loading a table during resolution of >>>> the table identifier. >>>> >>>> On Wed, Oct 7, 2020 at 10:04 AM Ryan Blue <rb...@netflix.com> wrote: >>>> >>>>> I've hit this with `DROP TABLE` commands that should be passed to a >>>>> registered v2 session catalog, but are handled by v1. I think that's the >>>>> only case we hit in our downstream test suites, but we haven't been >>>>> exploring the use of a session catalog for fallback. We use v2 for >>>>> everything now, which avoids the problem and comes with multi-catalog >>>>> support. >>>>> >>>>> On Tue, Oct 6, 2020 at 5:55 PM Jungtaek Lim < >>>>> kabhwan.opensou...@gmail.com> wrote: >>>>> >>>>>> Hi devs, >>>>>> >>>>>> I'm not sure whether it's addressed in Spark 3.1, but at least from >>>>>> Spark 3.0.1, many SQL DDL statements don't seem to go through the custom >>>>>> catalog when I replace default catalog with custom catalog and only >>>>>> provide >>>>>> 'dbName.tableName' as table identifier. >>>>>> >>>>>> I'm not an expert in this area, but after skimming the code I feel >>>>>> TempViewOrV1Table looks to be broken for the case, as it can still be a >>>>>> V2 >>>>>> table. Classifying the table identifier to either V2 table or "temp view >>>>>> or >>>>>> v1 table" looks to be mandatory, as former and latter have different code >>>>>> paths and different catalog interfaces. >>>>>> >>>>>> That sounds to me as being stuck and the only "clear" approach seems >>>>>> to disallow default catalog with custom one. Am I missing something? >>>>>> >>>>>> Thanks, >>>>>> Jungtaek Lim (HeartSaVioR) >>>>>> >>>>> >>>>> >>>>> -- >>>>> Ryan Blue >>>>> Software Engineer >>>>> Netflix >>>>> >>>>