Not all the DDL commands support v2 catalog APIs (e.g. CREATE TABLE LIKE), so it's possible that some commands still go through the v1 session catalog although you configured a custom v2 session catalog.
Can you create JIRA tickets if you hit any DDL commands that don't support v2 catalog? We should fix them. On Wed, Oct 7, 2020 at 9:15 AM Jungtaek Lim <[email protected]> wrote: > The logical plan for the parsed statement is getting converted either for > old one or v2, and for the former one it keeps using an external catalog > (Hive) - so replacing default session catalog with custom one and trying to > use it like it is in external catalog doesn't work, which destroys the > purpose of replacing the default session catalog. > > Btw I see one approach: in TempViewOrV1Table, if it matches > with SessionCatalogAndIdentifier where the catalog is TableCatalog, call > loadTable in catalog and see whether it's V1 table or not. Not sure it's a > viable approach though, as it requires loading a table during resolution of > the table identifier. > > On Wed, Oct 7, 2020 at 10:04 AM Ryan Blue <[email protected]> wrote: > >> I've hit this with `DROP TABLE` commands that should be passed to a >> registered v2 session catalog, but are handled by v1. I think that's the >> only case we hit in our downstream test suites, but we haven't been >> exploring the use of a session catalog for fallback. We use v2 for >> everything now, which avoids the problem and comes with multi-catalog >> support. >> >> On Tue, Oct 6, 2020 at 5:55 PM Jungtaek Lim <[email protected]> >> wrote: >> >>> Hi devs, >>> >>> I'm not sure whether it's addressed in Spark 3.1, but at least from >>> Spark 3.0.1, many SQL DDL statements don't seem to go through the custom >>> catalog when I replace default catalog with custom catalog and only provide >>> 'dbName.tableName' as table identifier. >>> >>> I'm not an expert in this area, but after skimming the code I feel >>> TempViewOrV1Table looks to be broken for the case, as it can still be a V2 >>> table. Classifying the table identifier to either V2 table or "temp view or >>> v1 table" looks to be mandatory, as former and latter have different code >>> paths and different catalog interfaces. >>> >>> That sounds to me as being stuck and the only "clear" approach seems to >>> disallow default catalog with custom one. Am I missing something? >>> >>> Thanks, >>> Jungtaek Lim (HeartSaVioR) >>> >> >> >> -- >> Ryan Blue >> Software Engineer >> Netflix >> >
