[GitHub] [spark] cloud-fan commented on issue #26684: [WIP][SPARK-30001][SQL] ResolveRelations should handle both V1 and V2 tables.

GitBox Tue, 26 Nov 2019 23:07:29 -0800

cloud-fan commented on issue #26684: [WIP][SPARK-30001][SQL] ResolveRelations 
should handle both V1 and V2 tables.
URL: https://github.com/apache/spark/pull/26684#issuecomment-558959828
 
 
   This seems like a hard problem. What we need is:
   1. access hive metadata only once when resolving a table.
   2. allow having catalog name in the table name for v1 tables.
   
   There are two things conflicting:
   1. we want to make fewer changes to the v1 code path. we want to still get 
v1 table through `SessionCatalog.lookupRelation`
   2. we want to know the table from session catalog is v1 or v2, through 
`V2SessionCatalog.loadTable`
   
   To do these 2 things together with one Hive metastore access, we have 3 
options:
   1. In `ResolveTables`, if we see a `V1Table`, we return a v1 relation 
instead of skipping it. This needs to refactor the view resolution, so that we 
don't need to resolve view and table recursively in one rule `ResolveRelations`.
   2. In `ResolveRelations`, we look up table using v2 API 
`V2SessionCatalog.loadTable`
   3. introduce a cache. This needs to be carefully designed, so that the cache 
only takes affect between `ResolveTables` and `ResolveRelations`.
   
   I think option 2 is the easiest to do at the current stage.
   
   cc @rdblue @brkyvz


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on issue #26684: [WIP][SPARK-30001][SQL] ResolveRelations should handle both V1 and V2 tables.

Reply via email to