Github user tejasapatil commented on the issue:
https://github.com/apache/spark/pull/16868
There are two main uses of EXTERNAL tables I am aware of:
1. Ingest data from non-hive locations into Hive tables. This can be
covered by adding test case for reading from external table creating using the
command this PR enables
2. Create a logical "pointer" to an existing hive table / partition
(without creating multiple copies of the underlying data). Testing if the
destination table can have the same location as of the source table will cover
this.
I don't think Spark's interpretation of external tables is different from
Hive's so its OK to support both.
BTW: If you are supporting 1st use case, one can mimic to get behavior of
2nd use case by creating external table with a fake location and later issuing
a `ALTER TABLE SET LOCATION` command to make it point to an existing table's
location. There is really no mechanism to guard against having EXTERNAL tables
not point to an existing table / partition in Spark. So, both use cases were
already possible in Spark
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]