[GitHub] [spark] cloud-fan edited a comment on issue #25822: [SPARK-29127][SQL] Support partitioning and bucketing through DataFrameWriter.save for V2 Tables

GitBox Wed, 18 Sep 2019 10:18:29 -0700

cloud-fan edited a comment on issue #25822: [SPARK-29127][SQL] Support 
partitioning and bucketing through DataFrameWriter.save for V2 Tables
URL: https://github.com/apache/spark/pull/25822#issuecomment-532781291
 
 
   It's worthwhile to discuss the usefulness of `TableProvider`. So far I see 2 
use cases:
   1. `DataFrameReader.load()` with only append/overwrite save mode. This was 
from a previous decision. If we revisit it and want to support all save modes, 
`TableProvider` can't be used here.
   2. CREATE TABLE USING with session catalog (similar to Hive EXTERNAL/MANAGED 
TABLE): The core idea is to keep metadata in Spark and keep data externally. 
`TableProvider` is a good fit as we don't need to create/alter/drop tables in 
the external systems, but register external data as tables in Spark.  This is 
the major use case of DS V1 and many users are familiar with it, I think it's 
better to support it with DS v2.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan edited a comment on issue #25822: [SPARK-29127][SQL] Support partitioning and bucketing through DataFrameWriter.save for V2 Tables

Reply via email to