cloud-fan commented on a change in pull request #26868: [SPARK-29665][SQL]
refine the TableProvider interface
URL: https://github.com/apache/spark/pull/26868#discussion_r361185075
##########
File path:
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableProvider.java
##########
@@ -36,26 +39,50 @@
public interface TableProvider {
/**
- * Return a {@link Table} instance to do read/write with user-specified
options.
+ * Infer the schema of the table identified by the given options.
+ *
+ * @param options an immutable case-insensitive string-to-string map that
can identify a table,
+ * e.g. file path, Kafka topic name, etc.
+ */
+ StructType inferSchema(CaseInsensitiveStringMap options);
+
+ /**
+ * Infer the partitioning of the table identified by the given options.
+ * <p>
+ * By default this method returns empty partitioning, please override it if
this source support
+ * partitioning.
+ *
+ * @param options an immutable case-insensitive string-to-string map that
can identify a table,
+ * e.g. file path, Kafka topic name, etc.
+ */
+ default Transform[] inferPartitioning(CaseInsensitiveStringMap options) {
+ return new Transform[0];
+ }
+
+ /**
+ * Return a {@link Table} instance with the specified table schema,
partitioning and properties
+ * to do read/write. The returned table should report the same schema and
partitioning with the
+ * specified ones, or Spark may fail the operation.
*
- * @param options the user-specified options that can identify a table, e.g.
file path, Kafka
- * topic name, etc. It's an immutable case-insensitive
string-to-string map.
+ * @param schema The specified table schema.
+ * @param partitioning The specified table partitioning.
+ * @param properties The specified table properties. It's case preserving
(contains exactly what
+ * users specified) and implementations are free to use it
case sensitively or
+ * insensitively. It should be able to identify a table,
e.g. file path, Kafka
+ * topic name, etc.
*/
- Table getTable(CaseInsensitiveStringMap options);
+ Table getTable(StructType schema, Transform[] partitioning, Map<String,
String> properties);
Review comment:
shall we use `CaseInsensitiveStringMap` as table properties? People can get
the original case sensitive map easily via `asCaseSensitiveMap`.
cc @rdblue @brkyvz
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]