Github user gengliangwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r208114465
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/SupportsPushDownRequiredColumns.java
---
@@ -21,22 +21,25 @@
import org.apache.spark.sql.types.StructType;
/**
- * A mix-in interface for {@link DataSourceReader}. Data source readers
can implement this
+ * A mix-in interface for {@link ScanConfigBuilder}. Data sources can
implement this
* interface to push down required columns to the data source and only
read these columns during
* scan to reduce the size of the data to be read.
*/
@InterfaceStability.Evolving
-public interface SupportsPushDownRequiredColumns extends DataSourceReader {
+public interface SupportsPushDownRequiredColumns extends ScanConfigBuilder
{
/**
* Applies column pruning w.r.t. the given requiredSchema.
*
* Implementation should try its best to prune the unnecessary columns
or nested fields, but it's
* also OK to do the pruning partially, e.g., a data source may not be
able to prune nested
* fields, and only prune top-level columns.
- *
- * Note that, data source readers should update {@link
DataSourceReader#readSchema()} after
- * applying column pruning.
*/
void pruneColumns(StructType requiredSchema);
--- End diff --
As we have a new method `prunedSchema`, should we rename this to
`pruneSchema`? As the parameter is also schema.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]