Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/20397#discussion_r164425992
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/DataReaderFactory.java
---
@@ -22,21 +22,23 @@
import org.apache.spark.annotation.InterfaceStability;
/**
- * A read task returned by {@link DataSourceV2Reader#createReadTasks()}
and is responsible for
- * creating the actual data reader. The relationship between {@link
ReadTask} and {@link DataReader}
+ * A reader factory returned by {@link
DataSourceV2Reader#createDataReaderFactories()} and is
+ * responsible for creating the actual data reader. The relationship
between
+ * {@link DataReaderFactory} and {@link DataReader}
* is similar to the relationship between {@link Iterable} and {@link
java.util.Iterator}.
*
- * Note that, the read task will be serialized and sent to executors, then
the data reader will be
- * created on executors and do the actual reading. So {@link ReadTask}
must be serializable and
- * {@link DataReader} doesn't need to be.
+ * Note that, the reader factory will be serialized and sent to executors,
then the data reader
+ * will be created on executors and do the actual reading. So {@link
DataReaderFactory} must be
+ * serializable and {@link DataReader} doesn't need to be.
*/
@InterfaceStability.Evolving
-public interface ReadTask<T> extends Serializable {
+public interface DataReaderFactory<T> extends Serializable {
/**
- * The preferred locations where this read task can run faster, but
Spark does not guarantee that
- * this task will always run on these locations. The implementations
should make sure that it can
- * be run on any location. The location is a string representing the
host name.
+ * The preferred locations where this data reader returned by this
reader factory can run faster,
+ * but Spark does not guarantee that this task will always run on these
locations.
--- End diff --
`not guarantee to always run the data reader on these locations.`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]