Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20397#discussion_r164425827
  
    --- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/DataReaderFactory.java
 ---
    @@ -22,21 +22,23 @@
     import org.apache.spark.annotation.InterfaceStability;
     
     /**
    - * A read task returned by {@link DataSourceV2Reader#createReadTasks()} 
and is responsible for
    - * creating the actual data reader. The relationship between {@link 
ReadTask} and {@link DataReader}
    + * A reader factory returned by {@link 
DataSourceV2Reader#createDataReaderFactories()} and is
    + * responsible for creating the actual data reader. The relationship 
between
    + * {@link DataReaderFactory} and {@link DataReader}
      * is similar to the relationship between {@link Iterable} and {@link 
java.util.Iterator}.
      *
    - * Note that, the read task will be serialized and sent to executors, then 
the data reader will be
    - * created on executors and do the actual reading. So {@link ReadTask} 
must be serializable and
    - * {@link DataReader} doesn't need to be.
    + * Note that, the reader factory will be serialized and sent to executors, 
then the data reader
    + * will be created on executors and do the actual reading. So {@link 
DataReaderFactory} must be
    + * serializable and {@link DataReader} doesn't need to be.
      */
     @InterfaceStability.Evolving
    -public interface ReadTask<T> extends Serializable {
    +public interface DataReaderFactory<T> extends Serializable {
     
       /**
    -   * The preferred locations where this read task can run faster, but 
Spark does not guarantee that
    -   * this task will always run on these locations. The implementations 
should make sure that it can
    -   * be run on any location. The location is a string representing the 
host name.
    +   * The preferred locations where this data reader returned by this 
reader factory can run faster,
    --- End diff --
    
    `this data reader` -> `the data reader`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to