Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21145#discussion_r186538150
  
    --- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/ReadTask.java ---
    @@ -22,20 +22,20 @@
     import org.apache.spark.annotation.InterfaceStability;
     
     /**
    - * A reader factory returned by {@link 
DataSourceReader#createDataReaderFactories()} and is
    + * A read task returned by {@link DataSourceReader#createReadTasks()} and 
is
    --- End diff --
    
    InputPartition sounds fine, but is it ok to have a method like 
"createDataReader" inside it? Will create confusion when "inputPartition" is 
member of other classes like DataSourceRDDPartition?
    
    It appears that the `DataReaderFactory` is kind of a wrapper for the Reader 
so that the Reader itself need not be serializable. I am also ok to leave it as 
is (though technically it may not be a factory).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to