[jira] [Commented] (SPARK-24073) DataSourceV2: Rename DataReaderFactory back to ReadTask.

Apache Spark (JIRA) Tue, 24 Apr 2018 13:00:30 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-24073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450443#comment-16450443
 ]


Apache Spark commented on SPARK-24073:
--------------------------------------

User 'rdblue' has created a pull request for this issue:
https://github.com/apache/spark/pull/21145

> DataSourceV2: Rename DataReaderFactory back to ReadTask.
> --------------------------------------------------------
>
>                 Key: SPARK-24073
>                 URL: https://issues.apache.org/jira/browse/SPARK-24073
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Ryan Blue
>            Priority: Major
>             Fix For: 2.4.0
>
>
> Just before 2.3.0, SPARK-23219 renamed ReadTask to DataReaderFactory. The 
> intent was to make the read and write API match (write side uses 
> DataWriterFactory), but the underlying problem is that the two classes are 
> not equivalent.
> ReadTask/DataReader function as Iterable/Iterator. ReadTask is a specific to 
> a read task, in contrast to DataWriterFactory where the same factory instance 
> is used in all write tasks. ReadTask's purpose is to manage the lifecycle of 
> DataReader with an explicit create operation to mirror the close operation. 
> This is no longer clear from the API, where DataReaderFactory appears to be 
> more generic than it is and it isn't clear why a set of them is produced for 
> a read.
> We should rename DataReaderFactory back to ReadTask, which correctly conveys 
> the purpose and use of the class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-24073) DataSourceV2: Rename DataReaderFactory back to ReadTask.

Reply via email to