Github user gengliangwang commented on the issue:
https://github.com/apache/spark/pull/21145
Either names are not perfect.
It is not a real task, and it has a method name `createDataReader`, while
there is `createDataWriter` in `DataWriterFactory`.
It is not a factory (design pattern). I did the renaming `ReadTask` ->
`DataReaderFactory` to make read and write API consistent. It wasn't such
misleading as expected, since the API in `DataSourceReader` is
`List<DataReaderFactory<Row>> createDataReaderFactories();`.
Now I feel sorry that I didn't come up with a better naming at that time.
But **partially** changing the naming to `ReadTask` now only makes things worse.
If there is a better name than both names, let's use it. Otherwise, I
prefer `DataReaderFactory` to `ReadTask`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]