Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20397
@cloud-fan, thanks for pinging me on this.
-1: I don't think there's a compelling benefit to justify this change, and
I think it makes the API more confusing. I think we should revert this.
This class doesn't actually behave as a factory and is used more like an
Iterable: it is only used to instantiate one DataReader and carries no explicit
guarantee that it can be reused. In addition, the piece of work that each one
represents is a task, which becomes an actual task when the stage runs. I would
much rather keep the ReadTask name to make that connection clear.
The write side does behave like a factory, so the name is appropriate
there. There is little value to uniform names if the names actually make the
API more confusing.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]