[GitHub] spark issue #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFactory i...

rdblue Mon, 29 Jan 2018 09:09:42 -0800

Github user rdblue commented on the issue:

    https://github.com/apache/spark/pull/20397
  
    @cloud-fan, thanks for pinging me on this.
    
    -1: I don't think there's a compelling benefit to justify this change, and 
I think it makes the API more confusing. I think we should revert this.
    
    This class doesn't actually behave as a factory and is used more like an 
Iterable: it is only used to instantiate one DataReader and carries no explicit 
guarantee that it can be reused. In addition, the piece of work that each one 
represents is a task, which becomes an actual task when the stage runs. I would 
much rather keep the ReadTask name to make that connection clear.
    
    The write side does behave like a factory, so the name is appropriate 
there. There is little value to uniform names if the names actually make the 
API more confusing.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFactory i...

Reply via email to