Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20397
  
    About the renaming, a lot of people complained to me about why the namings 
are not consistent, including @rxin . I named it `ReadTask` at the beginning 
because it really works like a task. But I believe after 2.3 more and more 
people will complain about the naming inconsistency because the difference 
between `ReadTask` and `DataWriterFactory` is too subtle: both of them are 
responsible for serializing information and initializing the actual 
reader/writer at executor side. The only difference is, we only get one 
`DataWriterFactor`, serialize and send it to all partitions, which means we 
implicitly "copy" the writer factory to all partitions. While for `ReakTask`, 
we get many of them, and send each one to its corresponding partition, which 
means there is no "copy". I think the renaming is worth to remove future 
confusions.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to