viirya commented on pull request #31296:
URL: https://github.com/apache/spark/pull/31296#issuecomment-765879767


   > Map doesn't expose it because object of type T is just passed into the map 
function.
   > 
   > Pipe would need to serialize T in order to pass it to the different 
process, wouldn't it? I think that's what @HeartSaVioR was trying to say.
   
   No~ As I mentioned many times above, Dataset.pipe works like RDD.pipe. For 
RDD.pipe, there is one parameter `printRDDElement: (T, String => Unit) => 
Unit`, which is similar to the map function and takes the domain object T. The 
function takes the object T and uses the second parameter, a print out 
function, to print the string data. So it is basically like Dataset.map. Here 
is I'm confused why @HeartSaVioR continues to mention that point.
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to