HeartSaVioR commented on pull request #34642:
URL: https://github.com/apache/spark/pull/34642#issuecomment-981386950


   The rationalization seems to make sense; supporting row read and columnar 
read are not mutually exclusive, and it might not be always true that `columnar 
read + columnar-to-row` is better than `row read` (if we are always in favor of 
columnar read) or vice versa. Even it only supports columnar read by nature, 
there might be possibility it can convert its columnar form to row more 
efficiently than relying on Spark's columnar-to-row conversion. I'm not 
referring any data source, but just thinking out loud about possibility.
   
   That said, for leaf node (each physical node probably has its requirement on 
row vs columnar) it sounds valid we allow supporting both and let Spark take 
one of, depending on the downstream node or preference if both are acceptable 
without addition conversion. (+1 on @attilapiros idea, though we may want to 
take this carefully to avoid requiring lots of changes on existing codes.)
   
   If there's a case `columar read + columnar-to-row` is always better than 
`row read`, simply mark as `don't support row read` and then Spark will take 
care of conversion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to