HeartSaVioR commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-981386950
The rationalization seems to make sense; supporting row read and columnar read are not mutually exclusive, and it might not be always true that `columnar read + columnar-to-row` is better than `row read` (if we are always in favor of columnar read) or vice versa. Even it only supports columnar read by nature, there might be possibility it can convert its columnar form to row more efficiently than relying on Spark's columnar-to-row conversion. I'm not referring any data source, but just thinking out loud about possibility. That said, for leaf node (each physical node probably has its requirement on row vs columnar) it sounds valid we allow supporting both and let Spark take one of, depending on the downstream node or preference if both are acceptable without addition conversion. (+1 on @attilapiros idea, though we may want to take this carefully to avoid requiring lots of changes on existing codes.) If there's a case `columar read + columnar-to-row` is always better than `row read`, simply mark as `don't support row read` and then Spark will take care of conversion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org