Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/4434#issuecomment-103632877
I think that Spark SQL is perhaps somewhat misleadingly named (as I
discussed [at the last Spark
Summit](http://www.slideshare.net/databricks/spark-sqlsse2015public)). You can
always call `.rdd` on any dataframe to get the underlying RDD if you don't want
to do higher level DataFrame/SQL operations.
The Data Sources API is the preferred way for reading data in various
formats as it is more concise, can perform optimizations like column pruning
automatically and it works the same in Scala/Java/Python/R, obviating the need
specific examples for every format/language combination.
So, while I appreciate the work you have done here, I don't think its worth
the maintenance burden to add this specific example. It would probably be
better as a gist or a blog post somewhere.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]