Hey Pedro,
SQL programming guide is being updated. Here's the PR, but not merged
yet: https://github.com/apache/spark/pull/13592
Cheng
On 6/17/16 9:13 PM, Pedro Rodriguez wrote:
Hi All,
At my workplace we are starting to use Datasets in 1.6.1 and even more
with Spark 2.0 in place of Dataframes. I looked at the 1.6.1
documentation then the 2.0 documentation and it looks like not much
time has been spent writing a Dataset guide/tutorial.
Preview Docs:
https://home.apache.org/~pwendell/spark-releases/spark-2.0.0-preview-docs/sql-programming-guide.html#creating-datasets
<https://home.apache.org/%7Epwendell/spark-releases/spark-2.0.0-preview-docs/sql-programming-guide.html#creating-datasets>
Spark master docs:
https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
I would like to spend the time to contribute an improvement to those
docs with a more in depth examples of creating and using Datasets (eg
using $ to select columns). Is this of value, and if so what should my
next step be to get this going (create JIRA etc)?
--
Pedro Rodriguez
PhD Student in Distributed Machine Learning | CU Boulder
R&D Data Science Intern at Oracle Data Cloud
UC Berkeley AMPLab Alumni
ski.rodrig...@gmail.com <mailto:ski.rodrig...@gmail.com> |
pedrorodriguez.io <http://pedrorodriguez.io> | 909-353-4423
Github: github.com/EntilZha <http://github.com/EntilZha> | LinkedIn:
https://www.linkedin.com/in/pedrorodriguezscience