I'd be happy to review a PR. At the minute, I'm still learning Spark
SQL, so writing documentation might be a bit of a stretch, but reviewing
would be fine.
Thanks!
On 12/16/2016 08:39 AM, Thakrar, Jayesh wrote:
Yes - that sounds good Anton, I can work on documenting the window
functions.
*From: *Anton Okolnychyi <[email protected]>
*Date: *Thursday, December 15, 2016 at 4:34 PM
*To: *Conversant <[email protected]>
*Cc: *Michael Armbrust <[email protected]>, Jim Hughes
<[email protected]>, "[email protected]" <[email protected]>
*Subject: *Re: Expand the Spark SQL programming guide?
I think it will make sense to show a sample implementation of
UserDefinedAggregateFunction for DataFrames, and an example of the
Aggregator API for typed Datasets.
Jim, what if I submit a PR and you join the review process? I also do
not mind to split this if you want, but it seems to be an overkill for
this part.
Jayesh, shall I skip the window functions part since you are going to
work on that?
2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh
<[email protected] <mailto:[email protected]>>:
I too am interested in expanding the documentation for Spark SQL.
For my work I needed to get some info/examples/guidance on window
functions and have been using
https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html
.
How about divide and conquer?
*From: *Michael Armbrust <[email protected]
<mailto:[email protected]>>
*Date: *Thursday, December 15, 2016 at 3:21 PM
*To: *Jim Hughes <[email protected] <mailto:[email protected]>>
*Cc: *"[email protected] <mailto:[email protected]>"
<[email protected] <mailto:[email protected]>>
*Subject: *Re: Expand the Spark SQL programming guide?
Pull requests would be welcome for any major missing features in
the guide:
https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <[email protected]
<mailto:[email protected]>> wrote:
Hi Anton,
I'd like to see this as well. I've been working on
implementing geospatial user-defined types and functions.
Having examples of aggregations and window functions would be
awesome!
I did test out implementing a distributed convex hull as a
UserDefinedAggregateFunction, and that seemed to work sensibly.
Cheers,
Jim
On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
Hi,
I am wondering whether it makes sense to expand the Spark
SQL programming guide with examples of aggregations
(including user-defined via the Aggregator API) and window
functions. For instance, there might be a separate
subsection under "Getting Started" for each functionality.
SPARK-16046 seems to be related but there is no activity
for more than 4 months.
Best regards,
Anton