[jira] [Commented] (SPARK-17895) Improve documentation of "rowsBetween" and "rangeBetween"

2018-10-03 Thread Antonio Pedro de Sousa Vieira (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637464#comment-16637464
 ] 

Antonio Pedro de Sousa Vieira commented on SPARK-17895:
---

These changes seem to only have been applied to Scala docs, SparkR and PySpark 
docs are still equal for the two methods. Should this be reopened?

> Improve documentation of "rowsBetween" and "rangeBetween"
> -
>
> Key: SPARK-17895
> URL: https://issues.apache.org/jira/browse/SPARK-17895
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark, SparkR, SQL
>Reporter: Weiluo Ren
>Assignee: Weiluo Ren
>Priority: Minor
> Fix For: 2.1.0
>
>
> This is an issue found by [~junyangq] when he was fixing SparkR docs.
> In WindowSpec we have two methods "rangeBetween" and "rowsBetween" (See 
> [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala#L82]).
>  However, the description of "rangeBetween" does not clearly differentiate it 
> from "rowsBetween". Even though in 
> [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala#L109]
>  we have pretty nice description for "RangeFrame" and "RowFrame" which are 
> used in "rangeBetween" and "rowsBetween", I cannot find them in the online 
> Spark scala api. 
> We could add small examples to the description of "rangeBetween" and 
> "rowsBetween" like
> {code}
> val df = Seq(1,1,2).toDF("id")
> df.withColumn("sum", sum('id) over Window.orderBy('id).rangeBetween(0,1)).show
> /**
>  * It shows
>  * +---+---+
>  * | id|sum|
>  * +---+---+
>  * |  1|  4|
>  * |  1|  4|
>  * |  2|  2|
>  * +---+---+
> */
> df.withColumn("sum", sum('id) over Window.orderBy('id).rowsBetween(0,1)).show
> /**
>  * It shows
>  * +---+---+
>  * | id|sum|
>  * +---+---+
>  * |  1|  2|
>  * |  1|  3|
>  * |  2|  2|
>  * +---+---+
> */
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17895) Improve documentation of "rowsBetween" and "rangeBetween"

2016-11-01 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15627597#comment-15627597
 ] 

Apache Spark commented on SPARK-17895:
--

User 'david-weiluo-ren' has created a pull request for this issue:
https://github.com/apache/spark/pull/15727

> Improve documentation of "rowsBetween" and "rangeBetween"
> -
>
> Key: SPARK-17895
> URL: https://issues.apache.org/jira/browse/SPARK-17895
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark, SparkR, SQL
>Reporter: Weiluo Ren
>Priority: Minor
>
> This is an issue found by [~junyangq] when he was fixing SparkR docs.
> In WindowSpec we have two methods "rangeBetween" and "rowsBetween" (See 
> [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala#L82]).
>  However, the description of "rangeBetween" does not clearly differentiate it 
> from "rowsBetween". Even though in 
> [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala#L109]
>  we have pretty nice description for "RangeFrame" and "RowFrame" which are 
> used in "rangeBetween" and "rowsBetween", I cannot find them in the online 
> Spark scala api. 
> We could add small examples to the description of "rangeBetween" and 
> "rowsBetween" like
> {code}
> val df = Seq(1,1,2).toDF("id")
> df.withColumn("sum", sum('id) over Window.orderBy('id).rangeBetween(0,1)).show
> /**
>  * It shows
>  * +---+---+
>  * | id|sum|
>  * +---+---+
>  * |  1|  4|
>  * |  1|  4|
>  * |  2|  2|
>  * +---+---+
> */
> df.withColumn("sum", sum('id) over Window.orderBy('id).rowsBetween(0,1)).show
> /**
>  * It shows
>  * +---+---+
>  * | id|sum|
>  * +---+---+
>  * |  1|  2|
>  * |  1|  3|
>  * |  2|  2|
>  * +---+---+
> */
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17895) Improve documentation of "rowsBetween" and "rangeBetween"

2016-10-13 Thread Weiluo Ren (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15572726#comment-15572726
 ] 

Weiluo Ren commented on SPARK-17895:


[~junyangq] Could you please help fix the SparkR doc accordingly?

> Improve documentation of "rowsBetween" and "rangeBetween"
> -
>
> Key: SPARK-17895
> URL: https://issues.apache.org/jira/browse/SPARK-17895
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark, SparkR, SQL
>Reporter: Weiluo Ren
>Priority: Minor
>
> This is an issue found by [~junyangq] when he was fixing SparkR docs.
> In WindowSpec we have two methods "rangeBetween" and "rowsBetween" (See 
> [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala#L82]).
>  However, the description of "rangeBetween" does not clearly differentiate it 
> from "rowsBetween". Even though in 
> [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala#L109]
>  we have pretty nice description for "RangeFrame" and "RowFrame" which are 
> used in "rangeBetween" and "rowsBetween", I cannot find them in the online 
> Spark scala api. 
> We could add small examples to the description of "rangeBetween" and 
> "rowsBetween" like
> {code}
> val df = Seq(1,1,2).toDF("id")
> df.withColumn("sum", sum('id) over Window.orderBy('id).rangeBetween(0,1)).show
> /**
>  * It shows
>  * +---+---+
>  * | id|sum|
>  * +---+---+
>  * |  1|  4|
>  * |  1|  4|
>  * |  2|  2|
>  * +---+---+
> */
> df.withColumn("sum", sum('id) over Window.orderBy('id).rowsBetween(0,1)).show
> /**
>  * It shows
>  * +---+---+
>  * | id|sum|
>  * +---+---+
>  * |  1|  2|
>  * |  1|  3|
>  * |  2|  2|
>  * +---+---+
> */
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17895) Improve documentation of "rowsBetween" and "rangeBetween"

2016-10-13 Thread Weiluo Ren (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15572721#comment-15572721
 ] 

Weiluo Ren commented on SPARK-17895:


Sure. Just want to collect some comments on the example to be added to the 
description here. Or I can first create a PR and get comments when people 
review it.

> Improve documentation of "rowsBetween" and "rangeBetween"
> -
>
> Key: SPARK-17895
> URL: https://issues.apache.org/jira/browse/SPARK-17895
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark, SparkR, SQL
>Reporter: Weiluo Ren
>Priority: Minor
>
> This is an issue found by [~junyangq] when he was fixing SparkR docs.
> In WindowSpec we have two methods "rangeBetween" and "rowsBetween" (See 
> [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala#L82]).
>  However, the description of "rangeBetween" does not clearly differentiate it 
> from "rowsBetween". Even though in 
> [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala#L109]
>  we have pretty nice description for "RangeFrame" and "RowFrame" which are 
> used in "rangeBetween" and "rowsBetween", I cannot find them in the online 
> Spark scala api. 
> We could add small examples to the description of "rangeBetween" and 
> "rowsBetween" like
> {code}
> val df = Seq(1,1,2).toDF("id")
> df.withColumn("sum", sum('id) over Window.orderBy('id).rangeBetween(0,1)).show
> /**
>  * It shows
>  * +---+---+
>  * | id|sum|
>  * +---+---+
>  * |  1|  4|
>  * |  1|  4|
>  * |  2|  2|
>  * +---+---+
> */
> df.withColumn("sum", sum('id) over Window.orderBy('id).rowsBetween(0,1)).show
> /**
>  * It shows
>  * +---+---+
>  * | id|sum|
>  * +---+---+
>  * |  1|  2|
>  * |  1|  3|
>  * |  2|  2|
>  * +---+---+
> */
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17895) Improve documentation of "rowsBetween" and "rangeBetween"

2016-10-13 Thread Felix Cheung (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15572602#comment-15572602
 ] 

Felix Cheung commented on SPARK-17895:
--

would you like to fix this?

> Improve documentation of "rowsBetween" and "rangeBetween"
> -
>
> Key: SPARK-17895
> URL: https://issues.apache.org/jira/browse/SPARK-17895
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark, SparkR, SQL
>Reporter: Weiluo Ren
>Priority: Minor
>
> This is an issue found by [~junyangq] when he was fixing SparkR docs.
> In WindowSpec we have two methods "rangeBetween" and "rowsBetween" (See 
> [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala#L82]).
>  However, the description of "rangeBetween" does not clearly differentiate it 
> from "rowsBetween". Even though in 
> [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala#L109]
>  we have pretty nice description for "RangeFrame" and "RowFrame" which are 
> used in "rangeBetween" and "rowsBetween", I cannot find them in the online 
> Spark scala api. 
> We could add small examples to the description of "rangeBetween" and 
> "rowsBetween" like
> {code}
> val df = Seq(1,1,2).toDF("id")
> df.withColumn("sum", sum('id) over Window.orderBy('id).rangeBetween(0,1)).show
> /**
>  * It shows
>  * +---+---+
>  * | id|sum|
>  * +---+---+
>  * |  1|  4|
>  * |  1|  4|
>  * |  2|  2|
>  * +---+---+
> */
> df.withColumn("sum", sum('id) over Window.orderBy('id).rowsBetween(0,1)).show
> /**
>  * It shows
>  * +---+---+
>  * | id|sum|
>  * +---+---+
>  * |  1|  2|
>  * |  1|  3|
>  * |  2|  2|
>  * +---+---+
> */
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org