Weiluo Ren created SPARK-17895: ---------------------------------- Summary: Improve documentation of "rowsBetween" and "rangeBetween" Key: SPARK-17895 URL: https://issues.apache.org/jira/browse/SPARK-17895 Project: Spark Issue Type: Documentation Components: PySpark, SparkR, SQL Reporter: Weiluo Ren Priority: Minor
This is an issue found by [~junyangq] when he was fixing SparkR docs. In WindowSpec we have two methods "rangeBetween" and "rowsBetween" (See [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala#L82]). However, the description of "rangeBetween" does not clearly differentiate it from "rowsBetween". Even though in [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala#L109] we have pretty nice description for "RangeFrame" and "RowFrame" which are used in "rangeBetween" and "rowsBetween", I cannot find them in the online Spark scala api. We could add small examples to the description of "rangeBetween" and "rowsBetween" like {code} val df = Seq(1,1,2).toDF("id") df.withColumn("sum", sum('id) over Window.orderBy('id).rangeBetween(0,1)).show /** * It shows * +---+---+ * | id|sum| * +---+---+ * | 1| 4| * | 1| 4| * | 2| 2| * +---+---+ */ df.withColumn("sum", sum('id) over Window.orderBy('id).rowsBetween(0,1)).show /** * It shows * +---+---+ * | id|sum| * +---+---+ * | 1| 2| * | 1| 3| * | 2| 2| * +---+---+ */ {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org