[jira] [Updated] (SPARK-7322) Add DataFrame DSL for window function support in Scala/Java

2015-05-22 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-7322:
---
Summary: Add DataFrame DSL for window function support in Scala/Java  (was: 
Add DataFrame DSL for window function support)

 Add DataFrame DSL for window function support in Scala/Java
 ---

 Key: SPARK-7322
 URL: https://issues.apache.org/jira/browse/SPARK-7322
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Assignee: Cheng Hao
  Labels: DataFrame

 Here's a proposal for supporting window functions in the DataFrame DSL:
 1. Add an over function to Column:
 {code}
 class Column {
   ...
   def over(window: Window): Column
   ...
 }
 {code}
 2. Window:
 {code}
 object Window {
   def partitionBy(...): Window
   def orderBy(...): Window
   object Frame {
 def unbounded: Frame
 def preceding(n: Long): Frame
 def following(n: Long): Frame
   }
   class Frame
 }
 class Window {
   def orderBy(...): Window
   def rowsBetween(Frame, Frame): Window
   def rangeBetween(Frame, Frame): Window  // maybe add this later
 }
 {code}
 Here's an example to use it:
 {code}
 df.select(
   avg(“age”).over(Window.partitionBy(“..”, “..”).orderBy(“..”, “..”)
 .rowsBetween(Frame.unbounded, Frame.currentRow))
 )
 df.select(
   avg(“age”).over(Window.partitionBy(“..”, “..”).orderBy(“..”, “..”)
 .rowsBetween(Frame.preceding(50), Frame.following(10)))
 )
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7322) Add DataFrame DSL for window function support

2015-05-21 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-7322:
---
Description: 
Here's a proposal for supporting window functions in the DataFrame DSL:

1. Add an over function to Column:
{code}
class Column {
  ...
  def over(window: Window): Column
  ...
}
{code}

2. Window:
{code}
object Window {
  def partitionBy(...): Window
  def orderBy(...): Window

  object Frame {
def unbounded: Frame
def preceding(n: Long): Frame
def following(n: Long): Frame
  }

  class Frame
}

class Window {
  def orderBy(...): Window
  def rowsBetween(Frame, Frame): Window
  def rangeBetween(Frame, Frame): Window  // maybe add this later
}
{code}

Here's an example to use it:
{code}
df.select(
  avg(“age”).over(Window.partitionBy(“..”, “..”).orderBy(“..”, “..”)
.rowsBetween(Frame.unbounded, Frame.currentRow))
)

df.select(
  avg(“age”).over(Window.partitionBy(“..”, “..”).orderBy(“..”, “..”)
.rowsBetween(Frame.preceding(50), Frame.following(10)))
)
{code}

  was:
Here's a proposal for supporting window functions in the DataFrame DSL:

1. Add an over function to Column:
{code}
class Column {
  ...
  def over(): WindowFunctionSpec
  ...
}
{code}

2. WindowFunctionSpec:
{code}
// By default frame = full partition
class WindowFunctionSpec {
  def partitionBy(cols: Column*): WindowFunctionSpec
  def orderBy(cols: Column*): WindowFunctionSpec

  // restrict frame beginning from current row - n position
  def rowsPreceding(n: Int): WindowFunctionSpec

  // restrict frame ending from current row - n position
  def rowsFollowing(n: Int): WindowFunctionSpec

  def rangePreceding(n: Int): WindowFunctionSpec
  def rowsFollowing(n: Int): WindowFunctionSpec
}
{code}

Here's an example to use it:
{code}
df.select(
  df.store,
  df.date,
  df.sales,
  avg(df.sales).over.partitionBy(df.store)
.orderBy(df.store) 
.rowsFollowing(0)// this means from unbounded preceding 
to current row
)
{code}


 Add DataFrame DSL for window function support
 -

 Key: SPARK-7322
 URL: https://issues.apache.org/jira/browse/SPARK-7322
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Assignee: Cheng Hao
  Labels: DataFrame

 Here's a proposal for supporting window functions in the DataFrame DSL:
 1. Add an over function to Column:
 {code}
 class Column {
   ...
   def over(window: Window): Column
   ...
 }
 {code}
 2. Window:
 {code}
 object Window {
   def partitionBy(...): Window
   def orderBy(...): Window
   object Frame {
 def unbounded: Frame
 def preceding(n: Long): Frame
 def following(n: Long): Frame
   }
   class Frame
 }
 class Window {
   def orderBy(...): Window
   def rowsBetween(Frame, Frame): Window
   def rangeBetween(Frame, Frame): Window  // maybe add this later
 }
 {code}
 Here's an example to use it:
 {code}
 df.select(
   avg(“age”).over(Window.partitionBy(“..”, “..”).orderBy(“..”, “..”)
 .rowsBetween(Frame.unbounded, Frame.currentRow))
 )
 df.select(
   avg(“age”).over(Window.partitionBy(“..”, “..”).orderBy(“..”, “..”)
 .rowsBetween(Frame.preceding(50), Frame.following(10)))
 )
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7322) Add DataFrame DSL for window function support

2015-05-11 Thread Cheng Lian (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Lian updated SPARK-7322:
--
Assignee: Cheng Hao

 Add DataFrame DSL for window function support
 -

 Key: SPARK-7322
 URL: https://issues.apache.org/jira/browse/SPARK-7322
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Assignee: Cheng Hao
  Labels: DataFrame

 Here's a proposal for supporting window functions in the DataFrame DSL:
 1. Add an over function to Column:
 {code}
 class Column {
   ...
   def over(): WindowFunctionSpec
   ...
 }
 {code}
 2. WindowFunctionSpec:
 {code}
 // By default frame = full partition
 class WindowFunctionSpec {
   def partitionBy(cols: Column*): WindowFunctionSpec
   def orderBy(cols: Column*): WindowFunctionSpec
   // restrict frame beginning from current row - n position
   def rowsPreceding(n: Int): WindowFunctionSpec
   // restrict frame ending from current row - n position
   def rowsFollowing(n: Int): WindowFunctionSpec
   def rangePreceding(n: Int): WindowFunctionSpec
   def rowsFollowing(n: Int): WindowFunctionSpec
 }
 {code}
 Here's an example to use it:
 {code}
 df.select(
   df.store,
   df.date,
   df.sales,
   avg(df.sales).over.partitionBy(df.store)
 .orderBy(df.store) 
 .rowsFollowing(0)// this means from unbounded 
 preceding to current row
 )
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7322) Add DataFrame DSL for window function support

2015-05-06 Thread Olivier Girardot (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olivier Girardot updated SPARK-7322:

Labels: DataFrame  (was: )

 Add DataFrame DSL for window function support
 -

 Key: SPARK-7322
 URL: https://issues.apache.org/jira/browse/SPARK-7322
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
  Labels: DataFrame

 Here's a proposal for supporting window functions in the DataFrame DSL:
 1. Add an over function to Column:
 {code}
 class Column {
   ...
   def over(): WindowFunctionSpec
   ...
 }
 {code}
 2. WindowFunctionSpec:
 {code}
 // By default frame = full partition
 class WindowFunctionSpec {
   def partitionBy(cols: Column*): WindowFunctionSpec
   def orderBy(cols: Column*): WindowFunctionSpec
   // restrict frame beginning from current row - n position
   def rowsPreceding(n: Int): WindowFunctionSpec
   // restrict frame ending from current row - n position
   def rowsFollowing(n: Int): WindowFunctionSpec
   def rangePreceding(n: Int): WindowFunctionSpec
   def rowsFollowing(n: Int): WindowFunctionSpec
 }
 {code}
 Here's an example to use it:
 {code}
 df.select(
   df.store,
   df.date,
   df.sales,
   avg(df.sales).over.partitionBy(df.store)
 .orderBy(df.store) 
 .rowsFollowing(0)// this means from unbounded 
 preceding to current row
 )
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7322) Add DataFrame DSL for window function support

2015-05-04 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-7322:
---
Description: 


class Column {
  ...
  def over(): WindowFunctionSpec
  ...
}

// By default frame = full partition
class WindowFunctionSpec {
  def partitionBy(cols: Column*): WindowFunctionSpec
  def orderBy(cols: Column*): WindowFunctionSpec

  // restrict frame beginning from current row - n position
  def rowsPreceding(n: Int): WindowFunctionSpec

  // restrict frame ending from current row - n position
  def rowsFollowing(n: Int): WindowFunctionSpec

  def rangePreceding(n: Int): WindowFunctionSpec
  def rowsFollowing(n: Int): WindowFunctionSpec
}


df.select(
  df.store,
  df.date,
  df.sales,
  avg(df.sales).over.partitionBy(df.store)
.orderBy(df.store) 
.rowsFollowing(0)// this means from unbounded preceding 
to current row
)

  was:A good reference implementation: 
http://www.jooq.org/doc/3.6/manual-single-page/#window-functions


 Add DataFrame DSL for window function support
 -

 Key: SPARK-7322
 URL: https://issues.apache.org/jira/browse/SPARK-7322
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 class Column {
   ...
   def over(): WindowFunctionSpec
   ...
 }
 // By default frame = full partition
 class WindowFunctionSpec {
   def partitionBy(cols: Column*): WindowFunctionSpec
   def orderBy(cols: Column*): WindowFunctionSpec
   // restrict frame beginning from current row - n position
   def rowsPreceding(n: Int): WindowFunctionSpec
   // restrict frame ending from current row - n position
   def rowsFollowing(n: Int): WindowFunctionSpec
   def rangePreceding(n: Int): WindowFunctionSpec
   def rowsFollowing(n: Int): WindowFunctionSpec
 }
 df.select(
   df.store,
   df.date,
   df.sales,
   avg(df.sales).over.partitionBy(df.store)
 .orderBy(df.store) 
 .rowsFollowing(0)// this means from unbounded 
 preceding to current row
 )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7322) Add DataFrame DSL for window function support

2015-05-04 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-7322:
---
Description: 
Here's a proposal for supporting window functions in the DataFrame DSL:

1. Add an over function to Column:
{code}
class Column {
  ...
  def over(): WindowFunctionSpec
  ...
}
{code}

2. WindowFunctionSpec:
{code}
// By default frame = full partition
class WindowFunctionSpec {
  def partitionBy(cols: Column*): WindowFunctionSpec
  def orderBy(cols: Column*): WindowFunctionSpec

  // restrict frame beginning from current row - n position
  def rowsPreceding(n: Int): WindowFunctionSpec

  // restrict frame ending from current row - n position
  def rowsFollowing(n: Int): WindowFunctionSpec

  def rangePreceding(n: Int): WindowFunctionSpec
  def rowsFollowing(n: Int): WindowFunctionSpec
}
{code}

Here's an example to use it:
{code}
df.select(
  df.store,
  df.date,
  df.sales,
  avg(df.sales).over.partitionBy(df.store)
.orderBy(df.store) 
.rowsFollowing(0)// this means from unbounded preceding 
to current row
)
{code}

  was:


class Column {
  ...
  def over(): WindowFunctionSpec
  ...
}

// By default frame = full partition
class WindowFunctionSpec {
  def partitionBy(cols: Column*): WindowFunctionSpec
  def orderBy(cols: Column*): WindowFunctionSpec

  // restrict frame beginning from current row - n position
  def rowsPreceding(n: Int): WindowFunctionSpec

  // restrict frame ending from current row - n position
  def rowsFollowing(n: Int): WindowFunctionSpec

  def rangePreceding(n: Int): WindowFunctionSpec
  def rowsFollowing(n: Int): WindowFunctionSpec
}


df.select(
  df.store,
  df.date,
  df.sales,
  avg(df.sales).over.partitionBy(df.store)
.orderBy(df.store) 
.rowsFollowing(0)// this means from unbounded preceding 
to current row
)


 Add DataFrame DSL for window function support
 -

 Key: SPARK-7322
 URL: https://issues.apache.org/jira/browse/SPARK-7322
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 Here's a proposal for supporting window functions in the DataFrame DSL:
 1. Add an over function to Column:
 {code}
 class Column {
   ...
   def over(): WindowFunctionSpec
   ...
 }
 {code}
 2. WindowFunctionSpec:
 {code}
 // By default frame = full partition
 class WindowFunctionSpec {
   def partitionBy(cols: Column*): WindowFunctionSpec
   def orderBy(cols: Column*): WindowFunctionSpec
   // restrict frame beginning from current row - n position
   def rowsPreceding(n: Int): WindowFunctionSpec
   // restrict frame ending from current row - n position
   def rowsFollowing(n: Int): WindowFunctionSpec
   def rangePreceding(n: Int): WindowFunctionSpec
   def rowsFollowing(n: Int): WindowFunctionSpec
 }
 {code}
 Here's an example to use it:
 {code}
 df.select(
   df.store,
   df.date,
   df.sales,
   avg(df.sales).over.partitionBy(df.store)
 .orderBy(df.store) 
 .rowsFollowing(0)// this means from unbounded 
 preceding to current row
 )
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org