[GitHub] spark pull request: [SPARK-14160] Time Windowing functions for Dat...

marmbrus Mon, 28 Mar 2016 12:34:43 -0700

Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12008#discussion_r57620239
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
    @@ -1585,3 +1586,110 @@ object ResolveUpCast extends Rule[LogicalPlan] {
         }
       }
     }
    +
    +/**
    + * Maps a time column to multiple time windows using the Expand operator. 
Since it's non-trivial to
    + * figure out how many windows a time column can map to, we over-estimate 
the number of windows and
    + * filter out the rows where the time column is not inside the time window.
    + */
    +object TimeWindowing extends Rule[LogicalPlan] {
    +
    +  /**
    +   * Depending on the operation, the TimeWindow expression may be wrapped 
in an Alias (in case of
    +   * projections) or be simply by itself (in case of groupBy),
    +   * @param f The function that we want to apply on the TimeWindow 
expression
    +   * @return The user defined function applied on the TimeWindow expression
    +   */
    +  private def getWindowExpr[E](f: TimeWindow => E): 
PartialFunction[Expression, E] = {
    --- End diff --
    
    Usually we would do this as an extractor (an unapply method). I think it 
looks a little less magical that way.
    
    However, I'm worried that we are loosing the alias.  Can you add a test 
like:
    
    ```scala
    df.groupBy(window(...).as("time")).agg(count("*")).select($"time")
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-14160] Time Windowing functions for Dat...

Reply via email to