Tudor Miu created SPARK-21641: --------------------------------- Summary: Combining windowing (groupBy) and mapGroupsWithState (groupByKey) in Spark Structured Streaming Key: SPARK-21641 URL: https://issues.apache.org/jira/browse/SPARK-21641 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 2.2.0 Reporter: Tudor Miu
Given a stream of timestamped data with watermarking, there seems to be no way to combine (1) the {{groupBy}} operation to achieve windowing by the timestamp field and other grouping criteria with (2) the {{groupByKey}} operation in order to apply {{mapGroupsWithState }}to the groups for custom sessionization. For context: - calling {{groupBy}}, which supports windowing, on a Dataset returns a {{RelationalGroupedDataset }}which does not have {{mapGroupsWithState}}. - calling {{groupByKey}}, which supports {{mapGroupsWithState}}, returns a {{KeyValueGroupedDataset}}, but that has no support for windowing. The suggestion is to _somehow_ unify the two APIs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org