[
https://issues.apache.org/jira/browse/FLINK-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936933#comment-14936933
]
ASF GitHub Bot commented on FLINK-2666:
---------------------------------------
GitHub user aljoscha opened a pull request:
https://github.com/apache/flink/pull/1201
[FLINK-2666] Add timestamp extraction operator
This adds a user function TimestampExtractor and an operator
ExtractTimestampsOperator that can be used to extract timestamps and
attach them to elements to do event-time windowing.
Users can either use an AscendingTimestampExtractor that assumes that
timestamps are monotonically increasing. (This allows it to derive the
watermark very easily.) Or they use a TimestampExtractor, where they
also have to provide the watermark.
The ExtractTimestampOperator periodically (on the auto watermark
interval) calls the extractor to get the current watermark and forwards
it.
This also adds an ITCase for this behaviour.
(Ignore the other two commits, I'm just basing my work on top of these)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/aljoscha/flink timestamp-extractor
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/1201.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1201
----
commit 95eabefc7526aeb7248c30a09782253ac4e9ad99
Author: Aljoscha Krettek <[email protected]>
Date: 2015-09-29T18:22:11Z
[hotfix] Simplify new windowing API
Before, there would be three different window() methods on
KeyedDataStream: one that takes two policies, one that takes one policy
and one that takes a window assigner.
Now, there is only one window() method that takes a window assigner and
creates a KeyedWindowDataStream.
For conveniece, there are two methods timeWindows() that take either one
argument (tumbling windows) or two arguments (sliding windows). These
create a KeyedWindowDataStream with either a SlidingWindows or
TumblingWindows assigner.
When the window operator is created we pick the optimized aligned time
windows operator if the combination of window assigner/trigger/evictor
allows it.
All of this behaviour is verified in tests.
commit e42a6a92cedea7d2c8f12855f8eab6609bd1ba60
Author: Aljoscha Krettek <[email protected]>
Date: 2015-09-29T18:29:09Z
[FLINK-2778] Add API for non-parallel non-keyed Windows
This adds two new operators for non-keyed windows: Regular trigger
operator and evicting trigger operator.
This also adds the API calls nonParallelWindow(...) on DataStream and
the API class NonParallelWindowDataStream for representing these
operations.
This also adds tests for both the operators and the translation from API
to operators.
commit 81385e68a2a9ac454eaaca63609516e36c5e0af1
Author: Aljoscha Krettek <[email protected]>
Date: 2015-09-30T13:05:13Z
[FLINK-2666] Add timestamp extraction operator
This adds a user function TimestampExtractor and an operator
ExtractTimestampsOperator that can be used to extract timestamps and
attach them to elements to do event-time windowing.
Users can either use an AscendingTimestampExtractor that assumes that
timestamps are monotonically increasing. (This allows it to derive the
watermark very easily.) Or they use a TimestampExtractor, where they
also have to provide the watermark.
The ExtractTimestampOperator periodically (on the auto watermark
interval) calls the extractor to get the current watermark and forwards
it.
This also adds an ITCase for this behaviour.
----
> Allow custom Timestamp extractors for Flink sources
> ---------------------------------------------------
>
> Key: FLINK-2666
> URL: https://issues.apache.org/jira/browse/FLINK-2666
> Project: Flink
> Issue Type: Improvement
> Components: Streaming
> Reporter: Gyula Fora
> Assignee: Aljoscha Krettek
> Priority: Minor
>
> When record timestamps are turned on users currently have 2 ways of
> specifying record timestamps.
> They can either chose to automatically attach ingress timestamps (and send
> watermarks), or custom implement a sourcefunction to manually assign
> timestamps and emit watermarks.
> It would be good if users could define a Timestamp extractor function that
> will attach a timestamp for every record generated using any of the current
> Flink sources. Also watermarks for these records should be automatically
> generated based on the extracted event time (assuming monotonicity per
> source) periodically.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)