Hi,
the PR was merged to master and a few follow-up issues, were created,
mainly [1] and [2]. I didn't find any reference to SortedMapState in
JIRA, is there any tracking issue for that that I can link to? I also
added link to design document here [3].
[1] https://issues.apache.org/jira/browse/BEAM-9256
[2] https://issues.apache.org/jira/browse/BEAM-9257
[3] https://cwiki.apache.org/confluence/display/BEAM/Design+Documents
On 1/30/20 1:39 PM, Jan Lukavský wrote:
Hi,
PR [1] (issue [2]) went though code review, and according to [3] seems
to me to be ready for merge. Current state of the implementation is
that it is supported only for direct runner, legacy flink runner
(batch and streaming) and legacy spark (batch). It could be supported
by all other (streaming) runners using StatefulDoFnRunner, provided
the runner can make guarantees about ordering of timer firings (which
is unfortunately the case only for legacy flink and direct runner, at
least for now - related issues are mentioned multiple times on other
threads). Implementation for other batch runners should be as
straightforward as adding sorting by event timestamp before stateful
dofn (in case where the runner doesn't sort already - e.g. Dataflow -
in which case the annotation can be simply ignored - hence support for
batch Dataflow seems to be a no-op).
There has been some slight controversy about this feature, but current
feature proposing and implementing guidelines do not cover how to
resolve those, so I'm using this opportunity to let the community
know, that there is a plan to merge this feature, unless there is some
veto (please provide specific reasons for that in that case). The plan
is to merge this in the second part of next week, unless there is a veto.
Thanks,
Jan
[1] https://github.com/apache/beam/pull/8774
[2] https://issues.apache.org/jira/browse/BEAM-8550
[3] https://beam.apache.org/contribute/committer-guide/