[
https://issues.apache.org/jira/browse/BEAM-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15345476#comment-15345476
]
Matt Pouttu-Clarke commented on BEAM-91:
----------------------------------------
Yes agreed it is not clear yet from the docs how this relates directly to Beam.
However this is mainly a terminology issue in my perspective. The bespoke
systems I have built over the last few years to unify batch and stream
processing all rely on data versioning to ensure point-in-session consistency
(watermarks) across streams and all data derived from streams such as
aggregates, transforms, splits, and replicas.
There is no hard dependency on a configuration service but it is critical to
keep a current water mark and all historical watermarks in a system of record.
This could be as simple as a shared file system or as complex as etcd.
That aside the versioning model I set forward for flatbuffers is an example
using more recent technologies. I have done the same with relational tables and
Avro in the past.
I'll work on the examples of how the versioning model feeds aggregate refresh
and hopefully it will become more clear.
> Retractions
> -----------
>
> Key: BEAM-91
> URL: https://issues.apache.org/jira/browse/BEAM-91
> Project: Beam
> Issue Type: New Feature
> Components: beam-model
> Reporter: Tyler Akidau
> Assignee: Frances Perry
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> We still haven't added retractions to Beam, even though they're a core part
> of the model. We should document all the necessary aspects (uncombine,
> reverting DoFn output with DoOvers, sink integration, source-level
> retractions, etc), and then implement them.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)