wilsonwang371 commented on issue #18479: URL: https://github.com/apache/beam/issues/18479#issuecomment-1340369856
> There are several requirements that must be met. Essentially: a) preserving per-partition order of records (i.e. records emitted in order from one distributed producer must not overtake each other when consumed) b) producer must be able to enqueue output records for a specific consumer (e.g. assigning a key of a output record, all records with same key must then be consumed by the same instance of downstream consumer) c) producer must be able to send record to all downstream consumers (i.e. producer must know how many consumers there - possibly - is) d) there must be some kind of support of state commit, either at the end of bundle, during bundle commit (dataflow model), or as a flowing checkpoint barrier (flink model), there must be a way to safely store state in a distributed fault tolerant storage and be able to possibly restore the complete state from that committed state > > Having these conditions met I think it should be possible (though quite hard) to implement Beam runner on top of it. Kafka definitely has all four (even without Kafka streams). Thank you so much for the reply. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
