Hello beam folks, We are evaluating a new solution to unify our streaming and batching data pipeline, from storage, computing engine to programming model. The idea is basically to implement the Kappa architecture, using DistributedLog as a unified stream store for both streaming and batching, using Flink or Spark (still debating) as the process engine, and using Beam as the programming model.
We'd like to contribute an IO connector to DistributedLog (both bounded source/sink and unbounded source/sink). Is there any special instructions or best practise to add a new IO connector? Any suggestion is very appreciated. The jira is here: https://issues.apache.org/jira/browse/BEAM-607 Also, /cc the distributed log team for any helps. KN