Thanks !
I will do the review today.
Regards
JB
On 11/30/2016 01:04 PM, Khurrum Nasim wrote:
I just sent a pull request for adding a bounded source to Beam for reading
distributedlog streams - https://github.com/apache/incubator-beam/pull/1464
Appreciate any review comments.
- KN
On Wed, Aug 31, 2016 at 2:10 AM, Jean-Baptiste Onofré <[email protected]>
wrote:
Hi Khurrum,
I already replied in the Jira this morning.
To write the IO, the first question is bounded or unbounded and which
features you want to provide.
An IO could be a wrapper to a simple DoFn.
If you want provide advanced features like:
- watermark/skew management for unbounded source
- estimated size and split for bounded source
then you can use the Source API.
You can take a look on the existing IO:
- JMS, Kafka, PubSub for unbounded
- Bigtable, MongoDB for bounded
We are preparing some documentation on the Beam website about that.
In the mean time, you can take a look on the Dataflow Custom IO
documentation:
https://cloud.google.com/dataflow/model/custom-io-java
It's basically the same as in Beam.
Anyway, please, let me know, I would be more than happy to help you on
this !
I'm looking forward working with you on this !
Regards
JB
On 08/31/2016 11:02 AM, Khurrum Nasim wrote:
Hello beam folks,
We are evaluating a new solution to unify our streaming and batching data
pipeline, from storage, computing engine to programming model. The idea is
basically to implement the Kappa architecture, using DistributedLog as a
unified stream store for both streaming and batching, using Flink or Spark
(still debating) as the process engine, and using Beam as the programming
model.
We'd like to contribute an IO connector to DistributedLog (both bounded
source/sink and unbounded source/sink).
Is there any special instructions or best practise to add a new IO
connector? Any suggestion is very appreciated.
The jira is here: https://issues.apache.org/jira/browse/BEAM-607
Also, /cc the distributed log team for any helps.
KN
--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com
--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com