Oh, missed your question on which one is better.... it really depends on
your use case.
If the data is homogenous, and you want to write to the same IO, I don't
see a reason not to Flatten them into one PCollection.
If you want to write files-to-files and Kafka-to-Kafka you might be better
off with two separate pipelines, batch and streaming. And to make things
even more elegant you could "compact" your (common) series of
transformations into a single composite transform such that you end-up with
something like:

*lines.apply(MyComposite)*
*moreLines.apply(MyComposite)*

Composite transforms programming guide is still under construction, should
be available here once ready :
https://beam.apache.org/documentation/programming-guide/#transforms-composite


On Wed, Feb 15, 2017 at 10:28 AM Amit Sela <[email protected]> wrote:

> You can write one pipeline and simply replace the IO, for example:
>
> To read from (text) files you can use:
> *PCollection<String> lines =
> p.apply(TextIO.Read.from("file://some/inputData.txt"));    *
>
> and from Kafka (I'm adding a generic key here because Kafka messages are
> keyed):
> *PCollection<KV<K, String>> moreLines = p,apply(*
> *    KafkaIO.<K, String>read()*
> *        .withBootstrapServers("brokers.list")*
> *        .withTopics("topic-list")*
> *        .withKeyCoder(Coder<K>)*
> *        .withValueCoder(StringUtf8Coder.of()));*
>
> Now you can apply the same code to both PCollections, or (as you
> mentioned) you can Flatten the together into one PCollection (after
> removing the keys from Kafka-read PCollection) and apply the
> transformations you want.
>
> You might find the IO section in the programming guide useful:
> https://beam.apache.org/documentation/programming-guide/#io
>
>
> On Wed, Feb 15, 2017 at 10:13 AM ankit beohar <[email protected]>
> wrote:
>
> Hi All,
>
> I have a use case where I have kafka and flat files so can I write one code
> and run for both or I have to create two different pipelines or use
> pipeline join in a one pipeline.
>
> Which one is better?
>
> Best Regards,
> ANKIT BEOHAR
>
>

Reply via email to