Oh, missed your question on which one is better.... it really depends on your use case. If the data is homogenous, and you want to write to the same IO, I don't see a reason not to Flatten them into one PCollection. If you want to write files-to-files and Kafka-to-Kafka you might be better off with two separate pipelines, batch and streaming. And to make things even more elegant you could "compact" your (common) series of transformations into a single composite transform such that you end-up with something like:
*lines.apply(MyComposite)* *moreLines.apply(MyComposite)* Composite transforms programming guide is still under construction, should be available here once ready : https://beam.apache.org/documentation/programming-guide/#transforms-composite On Wed, Feb 15, 2017 at 10:28 AM Amit Sela <[email protected]> wrote: > You can write one pipeline and simply replace the IO, for example: > > To read from (text) files you can use: > *PCollection<String> lines = > p.apply(TextIO.Read.from("file://some/inputData.txt")); * > > and from Kafka (I'm adding a generic key here because Kafka messages are > keyed): > *PCollection<KV<K, String>> moreLines = p,apply(* > * KafkaIO.<K, String>read()* > * .withBootstrapServers("brokers.list")* > * .withTopics("topic-list")* > * .withKeyCoder(Coder<K>)* > * .withValueCoder(StringUtf8Coder.of()));* > > Now you can apply the same code to both PCollections, or (as you > mentioned) you can Flatten the together into one PCollection (after > removing the keys from Kafka-read PCollection) and apply the > transformations you want. > > You might find the IO section in the programming guide useful: > https://beam.apache.org/documentation/programming-guide/#io > > > On Wed, Feb 15, 2017 at 10:13 AM ankit beohar <[email protected]> > wrote: > > Hi All, > > I have a use case where I have kafka and flat files so can I write one code > and run for both or I have to create two different pipelines or use > pipeline join in a one pipeline. > > Which one is better? > > Best Regards, > ANKIT BEOHAR > >
