Threading/parallelism is up to the runner and does not map 1-1 to the java memory model since most runners will execute in a distributed manner. In general, runners will attempt to break up the work as evenly as possible and schedule work across multiple machines / cpu cores at all times to maximize the throughput / minimize time for execution of the job This is abstracted away much by getting users to write DoFns that apply with ParDo. Please take a look at this explanation about ParDo ( https://cloud.google.com/dataflow/model/par-do) to get a better understanding of its usage and as a place to look at some examples.
On Mon, Jun 20, 2016 at 12:44 PM, amir bahmanyari <[email protected]> wrote: > Thanks JB. > I am executing FlinkPipelineRunner...& later will experirnt the same with > SparkRunner....any examples pls? > Cheers > > ------------------------------ > *From:* Jean-Baptiste Onofré <[email protected]> > *To:* [email protected] > *Sent:* Monday, June 20, 2016 12:35 PM > *Subject:* Re: Multi-threading implementation equivalence in Beam > > Hi Amir, > > the DirectPipelineRunner uses multi-thread to achieve ParDo execution > for instance. > > You mean example of Beam pipeline code ? Or example of runner ? > > Regards > JB > > On 06/20/2016 09:25 PM, amir bahmanyari wrote: > > Hi Colleagues, > > Hope you all had a great weekend. Another novice question :-) > > Is there a pipeline parallelism/threading model provide by Beam that > > equates the multi-threading model in Java for instance? > > Any examples if so? > > Thanks again, > > Amir > > > -- > Jean-Baptiste Onofré > [email protected] > http://blog.nanthrax.net > Talend - http://www.talend.com > > > >
