Hello,
I have a job configured the following way:
for (String path : paths) {
PCollection<String> col = pipeline.readTextFile(path);
col.parallelDo(new MyDoFn(path),
Writables.strings()).write(To.textFile(“out/“ + path), Target.WriteMode.APPEND);
}
pipeline.done();
It results in one spark job for each path, and the jobs run in sequence even
though there are no dependencies. Is it possible to have the jobs run in
parallel?
Thanks,
Ben- Processing many map only collections in single pipeline wit... Ben Juhn
- Re: Processing many map only collections in single pip... Stephen Durfey
- Re: Processing many map only collections in single... Ben Juhn
- Re: Processing many map only collections in si... Ben Juhn
- Re: Processing many map only collections i... David Ortiz
- Re: Processing many map only collecti... Ben Juhn
- Re: Processing many map only coll... David Ortiz
- Re: Processing many map only ... David Ortiz
- Re: Processing many map only ... Ben Juhn
- Re: Processing many map only ... David Ortiz
- Re: Processing many map only ... Josh Wills
