Hello,

I have a job configured the following way:
for (String path : paths) {
    PCollection<String> col = pipeline.readTextFile(path);
    col.parallelDo(new MyDoFn(path), 
Writables.strings()).write(To.textFile(“out/“ + path), Target.WriteMode.APPEND);
}
pipeline.done();
It results in one spark job for each path, and the jobs run in sequence even 
though there are no dependencies.  Is it possible to have the jobs run in 
parallel?
Thanks,
Ben

Reply via email to