I'd just like to close the loop. Josh, did you get an answer/guidance on how to proceed with your pipeline? Or maybe we'll need a new thread to figure that out : ) Best -P.
On Fri, Mar 9, 2018 at 1:39 PM Josh Ferge <josh.fe...@bounceexchange.com> wrote: > Hello all: > > Our team has a pipeline that make external network calls. These pipelines > are currently super slow, and the hypothesis is that they are slow because > we are not threading for our network calls. The github issue below provides > some discussion around this: > > https://github.com/apache/beam/pull/957 > > In beam 1.0, there was IntraBundleParallelization, which helped with this. > However, this was removed because it didn't comply with a few BEAM > paradigms. > > Questions going forward: > > What is advised for jobs that make blocking network calls? It seems > bundling the elements into groups of size X prior to passing to the DoFn, > and managing the threading within the function might work. thoughts? > Are these types of jobs even suitable for beam? > Are there any plans to develop features that help with this? > > Thanks > -- Got feedback? go/pabloem-feedback