As a bonus, here is a simplified diagram view of the use-case: Cheers, Pascal
On Thu, Aug 16, 2018 at 3:12 PM, Pascal Gula <[email protected]> wrote: > Hello, > I am currently evaluating Apache Beam (later executing on Google > DataFlow), and for the first use-case I am working on, I have a kinda > design question to see if any of you already had a similar one. > Namely, we have a DB describing dashboards views, and for each views, we > would like to perform some aggregation transform. > My first approach would be to create a higher level pipeline that will > fetch all view configurations from our mongoDB (BTW, we released a mongoDB > IO connector here: https://pypi.org/project/beam-extended/). With this > views PColl, the idea is to have a ParDo, with a DoFn that will create > sub-pipleine to perform the aggregation on data from our plant database > with a qurey derived from the view configuration. Afterwards, the idea is > to save for the higher level pipeline, some performance/data metrics > related to the execution of the array of sub-pipeline. > The main question is: are nested pipeline supported by the runner? > I hope that my description was clear enough. I will work on a diagram view > meanwhile. > Very best regards, > Pascal > > -- > > Pascal Gula > Senior Data Engineer / Scientist > +49 (0)176 34232684www.plantix.net <http://plantix.net/> > PEAT GmbH > Kastanienallee 4 > 10435 Berlin // Germany > <https://play.google.com/store/apps/details?id=com.peat.GartenBank>Download > the App! <https://play.google.com/store/apps/details?id=com.peat.GartenBank> > > -- Pascal Gula Senior Data Engineer / Scientist +49 (0)176 34232684www.plantix.net <http://plantix.net/> PEAT GmbH Kastanienallee 4 10435 Berlin // Germany <https://play.google.com/store/apps/details?id=com.peat.GartenBank>Download the App! <https://play.google.com/store/apps/details?id=com.peat.GartenBank>
