Thanks for your advise. But I'm using Batch processing. Does anyone have a solution for the batch processing case?
Best, Rico. Am 19.11.2018 um 09:43 schrieb Magnus Nilsson: > > > Magnus Nilsson > > > 9:43 AM (0 minutes ago) > > > to info > > I had the same requirements. As far as I know the only way is to > extend the foreachwriter, cache the microbatch result and write to > each output. > > https://docs.databricks.com/spark/latest/structured-streaming/foreach.html > > Unfortunately it seems as if you have to make a new connection "per > batch" instead of creating one long lasting connections for the > pipeline as such. Ie you might have to implement some sort of > connection pooling by yourself depending on sink. > > Regards, > > Magnus > > > On Mon, Nov 19, 2018 at 9:13 AM Dipl.-Inf. Rico Bergmann > <i...@ricobergmann.de <mailto:i...@ricobergmann.de>> wrote: > > Hi! > > I have a SparkSQL programm, having one input and 6 ouputs (write). > When > executing this programm every call to write(.) executes the plan. My > problem is, that I want all these writes to happen in parallel (inside > one execution plan), because all writes have a common and compute > intensive subpart, that can be shared by all plans. Is there a > possibility to do this? (Caching is not a solution because the input > dataset is way to large...) > > Hoping for advises ... > > Best, Rico B. > > > --- > Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. > https://www.avast.com/antivirus > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.apache.org> >
pEpkey.asc
Description: application/pgp-keys
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org