Thanks for your advise. But I'm using Batch processing. Does anyone have
a solution for the batch processing case?

Best,

Rico.


Am 19.11.2018 um 09:43 schrieb Magnus Nilsson:
>
>
>       Magnus Nilsson
>
>       
> 9:43 AM (0 minutes ago)
>       
>       
> to info
>
> I had the same requirements. As far as I know the only way is to
> extend the foreachwriter, cache the microbatch result and write to
> each output.
>
> https://docs.databricks.com/spark/latest/structured-streaming/foreach.html
>
> Unfortunately it seems as if you have to make a new connection "per
> batch" instead of creating one long lasting connections for the
> pipeline as such. Ie you might have to implement some sort of
> connection pooling by yourself depending on sink. 
>
> Regards,
>
> Magnus
>
>
> On Mon, Nov 19, 2018 at 9:13 AM Dipl.-Inf. Rico Bergmann
> <i...@ricobergmann.de <mailto:i...@ricobergmann.de>> wrote:
>
>     Hi!
>
>     I have a SparkSQL programm, having one input and 6 ouputs (write).
>     When
>     executing this programm every call to write(.) executes the plan. My
>     problem is, that I want all these writes to happen in parallel (inside
>     one execution plan), because all writes have a common and compute
>     intensive subpart, that can be shared by all plans. Is there a
>     possibility to do this? (Caching is not a solution because the input
>     dataset is way to large...)
>
>     Hoping for advises ...
>
>     Best, Rico B.
>
>
>     ---
>     Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
>     https://www.avast.com/antivirus
>
>
>     ---------------------------------------------------------------------
>     To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>     <mailto:user-unsubscr...@spark.apache.org>
>

Attachment: pEpkey.asc
Description: application/pgp-keys

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to