Re: Reducing database connection with JdbcIO

Jean-Baptiste Onofré Wed, 14 Mar 2018 11:53:28 -0700

Agree especially using the current JdbcIO impl that creates connection in the 
@Setup. Or it means that @Teardown is never called ?


Regards
JB

Le 14 mars 2018 à 11:40, à 11:40, Eugene Kirpichov <[email protected]> a 
écrit:
>Hi Derek - could you explain where does the "3000 connections" number
>come
>from, i.e. how did you measure it? It's weird that 5-6 workers would
>use
>3000 connections.
>
>On Wed, Mar 14, 2018 at 3:50 AM Derek Chan <[email protected]> wrote:
>
>> Hi,
>>
>> We are new to Beam and need some help.
>>
>> We are working on a flow to ingest events and writes the aggregated
>> counts to a database. The input rate is rather low (~2000 message per
>> sec), but the processing is relatively heavy, that we need to scale
>out
>> to 5~6 nodes. The output (via JDBC) is aggregated, so the volume is
>also
>> low. But because of the number of workers, it keeps 3000 connections
>to
>> the database and it keeps hitting the database connection limits.
>>
>> Is there a way that we can reduce the concurrency only at the output
>> stage? (In Spark we would have done a repartition/coalesce).
>>
>> And, if it matters, we are using Apache Beam 2.2 via Scio, on Google
>> Dataflow.
>>
>> Thank you in advance!
>>
>>
>>
>>

Re: Reducing database connection with JdbcIO

Reply via email to