Agree especially using the current JdbcIO impl that creates connection in the @Setup. Or it means that @Teardown is never called ?
Regards JB Le 14 mars 2018 à 11:40, à 11:40, Eugene Kirpichov <[email protected]> a écrit: >Hi Derek - could you explain where does the "3000 connections" number >come >from, i.e. how did you measure it? It's weird that 5-6 workers would >use >3000 connections. > >On Wed, Mar 14, 2018 at 3:50 AM Derek Chan <[email protected]> wrote: > >> Hi, >> >> We are new to Beam and need some help. >> >> We are working on a flow to ingest events and writes the aggregated >> counts to a database. The input rate is rather low (~2000 message per >> sec), but the processing is relatively heavy, that we need to scale >out >> to 5~6 nodes. The output (via JDBC) is aggregated, so the volume is >also >> low. But because of the number of workers, it keeps 3000 connections >to >> the database and it keeps hitting the database connection limits. >> >> Is there a way that we can reduce the concurrency only at the output >> stage? (In Spark we would have done a repartition/coalesce). >> >> And, if it matters, we are using Apache Beam 2.2 via Scio, on Google >> Dataflow. >> >> Thank you in advance! >> >> >> >>
