Ok, thanks. Yes, there is a problem with over-subscribing pools. If your pool is set to 4, you can get 15 active tasks and another 20 waiting. This is still true in 1.7.0.
Lance On Thu, May 19, 2016 at 5:21 PM, Chris Riccomini <[email protected]> wrote: > We do the same as well. BigQuery limits UDF usage to 6, so any DAG that > uses a UDF goes in a pool (the 'udf' pool), which has a max of 6. > > On Thu, May 19, 2016 at 4:27 PM, siddharth anand <[email protected]> wrote: > > > Hi Lance! > > Yes, we do the same. Specifically, we have multiple DAGs that share > access > > to a Spark cluster through the use of Pools. By setting the pool size to > > say 4, we remove the possibility of some backfill swamping the Spark > > cluster. BTW, there were some bugs with over-subscription of pools. It's > > not a common occurrence, but it has been reported. > > > > -s > > > > On Thu, May 19, 2016 at 9:37 PM, Lance Norskog <[email protected]> > > wrote: > > > > > How should we use pools in our dags? > > > > > > We do a lot of analytics queries and copying between databases. I've > set > > up > > > pools for each database instance so that we avoid overloading instances > > > with queries. Is this the right approach? > > > > > > Thanks, > > > > > > -- > > > Lance Norskog > > > [email protected] > > > Redwood City, CA > > > > > > -- Lance Norskog [email protected] Redwood City, CA
