We do the same as well. BigQuery limits UDF usage to 6, so any DAG that
uses a UDF goes in a pool (the 'udf' pool), which has a max of 6.

On Thu, May 19, 2016 at 4:27 PM, siddharth anand <[email protected]> wrote:

> Hi Lance!
> Yes, we do the same. Specifically, we have multiple DAGs that share access
> to a Spark cluster through the use of Pools. By setting the pool size to
> say 4, we remove the possibility of some backfill swamping the Spark
> cluster. BTW, there were some bugs with over-subscription of pools. It's
> not a common occurrence, but it has been reported.
>
> -s
>
> On Thu, May 19, 2016 at 9:37 PM, Lance Norskog <[email protected]>
> wrote:
>
> > How should we use pools in our dags?
> >
> > We do a lot of analytics queries and copying between databases. I've set
> up
> > pools for each database instance so that we avoid overloading instances
> > with queries. Is this the right approach?
> >
> > Thanks,
> >
> > --
> > Lance Norskog
> > [email protected]
> > Redwood City, CA
> >
>

Reply via email to