Re: Slot pools correct usage

2018-04-07 Thread Brian Greene
So what’s it doing (your config)?  Does it work if you don’t use pools?  What 
about if the pool is if size 2?   What if just one dag runs?  Have you ever 
seen this query work, or is it just since you started messing with pools that 
it stopped working?

I use 1 pool, no priority (I don’t care about sequence), and it “throttles” 
fine...

Which executor are you using?  I’m not familiar enough with the intricacies to 
know if the pool settings are honored with different executors, but I’m using 
CeleryExecutor with success.

B

Sent from a device with less than stellar autocorrect

> On Apr 6, 2018, at 10:40 PM, Manish Trivedi  wrote:
> 
> Hi Brian,
> 
> Really appreciate your quick reply. Just to be clear, I did not intend to
> run them in particular order. as a matter of fact, these are expensive db
> queries that I cant afford to run in parallel.
> I think I have setup the tasks correctly to use pool but may be missing the
> priority_weight setting correctly. Appreciate if you could run by your
> configs just to see if I am not missing any simple point.
> 
> thanks much,
> Manish
> 
> On Fri, Apr 6, 2018 at 6:18 PM, Brian Greene <
> br...@heisenbergwoodworking.com> wrote:
> 
>> To be clear, you’re hoping that setting the slots to 1 will cause the
>> tasks across district dags to run in order based on the assumption that
>> they’ll queue up and then execute off the pool?
>> 
>> I don’t think it will quite work that way - there’s no guarantee the
>> scheduler will execute your tasks across dags in any particular sequence,
>> and if 1 is “faster” than the other for sure they don’t “line up”.  Thus,
>> no way to ensure they’ll queue in the right order.
>> 
>> I successfully use pools across many dags to limit access to an expensive
>> resource and it works really well, but my design doesn’t require they
>> execute in any particular order, each idempotent.
>> 
>> I’m curious as to your design/constraints - could you elaborate?
>> 
>> Brian
>> 
>> Sent from a device with less than stellar autocorrect
>> 
>>> On Apr 6, 2018, at 3:46 PM, Manish Trivedi  wrote:
>>> 
>>> Hi Airflow devs,
>>> 
>>> I have a use case to limit the # of calls to a certain database. I am
>> using
>>> the pool along with priority weight to schedule the tasks to the slot
>> pool.
>>> I have around 5 operators that I need to execute in serial order across
>>> different dags.
>>> 
>>> Slot pool is created with "1" slot to ensure sequential exection. I am
>> not
>>> able to achieve the desired function with current setup.
>> 


Re: Slot pools correct usage

2018-04-06 Thread Manish Trivedi
Hi Brian,

Really appreciate your quick reply. Just to be clear, I did not intend to
run them in particular order. as a matter of fact, these are expensive db
queries that I cant afford to run in parallel.
I think I have setup the tasks correctly to use pool but may be missing the
priority_weight setting correctly. Appreciate if you could run by your
configs just to see if I am not missing any simple point.

thanks much,
Manish

On Fri, Apr 6, 2018 at 6:18 PM, Brian Greene <
br...@heisenbergwoodworking.com> wrote:

> To be clear, you’re hoping that setting the slots to 1 will cause the
> tasks across district dags to run in order based on the assumption that
> they’ll queue up and then execute off the pool?
>
> I don’t think it will quite work that way - there’s no guarantee the
> scheduler will execute your tasks across dags in any particular sequence,
> and if 1 is “faster” than the other for sure they don’t “line up”.  Thus,
> no way to ensure they’ll queue in the right order.
>
> I successfully use pools across many dags to limit access to an expensive
> resource and it works really well, but my design doesn’t require they
> execute in any particular order, each idempotent.
>
> I’m curious as to your design/constraints - could you elaborate?
>
> Brian
>
> Sent from a device with less than stellar autocorrect
>
> > On Apr 6, 2018, at 3:46 PM, Manish Trivedi  wrote:
> >
> > Hi Airflow devs,
> >
> > I have a use case to limit the # of calls to a certain database. I am
> using
> > the pool along with priority weight to schedule the tasks to the slot
> pool.
> > I have around 5 operators that I need to execute in serial order across
> > different dags.
> >
> > Slot pool is created with "1" slot to ensure sequential exection. I am
> not
> > able to achieve the desired function with current setup.
>


Re: Slot pools correct usage

2018-04-06 Thread Brian Greene
To be clear, you’re hoping that setting the slots to 1 will cause the tasks 
across district dags to run in order based on the assumption that they’ll queue 
up and then execute off the pool?

I don’t think it will quite work that way - there’s no guarantee the scheduler 
will execute your tasks across dags in any particular sequence, and if 1 is 
“faster” than the other for sure they don’t “line up”.  Thus, no way to ensure 
they’ll queue in the right order.

I successfully use pools across many dags to limit access to an expensive 
resource and it works really well, but my design doesn’t require they execute 
in any particular order, each idempotent.

I’m curious as to your design/constraints - could you elaborate?

Brian

Sent from a device with less than stellar autocorrect

> On Apr 6, 2018, at 3:46 PM, Manish Trivedi  wrote:
> 
> Hi Airflow devs,
> 
> I have a use case to limit the # of calls to a certain database. I am using
> the pool along with priority weight to schedule the tasks to the slot pool.
> I have around 5 operators that I need to execute in serial order across
> different dags.
> 
> Slot pool is created with "1" slot to ensure sequential exection. I am not
> able to achieve the desired function with current setup.