Re: outstanding offers

2016-11-03 Thread Benjamin Mahler
Yes, if you re-register with the master, this will invalidate all
outstanding offers.

On Mon, Oct 31, 2016 at 2:28 PM, Hendrik Haddorp <hendrik.hadd...@gmx.net>
wrote:

> Right, I have written my own scheduler and sometimes end up in a state
> that Mesos believes that there are outstanding offers for my framework but
> I don't seem to have received them and the normal Mesos trace is now
> showing the IDs when it offers resources just when they get declined or
> used. I'll look into using that trace.
>
> Beside that the question is how one can get back to a state where there
> are no outstanding offers. For tasks I can call "reconcileTasks" to check
> with Mesos on the tasks state. But there does not seem to be an equivalent
> for offers, which is odd given that offers don't timeout by default. Thus I
> was wondering what happens if there are communication problems and Mesos
> sends out an offer that I never receive. And what happens if my framework
> gets reregistered with Mesos, do outstanding offers get automatically
> reseted or not?
>
> On 31.10.2016 18:49, Vinod Kone wrote:
>
>> Are you running a custom framework?
>>
>> Can you see in scheduler logs which offers you are receiving? Am I
>> understanding your question correctly that Mesos thinks offers are being
>> sent to your framework but (you think) your framework hasn't received them?
>>
>> Note that you can increase logging on the framework (driver) and Mesos
>> master by setting GLOG_v=1 in the environment.
>>
>> On Mon, Oct 31, 2016 at 12:42 AM, Hendrik Haddorp <
>> hendrik.hadd...@gmx.net <mailto:hendrik.hadd...@gmx.net>> wrote:
>>
>> Hi,
>>
>> I have a Mesos 0.28.2 system and generally things seem to run
>> fine. The "Outstanding Offers" normally shows nothing, which I
>> believe is normal. However at some point my framework gets
>> disconnected for some odd reason, might be due to some high load
>> or so. A few seconds later I receive a reregistered call from
>> Mesos. However it looks like around this time offers start to get
>> listed on the "Oustanding Offers" page. Even more strangely no
>> Mesos log file contains any information for the offer IDs shown.
>> Unfortunately the default logging does not show what offer IDs are
>> being send out while it shows the IDs that are being declined or
>> got accepted. So I don't know when these actually offers got send out.
>>
>> How can I deal with such situation? Should I:
>> Stop the SchedulerDriver when I get disconnected instead of
>> waiting for a reregistered call?
>> Is it advised to set --offer_timeout to recover from such a
>> situation?
>> Is there any way to reconcile offers like one can do for tasks?
>>
>> thanks,
>> Hendrik
>>
>>
>>
>


Re: outstanding offers

2016-10-31 Thread Hendrik Haddorp
Right, I have written my own scheduler and sometimes end up in a state 
that Mesos believes that there are outstanding offers for my framework 
but I don't seem to have received them and the normal Mesos trace is now 
showing the IDs when it offers resources just when they get declined or 
used. I'll look into using that trace.


Beside that the question is how one can get back to a state where there 
are no outstanding offers. For tasks I can call "reconcileTasks" to 
check with Mesos on the tasks state. But there does not seem to be an 
equivalent for offers, which is odd given that offers don't timeout by 
default. Thus I was wondering what happens if there are communication 
problems and Mesos sends out an offer that I never receive. And what 
happens if my framework gets reregistered with Mesos, do outstanding 
offers get automatically reseted or not?


On 31.10.2016 18:49, Vinod Kone wrote:

Are you running a custom framework?

Can you see in scheduler logs which offers you are receiving? Am I 
understanding your question correctly that Mesos thinks offers are 
being sent to your framework but (you think) your framework hasn't 
received them?


Note that you can increase logging on the framework (driver) and Mesos 
master by setting GLOG_v=1 in the environment.


On Mon, Oct 31, 2016 at 12:42 AM, Hendrik Haddorp 
<hendrik.hadd...@gmx.net <mailto:hendrik.hadd...@gmx.net>> wrote:


Hi,

I have a Mesos 0.28.2 system and generally things seem to run
fine. The "Outstanding Offers" normally shows nothing, which I
believe is normal. However at some point my framework gets
disconnected for some odd reason, might be due to some high load
or so. A few seconds later I receive a reregistered call from
Mesos. However it looks like around this time offers start to get
listed on the "Oustanding Offers" page. Even more strangely no
Mesos log file contains any information for the offer IDs shown.
Unfortunately the default logging does not show what offer IDs are
being send out while it shows the IDs that are being declined or
got accepted. So I don't know when these actually offers got send out.

How can I deal with such situation? Should I:
Stop the SchedulerDriver when I get disconnected instead of
waiting for a reregistered call?
Is it advised to set --offer_timeout to recover from such a
situation?
Is there any way to reconcile offers like one can do for tasks?

thanks,
Hendrik






Re: outstanding offers

2016-10-31 Thread Vinod Kone
Are you running a custom framework?

Can you see in scheduler logs which offers you are receiving? Am I
understanding your question correctly that Mesos thinks offers are being
sent to your framework but (you think) your framework hasn't received them?

Note that you can increase logging on the framework (driver) and Mesos
master by setting GLOG_v=1 in the environment.

On Mon, Oct 31, 2016 at 12:42 AM, Hendrik Haddorp <hendrik.hadd...@gmx.net>
wrote:

> Hi,
>
> I have a Mesos 0.28.2 system and generally things seem to run fine. The
> "Outstanding Offers" normally shows nothing, which I believe is normal.
> However at some point my framework gets disconnected for some odd reason,
> might be due to some high load or so. A few seconds later I receive a
> reregistered call from Mesos. However it looks like around this time offers
> start to get listed on the "Oustanding Offers" page. Even more strangely no
> Mesos log file contains any information for the offer IDs shown.
> Unfortunately the default logging does not show what offer IDs are being
> send out while it shows the IDs that are being declined or got accepted. So
> I don't know when these actually offers got send out.
>
> How can I deal with such situation? Should I:
> Stop the SchedulerDriver when I get disconnected instead of waiting
> for a reregistered call?
> Is it advised to set --offer_timeout to recover from such a situation?
> Is there any way to reconcile offers like one can do for tasks?
>
> thanks,
> Hendrik
>


outstanding offers

2016-10-31 Thread Hendrik Haddorp

Hi,

I have a Mesos 0.28.2 system and generally things seem to run fine. The 
"Outstanding Offers" normally shows nothing, which I believe is normal. 
However at some point my framework gets disconnected for some odd 
reason, might be due to some high load or so. A few seconds later I 
receive a reregistered call from Mesos. However it looks like around 
this time offers start to get listed on the "Oustanding Offers" page. 
Even more strangely no Mesos log file contains any information for the 
offer IDs shown. Unfortunately the default logging does not show what 
offer IDs are being send out while it shows the IDs that are being 
declined or got accepted. So I don't know when these actually offers got 
send out.


How can I deal with such situation? Should I:
Stop the SchedulerDriver when I get disconnected instead of waiting 
for a reregistered call?

Is it advised to set --offer_timeout to recover from such a situation?
Is there any way to reconcile offers like one can do for tasks?

thanks,
Hendrik