Tomas, in your case the framework keeps disconnecting and re-connecting
with the master (likely a networking issue). Since the reconnect happens
within the failover timeout (1 week) the master doesn't remove it. What IPs
are your framework and master using?


On Wed, May 28, 2014 at 7:46 AM, Tomas Barton <[email protected]>wrote:

> Hi,
>
> I have similar issue, Mesos is trying to keep alive a framework that is
> crashing:
>
> I0528 16:42:52.487659  6009 master.cpp:929] Framework
> 20140528-054038-316558480-5050-17117-0003 failed over
> I0528 16:42:52.487927  6009 hierarchical_allocator_process.hpp:378]
> Activated framework 20140528-054038-316558480-5050-17117-0003
> I0528 16:42:52.488483  6009 master.cpp:2282] Sending 2 offers to framework
> 20140528-054038-316558480-5050-17117-0003
> I0528 16:42:52.488873  6009 master.cpp:592] Framework
> 20140528-054038-316558480-5050-17117-0003 disconnected
> I0528 16:42:52.488914  6009 master.cpp:1076] Deactivating framework
> 20140528-054038-316558480-5050-17117-0003
> I0528 16:42:52.489202  6009 master.cpp:614] Giving framework
> 20140528-054038-316558480-5050-17117-0003 1weeks to failover
> I0528 16:42:52.489279  6009 hierarchical_allocator_process.hpp:408]
> Deactivated framework 20140528-054038-316558480-5050-17117-0003
>
> it's trying to recover the framework few times per second. Is there
> currently a way how to remove that framework?
>
> Probably delete framework state from zookeeper?
>
> Tomas
>
>
> On 28 May 2014 05:56, Manivannan <[email protected]> wrote:
>
>> Hi Vinod,
>>
>> Thanks for your reply. Please see inline.
>>
>> Thanks,
>> Mani
>>
>>
>> On Wed, May 28, 2014 at 3:57 AM, Vinod Kone <[email protected]> wrote:
>>
>>> Hi Mani,
>>>
>>> What do you mean by "stuck" framework? If the framework disconnects from
>>> master and the failover timeout (configurable) has passed master should
>>> remove the framework. - *I have a Mesos cluster and lot of Jenkins
>>> instances talking to the cluster to provision slaves. Although I have
>>> killed the Jenkins instanes, I still see that they are listed as frameworks
>>> in Mesos(that is what I mentioned as stuck frameworks). What is the
>>> default fail over timeout ? *
>>>
>>
>>
>>>
>>> Also, there is currently work in progress to give operators the ability
>>> to force remove a framework. See :
>>> https://issues.apache.org/jira/browse/MESOS-1390 - *I believe this fix
>>> would help me out.*
>>>
>>
>>
>>>
>>>
>>> On Tue, May 27, 2014 at 5:01 AM, Manivannan <[email protected]>wrote:
>>>
>>>> Hi ,
>>>>
>>>> My issue is similar to :
>>>> https://issues.apache.org/jira/browse/MESOS-108
>>>> Couple of  frameworks were stuck forever in my Mesos cluster. Is there
>>>> a way to kill those frameworks ?
>>>>
>>>> Thanks,
>>>> Mani
>>>>
>>>
>>>
>>
>

Reply via email to