Tomas, in your case the framework keeps disconnecting and re-connecting with the master (likely a networking issue). Since the reconnect happens within the failover timeout (1 week) the master doesn't remove it. What IPs are your framework and master using?
On Wed, May 28, 2014 at 7:46 AM, Tomas Barton <[email protected]>wrote: > Hi, > > I have similar issue, Mesos is trying to keep alive a framework that is > crashing: > > I0528 16:42:52.487659 6009 master.cpp:929] Framework > 20140528-054038-316558480-5050-17117-0003 failed over > I0528 16:42:52.487927 6009 hierarchical_allocator_process.hpp:378] > Activated framework 20140528-054038-316558480-5050-17117-0003 > I0528 16:42:52.488483 6009 master.cpp:2282] Sending 2 offers to framework > 20140528-054038-316558480-5050-17117-0003 > I0528 16:42:52.488873 6009 master.cpp:592] Framework > 20140528-054038-316558480-5050-17117-0003 disconnected > I0528 16:42:52.488914 6009 master.cpp:1076] Deactivating framework > 20140528-054038-316558480-5050-17117-0003 > I0528 16:42:52.489202 6009 master.cpp:614] Giving framework > 20140528-054038-316558480-5050-17117-0003 1weeks to failover > I0528 16:42:52.489279 6009 hierarchical_allocator_process.hpp:408] > Deactivated framework 20140528-054038-316558480-5050-17117-0003 > > it's trying to recover the framework few times per second. Is there > currently a way how to remove that framework? > > Probably delete framework state from zookeeper? > > Tomas > > > On 28 May 2014 05:56, Manivannan <[email protected]> wrote: > >> Hi Vinod, >> >> Thanks for your reply. Please see inline. >> >> Thanks, >> Mani >> >> >> On Wed, May 28, 2014 at 3:57 AM, Vinod Kone <[email protected]> wrote: >> >>> Hi Mani, >>> >>> What do you mean by "stuck" framework? If the framework disconnects from >>> master and the failover timeout (configurable) has passed master should >>> remove the framework. - *I have a Mesos cluster and lot of Jenkins >>> instances talking to the cluster to provision slaves. Although I have >>> killed the Jenkins instanes, I still see that they are listed as frameworks >>> in Mesos(that is what I mentioned as stuck frameworks). What is the >>> default fail over timeout ? * >>> >> >> >>> >>> Also, there is currently work in progress to give operators the ability >>> to force remove a framework. See : >>> https://issues.apache.org/jira/browse/MESOS-1390 - *I believe this fix >>> would help me out.* >>> >> >> >>> >>> >>> On Tue, May 27, 2014 at 5:01 AM, Manivannan <[email protected]>wrote: >>> >>>> Hi , >>>> >>>> My issue is similar to : >>>> https://issues.apache.org/jira/browse/MESOS-108 >>>> Couple of frameworks were stuck forever in my Mesos cluster. Is there >>>> a way to kill those frameworks ? >>>> >>>> Thanks, >>>> Mani >>>> >>> >>> >> >

