>Is there a specific reason why the slave does not first send a TERM
signal, and if that does not help after a certain timeout, send a KILL
signal?
>That would give us a chance to cleanup consul registrations (and other
cleanup).
I think maybe this flow more complex? How about you register a KILL signal
listener to cleanup consul registration?


On Fri, Feb 12, 2016 at 6:12 PM, Harry Metske <[email protected]>
wrote:

> Hi,
>
> we have a Mesos (0.27) cluster running with (here relevant) slave options:
> --cgroups_enable_cfs=true
> --cgroups_limit_swap=true
> --isolation=cgroups/cpu,cgroups/mem
>
> What we see happening is that people are running Tasks (Java applications)
> and specify a memory resource limit that is too low, which cause these
> tasks to be terminated, see logs below.
> That's all fine, after all you should specify reasonable memory limits.
> It looks like the slave sends a KILL signal when the limit is reached, so
> the application has no chance to do recovery termination, which (in our
> case) results in consul registrations not being cleaned up.
> Is there a specific reason why the slave does not first send a TERM
> signal, and if that does not help after a certain timeout, send a KILL
> signal?
> That would give us a chance to cleanup consul registrations (and other
> cleanup).
>
> kind regards,
> Harry
>
>
> I0212 09:27:49.238371 11062 containerizer.cpp:1460] Container
> bed2585a-c361-4c66-afd9-69e70e748ae2 has reached its limit for resource
> mem(*):160 and will be terminated
>
> I0212 09:27:49.238418 11062 containerizer.cpp:1227] Destroying container
> 'bed2585a-c361-4c66-afd9-69e70e748ae2'
>
> I0212 09:27:49.240932 11062 cgroups.cpp:2427] Freezing cgroup
> /sys/fs/cgroup/freezer/mesos/bed2585a-c361-4c66-afd9-69e70e748ae2
>
> I0212 09:27:49.345171 11062 cgroups.cpp:1409] Successfully froze cgroup
> /sys/fs/cgroup/freezer/mesos/bed2585a-c361-4c66-afd9-69e70e748ae2 after
> 104.21376ms
>
> I0212 09:27:49.347303 11062 cgroups.cpp:2445] Thawing cgroup
> /sys/fs/cgroup/freezer/mesos/bed2585a-c361-4c66-afd9-69e70e748ae2
>
> I0212 09:27:49.349453 11062 cgroups.cpp:1438] Successfullly thawed cgroup
> /sys/fs/cgroup/freezer/mesos/bed2585a-c361-4c66-afd9-69e70e748ae2 after
> 2.123008ms
>
> I0212 09:27:49.359627 11062 slave.cpp:3481] executor(1)@
> 10.239.204.142:43950 exited
>
> I0212 09:27:49.381942 11062 containerizer.cpp:1443] Executor for container
> 'bed2585a-c361-4c66-afd9-69e70e748ae2' has exited
>
> I0212 09:27:49.389766 11062 provisioner.cpp:306] Ignoring destroy request
> for unknown container bed2585a-c361-4c66-afd9-69e70e748ae2
>
> I0212 09:27:49.389853 11062 slave.cpp:3816] Executor
> 'fulltest02.6cd29bd8-d162-11e5-a4df-005056aa67df' of framework
> 7baec9af-018f-4a4c-822a-117d61187471-0001 terminated with signal Killed
>



-- 
Best Regards,
Haosdent Huang

Reply via email to