Re: Agent Error with Large Docker Images
found the log: using aufs backend so how about change backend fs to overlay? 2017-07-18 19:49 GMT+08:00: > Hi, > > We are experiencing a bug on the mesos agent (1.3.0) when trying to > start large docker images inside a mesos container. I have tried with > multiple sizes of images and the threshold seems to lie somewhere > around 4.5 GB. We have experienced this bug using both a custom > framework (deep-mesos) and marathon. Here is a log of what is happening > with the agent. This is not happening on smaller images. > > Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.784018 30042 > master.cpp:9320] Adding task git-default.033d2193-0c3c-4878-a63c- > 6bbfb24df6e0-O0 with resources cpus(*)(allocated: *):4; > mem(*)(allocated: *):25000; gpus(*)(allocated: *):1; > ports(*)(allocated: *):[31000-31000] on agent 816e697d-62d2-465a-bf7c- > 7b79901e07a3-S4 at slave(1)@130.92.124.103:5051 (otpc103.unibe.ch) > Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.784235 30042 > master.cpp:4531] Launching task git-default.033d2193-0c3c-4878-a63c- > 6bbfb24df6e0-O0 of framework c7161dd3-0bbc-4032-92c2-5477082d2c08-0014 > (Deep Mesos) with resources cpus(*)(allocated: *):4; mem(*)(allocated: > *):25000; gpus(*)(allocated: *):1; ports(*)(allocated: *):[31000-31000] > on agent 816e697d-62d2-465a-bf7c-7b79901e07a3-S4 at > slave(1)@130.92.124.103:5051 (otpc103.unibe.ch) > Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.785534 30023 > slave.cpp:1613] Got assigned task 'git-default.033d2193-0c3c-4878-a63c- > 6bbfb24df6e0-O0' for framework c7161dd3-0bbc-4032-92c2-5477082d2c08- > 0014 > Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.786010 30038 > hierarchical.cpp:850] Updated allocation of framework c7161dd3-0bbc- > 4032-92c2-5477082d2c08-0014 on agent 816e697d-62d2-465a-bf7c- > 7b79901e07a3-S4 from gpus(*)(allocated: *):1; cpus(*)(allocated: *):8; > mem(*)(allocated: *):31099; disk(*)(allocated: *):56156; > ports(*)(allocated: *):[31000-32000] to gpus(*)(allocated: *):1; > cpus(*)(allocated: *):8; mem(*)(allocated: *):31099; disk(*)(allocated: > *):56156; ports(*)(allocated: *):[31000-32000] > Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.786223 30023 > gc.cpp:83] Unscheduling '/var/lib/mesos/agent/slaves/816e697d-62d2- > 465a-bf7c-7b79901e07a3-S4/frameworks/c7161dd3-0bbc-4032-92c2- > 5477082d2c08-0014' from gc > Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.786487 30023 > slave.cpp:1894] Authorizing task 'git-default.033d2193-0c3c-4878-a63c- > 6bbfb24df6e0-O0' for framework c7161dd3-0bbc-4032-92c2-5477082d2c08- > 0014 > Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.787127 30029 > slave.cpp:2081] Launching task 'git-default.033d2193-0c3c-4878-a63c- > 6bbfb24df6e0-O0' for framework c7161dd3-0bbc-4032-92c2-5477082d2c08- > 0014 > Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.789391 30029 > paths.cpp:573] Trying to chown '/var/lib/mesos/agent/slaves/816e697d- > 62d2-465a-bf7c-7b79901e07a3-S4/frameworks/c7161dd3-0bbc-4032-92c2- > 5477082d2c08-0014/executors/git-default.033d2193-0c3c-4878-a63c- > 6bbfb24df6e0-O0/runs/c2343739-4252-4778-8902-9bedd514c3cd' to user > 'root' > Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.789891 30029 > slave.cpp:6933] Launching executor 'git-default.033d2193-0c3c-4878- > a63c-6bbfb24df6e0-O0' of framework c7161dd3-0bbc-4032-92c2- > 5477082d2c08-0014 with resources cpus(*)(allocated: *):0.1; > mem(*)(allocated: *):32 in work directory > '/var/lib/mesos/agent/slaves/816e697d-62d2-465a-bf7c-7b79901e07a3- > S4/frameworks/c7161dd3-0bbc-4032-92c2-5477082d2c08-0014/executors/git- > default.033d2193-0c3c-4878-a63c-6bbfb24df6e0-O0/runs/c2343739-4252- > 4778-8902-9bedd514c3cd' > Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.790630 30029 > slave.cpp:2310] Queued task 'git-default.033d2193-0c3c-4878-a63c- > 6bbfb24df6e0-O0' for executor 'git-default.033d2193-0c3c-4878-a63c- > 6bbfb24df6e0-O0' of framework c7161dd3-0bbc-4032-92c2-5477082d2c08-0014 > Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.790971 30022 > docker.cpp:1148] Skipping non-docker container > Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.791677 30028 > containerizer.cpp:1001] Starting container c2343739-4252-4778-8902- > 9bedd514c3cd for executor 'git-default.033d2193-0c3c-4878-a63c- > 6bbfb24df6e0-O0' of framework c7161dd3-0bbc-4032-92c2-5477082d2c08-0014 > Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.799257 30028 > provisioner.cpp:453] Provisioning image rootfs > '/var/lib/mesos/agent/provisioner/containers/c2343739-4252-4778-8902- > 9bedd514c3cd/backends/aufs/rootfses/2eed6b86-66f1-46a0-9fc3- > 1c8b22bff399' for container c2343739-4252-4778-8902-9bedd514c3cd using > aufs backend > Jul 18 13:30:33 otpc103 kernel: [673973.912396] general protection > fault: [#2] SMP > Jul 18 13:30:33 otpc103 kernel: [673973.912403] Modules linked in: veth > ipt_MASQUERADE nf_nat_masquerade_ipv4
Re: Libraries
Hi, I build a small proxy for that. The required Mesos API is quite small so I just created my own SchedulerDriver implementation and send then to a proxy component that is running in a docker container. In there I can easily have the native dependency. So a proxy scheduler is running in there and forwards the calls coming from Mesos to the proxy SchedulerDriver. For the communication I used websockets. As the API uses proto buffers you can quite easily send them over the wire. This way I can run my code even on Windows. regards, Hendrik On 10.07.2017 13:48, Oeg Bizz wrote: All, I am continuing developing a Java framework using the Java API rather than the HTTP API (long story here). The ultimate goal is to run the framework on its own container, but in the meantime I am constantly making updates to it. So, I would like to be able to run the framework on my Centos 7 node (no containers) until is ready. I have a mesos-master and mesos-slave running on different containers. How can I run the framework WITHOUT installing mesos on my computer? I want to use the libraries contained in the mesos docker containers, is that possible? I tried mounting a volume and copying the libmesos from the container to that volume and set the MESOS_NATIVE_JAVA_LIBRARY accordingly, but is complaining about not finding the libsvn_delta library. Is there a list somewhere of all the .so files I need to expose? Is what I am trying to do absurd and I should just installed Mesos? I know installing Mesos is the easy answer, but it will be nice to run mesos on containers rather than on the computer. Thanks for your help, Oscar
Agent Error with Large Docker Images
Hi, We are experiencing a bug on the mesos agent (1.3.0) when trying to start large docker images inside a mesos container. I have tried with multiple sizes of images and the threshold seems to lie somewhere around 4.5 GB. We have experienced this bug using both a custom framework (deep-mesos) and marathon. Here is a log of what is happening with the agent. This is not happening on smaller images. Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.784018 30042 master.cpp:9320] Adding task git-default.033d2193-0c3c-4878-a63c- 6bbfb24df6e0-O0 with resources cpus(*)(allocated: *):4; mem(*)(allocated: *):25000; gpus(*)(allocated: *):1; ports(*)(allocated: *):[31000-31000] on agent 816e697d-62d2-465a-bf7c- 7b79901e07a3-S4 at slave(1)@130.92.124.103:5051 (otpc103.unibe.ch) Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.784235 30042 master.cpp:4531] Launching task git-default.033d2193-0c3c-4878-a63c- 6bbfb24df6e0-O0 of framework c7161dd3-0bbc-4032-92c2-5477082d2c08-0014 (Deep Mesos) with resources cpus(*)(allocated: *):4; mem(*)(allocated: *):25000; gpus(*)(allocated: *):1; ports(*)(allocated: *):[31000-31000] on agent 816e697d-62d2-465a-bf7c-7b79901e07a3-S4 at slave(1)@130.92.124.103:5051 (otpc103.unibe.ch) Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.785534 30023 slave.cpp:1613] Got assigned task 'git-default.033d2193-0c3c-4878-a63c- 6bbfb24df6e0-O0' for framework c7161dd3-0bbc-4032-92c2-5477082d2c08- 0014 Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.786010 30038 hierarchical.cpp:850] Updated allocation of framework c7161dd3-0bbc- 4032-92c2-5477082d2c08-0014 on agent 816e697d-62d2-465a-bf7c- 7b79901e07a3-S4 from gpus(*)(allocated: *):1; cpus(*)(allocated: *):8; mem(*)(allocated: *):31099; disk(*)(allocated: *):56156; ports(*)(allocated: *):[31000-32000] to gpus(*)(allocated: *):1; cpus(*)(allocated: *):8; mem(*)(allocated: *):31099; disk(*)(allocated: *):56156; ports(*)(allocated: *):[31000-32000] Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.786223 30023 gc.cpp:83] Unscheduling '/var/lib/mesos/agent/slaves/816e697d-62d2- 465a-bf7c-7b79901e07a3-S4/frameworks/c7161dd3-0bbc-4032-92c2- 5477082d2c08-0014' from gc Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.786487 30023 slave.cpp:1894] Authorizing task 'git-default.033d2193-0c3c-4878-a63c- 6bbfb24df6e0-O0' for framework c7161dd3-0bbc-4032-92c2-5477082d2c08- 0014 Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.787127 30029 slave.cpp:2081] Launching task 'git-default.033d2193-0c3c-4878-a63c- 6bbfb24df6e0-O0' for framework c7161dd3-0bbc-4032-92c2-5477082d2c08- 0014 Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.789391 30029 paths.cpp:573] Trying to chown '/var/lib/mesos/agent/slaves/816e697d- 62d2-465a-bf7c-7b79901e07a3-S4/frameworks/c7161dd3-0bbc-4032-92c2- 5477082d2c08-0014/executors/git-default.033d2193-0c3c-4878-a63c- 6bbfb24df6e0-O0/runs/c2343739-4252-4778-8902-9bedd514c3cd' to user 'root' Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.789891 30029 slave.cpp:6933] Launching executor 'git-default.033d2193-0c3c-4878- a63c-6bbfb24df6e0-O0' of framework c7161dd3-0bbc-4032-92c2- 5477082d2c08-0014 with resources cpus(*)(allocated: *):0.1; mem(*)(allocated: *):32 in work directory '/var/lib/mesos/agent/slaves/816e697d-62d2-465a-bf7c-7b79901e07a3- S4/frameworks/c7161dd3-0bbc-4032-92c2-5477082d2c08-0014/executors/git- default.033d2193-0c3c-4878-a63c-6bbfb24df6e0-O0/runs/c2343739-4252- 4778-8902-9bedd514c3cd' Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.790630 30029 slave.cpp:2310] Queued task 'git-default.033d2193-0c3c-4878-a63c- 6bbfb24df6e0-O0' for executor 'git-default.033d2193-0c3c-4878-a63c- 6bbfb24df6e0-O0' of framework c7161dd3-0bbc-4032-92c2-5477082d2c08-0014 Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.790971 30022 docker.cpp:1148] Skipping non-docker container Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.791677 30028 containerizer.cpp:1001] Starting container c2343739-4252-4778-8902- 9bedd514c3cd for executor 'git-default.033d2193-0c3c-4878-a63c- 6bbfb24df6e0-O0' of framework c7161dd3-0bbc-4032-92c2-5477082d2c08-0014 Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.799257 30028 provisioner.cpp:453] Provisioning image rootfs '/var/lib/mesos/agent/provisioner/containers/c2343739-4252-4778-8902- 9bedd514c3cd/backends/aufs/rootfses/2eed6b86-66f1-46a0-9fc3- 1c8b22bff399' for container c2343739-4252-4778-8902-9bedd514c3cd using aufs backend Jul 18 13:30:33 otpc103 kernel: [673973.912396] general protection fault: [#2] SMP Jul 18 13:30:33 otpc103 kernel: [673973.912403] Modules linked in: veth ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack br_netfilter bridge stp llc aufs nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache nvidia_uvm(POE)