Re: Agent Error with Large Docker Images

2017-07-18 Thread tommy xiao
found the log:
using aufs backend

so how about change backend fs to overlay?

2017-07-18 19:49 GMT+08:00 :

> Hi,
>
> We are experiencing a bug on the mesos agent (1.3.0) when trying to
> start large docker images inside a mesos container. I have tried with
> multiple sizes of images and the threshold seems to lie somewhere
> around 4.5 GB. We have experienced this bug using both a custom
> framework (deep-mesos) and marathon. Here is a log of what is happening
> with the agent. This is not happening on smaller images.
>
> Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.784018 30042
> master.cpp:9320] Adding task git-default.033d2193-0c3c-4878-a63c-
> 6bbfb24df6e0-O0 with resources cpus(*)(allocated: *):4;
> mem(*)(allocated: *):25000; gpus(*)(allocated: *):1;
> ports(*)(allocated: *):[31000-31000] on agent 816e697d-62d2-465a-bf7c-
> 7b79901e07a3-S4 at slave(1)@130.92.124.103:5051 (otpc103.unibe.ch)
> Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.784235 30042
> master.cpp:4531] Launching task git-default.033d2193-0c3c-4878-a63c-
> 6bbfb24df6e0-O0 of framework c7161dd3-0bbc-4032-92c2-5477082d2c08-0014
> (Deep Mesos) with resources cpus(*)(allocated: *):4; mem(*)(allocated:
> *):25000; gpus(*)(allocated: *):1; ports(*)(allocated: *):[31000-31000]
> on agent 816e697d-62d2-465a-bf7c-7b79901e07a3-S4 at
> slave(1)@130.92.124.103:5051 (otpc103.unibe.ch)
> Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.785534 30023
> slave.cpp:1613] Got assigned task 'git-default.033d2193-0c3c-4878-a63c-
> 6bbfb24df6e0-O0' for framework c7161dd3-0bbc-4032-92c2-5477082d2c08-
> 0014
> Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.786010 30038
> hierarchical.cpp:850] Updated allocation of framework c7161dd3-0bbc-
> 4032-92c2-5477082d2c08-0014 on agent 816e697d-62d2-465a-bf7c-
> 7b79901e07a3-S4 from gpus(*)(allocated: *):1; cpus(*)(allocated: *):8;
> mem(*)(allocated: *):31099; disk(*)(allocated: *):56156;
> ports(*)(allocated: *):[31000-32000] to gpus(*)(allocated: *):1;
> cpus(*)(allocated: *):8; mem(*)(allocated: *):31099; disk(*)(allocated:
> *):56156; ports(*)(allocated: *):[31000-32000]
> Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.786223 30023
> gc.cpp:83] Unscheduling '/var/lib/mesos/agent/slaves/816e697d-62d2-
> 465a-bf7c-7b79901e07a3-S4/frameworks/c7161dd3-0bbc-4032-92c2-
> 5477082d2c08-0014' from gc
> Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.786487 30023
> slave.cpp:1894] Authorizing task 'git-default.033d2193-0c3c-4878-a63c-
> 6bbfb24df6e0-O0' for framework c7161dd3-0bbc-4032-92c2-5477082d2c08-
> 0014
> Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.787127 30029
> slave.cpp:2081] Launching task 'git-default.033d2193-0c3c-4878-a63c-
> 6bbfb24df6e0-O0' for framework c7161dd3-0bbc-4032-92c2-5477082d2c08-
> 0014
> Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.789391 30029
> paths.cpp:573] Trying to chown '/var/lib/mesos/agent/slaves/816e697d-
> 62d2-465a-bf7c-7b79901e07a3-S4/frameworks/c7161dd3-0bbc-4032-92c2-
> 5477082d2c08-0014/executors/git-default.033d2193-0c3c-4878-a63c-
> 6bbfb24df6e0-O0/runs/c2343739-4252-4778-8902-9bedd514c3cd' to user
> 'root'
> Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.789891 30029
> slave.cpp:6933] Launching executor 'git-default.033d2193-0c3c-4878-
> a63c-6bbfb24df6e0-O0' of framework c7161dd3-0bbc-4032-92c2-
> 5477082d2c08-0014 with resources cpus(*)(allocated: *):0.1;
> mem(*)(allocated: *):32 in work directory
> '/var/lib/mesos/agent/slaves/816e697d-62d2-465a-bf7c-7b79901e07a3-
> S4/frameworks/c7161dd3-0bbc-4032-92c2-5477082d2c08-0014/executors/git-
> default.033d2193-0c3c-4878-a63c-6bbfb24df6e0-O0/runs/c2343739-4252-
> 4778-8902-9bedd514c3cd'
> Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.790630 30029
> slave.cpp:2310] Queued task 'git-default.033d2193-0c3c-4878-a63c-
> 6bbfb24df6e0-O0' for executor 'git-default.033d2193-0c3c-4878-a63c-
> 6bbfb24df6e0-O0' of framework c7161dd3-0bbc-4032-92c2-5477082d2c08-0014
> Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.790971 30022
> docker.cpp:1148] Skipping non-docker container
> Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.791677 30028
> containerizer.cpp:1001] Starting container c2343739-4252-4778-8902-
> 9bedd514c3cd for executor 'git-default.033d2193-0c3c-4878-a63c-
> 6bbfb24df6e0-O0' of framework c7161dd3-0bbc-4032-92c2-5477082d2c08-0014
> Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.799257 30028
> provisioner.cpp:453] Provisioning image rootfs
> '/var/lib/mesos/agent/provisioner/containers/c2343739-4252-4778-8902-
> 9bedd514c3cd/backends/aufs/rootfses/2eed6b86-66f1-46a0-9fc3-
> 1c8b22bff399' for container c2343739-4252-4778-8902-9bedd514c3cd using
> aufs backend
> Jul 18 13:30:33 otpc103 kernel: [673973.912396] general protection
> fault:  [#2] SMP
> Jul 18 13:30:33 otpc103 kernel: [673973.912403] Modules linked in: veth
> ipt_MASQUERADE nf_nat_masquerade_ipv4 

Re: Libraries

2017-07-18 Thread Hendrik Haddorp

Hi,

I build a small proxy for that. The required Mesos API is quite small so 
I just created my own SchedulerDriver implementation and send then to a 
proxy component that is running in a docker container. In there I can 
easily have the native dependency. So a proxy scheduler is running in 
there and forwards the calls coming from Mesos to the proxy 
SchedulerDriver. For the communication I used websockets. As the API 
uses proto buffers you can quite easily send them over the wire. This 
way I can run my code even on Windows.


regards,
Hendrik

On 10.07.2017 13:48, Oeg Bizz wrote:

All,
I am continuing developing a Java framework using the Java API 
rather than the HTTP API (long story here).  The ultimate goal is to 
run the framework on its own container, but in the meantime I am 
constantly making updates to it.  So, I would like to be able to run 
the framework on my Centos 7 node (no containers) until is ready.  I 
have a mesos-master and mesos-slave running on different containers. 
 How can I run the framework WITHOUT installing mesos on my computer? 
 I want to use the libraries contained in the mesos docker containers, 
is that possible?  I tried mounting a volume and copying the libmesos 
from the container to that volume and  set the 
MESOS_NATIVE_JAVA_LIBRARY accordingly, but is complaining about not 
finding the libsvn_delta library.  Is there a list somewhere of all 
the .so files I need to expose?  Is what I am trying to do absurd and 
I should just installed Mesos?   I know installing Mesos is the easy 
answer, but it will be nice to run mesos on containers rather than on 
the computer.  Thanks for your help,


Oscar




Agent Error with Large Docker Images

2017-07-18 Thread thomas.kurmann
Hi,

We are experiencing a bug on the mesos agent (1.3.0) when trying to
start large docker images inside a mesos container. I have tried with
multiple sizes of images and the threshold seems to lie somewhere
around 4.5 GB. We have experienced this bug using both a custom
framework (deep-mesos) and marathon. Here is a log of what is happening
with the agent. This is not happening on smaller images. 

Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.784018 30042
master.cpp:9320] Adding task git-default.033d2193-0c3c-4878-a63c-
6bbfb24df6e0-O0 with resources cpus(*)(allocated: *):4;
mem(*)(allocated: *):25000; gpus(*)(allocated: *):1;
ports(*)(allocated: *):[31000-31000] on agent 816e697d-62d2-465a-bf7c-
7b79901e07a3-S4 at slave(1)@130.92.124.103:5051 (otpc103.unibe.ch)
Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.784235 30042
master.cpp:4531] Launching task git-default.033d2193-0c3c-4878-a63c-
6bbfb24df6e0-O0 of framework c7161dd3-0bbc-4032-92c2-5477082d2c08-0014
(Deep Mesos) with resources cpus(*)(allocated: *):4; mem(*)(allocated:
*):25000; gpus(*)(allocated: *):1; ports(*)(allocated: *):[31000-31000] 
on agent 816e697d-62d2-465a-bf7c-7b79901e07a3-S4 at
slave(1)@130.92.124.103:5051 (otpc103.unibe.ch)
Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.785534 30023
slave.cpp:1613] Got assigned task 'git-default.033d2193-0c3c-4878-a63c-
6bbfb24df6e0-O0' for framework c7161dd3-0bbc-4032-92c2-5477082d2c08-
0014
Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.786010 30038
hierarchical.cpp:850] Updated allocation of framework c7161dd3-0bbc-
4032-92c2-5477082d2c08-0014 on agent 816e697d-62d2-465a-bf7c-
7b79901e07a3-S4 from gpus(*)(allocated: *):1; cpus(*)(allocated: *):8;
mem(*)(allocated: *):31099; disk(*)(allocated: *):56156;
ports(*)(allocated: *):[31000-32000] to gpus(*)(allocated: *):1;
cpus(*)(allocated: *):8; mem(*)(allocated: *):31099; disk(*)(allocated:
*):56156; ports(*)(allocated: *):[31000-32000]
Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.786223 30023
gc.cpp:83] Unscheduling '/var/lib/mesos/agent/slaves/816e697d-62d2-
465a-bf7c-7b79901e07a3-S4/frameworks/c7161dd3-0bbc-4032-92c2-
5477082d2c08-0014' from gc
Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.786487 30023
slave.cpp:1894] Authorizing task 'git-default.033d2193-0c3c-4878-a63c-
6bbfb24df6e0-O0' for framework c7161dd3-0bbc-4032-92c2-5477082d2c08-
0014
Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.787127 30029
slave.cpp:2081] Launching task 'git-default.033d2193-0c3c-4878-a63c-
6bbfb24df6e0-O0' for framework c7161dd3-0bbc-4032-92c2-5477082d2c08-
0014
Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.789391 30029
paths.cpp:573] Trying to chown '/var/lib/mesos/agent/slaves/816e697d-
62d2-465a-bf7c-7b79901e07a3-S4/frameworks/c7161dd3-0bbc-4032-92c2-
5477082d2c08-0014/executors/git-default.033d2193-0c3c-4878-a63c-
6bbfb24df6e0-O0/runs/c2343739-4252-4778-8902-9bedd514c3cd' to user
'root'
Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.789891 30029
slave.cpp:6933] Launching executor 'git-default.033d2193-0c3c-4878-
a63c-6bbfb24df6e0-O0' of framework c7161dd3-0bbc-4032-92c2-
5477082d2c08-0014 with resources cpus(*)(allocated: *):0.1;
mem(*)(allocated: *):32 in work directory
'/var/lib/mesos/agent/slaves/816e697d-62d2-465a-bf7c-7b79901e07a3-
S4/frameworks/c7161dd3-0bbc-4032-92c2-5477082d2c08-0014/executors/git-
default.033d2193-0c3c-4878-a63c-6bbfb24df6e0-O0/runs/c2343739-4252-
4778-8902-9bedd514c3cd'
Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.790630 30029
slave.cpp:2310] Queued task 'git-default.033d2193-0c3c-4878-a63c-
6bbfb24df6e0-O0' for executor 'git-default.033d2193-0c3c-4878-a63c-
6bbfb24df6e0-O0' of framework c7161dd3-0bbc-4032-92c2-5477082d2c08-0014
Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.790971 30022
docker.cpp:1148] Skipping non-docker container
Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.791677 30028
containerizer.cpp:1001] Starting container c2343739-4252-4778-8902-
9bedd514c3cd for executor 'git-default.033d2193-0c3c-4878-a63c-
6bbfb24df6e0-O0' of framework c7161dd3-0bbc-4032-92c2-5477082d2c08-0014
Jul 18 13:30:33 otpc103 rc.local[29950]: I0718 13:30:33.799257 30028
provisioner.cpp:453] Provisioning image rootfs
'/var/lib/mesos/agent/provisioner/containers/c2343739-4252-4778-8902-
9bedd514c3cd/backends/aufs/rootfses/2eed6b86-66f1-46a0-9fc3-
1c8b22bff399' for container c2343739-4252-4778-8902-9bedd514c3cd using
aufs backend
Jul 18 13:30:33 otpc103 kernel: [673973.912396] general protection
fault:  [#2] SMP 
Jul 18 13:30:33 otpc103 kernel: [673973.912403] Modules linked in: veth
ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink
xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables
nf_nat nf_conntrack br_netfilter bridge stp llc aufs nfsv3 nfs_acl
rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache
nvidia_uvm(POE)