here is the log:
I0829 14:24:21.727960 2679 main.cpp:223] Build: 2016-08-28 13:39:46 by root I0829 14:24:21.728159 2679 main.cpp:225] Version: 0.28.2 I0829 14:24:21.733256 2679 containerizer.cpp:149] Using isolation: posix/cpu,posix/mem,filesystem/posix I0829 14:24:21.738895 2679 linux_launcher.cpp:101] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher I0829 14:24:21.748019 2679 main.cpp:328] Starting Mesos slave I0829 14:24:21.750063 2679 slave.cpp:193] Slave started on 1)@ 128.226.116.69:8082 I0829 14:24:21.750114 2679 slave.cpp:194] Flags at startup: --advertise_ip="128.226.116.69" --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --launcher_dir="/home/pankaj/mesos-0.28.2/build/src" --logbufsecs="0" --logging_level="INFO" --master="129.114.110.143:5050" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --port="8082" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --systemd_enable_support="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/mesos" I0829 14:24:21.753572 2679 slave.cpp:464] Slave resources: cpus(*):2; mem(*):2855; disk(*):84691; ports(*):[8081-8081] I0829 14:24:21.753706 2679 slave.cpp:472] Slave attributes: [ ] I0829 14:24:21.753762 2679 slave.cpp:477] Slave hostname: venom.cs.binghamton.edu I0829 14:24:21.770992 2696 state.cpp:58] Recovering state from '/tmp/mesos/meta' I0829 14:24:21.771304 2696 state.cpp:698] No checkpointed resources found at '/tmp/mesos/meta/resources/resources.info' I0829 14:24:21.771644 2696 state.cpp:101] Failed to find the latest slave from '/tmp/mesos/meta' I0829 14:24:21.772583 2696 status_update_manager.cpp:200] Recovering status update manager I0829 14:24:21.773082 2698 containerizer.cpp:407] Recovering containerizer I0829 14:24:21.777489 2702 provisioner.cpp:245] Provisioner recovery complete I0829 14:24:21.778149 2699 slave.cpp:4550] Finished recovery I0829 14:24:21.779564 2699 slave.cpp:796] New master detected at [email protected]:5050 I0829 14:24:21.779742 2697 status_update_manager.cpp:174] Pausing sending status updates I0829 14:24:21.780607 2699 slave.cpp:821] No credentials provided. Attempting to register without authentication I0829 14:24:21.781394 2699 slave.cpp:832] Detecting new master I0829 14:24:22.698812 2702 slave.cpp:971] Registered with master [email protected]:5050; given slave ID d6f0e3e2-d144-4275-9d38-82327408622b-S12 I0829 14:24:22.699113 2698 status_update_manager.cpp:181] Resuming sending status updates I0829 14:24:22.700258 2702 slave.cpp:1030] Forwarding total oversubscribed resources I0829 14:24:43.638958 2695 http.cpp:190] HTTP GET for /slave(1)/state from 128.226.119.78:59261 with User-Agent='Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.86 Safari/537.36' I0829 14:25:21.764268 2702 slave.cpp:4359] Current disk usage 9.67%. Max allowed age: 5.622868987169502days I0829 14:26:21.778849 2695 slave.cpp:4359] Current disk usage 9.67%. Max allowed age: 5.622860462326585days I0829 14:26:38.271085 2698 slave.cpp:1361] Got assigned task test.1fb85a35-6e16-11e6-bec9-c27afc834a0c for framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 I0829 14:26:38.311063 2698 slave.cpp:1480] Launching task test.1fb85a35-6e16-11e6-bec9-c27afc834a0c for framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 I0829 14:26:38.314755 2698 paths.cpp:528] Trying to chown '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/executors/test.1fb85a35-6e16-11e6-bec9-c27afc834a0c/runs/dff399f0-beb1-4c49-bd8e-c19621de2f71' to user 'root' I0829 14:26:38.320300 2698 slave.cpp:5352] Launching executor test.1fb85a35-6e16-11e6-bec9-c27afc834a0c of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 with resources cpus(*):0.1; mem(*):32 in work directory '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/executors/test.1fb85a35-6e16-11e6-bec9-c27afc834a0c/runs/dff399f0-beb1-4c49-bd8e-c19621de2f71' I0829 14:26:38.321523 2702 containerizer.cpp:666] Starting container 'dff399f0-beb1-4c49-bd8e-c19621de2f71' for executor 'test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' of framework 'c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' I0829 14:26:38.322588 2698 slave.cpp:1698] Queuing task 'test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' for executor 'test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 I0829 14:26:38.358906 2702 linux_launcher.cpp:304] Cloning child process with flags = I0829 14:26:38.366492 2702 containerizer.cpp:1179] Checkpointing executor's forked pid 2758 to '/tmp/mesos/meta/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/executors/test.1fb85a35-6e16-11e6-bec9-c27afc834a0c/runs/dff399f0-beb1-4c49-bd8e-c19621de2f71/pids/forked.pid' I0829 14:27:21.779755 2701 slave.cpp:4359] Current disk usage 9.67%. Max allowed age: 5.622850415190289days I0829 14:27:38.322805 2700 slave.cpp:4307] *Terminating executor ''test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' because it did not register within 1minsI0829 14:27:38.323226 2700 containerizer.cpp:1453] Destroying container 'dff399f0-beb1-4c49-bd8e-c19621de2f71'* I0829 14:27:38.329186 2702 cgroups.cpp:2427] Freezing cgroup /sys/fs/cgroup/freezer/mesos/dff399f0-beb1-4c49-bd8e-c19621de2f71 I0829 14:27:38.331509 2699 cgroups.cpp:1409] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos/dff399f0-beb1-4c49-bd8e-c19621de2f71 after 2.19392ms I0829 14:27:38.334520 2698 cgroups.cpp:2445] Thawing cgroup /sys/fs/cgroup/freezer/mesos/dff399f0-beb1-4c49-bd8e-c19621de2f71 I0829 14:27:38.337821 2698 cgroups.cpp:1438] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/dff399f0-beb1-4c49-bd8e-c19621de2f71 after 3.194112ms I0829 14:27:38.435214 2696 containerizer.cpp:1689] Executor for container 'dff399f0-beb1-4c49-bd8e-c19621de2f71' has exited I0829 14:27:38.441556 2695 provisioner.cpp:306] Ignoring destroy request for unknown container dff399f0-beb1-4c49-bd8e-c19621de2f71 I0829 14:27:38.442186 2695 slave.cpp:3871] Executor 'test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 terminated with signal Killed I0829 14:27:38.445689 2695 slave.cpp:3012] Handling status update TASK_FAILED (UUID: 154a67ea-f6d5-4e9c-ad3d-9b09161ba34d) for task test.1fb85a35-6e16-11e6-bec9-c27afc834a0c of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 from @0.0.0.0:0 W0829 14:27:38.447599 2702 containerizer.cpp:1295] Ignoring update for unknown container: dff399f0-beb1-4c49-bd8e-c19621de2f71 I0829 14:27:38.448391 2702 status_update_manager.cpp:320] Received status update TASK_FAILED (UUID: 154a67ea-f6d5-4e9c-ad3d-9b09161ba34d) for task test.1fb85a35-6e16-11e6-bec9-c27afc834a0c of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 I0829 14:27:38.449525 2702 status_update_manager.cpp:824] Checkpointing UPDATE for status update TASK_FAILED (UUID: 154a67ea-f6d5-4e9c-ad3d-9b09161ba34d) for task test.1fb85a35-6e16-11e6-bec9-c27afc834a0c of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 I0829 14:27:38.523027 2696 slave.cpp:3410] Forwarding the update TASK_FAILED (UUID: 154a67ea-f6d5-4e9c-ad3d-9b09161ba34d) for task test.1fb85a35-6e16-11e6-bec9-c27afc834a0c of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 to [email protected]:5050 I0829 14:27:38.627722 2698 status_update_manager.cpp:392] Received status update acknowledgement (UUID: 154a67ea-f6d5-4e9c-ad3d-9b09161ba34d) for task test.1fb85a35-6e16-11e6-bec9-c27afc834a0c of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 I0829 14:27:38.627943 2698 status_update_manager.cpp:824] Checkpointing ACK for status update TASK_FAILED (UUID: 154a67ea-f6d5-4e9c-ad3d-9b09161ba34d) for task test.1fb85a35-6e16-11e6-bec9-c27afc834a0c of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 I0829 14:27:38.698822 2701 slave.cpp:3975] Cleaning up executor 'test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 I0829 14:27:38.699582 2698 gc.cpp:55] Scheduling '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/executors/test.1fb85a35-6e16-11e6-bec9-c27afc834a0c/runs/dff399f0-beb1-4c49-bd8e-c19621de2f71' for gc 6.99999190486222days in the future I0829 14:27:38.700202 2698 gc.cpp:55] Scheduling '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/executors/test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' for gc 6.99999190029037days in the future I0829 14:27:38.700382 2701 slave.cpp:4063] Cleaning up framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 I0829 14:27:38.700443 2698 gc.cpp:55] Scheduling '/tmp/mesos/meta/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/executors/test.1fb85a35-6e16-11e6-bec9-c27afc834a0c/runs/dff399f0-beb1-4c49-bd8e-c19621de2f71' for gc 6.99999189796148days in the future I0829 14:27:38.700649 2698 gc.cpp:55] Scheduling '/tmp/mesos/meta/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/executors/test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' for gc 6.99999189622815days in the future I0829 14:27:38.700845 2698 gc.cpp:55] Scheduling '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' for gc 6.99999189143704days in the future I0829 14:27:38.701015 2701 status_update_manager.cpp:282] Closing status update streams for framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 I0829 14:27:38.701161 2698 gc.cpp:55] Scheduling '/tmp/mesos/meta/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' for gc 6.9999918900237days in the future I0829 14:27:39.651463 2697 slave.cpp:1361] Got assigned task test.445696e6-6e16-11e6-bec9-c27afc834a0c for framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 I0829 14:27:39.655815 2696 gc.cpp:83] Unscheduling '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' from gc I0829 14:27:39.656445 2696 gc.cpp:83] Unscheduling '/tmp/mesos/meta/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' from gc I0829 14:27:39.656855 2702 slave.cpp:1480] Launching task test.445696e6-6e16-11e6-bec9-c27afc834a0c for framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 I0829 14:27:39.660585 2702 paths.cpp:528] Trying to chown '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/executors/test.445696e6-6e16-11e6-bec9-c27afc834a0c/runs/ea676570-0a2a-49c3-a75c-14e045eb842b' to user 'root' I0829 14:27:39.666008 2702 slave.cpp:5352] Launching executor test.445696e6-6e16-11e6-bec9-c27afc834a0c of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 with resources cpus(*):0.1; mem(*):32 in work directory '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/executors/test.445696e6-6e16-11e6-bec9-c27afc834a0c/runs/ea676570-0a2a-49c3-a75c-14e045eb842b' I0829 14:27:39.667603 2702 slave.cpp:1698] Queuing task 'test.445696e6-6e16-11e6-bec9-c27afc834a0c' for executor 'test.445696e6-6e16-11e6-bec9-c27afc834a0c' of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 I0829 14:27:39.668207 2702 containerizer.cpp:666] Starting container 'ea676570-0a2a-49c3-a75c-14e045eb842b' for executor 'test.445696e6-6e16-11e6-bec9-c27afc834a0c' of framework 'c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' I0829 14:27:39.678665 2702 linux_launcher.cpp:304] Cloning child process with flags = I0829 14:27:39.681824 2702 containerizer.cpp:1179] Checkpointing executor's forked pid 2799 to '/tmp/mesos/meta/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/executors/test.445696e6-6e16-11e6-bec9-c27afc834a0c/runs/ea676570-0a2a-49c3-a75c-14e045eb842b/pids/forked.pid' Thanks Pankaj On Mon, Aug 29, 2016 at 7:25 AM, haosdent <[email protected]> wrote: > Hi, @Pankaj, Could you provide logs during " the job is getting restarted > and a new container is created with a new process id. ". The logs you > provided looks normal. > > On Mon, Aug 29, 2016 at 5:26 AM, Pankaj Saha <[email protected]> > wrote: > > > Hi > > I am facing an issue with a launched jobs into my mesos agents. I am > trying > > to launch a job through marathon framework and job is staying in stagged > > state and not running. > > I could see the log message at the agent console as below: > > > > Scheduling > > '/var/lib/mesos-8082/meta/slaves/d6f0e3e2-d144-4275- > 9d38-82327408622b-S8/ > > frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > > for gc 6.99999884239407days in the future > > I0828 16:20:36.053483 28512 slave.cpp:1361] *Got assigned task > > test-crixus*.eb66a42b-6d5c-11e6-bec9-c27afc834a0c > > for framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0828 16:20:36.056224 28510 gc.cpp:83] Unscheduling > > '/var/lib/mesos-8082/slaves/d6f0e3e2-d144-4275-9d38- > > 82327408622b-S8/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > > from gc > > I0828 16:20:36.056715 28510 gc.cpp:83] Unscheduling > > '/var/lib/mesos-8082/meta/slaves/d6f0e3e2-d144-4275- > 9d38-82327408622b-S8/ > > frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > > from gc > > I0828 16:20:36.057231 28509 slave.cpp:1480] *Launching task > > test-crixus*.eb66a42b-6d5c-11e6-bec9-c27afc834a0c > > for framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0828 16:20:36.058661 28509 paths.cpp:528]* Trying to chown* > > '/var/lib/mesos-8082/slaves/d6f0e3e2-d144-4275-9d38- > > 82327408622b-S8/frameworks/c796100f-9ecb-46fa-90a2- > > 72ad649c5dd3-0000/executors/test-crixus.eb66a42b-6d5c- > > 11e6-bec9-c27afc834a0c/runs/99620406-87b5-406c-a88b-13adb145c12d' > > to user 'root' > > I0828 16:20:36.067807 28509 slave.cpp:5352]* Launching executor > > test-crixus*.eb66a42b-6d5c-11e6-bec9-c27afc834a0c > > of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 with resources > > cpus(*):0.1; mem(*):32 in work directory > > '/var/lib/mesos-8082/slaves/d6f0e3e2-d144-4275-9d38- > > 82327408622b-S8/frameworks/c796100f-9ecb-46fa-90a2- > > 72ad649c5dd3-0000/executors/test-crixus.eb66a42b-6d5c- > > 11e6-bec9-c27afc834a0c/runs/99620406-87b5-406c-a88b-13adb145c12d' > > I0828 16:20:36.069314 28509 slave.cpp:1698] *Queuing task > > 'test-crixus.*eb66a42b-6d5c-11e6-bec9-c27afc834a0c' > > for executor 'test-crixus.eb66a42b-6d5c-11e6-bec9-c27afc834a0c' of > > framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0828 16:20:36.069902 28509 containerizer.cpp:666] *Starting container* > > '99620406-87b5-406c-a88b-13adb145c12d' for executor > > 'test-crixus.eb66a42b-6d5c-11e6-bec9-c27afc834a0c' of framework > > 'c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > > I0828 16:20:36.080713 28509 linux_launcher.cpp:304] *Cloning child > process* > > with flags = > > I0828 16:20:36.084738 28509 containerizer.cpp:1179] *Checkpointing > > executor's forked pid 29629* to > > '/var/lib/mesos-8082/meta/slaves/d6f0e3e2-d144-4275- > 9d38-82327408622b-S8/ > > frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/ > > executors/test-crixus.eb66a42b-6d5c-11e6-bec9- > c27afc834a0c/runs/99620406- > > 87b5-406c-a88b-13adb145c12d/pids/forked.pid' > > > > > > But after that, the job is getting restarted and a new container is > created > > with a new process id. It happening infinitely which is keeping the job > in > > stagged state to mesos-master. > > > > This job is nothing but a simle echo "hello world" kind of shell command. > > Can anyone please point out where its failing or I am doing wrong. > > > > > > > > Thanks > > Pankaj > > > > > > -- > Best Regards, > Haosdent Huang >
