[ 
https://issues.apache.org/jira/browse/MESOS-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17420360#comment-17420360
 ] 

Charles Natali commented on MESOS-10227:
----------------------------------------

Hi [~barrylee],

Sorry for the delay.
Is this still a problem?
The log you're providing is truncated, it would be useful to get:
- the agent logs, when the task is started
- the executor log



> After mesos-agent starts, mesos-exeute fails to be executed using the GPU
> -------------------------------------------------------------------------
>
>                 Key: MESOS-10227
>                 URL: https://issues.apache.org/jira/browse/MESOS-10227
>             Project: Mesos
>          Issue Type: Task
>          Components: agent
>    Affects Versions: 1.11.0
>         Environment: mesos-agent \
> --master=zk://192.168.10.191:2181,192.168.10.192:2181,192.168.10.193:2181/mesos
>  \
> --log_dir=/var/log/mesos --containerizers=docker,mesos \
> --executor_registration_timeout=5mins \
> --hostname=192.168.10.19 \
> --ip=192.168.10.19 \
> --port=5051 \
> --work_dir=/var/lib/mesos \
> --image_providers=docker \
> —executor_environment_variables="{}" \
> --isolation="docker/runtime,filesystem/linux,cgroups/devices,gpu/nvidia"
>  
>  
> mesos-execute \
> --master=zk://192.168.10.191:2181,192.168.10.192:2181,192.168.10.193:2181/mesos
>  \
> --name=gpu-test \
> --docker_image=nvidia/cuda \
> --command="nvidia-smi" \
> --framework_capabilities="GPU_RESOURCES" \
> --resources="gpus:1"
>  
>            Reporter: barry lee
>            Priority: Major
>             Fix For: 1.11.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> I0819 18:14:26.088129 9337 containerizer.cpp:3414] Transitioning the state of 
> container fab468e6-bcbd-499c-9c24-ccd572c8317b from PROVISIONING to 
> DESTROYING after 2.207289088secs
> I0819 18:14:26.089609 9339 slave.cpp:7100] Executor 'gpu-test' of framework 
> d5cb56f3-1f2f-49e6-b63b-a401e445104d-0027 has terminated with unknown status
> I0819 18:14:26.091435 9339 slave.cpp:5981] Handling status update TASK_FAILED 
> (Status UUID: 0abd4e4b-59a6-4610-b624-05762ab9fc17) for task gpu-test of 
> framework d5cb56f3-1f2f-49e6-b63b-a401e445104d-0027 from @0.0.0.0:0
> E0819 18:14:26.092530 9346 slave.cpp:6357] Failed to update resources for 
> container fab468e6-bcbd-499c-9c24-ccd572c8317b of executor 'gpu-test' running 
> task gpu-test on status update for terminal task, destroying container: 
> Container not found
> W0819 18:14:26.092737 9341 composing.cpp:614] Attempted to destroy unknown 
> container fab468e6-bcbd-499c-9c24-ccd572c8317b
> I0819 18:14:26.092895 9331 task_status_update_manager.cpp:328] Received task 
> status update TASK_FAILED (Status UUID: 0abd4e4b-59a6-4610-b624-05762ab9fc17) 
> for task gpu-test of framework d5cb56f3-1f2f-49e6-b63b-a401e445104d-0027
> I0819 18:14:26.093626 9333 slave.cpp:6527] Forwarding the update TASK_FAILED 
> (Status UUID: 0abd4e4b-59a6-4610-b624-05762ab9fc17) for task gpu-test of 
> framework d5cb56f3-1f2f-49e6-b63b-a401e445104d-0027 to 
> master@192.168.10.192:5050
> I0819 18:14:26.102195 9342 slave.cpp:4310] Shutting down framework 
> d5cb56f3-1f2f-49e6-b63b-a401e445104d-0027
> I0819 18:14:26.102257 9342 slave.cpp:7218] Cleaning up executor 'gpu-test' of 
> framework d5cb56f3-1f2f-49e6-b63b-a401e445104d-0027
> I0819 18:14:26.102448 9332 gc.cpp:95] Scheduling 
> '/var/lib/mesos/slaves/d5cb56f3-1f2f-49e6-b63b-a401e445104d-S125/frameworks/d5cb56f3-1f2f-49e6-b63b-a401e445104d-0027/executors/gpu-test/runs/fab468e6-bcbd-499c-9c24-ccd572c8317b'
>  for gc 6.9999988156days in the future
> I0819 18:14:26.102600 9332 gc.cpp:95] Scheduling 
> '/var/lib/mesos/slaves/d5cb56f3-1f2f-49e6-b63b-a401e445104d-S125/frameworks/d5cb56f3-1f2f-49e6-b63b-a401e445104d-0027/executors/gpu-test'
>  for gc 6.99999881303111days in the future
> I0819 18:14:26.102725 9342 slave.cpp:7347] Cleaning up framework 
> d5cb56f3-1f2f-49e6-b63b-a401e445104d-0027
> I0819 18:14:26.102805 9335 task_status_update_manager.cpp:289] Closing task 
> status update streams for framework d5cb56f3-1f2f-49e6-b63b-a401e445104d-0027
> I0819 18:14:26.102901 9342 gc.cpp:95] Scheduling 
> '/var/lib/mesos/slaves/d5cb56f3-1f2f-49e6-b63b-a401e445104d-S125/frameworks/d5cb56f3-1f2f-49e6-b63b-a401e445104d-0027'
>  for gc 6.99999881020741days in the future
> I0819 18:14:34.385221 9334 http.cpp:1436] HTTP GET for 
> /files/browse?path=%2Fvar%2Flib%2Fmesos%2Fslaves%2Fd5cb56f3-1f2f-49e6-b63b-a401e445104d-S125&jsonp=angular.callbacks._67
>  from 192.168.110.142:11640 with User-Agent='Mozilla/5.0 (Windows NT 10.0; 
> Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 
> Safari/537.36'
> I0819 18:14:45.385519 9344 http.cpp:1436] HTTP GET for 
> /files/browse?path=%2Fvar%2Flib%2Fmesos%2Fslaves%2Fd5cb56f3-1f2f-49e6-b63b-a401e445104d-S125&jsonp=angular.callbacks._6a
>  from 192.168.110.142:11690 with User-Agent='Mozilla/5.0 (Windows NT 10.0; 
> Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 
> Safari/537.36'
> I0819 18:14:56.381196 9334 http.cpp:1436] HTTP GET for 
> /files/browse?path=%2Fvar%2Flib%2Fmesos%2Fslaves%2Fd5cb56f3-1f2f-49e6-b63b-a401e445104d-S125&jsonp=angular.callbacks._6d
>  from 192.168.110.142:11716 with User-Agent='Mozilla/5.0 (Windows NT 10.0; 
> Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 
> Safari/537.36'
> I0819 18:15:07.385897 9340 http.cpp:1436] HTTP GET for 
> /files/browse?path=%2Fvar%2Flib%2Fmesos%2Fslaves%2Fd5cb56f3-1f2f-49e6-b63b-a401e445104d-S125&jsonp=angular.callbacks._6g
>  from 192.168.110.142:11745 with User-Agent='Mozilla/5.0 (Windows NT 10.0; 
> Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 
> Safari/537.36'
> I0819 18:15:18.397059 9343 http.cpp:1436] HTTP GET for 
> /files/browse?path=%2Fvar%2Flib%2Fmesos%2Fslaves%2Fd5cb56f3-1f2f-49e6-b63b-a401e445104d-S125&jsonp=angular.callbacks._6j
>  from 192.168.110.142:11774 with User-Agent='Mozilla/5.0 (Windows NT 10.0; 
> Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 
> Safari/537.36'
> I0819 18:15:20.797320 9331 slave.cpp:7657] Current disk usage 3.77%. Max 
> allowed age: 6.036056697613576days
> I0819 18:15:29.377502 9341 http.cpp:1436] HTTP GET for 
> /files/browse?path=%2Fvar%2Flib%2Fmesos%2Fslaves%2Fd5cb56f3-1f2f-49e6-b63b-a401e445104d-S125&jsonp=angular.callbacks._6m
>  from 192.168.110.142:13466 with User-Agent='Mozilla/5.0 (Windows NT 10.0; 
> Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 
> Safari/537.36'
> I0819 18:15:40.386363 9335 http.cpp:1436] HTTP GET for 
> /files/browse?path=%2Fvar%2Flib%2Fmesos%2Fslaves%2Fd5cb56f3-1f2f-49e6-b63b-a401e445104d-S125&jsonp=angular.callbacks._6p
>  from 192.168.110.142:13490 with User-Agent='Mozilla/5.0 (Windows NT 10.0; 
> Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 
> Safari/537.36'
> I0819 18:15:51.388419 9341 http.cpp:1436] HTTP GET for 
> /files/browse?path=%2Fvar%2Flib%2Fmesos%2Fslaves%2Fd5cb56f3-1f2f-49e6-b63b-a401e445104d-S125&jsonp=angular.callbacks._6s
>  from 192.168.110.142:13515 with User-Agent='Mozilla/5.0 (Windows NT 10.0; 
> Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 
> Safari/537.36'
> I0819 18:16:02.377324 9336 http.cpp:1436] HTTP GET for 
> /files/browse?path=%2Fvar%2Flib%2Fmesos%2Fslaves%2Fd5cb56f3-1f2f-49e6-b63b-a401e445104d-S125&jsonp=angular.callbacks._6v
>  from 192.168.110.142:13543 with User-Agent='Mozilla/5.0 (Windows NT 10.0; 
> Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 
> Safari/537.36'
> I0819 18:16:13.391608 9346 http.cpp:1436] HTTP GET for 
> /files/browse?path=%2Fvar%2Flib%2Fmesos%2Fslaves%2Fd5cb56f3-1f2f-49e6-b63b-a401e445104d-S125&jsonp=angular.callbacks._6y
>  from 192.168.110.142:13571 with User-Agent='Mozilla/5.0 (Windows NT 10.0; 
> Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 
> Safari/537.36'
> I0819 18:16:20.798060 9340 slave.cpp:7657] Current disk usage 3.77%. Max 
> allowed age: 6.036056697613576days
> I0819 18:16:24.390466 9345 http.cpp:1436] HTTP GET for 
> /files/browse?path=%2Fvar%2Flib%2Fmesos%2Fslaves%2Fd5cb56f3-1f2f-49e6-b63b-a401e445104d-S125&jsonp=angular.callbacks._71
>  from 192.168.110.142:13593 with User-Agent='Mozilla/5.0 (Windows NT 10.0; 
> Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 
> Safari/537.36'
> I0819 18:16:35.390462 9337 http.cpp:1436] HTTP GET for 
> /files/browse?path=%2Fvar%2Flib%2Fmesos%2Fslaves%2Fd5cb56f3-1f2f-49e6-b63b-a401e445104d-S125&jsonp=angular.callbacks._74
>  from 192.168.110.142:13612 with User-Agent='Mozilla/5.0 (Windows NT 10.0; 
> Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 
> Safari/537.36'
> I0819 18:16:46.374727 9345 http.cpp:1436] HTTP GET for 
> /files/browse?path=%2Fvar%2Flib%2Fmesos%2Fslaves%2Fd5cb56f3-1f2f-49e6-b63b-a401e445104d-S125&jsonp=angular.callbacks._77
>  from 192.168.110.142:13631 with User-Agent='Mozilla/5.0 (Windows NT 10.0; 
> Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 
> Safari/537.36'



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to