Hi!

It turns out I was starting/restarting the mesos-slave using systemD
scripts and they're not respecting the environment variable settings.  When
I manually started the agent using /usr/sbin/mesos-agent ... it was able to
use the proxy values.

Now I need to find the proper way to setup the systemD service scripts...

Thanks for the attention

On Thu, Jul 20, 2017 at 6:19 PM, Jie Yu <yujie....@gmail.com> wrote:

> Looks like some issue with the proxy setting.
>
> Can you print your agent environment variables? (HTTP_PROXY and
> HTTPS_PROXY)
>
> - Jie
>
> On Thu, Jul 20, 2017 at 1:16 PM, William Markito Oliveira <
> mark...@apache.org> wrote:
>
> > I'm reading MESOS-6010 [1] which seems very similar to the problem I'm
> > having but I do have the variables in all lower case and I'm assuming
> that
> > fix is already part of 1.3.0...
> >
> >
> > [1] https://issues.apache.org/jira/browse/MESOS-6010
> >
> > On Thu, Jul 20, 2017 at 3:08 PM, William Markito Oliveira <
> > william.mark...@gmail.com> wrote:
> >
> > > Hi folks,
> > >
> > > I'm trying to setup a Mesos cluster and submit the "hello world" GPU
> > > example from the documentation page:
> > >
> > > *"mesos-execute --master=10.120.59.5:5050 <http://10.120.59.5:5050>
> > > --name=gpu-test       --docker_image="nvidia/cuda"
> > > --command="nvidia-smi"       --framework_capabilities="GPU_RESOURCES"
> > > --resources="gpus:1"*
> > >
> > >
> > > This returns:
> > >
> > > I0720 16:02:26.039414 102623 scheduler.cpp:184] Version: 1.3.0
> > > I0720 16:02:26.040221 102619 scheduler.cpp:470] New master detected at
> > > master@10.120.59.5:5050
> > > Subscribed with ID 4d3b5156-85df-4ffd-a8cd-9e0ecaa90e39-0015
> > > Submitted task 'gpu-test' to agent 'b2d906e8-1207-4ceb-aeb0-
> > > 42be1150cff8-S4'
> > > Received status update TASK_FAILED for task 'gpu-test'
> > >   message: 'Failed to launch container: Failed to perform 'curl': curl:
> > > (7) Failed to connect to registry-1.docker.io port 443: Connection
> > refused
> > > '
> > >   source: SOURCE_AGENT
> > >   reason: REASON_CONTAINER_LAUNCH_FAILED
> > >
> > > ---
> > >
> > > Now when specifying the containerizer=docker  I receive the following
> > > output:
> > >
> > > mesos-execute --containerizer=docker      --master=10.120.59.5:5050
> > > --name=gpu-test       --docker_image="nvidia/cuda"
> > > --command="nvidia-smi"       --framework_capabilities="GPU_RESOURCES"
> > >   --resources="gpus:1"
> > > I0720 16:02:59.589102 102719 scheduler.cpp:184] Version: 1.3.0
> > > I0720 16:02:59.589792 102727 scheduler.cpp:470] New master detected at
> > > master@10.120.59.5:5050
> > > Subscribed with ID 4d3b5156-85df-4ffd-a8cd-9e0ecaa90e39-0016
> > > Submitted task 'gpu-test' to agent 'b2d906e8-1207-4ceb-aeb0-
> > > 42be1150cff8-S3'
> > > Received status update TASK_RUNNING for task 'gpu-test'
> > >   source: SOURCE_EXECUTOR
> > > Received status update TASK_FAILED for task 'gpu-test'
> > >   message: 'Container exited with status 127'
> > >   source: SOURCE_EXECUTOR
> > >
> > > So still no success, but a different error.
> > >
> > > My environment does have http_proxy and https_proxy variables with
> proper
> > > values and I've set them before starting the agents.
> > >
> > > Both docker and nvidia-docker pull works just fine and can download the
> > > images.
> > >
> > > Any thoughts on how to fix or troubleshoot this ?
> > >
> > > Thank you!
> > >
> > > Version: Mesos 1.3.0
> > > OS: Ubuntu 16.04
> > > Docker: 17.06.0-ce (+ NVIDIA-Docker)
> > >
> > >
> > > --
> > > ~/William
> > >
> >
>



-- 
~/William

Reply via email to