Re: mesos-slave Failed to initialize: Failed to bind on 0.0.0.0:0: Address already in use: Address already in use [98]

2018-05-03 Thread Benjamin Mahler
>From the man page for bind:

*EADDRINUSE*
  (Internet domain sockets) The port number was specified as
  zero in the socket address structure, but, upon attempting to
  bind to an ephemeral port, it was determined that all port
  numbers in the ephemeral port range are currently in use.  See
  the discussion of */proc/sys/net/ipv4/ip_local_port_range*
  ip(7) .


On Thu, May 3, 2018 at 10:11 AM Srikanth Viswanathan 
wrote:

> fwiw, I've seen this type of error in the past when the system runs out of
> ephemeral ports. Not saying this definitely the same issue, but I suggest
> checking to see if you have ephemeral ports available.
>
> On Thu, May 3, 2018 at 8:57 AM, Zhitao Li  wrote:
>
>> Can you paste the command line of how you started the Mesos agent process?
>>
>> On Wed, May 2, 2018 at 9:21 PM, Luke Adolph  wrote:
>>
>>> Hi all:
>>> When mesos slave run task, the stderr file shows
>>> I0503 04:01:20.488590  9110 logging.cpp:188] INFO level logging started!
>>> I0503 04:01:20.489073  9110 fetcher.cpp:424] Fetcher Info:
>>> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/2bcc032f-950b-4c36-bff4-b5552c193dc9-S1\/root","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"file:\/\/\/etc\/.dockercfg"}}],"sandbox_directory":"\/tmp\/mesos\/slaves\/2bcc032f-950b-4c36-bff4-b5552c193dc9-S1\/docker\/links\/b4eabcbb-5769-49f0-9324-b25c3cda8b8c","user":"root"}
>>> I0503 04:01:20.491297  9110 fetcher.cpp:379] Fetching URI
>>> 'file:///etc/.dockercfg'
>>> I0503 04:01:20.491325  9110 fetcher.cpp:250] Fetching directly into the
>>> sandbox directory
>>> I0503 04:01:20.491348  9110 fetcher.cpp:187] Fetching URI
>>> 'file:///etc/.dockercfg'
>>> I0503 04:01:20.491367  9110 fetcher.cpp:167] Copying resource with
>>> command:cp '/etc/.dockercfg'
>>> '/tmp/mesos/slaves/2bcc032f-950b-4c36-bff4-b5552c193dc9-S1/docker/links/b4eabcbb-5769-49f0-9324-b25c3cda8b8c/.dockercfg'
>>> W0503 04:01:20.495400  9110 fetcher.cpp:272] Copying instead of
>>> extracting resource from URI with 'extract' flag, because it does not seem
>>> to be an archive: file:///etc/.dockercfg
>>> I0503 04:01:20.495728  9110 fetcher.cpp:456] Fetched
>>> 'file:///etc/.dockercfg' to
>>> '/tmp/mesos/slaves/2bcc032f-950b-4c36-bff4-b5552c193dc9-S1/docker/links/b4eabcbb-5769-49f0-9324-b25c3cda8b8c/.dockercfg'
>>> F0503 04:01:21.990416  9202 process.cpp:889] Failed to initialize:
>>> Failed to bind on 0.0.0.0:0: Address already in use: Address already in
>>> use [98]
>>> *** Check failure stack trace: ***
>>> @ 0x7f95fc6ef86d  google::LogMessage::Fail()
>>> @ 0x7f95fc6f169d  google::LogMessage::SendToLog()
>>> @ 0x7f95fc6ef45c  google::LogMessage::Flush()
>>> @ 0x7f95fc6ef669  google::LogMessage::~LogMessage()
>>> @ 0x7f95fc6f05d2  google::ErrnoLogMessage::~ErrnoLogMessage()
>>> @ 0x7f95fc6955d9  process::initialize()
>>> @ 0x7f95fc696be2  process::ProcessBase::ProcessBase()
>>> @   0x430e9a
>>> mesos::internal::docker::DockerExecutorProcess::DockerExecutorProcess()
>>> @   0x41916b  main
>>> @ 0x7f95fa60ff45  (unknown)
>>> @   0x419c77  (unknown)
>>>
>>> When mesos slave initialize, it runs into "Failed to bind on 0.0.0.0:0:
>>> Address already in use", I run `netstat -nlp`, But there is no port "0" is
>>> used, full output is
>>> root@10:~# netstat -nlp
>>> Active Internet connections (only servers)
>>> Proto Recv-Q Send-Q Local Address   Foreign Address
>>>  State   PID/Program name
>>> tcp0  0 0.0.0.0:22  0.0.0.0:*
>>>  LISTEN  1153/sshd
>>> tcp0  0 0.0.0.0:37786   0.0.0.0:*
>>>  LISTEN  20042/mesos-docker-
>>> tcp0  0 0.0.0.0:50510.0.0.0:*
>>>  LISTEN  12701/mesos-slave
>>> tcp0  0 0.0.0.0:37084   0.0.0.0:*
>>>  LISTEN  19765/mesos-docker-
>>> tcp0  0 0.0.0.0:24220   0.0.0.0:*
>>>  LISTEN  28584/ruby
>>> tcp0  0 0.0.0.0:87650.0.0.0:*
>>>  LISTEN  28353/nginx
>>> tcp0  0 0.0.0.0:24224   0.0.0.0:*
>>>  LISTEN  28584/ruby
>>> tcp0  0 127.0.0.1:24225 0.0.0.0:*
>>>  LISTEN  28584/ruby
>>> tcp0  0 0.0.0.0:46690   0.0.0.0:*
>>>  LISTEN  28932/mesos-docker-
>>> tcp0  0 0.0.0.0:42437   0.0.0.0:*
>>>  LISTEN  32184/mesos-docker-
>>> tcp0  0 0.0.0.0:34695   0.0.0.0:*
>>>  LISTEN  25862/mesos-docker-
>>> tcp0  0 0.0.0.0:37039   0.0.0.0:*
>>>  LISTEN  21273/mesos-docker-
>>> tcp0  0 0.0.0.0:46001   0.0.0.0:*
>>>  LISTEN  710/mesos-docker-ex
>>> tcp6   0  0 :::31765:::*
>>> LISTEN  20160/docker-proxy
>>> tcp6   0  

Re: mesos-slave Failed to initialize: Failed to bind on 0.0.0.0:0: Address already in use: Address already in use [98]

2018-05-03 Thread Srikanth Viswanathan
fwiw, I've seen this type of error in the past when the system runs out of
ephemeral ports. Not saying this definitely the same issue, but I suggest
checking to see if you have ephemeral ports available.

On Thu, May 3, 2018 at 8:57 AM, Zhitao Li  wrote:

> Can you paste the command line of how you started the Mesos agent process?
>
> On Wed, May 2, 2018 at 9:21 PM, Luke Adolph  wrote:
>
>> Hi all:
>> When mesos slave run task, the stderr file shows
>> I0503 04:01:20.488590  9110 logging.cpp:188] INFO level logging started!
>> I0503 04:01:20.489073  9110 fetcher.cpp:424] Fetcher Info:
>> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/2bcc032f-
>> 950b-4c36-bff4-b5552c193dc9-S1\/root","items":[{"action":"
>> BYPASS_CACHE","uri":{"extract":true,"value":"file:\/\/\/etc\
>> /.dockercfg"}}],"sandbox_directory":"\/tmp\/mesos\/
>> slaves\/2bcc032f-950b-4c36-bff4-b5552c193dc9-S1\/docker\/
>> links\/b4eabcbb-5769-49f0-9324-b25c3cda8b8c","user":"root"}
>> I0503 04:01:20.491297  9110 fetcher.cpp:379] Fetching URI
>> 'file:///etc/.dockercfg'
>> I0503 04:01:20.491325  9110 fetcher.cpp:250] Fetching directly into the
>> sandbox directory
>> I0503 04:01:20.491348  9110 fetcher.cpp:187] Fetching URI
>> 'file:///etc/.dockercfg'
>> I0503 04:01:20.491367  9110 fetcher.cpp:167] Copying resource with
>> command:cp '/etc/.dockercfg' '/tmp/mesos/slaves/2bcc032f-95
>> 0b-4c36-bff4-b5552c193dc9-S1/docker/links/b4eabcbb-5769-49f
>> 0-9324-b25c3cda8b8c/.dockercfg'
>> W0503 04:01:20.495400  9110 fetcher.cpp:272] Copying instead of
>> extracting resource from URI with 'extract' flag, because it does not seem
>> to be an archive: file:///etc/.dockercfg
>> I0503 04:01:20.495728  9110 fetcher.cpp:456] Fetched
>> 'file:///etc/.dockercfg' to '/tmp/mesos/slaves/2bcc032f-95
>> 0b-4c36-bff4-b5552c193dc9-S1/docker/links/b4eabcbb-5769-49f
>> 0-9324-b25c3cda8b8c/.dockercfg'
>> F0503 04:01:21.990416  9202 process.cpp:889] Failed to initialize: Failed
>> to bind on 0.0.0.0:0: Address already in use: Address already in use [98]
>> *** Check failure stack trace: ***
>> @ 0x7f95fc6ef86d  google::LogMessage::Fail()
>> @ 0x7f95fc6f169d  google::LogMessage::SendToLog()
>> @ 0x7f95fc6ef45c  google::LogMessage::Flush()
>> @ 0x7f95fc6ef669  google::LogMessage::~LogMessage()
>> @ 0x7f95fc6f05d2  google::ErrnoLogMessage::~ErrnoLogMessage()
>> @ 0x7f95fc6955d9  process::initialize()
>> @ 0x7f95fc696be2  process::ProcessBase::ProcessBase()
>> @   0x430e9a  mesos::internal::docker::Docke
>> rExecutorProcess::DockerExecutorProcess()
>> @   0x41916b  main
>> @ 0x7f95fa60ff45  (unknown)
>> @   0x419c77  (unknown)
>>
>> When mesos slave initialize, it runs into "Failed to bind on 0.0.0.0:0:
>> Address already in use", I run `netstat -nlp`, But there is no port "0" is
>> used, full output is
>> root@10:~# netstat -nlp
>> Active Internet connections (only servers)
>> Proto Recv-Q Send-Q Local Address   Foreign Address
>>  State   PID/Program name
>> tcp0  0 0.0.0.0:22  0.0.0.0:*
>>  LISTEN  1153/sshd
>> tcp0  0 0.0.0.0:37786   0.0.0.0:*
>>  LISTEN  20042/mesos-docker-
>> tcp0  0 0.0.0.0:50510.0.0.0:*
>>  LISTEN  12701/mesos-slave
>> tcp0  0 0.0.0.0:37084   0.0.0.0:*
>>  LISTEN  19765/mesos-docker-
>> tcp0  0 0.0.0.0:24220   0.0.0.0:*
>>  LISTEN  28584/ruby
>> tcp0  0 0.0.0.0:87650.0.0.0:*
>>  LISTEN  28353/nginx
>> tcp0  0 0.0.0.0:24224   0.0.0.0:*
>>  LISTEN  28584/ruby
>> tcp0  0 127.0.0.1:24225 0.0.0.0:*
>>  LISTEN  28584/ruby
>> tcp0  0 0.0.0.0:46690   0.0.0.0:*
>>  LISTEN  28932/mesos-docker-
>> tcp0  0 0.0.0.0:42437   0.0.0.0:*
>>  LISTEN  32184/mesos-docker-
>> tcp0  0 0.0.0.0:34695   0.0.0.0:*
>>  LISTEN  25862/mesos-docker-
>> tcp0  0 0.0.0.0:37039   0.0.0.0:*
>>  LISTEN  21273/mesos-docker-
>> tcp0  0 0.0.0.0:46001   0.0.0.0:*
>>  LISTEN  710/mesos-docker-ex
>> tcp6   0  0 :::31765:::*
>> LISTEN  20160/docker-proxy
>> tcp6   0  0 :::31605:::*
>> LISTEN  20149/docker-proxy
>> tcp6   0  0 :::31327:::*
>> LISTEN  820/docker-proxy
>> tcp6   0  0 :::31008:::*
>> LISTEN  32291/docker-proxy
>> tcp6   0  0 :::2375 :::*
>> LISTEN  28305/node
>> tcp6   0  0 :::31690:::*
>> LISTEN  25966/docker-proxy
>> tcp6   0  0 :::31211:::*
>> LISTEN  21379/docker-proxy
>> tcp6   0  0 :::31245:::*
>> LISTEN  19988/docker-proxy
>> tcp6   0  0 :::31121:::*
>> 

Re: mesos-slave Failed to initialize: Failed to bind on 0.0.0.0:0: Address already in use: Address already in use [98]

2018-05-03 Thread Zhitao Li
Can you paste the command line of how you started the Mesos agent process?

On Wed, May 2, 2018 at 9:21 PM, Luke Adolph  wrote:

> Hi all:
> When mesos slave run task, the stderr file shows
> I0503 04:01:20.488590  9110 logging.cpp:188] INFO level logging started!
> I0503 04:01:20.489073  9110 fetcher.cpp:424] Fetcher Info:
> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/2bcc032f-950b-4c36-bff4-
> b5552c193dc9-S1\/root","items":[{"action":"BYPASS_CACHE","
> uri":{"extract":true,"value":"file:\/\/\/etc\/.dockercfg"}}]
> ,"sandbox_directory":"\/tmp\/mesos\/slaves\/2bcc032f-950b-
> 4c36-bff4-b5552c193dc9-S1\/docker\/links\/b4eabcbb-5769-
> 49f0-9324-b25c3cda8b8c","user":"root"}
> I0503 04:01:20.491297  9110 fetcher.cpp:379] Fetching URI
> 'file:///etc/.dockercfg'
> I0503 04:01:20.491325  9110 fetcher.cpp:250] Fetching directly into the
> sandbox directory
> I0503 04:01:20.491348  9110 fetcher.cpp:187] Fetching URI
> 'file:///etc/.dockercfg'
> I0503 04:01:20.491367  9110 fetcher.cpp:167] Copying resource with
> command:cp '/etc/.dockercfg' '/tmp/mesos/slaves/2bcc032f-
> 950b-4c36-bff4-b5552c193dc9-S1/docker/links/b4eabcbb-5769-
> 49f0-9324-b25c3cda8b8c/.dockercfg'
> W0503 04:01:20.495400  9110 fetcher.cpp:272] Copying instead of extracting
> resource from URI with 'extract' flag, because it does not seem to be an
> archive: file:///etc/.dockercfg
> I0503 04:01:20.495728  9110 fetcher.cpp:456] Fetched
> 'file:///etc/.dockercfg' to '/tmp/mesos/slaves/2bcc032f-
> 950b-4c36-bff4-b5552c193dc9-S1/docker/links/b4eabcbb-5769-
> 49f0-9324-b25c3cda8b8c/.dockercfg'
> F0503 04:01:21.990416  9202 process.cpp:889] Failed to initialize: Failed
> to bind on 0.0.0.0:0: Address already in use: Address already in use [98]
> *** Check failure stack trace: ***
> @ 0x7f95fc6ef86d  google::LogMessage::Fail()
> @ 0x7f95fc6f169d  google::LogMessage::SendToLog()
> @ 0x7f95fc6ef45c  google::LogMessage::Flush()
> @ 0x7f95fc6ef669  google::LogMessage::~LogMessage()
> @ 0x7f95fc6f05d2  google::ErrnoLogMessage::~ErrnoLogMessage()
> @ 0x7f95fc6955d9  process::initialize()
> @ 0x7f95fc696be2  process::ProcessBase::ProcessBase()
> @   0x430e9a  mesos::internal::docker::DockerExecutorProcess::
> DockerExecutorProcess()
> @   0x41916b  main
> @ 0x7f95fa60ff45  (unknown)
> @   0x419c77  (unknown)
>
> When mesos slave initialize, it runs into "Failed to bind on 0.0.0.0:0:
> Address already in use", I run `netstat -nlp`, But there is no port "0" is
> used, full output is
> root@10:~# netstat -nlp
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address   Foreign Address State
>  PID/Program name
> tcp0  0 0.0.0.0:22  0.0.0.0:*
>  LISTEN  1153/sshd
> tcp0  0 0.0.0.0:37786   0.0.0.0:*
>  LISTEN  20042/mesos-docker-
> tcp0  0 0.0.0.0:50510.0.0.0:*
>  LISTEN  12701/mesos-slave
> tcp0  0 0.0.0.0:37084   0.0.0.0:*
>  LISTEN  19765/mesos-docker-
> tcp0  0 0.0.0.0:24220   0.0.0.0:*
>  LISTEN  28584/ruby
> tcp0  0 0.0.0.0:87650.0.0.0:*
>  LISTEN  28353/nginx
> tcp0  0 0.0.0.0:24224   0.0.0.0:*
>  LISTEN  28584/ruby
> tcp0  0 127.0.0.1:24225 0.0.0.0:*
>  LISTEN  28584/ruby
> tcp0  0 0.0.0.0:46690   0.0.0.0:*
>  LISTEN  28932/mesos-docker-
> tcp0  0 0.0.0.0:42437   0.0.0.0:*
>  LISTEN  32184/mesos-docker-
> tcp0  0 0.0.0.0:34695   0.0.0.0:*
>  LISTEN  25862/mesos-docker-
> tcp0  0 0.0.0.0:37039   0.0.0.0:*
>  LISTEN  21273/mesos-docker-
> tcp0  0 0.0.0.0:46001   0.0.0.0:*
>  LISTEN  710/mesos-docker-ex
> tcp6   0  0 :::31765:::*
> LISTEN  20160/docker-proxy
> tcp6   0  0 :::31605:::*
> LISTEN  20149/docker-proxy
> tcp6   0  0 :::31327:::*
> LISTEN  820/docker-proxy
> tcp6   0  0 :::31008:::*
> LISTEN  32291/docker-proxy
> tcp6   0  0 :::2375 :::*
> LISTEN  28305/node
> tcp6   0  0 :::31690:::*
> LISTEN  25966/docker-proxy
> tcp6   0  0 :::31211:::*
> LISTEN  21379/docker-proxy
> tcp6   0  0 :::31245:::*
> LISTEN  19988/docker-proxy
> tcp6   0  0 :::31121:::*
> LISTEN  29037/docker-proxy
> udp0  0 0.0.0.0:24224   0.0.0.0:*
>28584/ruby
> udp0  0 192.168.0.1:123 0.0.0.0:*
>1348/ntpd
> udp0  0 59.110.24.56:1230.0.0.0:*
>1348/ntpd
> udp0  0 10.25.141.251:123   0.0.0.0:*
>1348/ntpd
> udp0  0 127.0.0.1:123