Re: mesos-slave Failed to initialize: Failed to bind on 0.0.0.0:0: Address already in use: Address already in use [98]

2018-05-03 Thread Benjamin Mahler
>From the man page for bind:

*EADDRINUSE*
  (Internet domain sockets) The port number was specified as
  zero in the socket address structure, but, upon attempting to
  bind to an ephemeral port, it was determined that all port
  numbers in the ephemeral port range are currently in use.  See
  the discussion of */proc/sys/net/ipv4/ip_local_port_range*
  ip(7) .


On Thu, May 3, 2018 at 10:11 AM Srikanth Viswanathan 
wrote:

> fwiw, I've seen this type of error in the past when the system runs out of
> ephemeral ports. Not saying this definitely the same issue, but I suggest
> checking to see if you have ephemeral ports available.
>
> On Thu, May 3, 2018 at 8:57 AM, Zhitao Li  wrote:
>
>> Can you paste the command line of how you started the Mesos agent process?
>>
>> On Wed, May 2, 2018 at 9:21 PM, Luke Adolph  wrote:
>>
>>> Hi all:
>>> When mesos slave run task, the stderr file shows
>>> I0503 04:01:20.488590  9110 logging.cpp:188] INFO level logging started!
>>> I0503 04:01:20.489073  9110 fetcher.cpp:424] Fetcher Info:
>>> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/2bcc032f-950b-4c36-bff4-b5552c193dc9-S1\/root","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"file:\/\/\/etc\/.dockercfg"}}],"sandbox_directory":"\/tmp\/mesos\/slaves\/2bcc032f-950b-4c36-bff4-b5552c193dc9-S1\/docker\/links\/b4eabcbb-5769-49f0-9324-b25c3cda8b8c","user":"root"}
>>> I0503 04:01:20.491297  9110 fetcher.cpp:379] Fetching URI
>>> 'file:///etc/.dockercfg'
>>> I0503 04:01:20.491325  9110 fetcher.cpp:250] Fetching directly into the
>>> sandbox directory
>>> I0503 04:01:20.491348  9110 fetcher.cpp:187] Fetching URI
>>> 'file:///etc/.dockercfg'
>>> I0503 04:01:20.491367  9110 fetcher.cpp:167] Copying resource with
>>> command:cp '/etc/.dockercfg'
>>> '/tmp/mesos/slaves/2bcc032f-950b-4c36-bff4-b5552c193dc9-S1/docker/links/b4eabcbb-5769-49f0-9324-b25c3cda8b8c/.dockercfg'
>>> W0503 04:01:20.495400  9110 fetcher.cpp:272] Copying instead of
>>> extracting resource from URI with 'extract' flag, because it does not seem
>>> to be an archive: file:///etc/.dockercfg
>>> I0503 04:01:20.495728  9110 fetcher.cpp:456] Fetched
>>> 'file:///etc/.dockercfg' to
>>> '/tmp/mesos/slaves/2bcc032f-950b-4c36-bff4-b5552c193dc9-S1/docker/links/b4eabcbb-5769-49f0-9324-b25c3cda8b8c/.dockercfg'
>>> F0503 04:01:21.990416  9202 process.cpp:889] Failed to initialize:
>>> Failed to bind on 0.0.0.0:0: Address already in use: Address already in
>>> use [98]
>>> *** Check failure stack trace: ***
>>> @ 0x7f95fc6ef86d  google::LogMessage::Fail()
>>> @ 0x7f95fc6f169d  google::LogMessage::SendToLog()
>>> @ 0x7f95fc6ef45c  google::LogMessage::Flush()
>>> @ 0x7f95fc6ef669  google::LogMessage::~LogMessage()
>>> @ 0x7f95fc6f05d2  google::ErrnoLogMessage::~ErrnoLogMessage()
>>> @ 0x7f95fc6955d9  process::initialize()
>>> @ 0x7f95fc696be2  process::ProcessBase::ProcessBase()
>>> @   0x430e9a
>>> mesos::internal::docker::DockerExecutorProcess::DockerExecutorProcess()
>>> @   0x41916b  main
>>> @ 0x7f95fa60ff45  (unknown)
>>> @   0x419c77  (unknown)
>>>
>>> When mesos slave initialize, it runs into "Failed to bind on 0.0.0.0:0:
>>> Address already in use", I run `netstat -nlp`, But there is no port "0" is
>>> used, full output is
>>> root@10:~# netstat -nlp
>>> Active Internet connections (only servers)
>>> Proto Recv-Q Send-Q Local Address   Foreign Address
>>>  State   PID/Program name
>>> tcp0  0 0.0.0.0:22  0.0.0.0:*
>>>  LISTEN  1153/sshd
>>> tcp0  0 0.0.0.0:37786   0.0.0.0:*
>>>  LISTEN  20042/mesos-docker-
>>> tcp0  0 0.0.0.0:50510.0.0.0:*
>>>  LISTEN  12701/mesos-slave
>>> tcp0  0 0.0.0.0:37084   0.0.0.0:*
>>>  LISTEN  19765/mesos-docker-
>>> tcp0  0 0.0.0.0:24220   0.0.0.0:*
>>>  LISTEN  28584/ruby
>>> tcp0  0 0.0.0.0:87650.0.0.0:*
>>>  LISTEN  28353/nginx
>>> tcp0  0 0.0.0.0:24224   0.0.0.0:*
>>>  LISTEN  28584/ruby
>>> tcp0  0 127.0.0.1:24225 0.0.0.0:*
>>>  LISTEN  28584/ruby
>>> tcp0  0 0.0.0.0:46690   0.0.0.0:*
>>>  LISTEN  28932/mesos-docker-
>>> tcp0  0 0.0.0.0:42437   0.0.0.0:*
>>>  LISTEN  32184/mesos-docker-
>>> tcp0  0 0.0.0.0:34695   0.0.0.0:*
>>>  LISTEN  25862/mesos-docker-
>>> tcp0  0 0.0.0.0:37039   0.0.0.0:*
>>>  LISTEN  21273/mesos-docker-
>>> tcp0  0 0.0.0.0:46001   0.0.0.0:*
>>>  LISTEN  710/mesos-docker-ex
>>> tcp6   0  0 :::31765:::*
>>> LISTEN  20160/docker-proxy
>>> tcp6   0  

Re: mesos-slave Failed to initialize: Failed to bind on 0.0.0.0:0: Address already in use: Address already in use [98]

2018-05-03 Thread Srikanth Viswanathan
fwiw, I've seen this type of error in the past when the system runs out of
ephemeral ports. Not saying this definitely the same issue, but I suggest
checking to see if you have ephemeral ports available.

On Thu, May 3, 2018 at 8:57 AM, Zhitao Li  wrote:

> Can you paste the command line of how you started the Mesos agent process?
>
> On Wed, May 2, 2018 at 9:21 PM, Luke Adolph  wrote:
>
>> Hi all:
>> When mesos slave run task, the stderr file shows
>> I0503 04:01:20.488590  9110 logging.cpp:188] INFO level logging started!
>> I0503 04:01:20.489073  9110 fetcher.cpp:424] Fetcher Info:
>> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/2bcc032f-
>> 950b-4c36-bff4-b5552c193dc9-S1\/root","items":[{"action":"
>> BYPASS_CACHE","uri":{"extract":true,"value":"file:\/\/\/etc\
>> /.dockercfg"}}],"sandbox_directory":"\/tmp\/mesos\/
>> slaves\/2bcc032f-950b-4c36-bff4-b5552c193dc9-S1\/docker\/
>> links\/b4eabcbb-5769-49f0-9324-b25c3cda8b8c","user":"root"}
>> I0503 04:01:20.491297  9110 fetcher.cpp:379] Fetching URI
>> 'file:///etc/.dockercfg'
>> I0503 04:01:20.491325  9110 fetcher.cpp:250] Fetching directly into the
>> sandbox directory
>> I0503 04:01:20.491348  9110 fetcher.cpp:187] Fetching URI
>> 'file:///etc/.dockercfg'
>> I0503 04:01:20.491367  9110 fetcher.cpp:167] Copying resource with
>> command:cp '/etc/.dockercfg' '/tmp/mesos/slaves/2bcc032f-95
>> 0b-4c36-bff4-b5552c193dc9-S1/docker/links/b4eabcbb-5769-49f
>> 0-9324-b25c3cda8b8c/.dockercfg'
>> W0503 04:01:20.495400  9110 fetcher.cpp:272] Copying instead of
>> extracting resource from URI with 'extract' flag, because it does not seem
>> to be an archive: file:///etc/.dockercfg
>> I0503 04:01:20.495728  9110 fetcher.cpp:456] Fetched
>> 'file:///etc/.dockercfg' to '/tmp/mesos/slaves/2bcc032f-95
>> 0b-4c36-bff4-b5552c193dc9-S1/docker/links/b4eabcbb-5769-49f
>> 0-9324-b25c3cda8b8c/.dockercfg'
>> F0503 04:01:21.990416  9202 process.cpp:889] Failed to initialize: Failed
>> to bind on 0.0.0.0:0: Address already in use: Address already in use [98]
>> *** Check failure stack trace: ***
>> @ 0x7f95fc6ef86d  google::LogMessage::Fail()
>> @ 0x7f95fc6f169d  google::LogMessage::SendToLog()
>> @ 0x7f95fc6ef45c  google::LogMessage::Flush()
>> @ 0x7f95fc6ef669  google::LogMessage::~LogMessage()
>> @ 0x7f95fc6f05d2  google::ErrnoLogMessage::~ErrnoLogMessage()
>> @ 0x7f95fc6955d9  process::initialize()
>> @ 0x7f95fc696be2  process::ProcessBase::ProcessBase()
>> @   0x430e9a  mesos::internal::docker::Docke
>> rExecutorProcess::DockerExecutorProcess()
>> @   0x41916b  main
>> @ 0x7f95fa60ff45  (unknown)
>> @   0x419c77  (unknown)
>>
>> When mesos slave initialize, it runs into "Failed to bind on 0.0.0.0:0:
>> Address already in use", I run `netstat -nlp`, But there is no port "0" is
>> used, full output is
>> root@10:~# netstat -nlp
>> Active Internet connections (only servers)
>> Proto Recv-Q Send-Q Local Address   Foreign Address
>>  State   PID/Program name
>> tcp0  0 0.0.0.0:22  0.0.0.0:*
>>  LISTEN  1153/sshd
>> tcp0  0 0.0.0.0:37786   0.0.0.0:*
>>  LISTEN  20042/mesos-docker-
>> tcp0  0 0.0.0.0:50510.0.0.0:*
>>  LISTEN  12701/mesos-slave
>> tcp0  0 0.0.0.0:37084   0.0.0.0:*
>>  LISTEN  19765/mesos-docker-
>> tcp0  0 0.0.0.0:24220   0.0.0.0:*
>>  LISTEN  28584/ruby
>> tcp0  0 0.0.0.0:87650.0.0.0:*
>>  LISTEN  28353/nginx
>> tcp0  0 0.0.0.0:24224   0.0.0.0:*
>>  LISTEN  28584/ruby
>> tcp0  0 127.0.0.1:24225 0.0.0.0:*
>>  LISTEN  28584/ruby
>> tcp0  0 0.0.0.0:46690   0.0.0.0:*
>>  LISTEN  28932/mesos-docker-
>> tcp0  0 0.0.0.0:42437   0.0.0.0:*
>>  LISTEN  32184/mesos-docker-
>> tcp0  0 0.0.0.0:34695   0.0.0.0:*
>>  LISTEN  25862/mesos-docker-
>> tcp0  0 0.0.0.0:37039   0.0.0.0:*
>>  LISTEN  21273/mesos-docker-
>> tcp0  0 0.0.0.0:46001   0.0.0.0:*
>>  LISTEN  710/mesos-docker-ex
>> tcp6   0  0 :::31765:::*
>> LISTEN  20160/docker-proxy
>> tcp6   0  0 :::31605:::*
>> LISTEN  20149/docker-proxy
>> tcp6   0  0 :::31327:::*
>> LISTEN  820/docker-proxy
>> tcp6   0  0 :::31008:::*
>> LISTEN  32291/docker-proxy
>> tcp6   0  0 :::2375 :::*
>> LISTEN  28305/node
>> tcp6   0  0 :::31690:::*
>> LISTEN  25966/docker-proxy
>> tcp6   0  0 :::31211:::*
>> LISTEN  21379/docker-proxy
>> tcp6   0  0 :::31245:::*
>> LISTEN  19988/docker-proxy
>> tcp6   0  0 :::31121:::*
>> 

Re: mesos-slave Failed to initialize: Failed to bind on 0.0.0.0:0: Address already in use: Address already in use [98]

2018-05-03 Thread Zhitao Li
Can you paste the command line of how you started the Mesos agent process?

On Wed, May 2, 2018 at 9:21 PM, Luke Adolph  wrote:

> Hi all:
> When mesos slave run task, the stderr file shows
> I0503 04:01:20.488590  9110 logging.cpp:188] INFO level logging started!
> I0503 04:01:20.489073  9110 fetcher.cpp:424] Fetcher Info:
> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/2bcc032f-950b-4c36-bff4-
> b5552c193dc9-S1\/root","items":[{"action":"BYPASS_CACHE","
> uri":{"extract":true,"value":"file:\/\/\/etc\/.dockercfg"}}]
> ,"sandbox_directory":"\/tmp\/mesos\/slaves\/2bcc032f-950b-
> 4c36-bff4-b5552c193dc9-S1\/docker\/links\/b4eabcbb-5769-
> 49f0-9324-b25c3cda8b8c","user":"root"}
> I0503 04:01:20.491297  9110 fetcher.cpp:379] Fetching URI
> 'file:///etc/.dockercfg'
> I0503 04:01:20.491325  9110 fetcher.cpp:250] Fetching directly into the
> sandbox directory
> I0503 04:01:20.491348  9110 fetcher.cpp:187] Fetching URI
> 'file:///etc/.dockercfg'
> I0503 04:01:20.491367  9110 fetcher.cpp:167] Copying resource with
> command:cp '/etc/.dockercfg' '/tmp/mesos/slaves/2bcc032f-
> 950b-4c36-bff4-b5552c193dc9-S1/docker/links/b4eabcbb-5769-
> 49f0-9324-b25c3cda8b8c/.dockercfg'
> W0503 04:01:20.495400  9110 fetcher.cpp:272] Copying instead of extracting
> resource from URI with 'extract' flag, because it does not seem to be an
> archive: file:///etc/.dockercfg
> I0503 04:01:20.495728  9110 fetcher.cpp:456] Fetched
> 'file:///etc/.dockercfg' to '/tmp/mesos/slaves/2bcc032f-
> 950b-4c36-bff4-b5552c193dc9-S1/docker/links/b4eabcbb-5769-
> 49f0-9324-b25c3cda8b8c/.dockercfg'
> F0503 04:01:21.990416  9202 process.cpp:889] Failed to initialize: Failed
> to bind on 0.0.0.0:0: Address already in use: Address already in use [98]
> *** Check failure stack trace: ***
> @ 0x7f95fc6ef86d  google::LogMessage::Fail()
> @ 0x7f95fc6f169d  google::LogMessage::SendToLog()
> @ 0x7f95fc6ef45c  google::LogMessage::Flush()
> @ 0x7f95fc6ef669  google::LogMessage::~LogMessage()
> @ 0x7f95fc6f05d2  google::ErrnoLogMessage::~ErrnoLogMessage()
> @ 0x7f95fc6955d9  process::initialize()
> @ 0x7f95fc696be2  process::ProcessBase::ProcessBase()
> @   0x430e9a  mesos::internal::docker::DockerExecutorProcess::
> DockerExecutorProcess()
> @   0x41916b  main
> @ 0x7f95fa60ff45  (unknown)
> @   0x419c77  (unknown)
>
> When mesos slave initialize, it runs into "Failed to bind on 0.0.0.0:0:
> Address already in use", I run `netstat -nlp`, But there is no port "0" is
> used, full output is
> root@10:~# netstat -nlp
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address   Foreign Address State
>  PID/Program name
> tcp0  0 0.0.0.0:22  0.0.0.0:*
>  LISTEN  1153/sshd
> tcp0  0 0.0.0.0:37786   0.0.0.0:*
>  LISTEN  20042/mesos-docker-
> tcp0  0 0.0.0.0:50510.0.0.0:*
>  LISTEN  12701/mesos-slave
> tcp0  0 0.0.0.0:37084   0.0.0.0:*
>  LISTEN  19765/mesos-docker-
> tcp0  0 0.0.0.0:24220   0.0.0.0:*
>  LISTEN  28584/ruby
> tcp0  0 0.0.0.0:87650.0.0.0:*
>  LISTEN  28353/nginx
> tcp0  0 0.0.0.0:24224   0.0.0.0:*
>  LISTEN  28584/ruby
> tcp0  0 127.0.0.1:24225 0.0.0.0:*
>  LISTEN  28584/ruby
> tcp0  0 0.0.0.0:46690   0.0.0.0:*
>  LISTEN  28932/mesos-docker-
> tcp0  0 0.0.0.0:42437   0.0.0.0:*
>  LISTEN  32184/mesos-docker-
> tcp0  0 0.0.0.0:34695   0.0.0.0:*
>  LISTEN  25862/mesos-docker-
> tcp0  0 0.0.0.0:37039   0.0.0.0:*
>  LISTEN  21273/mesos-docker-
> tcp0  0 0.0.0.0:46001   0.0.0.0:*
>  LISTEN  710/mesos-docker-ex
> tcp6   0  0 :::31765:::*
> LISTEN  20160/docker-proxy
> tcp6   0  0 :::31605:::*
> LISTEN  20149/docker-proxy
> tcp6   0  0 :::31327:::*
> LISTEN  820/docker-proxy
> tcp6   0  0 :::31008:::*
> LISTEN  32291/docker-proxy
> tcp6   0  0 :::2375 :::*
> LISTEN  28305/node
> tcp6   0  0 :::31690:::*
> LISTEN  25966/docker-proxy
> tcp6   0  0 :::31211:::*
> LISTEN  21379/docker-proxy
> tcp6   0  0 :::31245:::*
> LISTEN  19988/docker-proxy
> tcp6   0  0 :::31121:::*
> LISTEN  29037/docker-proxy
> udp0  0 0.0.0.0:24224   0.0.0.0:*
>28584/ruby
> udp0  0 192.168.0.1:123 0.0.0.0:*
>1348/ntpd
> udp0  0 59.110.24.56:1230.0.0.0:*
>1348/ntpd
> udp0  0 10.25.141.251:123   0.0.0.0:*
>1348/ntpd
> udp0  0 127.0.0.1:123 

mesos-slave Failed to initialize: Failed to bind on 0.0.0.0:0: Address already in use: Address already in use [98]

2018-05-02 Thread Luke Adolph
Hi all:
When mesos slave run task, the stderr file shows
I0503 04:01:20.488590  9110 logging.cpp:188] INFO level logging started!
I0503 04:01:20.489073  9110 fetcher.cpp:424] Fetcher Info:
{"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/2bcc032f-950b-4c36-bff4-b5552c193dc9-S1\/root","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"file:\/\/\/etc\/.dockercfg"}}],"sandbox_directory":"\/tmp\/mesos\/slaves\/2bcc032f-950b-4c36-bff4-b5552c193dc9-S1\/docker\/links\/b4eabcbb-5769-49f0-9324-b25c3cda8b8c","user":"root"}
I0503 04:01:20.491297  9110 fetcher.cpp:379] Fetching URI
'file:///etc/.dockercfg'
I0503 04:01:20.491325  9110 fetcher.cpp:250] Fetching directly into the
sandbox directory
I0503 04:01:20.491348  9110 fetcher.cpp:187] Fetching URI
'file:///etc/.dockercfg'
I0503 04:01:20.491367  9110 fetcher.cpp:167] Copying resource with
command:cp '/etc/.dockercfg'
'/tmp/mesos/slaves/2bcc032f-950b-4c36-bff4-b5552c193dc9-S1/docker/links/b4eabcbb-5769-49f0-9324-b25c3cda8b8c/.dockercfg'
W0503 04:01:20.495400  9110 fetcher.cpp:272] Copying instead of extracting
resource from URI with 'extract' flag, because it does not seem to be an
archive: file:///etc/.dockercfg
I0503 04:01:20.495728  9110 fetcher.cpp:456] Fetched
'file:///etc/.dockercfg' to
'/tmp/mesos/slaves/2bcc032f-950b-4c36-bff4-b5552c193dc9-S1/docker/links/b4eabcbb-5769-49f0-9324-b25c3cda8b8c/.dockercfg'
F0503 04:01:21.990416  9202 process.cpp:889] Failed to initialize: Failed
to bind on 0.0.0.0:0: Address already in use: Address already in use [98]
*** Check failure stack trace: ***
@ 0x7f95fc6ef86d  google::LogMessage::Fail()
@ 0x7f95fc6f169d  google::LogMessage::SendToLog()
@ 0x7f95fc6ef45c  google::LogMessage::Flush()
@ 0x7f95fc6ef669  google::LogMessage::~LogMessage()
@ 0x7f95fc6f05d2  google::ErrnoLogMessage::~ErrnoLogMessage()
@ 0x7f95fc6955d9  process::initialize()
@ 0x7f95fc696be2  process::ProcessBase::ProcessBase()
@   0x430e9a
mesos::internal::docker::DockerExecutorProcess::DockerExecutorProcess()
@   0x41916b  main
@ 0x7f95fa60ff45  (unknown)
@   0x419c77  (unknown)

When mesos slave initialize, it runs into "Failed to bind on 0.0.0.0:0:
Address already in use", I run `netstat -nlp`, But there is no port "0" is
used, full output is
root@10:~# netstat -nlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address   Foreign Address State
 PID/Program name
tcp0  0 0.0.0.0:22  0.0.0.0:*   LISTEN
1153/sshd
tcp0  0 0.0.0.0:37786   0.0.0.0:*   LISTEN
20042/mesos-docker-
tcp0  0 0.0.0.0:50510.0.0.0:*   LISTEN
12701/mesos-slave
tcp0  0 0.0.0.0:37084   0.0.0.0:*   LISTEN
19765/mesos-docker-
tcp0  0 0.0.0.0:24220   0.0.0.0:*   LISTEN
28584/ruby
tcp0  0 0.0.0.0:87650.0.0.0:*   LISTEN
28353/nginx
tcp0  0 0.0.0.0:24224   0.0.0.0:*   LISTEN
28584/ruby
tcp0  0 127.0.0.1:24225 0.0.0.0:*   LISTEN
28584/ruby
tcp0  0 0.0.0.0:46690   0.0.0.0:*   LISTEN
28932/mesos-docker-
tcp0  0 0.0.0.0:42437   0.0.0.0:*   LISTEN
32184/mesos-docker-
tcp0  0 0.0.0.0:34695   0.0.0.0:*   LISTEN
25862/mesos-docker-
tcp0  0 0.0.0.0:37039   0.0.0.0:*   LISTEN
21273/mesos-docker-
tcp0  0 0.0.0.0:46001   0.0.0.0:*   LISTEN
710/mesos-docker-ex
tcp6   0  0 :::31765:::*LISTEN
20160/docker-proxy
tcp6   0  0 :::31605:::*LISTEN
20149/docker-proxy
tcp6   0  0 :::31327:::*LISTEN
820/docker-proxy
tcp6   0  0 :::31008:::*LISTEN
32291/docker-proxy
tcp6   0  0 :::2375 :::*LISTEN
28305/node
tcp6   0  0 :::31690:::*LISTEN
25966/docker-proxy
tcp6   0  0 :::31211:::*LISTEN
21379/docker-proxy
tcp6   0  0 :::31245:::*LISTEN
19988/docker-proxy
tcp6   0  0 :::31121:::*LISTEN
29037/docker-proxy
udp0  0 0.0.0.0:24224   0.0.0.0:*
 28584/ruby
udp0  0 192.168.0.1:123 0.0.0.0:*
 1348/ntpd
udp0  0 59.110.24.56:1230.0.0.0:*
 1348/ntpd
udp0  0 10.25.141.251:123   0.0.0.0:*
 1348/ntpd
udp0  0 127.0.0.1:123   0.0.0.0:*
 1348/ntpd
udp0  0