Hi Avinash,

Sorry for my unclear expression!

The root cause is not related to k8s, but the CentOS which k8s is running on.
It is related to iptables. After executing "iptables -F", it works!

Best Regards
Nan Xiao


On Wed, Dec 30, 2015 at 11:41 PM, Avinash Sridharan
<[email protected]> wrote:
> Thanks for the update Nan. k8s enabling firewall rules that would block
> traffic to the master seems a bit odd. Looks like a bug to me, in the head
> of the branch. If you are able to reproduce it consistently, could you file
> an issue against kubernetes mesos.
>
> regards,
> Avinash
>
> On Tue, Dec 29, 2015 at 11:04 PM, Nan Xiao <[email protected]> wrote:
>>
>> Hi Avinash,
>>
>> Thanks very much for your reply!
>>
>> The root cause has been found: the k8s server has enabled the iptables
>> which blocks connection from
>> Mesos master; after disable it, it works!
>>
>> Best Regards
>> Nan Xiao
>>
>>
>> On Wed, Dec 30, 2015 at 3:22 AM, Avinash Sridharan
>> <[email protected]> wrote:
>> > lsof command will show only actively opened file descriptors. So if you
>> > ran
>> > the command after seeing the error logs in the master, most probably the
>> > master had already closed this fd. Just throwing a few other things to
>> > look
>> > at, that might give some more insights.
>> >
>> > * Run the "netstat -na" and netstat -nt" commands on the master and the
>> > kubernetes master node to make sure that the master is listening to the
>> > right port, and the k8s scheduler is trying to connect to the right
>> > port.
>> > From the logs it does look like the master is receiving the registration
>> > request, so there shouldn't be a network configuration issue here.
>> > * Make sure there are no firewall rules getting turned on in your
>> > cluster
>> > since it looks like the k8s scheduler is not able to connect to the
>> > master
>> > (though it was able to register the first time).
>> >
>> > On Tue, Dec 29, 2015 at 1:37 AM, Nan Xiao <[email protected]>
>> > wrote:
>> >>
>> >> BTW, using "lsof" command finds there are only 16 file descriptors. I
>> >> don't know why Mesos
>> >> master try to close "fd 17".
>> >> Best Regards
>> >> Nan Xiao
>> >>
>> >>
>> >> On Tue, Dec 29, 2015 at 11:32 AM, Nan Xiao <[email protected]>
>> >> wrote:
>> >> > Hi Klaus,
>> >> >
>> >> > Firstly, thanks very much for your answer!
>> >> >
>> >> > The km processes are all live:
>> >> > root     129474 128024  2 22:26 pts/0    00:00:00 km apiserver
>> >> > --address=15.242.100.60 --etcd-servers=http://15.242.100.60:4001
>> >> > --service-cluster-ip-range=10.10.10.0/24 --port=8888
>> >> > --cloud-provider=mesos --cloud-config=mesos-cloud.conf
>> >> > --secure-port=0
>> >> > --v=1
>> >> > root     129509 128024  2 22:26 pts/0    00:00:00 km
>> >> > controller-manager --master=15.242.100.60:8888 --cloud-provider=mesos
>> >> > --cloud-config=./mesos-cloud.conf --v=1
>> >> > root     129538 128024  0 22:26 pts/0    00:00:00 km scheduler
>> >> > --address=15.242.100.60 --mesos-master=15.242.100.56:5050
>> >> > --etcd-servers=http://15.242.100.60:4001 --mesos-user=root
>> >> > --api-servers=15.242.100.60:8888 --cluster-dns=10.10.10.10
>> >> > --cluster-domain=cluster.local --v=2
>> >> >
>> >> > All the logs are also seem OK, except the logs from scheduler.log:
>> >> > ......
>> >> > I1228 22:26:37.883092  129538 messenger.go:381] Receiving message
>> >> > mesos.internal.InternalMasterChangeDetected from
>> >> > scheduler(1)@15.242.100.60:33077
>> >> > I1228 22:26:37.883225  129538 scheduler.go:374] New master
>> >> > [email protected]:5050 detected
>> >> > I1228 22:26:37.883268  129538 scheduler.go:435] No credentials were
>> >> > provided. Attempting to register scheduler without authentication.
>> >> > I1228 22:26:37.883356  129538 scheduler.go:928] Registering with
>> >> > master: [email protected]:5050
>> >> > I1228 22:26:37.883460  129538 messenger.go:187] Sending message
>> >> > mesos.internal.RegisterFrameworkMessage to [email protected]:5050
>> >> > I1228 22:26:37.883504  129538 scheduler.go:881] will retry
>> >> > registration in 1.209320575s if necessary
>> >> > I1228 22:26:37.883758  129538 http_transporter.go:193] Sending
>> >> > message
>> >> > to [email protected]:5050 via http
>> >> > I1228 22:26:37.883873  129538 http_transporter.go:587] libproc target
>> >> > URL
>> >> >
>> >> > http://15.242.100.56:5050/master/mesos.internal.RegisterFrameworkMessage
>> >> > I1228 22:26:39.093560  129538 scheduler.go:928] Registering with
>> >> > master: [email protected]:5050
>> >> > I1228 22:26:39.093659  129538 messenger.go:187] Sending message
>> >> > mesos.internal.RegisterFrameworkMessage to [email protected]:5050
>> >> > I1228 22:26:39.093702  129538 scheduler.go:881] will retry
>> >> > registration in 3.762036352s if necessary
>> >> > I1228 22:26:39.093765  129538 http_transporter.go:193] Sending
>> >> > message
>> >> > to [email protected]:5050 via http
>> >> > I1228 22:26:39.093847  129538 http_transporter.go:587] libproc target
>> >> > URL
>> >> >
>> >> > http://15.242.100.56:5050/master/mesos.internal.RegisterFrameworkMessage
>> >> > ......
>> >> >
>> >> > From the log, the Mesos master rejected the k8s's registeration, and
>> >> > k8s retry constantly.
>> >> >
>> >> > Have you met this issue before? Thanks very much in advance!
>> >> > Best Regards
>> >> > Nan Xiao
>> >> >
>> >> >
>> >> > On Mon, Dec 28, 2015 at 7:26 PM, Klaus Ma <[email protected]>
>> >> > wrote:
>> >> >> It seems Kubernetes is down; would you help to check kubernetes's
>> >> >> status
>> >> >> (km)?
>> >> >>
>> >> >> ----
>> >> >> Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
>> >> >> Platform Symphony/DCOS Development & Support, STG, IBM GCG
>> >> >> +86-10-8245 4084 | [email protected] | http://k82.me
>> >> >>
>> >> >> On Mon, Dec 28, 2015 at 6:35 PM, Nan Xiao <[email protected]>
>> >> >> wrote:
>> >> >>>
>> >> >>> Hi all,
>> >> >>>
>> >> >>> Greetings from me!
>> >> >>>
>> >> >>> I am trying to follow this tutorial
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>> (https://github.com/kubernetes/kubernetes/blob/master/docs/getting-started-guides/mesos.md)
>> >> >>> to deploy "k8s on Mesos" on local machines: The k8s is the newest
>> >> >>> master branch, and Mesos is the 0.26 edition.
>> >> >>>
>> >> >>> After running Mesos master(IP:15.242.100.56), Mesos
>> >> >>> slave(IP:15.242.100.16),, and the k8s(IP:15.242.100.60), I can see
>> >> >>> the
>> >> >>> following logs from Mesos master:
>> >> >>>
>> >> >>> ......
>> >> >>> I1227 22:52:34.494478  8069 master.cpp:4269] Received update of
>> >> >>> slave
>> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0 at
>> >> >>> slave(1)@15.242.100.16:5051
>> >> >>> (pqsfc016.ftc.rdlabs.hpecorp.net) with total oversubscribed
>> >> >>> resources
>> >> >>> I1227 22:52:34.494940  8065 hierarchical.cpp:400] Slave
>> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0
>> >> >>> (pqsfc016.ftc.rdlabs.hpecorp.net) updated with oversubscribed
>> >> >>> resources  (total: cpus(*):32; mem(*):127878; disk(*):4336;
>> >> >>> ports(*):[31000-32000], allocated: )
>> >> >>> I1227 22:53:06.740757 8053 http.cpp:334] HTTP GET for
>> >> >>> /master/state.json from 15.242.100.60:56219 with
>> >> >>> User-Agent='Go-http-client/1.1'
>> >> >>> I1227 22:53:07.736419 8065 http.cpp:334] HTTP GET for
>> >> >>> /master/state.json from 15.242.100.60:56241 with
>> >> >>> User-Agent='Go-http-client/1.1'
>> >> >>> I1227 22:53:07.767196  8070 http.cpp:334] HTTP GET for
>> >> >>> /master/state.json from 15.242.100.60:56252 with
>> >> >>> User-Agent='Go-http-client/1.1'
>> >> >>> I1227 22:53:08.808171  8053 http.cpp:334] HTTP GET for
>> >> >>> /master/state.json from 15.242.100.60:56272 with
>> >> >>> User-Agent='Go-http-client/1.1'
>> >> >>> I1227 22:53:08.815811 8060 master.cpp:2176] Received SUBSCRIBE call
>> >> >>> for framework 'Kubernetes' at scheduler(1)@15.242.100.60:59488
>> >> >>> I1227 22:53:08.816182 8060 master.cpp:2247] Subscribing framework
>> >> >>> Kubernetes with checkpointing enabled and capabilities [  ]
>> >> >>> I1227 22:53:08.817294  8052 hierarchical.cpp:195] Added framework
>> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000
>> >> >>> I1227 22:53:08.817464  8050 master.cpp:1122] Framework
>> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at
>> >> >>> scheduler(1)@15.242.100.60:59488 disconnected
>> >> >>> E1227 22:53:08.817497 8073 process.cpp:1911] Failed to shutdown
>> >> >>> socket with fd 17: Transport endpoint is not connected
>> >> >>> I1227 22:53:08.817533  8050 master.cpp:2472] Disconnecting
>> >> >>> framework
>> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at
>> >> >>> scheduler(1)@15.242.100.60:59488
>> >> >>> I1227 22:53:08.817595 8050 master.cpp:2496] Deactivating framework
>> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at
>> >> >>> scheduler(1)@15.242.100.60:59488
>> >> >>> I1227 22:53:08.817797 8050 master.cpp:1146] Giving framework
>> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at
>> >> >>> scheduler(1)@15.242.100.60:59488 7625.14222623576weeks to failover
>> >> >>> W1227 22:53:08.818389 8062 master.cpp:4840] Master returning
>> >> >>> resources offered to framework
>> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 because the framework has
>> >> >>> terminated or is inactive
>> >> >>> I1227 22:53:08.818397  8052 hierarchical.cpp:273] Deactivated
>> >> >>> framework 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000
>> >> >>> I1227 22:53:08.819046  8066 hierarchical.cpp:744] Recovered
>> >> >>> cpus(*):32; mem(*):127878; disk(*):4336; ports(*):[31000-32000]
>> >> >>> (total: cpus(*):32; mem(*):127878; disk(*):4336;
>> >> >>> ports(*):[31000-32000], allocated: ) on slave
>> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0 from framework
>> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000
>> >> >>> ......
>> >> >>>
>> >> >>> I can't figure out why Mesos master complains "Failed to shutdown
>> >> >>> socket with fd 17: Transport endpoint is not connected".
>> >> >>> Could someone give some clues on this issue?
>> >> >>>
>> >> >>> Thanks very much in advance!
>> >> >>>
>> >> >>> Best Regards
>> >> >>> Nan Xiao
>> >> >>
>> >> >>
>> >
>> >
>> >
>> >
>> > --
>> > Avinash Sridharan, Mesosphere
>> > +1 (323) 702 5245
>
>
>
>
> --
> Avinash Sridharan, Mesosphere
> +1 (323) 702 5245

Reply via email to