Hi Avinash, Sorry for my unclear expression!
The root cause is not related to k8s, but the CentOS which k8s is running on. It is related to iptables. After executing "iptables -F", it works! Best Regards Nan Xiao On Wed, Dec 30, 2015 at 11:41 PM, Avinash Sridharan <[email protected]> wrote: > Thanks for the update Nan. k8s enabling firewall rules that would block > traffic to the master seems a bit odd. Looks like a bug to me, in the head > of the branch. If you are able to reproduce it consistently, could you file > an issue against kubernetes mesos. > > regards, > Avinash > > On Tue, Dec 29, 2015 at 11:04 PM, Nan Xiao <[email protected]> wrote: >> >> Hi Avinash, >> >> Thanks very much for your reply! >> >> The root cause has been found: the k8s server has enabled the iptables >> which blocks connection from >> Mesos master; after disable it, it works! >> >> Best Regards >> Nan Xiao >> >> >> On Wed, Dec 30, 2015 at 3:22 AM, Avinash Sridharan >> <[email protected]> wrote: >> > lsof command will show only actively opened file descriptors. So if you >> > ran >> > the command after seeing the error logs in the master, most probably the >> > master had already closed this fd. Just throwing a few other things to >> > look >> > at, that might give some more insights. >> > >> > * Run the "netstat -na" and netstat -nt" commands on the master and the >> > kubernetes master node to make sure that the master is listening to the >> > right port, and the k8s scheduler is trying to connect to the right >> > port. >> > From the logs it does look like the master is receiving the registration >> > request, so there shouldn't be a network configuration issue here. >> > * Make sure there are no firewall rules getting turned on in your >> > cluster >> > since it looks like the k8s scheduler is not able to connect to the >> > master >> > (though it was able to register the first time). >> > >> > On Tue, Dec 29, 2015 at 1:37 AM, Nan Xiao <[email protected]> >> > wrote: >> >> >> >> BTW, using "lsof" command finds there are only 16 file descriptors. I >> >> don't know why Mesos >> >> master try to close "fd 17". >> >> Best Regards >> >> Nan Xiao >> >> >> >> >> >> On Tue, Dec 29, 2015 at 11:32 AM, Nan Xiao <[email protected]> >> >> wrote: >> >> > Hi Klaus, >> >> > >> >> > Firstly, thanks very much for your answer! >> >> > >> >> > The km processes are all live: >> >> > root 129474 128024 2 22:26 pts/0 00:00:00 km apiserver >> >> > --address=15.242.100.60 --etcd-servers=http://15.242.100.60:4001 >> >> > --service-cluster-ip-range=10.10.10.0/24 --port=8888 >> >> > --cloud-provider=mesos --cloud-config=mesos-cloud.conf >> >> > --secure-port=0 >> >> > --v=1 >> >> > root 129509 128024 2 22:26 pts/0 00:00:00 km >> >> > controller-manager --master=15.242.100.60:8888 --cloud-provider=mesos >> >> > --cloud-config=./mesos-cloud.conf --v=1 >> >> > root 129538 128024 0 22:26 pts/0 00:00:00 km scheduler >> >> > --address=15.242.100.60 --mesos-master=15.242.100.56:5050 >> >> > --etcd-servers=http://15.242.100.60:4001 --mesos-user=root >> >> > --api-servers=15.242.100.60:8888 --cluster-dns=10.10.10.10 >> >> > --cluster-domain=cluster.local --v=2 >> >> > >> >> > All the logs are also seem OK, except the logs from scheduler.log: >> >> > ...... >> >> > I1228 22:26:37.883092 129538 messenger.go:381] Receiving message >> >> > mesos.internal.InternalMasterChangeDetected from >> >> > scheduler(1)@15.242.100.60:33077 >> >> > I1228 22:26:37.883225 129538 scheduler.go:374] New master >> >> > [email protected]:5050 detected >> >> > I1228 22:26:37.883268 129538 scheduler.go:435] No credentials were >> >> > provided. Attempting to register scheduler without authentication. >> >> > I1228 22:26:37.883356 129538 scheduler.go:928] Registering with >> >> > master: [email protected]:5050 >> >> > I1228 22:26:37.883460 129538 messenger.go:187] Sending message >> >> > mesos.internal.RegisterFrameworkMessage to [email protected]:5050 >> >> > I1228 22:26:37.883504 129538 scheduler.go:881] will retry >> >> > registration in 1.209320575s if necessary >> >> > I1228 22:26:37.883758 129538 http_transporter.go:193] Sending >> >> > message >> >> > to [email protected]:5050 via http >> >> > I1228 22:26:37.883873 129538 http_transporter.go:587] libproc target >> >> > URL >> >> > >> >> > http://15.242.100.56:5050/master/mesos.internal.RegisterFrameworkMessage >> >> > I1228 22:26:39.093560 129538 scheduler.go:928] Registering with >> >> > master: [email protected]:5050 >> >> > I1228 22:26:39.093659 129538 messenger.go:187] Sending message >> >> > mesos.internal.RegisterFrameworkMessage to [email protected]:5050 >> >> > I1228 22:26:39.093702 129538 scheduler.go:881] will retry >> >> > registration in 3.762036352s if necessary >> >> > I1228 22:26:39.093765 129538 http_transporter.go:193] Sending >> >> > message >> >> > to [email protected]:5050 via http >> >> > I1228 22:26:39.093847 129538 http_transporter.go:587] libproc target >> >> > URL >> >> > >> >> > http://15.242.100.56:5050/master/mesos.internal.RegisterFrameworkMessage >> >> > ...... >> >> > >> >> > From the log, the Mesos master rejected the k8s's registeration, and >> >> > k8s retry constantly. >> >> > >> >> > Have you met this issue before? Thanks very much in advance! >> >> > Best Regards >> >> > Nan Xiao >> >> > >> >> > >> >> > On Mon, Dec 28, 2015 at 7:26 PM, Klaus Ma <[email protected]> >> >> > wrote: >> >> >> It seems Kubernetes is down; would you help to check kubernetes's >> >> >> status >> >> >> (km)? >> >> >> >> >> >> ---- >> >> >> Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer >> >> >> Platform Symphony/DCOS Development & Support, STG, IBM GCG >> >> >> +86-10-8245 4084 | [email protected] | http://k82.me >> >> >> >> >> >> On Mon, Dec 28, 2015 at 6:35 PM, Nan Xiao <[email protected]> >> >> >> wrote: >> >> >>> >> >> >>> Hi all, >> >> >>> >> >> >>> Greetings from me! >> >> >>> >> >> >>> I am trying to follow this tutorial >> >> >>> >> >> >>> >> >> >>> >> >> >>> (https://github.com/kubernetes/kubernetes/blob/master/docs/getting-started-guides/mesos.md) >> >> >>> to deploy "k8s on Mesos" on local machines: The k8s is the newest >> >> >>> master branch, and Mesos is the 0.26 edition. >> >> >>> >> >> >>> After running Mesos master(IP:15.242.100.56), Mesos >> >> >>> slave(IP:15.242.100.16),, and the k8s(IP:15.242.100.60), I can see >> >> >>> the >> >> >>> following logs from Mesos master: >> >> >>> >> >> >>> ...... >> >> >>> I1227 22:52:34.494478 8069 master.cpp:4269] Received update of >> >> >>> slave >> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0 at >> >> >>> slave(1)@15.242.100.16:5051 >> >> >>> (pqsfc016.ftc.rdlabs.hpecorp.net) with total oversubscribed >> >> >>> resources >> >> >>> I1227 22:52:34.494940 8065 hierarchical.cpp:400] Slave >> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0 >> >> >>> (pqsfc016.ftc.rdlabs.hpecorp.net) updated with oversubscribed >> >> >>> resources (total: cpus(*):32; mem(*):127878; disk(*):4336; >> >> >>> ports(*):[31000-32000], allocated: ) >> >> >>> I1227 22:53:06.740757 8053 http.cpp:334] HTTP GET for >> >> >>> /master/state.json from 15.242.100.60:56219 with >> >> >>> User-Agent='Go-http-client/1.1' >> >> >>> I1227 22:53:07.736419 8065 http.cpp:334] HTTP GET for >> >> >>> /master/state.json from 15.242.100.60:56241 with >> >> >>> User-Agent='Go-http-client/1.1' >> >> >>> I1227 22:53:07.767196 8070 http.cpp:334] HTTP GET for >> >> >>> /master/state.json from 15.242.100.60:56252 with >> >> >>> User-Agent='Go-http-client/1.1' >> >> >>> I1227 22:53:08.808171 8053 http.cpp:334] HTTP GET for >> >> >>> /master/state.json from 15.242.100.60:56272 with >> >> >>> User-Agent='Go-http-client/1.1' >> >> >>> I1227 22:53:08.815811 8060 master.cpp:2176] Received SUBSCRIBE call >> >> >>> for framework 'Kubernetes' at scheduler(1)@15.242.100.60:59488 >> >> >>> I1227 22:53:08.816182 8060 master.cpp:2247] Subscribing framework >> >> >>> Kubernetes with checkpointing enabled and capabilities [ ] >> >> >>> I1227 22:53:08.817294 8052 hierarchical.cpp:195] Added framework >> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 >> >> >>> I1227 22:53:08.817464 8050 master.cpp:1122] Framework >> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at >> >> >>> scheduler(1)@15.242.100.60:59488 disconnected >> >> >>> E1227 22:53:08.817497 8073 process.cpp:1911] Failed to shutdown >> >> >>> socket with fd 17: Transport endpoint is not connected >> >> >>> I1227 22:53:08.817533 8050 master.cpp:2472] Disconnecting >> >> >>> framework >> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at >> >> >>> scheduler(1)@15.242.100.60:59488 >> >> >>> I1227 22:53:08.817595 8050 master.cpp:2496] Deactivating framework >> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at >> >> >>> scheduler(1)@15.242.100.60:59488 >> >> >>> I1227 22:53:08.817797 8050 master.cpp:1146] Giving framework >> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at >> >> >>> scheduler(1)@15.242.100.60:59488 7625.14222623576weeks to failover >> >> >>> W1227 22:53:08.818389 8062 master.cpp:4840] Master returning >> >> >>> resources offered to framework >> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 because the framework has >> >> >>> terminated or is inactive >> >> >>> I1227 22:53:08.818397 8052 hierarchical.cpp:273] Deactivated >> >> >>> framework 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 >> >> >>> I1227 22:53:08.819046 8066 hierarchical.cpp:744] Recovered >> >> >>> cpus(*):32; mem(*):127878; disk(*):4336; ports(*):[31000-32000] >> >> >>> (total: cpus(*):32; mem(*):127878; disk(*):4336; >> >> >>> ports(*):[31000-32000], allocated: ) on slave >> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0 from framework >> >> >>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 >> >> >>> ...... >> >> >>> >> >> >>> I can't figure out why Mesos master complains "Failed to shutdown >> >> >>> socket with fd 17: Transport endpoint is not connected". >> >> >>> Could someone give some clues on this issue? >> >> >>> >> >> >>> Thanks very much in advance! >> >> >>> >> >> >>> Best Regards >> >> >>> Nan Xiao >> >> >> >> >> >> >> > >> > >> > >> > >> > -- >> > Avinash Sridharan, Mesosphere >> > +1 (323) 702 5245 > > > > > -- > Avinash Sridharan, Mesosphere > +1 (323) 702 5245

