Hi,

can you share
- zone type
- network type
- number of control nodes


-Wei

On Thu, 15 Feb 2024 at 08:52, Wally B <[email protected]> wrote:

> So
>
> Recreating the Sec Storage VM Fixed the Cert issue and I was able to
> install K8s 1.28.4 Binaries. --- THANKS Wei ZHOU !
>
>
> Im still getting
>
> [FAILED] Failed to start Execute cloud user/final scripts.
>
> on 1 control and 1 worker.
>
> *Control 1 --  pz-dev-k8s-ncus-00001-control-18dabaf66c1  --    :* No
> errors at the CLI
>
> kubectl get nodes
> NAME                                        STATUS   ROLES           AGE
>   VERSION
> pz-dev-k8s-ncus-00001-control-18dabaf0edb   Ready    control-plane   5m2s
>  v1.28.4
> pz-dev-k8s-ncus-00001-control-18dabaf66c1   Ready    control-plane   4m44s
>   v1.28.4
> pz-dev-k8s-ncus-00001-node-18dabafb0bd      Ready    <none>          4m47s
>   v1.28.4
> pz-dev-k8s-ncus-00001-node-18dabb006bc      Ready    <none>          4m47s
>   v1.28.4
>
>
> kubectl get pods --all-namespaces
> NAMESPACE              NAME
>                READY   STATUS    RESTARTS        AGE
> kube-system            coredns-5dd5756b68-295gb
>                1/1     Running   0               5m32s
> kube-system            coredns-5dd5756b68-cdwvw
>                1/1     Running   0               5m33s
> kube-system            etcd-pz-dev-k8s-ncus-00001-control-18dabaf0edb
>                1/1     Running   0               5m36s
> kube-system            etcd-pz-dev-k8s-ncus-00001-control-18dabaf66c1
>                1/1     Running   0               5m23s
> kube-system
>  kube-apiserver-pz-dev-k8s-ncus-00001-control-18dabaf0edb            1/1
>   Running   0               5m36s
> kube-system
>  kube-apiserver-pz-dev-k8s-ncus-00001-control-18dabaf66c1            1/1
>   Running   0               5m23s
> kube-system
>  kube-controller-manager-pz-dev-k8s-ncus-00001-control-18dabaf0edb   1/1
>   Running   1 (5m13s ago)   5m36s
> kube-system
>  kube-controller-manager-pz-dev-k8s-ncus-00001-control-18dabaf66c1   1/1
>   Running   0               5m23s
> kube-system            kube-proxy-2m8zb
>                1/1     Running   0               5m26s
> kube-system            kube-proxy-cwpjg
>                1/1     Running   0               5m33s
> kube-system            kube-proxy-l2vbf
>                1/1     Running   0               5m26s
> kube-system            kube-proxy-qhlqt
>                1/1     Running   0               5m23s
> kube-system
>  kube-scheduler-pz-dev-k8s-ncus-00001-control-18dabaf0edb            1/1
>   Running   1 (5m8s ago)    5m36s
> kube-system
>  kube-scheduler-pz-dev-k8s-ncus-00001-control-18dabaf66c1            1/1
>   Running   0               5m23s
> kube-system            weave-net-5cs26
>                 2/2     Running   1 (5m9s ago)    5m26s
> kube-system            weave-net-9zqrw
>                 2/2     Running   1 (5m28s ago)   5m33s
> kube-system            weave-net-fcwtr
>                 2/2     Running   0               5m23s
> kube-system            weave-net-lh2dh
>                 2/2     Running   1 (4m41s ago)   5m26s
> kubernetes-dashboard   dashboard-metrics-scraper-5657497c4c-r284t
>                1/1     Running   0               5m32s
> kubernetes-dashboard   kubernetes-dashboard-5b749d9495-vtwdd
>                 1/1     Running   0               5m32s
>
>
>
> *Control 2 ---  pz-dev-k8s-ncus-00001-control-18dabaf66c1   :*  [FAILED]
> Failed to start Execute cloud user/final scripts.
>
> kubectl get nodes
> E0215 07:38:33.314561    2643 memcache.go:265] couldn't get current server
> API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp
> 127.0.0.1:8080: connect: connection refused
> E0215 07:38:33.316751    2643 memcache.go:265] couldn't get current server
> API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp
> 127.0.0.1:8080: connect: connection refused
> E0215 07:38:33.317754    2643 memcache.go:265] couldn't get current server
> API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp
> 127.0.0.1:8080: connect: connection refused
> E0215 07:38:33.319181    2643 memcache.go:265] couldn't get current server
> API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp
> 127.0.0.1:8080: connect: connection refused
> E0215 07:38:33.319975    2643 memcache.go:265] couldn't get current server
> API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp
> 127.0.0.1:8080: connect: connection refused
> The connection to the server localhost:8080 was refused - did you specify
> the right host or port?
>
>
> kubectl get pods --all-namespaces
> E0215 07:42:23.786704    2700 memcache.go:265] couldn't get current server
> API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp
> 127.0.0.1:8080: connect: connection refused
> E0215 07:42:23.787455    2700 memcache.go:265] couldn't get current server
> API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp
> 127.0.0.1:8080: connect: connection refused
> E0215 07:42:23.789529    2700 memcache.go:265] couldn't get current server
> API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp
> 127.0.0.1:8080: connect: connection refused
> E0215 07:42:23.790051    2700 memcache.go:265] couldn't get current server
> API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp
> 127.0.0.1:8080: connect: connection refused
> E0215 07:42:23.791742    2700 memcache.go:265] couldn't get current server
> API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp
> 127.0.0.1:8080: connect: connection refused
> The connection to the server localhost:8080 was refused - did you specify
> the right host or port?
>
>
> */var/log/daemon.log*
>
> https://docs.google.com/document/d/1KuIx0jI4TuAXPgACY3rJQz3L2B8AjeqOL0Fm5r4YF5M/edit?usp=sharing
>
> */var/log/messages*
>
> https://docs.google.com/document/d/15xet6kxI9rdgi4RkIHqtn-Wywph4h1Coyt_cyrJYkv4/edit?usp=sharing
>
> On Thu, Feb 15, 2024 at 1:21 AM Wei ZHOU <[email protected]> wrote:
>
> > Destroy ssvm and retry when new ssvm is Up  ?
> >
> > -Wei
> >
> > 在 2024年2月15日星期四,Wally B <[email protected]> 写道:
> >
> > > Super Weird. I have two other versions added successfully but now when
> I
> > > try to add an ISO/version I get the following on the management host.
> > This
> > > is the first time I've tried adding a K8s version since 4.18.0
> > >
> > >
> > > tail -f /var/log/cloudstack/management/management-server.log | grep
> ERROR
> > >
> > > 2024-02-15 06:26:18,900 DEBUG [c.c.a.t.Request]
> > > (AgentManager-Handler-5:null) (logid:) Seq 48-6373437897659383816:
> > > Processing:  { Ans: , MgmtId: 15643723020152, via: 48, Ver: v1, Flags:
> > 10,
> > > [{"com.cloud.agent.api.storage.DownloadAnswer":{"
> > > jobId":"39d72d08-ab48-47dd-b09a-eee3ed816f4d","
> > > downloadPct":"0","errorString":"PKIX
> > > path building failed:
> > > sun.security.provider.certpath.SunCertPathBuilderException: unable to
> > find
> > > valid certification path to requested
> > > target","downloadStatus":"DOWNLOAD_ERROR","downloadPath"
> > > :"/mnt/SecStorage/73075a0a-38a1-3631-8170-8887c04f6073/
> > > template/tmpl/1/223/dnld9180711723601784047tmp_","
> > > installPath":"template/tmpl/1/223","templateSize":"(0
> > > bytes) 0","templatePhySicalSize":"(0 bytes)
> > > 0","checkSum":"4dfb9d8be2191bc8bc4b89d78795a5
> > > b","result":"true","details":"PKIX
> > > path building failed:
> > > sun.security.provider.certpath.SunCertPathBuilderException: unable to
> > find
> > > valid certification path to requested
> > > target","wait":"0","bypassHostMaintenance":"false"}}] }
> > >
> > > 2024-02-15 06:26:18,937 ERROR [o.a.c.s.i.BaseImageStoreDriverImpl]
> > > (RemoteHostEndPoint-5:ctx-55063062) (logid:e21177cb) Failed to register
> > > template: b6e79c5a-38d4-4cf5-8606-e6f209b6b4c2 with error: PKIX path
> > > building failed:
> > > sun.security.provider.certpath.SunCertPathBuilderException: unable to
> > find
> > > valid certification path to requested target
> > >
> > >
> > >
> > >
> > > On Wed, Feb 14, 2024 at 11:27 PM Wei ZHOU <[email protected]>
> wrote:
> > >
> > > > Can you try 1.27.8 or 1.28.4 on https://download.cloudstack.org/cks/
> ?
> > > >
> > > >
> > > > -Wei
> > > >
> > > > 在 2024年2月15日星期四,Wally B <[email protected]> 写道:
> > > >
> > > > > Hello Everyone!
> > > > >
> > > > > We are currently attempting to deploy k8s clusters and are running
> > into
> > > > > issues with the deployment.
> > > > >
> > > > >
> > > > > Current CS Environment:
> > > > >
> > > > > CloudStack Verison: 4.19.0 (Same issue before we upgraded from
> > 4.18.1).
> > > > > Hypervisor Type: Ubuntu 20.04.03 KVM
> > > > > Attempted K8s Bins: 1.23.3, 1.27.3
> > > > >
> > > > >
> > > > >
> > > > > ======== ISSUE =========
> > > > >
> > > > > For some reason when we attempt the cluster provisioning all of the
> > VMs
> > > > > start up, SSH Keys are installed, but then at least 1, sometimes 2
> of
> > > the
> > > > > VMs (control and/or worker) we get:
> > > > >
> > > > > [FAILED] Failed to start deploy-kube-system.service.
> > > > > [FAILED] Failed to start Execute cloud user/final scripts.
> > > > >
> > > > > The Cloudstack UI just says:
> > > > > Create Kubernetes cluster test-cluster in progress
> > > > > for about an hour (I assume this is the 3600 second timeout) and
> then
> > > > > fails.
> > > > >
> > > > > In the users event log it stays on:
> > > > > INFO KUBERNETES.CLUSTER.CREATE
> > > > > Scheduled
> > > > > Creating Kubernetes cluster. Cluster Id: XXX
> > > > >
> > > > >
> > > > >
> > > > > I can ssh into the VMs with their assigned private keys. I
> attempted
> > to
> > > > run
> > > > > the deploy-kube-system script but it just says already provisioned!
> > I'm
> > > > not
> > > > > sure how I would Execute cloud user/final scripts. If I attempt to
> > stop
> > > > the
> > > > > cluster and start it again nothing seems to change.
> > > > >
> > > > >
> > > > >
> > > > > Any help would be appreciated, I can provide any details as they
> are
> > > > > needed!
> > > > >
> > > > > Thanks!
> > > > > Wally
> > > > >
> > > >
> > >
> >
>

Reply via email to