Hi Ashwath,

We have seen this with older version of EKS, namely we were on 1.23 for too
long.

We had faced this issue and also loss of connectivity.

We tested connections manually to various end points other than GoCD server
and would see that the node would lose connectivity to various end points.
We connected over the AWS Session Manager and not over SSH.

A restart of the EC2 worker node was the only way to get connectivity back
again.

We were able to use these two issues to urge that the EKS upgrade be
prioritised.  ( We are helping the client with their journey to achieve
Continuous Deliver and have added the IaC for EKS to our work backlog ).

— Sriram

On Mon, 23 Dec 2024 at 8:15 PM, 'Ashwanth Kumar' via go-cd <
go-cd@googlegroups.com> wrote:

> Hello,
>
> I'm running GoCD: 24.3.0 with an elastic agent profile running on AWS EKS
> cluster. My pipelines run properly most of the time, but sometimes certain
> runs get into some limbo state with pod logs as below. When it does, the
> pipeline is just stuck waiting on assigning agents.
>
> Is there anything obvious for anyone who has seen this error before?
> Should I just upgrade to 24.5.0 and see? The only way I get out of this is
> by terminating my node on EC2 forcefully. Killing the pod also doesn't
> help, because all newer pods have the same issue.
>
> $ sudo /run-docker-daemon.sh
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> /run-docker-daemon.sh: line 23:     9 Killed                  $(which
> dind) dockerd --host=unix:///var/run/docker.sock
> ${DOCKERD_ADDITIONAL_ARGS:-'--host=tcp://localhost:2375'} >
> /var/log/dockerd.log 2>&1
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the
> docker daemon running?
> dockerd startup failed...
> time="2024-12-23T14:35:58.923528879Z" level=info msg="Starting up"
> time="2024-12-23T14:35:58.924393339Z" level=warning msg="Binding to IP
> address without --tlsverify is insecure and gives root access on this
> machine to everyone who has access to your network."
> host="tcp://localhost:2375"
> time="2024-12-23T14:35:58.924414660Z" level=warning msg="Binding to an IP
> address, even on localhost, can also give access to scripts run in a
> browser. Be safe out there!" host="tcp://localhost:2375"
> time="2024-12-23T14:35:58.924442010Z" level=warning msg="[DEPRECATION
> NOTICE] In future versions this will be a hard failure preventing the
> daemon from starting! Learn more at:
> https://docs.docker.com/go/api-security/"; host="tcp://localhost:2375"
> time="2024-12-23T14:35:59.925995985Z" level=info msg="containerd not
> running, starting managed containerd"
> time="2024-12-23T14:35:59.928570903Z" level=info msg="started new
> containerd process" address=/var/run/docker/containerd/containerd.sock
> module=libcontainerd pid=141
> time="2024-12-23T14:35:59.956936540Z" level=info msg="starting containerd"
> revision=8fc6bcff51318944179630522a095cc9dbf9f353 version=v1.7.20
> time="2024-12-23T14:36:00.000700964Z" level=info msg="loading plugin
> \"io.containerd.event.v1.exchange\"..." type=io.containerd.event.v1
> time="2024-12-23T14:36:00.000778006Z" level=info msg="loading plugin
> \"io.containerd.internal.v1.opt\"..." type=io.containerd.internal.v1
> time="2024-12-23T14:36:00.001422560Z" level=info msg="loading plugin
> \"io.containerd.warning.v1.deprecations\"..." type=io.containerd.warning.v1
> time="2024-12-23T14:36:00.001454791Z" level=info msg="loading plugin
> \"io.containerd.snapshotter.v1.blockfile\"..."
> type=io.containerd.snapshotter.v1
> time="2024-12-23T14:36:00.001564513Z" level=info msg="skip loading plugin
> \"io.containerd.snapshotter.v1.blockfile\"..." error="no scratch file
> generator: skip plugin" type=io.containerd.snapshotter.v1
> time="2024-12-23T14:36:00.001590374Z" level=info msg="loading plugin
> \"io.containerd.snapshotter.v1.devmapper\"..."
> type=io.containerd.snapshotter.v1
> time="2024-12-23T14:36:00.001613715Z" level=info msg="skip loading plugin
> \"io.containerd.snapshotter.v1.devmapper\"..." error="devmapper not
> configured: skip plugin" type=io.containerd.snapshotter.v1
> time="2024-12-23T14:36:00.001631625Z" level=info msg="loading plugin
> \"io.containerd.snapshotter.v1.native\"..."
> type=io.containerd.snapshotter.v1
> time="2024-12-23T14:36:00.001791238Z" level=info msg="loading plugin
> \"io.containerd.snapshotter.v1.overlayfs\"..."
> type=io.containerd.snapshotter.v1
> time="2024-12-23T14:36:00.002375252Z" level=info msg="loading plugin
> \"io.containerd.snapshotter.v1.aufs\"..." type=io.containerd.snapshotter.v1
> time="2024-12-23T14:36:00.018509514Z" level=info msg="skip loading plugin
> \"io.containerd.snapshotter.v1.aufs\"..." error="aufs is not supported
> (modprobe aufs failed: exit status 1 \"ip: can't find device
> 'aufs'\\nmodprobe: can't change directory to '/lib/modules': No such file
> or directory\\n\"): skip plugin" type=io.containerd.snapshotter.v1
> time="2024-12-23T14:36:00.018690338Z" level=info msg="loading plugin
> \"io.containerd.snapshotter.v1.zfs\"..." type=io.containerd.snapshotter.v1
> time="2024-12-23T14:36:00.019063176Z" level=info msg="skip loading plugin
> \"io.containerd.snapshotter.v1.zfs\"..." error="path
> /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be
> a zfs filesystem to be used with the zfs snapshotter: skip plugin"
> type=io.containerd.snapshotter.v1
> time="2024-12-23T14:36:00.019096327Z" level=info msg="loading plugin
> \"io.containerd.content.v1.content\"..." type=io.containerd.content.v1
> time="2024-12-23T14:36:00.019363883Z" level=info msg="loading plugin
> \"io.containerd.metadata.v1.bolt\"..." type=io.containerd.metadata.v1
> time="2024-12-23T14:36:00.019483946Z" level=info msg="metadata content
> store policy set" policy=shared
> time="2024-12-23T14:36:00.025371189Z" level=info msg="loading plugin
> \"io.containerd.gc.v1.scheduler\"..." type=io.containerd.gc.v1
> time="2024-12-23T14:36:00.025466381Z" level=info msg="loading plugin
> \"io.containerd.differ.v1.walking\"..." type=io.containerd.differ.v1
> time="2024-12-23T14:36:00.025513112Z" level=info msg="loading plugin
> \"io.containerd.lease.v1.manager\"..." type=io.containerd.lease.v1
> time="2024-12-23T14:36:00.025568054Z" level=info msg="loading plugin
> \"io.containerd.streaming.v1.manager\"..." type=io.containerd.streaming.v1
> time="2024-12-23T14:36:00.025598394Z" level=info msg="loading plugin
> \"io.containerd.runtime.v1.linux\"..." type=io.containerd.runtime.v1
> time="2024-12-23T14:36:00.025975922Z" level=info msg="loading plugin
> \"io.containerd.monitor.v1.cgroups\"..." type=io.containerd.monitor.v1
> time="2024-12-23T14:36:00.026418103Z" level=info msg="loading plugin
> \"io.containerd.runtime.v2.task\"..." type=io.containerd.runtime.v2
> time="2024-12-23T14:36:00.026879433Z" level=info msg="loading plugin
> \"io.containerd.runtime.v2.shim\"..." type=io.containerd.runtime.v2
> time="2024-12-23T14:36:00.026911204Z" level=info msg="loading plugin
> \"io.containerd.sandbox.store.v1.local\"..."
> type=io.containerd.sandbox.store.v1
> time="2024-12-23T14:36:00.026934074Z" level=info msg="loading plugin
> \"io.containerd.sandbox.controller.v1.local\"..."
> type=io.containerd.sandbox.controller.v1
> time="2024-12-23T14:36:00.026959075Z" level=info msg="loading plugin
> \"io.containerd.service.v1.containers-service\"..."
> type=io.containerd.service.v1
> time="2024-12-23T14:36:00.026981415Z" level=info msg="loading plugin
> \"io.containerd.service.v1.content-service\"..."
> type=io.containerd.service.v1
> time="2024-12-23T14:36:00.027001746Z" level=info msg="loading plugin
> \"io.containerd.service.v1.diff-service\"..." type=io.containerd.service.v1
> time="2024-12-23T14:36:00.027134208Z" level=info msg="loading plugin
> \"io.containerd.service.v1.images-service\"..."
> type=io.containerd.service.v1
> time="2024-12-23T14:36:00.027172850Z" level=info msg="loading plugin
> \"io.containerd.service.v1.introspection-service\"..."
> type=io.containerd.service.v1
> time="2024-12-23T14:36:00.027207840Z" level=info msg="loading plugin
> \"io.containerd.service.v1.namespaces-service\"..."
> type=io.containerd.service.v1
> time="2024-12-23T14:36:00.027228851Z" level=info msg="loading plugin
> \"io.containerd.service.v1.snapshots-service\"..."
> type=io.containerd.service.v1
> time="2024-12-23T14:36:00.027250042Z" level=info msg="loading plugin
> \"io.containerd.service.v1.tasks-service\"..." type=io.containerd.service.v1
> time="2024-12-23T14:36:00.027280312Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.containers\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027302052Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.content\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027322173Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.diff\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027343804Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.events\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027364884Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.images\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027385374Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.introspection\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027486097Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.leases\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027533828Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.namespaces\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027563219Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.sandbox-controllers\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027596579Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.sandboxes\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027616510Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.snapshots\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027644821Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.streaming\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027668330Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.tasks\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027701042Z" level=info msg="loading plugin
> \"io.containerd.transfer.v1.local\"..." type=io.containerd.transfer.v1
> time="2024-12-23T14:36:00.027736933Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.transfer\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027756793Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.version\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.027778393Z" level=info msg="loading plugin
> \"io.containerd.internal.v1.restart\"..." type=io.containerd.internal.v1
> time="2024-12-23T14:36:00.027988048Z" level=info msg="loading plugin
> \"io.containerd.tracing.processor.v1.otlp\"..."
> type=io.containerd.tracing.processor.v1
> time="2024-12-23T14:36:00.028029879Z" level=info msg="skip loading plugin
> \"io.containerd.tracing.processor.v1.otlp\"..." error="skip plugin: tracing
> endpoint not configured" type=io.containerd.tracing.processor.v1
> time="2024-12-23T14:36:00.028055519Z" level=info msg="loading plugin
> \"io.containerd.internal.v1.tracing\"..." type=io.containerd.internal.v1
> time="2024-12-23T14:36:00.028086000Z" level=info msg="skip loading plugin
> \"io.containerd.internal.v1.tracing\"..." error="skip plugin: tracing
> endpoint not configured" type=io.containerd.internal.v1
> time="2024-12-23T14:36:00.028110501Z" level=info msg="loading plugin
> \"io.containerd.grpc.v1.healthcheck\"..." type=io.containerd.grpc.v1
> time="2024-12-23T14:36:00.028163852Z" level=info msg="loading plugin
> \"io.containerd.nri.v1.nri\"..." type=io.containerd.nri.v1
> time="2024-12-23T14:36:00.028196603Z" level=info msg="NRI interface is
> disabled by configuration."
> time="2024-12-23T14:36:00.028682933Z" level=info msg=serving...
> address=/var/run/docker/containerd/containerd-debug.sock
> time="2024-12-23T14:36:00.028929999Z" level=info msg=serving...
> address=/var/run/docker/containerd/containerd.sock.ttrpc
> time="2024-12-23T14:36:00.029152474Z" level=info msg=serving...
> address=/var/run/docker/containerd/containerd.sock
> time="2024-12-23T14:36:00.029203025Z" level=info msg="containerd
> successfully booted in 0.073659s"
> time="2024-12-23T14:36:01.003481197Z" level=info msg="Loading containers:
> start."
> time="2024-12-23T14:36:02.301133968Z" level=info msg="Default bridge
> (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option
> --bip can be used to set a preferred IP address"
> time="2024-12-23T14:36:02.600676172Z" level=info msg="Loading containers:
> done."
> time="2024-12-23T14:36:02.635377282Z" level=warning msg="[DEPRECATION
> NOTICE]: API is accessible on http://localhost:2375 without encryption.\n
>         Access to the remote API is equivalent to root access on the host.
> Refer\n         to the 'Docker daemon attack surface' section in the
> documentation for\n         more information:
> https://docs.docker.com/go/attack-surface/\nIn future versions this will
> be a hard failure preventing the daemon from starting! Learn more at:
> https://docs.docker.com/go/api-security/";
> time="2024-12-23T14:36:02.635459914Z" level=info msg="Docker daemon"
> commit=cc13f95 containerd-snapshotter=false storage-driver=overlay2
> version=27.1.1
> time="2024-12-23T14:36:02.635735160Z" level=info msg="Daemon has completed
> initialization"
> /docker-entrypoint.sh: cannot sudo /run-docker-daemon.sh
>
>
> --
>
> Ashwanth Kumar / ashwanthkumar.in
>
> --
> You received this message because you are subscribed to the Google Groups
> "go-cd" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to go-cd+unsubscr...@googlegroups.com.
> To view this discussion visit
> https://groups.google.com/d/msgid/go-cd/CAD9m7Cz8R3sZtuNL_T49g_SW%2BqjdGjt087c-6NrbbiqSgh4QYg%40mail.gmail.com
> <https://groups.google.com/d/msgid/go-cd/CAD9m7Cz8R3sZtuNL_T49g_SW%2BqjdGjt087c-6NrbbiqSgh4QYg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to go-cd+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/go-cd/CANiY96Y7X1Xt7EsjZ0Y9Yy889AB3usD6MBuV2TS3OvCg%2BU1t1A%40mail.gmail.com.

Reply via email to