Hi Evgeniy,

>From the following exception message:

        at
org.apache.flink.shaded.netty4.io.netty.bootstrap.Bootstrap.connect(Bootstrap.java:123)
        at
org.apache.flink.runtime.rest.RestClient.submitRequest(RestClient.java:469)
        at
org.apache.flink.runtime.rest.RestClient.sendRequest(RestClient.java:392)
        at
org.apache.flink.runtime.rest.RestClient.sendRequest(RestClient.java:306)
        at
org.apache.flink.client.program.rest.RestClusterClient.lambda$null$37(RestClusterClient.java:931)
        at
java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)

It seems that the client tried to submit a job to the flink cluster through
the rest api failed, maybe you need to provide more information such as
config of k8s for the job and community can help better analyze problems.


Best,
Shammon FY

On Wed, Jun 7, 2023 at 11:35 PM Evgeniy Lyutikov <eblyuti...@avito.ru>
wrote:

> Hello.
> We use Kubernetes operator 1.4.0, operator serves about 50 jobs, but
> sometimes there are errors in the logs that are reflected in the metrics
> (FlinkDeployment.JmDeploymentStatus.READY.Count). What is the reason for
> such errors?
>
>
> 2023-06-07 15:28:27,601 o.a.f.k.o.c.FlinkDeploymentController [INFO
> ][job-name/job-name] Starting reconciliation
> 2023-06-07 15:28:27,602 o.a.f.k.o.s.FlinkResourceContextFactory [INFO
> ][job-name/job-name] Getting service for job-name
> 2023-06-07 15:28:27,602 o.a.f.k.o.o.JobStatusObserver  [INFO
> ][job-name/job-name] Observing job status
> 2023-06-07 15:28:39,623 o.a.f.s.n.i.n.c.AbstractChannel [WARN ]
> Force-closing a channel whose registration task was not accepted by an
> event loop: [id: 0xd494f516]
> java.util.concurrent.RejectedExecutionException: event executor terminated
>         at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:923)
>         at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:350)
>         at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:343)
>         at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:825)
>         at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:815)
>         at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AbstractUnsafe.register(AbstractChannel.java:483)
>         at
> org.apache.flink.shaded.netty4.io.netty.channel.SingleThreadEventLoop.register(SingleThreadEventLoop.java:87)
>         at
> org.apache.flink.shaded.netty4.io.netty.channel.SingleThreadEventLoop.register(SingleThreadEventLoop.java:81)
>         at
> org.apache.flink.shaded.netty4.io.netty.channel.MultithreadEventLoopGroup.register(MultithreadEventLoopGroup.java:86)
>         at
> org.apache.flink.shaded.netty4.io.netty.bootstrap.AbstractBootstrap.initAndRegister(AbstractBootstrap.java:323)
>         at
> org.apache.flink.shaded.netty4.io.netty.bootstrap.Bootstrap.doResolveAndConnect(Bootstrap.java:155)
>         at
> org.apache.flink.shaded.netty4.io.netty.bootstrap.Bootstrap.connect(Bootstrap.java:139)
>         at
> org.apache.flink.shaded.netty4.io.netty.bootstrap.Bootstrap.connect(Bootstrap.java:123)
>         at
> org.apache.flink.runtime.rest.RestClient.submitRequest(RestClient.java:469)
>         at
> org.apache.flink.runtime.rest.RestClient.sendRequest(RestClient.java:392)
>         at
> org.apache.flink.runtime.rest.RestClient.sendRequest(RestClient.java:306)
>         at
> org.apache.flink.client.program.rest.RestClusterClient.lambda$null$37(RestClusterClient.java:931)
>         at
> java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
>         at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>         at
> java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:610)
>         at
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:649)
>         at
> java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
>         at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base/java.lang.Thread.run(Thread.java:829)
> 2023-06-07 15:28:39,624 o.a.f.s.n.i.n.u.c.D.rejectedExecution [ERROR]
> Failed to submit a listener notification task. Event loop shut down?
> java.util.concurrent.RejectedExecutionException: event executor terminated
>         at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:923)
>         at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:350)
>         at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:343)
>         at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:825)
>         at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:815)
>         at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.safeExecute(DefaultPromise.java:841)
>         at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:499)
>         at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.addListener(DefaultPromise.java:184)
>         at
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:95)
>         at
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:30)
>         at
> org.apache.flink.runtime.rest.RestClient.submitRequest(RestClient.java:473)
>         at
> org.apache.flink.runtime.rest.RestClient.sendRequest(RestClient.java:392)
>         at
> org.apache.flink.runtime.rest.RestClient.sendRequest(RestClient.java:306)
>         at
> org.apache.flink.client.program.rest.RestClusterClient.lambda$null$37(RestClusterClient.java:931)
>         at
> java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
>         at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>         at
> java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:610)
>         at
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:649)
>         at
> java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
>         at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base/java.lang.Thread.run(Thread.java:829)
> 2023-06-07 15:28:39,624 o.a.f.k.o.o.JobStatusObserver  [WARN
> ][job-name/job-name] Exception while listing jobs
> java.util.concurrent.TimeoutException
>         at
> java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
>         at
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
>         at
> org.apache.flink.kubernetes.operator.service.AbstractFlinkService.listJobs(AbstractFlinkService.java:241)
>         at
> org.apache.flink.kubernetes.operator.observer.JobStatusObserver.observe(JobStatusObserver.java:70)
>         at
> org.apache.flink.kubernetes.operator.observer.deployment.ApplicationObserver.observeFlinkCluster(ApplicationObserver.java:58)
>         at
> org.apache.flink.kubernetes.operator.observer.deployment.AbstractFlinkDeploymentObserver.observeInternal(AbstractFlinkDeploymentObserver.java:73)
>         at
> org.apache.flink.kubernetes.operator.observer.AbstractFlinkResourceObserver.observe(AbstractFlinkResourceObserver.java:53)
>         at
> org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:120)
>         at
> org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:56)
>         at
> io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:145)
>         at
> io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:103)
>         at
> org.apache.flink.kubernetes.operator.metrics.OperatorJosdkMetrics.timeControllerExecution(OperatorJosdkMetrics.java:80)
>         at
> io.javaoperatorsdk.operator.processing.Controller.reconcile(Controller.java:102)
>         at
> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.reconcileExecution(ReconciliationDispatcher.java:139)
>         at
> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleReconcile(ReconciliationDispatcher.java:119)
>         at
> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleDispatch(ReconciliationDispatcher.java:89)
>         at
> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:62)
>         at
> io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:406)
>         at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base/java.lang.Thread.run(Thread.java:829)
> 2023-06-07 15:28:39,624 o.a.f.k.o.o.d.ApplicationObserver [INFO
> ][job-name/job-name] Observing JobManager deployment. Previous status: READY
> 2023-06-07 15:28:39,652 o.a.f.k.o.o.d.ApplicationObserver [INFO
> ][job-name/job-name] JobManager is being deployed
> 2023-06-07 15:28:39,723 o.a.f.k.o.l.AuditUtils         [INFO
> ][job-name/job-name] >>> Status | Info    | STABLE          | The resource
> deployment is considered to be stable and won’t be rolled back
> 2023-06-07 15:28:39,724 o.a.f.k.o.a.JobAutoScalerImpl  [INFO
> ][job-name/job-name] Job autoscaler is disabled
> 2023-06-07 15:28:39,724 o.a.f.k.o.r.d.AbstractFlinkResourceReconciler
> [INFO ][job-name/job-name] Resource fully reconciled, nothing to do...
> 2023-06-07 15:28:39,724 o.a.f.k.o.c.FlinkDeploymentController [INFO
> ][job-name/job-name] End of reconciliation
>
>
>
> * ------------------------------ *“This message contains confidential
> information/commercial secret. If you are not the intended addressee of
> this message you may not copy, save, print or forward it to any third party
> and you are kindly requested to destroy this message and notify the sender
> thereof by email.
> Данное сообщение содержит конфиденциальную информацию/информацию,
> являющуюся коммерческой тайной. Если Вы не являетесь надлежащим адресатом
> данного сообщения, Вы не вправе копировать, сохранять, печатать или
> пересылать его каким либо иным лицам. Просьба уничтожить данное сообщение и
> уведомить об этом отправителя электронным письмом.”
>

Reply via email to