gabriel-farache opened a new issue, #395: URL: https://github.com/apache/incubator-kie-kogito-serverless-operator/issues/395
### Describe the bug Related to https://github.com/apache/incubator-kie-kogito-serverless-operator/issues/361 When starting the DI and a workflow in the same time, the workflow does not register in the DI I do have the logs ``` 2024-02-15 08:28:04,460 INFO [io.sma.health] (executor-thread-1) SRHCK01001: Reporting health down status: {"status":"DOWN","checks":[{"name":"Data Index Availability - startup check","status":"DOWN","data":{"error":"[unknown] - io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: sonataflow-platform-data-index-service.sonataflow-infra/172.31.200.9:80"}},{"name":"SmallRye Reactive Messaging - startup check","status":"UP"}]} 2024-02-15 08:28:19,371 INFO [io.sma.health] (executor-thread-1) SRHCK01001: Reporting health down status: {"status":"DOWN","checks":[{"name":"Data Index Availability - startup check","status":"DOWN","data":{"error":"[unknown] - io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: sonataflow-platform-data-index-service.sonataflow-infra/172.31.200.9:80"}},{"name":"SmallRye Reactive Messaging - startup check","status":"UP"}]} 2024-02-15 08:28:34,375 INFO [io.sma.health] (executor-thread-1) SRHCK01001: Reporting health down status: {"status":"DOWN","checks":[{"name":"Data Index Availability - startup check","status":"DOWN","data":{"error":"[unknown] - io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: sonataflow-platform-data-index-service.sonataflow-infra/172.31.200.9:80"}},{"name":"SmallRye Reactive Messaging - startup check","status":"UP"}]} ``` No restarts ``` oc -n sonataflow-infra get pods NAME READY STATUS RESTARTS AGE greeting-64c66ccdb7-ldmdr 1/1 Running 0 7m42s sonataflow-platform-data-index-service-6676f74b48-258wf 1/1 Running 0 7m42s sonataflow-platform-jobs-service-d9455b6f7-2v8c9 1/1 Running 0 7m42s sonataflow-psql-postgresql-0 1/1 Running 0 10m ``` and no greetings in our UI that is reading the data index  Here is the dump of the DB [DI_dump.zip](https://github.com/apache/incubator-kie-kogito-serverless-operator/files/14293721/DI_dump.zip) From what I see and understand from the describe, the startupProbe ``` startupProbe: failureThreshold: 5 httpGet: path: /q/health/started port: 8080 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 15 successThreshold: 1 timeoutSeconds: 3 ``` Is that the pod will only restart after 5 failures and here we only have 3. From the full log (see below), there are 2 errors related to publishing event on the DI when the workflow starts so it seems that the workflow is registering itself at startup and never after so if no restart, no registration I tried to delete the DI pod to see if after its re-creation something changes but nothing, the greeting still not appears while other workflows created after the DI start are there. FUll log of greeting: ``` oc -n sonataflow-infra logs greeting-64c66ccdb7-ldmdr Starting the Java application using /opt/jboss/container/java/run/run-java.sh ... INFO exec -a "java" java -Dquarkus.http.host=0.0.0.0 -Djava.util.logging.manager=org.jboss.logmanager.LogManager -cp "." -jar /deployments/quarkus-run.jar INFO running in /deployments __ ____ __ _____ ___ __ ____ ______ --/ __ \/ / / / _ | / _ \/ //_/ / / / __/ -/ /_/ / /_/ / __ |/ , _/ ,< / /_/ /\ \ --\___\_\____/_/ |_/_/|_/_/|_|\____/___/ 2024-02-15 08:27:48,983 WARN [io.qua.config] (main) Unrecognized configuration key "kogito.data-index.health-enabled" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo 2024-02-15 08:27:48,984 WARN [io.qua.config] (main) Unrecognized configuration key "kogito.jobs-service.health-enabled" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo 2024-02-15 08:27:48,984 WARN [io.qua.config] (main) Unrecognized configuration key "kogito.data-index.url" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo 2024-02-15 08:27:48,984 WARN [io.qua.config] (main) Unrecognized configuration key "kogito.jobs-service.url" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo 2024-02-15 08:27:49,846 WARN [org.kie.kog.add.qua.kna.eve.KnativeEventingConfigSourceFactory] (main) K_SINK variable is empty or doesn't exist. Please make sure that this service is a Knative Source or has a SinkBinding bound to it. 2024-02-15 08:27:49,941 WARN [io.qua.run.con.ConfigRecorder] (main) Build time property cannot be changed at runtime: - quarkus.devservices.enabled is set to 'false' but it is build time fixed to 'true'. Did you change the property quarkus.devservices.enabled after building the application? 2024-02-15 08:27:50,623 INFO [org.kie.kog.add.qua.mes.com.QuarkusKogitoExtensionInitializer] (main) Registered Kogito CloudEvent extension 2024-02-15 08:27:50,673 INFO [io.quarkus] (main) serverless-workflow-project 1.0.0-SNAPSHOT on JVM (powered by Quarkus 3.2.9.Final) started in 2.157s. Listening on: http://0.0.0.0:8080 2024-02-15 08:27:50,673 INFO [io.quarkus] (main) Profile prod activated. 2024-02-15 08:27:50,673 INFO [io.quarkus] (main) Installed features: [cache, cdi, jackson-jq, kogito-addon-events-process-extension, kogito-addon-jobs-knative-eventing-extension, kogito-addon-knative-eventing-extension, kogito-addon-kubernetes-extension, kogito-addon-messaging-extension, kogito-addon-microprofile-config-service-catalog-extension, kogito-addon-process-management-extension, kogito-addon-source-files-extension, kogito-addons-quarkus-knative-serving, kogito-serverless-workflow, kubernetes, kubernetes-client, qute, reactive-routes, rest-client, rest-client-jackson, resteasy, resteasy-jackson, security, security-properties-file, smallrye-context-propagation, smallrye-health, smallrye-openapi, smallrye-reactive-messaging, smallrye-reactive-messaging-http, vertx] 2024-02-15 08:27:50,675 WARN [io.sma.rea.mes.provider] (vert.x-eventloop-thread-7) SRMSG00234: Failed to emit a Message to the channel: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: sonataflow-platform-data-index-service.sonataflow-infra/172.31.200.9:80 Caused by: java.net.ConnectException: Connection refused at java.base/sun.nio.ch.Net.pollConnect(Native Method) at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:672) at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:946) at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:840) 2024-02-15 08:27:50,676 ERROR [org.kie.kog.eve.pro.ReactiveMessagingEventPublisher] (vert.x-eventloop-thread-7) Error while publishing message org.eclipse.microprofile.reactive.messaging.Message$8@7f469c1a: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: sonataflow-platform-data-index-service.sonataflow-infra/172.31.200.9:80 Caused by: java.net.ConnectException: Connection refused at java.base/sun.nio.ch.Net.pollConnect(Native Method) at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:672) at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:946) at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:840) 2024-02-15 08:28:04,460 INFO [io.sma.health] (executor-thread-1) SRHCK01001: Reporting health down status: {"status":"DOWN","checks":[{"name":"Data Index Availability - startup check","status":"DOWN","data":{"error":"[unknown] - io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: sonataflow-platform-data-index-service.sonataflow-infra/172.31.200.9:80"}},{"name":"SmallRye Reactive Messaging - startup check","status":"UP"}]} 2024-02-15 08:28:19,371 INFO [io.sma.health] (executor-thread-1) SRHCK01001: Reporting health down status: {"status":"DOWN","checks":[{"name":"Data Index Availability - startup check","status":"DOWN","data":{"error":"[unknown] - io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: sonataflow-platform-data-index-service.sonataflow-infra/172.31.200.9:80"}},{"name":"SmallRye Reactive Messaging - startup check","status":"UP"}]} 2024-02-15 08:28:34,375 INFO [io.sma.health] (executor-thread-1) SRHCK01001: Reporting health down status: {"status":"DOWN","checks":[{"name":"Data Index Availability - startup check","status":"DOWN","data":{"error":"[unknown] - io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: sonataflow-platform-data-index-service.sonataflow-infra/172.31.200.9:80"}},{"name":"SmallRye Reactive Messaging - startup check","status":"UP"}]} ``` ### Expected behavior I expect the workflow to register itself once the DI is avaible ### Actual behavior The workflow does not register to the DI once the DI is ready and reachable ### How to Reproduce? _No response_ ### Output of `uname -a` or `ver` _No response_ ### Golang version _No response_ ### Operator-sdk version _No response_ ### SonataFlow Operator version or git rev _No response_ ### Additional information _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
