gianm commented on a change in pull request #12233:
URL: https://github.com/apache/druid/pull/12233#discussion_r818274025
##########
File path:
extensions-core/kubernetes-extensions/src/main/java/org/apache/druid/k8s/discovery/K8sDruidNodeDiscoveryProvider.java
##########
@@ -282,7 +282,10 @@ private void keepWatching(String namespace, String
labelSelector, String resourc
nextResourceVersion = item.object.getResourceVersion();
} else {
- LOGGER.error("WTH! item or item.type is NULL");
+ // Try again by starting the watch from the beginning. This
can happen if the
+ // watch goes bad.
+ LOGGER.warn("Received NULL item. Restarting watch");
Review comment:
Couple things about the log message:
- From your description in the PR, it sounds like this is something that can
happen "normally" and so INFO is better than WARN. The WARN level should be for
something that a user might need to look into.
- Would it be useful to include `nodeRole` here? (Up to you.)
##########
File path:
extensions-core/kubernetes-extensions/src/main/java/org/apache/druid/k8s/discovery/DefaultK8sApiClient.java
##########
@@ -132,12 +132,22 @@ public boolean hasNext() throws SocketTimeoutException
while (watch.hasNext()) {
Watch.Response<V1Pod> item = watch.next();
if (item != null && item.type != null) {
+ DiscoveryDruidNodeAndResourceVersion result = null;
+ if (item.object != null) {
+ result = new DiscoveryDruidNodeAndResourceVersion(
+ item.object.getMetadata().getResourceVersion(),
+ getDiscoveryDruidNodeFromPodDef(nodeRole, item.object)
+ );
+ } else {
+ // The item's object can be null in some cases -- likely due
to a blip
+ // in the k8s watch. Handle that by passing the null
upwards. The caller
+ // needs to know that the object can be null.
+ LOGGER.warn("item of type " + item.type + " was NULL");
Review comment:
Similar to the other comment about logging — it sounds like this is
something that can happen "normally", and we don't expect users to look into it
when this message appears. If that's true, INFO or even DEBUG is better than
WARN.
##########
File path:
extensions-core/kubernetes-extensions/src/test/java/org/apache/druid/k8s/discovery/K8sDruidNodeDiscoveryProviderTest.java
##########
@@ -162,6 +163,140 @@ public void testGetForNodeRole() throws Exception
discoveryProvider.stop();
}
+ @Test(timeout = 10_000)
+ public void testNodeRoleWatcherHandlesNullFromAPIByRestarting() throws
Exception
Review comment:
Thanks for adding a test!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]