paul-rogers commented on a change in pull request #12222:
URL: https://github.com/apache/druid/pull/12222#discussion_r834683367
##########
File path:
processing/src/main/java/org/apache/druid/query/metadata/metadata/NoneColumnIncluderator.java
##########
@@ -21,6 +21,8 @@
/**
*/
+// Needed for IntelliJ checks
+@SuppressWarnings("unused")
Review comment:
Not sure. Probably added this to get the IntelliJ checks to pass. I can
try removing it again to see if things still fail.
##########
File path:
server/src/main/java/org/apache/druid/discovery/LocalDruidNodeDiscoveryProvider.java
##########
@@ -0,0 +1,160 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.discovery;
+
+import org.apache.druid.server.DruidNode;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.function.BooleanSupplier;
+
+/**
+ * Local node discovery. In this mode, Druid may run a number of
+ * logical servers within a single process, so that node discovery
+ * is all within the same process, but matches a true distributed
+ * node discovery.
+ * <p>
+ * Concurrency is at the level of the entire provider, which is
+ * necessary to coordinate the active state with ongoing node
+ * discovery and listener registrations. Such a "global" lock
+ * is fine because the actions don't occur frequently, nor do the
+ * actions take much time.
+ */
+public class LocalDruidNodeDiscoveryProvider extends
DruidNodeDiscoveryProvider implements DruidNodeAnnouncer
+{
+ private class RoleEntry implements DruidNodeDiscovery
+ {
+ @SuppressWarnings("unused")
+ private final NodeRole role;
+ private final Map<DruidNode, DiscoveryDruidNode> nodes = new HashMap<>();
Review comment:
It _seems_ to be that we announce nodes in ZK as (host, port) pairs with
an associated JSON payload the deserializes to a `DiscoveryDruidNode`. That
class has a map of roles. Since ZK can't have two nodes with the same key, I
_think_ this code is mimicking the way ZK works.
##########
File path: docs/operations/api-reference.md
##########
@@ -935,6 +935,12 @@ Returns segment information lists including server
locations for the given query
#### GET
+* `/druid/v2/router/cluster`
+
+Returns a list of the servers registered within the cluster. Similar to
Review comment:
As the docs state, the preferred solution is to query the system tables.
Yet, as I tinker with clusters, I find it very easy to screw things up so that
the the cluster is too broken for SQL. This API is meant to be a light layer on
top of ZK to diagnose such issues without having to fire up the ZK client and
come up with a way to decode node payloads. Reworded the docs to highlight this
idea.
There are slight differences between the formats of the two endpoints. The
Coordinator one appears to be tailored to the needs of the Druid Console
(maybe?)
The coordinator one is more heavily formatted to put services in some
preferred order:
```python
{'coordinator': [{'service': 'druid/coordinator',
'plaintextPort': 8081,
'host': 'coordinator-one'},
{'service': 'druid/coordinator',
'plaintextPort': 8081,
'host': 'coordinator-two'}],
...
'broker': [{'service': 'druid/broker',
'plaintextPort': 8082,
'host': 'broker'}],
'historical': [],
```
This one lists services alphabetically, including only those services
actually running:
```python
{'broker': [{'service': 'druid/broker',
'host': 'broker',
'plaintextPort': 8082}],
'coordinator': [{'service': 'druid/coordinator',
'host': 'coordinator-one',
'plaintextPort': 8081},
{'service': 'druid/coordinator',
'host': 'coordinator-two',
'plaintextPort': 8081}],
...
```
##########
File path:
sql/src/main/java/org/apache/druid/sql/calcite/schema/SystemSchema.java
##########
@@ -674,9 +684,9 @@ private static DruidServer toDruidServer(DiscoveryDruidNode
discoveryDruidNode)
}
}
- private static Iterator<DiscoveryDruidNode>
getDruidServers(DruidNodeDiscoveryProvider druidNodeDiscoveryProvider)
+ private Iterator<DiscoveryDruidNode>
getDruidServers(DruidNodeDiscoveryProvider druidNodeDiscoveryProvider)
{
- return Arrays.stream(NodeRole.values())
+ return allNodeRoles.stream()
Review comment:
A better design here would be to include in the system table all
services registered in ZK.
Instead, we start with a pre-defined set of node roles which used to be
hard-coded, is now in defined in Guice. You are right that each extension has
to define a module that will insert its node role in the "global" list. The IT
"custom node role" has an added "client" module which does the trick.
In terms of compatibility, an extension needs to compile with a version of
Druid (or newer) to be able to include the custom node role in Guice. In that
case, the Broker is also of this version (or newer), and so will understand the
Guice collection.
If a newer custom node role is dropped into an older Druid, then link
failures will occur.
If a rolling upgrade happens, and the custom role is upgraded before the
Broker, the Broker will continue to not know about the custom node role. Only
when the Broker is also upgraded will the custom node role appear in the system
table.
Finally, if the custom node role wants to remain anonymous, it simply does
not provide a module to register its node role. In this case, ZK will have the
role, but the Druid REST APIs and system tables will ignore it.
Did update the `NodeRole` Javadoc.
Is there anything else we should be thinking about here?
##########
File path: server/src/test/java/org/apache/druid/server/NodeTest.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server;
+
+import org.junit.Test;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNotEquals;
+import static org.junit.Assert.assertNull;
+
+public class NodeTest
Review comment:
Cool! Ended up not being able to use this since the revised class now
takes a DruidNode as input to the constructor.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]