[GitHub] [druid] paul-rogers commented on a diff in pull request #12845: Refactor SqlLifecycle into statement classes

GitBox Tue, 02 Aug 2022 14:03:10 -0700


paul-rogers commented on code in PR #12845:
URL: https://github.com/apache/druid/pull/12845#discussion_r935990516



##########
sql/src/main/java/org/apache/druid/sql/SqlQueryPlus.java:
##########
@@ -66,6 +66,25 @@ public SqlQueryPlus(final String sql, final 
AuthenticationResult authResult)
     this(sql, (QueryContext) null, null, authResult);
   }
 
+  public SqlQueryPlus(

Review Comment:
   I suppose we could. Four parameters is on the border: not so many that a 
builder is clearly necessary, but there is enough variation that a builder 
might be handy. Let me try adding one to see if that improves the flow.



##########
sql/src/main/java/org/apache/druid/sql/DirectStatement.java:
##########
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql;
+
+import org.apache.druid.java.util.common.StringUtils;
+import org.apache.druid.java.util.common.guava.Sequence;
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.query.QueryInterruptedException;
+import org.apache.druid.sql.SqlLifecycleManager.Cancellable;
+import org.apache.druid.sql.calcite.planner.DruidPlanner;
+import org.apache.druid.sql.calcite.planner.PlannerResult;
+import org.apache.druid.sql.calcite.planner.PrepareResult;
+
+import java.util.concurrent.CopyOnWriteArrayList;
+
+/**
+ * Lifecycle for direct SQL statement execution, which means that the query
+ * is planned and executed in a single step, with no "prepare" step.
+ * Callers need call only:
+ * <ul>
+ * <li>{@link #execute()} to execute the query. The caller must close
+ * the returned {@code Sequence}.</li>
+ * <li>{@link #close()} to report metrics, or {@link #closeQuietly()}
+ * otherwise.</li>
+ * </ul>
+ * <p>
+ * The {@link #cancel()} method may be called from any thread and cancels
+ * the query.
+ * <p>
+ * All other methods are optional and are generally for introspection.
+ * <p>
+ * The class supports two threading models. In the simple case, the same
+ * thread creates this object and executes the query. In the split model,
+ * a request thread creates this object and plans the query. A separate
+ * response thread consumes results and performs any desired logging, etc.
+ * The object is transferred between threads, with no overlapping access.

Review Comment:
   Another great question. In the broad schema of things, the general rule for 
a SQL interface is that it is single-threaded per statement. The reason is 
simple: there is nothing at the API level in a SQL statement that can be 
meaningfully parallelized. (Execution can be parallelized, but that is below 
the API level.)
   
   The more nuanced statement is that access to a SQL statement is sequentially 
single-threaded: ownership can transfer, but only one thread can access the 
statement at a time.
   
   There are only two cases where this arises in Druid. The first is in JDBC. 
As a statement executes, the caller will request pages of data. Each request 
comes in on a separate thread, but there can only be one active at a time. 
Locking in JDBC ensures this pattern.
   
   The other case is in HTTP: the request thread does all the planning. Only 
when planning is successful does it turn ownership over to a response thread 
which reads the results and, eventually, closes the statement.
   
   Because of the "sequential" model, the `volatile` keyword should not be 
necessary: there is no race condition between readers and writers.
   
   For `prepareResult`, the "plan" thread must exit from the `execute()` method 
before it is legal for another thread to access the statement. This protocol is 
what prevents memory access (or worse, planner access) race conditions. The 
ownership-transfer protocol is why the statement itself doesn't do locking.
   
   Does this answer the question? More to the point, are there concurrency 
corner cases that the above explanation overlooks?



##########
sql/src/main/java/org/apache/druid/sql/SqlExecutionReporter.java:
##########
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql;
+
+import org.apache.druid.java.util.common.DateTimes;
+import org.apache.druid.java.util.common.StringUtils;
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.java.util.emitter.service.ServiceEmitter;
+import org.apache.druid.java.util.emitter.service.ServiceMetricEvent;
+import org.apache.druid.query.QueryContext;
+import org.apache.druid.query.QueryInterruptedException;
+import org.apache.druid.query.QueryTimeoutException;
+import org.apache.druid.server.QueryStats;
+import org.apache.druid.server.RequestLogLine;
+import org.apache.druid.sql.calcite.planner.PlannerContext;
+
+import java.util.LinkedHashMap;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Side-car class which reports logs and metrics for an
+ * {@link HttpStatement}. This separate class cleanly separates the logic
+ * for running a query from the logic for reporting on that run. A query
+ * can end either with a success or error. This object is created in
+ * the request thread, with the remaining method called either from the
+ * request or response thread, but not both.
+ */
+public class SqlExecutionReporter
+{
+  private static final Logger log = new Logger(SqlExecutionReporter.class);
+
+  private final AbstractStatement stmt;
+  private final String remoteAddress;
+  private final long startMs;
+  private final long startNs;
+  private Throwable e;
+  private long bytesWritten;
+
+  public SqlExecutionReporter(
+      final AbstractStatement stmt,
+      final String remoteAddress
+  )
+  {
+    this.stmt = stmt;
+    this.remoteAddress = remoteAddress;
+    this.startMs = System.currentTimeMillis();
+    this.startNs = System.nanoTime();
+  }
+
+  public void failed(Throwable e)
+  {
+    this.e = e;
+  }
+
+  public void succeeded(final long bytesWritten)
+  {
+    this.bytesWritten = bytesWritten;
+  }
+
+  public void emit()
+  {
+    final boolean success = e == null;
+    final long queryTimeNs = System.nanoTime() - startNs;
+
+    ServiceEmitter emitter = stmt.sqlToolbox.emitter;
+    PlannerContext plannerContext = stmt.plannerContext;
+    try {
+      ServiceMetricEvent.Builder metricBuilder = ServiceMetricEvent.builder();
+      if (plannerContext != null) {
+        metricBuilder.setDimension("id", plannerContext.getSqlQueryId());
+        metricBuilder.setDimension("nativeQueryIds", 
plannerContext.getNativeQueryIds().toString());
+      }
+      if (stmt.resourceActions != null) {
+        metricBuilder.setDimension(
+            "dataSource",
+            stmt.resourceActions
+                            .stream()
+                            .map(action -> action.getResource().getName())
+                            .collect(Collectors.toList())
+                            .toString()
+        );
+      }
+      metricBuilder.setDimension("remoteAddress", 
StringUtils.nullToEmptyNonDruidDataString(remoteAddress));
+      metricBuilder.setDimension("success", String.valueOf(success));
+      emitter.emit(metricBuilder.build("sqlQuery/time", 
TimeUnit.NANOSECONDS.toMillis(queryTimeNs)));
+      if (bytesWritten >= 0) {
+        emitter.emit(metricBuilder.build("sqlQuery/bytes", bytesWritten));
+      }
+
+      final Map<String, Object> statsMap = new LinkedHashMap<>();
+      statsMap.put("sqlQuery/time", 
TimeUnit.NANOSECONDS.toMillis(queryTimeNs));
+      statsMap.put("sqlQuery/bytes", bytesWritten);
+      statsMap.put("success", success);
+      QueryContext queryContext;
+      if (plannerContext == null) {
+        queryContext = stmt.queryPlus.context();
+      } else {
+        statsMap.put("identity", 
plannerContext.getAuthenticationResult().getIdentity());
+        queryContext = stmt.queryPlus.context();
+        queryContext.addSystemParam("nativeQueryIds", 
plannerContext.getNativeQueryIds().toString());
+      }
+      final Map<String, Object> context = queryContext.getMergedParams();
+      statsMap.put("context", context);
+      if (e != null) {
+        statsMap.put("exception", e.toString());
+
+        if (e instanceof QueryInterruptedException || e instanceof 
QueryTimeoutException) {
+          statsMap.put("interrupted", true);
+          statsMap.put("reason", e.toString());

Review Comment:
   To be honest, I didn't pay too much attention to this code. If you were to 
diff this function against the original copy, you'd see that only the field 
names changed: the rest of the code is pretty much how I found it.
   
   That said, yes, it does appear that this code evolved over time and is 
perhaps not the most optimally structured.
   
   BTW: this is one big hole in the otherwise great GitHub diff mechanism: no 
easy way to compare code when it moves from one file to another.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] paul-rogers commented on a diff in pull request #12845: Refactor SqlLifecycle into statement classes

Reply via email to