paul-rogers commented on code in PR #13564:
URL: https://github.com/apache/druid/pull/13564#discussion_r1051287738
##########
sql/src/main/java/org/apache/druid/sql/SqlPlanningException.java:
##########
@@ -32,11 +32,12 @@
*/
public class SqlPlanningException extends BadQueryException
{
+
public enum PlanningError
{
- SQL_PARSE_ERROR("SQL parse failed", SqlParseException.class.getName()),
- VALIDATION_ERROR("Plan validation failed",
ValidationException.class.getName()),
- UNSUPPORTED_SQL_ERROR("SQL query is unsupported",
RelOptPlanner.CannotPlanException.class.getName());
+ SQL_PARSE_ERROR(SQL_PARSE_FAILED_ERROR_CODE,
SqlParseException.class.getName()),
+ VALIDATION_ERROR(PLAN_VALIDATION_FAILED_ERROR_CODE,
ValidationException.class.getName()),
+ UNSUPPORTED_SQL_ERROR(SQL_QUERY_UNSUPPORTED_ERROR_CODE,
RelOptPlanner.CannotPlanException.class.getName());
Review Comment:
If we standardize on error codes, it would be a huge help to use (or at
least include) the standard [SQLSTATE
errors](https://en.wikipedia.org/wiki/SQLSTATE), where applicable. This is
particularly helpful to JDBC users. At present, we don't return any meaningful
errors except for the unauthorized error.
##########
core/src/main/java/org/apache/druid/query/QueryException.java:
##########
@@ -30,12 +29,129 @@
import java.util.function.Function;
/**
- * Base serializable error response
- *
+ * Base serializable error response.
+ * <p>
+ * The Object Model that QueryException follows is a little non-intuitive as
the primary way that a QueryException is
+ * generated is through a child class. However, those child classes are *not*
equivalent to a QueryException, instead
+ * they act as a Factory of QueryException objects. This can be seen in two
different places.
+ * <p>
+ * 1. When sanitize() is called, the response is a QueryException without any
indication of which original exception
+ * occurred.
+ * 2. When these objects get serialized across the wire the recipient
deserializes a QueryException. The client is
+ * never expected, and fundamentally is not allowed to, ever deserialize a
child class of QueryException.
+ * <p>
+ * For this reason, QueryException must contain all potential state that any
of its child classes could ever want to
+ * push across the wire. Additionally, any catch clauses expecting one of the
child Exceptions must know that it is
+ * running inside of code where the exception has not traveled across the
wire. If there is a chance that the
+ * exception could have been serialized across the wire, the code must catch a
QueryException and check the errorCode
+ * instead.
+ * <p>
+ * As a corollary, adding new state or adjusting the logic of this class must
always be done in a backwards-compatible
+ * fashion across all child classes of QueryException.
+ * <p>
+ * If there is any need to do different logic based on the type of error that
has happened, the only reliable method
+ * of discerning the type of the error is to look at the errorCode String.
Because these Strings are considered part
+ * of the API, they are not allowed to change and must maintain their same
semantics. The known errorCode Strings
+ * are pulled together as public static fields on this class in order to make
it more clear what options exist.
+ * <p>
* QueryResource and SqlResource are expected to emit the JSON form of this
object when errors happen.
*/
public class QueryException extends RuntimeException implements
SanitizableException
{
+ /**
+ * Error codes
+ */
+ public static final String JSON_PARSE_ERROR_CODE = "Json parse failed";
+ public static final String BAD_QUERY_CONTEXT_ERROR_CODE = "Query context
parse failed";
+ public static final String QUERY_CAPACITY_EXCEEDED_ERROR_CODE = "Query
capacity exceeded";
+ public static final String QUERY_INTERRUPTED_ERROR_CODE = "Query
interrupted";
+ // Note: the proper spelling is with a single "l", but the version with
+ // two "l"s is documented, we can't change the text of the message.
+ public static final String QUERY_CANCELED_ERROR_CODE = "Query cancelled";
+ public static final String UNAUTHORIZED_ERROR_CODE = "Unauthorized request";
+ public static final String UNSUPPORTED_OPERATION_ERROR_CODE = "Unsupported
operation";
+ public static final String TRUNCATED_RESPONSE_CONTEXT_ERROR_CODE =
"Truncated response context";
+ public static final String UNKNOWN_EXCEPTION_ERROR_CODE = "Unknown
exception";
+ public static final String QUERY_TIMEOUT_ERROR_CODE = "Query timeout";
+ public static final String QUERY_UNSUPPORTED_ERROR_CODE = "Unsupported
query";
+ public static final String RESOURCE_LIMIT_EXCEEDED_ERROR_CODE = "Resource
limit exceeded";
+ public static final String SQL_PARSE_FAILED_ERROR_CODE = "SQL parse failed";
+ public static final String PLAN_VALIDATION_FAILED_ERROR_CODE = "Plan
validation failed";
+ public static final String SQL_QUERY_UNSUPPORTED_ERROR_CODE = "SQL query is
unsupported";
+
+ public enum FailType
+ {
+ USER_ERROR(400),
+ UNAUTHORIZED(401),
+ CAPACITY_EXCEEDED(429),
+ UNKNOWN(500),
+ SERVER_ERROR(500),
+ CANCELED(500),
+ QUERY_RUNTIME_FAILURE(501),
+ UNSUPPORTED(501),
+ TIMEOUT(504);
+
+ private final int expectedStatus;
+
+ FailType(int expectedStatus)
+ {
+ this.expectedStatus = expectedStatus;
+ }
+
+ public int getExpectedStatus()
+ {
+ return expectedStatus;
+ }
+ }
+
+ public static FailType fromErrorCode(String errorCode)
Review Comment:
This is a good idea in general. I wonder if it is a scalable solution in
particular. Another approach is suggested
[here](https://github.com/apache/druid/issues/13123). we may find that a global
list of errors becomes hard to manage. Since any error from storage on up could
terminate a query, we'd want the entire system to use a single unified set of
errors. That gets to be hard to manage.
Also, it can be hard for a developer working in a lower level, to know the
proper HTTP status to send. For which API? REST? JSON? gRPC?
It might be better to make the system more modular. Define a set of error
categories, then define rules for mapping categories at each API. See [this
proposal](https://github.com/apache/druid/issues/13123) for a suggestion of the
categories.
Our problem is also one of mapping from the specific code issue, "symbol
'foo' not found" to what this means to the user, "The table function foo(), at
line 3, position 27, is not valid for the arguments a, b, c."
##########
core/src/main/java/org/apache/druid/query/QueryException.java:
##########
@@ -30,12 +29,126 @@
import java.util.function.Function;
/**
- * Base serializable error response
- *
+ * Base serializable error response.
+ * <p>
+ * The Object Model that QueryException follows is a little non-intuitive as
the primary way that a QueryException is
+ * generated is through a child class. However, those child classes are *not*
equivalent to a QueryException, instead
+ * they act as a Factory of QueryException objects. This can be seen in two
different places.
+ * <p>
+ * 1. When sanitize() is called, the response is a QueryException without any
indication of which original exception
+ * occurred.
+ * 2. When these objects get serialized across the wire the recipient
deserializes a QueryException. The client is
+ * never expected, and fundamentally is not allowed to, ever deserialize a
child class of QueryException.
+ * <p>
+ * For this reason, QueryException must contain all potential state that any
of its child classes could ever want to
+ * push across the wire. Additionally, any catch clauses expecting one of the
child Exceptions must know that it is
+ * running inside of code where the exception has not traveled across the
wire. If there is a chance that the
+ * exception could have been serialized across the wire, the code must catch a
QueryException and check the errorCode
+ * instead.
+ * <p>
+ * As a correlary, adding new state or adjusting the logic of this class must
always be done in a backwards-compatible
+ * fashion across all child classes of QueryException.
+ * <p>
+ * If there is any need to do different logic based on the type of error that
has happened, the only reliable method
+ * of discerning the type of the error is to look at the errorCode String.
Because these Strings are considered part
+ * of the API, they are not allowed to change and must maintain their same
semantics. The known errorCode Strings
+ * are pulled together as public static fields on this class in order to make
it more clear what options exist.
+ * <p>
* QueryResource and SqlResource are expected to emit the JSON form of this
object when errors happen.
*/
public class QueryException extends RuntimeException implements
SanitizableException
{
+ /**
+ * Error codes
+ */
+ public static final String JSON_PARSE_ERROR_CODE = "Json parse failed";
+ public static final String BAD_QUERY_CONTEXT_ERROR_CODE = "Query context
parse failed";
+ public static final String QUERY_CAPACITY_EXCEEDED_ERROR_CODE = "Query
capacity exceeded";
+ public static final String QUERY_INTERRUPTED_ERROR_CODE = "Query
interrupted";
+ // Note: the proper spelling is with a single "l", but the version with
Review Comment:
I left this comment because Gian pointed out the correct "cancelled" (2 l's)
spelling, but we noted we can't change the constant because it is part of the
public API.
But, this PR changes the constant (by moving it from one place to another),
so we might as well fix the spelling of the constant name while we're breaking
things.
##########
server/src/test/java/org/apache/druid/server/mocks/MockHttpServletRequest.java:
##########
@@ -0,0 +1,504 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.mocks;
+
+import javax.servlet.AsyncContext;
+import javax.servlet.DispatcherType;
+import javax.servlet.RequestDispatcher;
+import javax.servlet.ServletContext;
+import javax.servlet.ServletInputStream;
+import javax.servlet.ServletRequest;
+import javax.servlet.ServletResponse;
+import javax.servlet.http.Cookie;
+import javax.servlet.http.HttpServletRequest;
+import javax.servlet.http.HttpServletResponse;
+import javax.servlet.http.HttpSession;
+import javax.servlet.http.HttpUpgradeHandler;
+import javax.servlet.http.Part;
+import java.io.BufferedReader;
+import java.security.Principal;
+import java.util.Collection;
+import java.util.Enumeration;
+import java.util.LinkedHashMap;
+import java.util.Locale;
+import java.util.Map;
+import java.util.function.Supplier;
+
+public class MockHttpServletRequest implements HttpServletRequest
Review Comment:
There is already a similar implementation
[here](https://github.com/apache/druid/blob/master/extensions-core/druid-catalog/src/test/java/org/apache/druid/server/http/catalog/DummyRequest.java).
Perhaps we can combine them and move them to a common location.
##########
processing/src/main/java/org/apache/druid/query/context/ResponseContext.java:
##########
@@ -334,14 +335,15 @@ public Object mergeValues(Object oldValue, Object
newValue)
*/
public static final Key UNCOVERED_INTERVALS_OVERFLOWED = new BooleanKey(
"uncoveredIntervalsOverflowed",
- true);
+ true
+ );
/**
* Map of most relevant query ID to remaining number of responses from
query nodes.
* The value is initialized in {@code CachingClusteredClient} when it
initializes the connection to the query nodes,
* and is updated whenever they respond (@code DirectDruidClient). {@code
RetryQueryRunner} uses this value to
* check if the {@link #MISSING_SEGMENTS} is valid.
- *
+ * <p>
Review Comment:
Thank you for adding good formatting. I often use the IDE Javadoc preview.
Without proper formatting, the preview is a bit of a muddle.
##########
core/src/test/java/org/apache/druid/query/QueryExceptionTest.java:
##########
@@ -38,36 +33,72 @@
private static final String ERROR_MESSAGE_ORIGINAL = "aaaa";
private static final String ERROR_MESSAGE_TRANSFORMED = "bbbb";
- @Mock
- private Function<String, String> trasformFunction;
-
@Test
public void testSanitizeWithTransformFunctionReturningNull()
{
-
Mockito.when(trasformFunction.apply(ArgumentMatchers.eq(ERROR_MESSAGE_ORIGINAL))).thenReturn(null);
QueryException queryException = new QueryException(ERROR_CODE,
ERROR_MESSAGE_ORIGINAL, ERROR_CLASS, HOST);
- QueryException actual = queryException.sanitize(trasformFunction);
+
+ AtomicLong callCount = new AtomicLong(0);
+ QueryException actual = queryException.sanitize(s -> {
+ callCount.incrementAndGet();
+ Assert.assertEquals(ERROR_MESSAGE_ORIGINAL, s);
+ return null;
+ });
+
Assert.assertNotNull(actual);
Assert.assertEquals(actual.getErrorCode(), ERROR_CODE);
Assert.assertNull(actual.getMessage());
Assert.assertNull(actual.getHost());
Assert.assertNull(actual.getErrorClass());
-
Mockito.verify(trasformFunction).apply(ArgumentMatchers.eq(ERROR_MESSAGE_ORIGINAL));
- Mockito.verifyNoMoreInteractions(trasformFunction);
+ Assert.assertEquals(1, callCount.get());
}
@Test
public void testSanitizeWithTransformFunctionReturningNewString()
{
-
Mockito.when(trasformFunction.apply(ArgumentMatchers.eq(ERROR_MESSAGE_ORIGINAL))).thenReturn(ERROR_MESSAGE_TRANSFORMED);
QueryException queryException = new QueryException(ERROR_CODE,
ERROR_MESSAGE_ORIGINAL, ERROR_CLASS, HOST);
- QueryException actual = queryException.sanitize(trasformFunction);
+
+ AtomicLong callCount = new AtomicLong(0);
+ QueryException actual = queryException.sanitize(s -> {
Review Comment:
If we're defining a new error system, it would be ideal if sanitizing
consists of stripping out the stack trace: the rest of the error was already
well designed.
Else, it becomes quite hard to sanitize a message in general. The only
general solution is "We're sorry, it didn't work out this time." GIGO and all
that. Let's not put information into an error that we have to strip back out.
Or, if we must (to aid debugging), do so using properties that are easy to
discard. Again, [see this
proposal](https://github.com/apache/druid/issues/13123).
##########
core/src/main/java/org/apache/druid/query/QueryException.java:
##########
@@ -30,12 +29,129 @@
import java.util.function.Function;
/**
- * Base serializable error response
- *
+ * Base serializable error response.
+ * <p>
+ * The Object Model that QueryException follows is a little non-intuitive as
the primary way that a QueryException is
+ * generated is through a child class. However, those child classes are *not*
equivalent to a QueryException, instead
+ * they act as a Factory of QueryException objects. This can be seen in two
different places.
+ * <p>
+ * 1. When sanitize() is called, the response is a QueryException without any
indication of which original exception
+ * occurred.
+ * 2. When these objects get serialized across the wire the recipient
deserializes a QueryException. The client is
+ * never expected, and fundamentally is not allowed to, ever deserialize a
child class of QueryException.
+ * <p>
+ * For this reason, QueryException must contain all potential state that any
of its child classes could ever want to
+ * push across the wire. Additionally, any catch clauses expecting one of the
child Exceptions must know that it is
+ * running inside of code where the exception has not traveled across the
wire. If there is a chance that the
+ * exception could have been serialized across the wire, the code must catch a
QueryException and check the errorCode
+ * instead.
+ * <p>
+ * As a corollary, adding new state or adjusting the logic of this class must
always be done in a backwards-compatible
+ * fashion across all child classes of QueryException.
+ * <p>
+ * If there is any need to do different logic based on the type of error that
has happened, the only reliable method
+ * of discerning the type of the error is to look at the errorCode String.
Because these Strings are considered part
+ * of the API, they are not allowed to change and must maintain their same
semantics. The known errorCode Strings
+ * are pulled together as public static fields on this class in order to make
it more clear what options exist.
+ * <p>
* QueryResource and SqlResource are expected to emit the JSON form of this
object when errors happen.
*/
public class QueryException extends RuntimeException implements
SanitizableException
{
+ /**
+ * Error codes
+ */
+ public static final String JSON_PARSE_ERROR_CODE = "Json parse failed";
+ public static final String BAD_QUERY_CONTEXT_ERROR_CODE = "Query context
parse failed";
+ public static final String QUERY_CAPACITY_EXCEEDED_ERROR_CODE = "Query
capacity exceeded";
+ public static final String QUERY_INTERRUPTED_ERROR_CODE = "Query
interrupted";
+ // Note: the proper spelling is with a single "l", but the version with
+ // two "l"s is documented, we can't change the text of the message.
+ public static final String QUERY_CANCELED_ERROR_CODE = "Query cancelled";
+ public static final String UNAUTHORIZED_ERROR_CODE = "Unauthorized request";
+ public static final String UNSUPPORTED_OPERATION_ERROR_CODE = "Unsupported
operation";
+ public static final String TRUNCATED_RESPONSE_CONTEXT_ERROR_CODE =
"Truncated response context";
+ public static final String UNKNOWN_EXCEPTION_ERROR_CODE = "Unknown
exception";
+ public static final String QUERY_TIMEOUT_ERROR_CODE = "Query timeout";
+ public static final String QUERY_UNSUPPORTED_ERROR_CODE = "Unsupported
query";
+ public static final String RESOURCE_LIMIT_EXCEEDED_ERROR_CODE = "Resource
limit exceeded";
+ public static final String SQL_PARSE_FAILED_ERROR_CODE = "SQL parse failed";
+ public static final String PLAN_VALIDATION_FAILED_ERROR_CODE = "Plan
validation failed";
Review Comment:
These are all quite lame messages. If I was a user, and was told "Plan
validation failed", I'd have no idea where to start to set things right. Is
this a user error? If so, what did I do wrong? Or, did the system run out of
resources? Is there a bug in Druid: the plan should have worked, but didn't.
We really gotta put some effort into improving our messages on top of just
making the lousy messages into nice constants.
##########
sql/src/main/java/org/apache/druid/sql/http/SqlResource.java:
##########
@@ -117,171 +108,41 @@
@POST
@Produces(MediaType.APPLICATION_JSON)
@Consumes(MediaType.APPLICATION_JSON)
+ @Nullable
public Response doPost(
final SqlQuery sqlQuery,
@Context final HttpServletRequest req
- ) throws IOException
+ )
Review Comment:
Thank you for pulling the response out of the body of this method. That's
been something that's been bugging me for a while.
This refactoring will be helpful to the gRPC API project: the HTTP request
handler is now just a few lines of code which is easy to replicate for the gRPC
handler.
##########
sql/src/main/java/org/apache/druid/sql/http/SqlResource.java:
##########
@@ -320,4 +181,187 @@ public Response cancelQuery(
return Response.status(Status.FORBIDDEN).build();
}
}
+
+ /**
+ * The SqlResource only generates metrics and doesn't keep track of
aggregate counts of successful/failed/interrupted
+ * queries, so this implementation is effectively just a noop.
+ */
+ private static class SqlResourceQueryMetricCounter implements
QueryResource.QueryMetricCounter
+ {
+ @Override
+ public void incrementSuccess()
+ {
+
+ }
+
+ @Override
+ public void incrementFailed()
+ {
+
+ }
+
+ @Override
+ public void incrementInterrupted()
+ {
+
+ }
+
+ @Override
+ public void incrementTimedOut()
+ {
+
+ }
+ }
+
+ private class SqlResourceQueryResultPusher extends QueryResultPusher
+ {
+ private final String sqlQueryId;
+ private final HttpStatement stmt;
+ private final SqlQuery sqlQuery;
+
+ public SqlResourceQueryResultPusher(
+ AsyncContext asyncContext,
+ String sqlQueryId,
+ HttpStatement stmt,
+ SqlQuery sqlQuery
+ )
+ {
+ super(
+ (HttpServletResponse) asyncContext.getResponse(),
+ SqlResource.this.jsonMapper,
+ SqlResource.this.responseContextConfig,
+ SqlResource.this.selfNode,
+ SqlResource.QUERY_METRIC_COUNTER,
+ sqlQueryId,
+ MediaType.APPLICATION_JSON_TYPE
+ );
+ this.sqlQueryId = sqlQueryId;
+ this.stmt = stmt;
+ this.sqlQuery = sqlQuery;
+ }
+
+ @Override
+ public ResultsWriter start()
+ {
+ return new ResultsWriter()
+ {
+ private ResultSet thePlan;
+
+ @Override
+ @Nullable
+ @SuppressWarnings({"unchecked", "rawtypes"})
+ public QueryResponse<Object> start(HttpServletResponse response)
+ {
+ response.setHeader(SQL_QUERY_ID_RESPONSE_HEADER, sqlQueryId);
+
+ final QueryResponse<Object[]> retVal;
+ try {
+ thePlan = stmt.plan();
+ retVal = thePlan.run();
Review Comment:
This is OK... But I worry that we create an opportunity for a naive user to
try to use the statement both here and in the request. Nothing in the SQL
planner is multi-threaded: each operation must be called by only a single
thread at a time. (It is obviously OK for the thread ownership to change, as
here.)
Also, in JDBC, there is an elaborate mechanism because the thread that
starts the query (metrics?) must be the one that finishes them. I don't know if
that restriction is still valid. If it is, this solution should be better than
the original code.
I wonder if it is better to do the planing in the request thread, and just
return an error response if things go south. Only if the plan is good do we
move onto the execution being done here.
##########
sql/src/main/java/org/apache/druid/sql/http/SqlResource.java:
##########
@@ -275,7 +275,10 @@ public QueryResponse<Object> start(HttpServletResponse
response)
// AssertionErrors are coming from and do something to ensure that
they don't actually make it out of Calcite
catch (AssertionError e) {
log.warn(e, "AssertionError killed query: %s", sqlQuery);
- final QueryInterruptedException wrappedEx =
QueryInterruptedException.wrapIfNeeded(e);
+
+ // We wrap the exception here so that we get the sanitization.
java.lang.AssertionError apparently
+ // doesn't implement
org.apache.druid.common.exception.SanitizableException.
Review Comment:
We really should clean up our error system rather than adding more hacks...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]