clintropolis opened a new pull request, #13361:
URL: https://github.com/apache/druid/pull/13361
### Description
This PR fixes an accidental kill switch introduced by #13332, where by
exposing the ability to construct sketches via expressions we also expose the
ability to crash any jvm process running them by feeding them bad input.
For example, the query `SELECT COMPLEX_DECODE_BASE64('HLLSketch', 'AgEH')`
which is a truncated base64 encoded sketch object, will immediately crash the
broker which evaluates it when planning the SQL query with an error of the form:
```
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x000000010424616c, pid=77140, tid=0x0000000000010303
#
# JRE version: OpenJDK Runtime Environment (8.0_332-b08) (build
1.8.0_332-b08)
# Java VM: OpenJDK 64-Bit Server VM (25.332-b08 mixed mode bsd-aarch64
compressed oops)
# Problematic frame:
# j
org.apache.datasketches.hll.PreambleUtil.insertInt(Lorg/apache/datasketches/memory/WritableMemory;JI)V+3
#
```
After the changes in this PR, we just get a regular out of bounds exception:
```
2022-11-12T12:48:10,545 WARN [sql[a54c6476-ed24-4e00-a87b-f01c90cdae2c]]
org.apache.druid.sql.http.SqlResource - Failed to handle query:
SqlQuery{query='SELECT COMPLEX_DECODE_BASE64('HLLSketch', 'AgEH')',
resultFormat=array, header=true, typesHeader=true, sqlTypesHeader=true,
context={sqlOuterLimit=1001, sqlQueryId=a54c6476-ed24-4e00-a87b-f01c90cdae2c,
queryId=a54c6476-ed24-4e00-a87b-f01c90cdae2c}, parameters=[]}
java.lang.IndexOutOfBoundsException: null
at java.nio.Buffer.checkIndex(Buffer.java:545) ~[?:1.8.0_332]
at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:142) ~[?:1.8.0_332]
at
org.apache.druid.segment.data.SafeMemoryBase.getByte(SafeMemoryBase.java:65)
~[classes/:?]
at
org.apache.datasketches.hll.PreambleUtil.extractLgK(PreambleUtil.java:280)
~[datasketches-java-3.2.0.jar:?]
at org.apache.datasketches.hll.HllSketch.wrap(HllSketch.java:247)
~[datasketches-java-3.2.0.jar:?]
at
org.apache.druid.query.aggregation.datasketches.hll.HllSketchObjectStrategy.fromByteBufferSafe(HllSketchObjectStrategy.java:64)
~[?:?]
at
org.apache.druid.query.aggregation.datasketches.hll.HllSketchObjectStrategy.fromByteBufferSafe(HllSketchObjectStrategy.java:31)
~[?:?]
at
org.apache.druid.segment.column.ObjectStrategyComplexTypeStrategy.fromBytes(ObjectStrategyComplexTypeStrategy.java:93)
~[classes/:?]
at
org.apache.druid.math.expr.BuiltInExprMacros$ComplexDecodeBase64ExprMacro$ComplexDecodeBase64Expression.eval(BuiltInExprMacros.java:110)
~[classes/:?]
at org.apache.druid.math.expr.Parser.lambda$flatten$1(Parser.java:181)
~[classes/:?]
at
org.apache.druid.math.expr.BuiltInExprMacros$ComplexDecodeBase64ExprMacro$ComplexDecodeBase64Expression.visit(BuiltInExprMacros.java:117)
~[classes/:?]
at org.apache.druid.math.expr.Parser.flatten(Parser.java:152)
~[classes/:?]
at org.apache.druid.math.expr.Parser.parse(Parser.java:144)
~[classes/:?]
at org.apache.druid.math.expr.Parser.parse(Parser.java:125)
~[classes/:?]
at
org.apache.druid.sql.calcite.planner.DruidRexExecutor.reduce(DruidRexExecutor.java:74)
~[classes/:?]
at
org.apache.calcite.rel.rules.ReduceExpressionsRule.reduceExpressionsInternal(ReduceExpressionsRule.java:695)
~[calcite-core-1.21.0.jar:1.21.0]
```
In fairness, this problem existed prior to #13332, since the underlying
problem has to do with anything using `Memory` on untrusted input bytes for
anything that gets instructions of where to read or write from those bytes. The
reason is that `Memory` for performance reasons uses `Unsafe` for all read and
write operations, even when wrapping `ByteBuffer`, so if the code tries to read
out of bounds locations it can crash the jvm process and do who knows what else.
I think this behavior is absolutely correct for `Memory` for most uses, but
to solve our problem in a bit of an unconventional way, this PR introduces
`SafeWritableMemory` and `SafeWritableBuffer` that delegate _all_ operations to
an underlying plain java `ByteBuffer` so that it gets the bounds checking that
goes with that, which we can use for anything that uses `Memory` when loading
from an untrusted source. When reading from segments we still use standard
`Memory` calls.
Fortunately due to a bug that wasn't fixed until #13332, i wasn't able to
trigger this exact failure in older versions of Druid with the native
`complex_decode_base64` expression because the complex type mapping wasn't
correctly wired up to the function, but i believe it is still possible when
ingesting pre-aggregated sketches that use `Memory`.
To wire up to the `complex_decode_base64` function, I've added a new method
to `ObjectStrategy`,
```
@Nullable
default T fromByteBufferSafe(ByteBuffer buffer, int numBytes)
{
return fromByteBuffer(buffer, numBytes);
}
```
which should be implemented by anything which uses `Unsafe` or other
potentially dangerous operations. I debated having a default implementation but
ultimately decided to add one since this interface is marked as an
`@ExtensionPoint`. This is erring on the side of danger in favor of
compatibility, so I welcome discussion here.
<hr>
This PR has:
- [ ] been self-reviewed.
- [ ] using the [concurrency
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
(Remove this item if the PR doesn't have any relation to concurrency.)
- [ ] added documentation for new or modified features or behaviors.
- [ ] a release note entry in the PR description.
- [ ] added Javadocs for most classes and all non-trivial methods. Linked
related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
- [ ] added comments explaining the "why" and the intent of the code
wherever would not be obvious for an unfamiliar reader.
- [ ] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
- [ ] added integration tests.
- [ ] been tested in a test Druid cluster.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]