gianm commented on PR #15552:
URL: https://github.com/apache/druid/pull/15552#issuecomment-1879849582
> create the ExprEval early for ConstantExpr -s (except the one for
BigInteger which seem to have some odd contract)
This part is problematic since `ExprEval` is not thread-safe. For example,
`NestedDataExprEval#asDouble` calls `computeNumber` which has race problems if
called from two threads simultaneously:
```
private void computeNumber()
{
if (!computedNumber && value != null) {
computedNumber = true;
Object val = StructuredData.unwrap(value);
if (val instanceof Number) {
number = (Number) val;
} else if (val instanceof Boolean) {
number = Evals.asLong((Boolean) val);
} else if (val instanceof String) {
number = ExprEval.computeNumber((String) val);
}
}
}
```
To safely cache the `ExprEval` we have a couple of options.
One is to ensure that the `ExprEval` stashed in `ConstantExpr` are
immutable. Perhaps by adding a method on `ExprEval` that is used to ensure all
the caches for `asDouble()`, `asString()`, etc, are populated. Once those are
all populated then the object becomes thread-safe.
Another is to add a `singleThreaded()` method to make a thread-unsafe +
optimized version, like the technique used by `GenericIndexed#singleThreaded`.
The equivalent here would be something like `Expr#singleThreaded`, and _that_
would be the one that creates the `ExprEval` (in a copy of the `Expr`). Callers
that are about to call `expr.eval(bindings)` in a hot loop would then first do
`singleThreadedExpr = expr.singleThreaded()`, then do
`singleThreadedExpr.eval(bindings)`. Callers that _don't_ do this will still
function properly, but they won't get the stashed eval optimization. Sketch of
the approach:
```java
class DoubleExpr extends ConstantExpr<Double>
{
@Nullable
private final ExprEval<?> eval;
// constructor used by Parser
DoubleExpr(Double value)
{
super(ExpressionType.DOUBLE, Preconditions.checkNotNull(value, "value"));
}
// constructor used by singleThreaded
DoubleExpr(ExprEval<?> eval)
{
super(ExpressionType.DOUBLE, Preconditions.checkNotNull(eval.value(),
"value"));
}
@Override // overrides method from Expr
public Expr singleThreaded()
{
return new DoubleExpr(ExprEval.ofDouble(value));
}
@Override
public ExprEval eval(ObjectBindings bindings)
{
// thread-safe no matter what, but more optimized if someone had called
singleThreaded()
return eval != null ? eval : ExprEval.ofDouble(value);
}
}
```
I'm partial to the second one since I am thinking we might want to add a
`Utf8ExprEval` in the future that has a `ByteBuffer` value, and that would
benefit from the second approach (we can call `ByteBuffer#duplicate` in
`singleThreaded()` and avoid calling it during `eval`) but would not benefit
from the first approach (we would need to call `ByteBuffer#duplicate` in
`eval`, since ByteBuffer cannot ever be shared across threads, due to the
builtin `position` being changed by various operations).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]