GitHub user Iskander14yo created a discussion: Adding Gluten (on Velox) to 
ClickBench

Hi!

I recently opened a https://github.com/ClickHouse/ClickBench/pull/575 to add 
Gluten (on Velox) to [ClickBench](https://benchmark.clickhouse.com/) - one of 
the popular benchmarks for analytical workloads.

I thought it might be useful to let Velox/Gluten community know about this PR. 
And while results yet to be measured I'd also appreciate any feedback on 
whether my configuration and setup are correct.

My notes (besides those in README):

- I couldn’t get Gluten to run with pyspark==3.5.5 as the docs recommend; I hit 
warnings/errors (referenced in the README).

- I didn’t find a reliable, single grep-able pattern to detect 
operators/functions where Gluten falls back to Spark because warnings vary 
significantly. For example, below are logs for a last query:
```
25/08/17 00:42:50 WARN ProjectExecTransformer: 
 - Native validation failed: 
   |- Validation failed due to exception caught at 
file:SubstraitToVeloxPlanValidator.cc line:1387 function:validate, thrown from 
file:ExprCompiler.cpp line:472 function:compileRewrittenExpression, 
reason:Scalar function name not registered: date_trunc, called with arguments: 
(VARCHAR, TIMESTAMP, VARCHAR).
25/08/17 00:42:50 WARN ProjectExecTransformer: 
 - Native validation failed: 
   |- Validation failed due to exception caught at 
file:SubstraitToVeloxPlanValidator.cc line:1387 function:validate, thrown from 
file:ExprCompiler.cpp line:472 function:compileRewrittenExpression, 
reason:Scalar function name not registered: date_trunc, called with arguments: 
(VARCHAR, TIMESTAMP, VARCHAR).
25/08/17 00:42:50 WARN GlutenFallbackReporter: Validation failed for plan: 
Project[QueryId=2], due to: 
 - Native validation failed: 
   |- Validation failed due to exception caught at 
file:SubstraitToVeloxPlanValidator.cc line:1387 function:validate, thrown from 
file:ExprCompiler.cpp line:472 function:compileRewrittenExpression, 
reason:Scalar function name not registered: date_trunc, called with arguments: 
(VARCHAR, TIMESTAMP, VARCHAR).
25/08/17 00:42:52 WARN ProjectExecTransformer:                                  
 - Validation failed with exception from: ProjectExecTransformer, reason: Not 
supported to map spark function name to substrait function name: 
toprettystring(M#433, Some(Europe/Moscow)), class name: ToPrettyString.
25/08/17 00:42:52 WARN ColumnarCollectLimitExec: 
 - Columnar collect-limit is unsupported under the current Spark version
25/08/17 00:42:52 WARN ProjectExecTransformer: 
 - Validation failed with exception from: ProjectExecTransformer, reason: Not 
supported to map spark function name to substrait function name: 
toprettystring(M#433, Some(Europe/Moscow)), class name: ToPrettyString.
25/08/17 00:42:52 WARN GlutenFallbackReporter: Validation failed for plan: 
Project[QueryId=2], due to: 
 - Validation failed with exception from: ProjectExecTransformer, reason: Not 
supported to map spark function name to substrait function name: 
toprettystring(M#433, Some(Europe/Moscow)), class name: ToPrettyString.
25/08/17 00:42:52 WARN GlutenFallbackReporter: Validation failed for plan: 
CollectLimit[QueryId=2], due to: 
 - Columnar collect-limit is unsupported under the current Spark version
E20250817 00:42:52.213160 134615 Exceptions.h:66] Line: 
/root/src/apache/incubator-gluten/ep/build-velox/build/velox_ep/velox/exec/Task.cpp:2049,
 Function:terminate, Expression:  Cancelled, Source: RUNTIME, ErrorCode: 
INVALID_STATE
```

GitHub link: https://github.com/apache/incubator-gluten/discussions/10465

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to