GitHub user Iskander14yo created a discussion: Adding Gluten (on Velox) to ClickBench
Hi! I recently opened a https://github.com/ClickHouse/ClickBench/pull/575 to add Gluten (on Velox) to [ClickBench](https://benchmark.clickhouse.com/) - one of the popular benchmarks for analytical workloads. I thought it might be useful to let Velox/Gluten community know about this PR. And while results yet to be measured I'd also appreciate any feedback on whether my configuration and setup are correct. My notes (besides those in README): - I couldn’t get Gluten to run with pyspark==3.5.5 as the docs recommend; I hit warnings/errors (referenced in the README). - I didn’t find a reliable, single grep-able pattern to detect operators/functions where Gluten falls back to Spark because warnings vary significantly. For example, below are logs for a last query: ``` 25/08/17 00:42:50 WARN ProjectExecTransformer: - Native validation failed: |- Validation failed due to exception caught at file:SubstraitToVeloxPlanValidator.cc line:1387 function:validate, thrown from file:ExprCompiler.cpp line:472 function:compileRewrittenExpression, reason:Scalar function name not registered: date_trunc, called with arguments: (VARCHAR, TIMESTAMP, VARCHAR). 25/08/17 00:42:50 WARN ProjectExecTransformer: - Native validation failed: |- Validation failed due to exception caught at file:SubstraitToVeloxPlanValidator.cc line:1387 function:validate, thrown from file:ExprCompiler.cpp line:472 function:compileRewrittenExpression, reason:Scalar function name not registered: date_trunc, called with arguments: (VARCHAR, TIMESTAMP, VARCHAR). 25/08/17 00:42:50 WARN GlutenFallbackReporter: Validation failed for plan: Project[QueryId=2], due to: - Native validation failed: |- Validation failed due to exception caught at file:SubstraitToVeloxPlanValidator.cc line:1387 function:validate, thrown from file:ExprCompiler.cpp line:472 function:compileRewrittenExpression, reason:Scalar function name not registered: date_trunc, called with arguments: (VARCHAR, TIMESTAMP, VARCHAR). 25/08/17 00:42:52 WARN ProjectExecTransformer: - Validation failed with exception from: ProjectExecTransformer, reason: Not supported to map spark function name to substrait function name: toprettystring(M#433, Some(Europe/Moscow)), class name: ToPrettyString. 25/08/17 00:42:52 WARN ColumnarCollectLimitExec: - Columnar collect-limit is unsupported under the current Spark version 25/08/17 00:42:52 WARN ProjectExecTransformer: - Validation failed with exception from: ProjectExecTransformer, reason: Not supported to map spark function name to substrait function name: toprettystring(M#433, Some(Europe/Moscow)), class name: ToPrettyString. 25/08/17 00:42:52 WARN GlutenFallbackReporter: Validation failed for plan: Project[QueryId=2], due to: - Validation failed with exception from: ProjectExecTransformer, reason: Not supported to map spark function name to substrait function name: toprettystring(M#433, Some(Europe/Moscow)), class name: ToPrettyString. 25/08/17 00:42:52 WARN GlutenFallbackReporter: Validation failed for plan: CollectLimit[QueryId=2], due to: - Columnar collect-limit is unsupported under the current Spark version E20250817 00:42:52.213160 134615 Exceptions.h:66] Line: /root/src/apache/incubator-gluten/ep/build-velox/build/velox_ep/velox/exec/Task.cpp:2049, Function:terminate, Expression: Cancelled, Source: RUNTIME, ErrorCode: INVALID_STATE ``` GitHub link: https://github.com/apache/incubator-gluten/discussions/10465 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
