Sorry. Correct my typo issue below. We use Task::next() now, but not
Task::start().
At 2025-05-13 08:43:45, "YONG" <[email protected]> wrote:
Hi all,
Happy to be here! I am a newbie in spark and gluten, and have two questions
about gluten to ask.
The first question is about the task's execution mode in gluten.
From velox's source code, it seems that velox can support two execution modes
[velox/exec/Task.h enum class ExecutionMode]:
Serial Execution Mode: which uses single-thread to process the task,
and the API is Task::next()
Parallel Execution Mode: which uses multi-threads to process the task,
and the API is Task::start()
In gluten's code [WholeStageResultIterator::next()], we only use velox's
serial execution mode [Task::next()] now.
I guess maybe velox is developed by Meta to replace presto's engine at first,
and the presto's task can be run in multi-threads. But in Spark, the task
should be run in single-thread, which corresponding to one core in one
executor. I am not sure about the effort to implement velox's parallel mode in
gluten.
My question is whether we have plan to support parallel mode in future, and
when if it is in the feature list?
The second question is about profiling tool.
I want to collect the C++ code's hotspot & Flame Graph in one query, and to see
which function in velox is the cirtical path in my case. I just find the memo
about clickhouse backend
(incubator-gluten/docs/developers/UsingGperftoolsInCH.md at main ·
apache/incubator-gluten · GitHub). Is there any memo which I can follow about
velox backend?
Thanks a lot
Best Regards
Pan Yong