924060929 commented on code in PR #63366:
URL: https://github.com/apache/doris/pull/63366#discussion_r3413517236
##########
be/src/exec/pipeline/pipeline_fragment_context.cpp:
##########
@@ -290,6 +290,32 @@ Status
PipelineFragmentContext::_build_and_prepare_full_pipeline(ThreadPool* thr
RETURN_IF_ERROR(_build_pipelines(_runtime_state->obj_pool(),
*_query_ctx->desc_tbl,
&_root_op, root_pipeline));
+ // Propagate _num_instances from LOCAL_EXCHANGE pipelines to ancestor
pipelines
+ // that inherited reduced num_tasks from a serial operator.
+ _propagate_local_exchange_num_tasks();
+
+ // Create deferred local exchangers now that all pipelines have final
num_tasks.
+ RETURN_IF_ERROR(_create_deferred_local_exchangers());
Review Comment:
是的,这两块不是为了灰度/回滚——是 FE 接管 local exchange 的**当前折中**,不是一步到位。
要让 FE 把"每个算子的并发"也规划好直接下发,本质上需要 FE 自己切 pipeline、自己控制每条 pipeline 的并发(num_tasks
是 pipeline 粒度的,只在 pipeline 边界即 local exchange / pipeline breaker 处变化),BE 就不再规划
pipeline 了。那等于把 BE 的整套 pipeline 构建(`_build_pipeline`、pipeline-breaker
判定、num_tasks 推导、dependency 接线,还要兼顾 spill / runtime filter / shared state)搬到
FE,并改 thrift 契约携带完整 pipeline 结构——战线太长、风险太大。这套 FE 近似等价实现已经做了大半年、近 1 万行代码,没法一把全做。
所以现状是:FE 规划 LE 的**位置/类型/分布**(决定 plan 正确性,过去正是 BE/FE 分歧 bug
的根源),`enable_local_shuffle_planner=true` 时 BE 的 `_plan_local_exchange` 整个跳过;但每条
pipeline 的 num_tasks 仍由 BE 的 `_propagate_local_exchange_num_tasks`
拓扑推导,`_create_deferred_local_exchangers` 因为 exchanger 的 `num_partitions` 要等
num_tasks 定了才能建、所以延后创建。终态是 FE 连 num_tasks 也算好下发、BE 的 propagate 退化成
assert(只校验)——那是后续单独的大改。
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]