milenkovicm commented on PR #18192:
URL: https://github.com/apache/datafusion/pull/18192#issuecomment-3446916555
One simple question: if the `TaskContext` and `DecodeContext` are reused to
decode `protos` from multiple clients or even from one client sending plan
after plan, is there a chance of id collision?
I believe the current implementation does not prevent the reuse of
`DecodeContext` for two different plans.
```rust
let task_ctx = ctx.task_ctx();
let decode_ctx = DecodeContext::new(&task_ctx);
let result_exec_plan: Arc<dyn ExecutionPlan> = proto
.try_into_physical_plan(&decode_ctx, codec)
.expect("from proto");
```
Two subsequent encoded plans coming to the same `DecodeContext` (one after
the other, even from a same client) may have identical IDs for different
expressions (we were unlucky to have one arc doped and another arc created at
the same location for two different queries coming one after the other. I know
we have to be very, very unlucky, but we have no guarantee it wont).
Hence, we need to consume `DecodeContext` after decoding to prevent its
re-use for the next decode in all public interfaces, preventing its re-use.
I can't really claim but current approach use of arc address may be safe for
referencing expressions within single plan.
If `DecodeContext` can't be re-used, how many times cache will be hit, and
would it really save that much resources to add additional moving part, and
changing public interface again?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]