milenkovicm commented on PR #18192:
URL: https://github.com/apache/datafusion/pull/18192#issuecomment-3446916555

   One simple question: if the `TaskContext` and `DecodeContext` are reused to 
decode `protos` from multiple clients or even from one client sending plan 
after plan, is there a chance of id collision? 
   
   I believe the current implementation does not prevent the reuse of 
`DecodeContext` for two different plans.
   
   ```rust
   let task_ctx = ctx.task_ctx();
       let decode_ctx = DecodeContext::new(&task_ctx);
       let result_exec_plan: Arc<dyn ExecutionPlan> = proto
           .try_into_physical_plan(&decode_ctx, codec)
           .expect("from proto");
   ```
   
   Two subsequent encoded plans coming to the same `DecodeContext`  (one after 
the other, even from a same client) may have identical IDs for different 
expressions (we were unlucky to have one arc doped and another arc created at 
the same location for two different queries coming one after the other. I know 
we have to be very, very unlucky, but we have no guarantee it wont). 
   Hence, we need to consume `DecodeContext` after decoding to prevent its 
re-use for the next decode in all public interfaces, preventing its re-use. 
   
   I can't really claim but current approach use of arc address may be safe for 
referencing expressions within single plan.
   
   If `DecodeContext` can't be re-used, how many times cache will be hit, and 
would it really save that much resources to add additional moving part, and 
changing public interface again?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to