potiuk commented on issue #37131:
URL: https://github.com/apache/airflow/issues/37131#issuecomment-1947843819

   It's just another way to achieve isolation (heavy handed, so maybe 
side-edlffects will be too big and it's a bad idea). 
   
   Basically the reason we are doing deep copy here is because we want to have 
objects in memory that we want to modify but we want to keep the original 
object intact. This is what we really achieve by deepcopy. And apparently the 
objects we deepcopy are so complex that it takes couple of seconds to do (which 
I find quite strange on its own but I trust the original diagnosis - it would 
have to be likely many 100s of objects and rather complex structure to get few 
seconds here). But yeah, those objects are likely complex.
   
   So if we are talking about out-of-the box thinkinging and saving the cost of 
deepcopying - forking process does the same thing - it allows to have 
separate.process with the same object in memory which is effectively a copy, 
and the memory segments for the object in question on heap will get duplicated 
the moment you write them (which should be much faster than copying a 
complex.object structure in Python as you copy all affected memory segments 
using low-level processor instructions). Effect is the same - you have a copy 
of the object that you can modify without affecting the original object.
   
   So my proposal is just one of the solution to 'let's get an isolated copy of 
the object that is long to copy'.
   
   Very remote and out-of-the-box and can have other side effects - but maybe 
it's something we are looking for.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to