potiuk commented on issue #37131: URL: https://github.com/apache/airflow/issues/37131#issuecomment-1947843819
It's just another way to achieve isolation (heavy handed, so maybe side-edlffects will be too big and it's a bad idea). Basically the reason we are doing deep copy here is because we want to have objects in memory that we want to modify but we want to keep the original object intact. This is what we really achieve by deepcopy. And apparently the objects we deepcopy are so complex that it takes couple of seconds to do (which I find quite strange on its own but I trust the original diagnosis - it would have to be likely many 100s of objects and rather complex structure to get few seconds here). But yeah, those objects are likely complex. So if we are talking about out-of-the box thinkinging and saving the cost of deepcopying - forking process does the same thing - it allows to have separate.process with the same object in memory which is effectively a copy, and the memory segments for the object in question on heap will get duplicated the moment you write them (which should be much faster than copying a complex.object structure in Python as you copy all affected memory segments using low-level processor instructions). Effect is the same - you have a copy of the object that you can modify without affecting the original object. So my proposal is just one of the solution to 'let's get an isolated copy of the object that is long to copy'. Very remote and out-of-the-box and can have other side effects - but maybe it's something we are looking for. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org