potiuk commented on issue #41641:
URL: https://github.com/apache/airflow/issues/41641#issuecomment-2441075120

   :eyes:  :eyes:  `rust` project :) ... 
   
   Me :heart:  it  (but I doubt we want to invest in it as it might be 
difficult to maintain, unless we find quite a few committers who are somewhat 
ruff profficient to at least be able to review the code) . But it's tempting I 
must admit.
   
   But to be honest - while I'd love to finally get a serious rust project, 
it's not worth it I think we are talking of one-time migration for even a 
10.000 dags it will take at most single minutes and we can turn it maybe in 
under one minute with rust - so not a big gain for a lot of pain :) . Or at 
lest this is what my intuition tells me.
   
   I think parallelism will do the job nicely. My intuition tells me (but this 
is just intuition and understanding on some limits ans speed of certain 
operation) - that we will get for multiple 10s of minutes to single minutes 
when we allow to run migration in parallel using multiple processors - even 
with Python and libcst. This task is really suitable for such parallelisation 
because each file is complete, independent task that can be run in complete 
isolation from all other tasks - so spawning multiple paralllel interpreters, 
ideally forking them right after all the imports and common code is loaded so 
that they use shared memory for those - this **should** do the job nicely (at 
least intuitively).
   
   Using RUST for that might be classic premature optimisation - we might 
likely don't need it :). But would be worth to make some calculations and get 
some "numbers" for big installation - i.e. how many dags of what size are out 
there, and how long it will be to parse them all with libcst and write back 
(even unmodified). With a simple modification.  I presume that parsing and 
writing back will be the bulk of the job - and modifications will add very 
little overhead as they will be mostly operating on in memory data structures.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to