leoitcode opened a new issue, #30061:
URL: https://github.com/apache/airflow/issues/30061

   ### Apache Airflow version
   
   2.5.1
   
   ### What happened
   
   I wasn't able to find a way to run CUDA application using torch libraries in 
Airflow due Gunicorn method of managing the parent and child processes with 
fork instead spawn.
   I'd like to know if there is a way to run an application using GPU inside 
Apache airflow DAG.
   
   What I've tried:
   - Change configuration of execute_tasks_new_python_interpreter to True
   - Reduce workes in Gunicorn settings.
   - Set the torch.multiprocessing to spawn method:
   mp.set_start_method('spawn', force=True)
   
   ### What you think should happen instead
   
   _No response_
   
   ### How to reproduce
   
   To reproduce the problem is only try to load any kind of model into the CUDA 
device using torch.
   
   `
   with DAG(
       dag_id='dag_temp', 
       description='Dag temp',
       start_date=datetime(2022, 8, 18), 
       catchup=False
   ) as dag:
   
   from transformers import AutoTokenizer, RobertaModel
   import torch
   @task()
   def load_models():
      device = 'cuda' if torch.cuda.is_available() else 'cpu' # device = cuda
   
      tokenizer = AutoTokenizer.from_pretrained("roberta-base")
      model = RobertaModel.from_pretrained("roberta-base")
      model.to(device)
   `
   
   I got the error:
   RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA 
with multiprocessing, you must use the ‘spawn’ start method
   
   ### Operating System
   
   Ubuntu 22.04
   
   ### Versions of Apache Airflow Providers
   
   2.5.1, using through poetry installation.
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to