leoitcode opened a new issue, #30061:
URL: https://github.com/apache/airflow/issues/30061
### Apache Airflow version
2.5.1
### What happened
I wasn't able to find a way to run CUDA application using torch libraries in
Airflow due Gunicorn method of managing the parent and child processes with
fork instead spawn.
I'd like to know if there is a way to run an application using GPU inside
Apache airflow DAG.
What I've tried:
- Change configuration of execute_tasks_new_python_interpreter to True
- Reduce workes in Gunicorn settings.
- Set the torch.multiprocessing to spawn method:
mp.set_start_method('spawn', force=True)
### What you think should happen instead
_No response_
### How to reproduce
To reproduce the problem is only try to load any kind of model into the CUDA
device using torch.
`
with DAG(
dag_id='dag_temp',
description='Dag temp',
start_date=datetime(2022, 8, 18),
catchup=False
) as dag:
from transformers import AutoTokenizer, RobertaModel
import torch
@task()
def load_models():
device = 'cuda' if torch.cuda.is_available() else 'cpu' # device = cuda
tokenizer = AutoTokenizer.from_pretrained("roberta-base")
model = RobertaModel.from_pretrained("roberta-base")
model.to(device)
`
I got the error:
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA
with multiprocessing, you must use the ‘spawn’ start method
### Operating System
Ubuntu 22.04
### Versions of Apache Airflow Providers
2.5.1, using through poetry installation.
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]