nanohanno opened a new issue, #24779:
URL: https://github.com/apache/airflow/issues/24779

   ### Apache Airflow version
   
   2.3.0
   
   ### What happened
   
   When using the `PythonVirtualenvOperator` with a return statement containing 
objects from a package that is installed in the virtualenv but not on the 
Airflow host system the following exception is raised:
   ```
   [2022-06-30, 16:10:12 UTC] {taskinstance.py:1889} ERROR - Task failed with 
exception
   Traceback (most recent call last):
     File 
"/home/hanno/.pyenv/versions/3.8.12/envs/datamart-py38/lib/python3.8/site-packages/airflow/decorators/base.py",
 line 179, in execute
       return_value = super().execute(context)
     File 
"/home/hanno/.pyenv/versions/3.8.12/envs/datamart-py38/lib/python3.8/site-packages/airflow/operators/python.py",
 line 423, in execute
       return super().execute(context=serializable_context)
     File 
"/home/hanno/.pyenv/versions/3.8.12/envs/datamart-py38/lib/python3.8/site-packages/airflow/operators/python.py",
 line 171, in execute
       return_value = self.execute_callable()
     File 
"/home/hanno/.pyenv/versions/3.8.12/envs/datamart-py38/lib/python3.8/site-packages/airflow/operators/python.py",
 line 483, in execute_callable
       return self._read_result(output_filename)
     File 
"/home/hanno/.pyenv/versions/3.8.12/envs/datamart-py38/lib/python3.8/site-packages/airflow/operators/python.py",
 line 514, in _read_result
       return self.pickling_library.load(file)
   ModuleNotFoundError: No module named 'name_of_custom_module'
   ```
   As I understand it currently, the returned result gets unpickled outside the 
virtualenv in 
https://github.com/apache/airflow/blob/main/airflow/operators/python.py#L484 
which raises the exception because the custom_module does not exist outside the 
virtualenv.
   
   
   ### What you think should happen instead
   
   Being able to easily pass pickled objects from one virtualenv task to 
another when both have the necessary package installed. Alternatively, having 
documentation about the limitations of virtualenv operators in this respect.
   
   ### How to reproduce
   
   ```
   @task.virtualenv(task_id="task0", requirements="pandas")
   def pandas_task():
       import pandas as pd
       df = pd.DataFrame()
       return df
   ```
   
   ### Operating System
   
   Ubuntu 20.04
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to