stroykova commented on issue #14672:
URL: https://github.com/apache/airflow/issues/14672#issuecomment-943508653


   I use airflow in a data science project with lots of multiprocessing.
   I also have the same issue with SIGTERM signal. Here is a code example that 
shows using multiprocessing produces sigterm:
   
   ```
   from pebble import ProcessPool
   import multiprocessing 
   import signal 
   
   def function(foo, bar=0):
       print(multiprocessing.current_process().pid)
   
   def task_done(future):
       future.result()  # blocks until results are ready
   
   def process():
       def signal_handler(signum, frame): 
           raise ValueError("received SIGTERM signal") 
       signal.signal(signal.SIGTERM, signal_handler)
       with ProcessPool(max_workers=5, max_tasks=10) as pool:
           for i in range(0, 10):
               future = pool.schedule(function, args=[i], timeout=10)
               future.add_done_callback(task_done)
   ```
   
   I added sigterm handler and see it in every spawned process. I do not 
completely understand why this happens.
   But I also know that airflow listens to sigterm signal and stops the task on 
it. So I am not able to use PythonOperator with multiprocessing. It is quite 
annoying.
   
   The workaround is to wrap python code with multiprocessing to BashOperator. 
Hope this would help someone.
   
   It would be great to be able to use PythonOperator for such things. Hope 
this will help to investigate this issue. I think it should be discussed and 
maybe reopened.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to