potiuk commented on issue #12969:
URL: https://github.com/apache/airflow/issues/12969#issuecomment-744085439


   No, the #13048 does not seem to help (I tried the S3 fix). It really looks 
like the "fork + multiprocessing" error. When I hard-code the CAN_FORK = False, 
I got the close() called, but with the "NoneType' error. When I apply the fix 
to the S3TaskHandler from #13048 - it does not help. i added some debug code to 
know for sure when the 'close()' method has been called  - and it is not called 
at all when "CAN_FORK" evaluates to true:
   
   ```
       def close(self):
           """Close and upload local log file to remote storage S3."""
           # When application exit, system shuts down all handlers by
           # calling close method. Here we check if logger is already
           # closed to prevent uploading the log to remote storage multiple
           # times when `logging.shutdown` is called.
           with open ("/tmp/out.txt", "at") as f:
               f.write("Closing\n")
               f.write(self.log_relative_path + "\n")
               f.write(str(self.upload_on_close) + "\n")
               f.write(self.local_base + "\n")
               f.write(self.remote_base + "\n")
               f.write(self.log_relative_path + "\n")
           if self.closed:
               return
           super().close()
   
           if not self.upload_on_close:
               return
           local_loc = os.path.join(self.local_base, self.log_relative_path)
           remote_loc = os.path.join(self.remote_base, self.log_relative_path)
           if os.path.exists(local_loc):
               # read log and remove old logs to get just the latest additions
               with open(local_loc) as logfile:
                   log = logfile.read()
               self.s3_write(log, remote_loc)
   
   ```
   
   I see this method being called once from the main scheduler process and 
appending this to the log:
   
   ```
   Closing
   
   True
   /root/airflow/logs
   s3://test-amazon-logging/airflowlogs
   ```
   
   It does not add anything to S3 because there is no file (just folder) named 
`/root/airflow/logs`
   
   If I modify 'CAN_FORK=False' I got another entry there:
   
   ```
   Closing
   example_bash_operator/runme_0/2020-12-13T22:51:17.397960+00:00/1.log
   True
   /root/airflow/logs
   s3://test-amazon-logging/airflowlogs
   example_bash_operator/runme_0/2020-12-13T22:51:17.397960+00:00/1.log
   ```
   
   Which is much closer to what I would expect, but then I got the 'NoneType' 
object is not callale exception above.
   
   
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to