larryzhu2018 commented on a change in pull request #7141: [AIRFLOW-6544] add 
log_id to end-of-file mark and also add an index config for logs
URL: https://github.com/apache/airflow/pull/7141#discussion_r366647600
 
 

 ##########
 File path: airflow/utils/log/es_task_handler.py
 ##########
 @@ -255,7 +256,9 @@ def close(self):
 
         # Mark the end of file using end of log mark,
         # so we know where to stop while auto-tailing.
-        self.handler.stream.write(self.end_of_log_mark)
+        if self.write_stdout:
+            print()
 
 Review comment:
   it is import for us to find the log_id mark in the log line with 
"end_of_log_mark" from the elastic search cluster. What I observed was that I 
saw the end of log mark can end up with the same line of the previous log lines 
hence it would prevent us from finding the end-of-log mark in some cases 
console prints from random places without the newline. I am adding an obnoxious 
new line (print()) so as to guarantee that end-of-log mark is a separate log 
record. For any other log line it is actually benign to have two log lines to 
combine into one line in elastic search. Only the end-of-log mark absolutely 
need to be in its own line.  This is just to make the solution here more robust 
and decoupled from the rest of log lines. I understand this is a fix for 
reliability and it probably is not very clean.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to