jscheffl commented on code in PR #43954:
URL: https://github.com/apache/airflow/pull/43954#discussion_r1840296255


##########
providers/src/airflow/providers/edge/cli/edge_command.py:
##########
@@ -248,19 +248,23 @@ def check_running_jobs(self) -> None:
                     logger.error("Job failed: %s", job.edge_job)
                     EdgeJob.set_state(job.edge_job.key, 
TaskInstanceState.FAILED)
             if job.logfile.exists() and job.logfile.stat().st_size > 
job.logsize:
-                with job.logfile.open("r") as logfile:
+                with job.logfile.open("rb") as logfile:
                     push_log_chunk_size = conf.getint("edge", 
"push_log_chunk_size")
                     logfile.seek(job.logsize, os.SEEK_SET)
+                    read_data = logfile.read()
+                    job.logsize += len(read_data)
+                    log_data = read_data.decode("utf-8", "backslashreplace")

Review Comment:
   Are you sure the logs are encoded with backslash in bytes and need an 
additional decoding of this? I would doubt this:
   ```suggestion
                       log_data = read_data.decode("utf-8")
   ```
   Otherwise, so we know in which encoding the worker writes the logs? Is it 
UTF-8 hard coded or is is currently UTF-8 "per acident" because this is the 
system default? If it is the system default, we should decode with system 
default as well:
   ```suggestion
                       log_data = read_data.decode()
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to