jonstacks commented on issue #10292:
URL: https://github.com/apache/airflow/issues/10292#issuecomment-683378076


   I'm also seeing this in our EKS clusters(version 1.15.11) with airflow 
1.10.11 and 1.10.12 using the Kubernetes Executor. Everything else seems to be 
working as expected with the exception of logs.
   
   I think I've traced it back to how the hostname gets set in the DB. I think 
this call to get_hostname is the issue:
   
   
https://github.com/apache/airflow/blob/4e3799fec4c23d0f43603a0489c5a6158aeba035/airflow/utils/net.py#L31-L36
   
   If you exec into a pod with a long pod name(> 63 chars) and  and run `echo 
-n $(hostname) | wc -c` you should get the truncated name you are seeing in the 
database(length 63 characters). At least that is happening for me. I think it 
has to do with the maximum length of a label in a fqdn being 63 characters.
   
   It looks like airflow has an environment variable for this callable and we 
could do something like this: https://stackoverflow.com/a/62905570. If there is 
an easy way to expose the log port on the executor pods, I think we could 
easily override this with the environment variable to set it to the IP. I 
didn't see anything in the documentation for it, except maybe a pod template 
file, but I think that overrides the whole POD.
   
   It looks like it shouldn't be a problem from the side that looks at the logs:
   
   
https://github.com/apache/airflow/blob/4e3799fec4c23d0f43603a0489c5a6158aeba035/airflow/utils/log/file_task_handler.py#L109-L123
   
   I am new to managing airflow for our company, but I'm wondering if it would 
make sense to, by default, inject a `POD_NAME` environment variable using the 
downward API by default into the container and if that environment variable is 
set, return that for the hostname, so it gets set in the DB. Hopefully it would 
make the default configuration handle this case where the pod name is really 
long.
   
   Something like:
   ```yaml
   - name: MY_POD_NAME
     valueFrom:
       fieldRef:
         fieldPath: metadata.name
   ```
   
   We are currently trying to get this working so that we can use the 
KubernetesExecutor in production instead of the CeleryExecutor. If anyone knows 
a good way around this, I would be great full to learn the best way to handle 
this.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to