nicknezis commented on issue #3542:
URL: 
https://github.com/apache/incubator-heron/issues/3542#issuecomment-650497830


   I've discussed this with @windhamwong and have come up with a proposed 
approach to handle Heron topologies in Kubernetes. We found that it does work, 
but there are some edge cases that can cause the topology StatefulSet to fail.
   
   1. TMaster looks for `/.dockerenv` to determine if Tmaster is running in a 
container. I have found situations in which the pod does not have this file 
(i.e. Kind and K3s) [Tmaster 
code](https://github.com/apache/incubator-heron/blob/cc815d85305dc0b665a2ccb42113cf7a49b1eb0a/heron/executor/src/python/heron_executor.py#L232)
   2. If TMaster does find `/.dockerenv` it will try to use the `HOST` 
environment variable. I have found some use cases in which the Pod does not 
have this set (i.e. Kind).
   3. If both of these work, then the TMaster and Stmgr processes will use the 
pod's IP address. If either fails, then the `socket.hostname()` call will 
return the pod name, which is not stored in the Kubernetes cluster DNS.
   4. To enable the use of the hostname, we need to have a Headless Service 
registered.
   
   The proposal:
   1. Update Kubernetes Scheduler code to create a matching Headless Service 
for each topology created.
   2. Update the Kubernetes Scheduler code to add a custom ENV variable on the 
StatefulSet (i.e. `HERON_HOSTNAME`)
   3. Update the TMaster logic that checks for `/.dockerenv` to instead first 
check for `HERON_HOSTNAME` variable.
   
   If we make these changes, this issue would be resolved.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to