[ https://issues.apache.org/jira/browse/IGNITE-16568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexandr Shapkin updated IGNITE-16568: -------------------------------------- External issue URL: https://stackoverflow.com/questions/71118869/openshift-k8s-issue-with-project-pods-not-joining-same-grid-but-rather-create-m Labels: Kubernetes (was: ) > Kubernetes cluster might split apart on initialization > ------------------------------------------------------ > > Key: IGNITE-16568 > URL: https://issues.apache.org/jira/browse/IGNITE-16568 > Project: Ignite > Issue Type: Bug > Components: networking > Reporter: Alexandr Shapkin > Priority: Major > Labels: Kubernetes > > The issue is mostly about Kubernetes/Openshift deployment but could also be > true for other scenarios relying on external services (AWS?). > Consider the following case: multiple nodes (PODs) were started > simultaneously and all of them are trying to locate if there are other nodes > available using > *_TcpDiscoveryKubernetesIpFinder._* that just returns a set of registered > IPs. Since there is no delay or retry attempt, all nodes could be returned > with an empty IPs list and decide to be a coordinator, i.e. to start multiple > independent grids. > > Proposed changes: extend TcpDiscoveryKubernetesIpFinder with either a > configurable delay or repetitions counter to check if there is a non-empty > list of available IPs. -- This message was sent by Atlassian Jira (v8.20.1#820001)