[
https://issues.apache.org/jira/browse/CASSANDRA-15823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200199#comment-17200199
]
Roman Chernobelskiy edited comment on CASSANDRA-15823 at 9/22/20, 4:38 PM:
---------------------------------------------------------------------------
One way to address this in kubernetes without Cassandra changes is with a
sidecar that encodes the pod names into virtual IP addresses, thereby giving
each node a stable ip and then looking up the ip based on the encoded hostname
on outgoing connections.
was (Author: rchernobelskiy):
One way to address this in kubernetes is with a sidecar that encodes the pod
names into virtual IP addresses, thereby giving each node a stable ip and then
looking up the ip based on the encoded hostname on outgoing connections.
> Support for networking via identity instead of IP
> -------------------------------------------------
>
> Key: CASSANDRA-15823
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15823
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Christopher Bradford
> Priority: Normal
> Labels: kubernetes
> Attachments: consul-mesh-gateways.png,
> istio-multicluster-with-gateways.svg, linkerd-service-mirroring.svg
>
>
> TL;DR: Instead of mapping host ids to IPs, use hostnames. This allows
> resolution to different IP addresses per DC that may then be forwarded to
> nodes on remote networks without requiring node to node IP connectivity for
> cross-dc links.
>
> This approach should not affect existing deployments as those could continue
> to use IPs as the hostname and skip resolution.
> ----
> With orchestration platforms like Kubernetes and the usage of ephemeral
> containers in environments today we should consider some changes to how we
> handle the tracking of nodes and their network location. Currently we
> maintain a mapping between host ids and IP addresses.
>
> With traditional infrastructure, if a node goes down it, usually, comes back
> up with the same IP. In some environments this contract may be explicit with
> virtual IPs that may move between hosts. In newer deployments, like on
> Kubernetes, this contract is not possible. Pods (analogous to nodes) are
> assigned an IP address at start time. Should the pod be restarted or
> scheduled on a different host there is no guarantee we would have the same
> IP. Cassandra is protected here as we already have logic in place to update
> peers when we come up with the same host id, but a different IP address.
>
> There are ways to get Kubernetes to assign a specific IP per Pod. Most
> recommendations involve the use of a service per pod. Communication with the
> fixed service IP would automatically forward to the associated pod,
> regardless of address. We _could_ use this approach, but it seems like this
> would needlessly create a number of extra resources in our k8s cluster to get
> around the problem. Which, to be fair, doesn't seem like much of a problem
> with the aforementioned mitigations built into C*.
>
> So what is the _actual_ problem? *Cross-region, cross-cloud,
> hybrid-deployment connectivity between pods is a pain.* This can be solved
> with significant investment by those who want to deploy these types of
> topologies. You can definitely configure connectivity between clouds over
> dedicated connections, or VPN tunnels. With a big chunk of time insuring that
> pod to pod connectivity just works even if those pods are managed by separate
> control planes, but that again requires time and talent. There are a number
> of edge cases to support between the ever so slight, but very important,
> differences in cloud vendor networks.
>
> Recently there have been a number of innovations that aid in the deployment
> and operation of these types of applications on Kubernetes. Service meshes
> support distributed microservices running across multiple k8s cluster control
> planes in disparate networks. Instead of directly connecting to IP addresses
> of remote services instead they use a hostname. With this approach, hostname
> traffic may then be routed to a proxy that sends traffic over the WAN
> (sometimes with mTLS) to another proxy pod in the remote cluster which then
> forwards the data along to the correct pod in that network. (See attached
> diagrams)
>
> Which brings us to the point of this ticket. Instead of mapping host ids to
> IPs, use hostnames (and update the underlying address periodically instead of
> caching indefinitely). This allows resolution to different IP addresses per
> DC (k8s cluster) that may then be forwarded to nodes (pods) on remote
> networks (k8s clusters) without requiring node to node (pod to pod) IP
> connectivity between them. Traditional deployments can still function like
> they do today (even if operators opt to keep using IPs as identifiers instead
> of hostnames). This proxy approach is then enabled like those we see in
> service meshes.
>
> _Notes_
> C* already has the concept of broadcast addresses vs those which are bound on
> the node. This approach _could_ be leveraged to provide the behavior we're
> looking for, but then the broadcast values would need to be pre-computed
> _*and match*_ across all k8s control planes. By using hostnames the
> underlying IP address does not matter and will most likely be different in
> each cluster.
>
> I recognize the title may be a bit misleading as we would obviously still
> communicate over TCP/IP., but it concisely conveys the point.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]