[ 
https://issues.apache.org/jira/browse/CASSANDRA-15823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112584#comment-17112584
 ] 

Jeremiah Jordan edited comment on CASSANDRA-15823 at 5/20/20, 8:24 PM:
-----------------------------------------------------------------------

bq. Cassandra is protected here as we already have logic in place to update 
peers when we come up with the same host id, but a different IP address.
bq. This definitely isn’t true / strictly safe. In fact it’s trivial to violate 
consistency / lose data by swapping the IP of two Pods/instances on the same 
host.

Right.  We are OK right now as long as you don't actually "swap" IPs with 
another node.  Aka Node A goes down loses its IP, Node B goes down loses its 
IP, Node A comes back up with the IP node B previously had.  "bad things" will 
happen in this case as there are a bunch of race conditions for the change in 
token ownership that just occurred.

If a node comes up with a completely brand new IP that was never part of the 
cluster before, then we do not have any issues that I know of.  Do you know of 
any problems that can happen for that case [~jjirsa] ?

bq. We really need everything to be based on UUIDs, not ip or port or host 
name. And we really really really shouldn’t assume that dns is universally 
available or correct (because that’s just not always true, even in 2020).

While I agree it would be best to have all membership based on UUID's, I think 
we need to allow people to have hostnames be the contact point, and have those 
re-resolve on every "connect".  While I agree "ips are best, dns is the devil", 
 I have seen bad DNS take down clusters, there are many systems being created 
right now where hostnames are the invariant, not ips, and we need Cassandra to 
be able to play in those environments.


was (Author: jjordan):
bq. Cassandra is protected here as we already have logic in place to update 
peers when we come up with the same host id, but a different IP address.
bq. This definitely isn’t true / strictly safe. In fact it’s trivial to violate 
consistency / lose data by swapping the IP of two Pods/instances on the same 
host.

Right.  We are OK right now as long as you don't actually "swap" its with 
another node.  Aka Node A goes down loses its IP, Node B goes down loses its 
IP, Node A comes back up with the IP node B previously had.  "bad things" will 
happen in this case as there are a bunch of race conditions for the change in 
token ownership that just occurred.

If a node comes up with a completely brand new IP that was never part of the 
cluster before, then we do not have any issues that I know of.  Do you know of 
any problems that can happen for that case [~jjirsa] ?

bq. We really need everything to be based on UUIDs, not ip or port or host 
name. And we really really really shouldn’t assume that dns is universally 
available or correct (because that’s just not always true, even in 2020).

While I agree it would be best to have all membership based on UUID's, I think 
we need to allow people to have hostnames be the contact point, and have those 
re-resolve on every "connect".  While I agree "ips are best, dns is the devil", 
 I have seen bad DNS take down clusters, there are many systems being created 
right now where hostnames are the invariant, not ips, and we need Cassandra to 
be able to play in those environments.

> Support for networking via identity instead of IP
> -------------------------------------------------
>
>                 Key: CASSANDRA-15823
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15823
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Christopher Bradford
>            Priority: Normal
>         Attachments: consul-mesh-gateways.png, 
> istio-multicluster-with-gateways.svg, linkerd-service-mirroring.svg
>
>
> TL;DR: Instead of mapping host ids to IPs, use hostnames. This allows 
> resolution to different IP addresses per DC that may then be forwarded to 
> nodes on remote networks without requiring node to node IP connectivity for 
> cross-dc links.
>  
> This approach should not affect existing deployments as those could continue 
> to use IPs as the hostname and skip resolution.
> ----
> With orchestration platforms like Kubernetes and the usage of ephemeral 
> containers in environments today we should consider some changes to how we 
> handle the tracking of nodes and their network location. Currently we 
> maintain a mapping between host ids and IP addresses.
>  
> With traditional infrastructure, if a node goes down it, usually, comes back 
> up with the same IP. In some environments this contract may be explicit with 
> virtual IPs that may move between hosts. In newer deployments, like on 
> Kubernetes, this contract is not possible. Pods (analogous to nodes) are 
> assigned an IP address at start time. Should the pod be restarted or 
> scheduled on a different host there is no guarantee we would have the same 
> IP. Cassandra is protected here as we already have logic in place to update 
> peers when we come up with the same host id, but a different IP address.
>  
> There are ways to get Kubernetes to assign a specific IP per Pod. Most 
> recommendations involve the use of a service per pod. Communication with the 
> fixed service IP would automatically forward to the associated pod, 
> regardless of address. We _could_ use this approach, but it seems like this 
> would needlessly create a number of extra resources in our k8s cluster to get 
> around the problem. Which, to be fair, doesn't seem like much of a problem 
> with the aforementioned mitigations built into C*.
>  
> So what is the _actual_ problem? *Cross-region, cross-cloud, 
> hybrid-deployment connectivity between pods is a pain.* This can be solved 
> with significant investment by those who want to deploy these types of 
> topologies. You can definitely configure connectivity between clouds over 
> dedicated connections, or VPN tunnels. With a big chunk of time insuring that 
> pod to pod connectivity just works even if those pods are managed by separate 
> control planes, but that again requires time and talent. There are a number 
> of edge cases to support between the ever so slight, but very important, 
> differences in cloud vendor networks.
>  
> Recently there have been a number of innovations that aid in the deployment 
> and operation of these types of applications on Kubernetes. Service meshes 
> support distributed microservices running across multiple k8s cluster control 
> planes in disparate networks. Instead of directly connecting to IP addresses 
> of remote services instead they use a hostname. With this approach, hostname 
> traffic may then be routed to a proxy that sends traffic over the WAN 
> (sometimes with mTLS) to another proxy pod in the remote cluster which then 
> forwards the data along to the correct pod in that network. (See attached 
> diagrams)
>  
> Which brings us to the point of this ticket. Instead of mapping host ids to 
> IPs, use hostnames (and update the underlying address periodically instead of 
> caching indefinitely). This allows resolution to different IP addresses per 
> DC (k8s cluster) that may then be forwarded to nodes (pods) on remote 
> networks (k8s clusters) without requiring node to node (pod to pod) IP 
> connectivity between them. Traditional deployments can still function like 
> they do today (even if operators opt to keep using IPs as identifiers instead 
> of hostnames). This proxy approach is then enabled like those we see in 
> service meshes.
>  
> _Notes_
> C* already has the concept of broadcast addresses vs those which are bound on 
> the node. This approach _could_ be leveraged to provide the behavior we're 
> looking for, but then the broadcast values would need to be pre-computed 
> _*and match*_ across all k8s control planes. By using hostnames the 
> underlying IP address does not matter and will most likely be different in 
> each cluster.
>  
> I recognize the title may be a bit misleading as we would obviously still 
> communicate over TCP/IP., but it concisely conveys the point.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to