Re: [DISCUSS] CEP-40: Data Transfer Using Cassandra Sidecar for Live Migrating Instances

Claude Warren, Jr via dev Thu, 18 Apr 2024 03:51:51 -0700

I think this solution would solve one of the problems that Aiven has with
node replacement currently.  Though TCM will probably help as well.


On Mon, Apr 15, 2024 at 11:47 PM German Eichberger via dev <
dev@cassandra.apache.org> wrote:

> Thanks for the proposal. I second Jordan that we need more abstraction in
> (1), e.g. most cloud provider allow for disk snapshots and starting nodes
> from a snapshot which would be a good mechanism if you find yourself there.
>
> German
> ------------------------------
> *From:* Jordan West <jorda...@gmail.com>
> *Sent:* Sunday, April 14, 2024 12:27 PM
> *To:* dev@cassandra.apache.org <dev@cassandra.apache.org>
> *Subject:* [EXTERNAL] Re: [DISCUSS] CEP-40: Data Transfer Using Cassandra
> Sidecar for Live Migrating Instances
>
> Thanks for proposing this CEP! We have something like this internally so I
> have some familiarity with the approach and the challenges. After reading
> the CEP a couple things come to mind:
>
> 1. I would like to see more abstraction of how the files get moved / put
> in place with the proposed solution being the default implementation. That
> would allow others to plug in alternatives means of data movement like
> pulling down backups from S3 or rsync, etc.
>
> 2. I do agree with Jon’s last email that the lifecycle / orchestration
> portion is the more challenging aspect. It would be nice to address that as
> well so we don’t end up with something like repair where the building
> blocks are there but the hard parts are left to the operator. I do,
> however, see that portion being done in a follow-on CEP to limit the scope
> of CEP-40 and have a higher chance for success by incrementally adding
> these features.
>
> Jordan
>
> On Thu, Apr 11, 2024 at 12:31 Jon Haddad <j...@jonhaddad.com> wrote:
>
> First off, let me apologize for my initial reply, it came off harsher than
> I had intended.
>
> I know I didn't say it initially, but I like the idea of making it easier
> to replace a node.  I think it's probably not obvious to folks that you can
> use rsync (with stunnel, or alternatively rclone), and for a lot of teams
> it's intimidating to do so.  Whether it actually is easy or not to do with
> rsync is irrelevant.  Having tooling that does it right is better than duct
> taping things together.
>
> So with that said, if you're looking to get feedback on how to make the
> CEP more generally useful, I have a couple thoughts.
>
> > Managing the Cassandra processes like bringing them up or down while
> migrating the instances.
>
> Maybe I missed this, but I thought we already had support for managing the
> C* lifecycle with the sidecar?  Maybe I'm misremembering.  It seems to me
> that adding the ability to make this entire workflow self managed would be
> the biggest win, because having a live migrate *feature* instead of what's
> essentially a runbook would be far more useful.
>
> > To verify whether the desired file set matches with source, only file
> path and size is considered at the moment. Strict binary level verification
> is deferred for later.
>
> Scott already mentioned this is a problem and I agree, we cannot simply
> rely on file path and size.
>
> TL;DR: I like the intention of the CEP.  I think it would be better if it
> managed the entire lifecycle of the migration, but you might not have an
> appetite to implement all that.
>
> Jon
>
>
> On Thu, Apr 11, 2024 at 10:01 AM Venkata Hari Krishna Nukala <
> n.v.harikrishna.apa...@gmail.com> wrote:
>
> Thanks Jon & Scott for taking time to go through this CEP and providing
> inputs.
>
> I am completely with what Scott had mentioned earlier (I would have added
> more details into the CEP). Adding a few more points to the same.
>
> Having a solution with Sidecar can make the migration easy without
> depending on rsync. At least in the cases I have seen, rsync is not enabled
> by default and most of them want to run OS/images with as minimal
> requirements as possible. Installing rsync requires admin privileges and
> syncing data is a manual operation. If an API is provided with Sidecar,
> then tooling can be built around it reducing the scope for manual errors.
>
> From performance wise, at least in the cases I had seen, the File
> Streaming API in Sidecar performs a lot better. To give an idea on the
> performance, I would like to quote "up to 7 Gbps/instance writes (depending
> on hardware)" from CEP-28 as this CEP proposes to leverage the same.
>
> For:
>
> >When enabled for LCS, single sstable uplevel will mutate only the level
> of an SSTable in its stats metadata component, which wouldn't alter the
> filename and may not alter the length of the stats metadata component. A
> change to the level of an SSTable on the source via single sstable uplevel
> may not be caught by a digest based only on filename and length.
>
> In this case file size may not change, but the timestamp of last modified
> time would change, right? It is addressed in section MIGRATING ONE
> INSTANCE, point 2.b.ii which says "If a file is present at the destination
> but did not match (by size or timestamp) with the source file, then local
> file is deleted and added to list of files to download.". And after
> download by final data copy task, file should match with source.
>
> On Thu, Apr 11, 2024 at 7:30 AM C. Scott Andreas <sc...@paradoxica.net>
> wrote:
>
> Oh, one note on this item:
>
> >  The operator can ensure that files in the destination matches with the
> source. In the first iteration of this feature, an API is introduced to
> calculate digest for the list of file names and their lengths to identify
> any mismatches. It does not validate the file contents at the binary level,
> but, such feature can be added at a later point of time.
>
> When enabled for LCS, single sstable uplevel will mutate only the level of
> an SSTable in its stats metadata component, which wouldn't alter the
> filename and may not alter the length of the stats metadata component. A
> change to the level of an SSTable on the source via single sstable uplevel
> may not be caught by a digest based only on filename and length.
>
> Including the file’s modification timestamp would address this without
> requiring a deep hash of the data. This would be good to include to ensure
> SSTables aren’t downleveled unexpectedly during migration.
>
> - Scott
>
> On Apr 8, 2024, at 2:15 PM, C. Scott Andreas <sc...@paradoxica.net> wrote:
>
> 
> Hi Jon,
>
> Thanks for taking the time to read and reply to this proposal. Would
> encourage you to approach it from an attitude of seeking understanding on
> the part of the first-time CEP author, as this reply casts it off pretty
> quickly as NIH.
>
> The proposal isn't mine, but I'll offer a few notes on where I see this as
> valuable:
>
> – It's valuable for Cassandra to have an ecosystem-native mechanism of
> migrating data between physical/virtual instances outside the standard
> streaming path. As Hari mentions, the current ecosystem-native approach of
> executing repairs, decommissions, and bootstraps is time-consuming and
> cumbersome.
>
> – An ecosystem-native solution is safer than a bunch of bash and rsync.
> Defining a safe protocol to migrate data between instances via rsync
> without downtime is surprisingly difficult - and even moreso to do safely
> and repeatedly at scale. Enabling this process to be orchestrated by a
> control plane mechanizing offical endpoints of the database and sidecar –
> rather than trying to move data around behind its back – is much safer than
> hoping one's cobbled together the right set of scripts to move data in a
> way that won't violate strong / transactional consistency guarantees. This
> complexity is kind of exemplified by the "Migrating One Instance" section
> of the doc and state machine diagram, which illustrates an approach to
> solving that problem.
>
> – An ecosystem-native approach poses fewer security concerns than rsync.
> mTLS-authenticated endpoints in the sidecar for data movement eliminate the
> requirement for orchestration to occur via (typically) high-privilege SSH,
> which often allows for code execution of some form or complex efforts to
> scope SSH privileges of particular users; and eliminates the need to manage
> and secure rsyncd processes on each instance if not via SSH.
>
> – An ecosystem-native approach is more instrumentable and measurable than
> rsync. Support for data migration endpoints in the sidecar would allow for
> metrics reporting, stats collection, and alerting via mature and modern
> mechanisms rather than monitoring the output of a shell script.
>
> I'll yield to Hari to share more, though today is a public holiday in
> India.
>
> I do see this CEP as solving an important problem.
>
> Thanks,
>
> – Scott
>
> On Apr 8, 2024, at 10:23 AM, Jon Haddad <j...@jonhaddad.com> wrote:
>
>
> This seems like a lot of work to create an rsync alternative.  I can't
> really say I see the point.  I noticed your "rejected alternatives"
> mentions it with this note:
>
>
>    - However, it might not be permitted by the administrator or available
>    in various environments such as Kubernetes or virtual instances like EC2.
>    Enabling data transfer through a sidecar facilitates smooth instance
>    migration.
>
> This feels more like NIH than solving a real problem, as what you've
> listed is a hypothetical, and one that's easily addressed.
>
> Jon
>
>
>
> On Fri, Apr 5, 2024 at 3:47 AM Venkata Hari Krishna Nukala <
> n.v.harikrishna.apa...@gmail.com> wrote:
>
> Hi all,
>
> I have filed CEP-40 [1] for live migrating Cassandra instances using the
> Cassandra Sidecar.
>
> When someone needs to move all or a portion of the Cassandra nodes
> belonging to a cluster to different hosts, the traditional approach of
> Cassandra node replacement can be time-consuming due to repairs and the
> bootstrapping of new nodes. Depending on the volume of the storage service
> load, replacements (repair + bootstrap) may take anywhere from a few hours
> to days.
>
> Proposing a Sidecar based solution to address these challenges. This
> solution proposes transferring data from the old host (source) to the new
> host (destination) and then bringing up the Cassandra process at the
> destination, to enable fast instance migration. This approach would help to
> minimise node downtime, as it is based on a Sidecar solution for data
> transfer and avoids repairs and bootstrap.
>
> Looking forward to the discussions.
>
> [1]
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-40%3A+Data+Transfer+Using+Cassandra+Sidecar+for+Live+Migrating+Instances
>
> Thanks!
> Hari
>
>
>

Re: [DISCUSS] CEP-40: Data Transfer Using Cassandra Sidecar for Live Migrating Instances

Reply via email to