I weakly prefer contrib.

On Thu, Aug 22, 2019 at 12:09 PM Marcus Eriksson <marc...@apache.org> wrote:

> Hi, we are about to open source our tooling for comparing two cassandra
> clusters and want to get some feedback where to push it. I think the
> options are: (name bike-shedding welcome)
>
> 1. create repos/asf/cassandra-diff.git
> 2. create a generic repos/asf/cassandra-contrib.git where we can add more
> contributed tools in the future
>
> Temporary location: https://github.com/krummas/cassandra-diff
>
> Cassandra-diff is a spark job that compares the data in two clusters - it
> pages through all partitions and reads all rows for those partitions in
> both clusters to make sure they are identical. Based on the configuration
> variable “reverse_read_probability” the rows are either read forward or in
> reverse order.
>
> Our main use case for cassandra-diff has been to set up two identical
> clusters, transfer a snapshot from the cluster we want to test to these
> clusters and upgrade one side. When that is done we run this tool to make
> sure that 2.1 and 3.0 gives the same results. A few examples of the bugs we
> have found using this tool:
>
> * CASSANDRA-14823: Legacy sstables with range tombstones spanning multiple
> index blocks create invalid bound sequences on 3.0+
> * CASSANDRA-14803: Rows that cross index block boundaries can cause
> incomplete reverse reads in some cases
> * CASSANDRA-15178: Skipping illegal legacy cells can break reverse
> iteration of indexed partitions
>
> /Marcus
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Reply via email to