Re: dtests to reproduce the schema disagreement

2022-08-18 Thread Alex Petrov
I could not quickly find which test does specifically this, but you could 
induce schema disagreements with dtests in two ways, from the top of my head:

1. dtests with verb filters; disabling schema mutations
2. by executing schema statements with NODE_LOCAL / executeInternal on the node 
rather than going through coordinated path

Hope this helps 

On Fri, Aug 12, 2022, at 8:30 PM, Cheng Wang via dev wrote:
> Thank you, Jeff, now it works for me and I can reproduce a schema 
> disagreement. 
> The testing logic is like:
> On node 1:
> start the node 1
> create table
> shutdown the node 1
> 
> Node node 2:
> start the node 1
> create table 
> shutdown the node 2
> 
> Then startup the two nodes and check the schema version Id
> 
> Now it seems that with the deterministic table id, the schema versions on the 
> two nodes were different initially, even though table ids are the same, and 
> they reached an agreement at some point. Is it as expected? 
> 
> I am just wondering, is there any other way to re-sync the shema versions 
> without restarting the node, cause I feel that shutdown/start is quite 
> expensive and  flaky since the schema migration is a background task?
> 
> Thanks
> Cheng
> 
> On Tue, Aug 9, 2022 at 12:12 PM Jeff Jirsa  wrote:
>> Stop node 1 before you start node 2, essentially mocking a full network 
>> partition. 
>> 
>> 
>> 
>> On Tue, Aug 9, 2022 at 11:57 AM Cheng Wang via dev 
>>  wrote:
>>> Thank you, Aleksey, 
>>> Yes, I have tried this approach, the problem is there is a timing window 
>>> that node 1 runs the CREATE TABLE while node 2 is down, and then we bring 
>>> up the node 2 and it may receive the gossip from node 1 at startup, and the 
>>> CREATE TABLE will fail on node 2 since the table already exists?
>>> 
>>> 
>>> 
>>> On Tue, Aug 9, 2022 at 4:48 AM Aleksey Yeshchenko  wrote:
 The absolute easiest way would be to down one of the two nodes first,
 run CREATE TABLE on the live node, shut it down, get the other one up,
 and run the same CREATE TABLE there, the bring up the down node.
 
 > On 9 Aug 2022, at 07:48, Konstantin Osipov via dev 
 >  wrote:
 > 
 > * Cheng Wang via dev  [22/08/09 09:43]:
 > 
 >> I am working on improving the schema disagreement issue. I need some 
 >> dtests
 >> which can reproduce the schema disagreement.  Anyone know if there are 
 >> any
 >> existing tests for that? Or something similar?
 > 
 > cassandra-10250 is a good start.
 > 
 > -- 
 > Konstantin Osipov, Moscow, Russia
 


Re: dtests to reproduce the schema disagreement

2022-08-12 Thread Cheng Wang via dev
Thank you, Jeff, now it works for me and I can reproduce a schema
disagreement.
The testing logic is like:
On node 1:
start the node 1
create table
shutdown the node 1

Node node 2:
start the node 1
create table
shutdown the node 2

Then startup the two nodes and check the schema version Id

Now it seems that with the deterministic table id, the schema versions on
the two nodes were different initially, even though table ids are the same,
and they reached an agreement at some point. Is it as expected?

I am just wondering, is there any other way to re-sync the shema versions
without restarting the node, cause I feel that shutdown/start is quite
expensive and  flaky since the schema migration is a background task?

Thanks
Cheng

On Tue, Aug 9, 2022 at 12:12 PM Jeff Jirsa  wrote:

> Stop node 1 before you start node 2, essentially mocking a full network
> partition.
>
>
>
> On Tue, Aug 9, 2022 at 11:57 AM Cheng Wang via dev <
> dev@cassandra.apache.org> wrote:
>
>> Thank you, Aleksey,
>> Yes, I have tried this approach, the problem is there is a timing window
>> that node 1 runs the CREATE TABLE while node 2 is down, and then we bring
>> up the node 2 and it may receive the gossip from node 1 at startup, and the
>> CREATE TABLE will fail on node 2 since the table already exists?
>>
>>
>>
>> On Tue, Aug 9, 2022 at 4:48 AM Aleksey Yeshchenko 
>> wrote:
>>
>>> The absolute easiest way would be to down one of the two nodes first,
>>> run CREATE TABLE on the live node, shut it down, get the other one up,
>>> and run the same CREATE TABLE there, the bring up the down node.
>>>
>>> > On 9 Aug 2022, at 07:48, Konstantin Osipov via dev <
>>> dev@cassandra.apache.org> wrote:
>>> >
>>> > * Cheng Wang via dev  [22/08/09 09:43]:
>>> >
>>> >> I am working on improving the schema disagreement issue. I need some
>>> dtests
>>> >> which can reproduce the schema disagreement.  Anyone know if there
>>> are any
>>> >> existing tests for that? Or something similar?
>>> >
>>> > cassandra-10250 is a good start.
>>> >
>>> > --
>>> > Konstantin Osipov, Moscow, Russia
>>>
>>>


Re: dtests to reproduce the schema disagreement

2022-08-09 Thread Jeff Jirsa
Stop node 1 before you start node 2, essentially mocking a full network
partition.



On Tue, Aug 9, 2022 at 11:57 AM Cheng Wang via dev 
wrote:

> Thank you, Aleksey,
> Yes, I have tried this approach, the problem is there is a timing window
> that node 1 runs the CREATE TABLE while node 2 is down, and then we bring
> up the node 2 and it may receive the gossip from node 1 at startup, and the
> CREATE TABLE will fail on node 2 since the table already exists?
>
>
>
> On Tue, Aug 9, 2022 at 4:48 AM Aleksey Yeshchenko 
> wrote:
>
>> The absolute easiest way would be to down one of the two nodes first,
>> run CREATE TABLE on the live node, shut it down, get the other one up,
>> and run the same CREATE TABLE there, the bring up the down node.
>>
>> > On 9 Aug 2022, at 07:48, Konstantin Osipov via dev <
>> dev@cassandra.apache.org> wrote:
>> >
>> > * Cheng Wang via dev  [22/08/09 09:43]:
>> >
>> >> I am working on improving the schema disagreement issue. I need some
>> dtests
>> >> which can reproduce the schema disagreement.  Anyone know if there are
>> any
>> >> existing tests for that? Or something similar?
>> >
>> > cassandra-10250 is a good start.
>> >
>> > --
>> > Konstantin Osipov, Moscow, Russia
>>
>>


Re: dtests to reproduce the schema disagreement

2022-08-09 Thread Cheng Wang via dev
Thank you, Aleksey,
Yes, I have tried this approach, the problem is there is a timing window
that node 1 runs the CREATE TABLE while node 2 is down, and then we bring
up the node 2 and it may receive the gossip from node 1 at startup, and the
CREATE TABLE will fail on node 2 since the table already exists?



On Tue, Aug 9, 2022 at 4:48 AM Aleksey Yeshchenko  wrote:

> The absolute easiest way would be to down one of the two nodes first,
> run CREATE TABLE on the live node, shut it down, get the other one up,
> and run the same CREATE TABLE there, the bring up the down node.
>
> > On 9 Aug 2022, at 07:48, Konstantin Osipov via dev <
> dev@cassandra.apache.org> wrote:
> >
> > * Cheng Wang via dev  [22/08/09 09:43]:
> >
> >> I am working on improving the schema disagreement issue. I need some
> dtests
> >> which can reproduce the schema disagreement.  Anyone know if there are
> any
> >> existing tests for that? Or something similar?
> >
> > cassandra-10250 is a good start.
> >
> > --
> > Konstantin Osipov, Moscow, Russia
>
>


Re: dtests to reproduce the schema disagreement

2022-08-09 Thread Aleksey Yeshchenko
The absolute easiest way would be to down one of the two nodes first,
run CREATE TABLE on the live node, shut it down, get the other one up,
and run the same CREATE TABLE there, the bring up the down node.

> On 9 Aug 2022, at 07:48, Konstantin Osipov via dev  
> wrote:
> 
> * Cheng Wang via dev  [22/08/09 09:43]:
> 
>> I am working on improving the schema disagreement issue. I need some dtests
>> which can reproduce the schema disagreement.  Anyone know if there are any
>> existing tests for that? Or something similar?
> 
> cassandra-10250 is a good start.
> 
> -- 
> Konstantin Osipov, Moscow, Russia



Re: dtests to reproduce the schema disagreement

2022-08-09 Thread Konstantin Osipov via dev
* Cheng Wang via dev  [22/08/09 09:43]:

> I am working on improving the schema disagreement issue. I need some dtests
> which can reproduce the schema disagreement.  Anyone know if there are any
> existing tests for that? Or something similar?

cassandra-10250 is a good start.

-- 
Konstantin Osipov, Moscow, Russia


Re: dtests to reproduce the schema disagreement

2022-08-08 Thread Cheng Wang via dev
Hi Jeff,

Thank you for your reply! Yes, we are working on generating a deterministic
CFID at table creation time. We will also most likely block the pattern of
drop and create to avoid the data reassurance issue once we identify all
the potential risks with the deterministic id.
That's why I asked to create some dtests to reproduce the schema
disagreement issue and show the deterministic table id can avoid the issue.

Thanks
Cheng

On Mon, Aug 8, 2022 at 4:46 PM Jeff Jirsa  wrote:

> I see. Then yes, make a cluster with at least 2 hosts, run the CREATE
> TABLE on them at the same time. If you use the pause injection framework,
> you can probably pause threads after the CFID is generated but before it's
> broadcast.
>
> If you make the CFID deterministic, you can avoid the race, but can run
> into problems if you create/drop/create (a node that was down during the
> drop may resurrect data)
>
> If you leave the CFID non-deterministic, the only way you're going to get
> safety is a global ordering or transactional system, which more or less
> reduces down to https://issues.apache.org/jira/browse/CASSANDRA-10699
>
> Now, there are some things you can do to minimize risk along the way - you
> could try to hunt down all of the possible races where in-memory state and
> on-disk state diverge, create signals/log messages / warnings to make it
> easier to detect, etc. But I'd be worried that any partial fixes will
> complicate 10699 (either make the merge worse, or be outright removed
> later), so it may be worth floating your proposed fix before you invest a
> ton of time on it.
>
>
>
>
>
>
>
>
> On Mon, Aug 8, 2022 at 3:57 PM Cheng Wang  wrote:
>
>> Jeff,
>>
>> The issue I was trying to address is when there are two CREATE TABLE
>> queries running on two coordinator nodes concurrently, it might end up with
>> 2 schema versions and they would never get resolved automatically because
>> table id is random TimeUUID.
>>
>>
>>
>> On Mon, Aug 8, 2022 at 3:54 PM Jeff Jirsa  wrote:
>>
>>> Which (of the many) schema disagreement issue(s)?
>>>
>>>
>>>
>>> On Mon, Aug 8, 2022 at 3:29 PM Cheng Wang via dev <
>>> dev@cassandra.apache.org> wrote:
>>>
 Thank you for the reply, Brandon! It is helpful!

 I was thinking of creating a cluster with 2 nodes and having two
 concurrent CREATE TABLE statements running. But the test will be flaky as
 there is no guarantee that the query runs before the schema agreement has
 been reached.
 Any ideas for that?

 Thanks,
 Cheng

 On Mon, Aug 8, 2022 at 3:19 PM Brandon Williams 
 wrote:

> If you simply do a lot of schema changes quickly without waiting for
> agreement, that should get you there.
>
> Kind Regards,
> Brandon
>
> On Mon, Aug 8, 2022 at 5:08 PM Cheng Wang via dev
>  wrote:
> >
> > Hello,
> >
> > I am working on improving the schema disagreement issue. I need some
> dtests which can reproduce the schema disagreement.  Anyone know if there
> are any existing tests for that? Or something similar?
> >
> > Thanks
> > Cheng
>



Re: dtests to reproduce the schema disagreement

2022-08-08 Thread Jeff Jirsa
I see. Then yes, make a cluster with at least 2 hosts, run the CREATE TABLE
on them at the same time. If you use the pause injection framework, you can
probably pause threads after the CFID is generated but before it's
broadcast.

If you make the CFID deterministic, you can avoid the race, but can run
into problems if you create/drop/create (a node that was down during the
drop may resurrect data)

If you leave the CFID non-deterministic, the only way you're going to get
safety is a global ordering or transactional system, which more or less
reduces down to https://issues.apache.org/jira/browse/CASSANDRA-10699

Now, there are some things you can do to minimize risk along the way - you
could try to hunt down all of the possible races where in-memory state and
on-disk state diverge, create signals/log messages / warnings to make it
easier to detect, etc. But I'd be worried that any partial fixes will
complicate 10699 (either make the merge worse, or be outright removed
later), so it may be worth floating your proposed fix before you invest a
ton of time on it.








On Mon, Aug 8, 2022 at 3:57 PM Cheng Wang  wrote:

> Jeff,
>
> The issue I was trying to address is when there are two CREATE TABLE
> queries running on two coordinator nodes concurrently, it might end up with
> 2 schema versions and they would never get resolved automatically because
> table id is random TimeUUID.
>
>
>
> On Mon, Aug 8, 2022 at 3:54 PM Jeff Jirsa  wrote:
>
>> Which (of the many) schema disagreement issue(s)?
>>
>>
>>
>> On Mon, Aug 8, 2022 at 3:29 PM Cheng Wang via dev <
>> dev@cassandra.apache.org> wrote:
>>
>>> Thank you for the reply, Brandon! It is helpful!
>>>
>>> I was thinking of creating a cluster with 2 nodes and having two
>>> concurrent CREATE TABLE statements running. But the test will be flaky as
>>> there is no guarantee that the query runs before the schema agreement has
>>> been reached.
>>> Any ideas for that?
>>>
>>> Thanks,
>>> Cheng
>>>
>>> On Mon, Aug 8, 2022 at 3:19 PM Brandon Williams 
>>> wrote:
>>>
 If you simply do a lot of schema changes quickly without waiting for
 agreement, that should get you there.

 Kind Regards,
 Brandon

 On Mon, Aug 8, 2022 at 5:08 PM Cheng Wang via dev
  wrote:
 >
 > Hello,
 >
 > I am working on improving the schema disagreement issue. I need some
 dtests which can reproduce the schema disagreement.  Anyone know if there
 are any existing tests for that? Or something similar?
 >
 > Thanks
 > Cheng

>>>


Re: dtests to reproduce the schema disagreement

2022-08-08 Thread Cheng Wang via dev
Jeff,

The issue I was trying to address is when there are two CREATE TABLE
queries running on two coordinator nodes concurrently, it might end up with
2 schema versions and they would never get resolved automatically because
table id is random TimeUUID.



On Mon, Aug 8, 2022 at 3:54 PM Jeff Jirsa  wrote:

> Which (of the many) schema disagreement issue(s)?
>
>
>
> On Mon, Aug 8, 2022 at 3:29 PM Cheng Wang via dev <
> dev@cassandra.apache.org> wrote:
>
>> Thank you for the reply, Brandon! It is helpful!
>>
>> I was thinking of creating a cluster with 2 nodes and having two
>> concurrent CREATE TABLE statements running. But the test will be flaky as
>> there is no guarantee that the query runs before the schema agreement has
>> been reached.
>> Any ideas for that?
>>
>> Thanks,
>> Cheng
>>
>> On Mon, Aug 8, 2022 at 3:19 PM Brandon Williams  wrote:
>>
>>> If you simply do a lot of schema changes quickly without waiting for
>>> agreement, that should get you there.
>>>
>>> Kind Regards,
>>> Brandon
>>>
>>> On Mon, Aug 8, 2022 at 5:08 PM Cheng Wang via dev
>>>  wrote:
>>> >
>>> > Hello,
>>> >
>>> > I am working on improving the schema disagreement issue. I need some
>>> dtests which can reproduce the schema disagreement.  Anyone know if there
>>> are any existing tests for that? Or something similar?
>>> >
>>> > Thanks
>>> > Cheng
>>>
>>


Re: dtests to reproduce the schema disagreement

2022-08-08 Thread Jeff Jirsa
Which (of the many) schema disagreement issue(s)?



On Mon, Aug 8, 2022 at 3:29 PM Cheng Wang via dev 
wrote:

> Thank you for the reply, Brandon! It is helpful!
>
> I was thinking of creating a cluster with 2 nodes and having two
> concurrent CREATE TABLE statements running. But the test will be flaky as
> there is no guarantee that the query runs before the schema agreement has
> been reached.
> Any ideas for that?
>
> Thanks,
> Cheng
>
> On Mon, Aug 8, 2022 at 3:19 PM Brandon Williams  wrote:
>
>> If you simply do a lot of schema changes quickly without waiting for
>> agreement, that should get you there.
>>
>> Kind Regards,
>> Brandon
>>
>> On Mon, Aug 8, 2022 at 5:08 PM Cheng Wang via dev
>>  wrote:
>> >
>> > Hello,
>> >
>> > I am working on improving the schema disagreement issue. I need some
>> dtests which can reproduce the schema disagreement.  Anyone know if there
>> are any existing tests for that? Or something similar?
>> >
>> > Thanks
>> > Cheng
>>
>


Re: dtests to reproduce the schema disagreement

2022-08-08 Thread Cheng Wang via dev
Thank you for the reply, Brandon! It is helpful!

I was thinking of creating a cluster with 2 nodes and having two concurrent
CREATE TABLE statements running. But the test will be flaky as there is no
guarantee that the query runs before the schema agreement has been reached.
Any ideas for that?

Thanks,
Cheng

On Mon, Aug 8, 2022 at 3:19 PM Brandon Williams  wrote:

> If you simply do a lot of schema changes quickly without waiting for
> agreement, that should get you there.
>
> Kind Regards,
> Brandon
>
> On Mon, Aug 8, 2022 at 5:08 PM Cheng Wang via dev
>  wrote:
> >
> > Hello,
> >
> > I am working on improving the schema disagreement issue. I need some
> dtests which can reproduce the schema disagreement.  Anyone know if there
> are any existing tests for that? Or something similar?
> >
> > Thanks
> > Cheng
>


Re: dtests to reproduce the schema disagreement

2022-08-08 Thread Brandon Williams
If you simply do a lot of schema changes quickly without waiting for
agreement, that should get you there.

Kind Regards,
Brandon

On Mon, Aug 8, 2022 at 5:08 PM Cheng Wang via dev
 wrote:
>
> Hello,
>
> I am working on improving the schema disagreement issue. I need some dtests 
> which can reproduce the schema disagreement.  Anyone know if there are any 
> existing tests for that? Or something similar?
>
> Thanks
> Cheng