Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-06 Thread scott
+1 nb

> On Feb 6, 2023, at 9:05 PM, Ariel Weisberg  wrote:
> 
> +1
> 
> On Mon, Feb 6, 2023, at 11:15 AM, Sam Tunnicliffe wrote:
>> Hi everyone,
>> 
>> I would like to start a vote on this CEP.
>> 
>> Proposal:
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+Metadata
>> 
>> Discussion:
>> https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7
>> 
>> The vote will be open for 72 hours.
>> A vote passes if there are at least three binding +1s and no binding vetoes.
>> 
>> Thanks,
>> Sam



Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-06 Thread Ariel Weisberg
+1

On Mon, Feb 6, 2023, at 11:15 AM, Sam Tunnicliffe wrote:
> Hi everyone,
> 
> I would like to start a vote on this CEP.
> 
> Proposal:
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+Metadata
> 
> Discussion:
> https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7
> 
> The vote will be open for 72 hours.
> A vote passes if there are at least three binding +1s and no binding vetoes.
> 
> Thanks,
> Sam


Re: Welcome Patrick McFadin as Cassandra Committer

2023-02-06 Thread Miles Garnsey
Congrats Patrick!

> On 7 Feb 2023, at 11:59 am, Anthony Grasso  wrote:
> 
> Congratulations, Patrick!
> 
> Well deserved given all the work you do for the community.
> 
> Kind regards,
> 
> On Mon, 6 Feb 2023 at 11:39, Paulo Motta  > wrote:
>> Congratulations for this well deserved recognition, Patrick! Thank you for 
>> all the energy you inject into this community! :-)
>> 
>> On Sun, Feb 5, 2023 at 7:17 PM Patrick McFadin > > wrote:
>>> Thank you everyone for all the well wishes here and in other parts of the 
>>> interwebs. It's always a privilege to work with the people in our community.
>>> 
>>> Patrick
>>> 
>>> On Fri, Feb 3, 2023 at 11:24 AM C. Scott Andreas >> > wrote:
 Congratulations, Patrick!
 
> On Feb 2, 2023, at 9:46 PM, Berenguer Blasi  > wrote:
> 
> 
> Welcome!
> 
> On 3/2/23 4:09, Vinay Chella wrote:
>> Well deserved one, Congratulations, Patrick. 
>> 
>> On Fri, Feb 3, 2023 at 4:01 AM Josh McKenzie > > wrote:
>>> Congrats Patrick! Well deserved.
>>> 
>>> On Thu, Feb 2, 2023, at 5:25 PM, Molly Monroy wrote:
 Congrats, Patrick... much deserved!
 
 On Thu, Feb 2, 2023 at 1:59 PM Derek Chen-Becker 
 mailto:de...@chen-becker.org>> wrote:
 Congrats!
 
 On Thu, Feb 2, 2023 at 10:58 AM Benjamin Lerer >>> > wrote:
 The PMC members are pleased to announce that Patrick McFadin has 
 accepted
 the invitation to become committer today.
 
 Thanks a lot, Patrick, for everything you have done for this project 
 and its community through the years.
 
 Congratulations and welcome!
 
 The Apache Cassandra PMC members
 
 
 --
 +---+
 | Derek Chen-Becker |
 | GPG Key available at https://keybase.io/dchenbecker 
 
  and   |
 | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org 
 
  |
 | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
 +---+
 
>>> 
>> 
>> --
>> 
>> 
>> Thanks,
>> Vinay Chella



Re: Welcome Patrick McFadin as Cassandra Committer

2023-02-06 Thread Anthony Grasso
Congratulations, Patrick!

Well deserved given all the work you do for the community.

Kind regards,

On Mon, 6 Feb 2023 at 11:39, Paulo Motta  wrote:

> Congratulations for this well deserved recognition, Patrick! Thank you for
> all the energy you inject into this community! :-)
>
> On Sun, Feb 5, 2023 at 7:17 PM Patrick McFadin  wrote:
>
>> Thank you everyone for all the well wishes here and in other parts of the
>> interwebs. It's always a privilege to work with the people in our community.
>>
>> Patrick
>>
>> On Fri, Feb 3, 2023 at 11:24 AM C. Scott Andreas 
>> wrote:
>>
>>> Congratulations, Patrick!
>>>
>>> On Feb 2, 2023, at 9:46 PM, Berenguer Blasi 
>>> wrote:
>>>
>>>
>>> Welcome!
>>> On 3/2/23 4:09, Vinay Chella wrote:
>>>
>>> Well deserved one, Congratulations, Patrick.
>>>
>>> On Fri, Feb 3, 2023 at 4:01 AM Josh McKenzie 
>>> wrote:
>>>
 Congrats Patrick! Well deserved.

 On Thu, Feb 2, 2023, at 5:25 PM, Molly Monroy wrote:

 Congrats, Patrick... much deserved!

 On Thu, Feb 2, 2023 at 1:59 PM Derek Chen-Becker 
 wrote:

 Congrats!

 On Thu, Feb 2, 2023 at 10:58 AM Benjamin Lerer 
 wrote:

 The PMC members are pleased to announce that Patrick McFadin has
 accepted
 the invitation to become committer today.

 Thanks a lot, Patrick, for everything you have done for this project
 and its community through the years.

 Congratulations and welcome!

 The Apache Cassandra PMC members



 --
 +---+
 | Derek Chen-Becker |
 | GPG Key available at https://keybase.io/dchenbecker and   |
 | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
 | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
 +---+


 --
>>>
>>>
>>> Thanks,
>>> Vinay Chella
>>>
>>>


Re: Announcement: Performance testing for Cassandra

2023-02-06 Thread Henrik Ingo
Thanks Marianne, and Matt

Ever since I joined this ecosystem I've wanted to see the day that there
are end-to-end full scale performance tests running nightly directly on
upstream Cassandra. Thank you so much for your work towards that!

Come to think of it, thank you to everyone and anyone who worked on open
sourcing the multiple tools used in running those tests.

<3

henrik

On Mon, Feb 6, 2023 at 6:02 PM Marianne Lyne Manaog <
marianne.man...@ieee.org> wrote:

> Hi everyone,
>
> Matt and I have created a public repository that contains performance
> tests for Cassandra using the open-source Fallout tool that the community
> can benefit from. Fallout is an open-source tool for running large scale
> remote-based distributed correctness, verification and performance tests
> for Apache Cassandra. All components for running the performance tests are
> all open-source and use Google Kubernetes Engine.
>
> At the moment, the repository contains 6 performance tests for lwt. We are
> still working on porting a few more tests into the repository. Everyone is
> welcome to contribute their own performance tests to the repository. Here
> is the link to the fallout-tests repository
> .
>
> One thing to note is that, at the moment, it is not possible to run
> performance tests on trunk but rather on specific versions of Cassandra.
> However, we are currently working on it.
>
> Marianne
>


-- 

Henrik Ingo

c. +358 40 569 7354

w. www.datastax.com

  
  


Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-06 Thread Josh McKenzie
+1

On Mon, Feb 6, 2023, at 2:53 PM, Dinesh Joshi wrote:
> +1
> 
>> 
>> On Feb 6, 2023, at 8:16 AM, Sam Tunnicliffe  wrote:
>> 
>> Hi everyone,
>> 
>> I would like to start a vote on this CEP.
>> 
>> Proposal:
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+Metadata
>> 
>> Discussion:
>> https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7
>> 
>> The vote will be open for 72 hours.
>> A vote passes if there are at least three binding +1s and no binding vetoes.
>> 
>> Thanks,
>> Sam

Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-06 Thread Dinesh Joshi
+1

> 
> On Feb 6, 2023, at 8:16 AM, Sam Tunnicliffe  wrote:
> 
> 
> Hi everyone,
> 
> I would like to start a vote on this CEP.
> 
> Proposal:
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+Metadata
> 
> Discussion:
> https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7
> 
> The vote will be open for 72 hours.
> A vote passes if there are at least three binding +1s and no binding vetoes.
> 
> Thanks,
> Sam


Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-06 Thread Ekaterina Dimitrova
+1

On Mon, 6 Feb 2023 at 13:02, Patrick McFadin  wrote:

> No more nodetool createepochunsafe! +1
>
> This is going to be another big merge. Just bookmarking the discussions
> last week on CEP-15.
>
> On Mon, Feb 6, 2023 at 9:57 AM Jeff Jirsa  wrote:
>
>> +1
>>
>>
>> On Mon, Feb 6, 2023 at 8:16 AM Sam Tunnicliffe  wrote:
>>
>>> Hi everyone,
>>>
>>> I would like to start a vote on this CEP.
>>>
>>> Proposal:
>>>
>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+Metadata
>>>
>>> Discussion:
>>> https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7
>>>
>>> The vote will be open for 72 hours.
>>> A vote passes if there are at least three binding +1s and no binding
>>> vetoes.
>>>
>>> Thanks,
>>> Sam
>>>
>>


Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-06 Thread Patrick McFadin
No more nodetool createepochunsafe! +1

This is going to be another big merge. Just bookmarking the discussions
last week on CEP-15.

On Mon, Feb 6, 2023 at 9:57 AM Jeff Jirsa  wrote:

> +1
>
>
> On Mon, Feb 6, 2023 at 8:16 AM Sam Tunnicliffe  wrote:
>
>> Hi everyone,
>>
>> I would like to start a vote on this CEP.
>>
>> Proposal:
>>
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+Metadata
>>
>> Discussion:
>> https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7
>>
>> The vote will be open for 72 hours.
>> A vote passes if there are at least three binding +1s and no binding
>> vetoes.
>>
>> Thanks,
>> Sam
>>
>


Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-06 Thread Jeff Jirsa
+1


On Mon, Feb 6, 2023 at 8:16 AM Sam Tunnicliffe  wrote:

> Hi everyone,
>
> I would like to start a vote on this CEP.
>
> Proposal:
>
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+Metadata
>
> Discussion:
> https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7
>
> The vote will be open for 72 hours.
> A vote passes if there are at least three binding +1s and no binding
> vetoes.
>
> Thanks,
> Sam
>


Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-06 Thread Aleksey Yeshchenko
+1

> On 6 Feb 2023, at 17:24, Benedict  wrote:
> 
> +1
> 
>> On 6 Feb 2023, at 16:17, Brandon Williams  wrote:
>> 
>> 
>> +1
>> 
>> On Mon, Feb 6, 2023, 10:15 AM Sam Tunnicliffe > > wrote:
>>> Hi everyone,
>>> 
>>> I would like to start a vote on this CEP.
>>> 
>>> Proposal:
>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+Metadata
>>> 
>>> Discussion:
>>> https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7
>>> 
>>> The vote will be open for 72 hours.
>>> A vote passes if there are at least three binding +1s and no binding vetoes.
>>> 
>>> Thanks,
>>> Sam



Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-06 Thread Benedict
+1On 6 Feb 2023, at 16:17, Brandon Williams  wrote:+1On Mon, Feb 6, 2023, 10:15 AM Sam Tunnicliffe  wrote:Hi everyone,I would like to start a vote on this CEP.Proposal:https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+MetadataDiscussion:https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7The vote will be open for 72 hours.A vote passes if there are at least three binding +1s and no binding vetoes.Thanks,Sam


Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-06 Thread Brandon Williams
+1

On Mon, Feb 6, 2023, 10:15 AM Sam Tunnicliffe  wrote:

> Hi everyone,
>
> I would like to start a vote on this CEP.
>
> Proposal:
>
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+Metadata
>
> Discussion:
> https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7
>
> The vote will be open for 72 hours.
> A vote passes if there are at least three binding +1s and no binding
> vetoes.
>
> Thanks,
> Sam
>


[VOTE] CEP-21 Transactional Cluster Metadata

2023-02-06 Thread Sam Tunnicliffe
Hi everyone,

I would like to start a vote on this CEP.

Proposal:
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+Metadata

Discussion:
https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7

The vote will be open for 72 hours.
A vote passes if there are at least three binding +1s and no binding vetoes.

Thanks,
Sam

Announcement: Performance testing for Cassandra

2023-02-06 Thread Marianne Lyne Manaog
Hi everyone,

Matt and I have created a public repository that contains performance tests
for Cassandra using the open-source Fallout tool that the community can
benefit from. Fallout is an open-source tool for running large scale
remote-based distributed correctness, verification and performance tests
for Apache Cassandra. All components for running the performance tests are
all open-source and use Google Kubernetes Engine.

At the moment, the repository contains 6 performance tests for lwt. We are
still working on porting a few more tests into the repository. Everyone is
welcome to contribute their own performance tests to the repository. Here
is the link to the fallout-tests repository
.

One thing to note is that, at the moment, it is not possible to run
performance tests on trunk but rather on specific versions of Cassandra.
However, we are currently working on it.

Marianne


Re: Implicitly enabling ALLOW FILTERING on virtual tables

2023-02-06 Thread Miklosovic, Stefan
Thanks everybody for the input. I created the ticket (1) to track the work (2).

Lets move the further discussion there.

(1) https://issues.apache.org/jira/browse/CASSANDRA-18238
(1) https://github.com/apache/cassandra/pull/2142/files


From: Aleksey Yeshchenko 
Sent: Monday, February 6, 2023 12:11
To: dev@cassandra.apache.org
Subject: Re: Implicitly enabling ALLOW FILTERING on virtual tables

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



Just make virtual table implementations decide?

Add a method to VirtualTable interface to indicate if this is desirable, and 
call it a day?

On 6 Feb 2023, at 09:41, Benjamin Lerer  wrote:

Making ALLOW FILTERING a table option implies giving the right to the person 
creating the table the ability to change the way the server will behave for 
that table which might not be something that every C* operator wants. Of course 
we can allow operators to controle that through the ALLOW FILTERING guardrail. 
At that point we would also need to have a default setting for the entire 
database.

Le ven. 3 févr. 2023 à 23:44, Miklosovic, Stefan 
mailto:stefan.mikloso...@netapp.com>> a écrit :
This is the draft for FILTERING ON|OFF in shell.

I would say this is the most simple solution.

We may still consider table option but what do you think about having it simply 
just set via shell?

https://github.com/apache/cassandra/pull/2141/files


From: Josh McKenzie mailto:jmcken...@apache.org>>
Sent: Friday, February 3, 2023 23:39
To: dev
Subject: Re: Implicitly enabling ALLOW FILTERING on virtual tables

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



they would start to set ALLOW FILTERING here and there in order to not think 
twice about their data model so they can just call it a day.
Setting this on a per-table basis or having users set this on specific queries 
that hit tables and forgetting they set it are 6 of one and half-a-dozen of 
another.

I like the table property idea personally. That communicates an intent about 
the data model and expectation of the size and usage of data in the modeling of 
the schema that embeds some context and intent there's currently no mechanism 
to communicate.

On Fri, Feb 3, 2023, at 5:00 PM, Miklosovic, Stefan wrote:
Yes, there would be discrepancy. I do not like that either. If it was only 
about "normal tables vs virtual tables", I could live with that. But the fact 
that there are going to be differences among vtables themselves, that starts to 
be a little bit messy. Then we would need to let operators know what tables are 
always allowed to be filtered on and which do not and that just complicates it. 
Putting that information to comment so it is visible in DECSCRIBE is nice idea.

That flag we talk about ... that flag would be used purely internally, it would 
not be in schema to be gossiped.

Also, I am starting to like the suggestion to have something like ALLOW 
FILTERING ON in CQLSH so it would be turned on whole CQL session. That leaves 
tables as they are and it should not be a big deal for operators to set. We 
would have to make sure to add "ALLOW FILTERING" clause to every SELECT 
statement (to virtual tables only?) a user submits. I am not sure if this is 
doable yet though.


From: David Capwell 
mailto:dcapw...@apple.com>>>
Sent: Friday, February 3, 2023 22:42
To: dev
Cc: Maxim Muzafarov
Subject: Re: Implicitly enabling ALLOW FILTERING on virtual tables

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



I don't think the assumption that "virtual tables will always be small and 
always fit in memory" is a safe one.

Agree, there is a repair ticket to have the coordinating node do network 
queries to peers to resolve the table (rather than operator querying 
everything, allow the coordinator node to do it for you)… so this assumption 
may not be true down the line.

I could be open to a table property that says ALLOW FILTERING on by default or 
not… then we can pick and choose vtables (or have vtables opt-out)…. I kinda 
like like the lack of consistency with this approach though

On Feb 3, 2023, at 11:24 AM, C. Scott Andreas 
mailto:sc...@paradoxica.net>>>
 wrote:

There are some ideas that development community members have kicked around that 
may falsify the assumption that "virtual tables are tiny and will fit in 
memory."

One example is CASSANDRA-14629: Abstract Virtual Table for very large result 
sets
https://issues.apache.org/jira/browse/CASSANDRA-14629

Chris's proposal 

Re: Implicitly enabling ALLOW FILTERING on virtual tables

2023-02-06 Thread Aleksey Yeshchenko
Just make virtual table implementations decide?

Add a method to VirtualTable interface to indicate if this is desirable, and 
call it a day? 

> On 6 Feb 2023, at 09:41, Benjamin Lerer  wrote:
> 
> Making ALLOW FILTERING a table option implies giving the right to the person 
> creating the table the ability to change the way the server will behave for 
> that table which might not be something that every C* operator wants. Of 
> course we can allow operators to controle that through the ALLOW FILTERING 
> guardrail. At that point we would also need to have a default setting for the 
> entire database.
> 
> Le ven. 3 févr. 2023 à 23:44, Miklosovic, Stefan 
> mailto:stefan.mikloso...@netapp.com>> a écrit :
>> This is the draft for FILTERING ON|OFF in shell.
>> 
>> I would say this is the most simple solution.
>> 
>> We may still consider table option but what do you think about having it 
>> simply just set via shell?
>> 
>> https://github.com/apache/cassandra/pull/2141/files
>> 
>> 
>> From: Josh McKenzie mailto:jmcken...@apache.org>>
>> Sent: Friday, February 3, 2023 23:39
>> To: dev
>> Subject: Re: Implicitly enabling ALLOW FILTERING on virtual tables
>> 
>> NetApp Security WARNING: This is an external email. Do not click links or 
>> open attachments unless you recognize the sender and know the content is 
>> safe.
>> 
>> 
>> 
>> they would start to set ALLOW FILTERING here and there in order to not think 
>> twice about their data model so they can just call it a day.
>> Setting this on a per-table basis or having users set this on specific 
>> queries that hit tables and forgetting they set it are 6 of one and 
>> half-a-dozen of another.
>> 
>> I like the table property idea personally. That communicates an intent about 
>> the data model and expectation of the size and usage of data in the modeling 
>> of the schema that embeds some context and intent there's currently no 
>> mechanism to communicate.
>> 
>> On Fri, Feb 3, 2023, at 5:00 PM, Miklosovic, Stefan wrote:
>> Yes, there would be discrepancy. I do not like that either. If it was only 
>> about "normal tables vs virtual tables", I could live with that. But the 
>> fact that there are going to be differences among vtables themselves, that 
>> starts to be a little bit messy. Then we would need to let operators know 
>> what tables are always allowed to be filtered on and which do not and that 
>> just complicates it. Putting that information to comment so it is visible in 
>> DECSCRIBE is nice idea.
>> 
>> That flag we talk about ... that flag would be used purely internally, it 
>> would not be in schema to be gossiped.
>> 
>> Also, I am starting to like the suggestion to have something like ALLOW 
>> FILTERING ON in CQLSH so it would be turned on whole CQL session. That 
>> leaves tables as they are and it should not be a big deal for operators to 
>> set. We would have to make sure to add "ALLOW FILTERING" clause to every 
>> SELECT statement (to virtual tables only?) a user submits. I am not sure if 
>> this is doable yet though.
>> 
>> 
>> From: David Capwell > > >>
>> Sent: Friday, February 3, 2023 22:42
>> To: dev
>> Cc: Maxim Muzafarov
>> Subject: Re: Implicitly enabling ALLOW FILTERING on virtual tables
>> 
>> NetApp Security WARNING: This is an external email. Do not click links or 
>> open attachments unless you recognize the sender and know the content is 
>> safe.
>> 
>> 
>> 
>> I don't think the assumption that "virtual tables will always be small and 
>> always fit in memory" is a safe one.
>> 
>> Agree, there is a repair ticket to have the coordinating node do network 
>> queries to peers to resolve the table (rather than operator querying 
>> everything, allow the coordinator node to do it for you)… so this assumption 
>> may not be true down the line.
>> 
>> I could be open to a table property that says ALLOW FILTERING on by default 
>> or not… then we can pick and choose vtables (or have vtables opt-out)…. I 
>> kinda like like the lack of consistency with this approach though
>> 
>> On Feb 3, 2023, at 11:24 AM, C. Scott Andreas > > >> wrote:
>> 
>> There are some ideas that development community members have kicked around 
>> that may falsify the assumption that "virtual tables are tiny and will fit 
>> in memory."
>> 
>> One example is CASSANDRA-14629: Abstract Virtual Table for very large result 
>> sets
>> https://issues.apache.org/jira/browse/CASSANDRA-14629
>> 
>> Chris's proposal here is to enable query results from virtual tables to be 
>> streamed to the client rather than being fully materialized. There are some 
>> neat possibilities suggested in this ticket, such as debug functionality to 
>> dump the contents of a raw SSTable via the CQL interface, or the 

Re: Implicitly enabling ALLOW FILTERING on virtual tables

2023-02-06 Thread Benjamin Lerer
Making ALLOW FILTERING a table option implies giving the right to the
person creating the table the ability to change the way the server will
behave for that table which might not be something that every C* operator
wants. Of course we can allow operators to controle that through the ALLOW
FILTERING guardrail. At that point we would also need to have a default
setting for the entire database.

Le ven. 3 févr. 2023 à 23:44, Miklosovic, Stefan <
stefan.mikloso...@netapp.com> a écrit :

> This is the draft for FILTERING ON|OFF in shell.
>
> I would say this is the most simple solution.
>
> We may still consider table option but what do you think about having it
> simply just set via shell?
>
> https://github.com/apache/cassandra/pull/2141/files
>
> 
> From: Josh McKenzie 
> Sent: Friday, February 3, 2023 23:39
> To: dev
> Subject: Re: Implicitly enabling ALLOW FILTERING on virtual tables
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
> they would start to set ALLOW FILTERING here and there in order to not
> think twice about their data model so they can just call it a day.
> Setting this on a per-table basis or having users set this on specific
> queries that hit tables and forgetting they set it are 6 of one and
> half-a-dozen of another.
>
> I like the table property idea personally. That communicates an intent
> about the data model and expectation of the size and usage of data in the
> modeling of the schema that embeds some context and intent there's
> currently no mechanism to communicate.
>
> On Fri, Feb 3, 2023, at 5:00 PM, Miklosovic, Stefan wrote:
> Yes, there would be discrepancy. I do not like that either. If it was only
> about "normal tables vs virtual tables", I could live with that. But the
> fact that there are going to be differences among vtables themselves, that
> starts to be a little bit messy. Then we would need to let operators know
> what tables are always allowed to be filtered on and which do not and that
> just complicates it. Putting that information to comment so it is visible
> in DECSCRIBE is nice idea.
>
> That flag we talk about ... that flag would be used purely internally, it
> would not be in schema to be gossiped.
>
> Also, I am starting to like the suggestion to have something like ALLOW
> FILTERING ON in CQLSH so it would be turned on whole CQL session. That
> leaves tables as they are and it should not be a big deal for operators to
> set. We would have to make sure to add "ALLOW FILTERING" clause to every
> SELECT statement (to virtual tables only?) a user submits. I am not sure if
> this is doable yet though.
>
> 
> From: David Capwell mailto:dcapw...@apple.com>>
> Sent: Friday, February 3, 2023 22:42
> To: dev
> Cc: Maxim Muzafarov
> Subject: Re: Implicitly enabling ALLOW FILTERING on virtual tables
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
> I don't think the assumption that "virtual tables will always be small and
> always fit in memory" is a safe one.
>
> Agree, there is a repair ticket to have the coordinating node do network
> queries to peers to resolve the table (rather than operator querying
> everything, allow the coordinator node to do it for you)… so this
> assumption may not be true down the line.
>
> I could be open to a table property that says ALLOW FILTERING on by
> default or not… then we can pick and choose vtables (or have vtables
> opt-out)…. I kinda like like the lack of consistency with this approach
> though
>
> On Feb 3, 2023, at 11:24 AM, C. Scott Andreas  > wrote:
>
> There are some ideas that development community members have kicked around
> that may falsify the assumption that "virtual tables are tiny and will fit
> in memory."
>
> One example is CASSANDRA-14629: Abstract Virtual Table for very large
> result sets
> https://issues.apache.org/jira/browse/CASSANDRA-14629
>
> Chris's proposal here is to enable query results from virtual tables to be
> streamed to the client rather than being fully materialized. There are some
> neat possibilities suggested in this ticket, such as debug functionality to
> dump the contents of a raw SSTable via the CQL interface, or the contents
> of the database's internal caches. One could also imagine a feature like
> this providing functionality similar to a foreign data wrapper in other
> databases.
>
> I don't think the assumption that "virtual tables will always be small and
> always fit in memory" is a safe one.
>
> I don't think we should implicitly add "ALLOW FILTERING" to all queries
> against virtual tables because of this, in addition to concern with
> departing from standard CQL semantics for a type of tables deemed special.
>
> – Scott
>
> On Feb