[ 
https://issues.apache.org/jira/browse/CASSANDRA-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715740#comment-17715740
 ] 

Stefan Miklosovic commented on CASSANDRA-15803:
-----------------------------------------------

When it comes to a practical usage, when we have dozens of tables, how would a 
guardrail look like? We can have tables for which allowing the filtering within 
a partition is OK because we "know" that there is not going to be a lot of rows 
within a partition so we are safe to assume that it would be quick. So, for 
some tables it would be ok, for some it would not. Unless we model a guardrail 
to be able to make a difference, per table, I would go with WITHIN PARTITION.

I would go with a guardrail only if we do not want to make it granular per 
table.

[~jeromatron]  [~adelapena]  what do you think?

> Separate out allow filtering scanning through a partition versus scanning 
> over the table
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15803
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15803
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL/Syntax
>            Reporter: Jeremy Hanna
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>
> Currently allow filtering can mean two things in the spirit of "avoid 
> operations that don't seek to a specific row or sequential rows of data."  
> First, it can mean scanning across the entire table to meet the criteria of 
> the query.  That's almost always a bad thing and should be discouraged or 
> disabled (see CASSANDRA-8303).  Second, it can mean filtering within a 
> specific partition.  For example, in a query you could specify the full 
> partition key and if you specify a criterion on a non-key field, it requires 
> allow filtering.
> The second reason to require allow filtering is significantly less work to 
> scan through a partition.  It is still extra work over seeking to a specific 
> row and getting N sequential rows though.  So while an application developer 
> and/or operator needs to be cautious about this second type, it's not 
> necessarily a bad thing, depending on the table and the use case.
> I propose that we separate the way to specify allow filtering across an 
> entire table from specifying allow filtering across a partition in a 
> backwards compatible way.  One idea that was brought up in Slack in the 
> cassandra-dev room was to have allow filtering mean the superset - scanning 
> across the table.  Then if you want to specify that you *only* want to scan 
> within a partition you would use something like
> {{ALLOW FILTERING [WITHIN PARTITION]}}
> So it will succeed if you specify non-key criteria within a single partition, 
> but fail with a message to say it requires the full allow filtering.  This 
> would allow for a backwards compatible full allow filtering while allowing a 
> user to specify that they want to just scan within a partition, but error out 
> if trying to scan a full table.
> This is potentially also related to the capability limitation framework by 
> which operators could more granularly specify what features are allowed or 
> disallowed per user, discussed in CASSANDRA-8303.  This way an operator could 
> disallow the more general allow filtering while allowing the partition scan 
> (or disallow them both at their discretion).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to