[ 
https://issues.apache.org/jira/browse/CASSANDRA-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958561#comment-14958561
 ] 

Sylvain Lebresne commented on CASSANDRA-10471:
----------------------------------------------

bq. As a reviewer I didn't figure out how to verify that this statement is true 
or why. I mucked about with StatementRestrictions and family and found where IN 
restrictions are expressed, but it's all pretty big in scope. Do you have any 
pointers?

The query that triggers the problem is
{noformat}
SELECT v FROM test_compact WHERE k1 = 0 AND k2 IN ()
{noformat}
That query explicitly requests no row (it's a query by name with an empty list 
of names). One would expect such valid but somewhat uninteresting query to be 
dealt with at the CQL layer (by returning nothing), yielding no internal query, 
and that is what happens on 3.0 (see 
{{SelectStatement.makeClusteringIndexFilter}}, in the case of a query by names, 
the code checks if {{getRequestedRows}} returns an empty list and return 
{{null}} if so which is code for "we know the query return nothing"). That's 
the reason why I initially made {{ColumnFilter.Builder}} reject the case where 
nothing was selected, but that was misguided especially since the code is 
perfectly fine dealing with an empty {{ColumnFilter}}.

And it happens that this optimization isn't done in 2.2. Or rather, it used to 
be done but was "broken" by CASSANDRA-7981. If you look at the equivalent code 
on 2.2, in {{SelectStatement.makeFilter()}}, it assumes a {{IN ()}} would 
result in {{getRequestedColumns}} returning {{null}}, but it's easy to see it's 
not the case anymore (and it's equally easy to see that CASSANDRA-7981 is the 
culprit for that). So on the 2.2 node, the query simply generates an empty list 
in {{SelectStatement.getRequestedColumns()}} (because 
{{SingleColumnRestriction.InWithValues.getValues()}} does nothing special if 
its list of terms is empty, is just returns an empty list) and queries with 
that. That's where, during an upgrade, the 3.0 node receives a query by name 
with an empty list of names, and that triggers my misguided assertion.

I'll note that I'm fine "fixing" that broken optimization in 2.2 and I've 
pushed a very trivial fix to do so 
[here|https://github.com/pcmanus/cassandra/commits/10471-2.2], but I don't 
really care whether we do it or not since 1) it's pretty inconsequential for 
2.2 users and 2) even if we do commit that 2.2 patch, we still need to fix 3.0 
for users who might upgrade from a 2.2 version that don't have that fix.

bq. As far as I can tell null isn't used as a signal anywhere

Well it does. It signals we don't want to skip any value when {{isFetchAll}} is 
set (paraphrasing the comment on the declaration of {{selection}} here). If 
{{isFetchAll == true}}, {{selection}} is the subset of columns for which we 
want to include the values (the ones whose values are not skipped). As the case 
where we don't skip any value is common, we use {{null}} to signal it. The 
equivalent if we were to not use {{null}} would be to have {{selection}} be all 
the columns, but that would mean a slightly less efficient {{canSkipValue}} in 
that common case.
I'll note that I feel all this is reasonably well explained in the class 
javadoc and the comments around the class field declarations, but I'm open to 
improvement suggestions.

bq. canSkipValue might have depended on it before this change

Why only before this change? {{selection == null}} only matter in 
{{canSkipValue}} if {{isFetchAll == true}}, and 
{{ColumnFilter.Builder.build()}} explicitly only force 
{{PartitionColumns.NONE}} if {{!isFetchAll}}.

bq. There is no unit test for ColumnFilter

There isn't and I've created CASSANDRA-10531 to change that.



> fix flapping empty_in_test dtest
> --------------------------------
>
>                 Key: CASSANDRA-10471
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10471
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Jim Witschey
>            Assignee: Sylvain Lebresne
>             Fix For: 3.0.0 rc2
>
>
> {{upgrade_tests/cql_tests.py:TestCQL.empty_in_test}} fails about half the 
> time on the upgrade path from 2.2 to 3.0:
> http://cassci.datastax.com/view/Upgrades/job/storage_engine_upgrade_dtest-22_tarball-30_HEAD/42/testReport/upgrade_tests.cql_tests/TestCQL/empty_in_test/history/
> Once [this dtest PR|https://github.com/riptano/cassandra-dtest/pull/586] is 
> merged, these tests should also run with this upgrade path on normal 3.0 
> jobs. Until then, you can run it with the following command:
> {code}
> SKIP=false CASSANDRA_VERSION=binary:2.2.0 UPGRADE_TO=git:cassandra-3.0 
> nosetests 2>&1 upgrade_tests/cql_tests.py:TestCQL.empty_in_test
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to