[
https://issues.apache.org/jira/browse/CASSANDRA-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958561#comment-14958561
]
Sylvain Lebresne commented on CASSANDRA-10471:
----------------------------------------------
bq. As a reviewer I didn't figure out how to verify that this statement is true
or why. I mucked about with StatementRestrictions and family and found where IN
restrictions are expressed, but it's all pretty big in scope. Do you have any
pointers?
The query that triggers the problem is
{noformat}
SELECT v FROM test_compact WHERE k1 = 0 AND k2 IN ()
{noformat}
That query explicitly requests no row (it's a query by name with an empty list
of names). One would expect such valid but somewhat uninteresting query to be
dealt with at the CQL layer (by returning nothing), yielding no internal query,
and that is what happens on 3.0 (see
{{SelectStatement.makeClusteringIndexFilter}}, in the case of a query by names,
the code checks if {{getRequestedRows}} returns an empty list and return
{{null}} if so which is code for "we know the query return nothing"). That's
the reason why I initially made {{ColumnFilter.Builder}} reject the case where
nothing was selected, but that was misguided especially since the code is
perfectly fine dealing with an empty {{ColumnFilter}}.
And it happens that this optimization isn't done in 2.2. Or rather, it used to
be done but was "broken" by CASSANDRA-7981. If you look at the equivalent code
on 2.2, in {{SelectStatement.makeFilter()}}, it assumes a {{IN ()}} would
result in {{getRequestedColumns}} returning {{null}}, but it's easy to see it's
not the case anymore (and it's equally easy to see that CASSANDRA-7981 is the
culprit for that). So on the 2.2 node, the query simply generates an empty list
in {{SelectStatement.getRequestedColumns()}} (because
{{SingleColumnRestriction.InWithValues.getValues()}} does nothing special if
its list of terms is empty, is just returns an empty list) and queries with
that. That's where, during an upgrade, the 3.0 node receives a query by name
with an empty list of names, and that triggers my misguided assertion.
I'll note that I'm fine "fixing" that broken optimization in 2.2 and I've
pushed a very trivial fix to do so
[here|https://github.com/pcmanus/cassandra/commits/10471-2.2], but I don't
really care whether we do it or not since 1) it's pretty inconsequential for
2.2 users and 2) even if we do commit that 2.2 patch, we still need to fix 3.0
for users who might upgrade from a 2.2 version that don't have that fix.
bq. As far as I can tell null isn't used as a signal anywhere
Well it does. It signals we don't want to skip any value when {{isFetchAll}} is
set (paraphrasing the comment on the declaration of {{selection}} here). If
{{isFetchAll == true}}, {{selection}} is the subset of columns for which we
want to include the values (the ones whose values are not skipped). As the case
where we don't skip any value is common, we use {{null}} to signal it. The
equivalent if we were to not use {{null}} would be to have {{selection}} be all
the columns, but that would mean a slightly less efficient {{canSkipValue}} in
that common case.
I'll note that I feel all this is reasonably well explained in the class
javadoc and the comments around the class field declarations, but I'm open to
improvement suggestions.
bq. canSkipValue might have depended on it before this change
Why only before this change? {{selection == null}} only matter in
{{canSkipValue}} if {{isFetchAll == true}}, and
{{ColumnFilter.Builder.build()}} explicitly only force
{{PartitionColumns.NONE}} if {{!isFetchAll}}.
bq. There is no unit test for ColumnFilter
There isn't and I've created CASSANDRA-10531 to change that.
> fix flapping empty_in_test dtest
> --------------------------------
>
> Key: CASSANDRA-10471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10471
> Project: Cassandra
> Issue Type: Sub-task
> Reporter: Jim Witschey
> Assignee: Sylvain Lebresne
> Fix For: 3.0.0 rc2
>
>
> {{upgrade_tests/cql_tests.py:TestCQL.empty_in_test}} fails about half the
> time on the upgrade path from 2.2 to 3.0:
> http://cassci.datastax.com/view/Upgrades/job/storage_engine_upgrade_dtest-22_tarball-30_HEAD/42/testReport/upgrade_tests.cql_tests/TestCQL/empty_in_test/history/
> Once [this dtest PR|https://github.com/riptano/cassandra-dtest/pull/586] is
> merged, these tests should also run with this upgrade path on normal 3.0
> jobs. Until then, you can run it with the following command:
> {code}
> SKIP=false CASSANDRA_VERSION=binary:2.2.0 UPGRADE_TO=git:cassandra-3.0
> nosetests 2>&1 upgrade_tests/cql_tests.py:TestCQL.empty_in_test
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)