[ 
https://issues.apache.org/jira/browse/CASSANDRA-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098023#comment-15098023
 ] 

Sam Tunnicliffe commented on CASSANDRA-8103:
--------------------------------------------

bq. Any feedback here?

Yes, apologies for the delay, the patch looks pretty good, thanks! 
I had intended to post the review sooner, but in reviewing it discovered an 
earlier regression which I ended up digging into (CASSANDRA-11013).

I have a couple of concerns about the new implementation of 
{{CompositesSearcher::filterStaleEntries}}. Firstly, when an index entry for a 
static column is stale the iterator returned includes the actual static row 
retrieved from the base table. This means that that static row is unnecessarily 
included in all local and remote read responses adding serialization and 
network overhead. As we've identified that the index entry is no longer valid 
for the partition it points to, the returned iter be completely empty (i.e. it 
should include {{Rows.EMPTY_STATIC_ROW}}). 
I'm also not overly fond of moving the call to {{deleteStaleEntries}} into 
{{filterStaleEntries}} itself. In the static case this is fair enough as we're 
going to simply return an empty iterator, but for non-statics I'd prefer to 
keep it as something which happens on close of the results iterator (modulo 
CASSANDRA-11013). There are a few other minor things with this method, so for 
simplicity I've pushed a commit 
[here|https://github.com/beobal/cassandra/commit/b4df631b945cfc87d8ee86016e6d61790513bf64]
 to fix it up, let me know if you agree with those changes. Also, I haven't 
added it, but a test to verify the stale entry cleanup behaviour would be 
useful (see 
{{o.a.c.db.SecondaryIndexTest::testDeleteOfInconsistentValuesFromCompositeIndex}}).
 

One additional thing I'm not wild about is that due to the way we convert a row 
in the index table to an {{IndexEntry}} in the {{CassandraIndex::decodeEntry}} 
implementations, we lose the fact that the indexed value is static. At write 
time the clustering value, which is always {{Clustering.STATIC_CLUSTERING}}, is 
correctly used to build the index entry. However, at query time the 
{{decodeEntry}} implementations are not aware of the static nature and so 
attempt to build a regular clustering from the index table row. The difference 
here is minor; {{STATIC_CLUSTERING}} consists of an empty array of values, 
whereas the one that gets reconstructed is of variable size (depending on the 
number of clustering columns in the base table), but all its values will be 
null. This doesn't actually break anything (yet) as 
{{IndexEntry.indexedClustering}} is unused in the static case thanks to the 
short circuit in {{CompositesSearcher}} but it is a bit messy, so if you 
wouldn't mind I think we should fix the decoding in {{RegularColumnIndex}}, 
{{CollectionValueIndex}} & {{CollectionKeyIndexBase}}. 

Contrary to the commit comment in the patch, non-regular static columns are 
supported by this solution (given the test covering static collections, I 
assume you know this though and the comment is simply out of date). In 
addition, *frozen* static collections and UDTs can also be indexed just fine - 
these work in the same way as their non-static counterparts by being processed 
as regular columns (which they are really, being just opaque blobs at the 
storage engine level). There aren't any tests covering static frozen types 
here, so adding some would be good.

So I think the things to address here are:

* Adding tests for static frozen columns, covering collections and UDTs if 
possible
* Review {{CompositesSearcher::filterStaleEntries}} as noted above
* Fix the decoding of index entries such that they have {{indexedClustering}} 
set to {{STATIC_CLUSTERING}}
* The new {{SinglePartitionReadCommand}} in {{CompositesSearcher}} could re-use 
{{command.columnFilter()}} as the existing one does. Right now, this is 
equivalent to what the patch currently does, as CQL queries need to read all 
columns, but changing this has been mooted before (see CASSANDRA-6588 & 
CASSANDRA-7055) and it would be easy to overlook this if it does happen at some 
point.
* Remove the {{println}} from {{CompositesSearcher}} (line #136)

At some point I'd like to take a run a reorganising the various index related 
tests as at the moment, they're spread over a number of different 
classes/packages. Consolidating them, at least into a single package, would 
make it easier to identify relevant tests. I'm not suggesting we do that here 
though, I'll leave that to a separate ticket.


> Secondary Indices for Static Columns
> ------------------------------------
>
>                 Key: CASSANDRA-8103
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8103
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: CQL
>            Reporter: Ron Cohen
>             Fix For: 3.x
>
>         Attachments: 8103.patch, in_progress.patch, smoke-test.cql
>
>
> We should add secondary index support for static columns.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to