[jira] [Comment Edited] (CASSANDRA-11990) Address rows rather than partitions in SASI

Alex Petrov (JIRA) Mon, 27 Aug 2018 08:34:26 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16593803#comment-16593803
 ]


Alex Petrov edited comment on CASSANDRA-11990 at 8/27/18 3:33 PM:
------------------------------------------------------------------

[~jrwest] thank you for the patch! 

I haven't reviewed the patch fully just yet but I have multiple comments: 

  * since we now store clusterings in the index, index sizes got much bigger. 
In case of prefix we get 1,5G for 175M of data (old version yields 660M). 
  * I realize there might be no hard limit on the amount of clusterings but a) 
using {{Integer.BYTES}} seems a lot of bytes even though it's written only once 
per partition and b) we should be able to infer clustering amount from schema, 
so writing it into the index might be redundant. I've tried to naively discard 
the value and replace {{buffer.getInt}} with {{clusteringComparator.size()}} 
and it seems to be ok (however, cases with static clustering, static tables etc 
should be double-checked)
  * I might be mistaken, but we can infer whether or not to read an entire 
partition from the query rather than going through all clusterings like 
[here|https://github.com/jrwest/cassandra/compare/deec448d8c1adee8663281708dc5e0060d0fa250...5d63aae5ffba7569811fad6a818163fcf3d26c6f#diff-df913f108e1290500a9afb8cd4638189R114]
  * I've tried ingesting a relatively small SSTable (~300Mb) and old SASI 
didn't seem to have any issues while new version struggles with GC. I'd have to 
do a lot more testing and verification to say what might be the cause.
  * it's be great to have at least a rough description of the index file 
format. We didn't have it before, I know, but if we could make it while at it, 
might be very useful for the future work.
  * given we're fetching clustering (therefore can say which rows are "same" 
across partitions), and data reconciliation is done by controller that goes 
through the "normal" read path, do we need to do re-filter the data in cases of 
non-static clusterings?

Could you also describe the motivation for embedding rows rather than row 
offsets into the index? I thought that it could improve some cases with 
range-iteration and eliminate additional filtering. But since we're making a 
seek to the data file in order to materialize the key, we can also fetch 
clusterings from there as well. 


was (Author: ifesdjeen):
[~jrwest] thank you for the patch! 

I haven't reviewed the patch fully just yet but I have multiple comments: 

  * since we now store clusterings in the index, index sizes got much bigger. 
In case of prefix we get 1,5G for 175M of data (old version yields 660M). 
  * I realize there might be no hard limit on the amount of clusterings but a) 
using {{Integer.BYTES}} seems a lot of bytes even though it's written only once 
per partition and b) we should be able to infer clustering amount from schema, 
so writing it into the index might be redundant. I've tried to naively discard 
the value and replace {{buffer.getInt}} with {{clusteringComparator.size()}} 
and it seems to be ok (however, cases with static clustering, static tables etc 
should be double-checked)
  * I might be mistaken, but we can infer whether or not to read an entire 
partition from the query rather than going through all clusterings like 
[here|https://github.com/jrwest/cassandra/compare/deec448d8c1adee8663281708dc5e0060d0fa250...5d63aae5ffba7569811fad6a818163fcf3d26c6f#diff-df913f108e1290500a9afb8cd4638189R114]
  * I've tried ingesting a relatively small SSTable (~300Mb) and old SASI 
didn't seem to have any issues while new version struggles with GC. I'd have to 
do a lot more testing and verification to say what might be the cause.
  * it's be great to have at least a rough description of the index file 
format. We didn't have it before, I know, but if we could make it while at it, 
might be very useful for the future work.
  * given we're fetching clustering (therefore can say which rows are "same" 
across partitions), and data reconciliation is done by controller that goes 
through the "normal" read path, do we need to do re-filter the data in cases of 
non-static clusterings?

> Address rows rather than partitions in SASI
> -------------------------------------------
>
>                 Key: CASSANDRA-11990
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11990
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL, sasi
>            Reporter: Alex Petrov
>            Assignee: Jordan West
>            Priority: Major
>             Fix For: 4.x
>
>         Attachments: perf.pdf, size_comparison.png
>
>
> Currently, the lookup in SASI index would return the key position of the 
> partition. After the partition lookup, the rows are iterated and the 
> operators are applied in order to filter out ones that do not match.
> bq. TokenTree which accepts variable size keys (such would enable different 
> partitioners, collections support, primary key indexing etc.), 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-11990) Address rows rather than partitions in SASI

Reply via email to