[ 
https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-1472:
--------------------------------

    Attachment: 0.7-1472-v6.tgz

> I renamed KEYS_BITMAP to just BITMAP, fixed some spots that could leak files, 
> and fixed a compaction bug related to 1916 with testcase.
I incorporated your changes into the latest tarball as 0018, and fixed some 
silliness in 0019 and 0020.

> There are some changes in here that seem to be bug fixes for other issues, 
> specifically the changes to CFMetaData.java
Dropped from this patch, and added on CASSANDRA-1962

> I see in SSTableWriter that BMT will fail on secondary indexed CFs now. Why 
> fail though? Can't they just be built on restart?
Yes, probably: but the naive approach is not very elegant, since when we see 
the first BMT append, we'll already have the secondary indexes open, so we need 
to null them out. A better approach would need to indicate to the SSTW 
constructor/factory that we were intending to write without certain component 
types... I think this can go in another ticket?

> The whole BitmapIndexWriter Scratch space has me slightly concerned.
There is an alternative to the layout I've implemented here, but it is slower 
for the most common query type (equality on one bucket), and only slightly 
faster for extremely general index queries (LT/GT involving most/all of the 
buckets). We can measure the actual overhead on a single sstable if you'd like. 

> AVRO, I don't see the value here. [...] The value of using our BRAF is you 
> have all the work to avoid polluting the page cache
I could go either way on this point: on one hand, this is an extremely simple 
structure. On the other hand, we get large benefits from compression here, and 
I'm fairly certain we should use Avro for the rest of the sstable.

Also, it's very simple to use our FileDataInput implementations here via Avro's 
SeekableInput interface, so we don't necessarily need to throw away any effort. 
See 
https://github.com/stuhood/cassandra/commit/1a5c9115cb1410519eff15dd3089772b1e550ae7

> I mentioned above that on the fly indexes should be allowed, however this can 
> happen in a subsequent ticket if you prefer.
Yes, I'd prefer that. It will likely be the highest priority of the 4-5 tickets 
we need to create if/when this issue goes in.

> As Nick mentioned it would be nice to have some stats on the index available 
> in JMX, for a subsequent ticket.
Agreed.

> I think this implementation should probably be the only secondary index 
> format we support (What's the value of keeping KEYS over this?)
Agreed, pending the optimizations mentioned in previous comments.

> Add bitmap secondary indexes
> ----------------------------
>
>                 Key: CASSANDRA-1472
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7.1
>
>         Attachments: 0.7-1472-v5.tgz, 0.7-1472-v6.tgz, 
> 0019-Rename-bugfixes-and-fileclose.txt, 1472-v3.tgz, 1472-v4.tgz, 
> 1472-v5.tgz, anatomy.png, v4-bench-c32.txt
>
>
> Bitmap indexes are a very efficient structure for dealing with immutable 
> data. We can take advantage of the fact that SSTables are immutable by 
> attaching them directly to SSTables as a new component (supported by 
> CASSANDRA-1471).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to