[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894247#action_12894247 ] Nate McCall commented on CASSANDRA-1258: SSTableReader.makeColumnFamily was still going through DD in a way that kept Indexed column CFs invisible - this came out through SSTableSliceIterator. This diff on SSTR will fix it: 561c561 return ColumnFamily.create(metadata); --- return ColumnFamily.create(getTableName(), getColumnFamilyName()); which would have been caught a lot easier with SSTableWriterTest :-) rebuild indexes after streaming --- Key: CASSANDRA-1258 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Jonathan Ellis Assignee: Nate McCall Fix For: 0.7 beta 1 Attachments: 1258-v4.txt, 1258-v5.txt, 1258-v6.txt, 1258-v7.txt, 1258-v8.txt, trunk-1258-src-2.txt, trunk-1258-src-3.txt since index CFSes are private, they won't be streamed with other sstables. which is good, because the normal partitioner logic wouldn't stream the right parts anyway. seems like the right solution is to extend SSTW.maybeRecover to rebuild indexes as well. (this has the added benefit of being able to use streaming as a relatively straightforward bulk loader.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12893606#action_12893606 ] Jonathan Ellis commented on CASSANDRA-1258: --- thanks but now it occurs to me, since the root of the problem is SSTableReader, shouldn't we push the comparator in there, instead of using if statements to avoid calling SSTR.getComparator? sorry for the run-around... rebuild indexes after streaming --- Key: CASSANDRA-1258 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Jonathan Ellis Assignee: Nate McCall Fix For: 0.7 beta 1 Attachments: 1258-v4.txt, 1258-v5.txt, 1258-v6.txt, 1258-v7.txt, trunk-1258-src-2.txt, trunk-1258-src-3.txt since index CFSes are private, they won't be streamed with other sstables. which is good, because the normal partitioner logic wouldn't stream the right parts anyway. seems like the right solution is to extend SSTW.maybeRecover to rebuild indexes as well. (this has the added benefit of being able to use streaming as a relatively straightforward bulk loader.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12893712#action_12893712 ] Nate McCall commented on CASSANDRA-1258: Initially I was hesitant to mess with the IO api since most stuff there went through DD to get information. Having a member for the comparator (CFMD as well?) on SSTR would make the sstable plumbing underneath a lot cleaner. rebuild indexes after streaming --- Key: CASSANDRA-1258 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Jonathan Ellis Assignee: Nate McCall Fix For: 0.7 beta 1 Attachments: 1258-v4.txt, 1258-v5.txt, 1258-v6.txt, 1258-v7.txt, trunk-1258-src-2.txt, trunk-1258-src-3.txt since index CFSes are private, they won't be streamed with other sstables. which is good, because the normal partitioner logic wouldn't stream the right parts anyway. seems like the right solution is to extend SSTW.maybeRecover to rebuild indexes as well. (this has the added benefit of being able to use streaming as a relatively straightforward bulk loader.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12893768#action_12893768 ] Jonathan Ellis commented on CASSANDRA-1258: --- going through DD is a wart, not a feature :) rebuild indexes after streaming --- Key: CASSANDRA-1258 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Jonathan Ellis Assignee: Nate McCall Fix For: 0.7 beta 1 Attachments: 1258-v4.txt, 1258-v5.txt, 1258-v6.txt, 1258-v7.txt, trunk-1258-src-2.txt, trunk-1258-src-3.txt since index CFSes are private, they won't be streamed with other sstables. which is good, because the normal partitioner logic wouldn't stream the right parts anyway. seems like the right solution is to extend SSTW.maybeRecover to rebuild indexes as well. (this has the added benefit of being able to use streaming as a relatively straightforward bulk loader.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12893870#action_12893870 ] Nate McCall commented on CASSANDRA-1258: I would like to stick with the approach in v7 for the scope of this ticket. I just took a stab at the above (CFMD at SSTR creation time to provide the comparator and CFMD). The changes are starting to reach into a lot of places: Memtable/BinaryMem., CompartionMgr (for SSTW.closeAndReopenReader), SST export, etc. I tried an initial set of changes with a fallback to DD when no CFMD was provided and got into a weird race condition that hung junit/ant. I'd like to put in a new ticket for 'SSTable initialization cleanup to avoid DD usage' for the above if your cool with that, primarily so I can stick a fork in this and knock out the thrift update. rebuild indexes after streaming --- Key: CASSANDRA-1258 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Jonathan Ellis Assignee: Nate McCall Fix For: 0.7 beta 1 Attachments: 1258-v4.txt, 1258-v5.txt, 1258-v6.txt, 1258-v7.txt, trunk-1258-src-2.txt, trunk-1258-src-3.txt since index CFSes are private, they won't be streamed with other sstables. which is good, because the normal partitioner logic wouldn't stream the right parts anyway. seems like the right solution is to extend SSTW.maybeRecover to rebuild indexes as well. (this has the added benefit of being able to use streaming as a relatively straightforward bulk loader.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892879#action_12892879 ] Jonathan Ellis commented on CASSANDRA-1258: --- (I goofed and did not include SSTWT in v4, but it's unchanged from v3.) rebuild indexes after streaming --- Key: CASSANDRA-1258 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Jonathan Ellis Assignee: Nate McCall Fix For: 0.7 beta 1 Attachments: 1258-v4.txt, trunk-1258-src-2.txt, trunk-1258-src-3.txt since index CFSes are private, they won't be streamed with other sstables. which is good, because the normal partitioner logic wouldn't stream the right parts anyway. seems like the right solution is to extend SSTW.maybeRecover to rebuild indexes as well. (this has the added benefit of being able to use streaming as a relatively straightforward bulk loader.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892910#action_12892910 ] Nate McCall commented on CASSANDRA-1258: So the flush, by putting this on disk in an sstable, triggers the loading of IColumnIterators via line 952 on ColumnFamilyStore. Without a flush, no SSTRs are present. The issue with this is that DatabaseDescriptor (via getComparator() called via the line above) does not know about the private indexed CFs. Given the above, I dont think this has ever worked outside of a test harness of some sort (ie. after an indexed CF is flushed and the callstack for CFS.scan is invoked). Should DatabaseDescriptor look into the metadata to see if this is an indexed column and return the comparator that way? rebuild indexes after streaming --- Key: CASSANDRA-1258 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Jonathan Ellis Assignee: Nate McCall Fix For: 0.7 beta 1 Attachments: 1258-v4.txt, trunk-1258-src-2.txt, trunk-1258-src-3.txt since index CFSes are private, they won't be streamed with other sstables. which is good, because the normal partitioner logic wouldn't stream the right parts anyway. seems like the right solution is to extend SSTW.maybeRecover to rebuild indexes as well. (this has the added benefit of being able to use streaming as a relatively straightforward bulk loader.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12891679#action_12891679 ] Jonathan Ellis commented on CASSANDRA-1258: --- Seems like the best place to put this code is in SSTR.recoverAndOpen. no? What is the point of the refactoring to CFS constructor? If it's not necessary for the feature, let's keep refactoring and new-feature-code in separate patches. rebuild indexes after streaming --- Key: CASSANDRA-1258 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Jonathan Ellis Assignee: Nate McCall Fix For: 0.7 beta 1 Attachments: trunk-1258-src-2.txt since index CFSes are private, they won't be streamed with other sstables. which is good, because the normal partitioner logic wouldn't stream the right parts anyway. seems like the right solution is to extend SSTW.maybeRecover to rebuild indexes as well. (this has the added benefit of being able to use streaming as a relatively straightforward bulk loader.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12891683#action_12891683 ] Nate McCall commented on CASSANDRA-1258: I had started adding this on SSTR initially, but there wasnt any other query stuff going on there, so it seemed out of place - not a very compelling argument, but this was my first time with the plumbing. I have no problem moving it to SSTR - so let me know. I had forgotten I left the constructor in their (I had started messing around with creating indexes after the fact and meant to take it out). Let me know about SSTR and I'll rebase and get rid of the constructor. rebuild indexes after streaming --- Key: CASSANDRA-1258 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Jonathan Ellis Assignee: Nate McCall Fix For: 0.7 beta 1 Attachments: trunk-1258-src-2.txt since index CFSes are private, they won't be streamed with other sstables. which is good, because the normal partitioner logic wouldn't stream the right parts anyway. seems like the right solution is to extend SSTW.maybeRecover to rebuild indexes as well. (this has the added benefit of being able to use streaming as a relatively straightforward bulk loader.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12891688#action_12891688 ] Jonathan Ellis commented on CASSANDRA-1258: --- give the SSTR approach a try, see if it works out. rebuild indexes after streaming --- Key: CASSANDRA-1258 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Jonathan Ellis Assignee: Nate McCall Fix For: 0.7 beta 1 Attachments: trunk-1258-src-2.txt since index CFSes are private, they won't be streamed with other sstables. which is good, because the normal partitioner logic wouldn't stream the right parts anyway. seems like the right solution is to extend SSTW.maybeRecover to rebuild indexes as well. (this has the added benefit of being able to use streaming as a relatively straightforward bulk loader.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.