[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming

2010-07-30 Thread Nate McCall (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894247#action_12894247
 ] 

Nate McCall commented on CASSANDRA-1258:


SSTableReader.makeColumnFamily was still going through DD in a way that kept 
Indexed column CFs invisible - this came out through SSTableSliceIterator. This 
diff on SSTR will fix it:

561c561
 return ColumnFamily.create(metadata);
---
 return ColumnFamily.create(getTableName(), getColumnFamilyName());

which would have been caught a lot easier with SSTableWriterTest :-)




 rebuild indexes after streaming
 ---

 Key: CASSANDRA-1258
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Jonathan Ellis
Assignee: Nate McCall
 Fix For: 0.7 beta 1

 Attachments: 1258-v4.txt, 1258-v5.txt, 1258-v6.txt, 1258-v7.txt, 
 1258-v8.txt, trunk-1258-src-2.txt, trunk-1258-src-3.txt


 since index CFSes are private, they won't be streamed with other sstables.  
 which is good, because the normal partitioner logic wouldn't stream the right 
 parts anyway.
 seems like the right solution is to extend SSTW.maybeRecover to rebuild 
 indexes as well.  (this has the added benefit of being able to use streaming 
 as a relatively straightforward bulk loader.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming

2010-07-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12893606#action_12893606
 ] 

Jonathan Ellis commented on CASSANDRA-1258:
---

thanks

but now it occurs to me, since the root of the problem is SSTableReader, 
shouldn't we push the comparator in there, instead of using if statements to 
avoid calling SSTR.getComparator?

sorry for the run-around...

 rebuild indexes after streaming
 ---

 Key: CASSANDRA-1258
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Jonathan Ellis
Assignee: Nate McCall
 Fix For: 0.7 beta 1

 Attachments: 1258-v4.txt, 1258-v5.txt, 1258-v6.txt, 1258-v7.txt, 
 trunk-1258-src-2.txt, trunk-1258-src-3.txt


 since index CFSes are private, they won't be streamed with other sstables.  
 which is good, because the normal partitioner logic wouldn't stream the right 
 parts anyway.
 seems like the right solution is to extend SSTW.maybeRecover to rebuild 
 indexes as well.  (this has the added benefit of being able to use streaming 
 as a relatively straightforward bulk loader.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming

2010-07-29 Thread Nate McCall (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12893712#action_12893712
 ] 

Nate McCall commented on CASSANDRA-1258:


Initially I was hesitant to mess with the IO api since most stuff there went 
through DD to get information. Having a member for the comparator (CFMD as 
well?) on SSTR would make the sstable plumbing underneath a lot cleaner. 

 rebuild indexes after streaming
 ---

 Key: CASSANDRA-1258
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Jonathan Ellis
Assignee: Nate McCall
 Fix For: 0.7 beta 1

 Attachments: 1258-v4.txt, 1258-v5.txt, 1258-v6.txt, 1258-v7.txt, 
 trunk-1258-src-2.txt, trunk-1258-src-3.txt


 since index CFSes are private, they won't be streamed with other sstables.  
 which is good, because the normal partitioner logic wouldn't stream the right 
 parts anyway.
 seems like the right solution is to extend SSTW.maybeRecover to rebuild 
 indexes as well.  (this has the added benefit of being able to use streaming 
 as a relatively straightforward bulk loader.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming

2010-07-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12893768#action_12893768
 ] 

Jonathan Ellis commented on CASSANDRA-1258:
---

going through DD is a wart, not a feature :)

 rebuild indexes after streaming
 ---

 Key: CASSANDRA-1258
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Jonathan Ellis
Assignee: Nate McCall
 Fix For: 0.7 beta 1

 Attachments: 1258-v4.txt, 1258-v5.txt, 1258-v6.txt, 1258-v7.txt, 
 trunk-1258-src-2.txt, trunk-1258-src-3.txt


 since index CFSes are private, they won't be streamed with other sstables.  
 which is good, because the normal partitioner logic wouldn't stream the right 
 parts anyway.
 seems like the right solution is to extend SSTW.maybeRecover to rebuild 
 indexes as well.  (this has the added benefit of being able to use streaming 
 as a relatively straightforward bulk loader.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming

2010-07-29 Thread Nate McCall (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12893870#action_12893870
 ] 

Nate McCall commented on CASSANDRA-1258:


I would like to stick with the approach in v7 for the scope of this ticket. I 
just took a stab at the above (CFMD at SSTR creation time to provide the 
comparator and CFMD).

The changes are starting to reach into a lot of places: Memtable/BinaryMem., 
CompartionMgr (for SSTW.closeAndReopenReader), SST export, etc. 

I tried an initial set of changes with a fallback to DD when no CFMD was 
provided and got into a weird race condition that hung junit/ant. 

I'd like to put in a new ticket for 'SSTable initialization cleanup to avoid DD 
usage' for the above if your cool with that, primarily so I can stick a fork in 
this and knock out the thrift update. 

 rebuild indexes after streaming
 ---

 Key: CASSANDRA-1258
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Jonathan Ellis
Assignee: Nate McCall
 Fix For: 0.7 beta 1

 Attachments: 1258-v4.txt, 1258-v5.txt, 1258-v6.txt, 1258-v7.txt, 
 trunk-1258-src-2.txt, trunk-1258-src-3.txt


 since index CFSes are private, they won't be streamed with other sstables.  
 which is good, because the normal partitioner logic wouldn't stream the right 
 parts anyway.
 seems like the right solution is to extend SSTW.maybeRecover to rebuild 
 indexes as well.  (this has the added benefit of being able to use streaming 
 as a relatively straightforward bulk loader.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming

2010-07-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892879#action_12892879
 ] 

Jonathan Ellis commented on CASSANDRA-1258:
---

(I goofed and did not include SSTWT in v4, but it's unchanged from v3.)

 rebuild indexes after streaming
 ---

 Key: CASSANDRA-1258
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Jonathan Ellis
Assignee: Nate McCall
 Fix For: 0.7 beta 1

 Attachments: 1258-v4.txt, trunk-1258-src-2.txt, trunk-1258-src-3.txt


 since index CFSes are private, they won't be streamed with other sstables.  
 which is good, because the normal partitioner logic wouldn't stream the right 
 parts anyway.
 seems like the right solution is to extend SSTW.maybeRecover to rebuild 
 indexes as well.  (this has the added benefit of being able to use streaming 
 as a relatively straightforward bulk loader.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming

2010-07-27 Thread Nate McCall (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892910#action_12892910
 ] 

Nate McCall commented on CASSANDRA-1258:


So the flush, by putting this on disk in an sstable, triggers the loading of 
IColumnIterators via line 952 on ColumnFamilyStore. Without a flush, no SSTRs 
are present.

The issue with this is that DatabaseDescriptor (via getComparator() called via 
the line above) does not know about the private indexed CFs. 

Given the above, I dont think this has ever worked outside of a test harness of 
some sort (ie. after an indexed CF is flushed and the callstack for CFS.scan is 
invoked). 

Should DatabaseDescriptor look into the metadata to see if this is an indexed 
column and return the comparator that way?

 rebuild indexes after streaming
 ---

 Key: CASSANDRA-1258
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Jonathan Ellis
Assignee: Nate McCall
 Fix For: 0.7 beta 1

 Attachments: 1258-v4.txt, trunk-1258-src-2.txt, trunk-1258-src-3.txt


 since index CFSes are private, they won't be streamed with other sstables.  
 which is good, because the normal partitioner logic wouldn't stream the right 
 parts anyway.
 seems like the right solution is to extend SSTW.maybeRecover to rebuild 
 indexes as well.  (this has the added benefit of being able to use streaming 
 as a relatively straightforward bulk loader.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming

2010-07-23 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12891679#action_12891679
 ] 

Jonathan Ellis commented on CASSANDRA-1258:
---

Seems like the best place to put this code is in SSTR.recoverAndOpen.  no?

What is the point of the refactoring to CFS constructor?  If it's not necessary 
for the feature, let's keep refactoring and new-feature-code in separate 
patches.

 rebuild indexes after streaming
 ---

 Key: CASSANDRA-1258
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Jonathan Ellis
Assignee: Nate McCall
 Fix For: 0.7 beta 1

 Attachments: trunk-1258-src-2.txt


 since index CFSes are private, they won't be streamed with other sstables.  
 which is good, because the normal partitioner logic wouldn't stream the right 
 parts anyway.
 seems like the right solution is to extend SSTW.maybeRecover to rebuild 
 indexes as well.  (this has the added benefit of being able to use streaming 
 as a relatively straightforward bulk loader.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming

2010-07-23 Thread Nate McCall (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12891683#action_12891683
 ] 

Nate McCall commented on CASSANDRA-1258:


I had started adding this on SSTR initially, but there wasnt any other query 
stuff going on there, so it seemed out of place - not a very compelling 
argument, but this was my first time with the plumbing. I have no problem 
moving it to SSTR - so let me know.

I had forgotten I left the constructor in their (I had started messing around 
with creating indexes after the fact and meant to take it out). Let me know 
about SSTR and I'll rebase and get rid of the constructor.



 

 rebuild indexes after streaming
 ---

 Key: CASSANDRA-1258
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Jonathan Ellis
Assignee: Nate McCall
 Fix For: 0.7 beta 1

 Attachments: trunk-1258-src-2.txt


 since index CFSes are private, they won't be streamed with other sstables.  
 which is good, because the normal partitioner logic wouldn't stream the right 
 parts anyway.
 seems like the right solution is to extend SSTW.maybeRecover to rebuild 
 indexes as well.  (this has the added benefit of being able to use streaming 
 as a relatively straightforward bulk loader.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1258) rebuild indexes after streaming

2010-07-23 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12891688#action_12891688
 ] 

Jonathan Ellis commented on CASSANDRA-1258:
---

give the SSTR approach a try, see if it works out.

 rebuild indexes after streaming
 ---

 Key: CASSANDRA-1258
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1258
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Jonathan Ellis
Assignee: Nate McCall
 Fix For: 0.7 beta 1

 Attachments: trunk-1258-src-2.txt


 since index CFSes are private, they won't be streamed with other sstables.  
 which is good, because the normal partitioner logic wouldn't stream the right 
 parts anyway.
 seems like the right solution is to extend SSTW.maybeRecover to rebuild 
 indexes as well.  (this has the added benefit of being able to use streaming 
 as a relatively straightforward bulk loader.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.