[jira] [Updated] (CASSANDRA-1608) Redesigned Compaction

2011-06-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1608:
--

Attachment: 1608-v2.txt

Thanks, Ben. This is promising!

I pretty much concententrated on the Manifest, which I moved to a top-level 
class.  (Can you summarize what is different in LDBCompactionTask?)

I don't think trying to build levels out of non-leveled data is useful.  Even 
if you tried all permutations the odds of ending up with something useful are 
infinitesmally small.  I'd suggest adding a startup hook instead to 
CompactionStrategy, and if we start up w/ unleveled SSTables we level them 
before doing anything else.  (This will take a while, but not as long as 
leveling everything naively would, since we can just do a single 
compaction-of-everything, spitting out non-overlapping sstables of the desired 
size, and set those to the appropriate level.)

Updated DataTracker to add streamed sstables to level 0.  DataTracker public 
API probably needs a more thorough look though to see if we're missing 
anything. (Speaking of streaming, I think we do need to go by data size not 
sstable count b/c streamed sstables from repair can be arbitrarily large or 
small.)

In promote, do we need to check for all the removed ones being on the same 
level?  I can't think of a scenario where we're not merging from multiple 
levels.  If so I'd change that to an assert.  (In fact there should be exactly 
two levels involved, right?)

Did some surgery on getCompactionCandidates.  Generally renamed things to be 
more succinct. Feels like we getCompactionCandidates should do lower levels 
before doing higher levels?

We'll also need to think about which parts of the strategy/manifest need to be 
threadsafe. (All of them?)  Should definitely document this in AbstractCS.


 Redesigned Compaction
 -

 Key: CASSANDRA-1608
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1608
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
 Attachments: 0001-leveldb-style-compaction.patch, 1608-v2.txt


 After seeing the I/O issues in CASSANDRA-1470, I've been doing some more 
 thinking on this subject that I wanted to lay out.
 I propose we redo the concept of how compaction works in Cassandra. At the 
 moment, compaction is kicked off based on a write access pattern, not read 
 access pattern. In most cases, you want the opposite. You want to be able to 
 track how well each SSTable is performing in the system. If we were to keep 
 statistics in-memory of each SSTable, prioritize them based on most accessed, 
 and bloom filter hit/miss ratios, we could intelligently group sstables that 
 are being read most often and schedule them for compaction. We could also 
 schedule lower priority maintenance on SSTable's not often accessed.
 I also propose we limit the size of each SSTable to a fix sized, that gives 
 us the ability to  better utilize our bloom filters in a predictable manner. 
 At the moment after a certain size, the bloom filters become less reliable. 
 This would also allow us to group data most accessed. Currently the size of 
 an SSTable can grow to a point where large portions of the data might not 
 actually be accessed as often.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CASSANDRA-2785) should export JAVA variable in the bin/cassandra and use that in the cassandra-env.sh when check for the java version

2011-06-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-2785:
-

Assignee: paul cannon

 should export JAVA variable in the bin/cassandra and use that in the 
 cassandra-env.sh when check for the java version
 -

 Key: CASSANDRA-2785
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2785
 Project: Cassandra
  Issue Type: Bug
Reporter: Jackson Chung
Assignee: paul cannon

 I forgot which jira we add this java -version check in the cassandra-env.sh 
 (for adding jamm to the javaagent), but we should probably use the variable 
 JAVA set in bin/cassandra (will need export) and use $JAVA instead of java 
 in the cassandra-env.sh
 In a situation where JAVA_HOME may have been properly set as the Sun's java 
 but the PATH still have the OpenJDK's java in front, the check will fail to 
 add the jamm.jar, even though the cassandra jvm is properly started via the 
 Sun's java.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2589) row deletes do not remove columns

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050905#comment-13050905
 ] 

Jonathan Ellis commented on CASSANDRA-2589:
---

bq. I'm passing Integer.MIN_VALUE for the gcBefore so I thought it would only 
remove colums if they were under a CF tombstone. 

Ah, you're right.

bq. One of the issues I ran into is that while it's seems technically correct 
to purge a tombstone after GCGraceSeconds, if it is not written into an SSTable 
it's lost. 

Not sure what you mean. Yes, it's lost, but why would we not want to lose a 
tombstone-older-than-GCGrace?

 row deletes do not remove columns
 -

 Key: CASSANDRA-2589
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2589
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Aaron Morton
Assignee: Aaron Morton
Priority: Minor
 Fix For: 0.8.2

 Attachments: 
 0001-remove-deleted-columns-before-flushing-memtable-v07.patch, 
 0001-remove-deleted-columns-before-flushing-memtable-v08.patch


 When a row delete is issued CF.delete() sets the localDeletetionTime and 
 markedForDeleteAt values but does not remove columns which have a lower time 
 stamp. As a result:
 # Memory which could be freed is held on to (prob not too bad as it's already 
 counted)
 # The deleted columns are serialised to disk, along with the CF info to say 
 they are no longer valid. 
 # NamesQueryFilter and SliceQueryFilter have to do more work as they filter 
 out the irrelevant columns using QueryFilter.isRelevant()
 # Also columns written with a lower time stamp after the deletion are added 
 to the CF without checking markedForDeletionAt.
 This can cause RR to fail, will create another ticket for that and link. This 
 ticket is for a fix to removing the columns. 
 Two options I could think of:
 # Check for deletion when serialising to SSTable and ignore columns if the 
 have a lower timestamp. Otherwise leave as is so dead columns stay in memory. 
 # Ensure at all times if the CF is deleted all columns it contains have a 
 higher timestamp. 
 ## I *think* this would include all column types (DeletedColumn as well) as 
 the CF deletion has the same effect. But not sure.
 ## Deleting (potentially) all columns in delete() will take time. Could track 
 the highest timestamp in the CF so the normal case of deleting all cols does 
 not need to iterate. 
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1608) Redesigned Compaction

2011-06-17 Thread Benjamin Coverston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050908#comment-13050908
 ] 

Benjamin Coverston commented on CASSANDRA-1608:
---

The LDBCompaction task was changed to limit the size of the SSTables that are 
output by the compaction itself. Once the size of rows compacted exceeds the 
size of the default size in MB then it creates a new SSTable:



if(position  cfs.metadata.getMemtableThroughputInMb() * 1024 * 1024
|| nni.hasNext() == false)
{


It feels like a bit of a hack because an optimal flush size may not always be 
an optimal storage size, but my goal was to try to keep the SSTable size in a 
reasonably small range to make compactions into into level 1 fast.

I'll make some more modifications to the manifest s.t. there is a single path 
for getting new SSTables (flushed and streamed) into the manifest. I found a 
bug on the plane today where they were getting added to the manifest, but they 
weren't being added to the queue that I was adding flushed SSTables to. I'll 
get that into my next revision.


In promote, do we need to check for all the removed ones being on the same 
level? I can't think of a scenario where we're not merging from multiple 
levels. If so I'd change that to an assert. (In fact there should be exactly 
two levels involved, right?)


I considered this. There are some boundary cases where every SSTable that gets 
compacted will be in the same level. Most of them have to do with L+1 being 
empty. Also sending the SSTables through the same compaction path will evict 
expired tombstones before they end up in the next level where compactions 
become increasingly unlikely.


Did some surgery on getCompactionCandidates. Generally renamed things to be 
more succinct. Feels like we getCompactionCandidates should do lower levels 
before doing higher levels?


Let's just say my naming conventions have been shaped by different influences 
:) I wouldn't object to any of the new names you chose however.

RE: the order, it does feel like we should do lower levels before higher 
levels, however one thing that we have to do is make sure that level-1 stays at 
10 SSTables. The algorithm dictates that all of the level-0 candidates get 
compacted with all of the candidates at level-1. This means that you need to 
promote out of level-1 so that it is ~10 SSTables before you schedule a 
compaction for level-0 promotion. Right now tuning this so that it is 
performant is the biggest hurdle, I have made some improvements by watching the 
CompactionExecutor, but I have a feeling that making this work is going to 
require some subtle manipulation of the way that the CompactionExecutor handles 
tasks.



 Redesigned Compaction
 -

 Key: CASSANDRA-1608
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1608
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
 Attachments: 0001-leveldb-style-compaction.patch, 1608-v2.txt


 After seeing the I/O issues in CASSANDRA-1470, I've been doing some more 
 thinking on this subject that I wanted to lay out.
 I propose we redo the concept of how compaction works in Cassandra. At the 
 moment, compaction is kicked off based on a write access pattern, not read 
 access pattern. In most cases, you want the opposite. You want to be able to 
 track how well each SSTable is performing in the system. If we were to keep 
 statistics in-memory of each SSTable, prioritize them based on most accessed, 
 and bloom filter hit/miss ratios, we could intelligently group sstables that 
 are being read most often and schedule them for compaction. We could also 
 schedule lower priority maintenance on SSTable's not often accessed.
 I also propose we limit the size of each SSTable to a fix sized, that gives 
 us the ability to  better utilize our bloom filters in a predictable manner. 
 At the moment after a certain size, the bloom filters become less reliable. 
 This would also allow us to group data most accessed. Currently the size of 
 an SSTable can grow to a point where large portions of the data might not 
 actually be accessed as often.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2780) sstable2json needs to escape quotes

2011-06-17 Thread Timo Nentwig (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050923#comment-13050923
 ] 

Timo Nentwig commented on CASSANDRA-2780:
-

The patch will replace existing \ with \\ which may not be the desired 
behaviour.

Alternative (regex should be precompiled, of course):
String.format(\%s\, val.replaceAll((?!)\, \));
 

 sstable2json needs to escape quotes
 ---

 Key: CASSANDRA-2780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2780
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Timo Nentwig
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: CASSANDRA-2780.patch


 [default@foo] set transactions[test][data]='{foo:bar}'; 
 $ cat /tmp/json
 {
 74657374: [[data, {foo:bar}, 1308209845388000]]
 }
 $ ./json2sstable -s -c transactions -K foo /tmp/json /tmp/ss-g-1-Data.db
 Counting keys to import, please wait... (NOTE: to skip this use -n num_keys)
 org.codehaus.jackson.JsonParseException: Unexpected character ('f' (code 
 102)): was expecting comma to separate ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
   at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonParserBase.java:565)
   at 
 org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:128)
   at 
 org.codehaus.jackson.impl.JsonParserBase.skipChildren(JsonParserBase.java:263)
   at 
 org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:328)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:252)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
 ERROR: Unexpected character ('f' (code 102)): was expecting comma to separate 
 ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
 http://www.mail-archive.com/user@cassandra.apache.org/msg14257.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2780) sstable2json needs to escape quotes

2011-06-17 Thread Timo Nentwig (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050923#comment-13050923
 ] 

Timo Nentwig edited comment on CASSANDRA-2780 at 6/17/11 8:10 AM:
--

The patch will replace existing \ with  which may not be the desired 
behaviour.

Alternative (regex should be precompiled, of course):
String.format(\%s\, val.replaceAll((?!)\, \));
 

  was (Author: tcn):
The patch will replace existing \ with \\ which may not be the desired 
behaviour.

Alternative (regex should be precompiled, of course):
String.format(\%s\, val.replaceAll((?!)\, \));
 
  
 sstable2json needs to escape quotes
 ---

 Key: CASSANDRA-2780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2780
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Timo Nentwig
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: CASSANDRA-2780.patch


 [default@foo] set transactions[test][data]='{foo:bar}'; 
 $ cat /tmp/json
 {
 74657374: [[data, {foo:bar}, 1308209845388000]]
 }
 $ ./json2sstable -s -c transactions -K foo /tmp/json /tmp/ss-g-1-Data.db
 Counting keys to import, please wait... (NOTE: to skip this use -n num_keys)
 org.codehaus.jackson.JsonParseException: Unexpected character ('f' (code 
 102)): was expecting comma to separate ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
   at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonParserBase.java:565)
   at 
 org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:128)
   at 
 org.codehaus.jackson.impl.JsonParserBase.skipChildren(JsonParserBase.java:263)
   at 
 org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:328)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:252)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
 ERROR: Unexpected character ('f' (code 102)): was expecting comma to separate 
 ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
 http://www.mail-archive.com/user@cassandra.apache.org/msg14257.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2780) sstable2json needs to escape quotes

2011-06-17 Thread Timo Nentwig (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050923#comment-13050923
 ] 

Timo Nentwig edited comment on CASSANDRA-2780 at 6/17/11 8:10 AM:
--

The patch will replace existing \ with \\\ which may not be the desired 
behaviour.

Alternative (regex should be precompiled, of course):
String.format(\%s\, val.replaceAll((?!)\, \));
 

  was (Author: tcn):
The patch will replace existing \ with \\ which may not be the desired 
behaviour.

Alternative (regex should be precompiled, of course):
String.format(\%s\, val.replaceAll((?!)\, \));
 
  
 sstable2json needs to escape quotes
 ---

 Key: CASSANDRA-2780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2780
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Timo Nentwig
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: CASSANDRA-2780.patch


 [default@foo] set transactions[test][data]='{foo:bar}'; 
 $ cat /tmp/json
 {
 74657374: [[data, {foo:bar}, 1308209845388000]]
 }
 $ ./json2sstable -s -c transactions -K foo /tmp/json /tmp/ss-g-1-Data.db
 Counting keys to import, please wait... (NOTE: to skip this use -n num_keys)
 org.codehaus.jackson.JsonParseException: Unexpected character ('f' (code 
 102)): was expecting comma to separate ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
   at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonParserBase.java:565)
   at 
 org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:128)
   at 
 org.codehaus.jackson.impl.JsonParserBase.skipChildren(JsonParserBase.java:263)
   at 
 org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:328)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:252)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
 ERROR: Unexpected character ('f' (code 102)): was expecting comma to separate 
 ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
 http://www.mail-archive.com/user@cassandra.apache.org/msg14257.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2780) sstable2json needs to escape quotes

2011-06-17 Thread Timo Nentwig (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050923#comment-13050923
 ] 

Timo Nentwig edited comment on CASSANDRA-2780 at 6/17/11 8:10 AM:
--

The patch will replace existing \ with \\ which may not be the desired 
behaviour.

Alternative (regex should be precompiled, of course):
String.format(\%s\, val.replaceAll((?!)\, \));
 

  was (Author: tcn):
The patch will replace existing \ with  which may not be the desired 
behaviour.

Alternative (regex should be precompiled, of course):
String.format(\%s\, val.replaceAll((?!)\, \));
 
  
 sstable2json needs to escape quotes
 ---

 Key: CASSANDRA-2780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2780
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Timo Nentwig
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: CASSANDRA-2780.patch


 [default@foo] set transactions[test][data]='{foo:bar}'; 
 $ cat /tmp/json
 {
 74657374: [[data, {foo:bar}, 1308209845388000]]
 }
 $ ./json2sstable -s -c transactions -K foo /tmp/json /tmp/ss-g-1-Data.db
 Counting keys to import, please wait... (NOTE: to skip this use -n num_keys)
 org.codehaus.jackson.JsonParseException: Unexpected character ('f' (code 
 102)): was expecting comma to separate ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
   at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonParserBase.java:565)
   at 
 org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:128)
   at 
 org.codehaus.jackson.impl.JsonParserBase.skipChildren(JsonParserBase.java:263)
   at 
 org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:328)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:252)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
 ERROR: Unexpected character ('f' (code 102)): was expecting comma to separate 
 ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
 http://www.mail-archive.com/user@cassandra.apache.org/msg14257.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2780) sstable2json needs to escape quotes

2011-06-17 Thread Timo Nentwig (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050923#comment-13050923
 ] 

Timo Nentwig edited comment on CASSANDRA-2780 at 6/17/11 8:10 AM:
--

The patch will replace existing \ with \\ (backslash backslash quote) which 
may not be the desired behaviour.

Alternative (regex should be precompiled, of course):
String.format(\%s\, val.replaceAll((?!)\, \));
 

  was (Author: tcn):
The patch will replace existing \ with \\\ which may not be the desired 
behaviour.

Alternative (regex should be precompiled, of course):
String.format(\%s\, val.replaceAll((?!)\, \));
 
  
 sstable2json needs to escape quotes
 ---

 Key: CASSANDRA-2780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2780
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Timo Nentwig
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: CASSANDRA-2780.patch


 [default@foo] set transactions[test][data]='{foo:bar}'; 
 $ cat /tmp/json
 {
 74657374: [[data, {foo:bar}, 1308209845388000]]
 }
 $ ./json2sstable -s -c transactions -K foo /tmp/json /tmp/ss-g-1-Data.db
 Counting keys to import, please wait... (NOTE: to skip this use -n num_keys)
 org.codehaus.jackson.JsonParseException: Unexpected character ('f' (code 
 102)): was expecting comma to separate ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
   at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonParserBase.java:565)
   at 
 org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:128)
   at 
 org.codehaus.jackson.impl.JsonParserBase.skipChildren(JsonParserBase.java:263)
   at 
 org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:328)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:252)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
 ERROR: Unexpected character ('f' (code 102)): was expecting comma to separate 
 ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
 http://www.mail-archive.com/user@cassandra.apache.org/msg14257.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2732) StringIndexOutOfBoundsException when specifying JDBC connection string without user and password

2011-06-17 Thread Vivek Mishra (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050929#comment-13050929
 ] 

Vivek Mishra commented on CASSANDRA-2732:
-

By specifying cassandra.properties, i mean defining cassandra specific 
properties and loading them implicitly just like having jdbc.properties, any 
sql client holds or hibernate.properties held by Hibernate.



 StringIndexOutOfBoundsException when specifying JDBC connection string 
 without user and password
 

 Key: CASSANDRA-2732
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2732
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8 beta 1
Reporter: Cathy Daw
Assignee: Rick Shaw
Priority: Trivial
  Labels: cql

 *PASS: specify connection string user and password*
 _connection = 
 DriverManager.getConnection(jdbc:cassandra:root/root@localhost:9170/default)_
 *FAIL: specify connection string without user and password*
 _connection = 
 DriverManager.getConnection(jdbc:cassandra://localhost:9170/default)_
 {code}
 [junit] String index out of range: -1
 [junit] java.lang.StringIndexOutOfBoundsException: String index out of range: 
 -1
 [junit] at java.lang.String.substring(String.java:1937)
 [junit] at 
 org.apache.cassandra.cql.jdbc.CassandraConnection.init(CassandraConnection.java:74)
 [junit] at 
 org.apache.cassandra.cql.jdbc.CassandraConnection.init(CassandraConnection.java:74)
 [junit] at 
 org.apache.cassandra.cql.jdbc.CassandraDriver.connect(CassandraDriver.java:86)
 [junit] at java.sql.DriverManager.getConnection(DriverManager.java:582)
 [junit] at java.sql.DriverManager.getConnection(DriverManager.java:207)
 [junit] at 
 com.datastax.cql.runJDBCSmokeTest.setUpBeforeClass(runJDBCSmokeTest.java:45)
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2720) The current implementation of CassandraConnection does not always follow documented semantics for a JDBC Connection interface

2011-06-17 Thread Vivek Mishra (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050931#comment-13050931
 ] 

Vivek Mishra commented on CASSANDRA-2720:
-

SQLFeatureNotSupportedException : should it be CQLFeatureNotSupportedException ?



 The current implementation of CassandraConnection does not always follow 
 documented semantics for a JDBC Connection interface
 -

 Key: CASSANDRA-2720
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2720
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8.0 beta 2
Reporter: Rick Shaw
Assignee: Rick Shaw
Priority: Minor
  Labels: cql, drivers, jdbc
 Fix For: 0.8.2

 Attachments: Cleanup-semantics for-a JDBC-Connection-v1.txt, 
 Cleanup-semantics-for-a-JDBC-Connection-v2.txt


 While the current implementation of many of the classes in the JDBC driver 
 implementation are practical to get the driver to work they do not always 
 obey the documented semantics for the associated interfaces. I am proposing 
 making a pass over the involved implementation members to start the 
 tightening process that will need to happen to use this driver in other 
 tools an programs that expect stricter adherence than is currently present.
 Broad areas of attention are:
 - Use of {{SQLFeatureNotSupportedException}} not 
 {{UnsupportedOperationException}} for methods that The Cassandra 
 Implementation does not support.
 - Checking in appropriate methods for the prescribed throwing of 
 {{SQLException}} if the method is called on a closed connection.
 - providing method implementations for all methods that are not optional even 
 it it is to return NULL (as prescribed in the Interface description.)
 I will cut additional JIRA tickets for other components in the suite.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2780) sstable2json needs to escape quotes

2011-06-17 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2780:
---

Attachment: CASSANDRA-2780-v2.patch

changed replace with pattern/matcher with your regex in the escapeQuotes 
method. thanks!

 sstable2json needs to escape quotes
 ---

 Key: CASSANDRA-2780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2780
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Timo Nentwig
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: CASSANDRA-2780-v2.patch, CASSANDRA-2780.patch


 [default@foo] set transactions[test][data]='{foo:bar}'; 
 $ cat /tmp/json
 {
 74657374: [[data, {foo:bar}, 1308209845388000]]
 }
 $ ./json2sstable -s -c transactions -K foo /tmp/json /tmp/ss-g-1-Data.db
 Counting keys to import, please wait... (NOTE: to skip this use -n num_keys)
 org.codehaus.jackson.JsonParseException: Unexpected character ('f' (code 
 102)): was expecting comma to separate ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
   at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonParserBase.java:565)
   at 
 org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:128)
   at 
 org.codehaus.jackson.impl.JsonParserBase.skipChildren(JsonParserBase.java:263)
   at 
 org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:328)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:252)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
 ERROR: Unexpected character ('f' (code 102)): was expecting comma to separate 
 ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
 http://www.mail-archive.com/user@cassandra.apache.org/msg14257.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2769) Cannot Create Duplicate Compaction Marker

2011-06-17 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2769:


Attachment: 
0002-Only-compact-what-has-been-succesfully-marked-as-com-v2.patch
0001-Do-compact-only-smallerSSTables-v2.patch

bq. For trunk patches, I'm not comfortable w/ 0001 reassigning the sstables 
field on general principles either. We could have the compaction proceed using 
smallerSSTables as a simpler alternative, but in general this organization 
feels like negative progress from the 0.8 
doCompaction/doCompactionWithoutSizeEstimation.

Attaching v2 that doesn't reassign the sstables field.

bq. I think Alan has a good point. I don't think it's an appropriate role of 
the data tracker to modify the set of sstables to be compacted in a task.

I do not disagree with that. However I'd like that we fix trunk as a first 
priority. It's a pain to work on other issues (CASSANDRA-2521 for instance) 
while it is broken (and the goal must be to do our best to always have a 
working trunk). The attached patches doesn't really change any behavior, it 
just fixes the bugs, so let's get that in first before thinking about 
refactoring.


 Cannot Create Duplicate Compaction Marker
 -

 Key: CASSANDRA-2769
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2769
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Benjamin Coverston
Assignee: Sylvain Lebresne
 Fix For: 0.8.2

 Attachments: 
 0001-0.8.0-Remove-useless-unmarkCompacting-in-doCleanup.patch, 
 0001-Do-compact-only-smallerSSTables-v2.patch, 
 0001-Do-compact-only-smallerSSTables.patch, 
 0002-Only-compact-what-has-been-succesfully-marked-as-com-v2.patch, 
 0002-Only-compact-what-has-been-succesfully-marked-as-com.patch


 Concurrent compaction can trigger the following exception when two threads 
 compact the same sstable. DataTracker attempts to prevent this but apparently 
 not successfully.
 java.io.IOError: java.io.IOException: Unable to create compaction marker
   at 
 org.apache.cassandra.io.sstable.SSTableReader.markCompacted(SSTableReader.java:638)
   at 
 org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:321)
   at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:294)
   at 
 org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:255)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:932)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:173)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:119)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:102)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:680)
 Caused by: java.io.IOException: Unable to create compaction marker
   at 
 org.apache.cassandra.io.sstable.SSTableReader.markCompacted(SSTableReader.java:634)
   ... 12 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2521) Move away from Phantom References for Compaction/Memtable

2011-06-17 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2521:


Attachment: 0002-Force-unmapping-files-before-deletion-v2.patch

0001-Use-reference-counting-to-decide-when-a-sstable-can-v2.patch

Attaching rebased first patch and a second patch to implement the Cleaner 
trick.

I have confirmed on an example that, at least on linux, it does force the 
unmapping: the jvm crashes if you try to access the buffer after the unmapping.

This is the biggest drawback of this approach imho. If we screw up with the 
reference counting and some thread does access the mapping, we won't get a nice 
exception, the JVM will simply crash (with the headache of having to find if it 
does is a bug on our side or a JVM bug). But for the quick testing I've done, 
it seems to work correctly.

 Move away from Phantom References for Compaction/Memtable
 -

 Key: CASSANDRA-2521
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2521
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Assignee: Sylvain Lebresne
 Fix For: 1.0

 Attachments: 
 0001-Use-reference-counting-to-decide-when-a-sstable-can-.patch, 
 0001-Use-reference-counting-to-decide-when-a-sstable-can-v2.patch, 
 0002-Force-unmapping-files-before-deletion-v2.patch


 http://wiki.apache.org/cassandra/MemtableSSTable
 Let's move to using reference counting instead of relying on GC to be called 
 in StorageService.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-2786) After a minor compaction, deleted key-slices are visible again

2011-06-17 Thread rene kochen (JIRA)
After a minor compaction, deleted key-slices are visible again
--

 Key: CASSANDRA-2786
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2786
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
 Environment: Single node with empty database
Reporter: rene kochen


After a minor compaction, deleted key-slices are visible again.

Steps to reproduce:

1) Insert a row named test.
2) Insert 50 rows. During this step, test is included in a major 
compaction.
3) Delete row named test.
4) Insert 50 rows. During this step, test is included in a minor 
compaction.

After step 4, row test is live again.

Test environment:

Single node with empty database.

Standard configured super-column-family (I see this behavior with several 
gc_grace settings (big and small values):
create column family Customers with column_type = 'Super' and comparator = 
'BytesType;

In Cassandra 0.7.6 I observe the expected behavior, i.e. after step 4, the row 
is still deleted.

I've included a .NET program to reproduce the problem. I will add a Java 
version later on.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2720) The current implementation of CassandraConnection does not always follow documented semantics for a JDBC Connection interface

2011-06-17 Thread Rick Shaw (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051022#comment-13051022
 ] 

Rick Shaw commented on CASSANDRA-2720:
--

I will readily admit it was an unfortunate decision on the part of JDBC 
standard to put SQL in the name; however it is pretty clear in the spec as to 
what the name of the Exception is to be, to be able to be conforment (or to 
take your best shot at it).

We could wrap Cassandra (with CQL in the name) exceptions with SQLException 
(and its subclasses) if we felt strongly, which we do already for many that do 
not have CQL in the name, but that seems like a lot of extra trouble to make a 
point.

 The current implementation of CassandraConnection does not always follow 
 documented semantics for a JDBC Connection interface
 -

 Key: CASSANDRA-2720
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2720
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8.0 beta 2
Reporter: Rick Shaw
Assignee: Rick Shaw
Priority: Minor
  Labels: cql, drivers, jdbc
 Fix For: 0.8.2

 Attachments: Cleanup-semantics for-a JDBC-Connection-v1.txt, 
 Cleanup-semantics-for-a-JDBC-Connection-v2.txt


 While the current implementation of many of the classes in the JDBC driver 
 implementation are practical to get the driver to work they do not always 
 obey the documented semantics for the associated interfaces. I am proposing 
 making a pass over the involved implementation members to start the 
 tightening process that will need to happen to use this driver in other 
 tools an programs that expect stricter adherence than is currently present.
 Broad areas of attention are:
 - Use of {{SQLFeatureNotSupportedException}} not 
 {{UnsupportedOperationException}} for methods that The Cassandra 
 Implementation does not support.
 - Checking in appropriate methods for the prescribed throwing of 
 {{SQLException}} if the method is called on a closed connection.
 - providing method implementations for all methods that are not optional even 
 it it is to return NULL (as prescribed in the Interface description.)
 I will cut additional JIRA tickets for other components in the suite.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2786) After a minor compaction, deleted key-slices are visible again

2011-06-17 Thread rene kochen (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rene kochen updated CASSANDRA-2786:
---

Attachment: CassandraIssue.zip

 After a minor compaction, deleted key-slices are visible again
 --

 Key: CASSANDRA-2786
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2786
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
 Environment: Single node with empty database
Reporter: rene kochen
 Attachments: CassandraIssue.zip


 After a minor compaction, deleted key-slices are visible again.
 Steps to reproduce:
 1) Insert a row named test.
 2) Insert 50 rows. During this step, test is included in a major 
 compaction.
 3) Delete row named test.
 4) Insert 50 rows. During this step, test is included in a minor 
 compaction.
 After step 4, row test is live again.
 Test environment:
 Single node with empty database.
 Standard configured super-column-family (I see this behavior with several 
 gc_grace settings (big and small values):
 create column family Customers with column_type = 'Super' and comparator = 
 'BytesType;
 In Cassandra 0.7.6 I observe the expected behavior, i.e. after step 4, the 
 row is still deleted.
 I've included a .NET program to reproduce the problem. I will add a Java 
 version later on.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-2787) java agent option missing in cassandra.bat file

2011-06-17 Thread rene kochen (JIRA)
java agent option missing in cassandra.bat file
---

 Key: CASSANDRA-2787
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2787
 Project: Cassandra
  Issue Type: Bug
  Components: Packaging
Affects Versions: 0.8.0
Reporter: rene kochen
Priority: Minor


This option must be included in cassandra.bat:

-javaagent:%CASSANDRA_HOME%/lib/jamm-0.2.2.jar

Otherwise you see the following warnings in cassandra log:

WARN 12:02:32,478 MemoryMeter uninitialized (jamm not specified as java agent); 
assuming liveRatio of 10.0. Usually this means cassandra-env.sh disabled jamm 
because you are using a buggy JRE; upgrade to the Sun JRE instead


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2769) Cannot Create Duplicate Compaction Marker

2011-06-17 Thread Benjamin Coverston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051086#comment-13051086
 ] 

Benjamin Coverston commented on CASSANDRA-2769:
---

I'm sorry I didn't mean to imply it should be fixed _here_. I'll find a more 
appropriate venue to vent these frustrations :)

 Cannot Create Duplicate Compaction Marker
 -

 Key: CASSANDRA-2769
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2769
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Benjamin Coverston
Assignee: Sylvain Lebresne
 Fix For: 0.8.2

 Attachments: 
 0001-0.8.0-Remove-useless-unmarkCompacting-in-doCleanup.patch, 
 0001-Do-compact-only-smallerSSTables-v2.patch, 
 0001-Do-compact-only-smallerSSTables.patch, 
 0002-Only-compact-what-has-been-succesfully-marked-as-com-v2.patch, 
 0002-Only-compact-what-has-been-succesfully-marked-as-com.patch


 Concurrent compaction can trigger the following exception when two threads 
 compact the same sstable. DataTracker attempts to prevent this but apparently 
 not successfully.
 java.io.IOError: java.io.IOException: Unable to create compaction marker
   at 
 org.apache.cassandra.io.sstable.SSTableReader.markCompacted(SSTableReader.java:638)
   at 
 org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:321)
   at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:294)
   at 
 org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:255)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:932)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:173)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:119)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:102)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:680)
 Caused by: java.io.IOException: Unable to create compaction marker
   at 
 org.apache.cassandra.io.sstable.SSTableReader.markCompacted(SSTableReader.java:634)
   ... 12 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1608) Redesigned Compaction

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051090#comment-13051090
 ] 

Jonathan Ellis commented on CASSANDRA-1608:
---

bq. The LDBCompaction task was changed to limit the size of the SSTables that 
are output by the compaction itself.

Ah, totally makes sense.  Wonder if we can refactor some more to avoid so much 
duplicate code.

bq. I'll make some more modifications to the manifest s.t. there is a single 
path for getting new SSTables (flushed and streamed) into the manifest. I found 
a bug on the plane today where they were getting added to the manifest, but 
they weren't being added to the queue

I think I fixed that by getting rid of the queue.  It was basically just L0 
anyway.

I like Manifest.add() [to L0] being The Single Path, feels pretty foolproof 
to me.

bq. There are some boundary cases where every SSTable that gets compacted will 
be in the same level. Most of them have to do with L+1 being empty.

Also makes sense.

bq. RE: the order, it does feel like we should do lower levels before higher 
levels, however one thing that we have to do is make sure that level-1 stays at 
10 SSTables. The algorithm dictates that all of the level-0 candidates get 
compacted with all of the candidates at level-1.

Well, all the overlapping ones.  Which is usually going to be all of them, but 
it's easy enough to check that we might as well on the off chance that we get 
to save some i/o.

bq. This means that you need to promote out of level-1 so that it is ~10 
SSTables before you schedule a compaction for level-0 promotion.

I'm not sure that necessarily follows.  Compacting lower levels first means 
less duplicate recompaction from L+1 later.  L0 is particularly important since 
lots of sstables in L0 means (potentially) lots of merging by readers.

In any case, the comments in gCC talked about prioritizing L1 but the code 
actually prioritized L0 so I went with that. :)

 Redesigned Compaction
 -

 Key: CASSANDRA-1608
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1608
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
 Attachments: 0001-leveldb-style-compaction.patch, 1608-v2.txt


 After seeing the I/O issues in CASSANDRA-1470, I've been doing some more 
 thinking on this subject that I wanted to lay out.
 I propose we redo the concept of how compaction works in Cassandra. At the 
 moment, compaction is kicked off based on a write access pattern, not read 
 access pattern. In most cases, you want the opposite. You want to be able to 
 track how well each SSTable is performing in the system. If we were to keep 
 statistics in-memory of each SSTable, prioritize them based on most accessed, 
 and bloom filter hit/miss ratios, we could intelligently group sstables that 
 are being read most often and schedule them for compaction. We could also 
 schedule lower priority maintenance on SSTable's not often accessed.
 I also propose we limit the size of each SSTable to a fix sized, that gives 
 us the ability to  better utilize our bloom filters in a predictable manner. 
 At the moment after a certain size, the bloom filters become less reliable. 
 This would also allow us to group data most accessed. Currently the size of 
 an SSTable can grow to a point where large portions of the data might not 
 actually be accessed as often.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2788) Add startup option renew the NodeId (for counters)

2011-06-17 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2788:


Attachment: 0001-Option-to-renew-the-NodeId-on-startup.patch

 Add startup option renew the NodeId (for counters)
 --

 Key: CASSANDRA-2788
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2788
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: counters
 Fix For: 0.8.2

 Attachments: 0001-Option-to-renew-the-NodeId-on-startup.patch


 If an sstable of a counter column family is corrupted, the only safe solution 
 a user have right now is to:
 # Remove the NodeId System table to force the node to regenerate a new NodeId 
 (and thus stop incrementing on it's previous, corrupted, subcount)
 # Remove all the sstables for that column family on that node (this is 
 important because otherwise the node will never get repaired for it's 
 previous subcount)
 This is far from being ideal, but I think this is the price we pay for 
 avoiding the read-before-write. In any case, the first step (remove the 
 NodeId system table) happens to remove the list of the old NodeId this node 
 has, which could prevent us for merging the other potential previous nodeId. 
 This is ok but sub-optimal. This ticket proposes to add a new startup flag to 
 make the node renew it's NodeId, thus replacing this first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-2788) Add startup option renew the NodeId (for counters)

2011-06-17 Thread Sylvain Lebresne (JIRA)
Add startup option renew the NodeId (for counters)
--

 Key: CASSANDRA-2788
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2788
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.8.2
 Attachments: 0001-Option-to-renew-the-NodeId-on-startup.patch

If an sstable of a counter column family is corrupted, the only safe solution a 
user have right now is to:
# Remove the NodeId System table to force the node to regenerate a new NodeId 
(and thus stop incrementing on it's previous, corrupted, subcount)
# Remove all the sstables for that column family on that node (this is 
important because otherwise the node will never get repaired for it's 
previous subcount)

This is far from being ideal, but I think this is the price we pay for avoiding 
the read-before-write. In any case, the first step (remove the NodeId system 
table) happens to remove the list of the old NodeId this node has, which could 
prevent us for merging the other potential previous nodeId. This is ok but 
sub-optimal. This ticket proposes to add a new startup flag to make the node 
renew it's NodeId, thus replacing this first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1608) Redesigned Compaction

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051097#comment-13051097
 ] 

Jonathan Ellis commented on CASSANDRA-1608:
---

I checked what leveldb actually does: 
http://www.google.com/codesearch#mHLldehqYMA/trunk/db/version_set.cc, methods 
Finalize and PickCompaction.

What it does is compute a score for each level, as the ratio of bytes in that 
level to desired bytes.  For level 0, it computes files / desired files 
instead.  (Apparently leveldb doesn't have row-level bloom filters, so merging 
on reads is extra painful.*) The level with the highest score is compacted.

When compacting L0 the only special casing done by leveldb is that after 
picking the primary L0 file to compact, it will check other L0 files for 
overlapping-ness too.  (Again, we can expect this to usually if not always be 
all L0 files, but it's not much more code than a always compact all L0 
files special case would be, so why not avoid some i/o if we can.)

*I'm pretty sure that (a) we don't need to special case for this reason and (b) 
we should standardize on bytes instead of file count, the latter is too subject 
to inaccuracy from streamed files as mentioned and on later levels the fact 
that compaction results are not going to be clean -- if we merge one sstable of 
size S from L with two of size S from L+1, odds are poor we'll end up with 
merged bytes divisible by S or even very close to it.  The overwhelming 
likelihood is you end up with 2 of size S and one of size 0  size  S.  Do 
enough of these and using sstable count as an approximation for size gets 
pretty inaccurate. Fortunately a method to sum SSTableReader.length() would be 
easy enough to write instead.


 Redesigned Compaction
 -

 Key: CASSANDRA-1608
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1608
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
 Attachments: 0001-leveldb-style-compaction.patch, 1608-v2.txt


 After seeing the I/O issues in CASSANDRA-1470, I've been doing some more 
 thinking on this subject that I wanted to lay out.
 I propose we redo the concept of how compaction works in Cassandra. At the 
 moment, compaction is kicked off based on a write access pattern, not read 
 access pattern. In most cases, you want the opposite. You want to be able to 
 track how well each SSTable is performing in the system. If we were to keep 
 statistics in-memory of each SSTable, prioritize them based on most accessed, 
 and bloom filter hit/miss ratios, we could intelligently group sstables that 
 are being read most often and schedule them for compaction. We could also 
 schedule lower priority maintenance on SSTable's not often accessed.
 I also propose we limit the size of each SSTable to a fix sized, that gives 
 us the ability to  better utilize our bloom filters in a predictable manner. 
 At the moment after a certain size, the bloom filters become less reliable. 
 This would also allow us to group data most accessed. Currently the size of 
 an SSTable can grow to a point where large portions of the data might not 
 actually be accessed as often.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2720) The current implementation of CassandraConnection does not always follow documented semantics for a JDBC Connection interface

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051108#comment-13051108
 ] 

Jonathan Ellis commented on CASSANDRA-2720:
---

bq. it is pretty clear in the spec as to what the name of the Exception is to be

Agreed, no reason to complicate things further.

 The current implementation of CassandraConnection does not always follow 
 documented semantics for a JDBC Connection interface
 -

 Key: CASSANDRA-2720
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2720
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8.0 beta 2
Reporter: Rick Shaw
Assignee: Rick Shaw
Priority: Minor
  Labels: cql, drivers, jdbc
 Fix For: 0.8.2

 Attachments: Cleanup-semantics for-a JDBC-Connection-v1.txt, 
 Cleanup-semantics-for-a-JDBC-Connection-v2.txt


 While the current implementation of many of the classes in the JDBC driver 
 implementation are practical to get the driver to work they do not always 
 obey the documented semantics for the associated interfaces. I am proposing 
 making a pass over the involved implementation members to start the 
 tightening process that will need to happen to use this driver in other 
 tools an programs that expect stricter adherence than is currently present.
 Broad areas of attention are:
 - Use of {{SQLFeatureNotSupportedException}} not 
 {{UnsupportedOperationException}} for methods that The Cassandra 
 Implementation does not support.
 - Checking in appropriate methods for the prescribed throwing of 
 {{SQLException}} if the method is called on a closed connection.
 - providing method implementations for all methods that are not optional even 
 it it is to return NULL (as prescribed in the Interface description.)
 I will cut additional JIRA tickets for other components in the suite.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2786) After a minor compaction, deleted key-slices are visible again

2011-06-17 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051110#comment-13051110
 ] 

Sylvain Lebresne commented on CASSANDRA-2786:
-

The java version would be really cool :)

 After a minor compaction, deleted key-slices are visible again
 --

 Key: CASSANDRA-2786
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2786
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
 Environment: Single node with empty database
Reporter: rene kochen
 Attachments: CassandraIssue.zip


 After a minor compaction, deleted key-slices are visible again.
 Steps to reproduce:
 1) Insert a row named test.
 2) Insert 50 rows. During this step, test is included in a major 
 compaction.
 3) Delete row named test.
 4) Insert 50 rows. During this step, test is included in a minor 
 compaction.
 After step 4, row test is live again.
 Test environment:
 Single node with empty database.
 Standard configured super-column-family (I see this behavior with several 
 gc_grace settings (big and small values):
 create column family Customers with column_type = 'Super' and comparator = 
 'BytesType;
 In Cassandra 0.7.6 I observe the expected behavior, i.e. after step 4, the 
 row is still deleted.
 I've included a .NET program to reproduce the problem. I will add a Java 
 version later on.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2780) sstable2json needs to escape quotes

2011-06-17 Thread Timo Nentwig (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051109#comment-13051109
 ] 

Timo Nentwig commented on CASSANDRA-2780:
-

D'oh, actually an existing \ must be replaced with \\\ :-\

 sstable2json needs to escape quotes
 ---

 Key: CASSANDRA-2780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2780
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Timo Nentwig
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: CASSANDRA-2780-v2.patch, CASSANDRA-2780.patch


 [default@foo] set transactions[test][data]='{foo:bar}'; 
 $ cat /tmp/json
 {
 74657374: [[data, {foo:bar}, 1308209845388000]]
 }
 $ ./json2sstable -s -c transactions -K foo /tmp/json /tmp/ss-g-1-Data.db
 Counting keys to import, please wait... (NOTE: to skip this use -n num_keys)
 org.codehaus.jackson.JsonParseException: Unexpected character ('f' (code 
 102)): was expecting comma to separate ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
   at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonParserBase.java:565)
   at 
 org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:128)
   at 
 org.codehaus.jackson.impl.JsonParserBase.skipChildren(JsonParserBase.java:263)
   at 
 org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:328)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:252)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
 ERROR: Unexpected character ('f' (code 102)): was expecting comma to separate 
 ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
 http://www.mail-archive.com/user@cassandra.apache.org/msg14257.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2769) Cannot Create Duplicate Compaction Marker

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051113#comment-13051113
 ] 

Jonathan Ellis commented on CASSANDRA-2769:
---

+1 v2

 Cannot Create Duplicate Compaction Marker
 -

 Key: CASSANDRA-2769
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2769
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Benjamin Coverston
Assignee: Sylvain Lebresne
 Fix For: 0.8.2

 Attachments: 
 0001-0.8.0-Remove-useless-unmarkCompacting-in-doCleanup.patch, 
 0001-Do-compact-only-smallerSSTables-v2.patch, 
 0001-Do-compact-only-smallerSSTables.patch, 
 0002-Only-compact-what-has-been-succesfully-marked-as-com-v2.patch, 
 0002-Only-compact-what-has-been-succesfully-marked-as-com.patch


 Concurrent compaction can trigger the following exception when two threads 
 compact the same sstable. DataTracker attempts to prevent this but apparently 
 not successfully.
 java.io.IOError: java.io.IOException: Unable to create compaction marker
   at 
 org.apache.cassandra.io.sstable.SSTableReader.markCompacted(SSTableReader.java:638)
   at 
 org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:321)
   at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:294)
   at 
 org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:255)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:932)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:173)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:119)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:102)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:680)
 Caused by: java.io.IOException: Unable to create compaction marker
   at 
 org.apache.cassandra.io.sstable.SSTableReader.markCompacted(SSTableReader.java:634)
   ... 12 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2780) sstable2json needs to escape quotes

2011-06-17 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051117#comment-13051117
 ] 

Pavel Yaskevich commented on CASSANDRA-2780:


Can you please post a correct regex so i can include it to patch?

 sstable2json needs to escape quotes
 ---

 Key: CASSANDRA-2780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2780
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Timo Nentwig
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: CASSANDRA-2780-v2.patch, CASSANDRA-2780.patch


 [default@foo] set transactions[test][data]='{foo:bar}'; 
 $ cat /tmp/json
 {
 74657374: [[data, {foo:bar}, 1308209845388000]]
 }
 $ ./json2sstable -s -c transactions -K foo /tmp/json /tmp/ss-g-1-Data.db
 Counting keys to import, please wait... (NOTE: to skip this use -n num_keys)
 org.codehaus.jackson.JsonParseException: Unexpected character ('f' (code 
 102)): was expecting comma to separate ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
   at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonParserBase.java:565)
   at 
 org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:128)
   at 
 org.codehaus.jackson.impl.JsonParserBase.skipChildren(JsonParserBase.java:263)
   at 
 org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:328)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:252)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
 ERROR: Unexpected character ('f' (code 102)): was expecting comma to separate 
 ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
 http://www.mail-archive.com/user@cassandra.apache.org/msg14257.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1136904 - in /cassandra/trunk/src/java/org/apache/cassandra/db: DataTracker.java compaction/AbstractCompactionTask.java compaction/CompactionManager.java compaction/CompactionTask.java

2011-06-17 Thread slebresne
Author: slebresne
Date: Fri Jun 17 15:00:21 2011
New Revision: 1136904

URL: http://svn.apache.org/viewvc?rev=1136904view=rev
Log:
Fix compaction of the same sstable by multiple thread
patch by slebresne; reviewed by jbellis for CASSANDRA-2769

Modified:
cassandra/trunk/src/java/org/apache/cassandra/db/DataTracker.java

cassandra/trunk/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java

cassandra/trunk/src/java/org/apache/cassandra/db/compaction/CompactionManager.java

cassandra/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java

Modified: cassandra/trunk/src/java/org/apache/cassandra/db/DataTracker.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/DataTracker.java?rev=1136904r1=1136903r2=1136904view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/db/DataTracker.java (original)
+++ cassandra/trunk/src/java/org/apache/cassandra/db/DataTracker.java Fri Jun 
17 15:00:21 2011
@@ -33,7 +33,6 @@ import org.slf4j.LoggerFactory;
 
 import org.apache.cassandra.cache.AutoSavingCache;
 import org.apache.cassandra.config.DatabaseDescriptor;
-import org.apache.cassandra.db.compaction.AbstractCompactionTask;
 import org.apache.cassandra.io.sstable.Descriptor;
 import org.apache.cassandra.io.sstable.SSTableReader;
 import org.apache.cassandra.utils.Pair;
@@ -163,6 +162,9 @@ public classDataTracker
 {
 if (max  min || max  1)
 return null;
+if (tomark == null || tomark.isEmpty())
+return null;
+
 View currentView, newView;
 SetSSTableReader subset = null;
 // order preserving set copy of the input
@@ -190,41 +192,6 @@ public classDataTracker
 return subset;
 }
 
-public boolean markCompacting(AbstractCompactionTask task)
-{
-ColumnFamilyStore cfs = task.getColumnFamilyStore();
-return markCompacting(task, cfs.getMinimumCompactionThreshold(), 
cfs.getMaximumCompactionThreshold());
-}
-
-public boolean markCompacting(AbstractCompactionTask task, int min, int 
max)
-{
-CollectionSSTableReader sstablesToMark = task.getSSTables();
-if (sstablesToMark == null || sstablesToMark.isEmpty())
-return false;
-
-if (max  min || max  1)
-return false;
-
-View currentView, newView;
-// order preserving set copy of the input
-SetSSTableReader remaining = new 
LinkedHashSetSSTableReader(sstablesToMark);
-do
-{
-currentView = view.get();
-
-// find the subset that is active and not already compacting
-remaining.removeAll(currentView.compacting);
-remaining.retainAll(currentView.sstables);
-if (remaining.size()  min || remaining.size()  max)
-// cannot meet the min and max threshold
-return false;
-
-newView = currentView.markCompacting(sstablesToMark);
-}
-while (!view.compareAndSet(currentView, newView));
-return true;
-}
-
 /**
  * Removes files from compacting status: this is different from 
'markCompacted'
  * because it should be run regardless of whether a compaction succeeded.
@@ -240,11 +207,6 @@ public classDataTracker
 while (!view.compareAndSet(currentView, newView));
 }
 
-public void unmarkCompacting(AbstractCompactionTask task)
-{
-unmarkCompacting(task.getSSTables());
-}
-
 public void markCompacted(CollectionSSTableReader sstables)
 {
 replace(sstables, Collections.SSTableReaderemptyList());

Modified: 
cassandra/trunk/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java?rev=1136904r1=1136903r2=1136904view=diff
==
--- 
cassandra/trunk/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java
 (original)
+++ 
cassandra/trunk/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java
 Fri Jun 17 15:00:21 2011
@@ -19,6 +19,7 @@
 package org.apache.cassandra.db.compaction;
 
 import java.util.Collection;
+import java.util.Set;
 import java.io.IOException;
 
 import org.apache.cassandra.io.sstable.SSTableReader;
@@ -47,4 +48,33 @@ public abstract class AbstractCompaction
 {
 return sstables;
 }
+
+/**
+ * Try to mark the sstable to compact as compacting.
+ * It returns true if some sstables have been marked for compaction, false
+ * otherwise.
+ * This *must* be called before calling execute(). Moreover,
+ * unmarkSSTables *must* always be called after execute() if this
+ * method returns true.
+ */
+public boolean markSSTablesForCompaction()
+{
+

[jira] [Commented] (CASSANDRA-2780) sstable2json needs to escape quotes

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051124#comment-13051124
 ] 

Jonathan Ellis commented on CASSANDRA-2780:
---

Why are we talking about regexes instead of taking Tatu's advice?

 sstable2json needs to escape quotes
 ---

 Key: CASSANDRA-2780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2780
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Timo Nentwig
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: CASSANDRA-2780-v2.patch, CASSANDRA-2780.patch


 [default@foo] set transactions[test][data]='{foo:bar}'; 
 $ cat /tmp/json
 {
 74657374: [[data, {foo:bar}, 1308209845388000]]
 }
 $ ./json2sstable -s -c transactions -K foo /tmp/json /tmp/ss-g-1-Data.db
 Counting keys to import, please wait... (NOTE: to skip this use -n num_keys)
 org.codehaus.jackson.JsonParseException: Unexpected character ('f' (code 
 102)): was expecting comma to separate ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
   at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonParserBase.java:565)
   at 
 org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:128)
   at 
 org.codehaus.jackson.impl.JsonParserBase.skipChildren(JsonParserBase.java:263)
   at 
 org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:328)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:252)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
 ERROR: Unexpected character ('f' (code 102)): was expecting comma to separate 
 ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
 http://www.mail-archive.com/user@cassandra.apache.org/msg14257.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2727) examples/hadoop_word_count reducer to cassandra doesn't output into the output_words cf

2011-06-17 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-2727:
--

Attachment: v1-0001-CASSANDRA-2727-fix-for-word-count-reducer.txt

 examples/hadoop_word_count reducer to cassandra doesn't output into the 
 output_words cf
 ---

 Key: CASSANDRA-2727
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2727
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0 beta 2
Reporter: Jeremy Hanna
Priority: Minor
  Labels: hadoop
 Attachments: v1-0001-CASSANDRA-2727-fix-for-word-count-reducer.txt


 I tried the examples/hadoop_word_count example and could output to the 
 filesystem but when I output to cassandra (the default), nothing shows up in 
 output_words.  I can output to cassandra using pig so I think the problem is 
 isolated to this example.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2727) examples/hadoop_word_count reducer to cassandra doesn't output into the output_words cf

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051127#comment-13051127
 ] 

Jonathan Ellis commented on CASSANDRA-2727:
---

+1

 examples/hadoop_word_count reducer to cassandra doesn't output into the 
 output_words cf
 ---

 Key: CASSANDRA-2727
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2727
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0 beta 2
Reporter: Jeremy Hanna
Priority: Minor
  Labels: hadoop
 Attachments: v1-0001-CASSANDRA-2727-fix-for-word-count-reducer.txt


 I tried the examples/hadoop_word_count example and could output to the 
 filesystem but when I output to cassandra (the default), nothing shows up in 
 output_words.  I can output to cassandra using pig so I think the problem is 
 isolated to this example.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2780) sstable2json needs to escape quotes

2011-06-17 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051126#comment-13051126
 ] 

Pavel Yaskevich commented on CASSANDRA-2780:


Agreed, I see now that would be the best option. Will be working at that 
direction.

 sstable2json needs to escape quotes
 ---

 Key: CASSANDRA-2780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2780
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Timo Nentwig
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: CASSANDRA-2780-v2.patch, CASSANDRA-2780.patch


 [default@foo] set transactions[test][data]='{foo:bar}'; 
 $ cat /tmp/json
 {
 74657374: [[data, {foo:bar}, 1308209845388000]]
 }
 $ ./json2sstable -s -c transactions -K foo /tmp/json /tmp/ss-g-1-Data.db
 Counting keys to import, please wait... (NOTE: to skip this use -n num_keys)
 org.codehaus.jackson.JsonParseException: Unexpected character ('f' (code 
 102)): was expecting comma to separate ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
   at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonParserBase.java:565)
   at 
 org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:128)
   at 
 org.codehaus.jackson.impl.JsonParserBase.skipChildren(JsonParserBase.java:263)
   at 
 org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:328)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:252)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
 ERROR: Unexpected character ('f' (code 102)): was expecting comma to separate 
 ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
 http://www.mail-archive.com/user@cassandra.apache.org/msg14257.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2530) Additional AbstractType data type definitions to enrich CQL

2011-06-17 Thread Rick Shaw (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051128#comment-13051128
 ] 

Rick Shaw commented on CASSANDRA-2530:
--

This appeared to get the ok on 2011-06-01? I there more work required to get 
this in trunk for 0.8.2?

 Additional AbstractType data type definitions to enrich CQL
 ---

 Key: CASSANDRA-2530
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2530
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0 beta 2
Reporter: Rick Shaw
Priority: Trivial
  Labels: cql
 Attachments: patch-to-add-4-new-AbstractTypes-and-CQL-support-v4.txt, 
 patch-to-add-4-new-AbstractTypes-and-CQL-support-v5.txt


 Provide 5 additional Datatypes: ByteType, DateType, BooleanType, FloatType, 
 DoubleType.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2769) Cannot Create Duplicate Compaction Marker

2011-06-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051130#comment-13051130
 ] 

Hudson commented on CASSANDRA-2769:
---

Integrated in Cassandra #931 (See 
[https://builds.apache.org/job/Cassandra/931/])
Fix compaction of the same sstable by multiple thread
patch by slebresne; reviewed by jbellis for CASSANDRA-2769

slebresne : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1136904
Files : 
* 
/cassandra/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/DataTracker.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java


 Cannot Create Duplicate Compaction Marker
 -

 Key: CASSANDRA-2769
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2769
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Benjamin Coverston
Assignee: Sylvain Lebresne
 Fix For: 0.8.2, 1.0

 Attachments: 
 0001-0.8.0-Remove-useless-unmarkCompacting-in-doCleanup.patch, 
 0001-Do-compact-only-smallerSSTables-v2.patch, 
 0001-Do-compact-only-smallerSSTables.patch, 
 0002-Only-compact-what-has-been-succesfully-marked-as-com-v2.patch, 
 0002-Only-compact-what-has-been-succesfully-marked-as-com.patch


 Concurrent compaction can trigger the following exception when two threads 
 compact the same sstable. DataTracker attempts to prevent this but apparently 
 not successfully.
 java.io.IOError: java.io.IOException: Unable to create compaction marker
   at 
 org.apache.cassandra.io.sstable.SSTableReader.markCompacted(SSTableReader.java:638)
   at 
 org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:321)
   at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:294)
   at 
 org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:255)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:932)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:173)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:119)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:102)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:680)
 Caused by: java.io.IOException: Unable to create compaction marker
   at 
 org.apache.cassandra.io.sstable.SSTableReader.markCompacted(SSTableReader.java:634)
   ... 12 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-2727) examples/hadoop_word_count reducer to cassandra doesn't output into the output_words cf

2011-06-17 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani resolved CASSANDRA-2727.
---

   Resolution: Fixed
Fix Version/s: 0.8.2
 Assignee: T Jake Luciani

 examples/hadoop_word_count reducer to cassandra doesn't output into the 
 output_words cf
 ---

 Key: CASSANDRA-2727
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2727
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0 beta 2
Reporter: Jeremy Hanna
Assignee: T Jake Luciani
Priority: Minor
  Labels: hadoop
 Fix For: 0.8.2

 Attachments: v1-0001-CASSANDRA-2727-fix-for-word-count-reducer.txt


 I tried the examples/hadoop_word_count example and could output to the 
 filesystem but when I output to cassandra (the default), nothing shows up in 
 output_words.  I can output to cassandra using pig so I think the problem is 
 isolated to this example.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2761) JDBC driver does not build

2011-06-17 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051138#comment-13051138
 ] 

Eric Evans commented on CASSANDRA-2761:
---

{quote}
The point is there are lots and they are scattered all over the various 
packages; It will be very difficult to manage when they change from the driver 
package (client side), which is supposed to be able to change independent of 
the server code. If a subset of the server code is to be a dependency then that 
subset (jar/s) must be managed in the main build not the driver build.
{quote}

Right, I was curious to see the list of classes (that list is fantastic btw, 
thanks for that), to see if there was one point in the graph where breaking a 
dependency would drastically change the scope of the problem.  It looks like 
the answer is Yes, and the dependency is {{o.a.c.config.CFMetaData}}, (needed 
by {{ColumnDecoder}}).

Just skimming through the code, I don't think it would be hard to either 
re-implement the needed parts of CFMetaData, or refactor CFMetaData to limit 
what it pulled in.

 JDBC driver does not build
 --

 Key: CASSANDRA-2761
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2761
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0
Reporter: Jonathan Ellis
Assignee: Rick Shaw
 Fix For: 1.0

 Attachments: jdbc-driver-build-v1.txt


 Need a way to build (and run tests for) the Java driver.
 Also: still some vestigal references to drivers/ in trunk build.xml.
 Should we remove drivers/ from the 0.8 branch as well?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2530) Additional AbstractType data type definitions to enrich CQL

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051139#comment-13051139
 ] 

Jonathan Ellis commented on CASSANDRA-2530:
---

Timestamp is not less broken than it was then, but we could commit the others.

 Additional AbstractType data type definitions to enrich CQL
 ---

 Key: CASSANDRA-2530
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2530
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0 beta 2
Reporter: Rick Shaw
Priority: Trivial
  Labels: cql
 Attachments: patch-to-add-4-new-AbstractTypes-and-CQL-support-v4.txt, 
 patch-to-add-4-new-AbstractTypes-and-CQL-support-v5.txt


 Provide 5 additional Datatypes: ByteType, DateType, BooleanType, FloatType, 
 DoubleType.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2530) Additional AbstractType data type definitions to enrich CQL

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051145#comment-13051145
 ] 

Jonathan Ellis commented on CASSANDRA-2530:
---

(I meant Date.)

I did some more reading and it looks like what you propose is mainstream 
behavior for JDBC drivers.  
(http://stackoverflow.com/questions/4078426/to-which-java-data-types-can-i-map-timestamp-with-time-zone-or-timestamp-with-loc,
 http://postgresql.1045698.n5.nabble.com/Timestamp-confusion-td2174087.html, 
http://download.oracle.com/docs/cd/E13222_01/wls/docs81/jdbc_drivers/oracle.html).

I'll get this committed.

 Additional AbstractType data type definitions to enrich CQL
 ---

 Key: CASSANDRA-2530
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2530
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0 beta 2
Reporter: Rick Shaw
Priority: Trivial
  Labels: cql
 Attachments: patch-to-add-4-new-AbstractTypes-and-CQL-support-v4.txt, 
 patch-to-add-4-new-AbstractTypes-and-CQL-support-v5.txt


 Provide 5 additional Datatypes: ByteType, DateType, BooleanType, FloatType, 
 DoubleType.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2788) Add startup option renew the NodeId (for counters)

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051158#comment-13051158
 ] 

Jonathan Ellis commented on CASSANDRA-2788:
---

+1

 Add startup option renew the NodeId (for counters)
 --

 Key: CASSANDRA-2788
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2788
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: counters
 Fix For: 0.8.2

 Attachments: 0001-Option-to-renew-the-NodeId-on-startup.patch


 If an sstable of a counter column family is corrupted, the only safe solution 
 a user have right now is to:
 # Remove the NodeId System table to force the node to regenerate a new NodeId 
 (and thus stop incrementing on it's previous, corrupted, subcount)
 # Remove all the sstables for that column family on that node (this is 
 important because otherwise the node will never get repaired for it's 
 previous subcount)
 This is far from being ideal, but I think this is the price we pay for 
 avoiding the read-before-write. In any case, the first step (remove the 
 NodeId system table) happens to remove the list of the old NodeId this node 
 has, which could prevent us for merging the other potential previous nodeId. 
 This is ok but sub-optimal. This ticket proposes to add a new startup flag to 
 make the node renew it's NodeId, thus replacing this first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2788) Add startup option renew the NodeId (for counters)

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051159#comment-13051159
 ] 

Jonathan Ellis edited comment on CASSANDRA-2788 at 6/17/11 4:11 PM:


Pasting Sylvain's explanation from IRC:

{quote}
Let's me take a small example: Suppose two node A and B. Initially their 
node_id will be respectively A1 and B1. Each counter will thus have two 
components, A1 and B1.

Now suppose you renew the node_id of A - A2 because of a corruption. Soon 
enough, the counters will have 3 components A1, A2 and B1. Renew that yet 
another time and the counter context will be A1, A2, A3 and B1. It grows, which 
is not cool.
But because we know that nobody will ever increment A1 and A2 anymore (A3 is 
the active node id for A), we can merge them (we have to wait for gc_grace and 
stuff for that be correct etc... but we do it)

So basically we try to keep the context as small as can be. If you nuke 
NodeIdInfo, right now the code won't be able to do that anymore and you will 
stay with a bigger that necessary context for all the counters.

So just renewing is more efficient in that sense. But nuking the system table 
is still 'correct' as far as returning the correct count is involved.
{quote}

  was (Author: jbellis):
Pasting Sylvain's explanation from IRC:

{quote}
Let's me take a small example: Suppose two node A and B. Initially their 
node_id will be respectively A1 and B1. Each counter will thus have two 
components, A1 and B1.

Now suppose you renew the node_id of A - A2 because of a corruption. Soon 
enough, the counters will have 3 components A1, A2 and B1. Renew that yet 
another time and the counter context will be A1, A2, A3 and B1. It grows, which 
is not cool.
But because we know that nobody will ever increment A1 and A2 anymore (A3 is 
the active node id for A), we can merge them (we have to wait for gc_grace and 
stuff for that be correct etc... but we do it)

So basically we try to keep the context as small as can be. If you nuke 
NodeIdInfo, right now the code won't be able to do that anymore and you will 
stay with a bigger that necessary context for all the counters.

So just renewing is more efficient in that sense. But nuking the system table 
is still 'correct' as far as returning the correct count is involved.
{quoted}
  
 Add startup option renew the NodeId (for counters)
 --

 Key: CASSANDRA-2788
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2788
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: counters
 Fix For: 0.8.2

 Attachments: 0001-Option-to-renew-the-NodeId-on-startup.patch


 If an sstable of a counter column family is corrupted, the only safe solution 
 a user have right now is to:
 # Remove the NodeId System table to force the node to regenerate a new NodeId 
 (and thus stop incrementing on it's previous, corrupted, subcount)
 # Remove all the sstables for that column family on that node (this is 
 important because otherwise the node will never get repaired for it's 
 previous subcount)
 This is far from being ideal, but I think this is the price we pay for 
 avoiding the read-before-write. In any case, the first step (remove the 
 NodeId system table) happens to remove the list of the old NodeId this node 
 has, which could prevent us for merging the other potential previous nodeId. 
 This is ok but sub-optimal. This ticket proposes to add a new startup flag to 
 make the node renew it's NodeId, thus replacing this first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2788) Add startup option renew the NodeId (for counters)

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051159#comment-13051159
 ] 

Jonathan Ellis commented on CASSANDRA-2788:
---

Pasting Sylvain's explanation from IRC:

{quote}
Let's me take a small example: Suppose two node A and B. Initially their 
node_id will be respectively A1 and B1. Each counter will thus have two 
components, A1 and B1.

Now suppose you renew the node_id of A - A2 because of a corruption. Soon 
enough, the counters will have 3 components A1, A2 and B1. Renew that yet 
another time and the counter context will be A1, A2, A3 and B1. It grows, which 
is not cool.
But because we know that nobody will ever increment A1 and A2 anymore (A3 is 
the active node id for A), we can merge them (we have to wait for gc_grace and 
stuff for that be correct etc... but we do it)

So basically we try to keep the context as small as can be. If you nuke 
NodeIdInfo, right now the code won't be able to do that anymore and you will 
stay with a bigger that necessary context for all the counters.

So just renewing is more efficient in that sense. But nuking the system table 
is still 'correct' as far as returning the correct count is involved.
{quoted}

 Add startup option renew the NodeId (for counters)
 --

 Key: CASSANDRA-2788
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2788
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: counters
 Fix For: 0.8.2

 Attachments: 0001-Option-to-renew-the-NodeId-on-startup.patch


 If an sstable of a counter column family is corrupted, the only safe solution 
 a user have right now is to:
 # Remove the NodeId System table to force the node to regenerate a new NodeId 
 (and thus stop incrementing on it's previous, corrupted, subcount)
 # Remove all the sstables for that column family on that node (this is 
 important because otherwise the node will never get repaired for it's 
 previous subcount)
 This is far from being ideal, but I think this is the price we pay for 
 avoiding the read-before-write. In any case, the first step (remove the 
 NodeId system table) happens to remove the list of the old NodeId this node 
 has, which could prevent us for merging the other potential previous nodeId. 
 This is ok but sub-optimal. This ticket proposes to add a new startup flag to 
 make the node renew it's NodeId, thus replacing this first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2727) examples/hadoop_word_count reducer to cassandra doesn't output into the output_words cf

2011-06-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051166#comment-13051166
 ] 

Hudson commented on CASSANDRA-2727:
---

Integrated in Cassandra-0.8 #174 (See 
[https://builds.apache.org/job/Cassandra-0.8/174/])
fix cassandra reducer example

Patch by tjake; reviewed by jbellis for CASSANDRA-2727

jake : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1136910
Files : 
* 
/cassandra/branches/cassandra-0.8/examples/hadoop_word_count/src/WordCount.java


 examples/hadoop_word_count reducer to cassandra doesn't output into the 
 output_words cf
 ---

 Key: CASSANDRA-2727
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2727
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0 beta 2
Reporter: Jeremy Hanna
Assignee: T Jake Luciani
Priority: Minor
  Labels: hadoop
 Fix For: 0.8.2

 Attachments: v1-0001-CASSANDRA-2727-fix-for-word-count-reducer.txt


 I tried the examples/hadoop_word_count example and could output to the 
 filesystem but when I output to cassandra (the default), nothing shows up in 
 output_words.  I can output to cassandra using pig so I think the problem is 
 isolated to this example.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2789) NPE on nodetool -h localhost -p 7199 info

2011-06-17 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-2789:
-

Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-2491

 NPE on nodetool -h localhost -p 7199 info
 -

 Key: CASSANDRA-2789
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2789
 Project: Cassandra
  Issue Type: Sub-task
  Components: Tools
Affects Versions: 0.8.0
 Environment: JVM, Unix
Reporter: Vijay
Assignee: Vijay
Priority: Trivial

 We use to running nodetool -h localhost as it is easy to create alias on 
 the images (which are ok to break)... but this defnetly breaks once broadcast 
 patch is committed (CASSANDRA-2491)... JMX will not be able to listern using 
 the BA (NAT)
 Stack Trace:
 [ajami_mr_cassandratest@ajami_mr_cassandra--useast1a-i-e985c587 ~]$ 
 /apps/nfcassandra_server/bin/nodetool -h localhost -p 7501 info
 85070591730234615865843651857942052863
 Gossip active: true
 Load : 13.41 KB
 Generation No: 1308310669
 Uptime (seconds) : 17420
 Heap Memory (MB) : 272.52 / 12083.25
 Exception in thread main java.lang.NullPointerException
   at 
 org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
   at 
 org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
   at sun.rmi.transport.Transport$1.run(Transport.java:159)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-2789) NPE on nodetool -h localhost -p 7199 info

2011-06-17 Thread Vijay (JIRA)
NPE on nodetool -h localhost -p 7199 info
-

 Key: CASSANDRA-2789
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2789
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 0.8.0
 Environment: JVM, Unix
Reporter: Vijay
Assignee: Vijay
Priority: Trivial


We use to running nodetool -h localhost as it is easy to create alias on the 
images (which are ok to break)... but this defnetly breaks once broadcast patch 
is committed (CASSANDRA-2491)... JMX will not be able to listern using the BA 
(NAT)

Stack Trace:
[ajami_mr_cassandratest@ajami_mr_cassandra--useast1a-i-e985c587 ~]$ 
/apps/nfcassandra_server/bin/nodetool -h localhost -p 7501 info
85070591730234615865843651857942052863
Gossip active: true
Load : 13.41 KB
Generation No: 1308310669
Uptime (seconds) : 17420
Heap Memory (MB) : 272.52 / 12083.25
Exception in thread main java.lang.NullPointerException
at 
org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
at 
org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
at 
org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
at 
javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
at 
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
at sun.rmi.transport.Transport$1.run(Transport.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2789) NPE on nodetool -h localhost -p 7199 info

2011-06-17 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-2789:
-

Issue Type: Bug  (was: Sub-task)
Parent: (was: CASSANDRA-2491)

 NPE on nodetool -h localhost -p 7199 info
 -

 Key: CASSANDRA-2789
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2789
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 0.8.0
 Environment: JVM, Unix
Reporter: Vijay
Assignee: Vijay
Priority: Trivial

 We use to running nodetool -h localhost as it is easy to create alias on 
 the images (which are ok to break)... but this defnetly breaks once broadcast 
 patch is committed (CASSANDRA-2491)... JMX will not be able to listern using 
 the BA (NAT)
 Stack Trace:
 [ajami_mr_cassandratest@ajami_mr_cassandra--useast1a-i-e985c587 ~]$ 
 /apps/nfcassandra_server/bin/nodetool -h localhost -p 7501 info
 85070591730234615865843651857942052863
 Gossip active: true
 Load : 13.41 KB
 Generation No: 1308310669
 Uptime (seconds) : 17420
 Heap Memory (MB) : 272.52 / 12083.25
 Exception in thread main java.lang.NullPointerException
   at 
 org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
   at 
 org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
   at sun.rmi.transport.Transport$1.run(Transport.java:159)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2780) sstable2json needs to escape quotes

2011-06-17 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2780:
---

Attachment: (was: CASSANDRA-2780-v2.patch)

 sstable2json needs to escape quotes
 ---

 Key: CASSANDRA-2780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2780
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Timo Nentwig
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: CASSANDRA-2780.patch


 [default@foo] set transactions[test][data]='{foo:bar}'; 
 $ cat /tmp/json
 {
 74657374: [[data, {foo:bar}, 1308209845388000]]
 }
 $ ./json2sstable -s -c transactions -K foo /tmp/json /tmp/ss-g-1-Data.db
 Counting keys to import, please wait... (NOTE: to skip this use -n num_keys)
 org.codehaus.jackson.JsonParseException: Unexpected character ('f' (code 
 102)): was expecting comma to separate ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
   at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonParserBase.java:565)
   at 
 org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:128)
   at 
 org.codehaus.jackson.impl.JsonParserBase.skipChildren(JsonParserBase.java:263)
   at 
 org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:328)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:252)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
 ERROR: Unexpected character ('f' (code 102)): was expecting comma to separate 
 ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
 http://www.mail-archive.com/user@cassandra.apache.org/msg14257.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-2790) SimpleStrategy enforces endpoints = replicas when reading with ConsistencyLevel.ONE

2011-06-17 Thread Ivan Gorgiev (JIRA)
SimpleStrategy enforces endpoints = replicas when reading with 
ConsistencyLevel.ONE


 Key: CASSANDRA-2790
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2790
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.6
 Environment: Linux 2.6.32-31-generic #61-Ubuntu SMP / Java HotSpot(TM) 
64-Bit Server VM (build 14.2-b01, mixed mode)
Reporter: Ivan Gorgiev


We use replication factor of 3 across our system, but in a one case on the 
application bootstrap we read a stored value with a local (in-process) call to 
StorageProxy.read(commands, ConsistencyLevel.ONE). This results in the 
following exception from SimpleStrategy: replication factor 3 exceeds number 
of endpoints 1. 

Shouldn't such a read operation always succeed as there is a guaranteed single 
Cassandra endpoint - the one processing the request? 

This code used to work with Cassandra 0.6.1 before we upgraded to 0.7.6.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2790) SimpleStrategy enforces endpoints = replicas when reading with ConsistencyLevel.ONE

2011-06-17 Thread Ivan Gorgiev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051201#comment-13051201
 ] 

Ivan Gorgiev commented on CASSANDRA-2790:
-

I should add that during the bootstrap of the node producing the exception (the 
seed node) all other nodes are down.

 SimpleStrategy enforces endpoints = replicas when reading with 
 ConsistencyLevel.ONE
 

 Key: CASSANDRA-2790
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2790
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.6
 Environment: Linux 2.6.32-31-generic #61-Ubuntu SMP / Java 
 HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode)
Reporter: Ivan Gorgiev

 We use replication factor of 3 across our system, but in a one case on the 
 application bootstrap we read a stored value with a local (in-process) call 
 to StorageProxy.read(commands, ConsistencyLevel.ONE). This results in the 
 following exception from SimpleStrategy: replication factor 3 exceeds number 
 of endpoints 1. 
 Shouldn't such a read operation always succeed as there is a guaranteed 
 single Cassandra endpoint - the one processing the request? 
 This code used to work with Cassandra 0.6.1 before we upgraded to 0.7.6.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2773) Index manager cannot support deleting and inserting into a row in the same mutation

2011-06-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2773:
--

 Reviewer: slebresne
 Priority: Minor  (was: Major)
Affects Version/s: (was: 0.8.0)
   0.7.0
Fix Version/s: 0.8.2
  Summary: Index manager cannot support deleting and inserting 
into a row in the same mutation  (was: after restart cassandra, seeing Index 
manager cannot support deleting and inserting into a row in the same mutation.)

 Index manager cannot support deleting and inserting into a row in the same 
 mutation
 -

 Key: CASSANDRA-2773
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2773
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Boris Yen
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.8.2


 I use hector 0.8.0-1 and cassandra 0.8.
 1. create mutator by using hector api, 
 2. Insert a few columns into the mutator for key key1, cf standard. 
 3. add a deletion to the mutator to delete the record of key1, cf 
 standard.
 4. repeat 2 and 3
 5. execute the mutator.
 the result: the connection seems to be held by the sever forever, it never 
 returns. when I tried to restart the cassandra I saw unsupportedexception : 
 Index manager cannot support deleting and inserting into a row in the same 
 mutation. and the cassandra is dead forever, unless I delete the commitlog. 
 I would expect to get an exception when I execute the mutator, not after I 
 restart the cassandra.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2773) Index manager cannot support deleting and inserting into a row in the same mutation

2011-06-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2773:
--

Attachment: 2773.txt

we could add some valdation logic but it looks like it's almost as easy to just 
remove this limitation.  patch to do this attached.

 Index manager cannot support deleting and inserting into a row in the same 
 mutation
 -

 Key: CASSANDRA-2773
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2773
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Boris Yen
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.8.2

 Attachments: 2773.txt


 I use hector 0.8.0-1 and cassandra 0.8.
 1. create mutator by using hector api, 
 2. Insert a few columns into the mutator for key key1, cf standard. 
 3. add a deletion to the mutator to delete the record of key1, cf 
 standard.
 4. repeat 2 and 3
 5. execute the mutator.
 the result: the connection seems to be held by the sever forever, it never 
 returns. when I tried to restart the cassandra I saw unsupportedexception : 
 Index manager cannot support deleting and inserting into a row in the same 
 mutation. and the cassandra is dead forever, unless I delete the commitlog. 
 I would expect to get an exception when I execute the mutator, not after I 
 restart the cassandra.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2477) CQL support for describing keyspaces / column familes

2011-06-17 Thread Rick Shaw (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051209#comment-13051209
 ] 

Rick Shaw commented on CASSANDRA-2477:
--

So from the client side you will need to know the name  of the Keyspace and the 
names of the CFs where this data will be stored and updated by the server. And 
the server side will need to fully document the current schema description of 
the CF(s) to do the SELECT on. And keep the CFs updated with any additions and 
updates to the internal KS and CF metadata. But with that info in hand the 
client could just issue a SELECT of the involved CF to get the metadata that is 
currently held in the internal server metadata structures represented by the 
associated KS/CFs. Is that the plan? 

 CQL support for describing keyspaces / column familes
 -

 Key: CASSANDRA-2477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2477
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
  Labels: cql
 Fix For: 0.8.2

 Attachments: 2477-virtual-cfs-false-start.txt




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1473) Implement a Cassandra aware Hadoop mapreduce.Partitioner

2011-06-17 Thread Patricio Echague (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051214#comment-13051214
 ] 

Patricio Echague commented on CASSANDRA-1473:
-

Stu, is this partitioner a different one that for instance 
BytesOrderedPartitioner ?

Where should it be plugged in at?

 Implement a Cassandra aware Hadoop mapreduce.Partitioner
 

 Key: CASSANDRA-1473
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1473
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Reporter: Stu Hood
Assignee: Patricio Echague
 Fix For: 1.0


 When using a IPartitioner that does not sort data in byte order 
 (RandomPartitioner for example) with Cassandra's Hadoop integration, Hadoop 
 is unaware of the output order of the data.
 We can make Hadoop aware of the proper order of the output data by 
 implementing Hadoop's mapreduce.Partitioner interface: then Hadoop will 
 handle sorting all of the data according to Cassandra's IPartitioner, and the 
 writing clients will be able to connect to smaller numbers of Cassandra nodes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1473) Implement a Cassandra aware Hadoop mapreduce.Partitioner

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051231#comment-13051231
 ] 

Jonathan Ellis commented on CASSANDRA-1473:
---

This is a Hadoop partitioner, not a Cassandra partitioner.

http://hadoop.apache.org/common/docs/r0.20.0/api/org/apache/hadoop/mapreduce/Partitioner.html

 Implement a Cassandra aware Hadoop mapreduce.Partitioner
 

 Key: CASSANDRA-1473
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1473
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Reporter: Stu Hood
Assignee: Patricio Echague
 Fix For: 1.0


 When using a IPartitioner that does not sort data in byte order 
 (RandomPartitioner for example) with Cassandra's Hadoop integration, Hadoop 
 is unaware of the output order of the data.
 We can make Hadoop aware of the proper order of the output data by 
 implementing Hadoop's mapreduce.Partitioner interface: then Hadoop will 
 handle sorting all of the data according to Cassandra's IPartitioner, and the 
 writing clients will be able to connect to smaller numbers of Cassandra nodes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2789) NPE on nodetool -h localhost -p 7199 info

2011-06-17 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-2789:
-

Attachment: 2789-SVN-Patch.patch

Simple check to see the ip validity and change to the broadcast ip.

 NPE on nodetool -h localhost -p 7199 info
 -

 Key: CASSANDRA-2789
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2789
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 0.8.0
 Environment: JVM, Unix
Reporter: Vijay
Assignee: Vijay
Priority: Trivial
 Attachments: 2789-SVN-Patch.patch


 We use to running nodetool -h localhost as it is easy to create alias on 
 the images (which are ok to break)... but this defnetly breaks once broadcast 
 patch is committed (CASSANDRA-2491)... JMX will not be able to listern using 
 the BA (NAT)
 Stack Trace:
 [ajami_mr_cassandratest@ajami_mr_cassandra--useast1a-i-e985c587 ~]$ 
 /apps/nfcassandra_server/bin/nodetool -h localhost -p 7501 info
 85070591730234615865843651857942052863
 Gossip active: true
 Load : 13.41 KB
 Generation No: 1308310669
 Uptime (seconds) : 17420
 Heap Memory (MB) : 272.52 / 12083.25
 Exception in thread main java.lang.NullPointerException
   at 
 org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
   at 
 org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
   at sun.rmi.transport.Transport$1.run(Transport.java:159)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-2791) Redhat spec file needs some enhancements for 0.8 and beyond

2011-06-17 Thread Nate McCall (JIRA)
Redhat spec file needs some enhancements for 0.8 and beyond
---

 Key: CASSANDRA-2791
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2791
 Project: Cassandra
  Issue Type: Improvement
  Components: Packaging
Affects Versions: 0.8.0
Reporter: Nate McCall
 Fix For: 0.8.2, 1.0


Version and Release need to be brought up to date. Also need to account for 
multiple 'apache-cassandra' jars. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2780) sstable2json needs to escape quotes

2011-06-17 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2780:
---

Attachment: CASSANDRA-2780-v2.patch

uses ObjectMapper.writeValue(PrintStream, Object) to serialize keys and columns 
to JSON instead of using regexs.

 sstable2json needs to escape quotes
 ---

 Key: CASSANDRA-2780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2780
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Timo Nentwig
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: CASSANDRA-2780-v2.patch, CASSANDRA-2780.patch


 [default@foo] set transactions[test][data]='{foo:bar}'; 
 $ cat /tmp/json
 {
 74657374: [[data, {foo:bar}, 1308209845388000]]
 }
 $ ./json2sstable -s -c transactions -K foo /tmp/json /tmp/ss-g-1-Data.db
 Counting keys to import, please wait... (NOTE: to skip this use -n num_keys)
 org.codehaus.jackson.JsonParseException: Unexpected character ('f' (code 
 102)): was expecting comma to separate ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
   at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonParserBase.java:565)
   at 
 org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:128)
   at 
 org.codehaus.jackson.impl.JsonParserBase.skipChildren(JsonParserBase.java:263)
   at 
 org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:328)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:252)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
 ERROR: Unexpected character ('f' (code 102)): was expecting comma to separate 
 ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
 http://www.mail-archive.com/user@cassandra.apache.org/msg14257.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1136991 - /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/tools/SSTableExport.java

2011-06-17 Thread brandonwilliams
Author: brandonwilliams
Date: Fri Jun 17 19:20:18 2011
New Revision: 1136991

URL: http://svn.apache.org/viewvc?rev=1136991view=rev
Log:
Use jackson's ObjectMapper instead of hand-crafting json in sstable2json.
Patch by Pavel Yaskevich, reviewed by brandonwilliams for CASSANDRA-2780

Modified:

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/tools/SSTableExport.java

Modified: 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/tools/SSTableExport.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/tools/SSTableExport.java?rev=1136991r1=1136990r2=1136991view=diff
==
--- 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/tools/SSTableExport.java
 (original)
+++ 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/tools/SSTableExport.java
 Fri Jun 17 19:20:18 2011
@@ -25,6 +25,7 @@ import java.nio.ByteBuffer;
 import java.util.*;
 
 import org.apache.cassandra.config.CFMetaData;
+import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.db.*;
 import org.apache.cassandra.db.marshal.AbstractType;
 import org.apache.cassandra.io.util.BufferedRandomAccessFile;
@@ -33,11 +34,11 @@ import org.apache.cassandra.service.Stor
 import org.apache.commons.cli.*;
 
 import org.apache.cassandra.config.ConfigurationException;
-import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.dht.IPartitioner;
 import org.apache.cassandra.io.sstable.*;
 import org.apache.cassandra.utils.ByteBufferUtil;
-import org.apache.cassandra.utils.Pair;
+import org.codehaus.jackson.JsonGenerator;
+import org.codehaus.jackson.map.ObjectMapper;
 
 import static org.apache.cassandra.utils.ByteBufferUtil.bytesToHex;
 import static org.apache.cassandra.utils.ByteBufferUtil.hexToBytes;
@@ -47,12 +48,12 @@ import static org.apache.cassandra.utils
  */
 public class SSTableExport
 {
-// size of the columns page
-private static final int PAGE_SIZE = 1000;
+private static ObjectMapper jsonMapper = new ObjectMapper();
 
 private static final String KEY_OPTION = k;
 private static final String EXCLUDEKEY_OPTION = x;
 private static final String ENUMERATEKEYS_OPTION = e;
+
 private static Options options;
 private static CommandLine cmd;
 
@@ -72,47 +73,36 @@ public class SSTableExport
 
 Option optEnumerate = new Option(ENUMERATEKEYS_OPTION, false, 
enumerate keys only);
 options.addOption(optEnumerate);
-}
 
-/**
- * Wraps given string into quotes
- * @param val string to quote
- * @return quoted string
- */
-private static String quote(String val)
-{
-return String.format(\%s\, escapeQuotes(val));
-}
-
-private static String escapeQuotes(String val)
-{
-return val.replace(\, \\\);
+// disabling auto close of the stream
+jsonMapper.configure(JsonGenerator.Feature.AUTO_CLOSE_TARGET, false);
 }
 
 /**
  * JSON Hash Key serializer
- * @param val value to set as a key
- * @return JSON Hash key
+ *
+ * @param out The output steam to write data
+ * @param value value to set as a key
  */
-private static String asKey(String val)
+private static void writeKey(PrintStream out, String value)
 {
-return String.format(%s: , quote(val));
+writeJSON(out, value);
+out.print(: );
 }
 
 /**
  * Serialize columns using given column iterator
+ *
  * @param columns column iterator
  * @param out output stream
  * @param comparator columns comparator
  * @param cfMetaData Column Family metadata (to get validator)
- * @return pair of (number of columns serialized, last column serialized)
  */
 private static void serializeColumns(IteratorIColumn columns, 
PrintStream out, AbstractType comparator, CFMetaData cfMetaData)
 {
 while (columns.hasNext())
 {
-IColumn column = columns.next();
-serializeColumn(column, out, comparator, cfMetaData);
+writeJSON(out, serializeColumn(columns.next(), comparator, 
cfMetaData));
 
 if (columns.hasNext())
 out.print(, );
@@ -121,47 +111,42 @@ public class SSTableExport
 
 /**
  * Serialize a given column to the JSON format
+ *
  * @param column column presentation
- * @param out output stream
  * @param comparator columns comparator
  * @param cfMetaData Column Family metadata (to get validator)
+ *
+ * @return column as serialized list
  */
-private static void serializeColumn(IColumn column, PrintStream out, 
AbstractType comparator, CFMetaData cfMetaData)
+private static ListObject serializeColumn(IColumn column, AbstractType 
comparator, CFMetaData cfMetaData)
 {
+ArrayListObject serializedColumn = new ArrayListObject();
+
 

[jira] [Closed] (CASSANDRA-2780) sstable2json needs to escape quotes

2011-06-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams closed CASSANDRA-2780.
---


Committed v2, thanks!

 sstable2json needs to escape quotes
 ---

 Key: CASSANDRA-2780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2780
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Timo Nentwig
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: CASSANDRA-2780-v2.patch, CASSANDRA-2780.patch


 [default@foo] set transactions[test][data]='{foo:bar}'; 
 $ cat /tmp/json
 {
 74657374: [[data, {foo:bar}, 1308209845388000]]
 }
 $ ./json2sstable -s -c transactions -K foo /tmp/json /tmp/ss-g-1-Data.db
 Counting keys to import, please wait... (NOTE: to skip this use -n num_keys)
 org.codehaus.jackson.JsonParseException: Unexpected character ('f' (code 
 102)): was expecting comma to separate ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
   at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonParserBase.java:565)
   at 
 org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:128)
   at 
 org.codehaus.jackson.impl.JsonParserBase.skipChildren(JsonParserBase.java:263)
   at 
 org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:328)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:252)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
 ERROR: Unexpected character ('f' (code 102)): was expecting comma to separate 
 ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
 http://www.mail-archive.com/user@cassandra.apache.org/msg14257.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2780) sstable2json needs to escape quotes

2011-06-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051311#comment-13051311
 ] 

Hudson commented on CASSANDRA-2780:
---

Integrated in Cassandra-0.8 #175 (See 
[https://builds.apache.org/job/Cassandra-0.8/175/])
Use jackson's ObjectMapper instead of hand-crafting json in sstable2json.
Patch by Pavel Yaskevich, reviewed by brandonwilliams for CASSANDRA-2780

brandonwilliams : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1136991
Files : 
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/tools/SSTableExport.java


 sstable2json needs to escape quotes
 ---

 Key: CASSANDRA-2780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2780
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Timo Nentwig
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: CASSANDRA-2780-v2.patch, CASSANDRA-2780.patch


 [default@foo] set transactions[test][data]='{foo:bar}'; 
 $ cat /tmp/json
 {
 74657374: [[data, {foo:bar}, 1308209845388000]]
 }
 $ ./json2sstable -s -c transactions -K foo /tmp/json /tmp/ss-g-1-Data.db
 Counting keys to import, please wait... (NOTE: to skip this use -n num_keys)
 org.codehaus.jackson.JsonParseException: Unexpected character ('f' (code 
 102)): was expecting comma to separate ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
   at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
   at 
 org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonParserBase.java:565)
   at 
 org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:128)
   at 
 org.codehaus.jackson.impl.JsonParserBase.skipChildren(JsonParserBase.java:263)
   at 
 org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:328)
   at 
 org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:252)
   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
 ERROR: Unexpected character ('f' (code 102)): was expecting comma to separate 
 ARRAY entries
  at [Source: /tmp/json2; line: 2, column: 27]
 http://www.mail-archive.com/user@cassandra.apache.org/msg14257.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2452) New EC2 Snitch to use public ip and hence natively support for EC2 mult-region's.

2011-06-17 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-2452:
-

Attachment: (was: 2452-Ec2Multi-Region-v3.rtf)

 New EC2 Snitch to use public ip and hence natively support for EC2 
 mult-region's.
 -

 Key: CASSANDRA-2452
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2452
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Affects Versions: 0.8 beta 1
 Environment: JVM
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 0.8.2

 Attachments: 2452-EC2Snitch-Changes.txt, 
 2452-Ec2Multi-Region-v3.patch, 2452-Ec2Multi-Region.patch, 
 2452-Intro-EC2MultiRegionSnitch-V2.txt, 2452-OutboundTCPConnection.patch


 Make cassandra talk identify itself using the public ip (To avoid any future 
 conflicts of private ips).
 1) Split the logic of identification vs listen Address in the code.
 2) Move the logic to assign IP address to the node into EndPointSnitch.
 3) Make EC2 Snitch query for its public ip and use it for identification.
 4) Make EC2 snitch to use InetAddress.getLocal to listen to the private ip.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2452) New EC2 Snitch to use public ip and hence natively support for EC2 mult-region's.

2011-06-17 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-2452:
-

Attachment: 2452-Ec2Multi-Region-v3.patch

 New EC2 Snitch to use public ip and hence natively support for EC2 
 mult-region's.
 -

 Key: CASSANDRA-2452
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2452
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Affects Versions: 0.8 beta 1
 Environment: JVM
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 0.8.2

 Attachments: 2452-EC2Snitch-Changes.txt, 
 2452-Ec2Multi-Region-v3.patch, 2452-Ec2Multi-Region.patch, 
 2452-Intro-EC2MultiRegionSnitch-V2.txt, 2452-OutboundTCPConnection.patch


 Make cassandra talk identify itself using the public ip (To avoid any future 
 conflicts of private ips).
 1) Split the logic of identification vs listen Address in the code.
 2) Move the logic to assign IP address to the node into EndPointSnitch.
 3) Make EC2 Snitch query for its public ip and use it for identification.
 4) Make EC2 snitch to use InetAddress.getLocal to listen to the private ip.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2452) New EC2 Snitch to use public ip and hence natively support for EC2 mult-region's.

2011-06-17 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-2452:
-

Attachment: 2452-Ec2Multi-Region-v3.patch

 New EC2 Snitch to use public ip and hence natively support for EC2 
 mult-region's.
 -

 Key: CASSANDRA-2452
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2452
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Affects Versions: 0.8 beta 1
 Environment: JVM
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 0.8.2

 Attachments: 2452-EC2Snitch-Changes.txt, 
 2452-Ec2Multi-Region-v3.patch, 2452-Ec2Multi-Region.patch, 
 2452-Intro-EC2MultiRegionSnitch-V2.txt, 2452-OutboundTCPConnection.patch


 Make cassandra talk identify itself using the public ip (To avoid any future 
 conflicts of private ips).
 1) Split the logic of identification vs listen Address in the code.
 2) Move the logic to assign IP address to the node into EndPointSnitch.
 3) Make EC2 Snitch query for its public ip and use it for identification.
 4) Make EC2 snitch to use InetAddress.getLocal to listen to the private ip.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2452) New EC2 Snitch to use public ip and hence natively support for EC2 mult-region's.

2011-06-17 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-2452:
-

Attachment: (was: 2452-Ec2Multi-Region-v3.patch)

 New EC2 Snitch to use public ip and hence natively support for EC2 
 mult-region's.
 -

 Key: CASSANDRA-2452
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2452
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Affects Versions: 0.8 beta 1
 Environment: JVM
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 0.8.2

 Attachments: 2452-EC2Snitch-Changes.txt, 
 2452-Ec2Multi-Region-v3.patch, 2452-Ec2Multi-Region.patch, 
 2452-Intro-EC2MultiRegionSnitch-V2.txt, 2452-OutboundTCPConnection.patch


 Make cassandra talk identify itself using the public ip (To avoid any future 
 conflicts of private ips).
 1) Split the logic of identification vs listen Address in the code.
 2) Move the logic to assign IP address to the node into EndPointSnitch.
 3) Make EC2 Snitch query for its public ip and use it for identification.
 4) Make EC2 snitch to use InetAddress.getLocal to listen to the private ip.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2491) A new config parameter, broadcast_address

2011-06-17 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051324#comment-13051324
 ] 

Brandon Williams commented on CASSANDRA-2491:
-

I'm getting EOF exceptions on startup from ITC with this patch.

{noformat}
ERROR 21:06:10,912 Fatal exception in thread Thread[Thread-6,5,main]
java.io.IOError: java.io.EOFException
at 
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:112)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at 
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:98)
{noformat}

One for every node in the cluser, per machine.

 A new config parameter, broadcast_address
 -

 Key: CASSANDRA-2491
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2491
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: x86_64 GNU/Linux
Reporter: Khee Chin
Assignee: Khee Chin
Priority: Trivial
  Labels: patch
 Fix For: 1.0

 Attachments: 2491_broadcast_address.patch, 
 2491_broadcast_address_v2.patch, 2491_broadcast_address_v3.patch, 
 2491_broadcast_address_v4.patch, 2491_broadcast_address_v5.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 A new config parameter, broadcast_address
 In a cluster setup where one or more nodes is behind a firewall and has a 
 private ip address, listen_address does not allow the hosts behind the 
 firewalls to be discovered by other nodes.
 Attached is a patch that introduces a new config parameter broadcast_address 
 which allows Cassandra nodes to explicitly specify their external ip address. 
 In addition, this allows listen_address to be set to 0.0.0.0 on the already 
 firewalled node.
 broadcast_address fallsback to listen_address when it is not stated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2103) expiring counter columns

2011-06-17 Thread Yang Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051329#comment-13051329
 ] 

Yang Yang edited comment on CASSANDRA-2103 at 6/17/11 9:22 PM:
---

there could be a problem with trying to rely on forcing compaction order:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

  was (Author: yangyangyyy):
there could be a problem with trying to relying on forcing compaction order:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.
  
 expiring counter columns
 

 Key: CASSANDRA-2103
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2103
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Affects Versions: 0.8 beta 1
Reporter: Kelvin Kakugawa
 Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch


 add ttl functionality to counter columns.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2103) expiring counter columns

2011-06-17 Thread Yang Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051329#comment-13051329
 ] 

Yang Yang commented on CASSANDRA-2103:
--

there could be a problem with trying to relying on forcing compaction order:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

 expiring counter columns
 

 Key: CASSANDRA-2103
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2103
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Affects Versions: 0.8 beta 1
Reporter: Kelvin Kakugawa
 Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch


 add ttl functionality to counter columns.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2103) expiring counter columns

2011-06-17 Thread Yang Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051329#comment-13051329
 ] 

Yang Yang edited comment on CASSANDRA-2103 at 6/17/11 9:27 PM:
---

there could be a problem with trying to rely on forcing compaction order:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correction 
direction to attack this problem: the issue here is due to the changing order 
between *individual counter adds/deletes(auto-expire is same as delete)*, this 
order can be different between different counters, so you have to fix the order 
between the updates within each counter, *not the order between ensembles of 
counters*. such ensembles of counters do not guarantee any orders at all, due 
to randomness in flushing time, or message delivery (they have similar effects)



  was (Author: yangyangyyy):
there could be a problem with trying to rely on forcing compaction order:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correction 
direction to attack this problem: the issue here is due to the changing order 
between *individual counter adds*, this order can be different between 
different counters, so you have to fix the order between the updates within 
each counter, *not the order between ensembles of counters*. such ensembles of 
counters do not guarantee any orders at all, due to randomness in flushing 
time, or message delivery (they have similar effects)


  
 expiring counter columns
 

 Key: CASSANDRA-2103
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2103
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Affects Versions: 0.8 beta 1
Reporter: Kelvin Kakugawa
 Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch


 add ttl functionality to counter columns.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2103) expiring counter columns

2011-06-17 Thread Yang Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051329#comment-13051329
 ] 

Yang Yang edited comment on CASSANDRA-2103 at 6/17/11 9:27 PM:
---

there could be a problem with trying to rely on forcing compaction order:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correction 
direction to attack this problem: the issue here is due to the changing order 
between *individual counter adds/deletes* (auto-expire is same as delete), this 
order can be different between different counters, so you have to fix the order 
between the updates within each counter, *not the order between ensembles of 
counters*. such ensembles of counters do not guarantee any orders at all, due 
to randomness in flushing time, or message delivery (they have similar effects)



  was (Author: yangyangyyy):
there could be a problem with trying to rely on forcing compaction order:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correction 
direction to attack this problem: the issue here is due to the changing order 
between *individual counter adds/deletes(auto-expire is same as delete)*, this 
order can be different between different counters, so you have to fix the order 
between the updates within each counter, *not the order between ensembles of 
counters*. such ensembles of counters do not guarantee any orders at all, due 
to randomness in flushing time, or message delivery (they have similar effects)


  
 expiring counter columns
 

 Key: CASSANDRA-2103
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2103
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Affects Versions: 0.8 beta 1
Reporter: Kelvin Kakugawa
 Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch


 add ttl functionality to counter columns.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2103) expiring counter columns

2011-06-17 Thread Yang Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051329#comment-13051329
 ] 

Yang Yang edited comment on CASSANDRA-2103 at 6/17/11 9:26 PM:
---

there could be a problem with trying to rely on forcing compaction order:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correction 
direction to attack this problem: the issue here is due to the changing order 
between *individual counter adds*, this order can be different between 
different counters, so you have to fix the order between the updates within 
each counter, *not the order between ensembles of counters*. such ensembles of 
counters do not guarantee any orders at all, due to randomness in flushing 
time, or message delivery (they have similar effects)



  was (Author: yangyangyyy):
there could be a problem with trying to rely on forcing compaction order:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.
  
 expiring counter columns
 

 Key: CASSANDRA-2103
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2103
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Affects Versions: 0.8 beta 1
Reporter: Kelvin Kakugawa
 Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch


 add ttl functionality to counter columns.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2103) expiring counter columns

2011-06-17 Thread Yang Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051329#comment-13051329
 ] 

Yang Yang edited comment on CASSANDRA-2103 at 6/17/11 9:40 PM:
---

there could be a problem with trying to rely on forcing compaction order:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correct direction 
to attack this problem: the issue here is due to the changing order between 
*individual counter adds/deletes* (auto-expire is same as delete), this order 
can be different between different counters, so you have to fix the order 
between the updates within each counter, *not the order between ensembles of 
counters*. such ensembles of counters do not guarantee any orders at all, due 
to randomness in flushing time, or message delivery (they have similar effects)



  was (Author: yangyangyyy):
there could be a problem with trying to rely on forcing compaction order:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correction 
direction to attack this problem: the issue here is due to the changing order 
between *individual counter adds/deletes* (auto-expire is same as delete), this 
order can be different between different counters, so you have to fix the order 
between the updates within each counter, *not the order between ensembles of 
counters*. such ensembles of counters do not guarantee any orders at all, due 
to randomness in flushing time, or message delivery (they have similar effects)


  
 expiring counter columns
 

 Key: CASSANDRA-2103
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2103
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Affects Versions: 0.8 beta 1
Reporter: Kelvin Kakugawa
 Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch


 add ttl functionality to counter columns.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2735) Timestamp Based Compaction Strategy

2011-06-17 Thread Yang Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051345#comment-13051345
 ] 

Yang Yang commented on CASSANDRA-2735:
--

there could be a problem with trying to rely on forcing compaction order to 
make counter expiration work:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correct direction 
to attack this problem: the issue here is due to the changing order between 
individual counter adds/deletes (auto-expire is same as delete), this order can 
be different between different counters, so you have to fix the order between 
the updates within each counter, not the order between ensembles of counters. 
such ensembles of counters do not guarantee any orders at all, due to 
randomness in flushing time, or message delivery (they have similar effects)


 Timestamp Based Compaction Strategy
 ---

 Key: CASSANDRA-2735
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2735
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Alan Liang
Assignee: Alan Liang
Priority: Minor
  Labels: compaction
 Attachments: 0004-timestamp-bucketed-compaction-strategy.patch


 Compaction strategy implementation based on max timestamp ordering of the 
 sstables while satisfying max sstable size, min and max compaction 
 thresholds. It also handles expiration of sstables based on a timestamp.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2735) Timestamp Based Compaction Strategy

2011-06-17 Thread Yang Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051345#comment-13051345
 ] 

Yang Yang edited comment on CASSANDRA-2735 at 6/17/11 9:47 PM:
---

there could be a problem with trying to rely on forcing compaction order to 
make counter expiration work:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correct direction 
to attack this problem: the issue here is due to the changing order between 
*individual* counter adds/deletes (auto-expire is same as delete), this order 
can be different between different counters, so you have to fix the order 
between the updates within each counter, not the order between *ensembles of 
counters*. such ensembles of counters do not guarantee any orders at all, due 
to randomness in flushing time, or message delivery (they have similar effects)


  was (Author: yangyangyyy):
there could be a problem with trying to rely on forcing compaction order to 
make counter expiration work:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correct direction 
to attack this problem: the issue here is due to the changing order between 
individual counter adds/deletes (auto-expire is same as delete), this order can 
be different between different counters, so you have to fix the order between 
the updates within each counter, not the order between ensembles of counters. 
such ensembles of counters do not guarantee any orders at all, due to 
randomness in flushing time, or message delivery (they have similar effects)

  
 Timestamp Based Compaction Strategy
 ---

 Key: CASSANDRA-2735
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2735
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Alan Liang
Assignee: Alan Liang
Priority: Minor
  Labels: compaction
 Attachments: 0004-timestamp-bucketed-compaction-strategy.patch


 Compaction strategy implementation based on max timestamp ordering of the 
 sstables while satisfying max sstable size, min and max compaction 
 thresholds. It also handles expiration of sstables based on a timestamp.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2103) expiring counter columns

2011-06-17 Thread Yang Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2103:
-

Comment: was deleted

(was: there could be a problem with trying to rely on forcing compaction order:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correct direction 
to attack this problem: the issue here is due to the changing order between 
*individual counter adds/deletes* (auto-expire is same as delete), this order 
can be different between different counters, so you have to fix the order 
between the updates within each counter, *not the order between ensembles of 
counters*. such ensembles of counters do not guarantee any orders at all, due 
to randomness in flushing time, or message delivery (they have similar effects)

)

 expiring counter columns
 

 Key: CASSANDRA-2103
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2103
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Affects Versions: 0.8 beta 1
Reporter: Kelvin Kakugawa
 Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch


 add ttl functionality to counter columns.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2735) Timestamp Based Compaction Strategy

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051347#comment-13051347
 ] 

Jonathan Ellis commented on CASSANDRA-2735:
---

I don't think that's worth worrying about; there's lots worse things malicious 
clients can do, than prevent you from throwing away data that's expired. :)

 Timestamp Based Compaction Strategy
 ---

 Key: CASSANDRA-2735
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2735
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Alan Liang
Assignee: Alan Liang
Priority: Minor
  Labels: compaction
 Attachments: 0004-timestamp-bucketed-compaction-strategy.patch


 Compaction strategy implementation based on max timestamp ordering of the 
 sstables while satisfying max sstable size, min and max compaction 
 thresholds. It also handles expiration of sstables based on a timestamp.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2735) Timestamp Based Compaction Strategy

2011-06-17 Thread Yang Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051345#comment-13051345
 ] 

Yang Yang edited comment on CASSANDRA-2735 at 6/17/11 9:52 PM:
---

there could be a problem with trying to rely on forcing compaction order to 
make counter expiration work:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correct direction 
to attack this problem: the issue here is due to the changing order between 
*individual* counter adds/deletes (auto-expire is same as delete), this order 
can be different between different counters, so you have to fix the order 
between the updates within each counter, not the order between *ensembles of 
counters*. such ensembles of counters do not guarantee any orders at all, due 
to randomness in flushing time, or message delivery (they have similar effects)

the problem with current counter+delete implementation is that counters use 
timestamp() to represent their order, but when they are merged, they lose their 
*individual order* and retain a max timestamp(), which supposedly represents 
the order of the ensemble, but this is meaningless because the it is the order 
of the ensemble is different from the true order.



  was (Author: yangyangyyy):
there could be a problem with trying to rely on forcing compaction order to 
make counter expiration work:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correct direction 
to attack this problem: the issue here is due to the changing order between 
*individual* counter adds/deletes (auto-expire is same as delete), this order 
can be different between different counters, so you have to fix the order 
between the updates within each counter, not the order between *ensembles of 
counters*. such ensembles of counters do not guarantee any orders at all, due 
to randomness in flushing time, or message delivery (they have similar effects)

  
 Timestamp Based Compaction Strategy
 ---

 Key: CASSANDRA-2735
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2735
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Alan Liang
Assignee: Alan Liang
Priority: Minor
  Labels: compaction
 Attachments: 0004-timestamp-bucketed-compaction-strategy.patch


 Compaction strategy implementation based on max timestamp ordering of the 
 sstables while satisfying max sstable size, min and max compaction 
 thresholds. It also handles expiration of sstables based on a timestamp.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2735) Timestamp Based Compaction Strategy

2011-06-17 Thread Yang Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051350#comment-13051350
 ] 

Yang Yang commented on CASSANDRA-2735:
--

true, I don't really care about expired data, I'm happy as long as we have an 
expiring counter that mostly works or works with certain cautions.

but it seems that the changing order can come not only from compaction (which 
is fixed for realistic scenarios here),
but also from effects of message drops.

compact( compact (Add1, delete), Add2)  is the same as receiving Add1, delete 
, Add2 in messages.

but we know that messages can be easily dropped. 
so let's say the delete is dropped, then we replay it later (through repair for 
example), the same issue appears. 
I think the latter issue can be fixed by changing the TTL reconcile rule so 
that reconciled death time is the older death time, not timestamp+new_TTL.

anyhow I think we users of the counters api need to understand that placing a 
delete shortly after your last update, or place an update shortly after delete 
is most likely not going to work.  this patch fixes half of the issue, but it 
still remains. 

 Timestamp Based Compaction Strategy
 ---

 Key: CASSANDRA-2735
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2735
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Alan Liang
Assignee: Alan Liang
Priority: Minor
  Labels: compaction
 Attachments: 0004-timestamp-bucketed-compaction-strategy.patch


 Compaction strategy implementation based on max timestamp ordering of the 
 sstables while satisfying max sstable size, min and max compaction 
 thresholds. It also handles expiration of sstables based on a timestamp.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2735) Timestamp Based Compaction Strategy

2011-06-17 Thread Yang Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051350#comment-13051350
 ] 

Yang Yang edited comment on CASSANDRA-2735 at 6/17/11 10:06 PM:


true, I don't really care about expired data, I'm happy as long as we have an 
expiring counter that mostly works or works with certain cautions.

but it seems that the changing order can come not only from compaction (which 
is fixed for realistic scenarios here),
but also from effects of message drops.

compact( compact (Add1, delete), Add2)  is the same as receiving Add1, delete 
, Add2 in messages.

but we know that messages can be easily dropped. 
so let's say the delete is dropped, then we replay it later (through repair for 
example), so we have Add1, Add2, delete, the same issue appears. 
I think the latter issue can be fixed by changing the TTL reconcile rule so 
that reconciled death time is the older death time, not timestamp+new_TTL.

anyhow I think we users of the counters api need to understand that placing a 
delete shortly after your last update, or place an update shortly after delete 
is most likely not going to work.  this patch fixes half of the issue, but it 
still remains. 

  was (Author: yangyangyyy):
true, I don't really care about expired data, I'm happy as long as we have 
an expiring counter that mostly works or works with certain cautions.

but it seems that the changing order can come not only from compaction (which 
is fixed for realistic scenarios here),
but also from effects of message drops.

compact( compact (Add1, delete), Add2)  is the same as receiving Add1, delete 
, Add2 in messages.

but we know that messages can be easily dropped. 
so let's say the delete is dropped, then we replay it later (through repair for 
example), the same issue appears. 
I think the latter issue can be fixed by changing the TTL reconcile rule so 
that reconciled death time is the older death time, not timestamp+new_TTL.

anyhow I think we users of the counters api need to understand that placing a 
delete shortly after your last update, or place an update shortly after delete 
is most likely not going to work.  this patch fixes half of the issue, but it 
still remains. 
  
 Timestamp Based Compaction Strategy
 ---

 Key: CASSANDRA-2735
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2735
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Alan Liang
Assignee: Alan Liang
Priority: Minor
  Labels: compaction
 Attachments: 0004-timestamp-bucketed-compaction-strategy.patch


 Compaction strategy implementation based on max timestamp ordering of the 
 sstables while satisfying max sstable size, min and max compaction 
 thresholds. It also handles expiration of sstables based on a timestamp.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2491) A new config parameter, broadcast_address

2011-06-17 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051358#comment-13051358
 ] 

Vijay commented on CASSANDRA-2491:
--

Hi Brandon,
I doubt thats because of this patch are you running different versions on the 
same cluster?... i am sure that this existed in the old version too I can 
skip bytes if you want or create a different ticket to track... 
(http://www.datastax.com/support-forums/topic/eofexception-on-startup  -- 
Ignore the answer to the question though :) just looks similar).

 A new config parameter, broadcast_address
 -

 Key: CASSANDRA-2491
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2491
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: x86_64 GNU/Linux
Reporter: Khee Chin
Assignee: Khee Chin
Priority: Trivial
  Labels: patch
 Fix For: 1.0

 Attachments: 2491_broadcast_address.patch, 
 2491_broadcast_address_v2.patch, 2491_broadcast_address_v3.patch, 
 2491_broadcast_address_v4.patch, 2491_broadcast_address_v5.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 A new config parameter, broadcast_address
 In a cluster setup where one or more nodes is behind a firewall and has a 
 private ip address, listen_address does not allow the hosts behind the 
 firewalls to be discovered by other nodes.
 Attached is a patch that introduces a new config parameter broadcast_address 
 which allows Cassandra nodes to explicitly specify their external ip address. 
 In addition, this allows listen_address to be set to 0.0.0.0 on the already 
 firewalled node.
 broadcast_address fallsback to listen_address when it is not stated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2491) A new config parameter, broadcast_address

2011-06-17 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051360#comment-13051360
 ] 

Brandon Williams commented on CASSANDRA-2491:
-

It goes away when the patch is removed and is 100% reproducible with the patch. 
 All nodes are running the same version (but that shouldn't be a problem 
anyway.)

 A new config parameter, broadcast_address
 -

 Key: CASSANDRA-2491
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2491
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: x86_64 GNU/Linux
Reporter: Khee Chin
Assignee: Khee Chin
Priority: Trivial
  Labels: patch
 Fix For: 1.0

 Attachments: 2491_broadcast_address.patch, 
 2491_broadcast_address_v2.patch, 2491_broadcast_address_v3.patch, 
 2491_broadcast_address_v4.patch, 2491_broadcast_address_v5.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 A new config parameter, broadcast_address
 In a cluster setup where one or more nodes is behind a firewall and has a 
 private ip address, listen_address does not allow the hosts behind the 
 firewalls to be discovered by other nodes.
 Attached is a patch that introduces a new config parameter broadcast_address 
 which allows Cassandra nodes to explicitly specify their external ip address. 
 In addition, this allows listen_address to be set to 0.0.0.0 on the already 
 firewalled node.
 broadcast_address fallsback to listen_address when it is not stated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Cassandra Wiki] Trivial Update of ThriftExamples by ChrisLarsen

2011-06-17 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The ThriftExamples page has been changed by ChrisLarsen:
http://wiki.apache.org/cassandra/ThriftExamples?action=diffrev1=86rev2=87

Comment:
Added note about TFramedTransport for .07

  To generate the bindings for a particular language, first find out if Thrift 
supports that language.  If it does, you can run thrift --gen XYZ 
interface/cassandra.thrift for whatever XYZ you fancy.
  
  These examples are for Cassandra 0.5 and 0.6.
+ 
+ As of 0.7 and later, the default thrift transport is TFramedTransport so if 
you get an error such as No more data to read, switch transports.
  
  == PHP ==
  Before working with Cassandra and PHP make sure that your 
[[http://us.php.net/manual/en/function.phpinfo.php|PHP installation]] has 
[[http://us.php.net/apc/|APC]] enabled. If it does not please re-compile 
[[http://us.php.net/manual/en/install.php|PHP]] and then recompile 
[[http://incubator.apache.org/thrift/|Thrift]]. [[http://us.php.net/apc/|APC]] 
drastically increases the performance of 
[[http://incubator.apache.org/thrift/|Thrift Interface]].


[jira] [Issue Comment Edited] (CASSANDRA-2491) A new config parameter, broadcast_address

2011-06-17 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051391#comment-13051391
 ] 

Vijay edited comment on CASSANDRA-2491 at 6/17/11 11:08 PM:


Earlier we where not throwing the exception in v5 i was (because of the 
refractor)... v6 doesnt throw it... 

  was (Author: vijay2...@yahoo.com):
Earlier we where not throwing the exception in v5 i was (because of the 
refractor)... v6 does throw it for now... 
  
 A new config parameter, broadcast_address
 -

 Key: CASSANDRA-2491
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2491
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: x86_64 GNU/Linux
Reporter: Khee Chin
Assignee: Khee Chin
Priority: Trivial
  Labels: patch
 Fix For: 1.0

 Attachments: 2491_broadcast_address.patch, 
 2491_broadcast_address_v2.patch, 2491_broadcast_address_v3.patch, 
 2491_broadcast_address_v4.patch, 2491_broadcast_address_v5.patch, 
 2491_broadcast_address_v6.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 A new config parameter, broadcast_address
 In a cluster setup where one or more nodes is behind a firewall and has a 
 private ip address, listen_address does not allow the hosts behind the 
 firewalls to be discovered by other nodes.
 Attached is a patch that introduces a new config parameter broadcast_address 
 which allows Cassandra nodes to explicitly specify their external ip address. 
 In addition, this allows listen_address to be set to 0.0.0.0 on the already 
 firewalled node.
 broadcast_address fallsback to listen_address when it is not stated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2491) A new config parameter, broadcast_address

2011-06-17 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-2491:
-

Attachment: 2491_broadcast_address_v6.patch

Earlier we where not throwing the exception in v5 i was (because of the 
refractor)... v6 does throw it for now... 

 A new config parameter, broadcast_address
 -

 Key: CASSANDRA-2491
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2491
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: x86_64 GNU/Linux
Reporter: Khee Chin
Assignee: Khee Chin
Priority: Trivial
  Labels: patch
 Fix For: 1.0

 Attachments: 2491_broadcast_address.patch, 
 2491_broadcast_address_v2.patch, 2491_broadcast_address_v3.patch, 
 2491_broadcast_address_v4.patch, 2491_broadcast_address_v5.patch, 
 2491_broadcast_address_v6.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 A new config parameter, broadcast_address
 In a cluster setup where one or more nodes is behind a firewall and has a 
 private ip address, listen_address does not allow the hosts behind the 
 firewalls to be discovered by other nodes.
 Attached is a patch that introduces a new config parameter broadcast_address 
 which allows Cassandra nodes to explicitly specify their external ip address. 
 In addition, this allows listen_address to be set to 0.0.0.0 on the already 
 firewalled node.
 broadcast_address fallsback to listen_address when it is not stated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1602) Commit Log archivation and rolling forward utility (AKA Retaining commit logs)

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051432#comment-13051432
 ] 

Jonathan Ellis commented on CASSANDRA-1602:
---

Thanks for the update Tamara.  It looks good for the most part.  A few minor 
points:

- our modern hard link creation utility is CLibrary.createHardLink
- I think Oleg is right: we want to create the link when a segment is complete, 
not when it's created
- indentation standard is four spaces 
(http://wiki.apache.org/cassandra/CodeStyle)

 Commit Log archivation and rolling forward utility (AKA Retaining commit logs)
 --

 Key: CASSANDRA-1602
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1602
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Tools
Reporter: Oleg Anastasyev
 Attachments: 1602-0.6.4.txt, 1602-cassandra0.6.txt, 1602-v2.txt, 
 1602-v3.txt


 As couple of people from mailing list suggested me to share this patch (see 
 discussion at http://comments.gmane.org/gmane.comp.db.cassandra.user/9423 ) 
 to retain (archive) commit logs generated by cassandra and  restore data by 
 rolling forward of commit logs to previously backed up or snapshotted data 
 files.
 Here is an instruction of how to use it, which i extracted from out internal 
 wiki:
 We rely on cassandra replication factor for disaster recovery.
 But there is another problem: there are bugs in data manipulation logic, 
 which can lead to data destruction on the whole cluster. But the freshest 
 backup to restore from is last snapshot, which can be up to 24h as old.
 To fight with it, we decided to implement snapshot + log archivenbsp; backup 
 strategy - i.e. we collect commit logs and snapshotted data files. On event 
 of data loss, either due hardware failure or logical bug, we restore last 
 snapshot and roll forward all logs, collected since last snapshot time to it.
 Originally cassandra does not support log archive , so I implemented it by 
 myself.
 The idea is simple:
 # As soon as commit log file is not needed anymore by cassadra, a hardlink 
 (unix command ln $1 $2) is created from just closed commit log file to 
 commit log archive directory. Both commit log and commit log archive are on 
 the same volume, of course.
 # Some script (which authoring i left to admin) then takes files from commit 
 log archive dir and copies them over net to a backup location. As soon as 
 file is transferred, it can be safely deleted from commit log archive dir.
 # Dont forget there must be some script (also authored by admins), which do 
 data snapshots from time to time using nodetool snapshot command, available 
 from standard cassandra distribution and copies snapshot files to backup 
 location.
 ## Creating a snapshot is very light operation for cassandra - under the hood 
 it is just hardlinking currently existing files to 
 snapshot/timestamp-millis directory. So frequence is up to our ability to 
 pull snapshotted data files over network.
 # To restore data, admin must:
 ## stop cassandra instance
 ## remove all corrupted data files from /data directory. Leave only commit 
 logs you want to roll forward in /commitlog dir
 ## copy to /data last snapshot data files
 ## copy to /commitlog all archived commit log files (only files not older 
 than snapshot data files could be copied. copying too old will do no harm, 
 but will extend processing time)
 ## *not starting* cassandra instance, run roll forward utility 
 (bin/logreplay) with option -forced or -forcedcompaction and wait for its 
 completion
 ## then start cassandra node instance as usual.
 Log archivenbsp; logic is activated using CommitLogArchive directive in 
 storage-conf.xml:
 {code:xml}
 CommitLogDirectory/commitlog/CommitLogDirectory
 CommitLogArchivetrue/CommitLogArchive
 DataFileDirectories
 DataFileDirectory/data/DataFileDirectory
 /DataFileDirectories
 {code}Log files will be archived to commit-dir/.archive directory, i.e. as 
 shown in example above, to /commitlog/.archive directory.
 Archived logs replay process is launched by running 
 org.apache.cassandra.tools.ReplayLogs with option -forced locally on node 
 (use option -forcedcompact to do major compaction right after log roll 
 forward process completion). I also made a script named 
 cassandra-dir/bin/logreplay.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2790) SimpleStrategy enforces endpoints = replicas when reading with ConsistencyLevel.ONE

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051439#comment-13051439
 ] 

Jonathan Ellis commented on CASSANDRA-2790:
---

So does nodetool ring (against the node being queried) show just the one node, 
or does it show the other nodes in DOWN state?


 SimpleStrategy enforces endpoints = replicas when reading with 
 ConsistencyLevel.ONE
 

 Key: CASSANDRA-2790
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2790
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.6
 Environment: Linux 2.6.32-31-generic #61-Ubuntu SMP / Java 
 HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode)
Reporter: Ivan Gorgiev

 We use replication factor of 3 across our system, but in a one case on the 
 application bootstrap we read a stored value with a local (in-process) call 
 to StorageProxy.read(commands, ConsistencyLevel.ONE). This results in the 
 following exception from SimpleStrategy: replication factor 3 exceeds number 
 of endpoints 1. 
 Shouldn't such a read operation always succeed as there is a guaranteed 
 single Cassandra endpoint - the one processing the request? 
 This code used to work with Cassandra 0.6.1 before we upgraded to 0.7.6.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2530) Additional AbstractType data type definitions to enrich CQL

2011-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051446#comment-13051446
 ] 

Jonathan Ellis commented on CASSANDRA-2530:
---

can you rebase the Cql.g and CCFS.java conflicts?

 Additional AbstractType data type definitions to enrich CQL
 ---

 Key: CASSANDRA-2530
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2530
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0 beta 2
Reporter: Rick Shaw
Priority: Trivial
  Labels: cql
 Attachments: patch-to-add-4-new-AbstractTypes-and-CQL-support-v4.txt, 
 patch-to-add-4-new-AbstractTypes-and-CQL-support-v5.txt


 Provide 5 additional Datatypes: ByteType, DateType, BooleanType, FloatType, 
 DoubleType.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira