[jira] [Commented] (CASSANDRA-2659) Improve forceDeserialize/getCompactedRow encapsulation
[ https://issues.apache.org/jira/browse/CASSANDRA-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13034708#comment-13034708 ] Sylvain Lebresne commented on CASSANDRA-2659: - nitpicks: * the could remove the descriptor argument of the first getCompactedRow() and call needDeserialize() for the EchoedRow case. * we could use that first getCompactedRow() in SSTableWriter (it's really only cosmetic as we forceDesialize) * the comment of that first getCompactedRow() method is not completely correct, since the method may purge data (either if the sstable is of an old format or if forceDeserialize is set) while the comment suggest it never does it. but those are nitpicks, so with or without +1 Improve forceDeserialize/getCompactedRow encapsulation -- Key: CASSANDRA-2659 URL: https://issues.apache.org/jira/browse/CASSANDRA-2659 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Minor Fix For: 0.8.1 Attachments: 2659.txt -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2433: Attachment: 0004-Reports-validation-compaction-errors-back-to-repair-v2.patch 0003-Report-streaming-errors-back-to-repair-v2.patch 0002-Register-in-gossip-to-handle-node-failures-v2.patch 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re-v2.patch Attaching rebased patch (against 0.8.1). It also change the behavior a little bit so as to not fail repair right away if a problem occur (it still throw an exception at the end if any problem had occured). It turns out to be slightly simpler that way. Especially for CASSANDRA-1610. Failed Streams Break Repair --- Key: CASSANDRA-2433 URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 Project: Cassandra Issue Type: Bug Affects Versions: 0.7.4 Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Attachments: 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re-v2.patch, 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re.patch, 0002-Register-in-gossip-to-handle-node-failures-v2.patch, 0002-Register-in-gossip-to-handle-node-failures.patch, 0003-Report-streaming-errors-back-to-repair-v2.patch, 0003-Report-streaming-errors-back-to-repair.patch, 0004-Reports-validation-compaction-errors-back-to-repair-v2.patch, 0004-Reports-validation-compaction-errors-back-to-repair.patch Running repair in cases where a stream fails we are seeing multiple problems. 1. Although retry is initiated and completes, the old stream doesn't seem to clean itself up and repair hangs. 2. The temp files are left behind and multiple failures can end up filling up the data partition. These issues together are making repair very difficult for nearly everyone running repair on a non-trivial sized data set. This issue is also being worked on w.r.t CASSANDRA-2088, however that was moved to 0.8 for a few reasons. This ticket is to fix the immediate issues that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2610) Have the repair of a range repair *all* the replica for that range
[ https://issues.apache.org/jira/browse/CASSANDRA-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2610: Attachment: 0001-Make-repair-repair-all-hosts.patch Patch against 0.8.1. It applies on top of CASSANDRA-2433 because it is changing enough of common code that I don't want to have to deal with the rebase back and forth (and it actually reuse some of the refactoring of CASSANDRA-2433 anyway) Have the repair of a range repair *all* the replica for that range -- Key: CASSANDRA-2610 URL: https://issues.apache.org/jira/browse/CASSANDRA-2610 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.8 beta 1 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.8.1 Attachments: 0001-Make-repair-repair-all-hosts.patch Original Estimate: 8h Remaining Estimate: 8h Say you have a range R whose replica for that range are A, B and C. If you run repair on node A for that range R, when the repair end you only know that A is fully repaired. B and C are not. That is B and C are up to date with A before the repair, but are not up to date with one another. It makes it a pain to schedule optimal cluster repairs, that is repairing a full cluster without doing work twice (because you would have still have to run a repair on B or C, which will make A, B and C redo a validation compaction on R, and with more replica it's even more annoying). However it is fairly easy during the first repair on A to have him compare all the merkle trees, i.e the ones for B and C, and ask to B or C to stream between them whichever the differences they have. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (CASSANDRA-2481) C* .deb installs C* init.d scripts such that C* comes up before mdadm and related
[ https://issues.apache.org/jira/browse/CASSANDRA-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne reopened CASSANDRA-2481: - When installing the debian package for 0.7.6 and 0.8.0-rc1 on ubuntu 11.04 (natty), I get {noformat} Installing new version of config file /etc/init.d/cassandra ... update-rc.d: error: start|stop arguments not terminated by . usage: update-rc.d [-n] [-f] basename remove update-rc.d [-n] basename defaults [NN | SS KK] update-rc.d [-n] basename start|stop NN runlvl [runlvl] [...] . update-rc.d [-n] basename disable|enable [S|2|3|4|5] -n: not really -f: force {noformat} Given that it works like a charm with 0.7.5, I strongly suspect this is this patch doing. C* .deb installs C* init.d scripts such that C* comes up before mdadm and related - Key: CASSANDRA-2481 URL: https://issues.apache.org/jira/browse/CASSANDRA-2481 Project: Cassandra Issue Type: Bug Components: Packaging Reporter: Matthew F. Dennis Assignee: paul cannon Priority: Minor Fix For: 0.7.6, 0.8.0 Attachments: 2481.txt the C* .deb packages install the init.d scripts at S20 which is before mdadm and various other services. This means that when a node reboots that C* is started before the RAID sets are up and mounted causing C* to think it has no data and attempt bootstrapping again. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1278) Make bulk loading into Cassandra less crappy, more pluggable
[ https://issues.apache.org/jira/browse/CASSANDRA-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-1278: Attachment: 0001-Add-bulk-loader-utility.patch Attaching patch that implements the simpler idea. It provide a new utility 'sstableloader' (a fat client basically) that given a sstable (or more) will stream the relevant parts of that sstable to the relevant nodes. The tool tries to be self-documented but basically you must have a sstable with -Data and -Index component (we really need a -Index component to be able to do anything) in a directory dir whose name is the keyspace and call 'sstableloader dir'. Alternatively, if dir seats on one of the machine of the cluster, you can simply use a JMX call with as argument the path to dir. Make bulk loading into Cassandra less crappy, more pluggable Key: CASSANDRA-1278 URL: https://issues.apache.org/jira/browse/CASSANDRA-1278 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jeremy Hanna Assignee: Sylvain Lebresne Fix For: 0.8.1 Attachments: 0001-Add-bulk-loader-utility.patch, 1278-cassandra-0.7-v2.txt, 1278-cassandra-0.7.1.txt, 1278-cassandra-0.7.txt Original Estimate: 40h Time Spent: 40h 40m Remaining Estimate: 0h Currently bulk loading into Cassandra is a black art. People are either directed to just do it responsibly with thrift or a higher level client, or they have to explore the contrib/bmt example - http://wiki.apache.org/cassandra/BinaryMemtable That contrib module requires delving into the code to find out how it works and then applying it to the given problem. Using either method, the user also needs to keep in mind that overloading the cluster is possible - which will hopefully be addressed in CASSANDRA-685 This improvement would be to create a contrib module or set of documents dealing with bulk loading. Perhaps it could include code in the Core to make it more pluggable for external clients of different types. It is just that this is something that many that are new to Cassandra need to do - bulk load their data into Cassandra. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1278) Make bulk loading into Cassandra less crappy, more pluggable
[ https://issues.apache.org/jira/browse/CASSANDRA-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036231#comment-13036231 ] Sylvain Lebresne commented on CASSANDRA-1278: - I'd love to, but as it turns out it is fairly heavily hardwired in Descriptor that the keyspace name is the directory where the file sits. And by hardwired I mean that even if you add a constructor to Descriptor to decorrelate the ksname field from the directory argument this doesn't work, because streaming only transmit the name of the file (including the directory), not the ksname field and thus would get the wrong name. That is, I don't think we can do that without adding a new argument to the stream header, which felt a bit overkill at first (it's probably doable though). Make bulk loading into Cassandra less crappy, more pluggable Key: CASSANDRA-1278 URL: https://issues.apache.org/jira/browse/CASSANDRA-1278 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jeremy Hanna Assignee: Sylvain Lebresne Fix For: 0.8.1 Attachments: 0001-Add-bulk-loader-utility.patch, 1278-cassandra-0.7-v2.txt, 1278-cassandra-0.7.1.txt, 1278-cassandra-0.7.txt Original Estimate: 40h Time Spent: 40h 40m Remaining Estimate: 0h Currently bulk loading into Cassandra is a black art. People are either directed to just do it responsibly with thrift or a higher level client, or they have to explore the contrib/bmt example - http://wiki.apache.org/cassandra/BinaryMemtable That contrib module requires delving into the code to find out how it works and then applying it to the given problem. Using either method, the user also needs to keep in mind that overloading the cluster is possible - which will hopefully be addressed in CASSANDRA-685 This improvement would be to create a contrib module or set of documents dealing with bulk loading. Perhaps it could include code in the Core to make it more pluggable for external clients of different types. It is just that this is something that many that are new to Cassandra need to do - bulk load their data into Cassandra. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1278) Make bulk loading into Cassandra less crappy, more pluggable
[ https://issues.apache.org/jira/browse/CASSANDRA-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036277#comment-13036277 ] Sylvain Lebresne commented on CASSANDRA-1278: - I didn't do it because if I'm correct the tool stuff don't go into releases (which I believe is the reason why we don't have cli, sstable2json, ... in tools). I figured that's not necessarily something we want user to grab the source to get. But I suppose we can if we want (at least the script + BulkLoader.java, I'd be in favor of leaving SSTableLoader where it is). Make bulk loading into Cassandra less crappy, more pluggable Key: CASSANDRA-1278 URL: https://issues.apache.org/jira/browse/CASSANDRA-1278 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jeremy Hanna Assignee: Sylvain Lebresne Fix For: 0.8.1 Attachments: 0001-Add-bulk-loader-utility.patch, 1278-cassandra-0.7-v2.txt, 1278-cassandra-0.7.1.txt, 1278-cassandra-0.7.txt Original Estimate: 40h Time Spent: 40h 40m Remaining Estimate: 0h Currently bulk loading into Cassandra is a black art. People are either directed to just do it responsibly with thrift or a higher level client, or they have to explore the contrib/bmt example - http://wiki.apache.org/cassandra/BinaryMemtable That contrib module requires delving into the code to find out how it works and then applying it to the given problem. Using either method, the user also needs to keep in mind that overloading the cluster is possible - which will hopefully be addressed in CASSANDRA-685 This improvement would be to create a contrib module or set of documents dealing with bulk loading. Perhaps it could include code in the Core to make it more pluggable for external clients of different types. It is just that this is something that many that are new to Cassandra need to do - bulk load their data into Cassandra. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1278) Make bulk loading into Cassandra less crappy, more pluggable
[ https://issues.apache.org/jira/browse/CASSANDRA-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036278#comment-13036278 ] Sylvain Lebresne commented on CASSANDRA-1278: - {noformat} +outputHandler.output(Starting client and waiting 15 seconds for gossip ...); +try +{ +// Init gossip +StorageService.instance.initClient(); {noformat} It is in client-only mode as far as I can tell. Maybe client-only mode is screwed up though, I don't know. Make bulk loading into Cassandra less crappy, more pluggable Key: CASSANDRA-1278 URL: https://issues.apache.org/jira/browse/CASSANDRA-1278 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jeremy Hanna Assignee: Sylvain Lebresne Fix For: 0.8.1 Attachments: 0001-Add-bulk-loader-utility.patch, 1278-cassandra-0.7-v2.txt, 1278-cassandra-0.7.1.txt, 1278-cassandra-0.7.txt Original Estimate: 40h Time Spent: 40h 40m Remaining Estimate: 0h Currently bulk loading into Cassandra is a black art. People are either directed to just do it responsibly with thrift or a higher level client, or they have to explore the contrib/bmt example - http://wiki.apache.org/cassandra/BinaryMemtable That contrib module requires delving into the code to find out how it works and then applying it to the given problem. Using either method, the user also needs to keep in mind that overloading the cluster is possible - which will hopefully be addressed in CASSANDRA-685 This improvement would be to create a contrib module or set of documents dealing with bulk loading. Perhaps it could include code in the Core to make it more pluggable for external clients of different types. It is just that this is something that many that are new to Cassandra need to do - bulk load their data into Cassandra. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1278) Make bulk loading into Cassandra less crappy, more pluggable
[ https://issues.apache.org/jira/browse/CASSANDRA-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-1278: Attachment: 0001-Add-bulk-loader-utility-v2.patch bq. It'd be nice if it printed the filename and the time it took for each time, since just having the percentages reset is a bit confusing. The fact that the percentages reset is really just a bug (I test at first with only one sstable, my bad). Anyway, that's fixed. I also agree with Jonathan's objection about printing the filename. And in general I'm not sure giving too much information is really necessary. bq. Also, this should respect SS.RING_DELAY Yes, I think this is the fat client that wasn't respecting it, it was waiting for an hardcoded time of 5 seconds, which is almost always not enough. I've updated SS.initClient() to use RING_DELAY instead. Attaching v2 that: * use RING_DELAY * update the progress indication so that percentage works. It also add for each host the number of files that should be transfered to it and how many have already been. Lastly it adds a total percentage as well as approximate transfer rate infos. Make bulk loading into Cassandra less crappy, more pluggable Key: CASSANDRA-1278 URL: https://issues.apache.org/jira/browse/CASSANDRA-1278 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jeremy Hanna Assignee: Sylvain Lebresne Fix For: 0.8.1 Attachments: 0001-Add-bulk-loader-utility-v2.patch, 0001-Add-bulk-loader-utility.patch, 1278-cassandra-0.7-v2.txt, 1278-cassandra-0.7.1.txt, 1278-cassandra-0.7.txt Original Estimate: 40h Time Spent: 40h 40m Remaining Estimate: 0h Currently bulk loading into Cassandra is a black art. People are either directed to just do it responsibly with thrift or a higher level client, or they have to explore the contrib/bmt example - http://wiki.apache.org/cassandra/BinaryMemtable That contrib module requires delving into the code to find out how it works and then applying it to the given problem. Using either method, the user also needs to keep in mind that overloading the cluster is possible - which will hopefully be addressed in CASSANDRA-685 This improvement would be to create a contrib module or set of documents dealing with bulk loading. Perhaps it could include code in the Core to make it more pluggable for external clients of different types. It is just that this is something that many that are new to Cassandra need to do - bulk load their data into Cassandra. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn
[ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037826#comment-13037826 ] Sylvain Lebresne commented on CASSANDRA-2280: - * If we're going to put that in 0.8.1 (which we should), we cannot rely on MessagingService.VERSION_07. We must bump the version for 0.8.0. Turns out CASSANDRA-2433 already have this problem, so I suggest we introduce a MS.VERSION_080 and stick to that (as a side note, when that's done, we should be careful with StreamRequestMessage as it will have a 0.7 and 0.8.0 part, i.e, we shouldn't blindly s/VERSION_07/VERSION_080 in there). * In StreamHeader and StreamRequestMessage, Iterables.size() is used. Is there a reason for that ? Though google collections are probably smart enough to not do a full iteration to compute the size when possible, in theory we can't really be sure so I don't see why not use .size() (and use a Collection instead of Iterable in StreamHeader, although see next point). * Why are we sending the cfs in StreamHeader at all. It's never used and I don't see why it should (StreamInSession will know what it receive with each file, no reason why it should know upfront what was the request that initiated the streaming). Request specific column families using StreamIn --- Key: CASSANDRA-2280 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Assignee: Jonathan Ellis Fix For: 0.8.1 Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1278) Make bulk loading into Cassandra less crappy, more pluggable
[ https://issues.apache.org/jira/browse/CASSANDRA-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038007#comment-13038007 ] Sylvain Lebresne commented on CASSANDRA-1278: - Note that I'm marking this resolved since that has been committed. However, as it stands sstableloader doesn't handler failure very well (because streaming doesn't). Once CASSANDRA-2433 is committed, this can be easily improved. Make bulk loading into Cassandra less crappy, more pluggable Key: CASSANDRA-1278 URL: https://issues.apache.org/jira/browse/CASSANDRA-1278 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jeremy Hanna Assignee: Sylvain Lebresne Fix For: 0.8.1 Attachments: 0001-Add-bulk-loader-utility-v2.patch, 0001-Add-bulk-loader-utility.patch, 1278-cassandra-0.7-v2.txt, 1278-cassandra-0.7.1.txt, 1278-cassandra-0.7.txt Original Estimate: 40h Time Spent: 40h 40m Remaining Estimate: 0h Currently bulk loading into Cassandra is a black art. People are either directed to just do it responsibly with thrift or a higher level client, or they have to explore the contrib/bmt example - http://wiki.apache.org/cassandra/BinaryMemtable That contrib module requires delving into the code to find out how it works and then applying it to the given problem. Using either method, the user also needs to keep in mind that overloading the cluster is possible - which will hopefully be addressed in CASSANDRA-685 This improvement would be to create a contrib module or set of documents dealing with bulk loading. Perhaps it could include code in the Core to make it more pluggable for external clients of different types. It is just that this is something that many that are new to Cassandra need to do - bulk load their data into Cassandra. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2690) Make the release build fail if the publish to central repository also fails
[ https://issues.apache.org/jira/browse/CASSANDRA-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038124#comment-13038124 ] Sylvain Lebresne commented on CASSANDRA-2690: - +1 Make the release build fail if the publish to central repository also fails --- Key: CASSANDRA-2690 URL: https://issues.apache.org/jira/browse/CASSANDRA-2690 Project: Cassandra Issue Type: Improvement Components: Packaging Affects Versions: 0.7.7, 0.8.0 Reporter: Stephen Connolly Attachments: CASSANDRA-2690-v-trunk.patch, CASSANDRA-2690-v0.7.patch, CASSANDRA-2690-v0.8.patch If the publish to Central fails for one artifact that failure is not picked up. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1978) get_range_slices: allow key and token to be interoperable
[ https://issues.apache.org/jira/browse/CASSANDRA-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038420#comment-13038420 ] Sylvain Lebresne commented on CASSANDRA-1978: - What about computing the token from the key for CASSANDRA-2003 ? Right now it's a legit things to do to walk all keys until CASSANDRA-1034. After CASSANDRA-1034 it will have a risk to miss some keys, but we'll then have this problem with hadoop too. But then I'm not sure the fix attached here is the right one. I think the right fix will be to only allow keys in range_slices, but all to specify if we're asking for a bound or a range. That is, changing KeyRange to something like: {noformat} struct KeyRange { 1: required binary start_key, 2: required binary end_key, 3: required boolean start_inclusive, 5: required i32 count=100 } {noformat} Anyway, we should probably defer that to later, but unless I'm missing something this shouldn't block CASSANDRA-2003. get_range_slices: allow key and token to be interoperable - Key: CASSANDRA-1978 URL: https://issues.apache.org/jira/browse/CASSANDRA-1978 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Kelvin Kakugawa Assignee: Kelvin Kakugawa Priority: Minor Fix For: 0.8.1 Attachments: 0001-CASSANDRA-1978-allow-key-token-to-be-interoperable-i.patch problem: get_range_slices requires two keys or two tokens, so we can't walk a randomly partitioned cluster by token. solution: allow keys and tokens to be mixed. however, if one side is a token, promote the bounds to a dht.Range, instead of a dht.Bounds. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2675) java.io.IOError: java.io.EOFException with version 0.7.6
[ https://issues.apache.org/jira/browse/CASSANDRA-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2675: Attachment: 0002-Avoid-modifying-super-column-in-memtable-being-flush.patch 0001-Don-t-remove-columns-from-super-columns-in-memtable.patch I was able to reproduce, thanks for the java version. I think the problem is that reads can remove subcolumns from a super-column that happens to be in a memtable being flushed. If a subcolumn become gc-able after when the super column count size was written on disk and the time the subcolumn itself is written we won't write it and will end up with short super columns (hence the EOFException). Note that this should not happen with a reasonable gc_grace value (one such that nothing that gets flushed will be gcable). First attached patch fixes this by making reads copy the super-column before modifying it (0.7 patch). I think there is a related second bug, in that when we reduce super columns (in QueryFilter), if we merge multiple super column with the same name, we'll merge them in the first super column. That is, we may end up adding subcolumns to a super column that is in an in-memory memtable. Most of the time this will be harmless, except some useless data duplication. But if that happens for a super column (in a memtable) being flushed and, as above, between the write of the number of column and the actual column writes, we may end up with too long super column. With could result in unreachable columns (i.e, data loss effectively) and quite probably some weird corruption during a compaction. Second patch fixes this second problem. I haven't been able to reproduce with the 2 attached patches and the thing is running since more than an hour. java.io.IOError: java.io.EOFException with version 0.7.6 - Key: CASSANDRA-2675 URL: https://issues.apache.org/jira/browse/CASSANDRA-2675 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.6 Environment: Reproduced on single Cassandra node (CentOS 5.5) Reproduced on single Cassandra node (Windows Server 2008) Reporter: rene kochen Assignee: Sylvain Lebresne Fix For: 0.7.7 Attachments: 0001-Don-t-remove-columns-from-super-columns-in-memtable.patch, 0002-Avoid-modifying-super-column-in-memtable-being-flush.patch, CassandraIssue.zip, CassandraIssueJava.zip I use the following data-model column_metadata: [] name: Customers column_type: Super gc_grace_seconds: 60 I have a super-column-family with a single row. Within this row I have a single super-column. Within this super-column, I concurrently create, read and delete columns. I have three threads: - Do in a loop: add a column to the super-column. - Do in a loop: delete a random column from the super-column. - Do in a loop: read the super-column (with all columns). After running the above threads concurrently, I always receive one of the following errors: ERROR 17:09:57,036 Fatal exception in thread Thread[ReadStage:81,5,main] java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:252) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:268) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:227) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(Unknown Source) at java.util.concurrent.ConcurrentSkipListMap.init(Unknown Source) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:379) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:362) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:322) at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:79) at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:40) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:108) at org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:283) at org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326) at org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69)
[jira] [Created] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2675) java.io.IOError: java.io.EOFException with version 0.7.6
[ https://issues.apache.org/jira/browse/CASSANDRA-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038571#comment-13038571 ] Sylvain Lebresne commented on CASSANDRA-2675: - Yes I agree, patch 2 is actually enough. java.io.IOError: java.io.EOFException with version 0.7.6 - Key: CASSANDRA-2675 URL: https://issues.apache.org/jira/browse/CASSANDRA-2675 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.6 Environment: Reproduced on single Cassandra node (CentOS 5.5) Reproduced on single Cassandra node (Windows Server 2008) Reporter: rene kochen Assignee: Sylvain Lebresne Fix For: 0.7.7 Attachments: 0001-Don-t-remove-columns-from-super-columns-in-memtable.patch, 0002-Avoid-modifying-super-column-in-memtable-being-flush.patch, CassandraIssue.zip, CassandraIssueJava.zip I use the following data-model column_metadata: [] name: Customers column_type: Super gc_grace_seconds: 60 I have a super-column-family with a single row. Within this row I have a single super-column. Within this super-column, I concurrently create, read and delete columns. I have three threads: - Do in a loop: add a column to the super-column. - Do in a loop: delete a random column from the super-column. - Do in a loop: read the super-column (with all columns). After running the above threads concurrently, I always receive one of the following errors: ERROR 17:09:57,036 Fatal exception in thread Thread[ReadStage:81,5,main] java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:252) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:268) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:227) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(Unknown Source) at java.util.concurrent.ConcurrentSkipListMap.init(Unknown Source) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:379) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:362) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:322) at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:79) at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:40) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:108) at org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:283) at org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326) at org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:116) at org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(QueryFilter.java:130) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1390) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1267) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1195) at org.apache.cassandra.db.Table.getRow(Table.java:324) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:451) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.io.EOFException at java.io.RandomAccessFile.readByte(Unknown Source) at org.apache.cassandra.utils.ByteBufferUtil.readShortLength(ByteBufferUtil.java:324) at
[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn
[ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038600#comment-13038600 ] Sylvain Lebresne commented on CASSANDRA-2280: - * In SSTableLoader, calling Table.open() isn't really neat in that in the case of the 'external' bulk loader, it's a fat client, so that will imply creating directories, etc... for no good reason (I haven't test but I would be surprised it actually throw an exception). We'd better give an empty list. Or even better (in my opinion), my next point. * I don't find that very logic for streamOutSession to take a collection of cfs. The coupling seems unnecessary. The problem we're solving is to ask another node to transfer us some range for some CF. So what about having the list of CFs only in StreamRequestMessage and add the list of cfs to use as an argument to StreamOut.transferRanges() ? We don't need it anywhere else. * In StreamRequestMessage, we should write the operation type even if version is VERSION_080 (same for deserialization). Nitpick: and couldn't we use the cf ids instead of the names ? * In StreamRequestMessage, the field is a Collection but we're still using Iterables.size() inside. Pretty sure that doesn't leave much option :) I mean, my remark was more about saying why add something that may make people wonder for no reason since that's not something that is widespread in the code. Anyway, just saying, I don't care. * I suppose the bump of MessagingService from 2 to 81 was on purpose ? (I don't mind, just pointing out to make sure) Request specific column families using StreamIn --- Key: CASSANDRA-2280 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Assignee: Jonathan Ellis Fix For: 0.8.1 Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2675) java.io.IOError: java.io.EOFException with version 0.7.6
[ https://issues.apache.org/jira/browse/CASSANDRA-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2675: Attachment: 0002-Avoid-modifying-super-column-in-memtable-being-flush-v2.patch bq. Is it? Don't you still have the problem of a tombstone cleanup modifying things mid-flush? Patch 2 make sure that the cf returned by a getTopLevelColumns() doesn't have any super column that is an alias of a super column in some memtable. So then we don't care what consumers of the result getTopLevelColumns() do. Even if they remove columns the 'being flushed' super column won't be affected. The idea of not always copying in the first patch was to not incure the copy to all the part of the code that doesn't care (mainly compaction). But anyway, I do think that patch 2 is enough. Attaching v2 of patch 2 to use isEmpty. java.io.IOError: java.io.EOFException with version 0.7.6 - Key: CASSANDRA-2675 URL: https://issues.apache.org/jira/browse/CASSANDRA-2675 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.6 Environment: Reproduced on single Cassandra node (CentOS 5.5) Reproduced on single Cassandra node (Windows Server 2008) Reporter: rene kochen Assignee: Sylvain Lebresne Fix For: 0.7.7 Attachments: 0001-Don-t-remove-columns-from-super-columns-in-memtable.patch, 0002-Avoid-modifying-super-column-in-memtable-being-flush-v2.patch, 0002-Avoid-modifying-super-column-in-memtable-being-flush.patch, CassandraIssue.zip, CassandraIssueJava.zip I use the following data-model column_metadata: [] name: Customers column_type: Super gc_grace_seconds: 60 I have a super-column-family with a single row. Within this row I have a single super-column. Within this super-column, I concurrently create, read and delete columns. I have three threads: - Do in a loop: add a column to the super-column. - Do in a loop: delete a random column from the super-column. - Do in a loop: read the super-column (with all columns). After running the above threads concurrently, I always receive one of the following errors: ERROR 17:09:57,036 Fatal exception in thread Thread[ReadStage:81,5,main] java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:252) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:268) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:227) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(Unknown Source) at java.util.concurrent.ConcurrentSkipListMap.init(Unknown Source) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:379) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:362) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:322) at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:79) at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:40) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:108) at org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:283) at org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326) at org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:116) at org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(QueryFilter.java:130) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1390) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1267) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1195) at org.apache.cassandra.db.Table.getRow(Table.java:324) at
[jira] [Commented] (CASSANDRA-2669) Scrub does not close files
[ https://issues.apache.org/jira/browse/CASSANDRA-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13039213#comment-13039213 ] Sylvain Lebresne commented on CASSANDRA-2669: - +1 Scrub does not close files -- Key: CASSANDRA-2669 URL: https://issues.apache.org/jira/browse/CASSANDRA-2669 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 0.7.3 Reporter: Daniel Doubleday Assignee: Jonathan Ellis Priority: Minor Fix For: 0.7.7, 0.8.1 Attachments: 2669.txt After scrubbing I find that cassandra process still holds file handles to the deleted sstables: {noformat} root@blnrzh047:/mnt/cassandra# jps 6932 Jps 32359 CassandraDaemon 32398 CassandraJmxHttpServer root@blnrzh047:/mnt/cassandra# du -sh . 315G . root@blnrzh047:/mnt/cassandra# df -h . FilesystemSize Used Avail Use% Mounted on /dev/md0 1.1T 626G 420G 60% /mnt/cassandra root@blnrzh047:/mnt/cassandra# lsof | grep /mnt java 32359root 356r REG9,0 24 4194599 /mnt/cassandra/data/system/Migrations-f-13-Index.db (deleted) java 32359root 357r REG9,0 329451 4194547 /mnt/cassandra/data/system/HintsColumnFamily-f-588-Data.db (deleted) java 32359root 358r REG9,0 22 4194546 /mnt/cassandra/data/system/HintsColumnFamily-f-588-Index.db (deleted) java 32359root 359r REG9,0 313225 4194534 /mnt/cassandra/data/system/HintsColumnFamily-f-587-Data.db (deleted) java 32359root 360r REG9,0 22 4194494 /mnt/cassandra/data/system/HintsColumnFamily-f-587-Index.db (deleted) java 32359root 361r REG9,030452 4194636 /mnt/cassandra/data/system/Schema-f-13-Data.db (deleted) java 32359root 362r REG9,0 484 4194635 /mnt/cassandra/data/system/Schema-f-13-Index.db (deleted) {noformat} I guess there's a missing dataFile.close() in CompactionManager:648 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2716) avoid allocating a new serializer per ColumnFamily (row)
[ https://issues.apache.org/jira/browse/CASSANDRA-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13040133#comment-13040133 ] Sylvain Lebresne commented on CASSANDRA-2716: - +1 avoid allocating a new serializer per ColumnFamily (row) Key: CASSANDRA-2716 URL: https://issues.apache.org/jira/browse/CASSANDRA-2716 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Trivial Fix For: 0.7.7, 0.8.1 Attachments: 2716.txt Column.serializer and Supercolumn.serializer both allocate new objects with each call. The most frequent offender is the ColumnFamily constructor. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2718) NPE in SSTableWriter when no ReplayPosition availible
[ https://issues.apache.org/jira/browse/CASSANDRA-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13040236#comment-13040236 ] Sylvain Lebresne commented on CASSANDRA-2718: - +1 NPE in SSTableWriter when no ReplayPosition availible - Key: CASSANDRA-2718 URL: https://issues.apache.org/jira/browse/CASSANDRA-2718 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.1 Reporter: T Jake Luciani Assignee: T Jake Luciani Priority: Trivial Fix For: 0.8.1 Attachments: v1-0001-CASSANDRA-2718-avoide-NPE-when-bypassing-commitlog.txt The following NPE occurs when durable_writes is set to false {noformat} ERROR 09:20:30,378 Fatal exception in thread Thread[FlushWriter:11,5,main] java.lang.RuntimeException: java.lang.NullPointerException at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.NullPointerException at org.apache.cassandra.db.commitlog.ReplayPosition$ReplayPositionSerializer.serialize(ReplayPosition.java:127) at org.apache.cassandra.io.sstable.SSTableWriter.writeMetadata(SSTableWriter.java:209) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:187) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:173) at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:253) at org.apache.cassandra.db.Memtable.access$400(Memtable.java:49) at org.apache.cassandra.db.Memtable$3.runMayThrow(Memtable.java:270) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2641) AbstractBounds.normalize should deal with overlapping ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2641: Attachment: 0001-Make-normalize-deoverlap-ranges.patch AbstractBounds.normalize should deal with overlapping ranges Key: CASSANDRA-2641 URL: https://issues.apache.org/jira/browse/CASSANDRA-2641 Project: Cassandra Issue Type: Test Components: Core Reporter: Stu Hood Assignee: Stu Hood Priority: Minor Fix For: 0.8.1 Attachments: 0001-Assert-non-overlapping-ranges-in-normalize.txt, 0001-Make-normalize-deoverlap-ranges.patch, 0002-Don-t-use-overlapping-ranges-in-tests.txt Apparently no consumers have encountered it in production, but AbstractBounds.normalize does not handle overlapping ranges. If given overlapping ranges, the output will be sorted but still overlapping, for which SSTableReader.getPositionsForRanges will choose ranges in an SSTable that may overlap. We should either add an assert in normalize(), or in getPositionsForRanges() to ensure that this never bites us in production. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (CASSANDRA-2641) AbstractBounds.normalize should deal with overlapping ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne reopened CASSANDRA-2641: - Sorry, I was much too quick in reviewing this. The patch has two problems: * It works only for Bounds, not Range. It will say that (1, 2] and (2, 3] are overlapping but it's not. * It does the check on the unsorted input list, so that's another reason why he will uncorrectly report overlapping Because I'm stubborn, I'm attaching a patch that take the approach of making normalize() deoverlap overlapping ranges. It also add a number of unit tests for normalize. AbstractBounds.normalize should deal with overlapping ranges Key: CASSANDRA-2641 URL: https://issues.apache.org/jira/browse/CASSANDRA-2641 Project: Cassandra Issue Type: Test Components: Core Reporter: Stu Hood Assignee: Stu Hood Priority: Minor Fix For: 0.8.1 Attachments: 0001-Assert-non-overlapping-ranges-in-normalize.txt, 0001-Make-normalize-deoverlap-ranges.patch, 0002-Don-t-use-overlapping-ranges-in-tests.txt Apparently no consumers have encountered it in production, but AbstractBounds.normalize does not handle overlapping ranges. If given overlapping ranges, the output will be sorted but still overlapping, for which SSTableReader.getPositionsForRanges will choose ranges in an SSTable that may overlap. We should either add an assert in normalize(), or in getPositionsForRanges() to ensure that this never bites us in production. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2709) sstableloader throws an exception when RF1
[ https://issues.apache.org/jira/browse/CASSANDRA-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-2709. - Resolution: Duplicate It's CASSANDRA-2641 that is buggy. As one can see in this message, those range do not overlap. I've reopen CASSANDRA-2641 to fix so closing this one as duplicate. sstableloader throws an exception when RF1 --- Key: CASSANDRA-2709 URL: https://issues.apache.org/jira/browse/CASSANDRA-2709 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.1 Reporter: Brandon Williams Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.8.1 {noformat} Exception in thread main java.lang.AssertionError: Overlapping ranges passed to normalize: see CASSANDRA-2461: (113427455640312821154458202477256070484,170141183460469231731687303715884105726] and [(56713727820156410577229101238628035242,113427455640312821154458202477256070484]] at org.apache.cassandra.dht.AbstractBounds.normalize(AbstractBounds.java:104) at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:497) at org.apache.cassandra.streaming.StreamOut.createPendingFiles(StreamOut.java:168) at org.apache.cassandra.streaming.StreamOut.transferSSTables(StreamOut.java:148) at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:128) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:61) {noformat} However, it does appear to keep streaming files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2709) sstableloader throws an exception when RF1
[ https://issues.apache.org/jira/browse/CASSANDRA-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13040256#comment-13040256 ] Sylvain Lebresne commented on CASSANDRA-2709: - I've also reverted the buggy assertion, so this should not be a problem anymore. sstableloader throws an exception when RF1 --- Key: CASSANDRA-2709 URL: https://issues.apache.org/jira/browse/CASSANDRA-2709 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.1 Reporter: Brandon Williams Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.8.1 {noformat} Exception in thread main java.lang.AssertionError: Overlapping ranges passed to normalize: see CASSANDRA-2461: (113427455640312821154458202477256070484,170141183460469231731687303715884105726] and [(56713727820156410577229101238628035242,113427455640312821154458202477256070484]] at org.apache.cassandra.dht.AbstractBounds.normalize(AbstractBounds.java:104) at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:497) at org.apache.cassandra.streaming.StreamOut.createPendingFiles(StreamOut.java:168) at org.apache.cassandra.streaming.StreamOut.transferSSTables(StreamOut.java:148) at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:128) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:61) {noformat} However, it does appear to keep streaming files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2641) AbstractBounds.normalize should deal with overlapping ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2641: Attachment: 0001-Make-normalize-deoverlap-ranges.patch AbstractBounds.normalize should deal with overlapping ranges Key: CASSANDRA-2641 URL: https://issues.apache.org/jira/browse/CASSANDRA-2641 Project: Cassandra Issue Type: Test Components: Core Reporter: Stu Hood Assignee: Stu Hood Priority: Minor Fix For: 0.8.1 Attachments: 0001-Assert-non-overlapping-ranges-in-normalize.txt, 0001-Make-normalize-deoverlap-ranges.patch, 0002-Don-t-use-overlapping-ranges-in-tests.txt Apparently no consumers have encountered it in production, but AbstractBounds.normalize does not handle overlapping ranges. If given overlapping ranges, the output will be sorted but still overlapping, for which SSTableReader.getPositionsForRanges will choose ranges in an SSTable that may overlap. We should either add an assert in normalize(), or in getPositionsForRanges() to ensure that this never bites us in production. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2719) Super Column Counters Increment on Read
[ https://issues.apache.org/jira/browse/CASSANDRA-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-2719. - Resolution: Duplicate Fix Version/s: 0.8.0 This turns out to be a duplicate of CASSANDRA-2675. I've thus committed it to 0.8.0 too (it was already committed on the 0.7 and 0.8 branches). Super Column Counters Increment on Read --- Key: CASSANDRA-2719 URL: https://issues.apache.org/jira/browse/CASSANDRA-2719 Project: Cassandra Issue Type: Bug Affects Versions: 0.8 beta 1 Environment: Tested on 0.8.0-rc1 on both a 3 node cluster and single instance. Reporter: Greg Hinkle Fix For: 0.8.0 Attachments: SuperCountTest.java Running a large number of batch increments on a set of counters in a super CF seems to put some of the counters into a strange state where they increment every time you read from them. Including just doing a list or get from the cli. Will attach test that reproduces problem. For example, after running the test (and it completing and the process stopping). [default@Chires] get CountTest[01]; = (super_column=1306512590369, (counter=n, value=25625)) Returned 1 results. [default@Chires] get CountTest[01]; = (super_column=1306512590369, (counter=n, value=26610)) Returned 1 results. From debug logs at the same time: DEBUG 12:42:13,899 get_slice DEBUG 12:42:13,899 Command/ConsistencyLevel is SliceFromReadCommand(table='Chires', key='01', column_parent='QueryPath(columnFamilyName='CountTest', superColumnName='null', columnName='null')', start='', finish='', reversed=false, count=100)/ONE DEBUG 12:42:13,899 Blockfor/repair is 1/true; setting up requests to /127.0.0.2 DEBUG 12:42:13,899 reading data from /127.0.0.2 DEBUG 12:42:13,900 Processing response on a callback from 210570@/127.0.0.2 DEBUG 12:42:13,900 Preprocessed data response DEBUG 12:42:13,900 Read: 1 ms. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2721) nodetool statusthrift exception while node starts up
[ https://issues.apache.org/jira/browse/CASSANDRA-2721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13041031#comment-13041031 ] Sylvain Lebresne commented on CASSANDRA-2721: - I don't think nodetool statusthrift exists, but yeah, makes sense, +1. BUT, let's just put it in 0.8.1 however, for the sake of making Eric job's easier when he re-roll 0.8.0 (in the meantime, any hypothetical implementation of a nodetool statusthrift could just catch the IllegalStateException). nodetool statusthrift exception while node starts up Key: CASSANDRA-2721 URL: https://issues.apache.org/jira/browse/CASSANDRA-2721 Project: Cassandra Issue Type: Bug Reporter: Chris Goffinet Assignee: Chris Goffinet Priority: Trivial Fix For: 0.8.0 Attachments: 0001-If-RPCServer-isn-t-started-just-return-false-instead.patch We noticed when calling nodetool statusthrift, while a node is starting up, it throws an exception. I think the proper behavior should be just return false, instead of throwing an exception if RPC server hasn't started yet. That way this stack trace won't have to be thrown in nodetool: Exception in thread main java.lang.IllegalStateException: No configured RPC daemon -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2722) nodetool statusthrift
[ https://issues.apache.org/jira/browse/CASSANDRA-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13041057#comment-13041057 ] Sylvain Lebresne commented on CASSANDRA-2722: - printisThriftServerRunning() should have the 'i' of 'is' in caps. Also, printing just 'true' or 'false' is maybe a big harsh (maybe 'Running' or 'Not running' would be slightly more friendly and not so much harder to use in a script). But apart from those nitpick, +1. nodetool statusthrift - Key: CASSANDRA-2722 URL: https://issues.apache.org/jira/browse/CASSANDRA-2722 Project: Cassandra Issue Type: Improvement Reporter: Chris Goffinet Assignee: Chris Goffinet Priority: Trivial Fix For: 0.8.1 Attachments: 0001-Added-the-ability-to-check-thrift-status-in-nodetool.patch Provide the status of thrift server, if it's running or not. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-2653) index scan errors out when zero columns are requested
[ https://issues.apache.org/jira/browse/CASSANDRA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne reassigned CASSANDRA-2653: --- Assignee: Sylvain Lebresne (was: Jonathan Ellis) index scan errors out when zero columns are requested - Key: CASSANDRA-2653 URL: https://issues.apache.org/jira/browse/CASSANDRA-2653 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 beta 2 Reporter: Jonathan Ellis Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.7.7 Attachments: v1-0001-CASSANDRA-2653-reproduce-regression.txt As reported by Tyler Hobbs as an addendum to CASSANDRA-2401, {noformat} ERROR 16:13:38,864 Fatal exception in thread Thread[ReadStage:16,5,main] java.lang.AssertionError: No data found for SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, count=0] in DecoratedKey(81509516161424251288255223397843705139, 6b657931):QueryPath(columnFamilyName='cf', superColumnName='null', columnName='null') (original filter SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, count=0]) from expression 'cf.626972746864617465 EQ 1' at org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1517) at org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2653) index scan errors out when zero columns are requested
[ https://issues.apache.org/jira/browse/CASSANDRA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2653: Attachment: 0001-Reset-SSTII-in-EchoedRow-constructor.patch This is indeed compaction related (but not related to secondary indexing at all). The problem is that compaction may lose some rows. Because of the way the ReducingIterator works, when we create a new {Pre|Lazy|Echoed}CompactedRow, we have already decoded the next row key and the file pointer if after that next row key. Both PreCompactedRow and LazyCompactedRow handle this correctly by resetting their SSTableIdentityIterator before reading (SSTII.getColumnFamilyWithColumns() does it for PreCompactedRow and LazilyCompactedRow calls SSTII.reset() directly). But EchoedRow doesn't handle this correctly. Hence when EchoedRow.isEmpty() is called, it will call SSTII.hasNext(), that will compare the current file pointer to the finishedAt value of the iterator. The pointer being on the next row, this test will always fail and the row will be skipped. Attaching a patch against 0.8 with a (smaller) unit test. Note that luckily this doesn't affect 0.7, because it only uses EchoedRow for cleanup compactions and clean compactions does not use ReducingIterator (and thus, the underlying SSTII won't have changed when the EchoedRow is built). I would still be in favor of committing the patch there too, just to make sure we don't hit this later. index scan errors out when zero columns are requested - Key: CASSANDRA-2653 URL: https://issues.apache.org/jira/browse/CASSANDRA-2653 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 beta 2 Reporter: Jonathan Ellis Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.7.7 Attachments: 0001-Reset-SSTII-in-EchoedRow-constructor.patch, v1-0001-CASSANDRA-2653-reproduce-regression.txt As reported by Tyler Hobbs as an addendum to CASSANDRA-2401, {noformat} ERROR 16:13:38,864 Fatal exception in thread Thread[ReadStage:16,5,main] java.lang.AssertionError: No data found for SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, count=0] in DecoratedKey(81509516161424251288255223397843705139, 6b657931):QueryPath(columnFamilyName='cf', superColumnName='null', columnName='null') (original filter SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, count=0]) from expression 'cf.626972746864617465 EQ 1' at org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1517) at org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2405) should expose 'time since last successful repair' for easier aes monitoring
[ https://issues.apache.org/jira/browse/CASSANDRA-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042175#comment-13042175 ] Sylvain Lebresne commented on CASSANDRA-2405: - This needs rebasing. First, two small remarks: * It seems we store the time in microseconds but then, when computing the time since last repair we use System.currentTimeMillis() - stored_time. * I would be in favor of calling the system table REPAIR_INFO, because the truth is I think it would make sense to record a number of other statistics on repair and it doesn't hurt to make the system table less specific. That also means we should probably not force any type for the value (though that can be easily changed later, so it's not a bit deal for this patch). * I think we usually put the code to query the system table in SystemTable, so I would move it from AntiEntropy to there. Then more generally, a given repair involves multiple states and multiple nodes, so I don't think keeping only one timestamp is enough. Right now, we save the time of the last scheduled validation compaction on each node. With only that we're missing information so that people can do any reasonably inform decision: * First, this does not correspond to the last repair session started on that node, since the validation can be a request from another node. People may be interested by that information. * Second, given that repair concerns a given range, keeping only one general number is wrong (it would suggest the node have been repaired recently even when only one range out of 3 or 5 have been actually repaired). * Third, though recording the start of the validation compaction is important, this says nothing on the success of the repair (and we all know failing during repair do happen, if only because it's a fairly long operation during which node can die). So we need to record some info on the success of the operation if we don't want to return misleading information. Turns out, this is easy to record on the node coordinating the repair, maybe not so much on the other node participating in the repair. Truth is, I'm not so sure what is the simplest way to handle this. Maybe one option could be to only register the start and end time of a repair session on the coordinator of the repair (adding the info of which range was repaired). Also, what do people think of keeping an history (instead of just keeping the last number). I'm thinking a little bit ahead here, but what about storing one supercolumn by repair, where the super column name would be the repair session id (a TimeUUID really) and the columns infos on that repair. For this patch we would only record the range for that session, the start time and the end time (or maybe one end time for each node). But we would populate this a little bit further with stuff like CASSANDRA-2698. I think having such history would be fairly interesting. should expose 'time since last successful repair' for easier aes monitoring --- Key: CASSANDRA-2405 URL: https://issues.apache.org/jira/browse/CASSANDRA-2405 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Pavel Yaskevich Priority: Minor Fix For: 0.8.1 Attachments: CASSANDRA-2405-v2.patch, CASSANDRA-2405.patch The practical implementation issues of actually ensuring repair runs is somewhat of an undocumented/untreated issue. One hopefully low hanging fruit would be to at least expose the time since last successful repair for a particular column family, to make it easier to write a correct script to monitor for lack of repair in a non-buggy fashion. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2673) AssertionError post truncate
[ https://issues.apache.org/jira/browse/CASSANDRA-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042199#comment-13042199 ] Sylvain Lebresne commented on CASSANDRA-2673: - +1 AssertionError post truncate Key: CASSANDRA-2673 URL: https://issues.apache.org/jira/browse/CASSANDRA-2673 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.0 Environment: linux 64-bit ubuntu. deb package (datastax). (Random partitioner) Reporter: Marko Mikulicic Assignee: Jonathan Ellis Priority: Minor Fix For: 0.7.7, 0.8.1 Attachments: 2673.txt I had 3 nodes with about 100G in a CF. I run truncate on that CF from cassandra-cli. Then I run cleanup for that CF. I saw this exception shortly after. INFO [FlushWriter:5] 2011-05-20 02:56:42,699 Memtable.java (line 157) Writing Memtable-body@1278535630(26722 bytes, 1 operations) INFO [FlushWriter:5] 2011-05-20 02:56:42,706 Memtable.java (line 172) Completed flushing /var/lib/cassandra/data/dnet/body-f-1892-Data.db (26915 bytes) INFO [NonPeriodicTasks:1] 2011-05-20 02:59:55,981 SSTable.java (line 147) Deleted /var/lib/cassandra/data/dnet/body-f-1892 INFO [NonPeriodicTasks:1] 2011-05-20 02:59:55,982 SSTable.java (line 147) Deleted /var/lib/cassandra/data/dnet/body-f-1889 INFO [NonPeriodicTasks:1] 2011-05-20 02:59:55,983 SSTable.java (line 147) Deleted /var/lib/cassandra/data/dnet/body-f-1890 INFO [NonPeriodicTasks:1] 2011-05-20 02:59:55,983 SSTable.java (line 147) Deleted /var/lib/cassandra/data/dnet/body-f-1888 INFO [NonPeriodicTasks:1] 2011-05-20 02:59:55,984 SSTable.java (line 147) Deleted /var/lib/cassandra/data/dnet/body-f-1887 INFO [CompactionExecutor:1] 2011-05-20 03:02:08,724 CompactionManager.java (line 750) Cleaned up to /var/lib/cassandra/data/dnet/body-tmp-f-1891-Data.db. 25,629,365,173 to 25,629,365,173 (~100% of original) bytes for 884,546 keys. Time: 1,165,900ms. ERROR [CompactionExecutor:1] 2011-05-20 03:02:08,727 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[CompactionExecutor:1,1,main] java.lang.AssertionError at org.apache.cassandra.io.sstable.SSTableTracker.replace(SSTableTracker.java:108) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:1037) at org.apache.cassandra.db.CompactionManager.doCleanupCompaction(CompactionManager.java:769) at org.apache.cassandra.db.CompactionManager.access$500(CompactionManager.java:56) at org.apache.cassandra.db.CompactionManager$2.call(CompactionManager.java:173) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1304#comment-1304 ] Sylvain Lebresne commented on CASSANDRA-2231: - You'd have to apply the patch on CASSANDRA-2355 first. Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal --- Key: CASSANDRA-2231 URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 Project: Cassandra Issue Type: New Feature Components: Contrib Reporter: Ed Anuff Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.8.1 Attachments: 0001-Add-compositeType-and-DynamicCompositeType-v2.patch, 0001-Add-compositeType-and-DynamicCompositeType-v3.patch, 0001-Add-compositeType-and-DynamicCompositeType-v4.patch, 0001-Add-compositeType-and-DynamicCompositeType_0.7.patch, CompositeType-and-DynamicCompositeType.patch, edanuff-CassandraCompositeType-1e253c4.zip CompositeType is a custom comparer that makes it possible to create comparable composite values out of the basic types that Cassandra currently supports, such as Long, UUID, etc. This is very useful in both the creation of custom inverted indexes using columns in a skinny row, where each column name is a composite value, and also when using Cassandra's built-in secondary index support, where it can be used to encode the values in the columns that Cassandra indexes. One scenario for the usage of these is documented here: http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for contribution is attached and has been previously maintained on github here: https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2654) Work around native heap leak in sun.nio.ch.Util affecting IncomingTcpConnection
[ https://issues.apache.org/jira/browse/CASSANDRA-2654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042268#comment-13042268 ] Sylvain Lebresne commented on CASSANDRA-2654: - looks good, +1 Work around native heap leak in sun.nio.ch.Util affecting IncomingTcpConnection --- Key: CASSANDRA-2654 URL: https://issues.apache.org/jira/browse/CASSANDRA-2654 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.6.13, 0.7.5, 0.8.0 beta 2 Environment: OpenJDK Runtime Environment (IcedTea6 1.9.7) (6b20-1.9.7-0ubuntu1~10.04.1) OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode) Also observed on Sun/Oracle JDK. Probably platform- and os-independent. Reporter: Hannes Schmidt Fix For: 0.6.12 Attachments: 2654-v2.txt, 2654-v3.txt, 2654-v4-0.7.txt, 2654-v4.txt, chunking.diff NIO's leaky, per-thread caching of direct buffers in combination with IncomingTcpConnection's eager buffering of messages leads to leakage of large amounts of native heap. Details in [1]. More on the root cause in [2]. Even though it doesn't fix the leak, attached patch has been found to alleviate the problem by keeping the size of each direct buffer modest. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn
[ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045323#comment-13045323 ] Sylvain Lebresne commented on CASSANDRA-2280: - In StreamRequestMessage deserializer, in the version VERSION_080 part, the type is deserialized again, it should be removed. It needs rebasing (at least for 0.8 branch) so I didn't run the tests with it, but looks good otherwise. Request specific column families using StreamIn --- Key: CASSANDRA-2280 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Assignee: Jonathan Ellis Fix For: 0.8.1 Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-833) fix consistencylevel during bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046017#comment-13046017 ] Sylvain Lebresne commented on CASSANDRA-833: +1 fix consistencylevel during bootstrap - Key: CASSANDRA-833 URL: https://issues.apache.org/jira/browse/CASSANDRA-833 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.5 Reporter: Jonathan Ellis Assignee: Sylvain Lebresne Fix For: 0.8.1 Attachments: 0001-Increase-CL-with-boostrapping-leaving-node.patch, 833-v2.txt As originally designed, bootstrap nodes should *always* get *all* writes under any consistencylevel, so when bootstrap finishes the operator can run cleanup on the old nodes w/o fear that he might lose data. but if a bootstrap operation fails or is aborted, that means all writes will fail until the ex-bootstrapping node is decommissioned. so starting in CASSANDRA-722, we just ignore dead nodes in consistencylevel calculations. but this breaks the original design. CASSANDRA-822 adds a partial fix for this (just adding bootstrap targets into the RF targets and hinting normally), but this is still broken under certain conditions. The real fix is to consider consistencylevel for two sets of nodes: 1. the RF targets as currently existing (no pending ranges) 2. the RF targets as they will exist after all movement ops are done If we satisfy CL for both sets then we will always be in good shape. I'm not sure if we can easily calculate 2. from the current TokenMetadata, though. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2590) row delete breaks read repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046398#comment-13046398 ] Sylvain Lebresne commented on CASSANDRA-2590: - +1 on v4, we do need both calls. That being said, we should probably refactor that part of the code someday because it is not the cleanest thing ever. And there is probably ways to avoid those two phases (which does do some duplicate works I believe). row delete breaks read repair -- Key: CASSANDRA-2590 URL: https://issues.apache.org/jira/browse/CASSANDRA-2590 Project: Cassandra Issue Type: Bug Components: Core Reporter: Aaron Morton Assignee: Aaron Morton Priority: Minor Fix For: 0.7.7, 0.8.1 Attachments: 0001-2590-v3.patch, 0001-cf-resolve-test-and-possible-solution-for-read-repai.patch, 2590-v2.txt, 2590-v4-0.7.txt related to CASSANDRA-2589 Working at CL ALL can get inconsistent reads after row deletion. Reproduced on the 0.7 and 0.8 source. Steps to reproduce: # two node cluster with rf 2 and HH turned off # insert rows via cli # flush both nodes # shutdown node 1 # connect to node 2 via cli and delete one row # bring up node 1 # connect to node 1 via cli and issue get with CL ALL # first get returns the deleted row, second get returns zero rows. RowRepairResolver.resolveSuperSet() resolves a local CF with the old row columns, and the remote CF which is marked for deletion. CF.resolve() does not pay attention to the deletion flags and the resolved CF has both markedForDeletion set and a column with a lower timestamp. The return from resolveSuperSet() is used as the return for the read without checking if the cols are relevant. Also when RowRepairResolver.mabeScheduleRepairs() runs it sends two mutations. Node 1 is given the row level deletation, and Node 2 is given a mutation to write the old (and now deleted) column from node 2. I have some log traces for this if needed. A quick fix is to check for relevant columns in the RowRepairResolver, will attach shortly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1034) Remove assumption that Key to Token is one-to-one
[ https://issues.apache.org/jira/browse/CASSANDRA-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-1034: Attachment: (was: 0002-Remove-assumption-that-token-and-keys-are-one-to-one-v2.patch) Remove assumption that Key to Token is one-to-one - Key: CASSANDRA-1034 URL: https://issues.apache.org/jira/browse/CASSANDRA-1034 Project: Cassandra Issue Type: Bug Reporter: Stu Hood Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.0 Attachments: 0001-Make-range-accept-both-Token-and-DecoratedKey.patch, 0002-LengthPartitioner.patch, 1034_v1.txt get_range_slices assumes that Tokens do not collide and converts a KeyRange to an AbstractBounds. For RandomPartitioner, this assumption isn't safe, and would lead to a very weird heisenberg. Converting AbstractBounds to use a DecoratedKey would solve this, because the byte[] key portion of the DecoratedKey can act as a tiebreaker. Alternatively, we could make DecoratedKey extend Token, and then use DecoratedKeys in places where collisions are unacceptable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1034) Remove assumption that Key to Token is one-to-one
[ https://issues.apache.org/jira/browse/CASSANDRA-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-1034: Attachment: (was: 0002-Remove-assumption-that-token-and-keys-are-one-to-one.patch) Remove assumption that Key to Token is one-to-one - Key: CASSANDRA-1034 URL: https://issues.apache.org/jira/browse/CASSANDRA-1034 Project: Cassandra Issue Type: Bug Reporter: Stu Hood Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.0 Attachments: 0001-Make-range-accept-both-Token-and-DecoratedKey.patch, 0002-LengthPartitioner.patch, 1034_v1.txt get_range_slices assumes that Tokens do not collide and converts a KeyRange to an AbstractBounds. For RandomPartitioner, this assumption isn't safe, and would lead to a very weird heisenberg. Converting AbstractBounds to use a DecoratedKey would solve this, because the byte[] key portion of the DecoratedKey can act as a tiebreaker. Alternatively, we could make DecoratedKey extend Token, and then use DecoratedKeys in places where collisions are unacceptable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1034) Remove assumption that Key to Token is one-to-one
[ https://issues.apache.org/jira/browse/CASSANDRA-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-1034: Attachment: 1034-2-Remove-assumption-that-token-and-keys-are-one-to-one-v3.patch 1034-1-Generify-AbstractBounds-v3.patch Patch rebased, this is against trunk. Remove assumption that Key to Token is one-to-one - Key: CASSANDRA-1034 URL: https://issues.apache.org/jira/browse/CASSANDRA-1034 Project: Cassandra Issue Type: Bug Reporter: Stu Hood Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.0 Attachments: 0001-Make-range-accept-both-Token-and-DecoratedKey.patch, 0002-LengthPartitioner.patch, 1034-1-Generify-AbstractBounds-v3.patch, 1034-2-Remove-assumption-that-token-and-keys-are-one-to-one-v3.patch, 1034_v1.txt get_range_slices assumes that Tokens do not collide and converts a KeyRange to an AbstractBounds. For RandomPartitioner, this assumption isn't safe, and would lead to a very weird heisenberg. Converting AbstractBounds to use a DecoratedKey would solve this, because the byte[] key portion of the DecoratedKey can act as a tiebreaker. Alternatively, we could make DecoratedKey extend Token, and then use DecoratedKeys in places where collisions are unacceptable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1034) Remove assumption that Key to Token is one-to-one
[ https://issues.apache.org/jira/browse/CASSANDRA-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046477#comment-13046477 ] Sylvain Lebresne commented on CASSANDRA-1034: - bq. One way to remove toSplitValue would be to use DecoratedKey everywhere; I'm not saying it's not possible, but I think this is overkill (in the changes it involves). Moreover, all the code that deals with topology really only care about token. That's the right abstraction for those part of the code. So I really (really) doubt using decorated key everywhere would be cleaner. Of course, anyone is free to actually do the experiment and prove me wrong. I also don't think it would remove the need for splitValue, it would just maybe call it differently. bq. The equivalent of today's Token is a DecoratedKey for that token with a null key This is only true today because we assume key and token are one-to-one. The goal is to change that. If multiple keys can have the same token (by definition the token is really the hash of a key), then the statement above is false. If a token correspond to an infinite set of key (with is the case with md5 btw, we just ignore it), then replacing a token by given key *cannot* work. Overall, it could be that there is better way to do this, but having spend some time on this, I have a reasonable confidence on that it fixes the issue at hand without being too disruptive (which is not saying there isn't a few points here and there that couldn't be improved). Remove assumption that Key to Token is one-to-one - Key: CASSANDRA-1034 URL: https://issues.apache.org/jira/browse/CASSANDRA-1034 Project: Cassandra Issue Type: Bug Reporter: Stu Hood Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.0 Attachments: 0001-Make-range-accept-both-Token-and-DecoratedKey.patch, 0002-LengthPartitioner.patch, 1034-1-Generify-AbstractBounds-v3.patch, 1034-2-Remove-assumption-that-token-and-keys-are-one-to-one-v3.patch, 1034_v1.txt get_range_slices assumes that Tokens do not collide and converts a KeyRange to an AbstractBounds. For RandomPartitioner, this assumption isn't safe, and would lead to a very weird heisenberg. Converting AbstractBounds to use a DecoratedKey would solve this, because the byte[] key portion of the DecoratedKey can act as a tiebreaker. Alternatively, we could make DecoratedKey extend Token, and then use DecoratedKeys in places where collisions are unacceptable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046500#comment-13046500 ] Sylvain Lebresne commented on CASSANDRA-2231: - The comment still applies to DynamicCompositeType, but what the comment doesn't says is that if you use a 0x01 as the end-of-component, it expects you have no remaining component. The error message tells that apparently there is some bytes remaining after that 0x01. You can look the discussion above on that ticket for why that doesn't make sense to have anything after a 0x01. Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal --- Key: CASSANDRA-2231 URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 Project: Cassandra Issue Type: New Feature Components: Contrib Reporter: Ed Anuff Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.8.1 Attachments: 0001-Add-compositeType-and-DynamicCompositeType-v2.patch, 0001-Add-compositeType-and-DynamicCompositeType-v3.patch, 0001-Add-compositeType-and-DynamicCompositeType-v4.patch, 0001-Add-compositeType-and-DynamicCompositeType_0.7.patch, CompositeType-and-DynamicCompositeType.patch, edanuff-CassandraCompositeType-1e253c4.zip CompositeType is a custom comparer that makes it possible to create comparable composite values out of the basic types that Cassandra currently supports, such as Long, UUID, etc. This is very useful in both the creation of custom inverted indexes using columns in a skinny row, where each column name is a composite value, and also when using Cassandra's built-in secondary index support, where it can be used to encode the values in the columns that Cassandra indexes. One scenario for the usage of these is documented here: http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for contribution is attached and has been previously maintained on github here: https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2433: Attachment: 0004-Reports-validation-compaction-errors-back-to-repair-v3.patch 0003-Report-streaming-errors-back-to-repair-v3.patch 0002-Register-in-gossip-to-handle-node-failures-v3.patch 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re-v3.patch Attaching v3 rebased (on 0.8). bq. Since we're not trying to control throughput or monitor sessions, could we just use Stage.MISC? The thing is that repair session are very long lived. And MISC is single threaded. So that would block other task that are not supposed to block. We could make MISC multi-threaded but even then it's not a good idea to mix short lived and long lived task on the same stage. bq. I think RepairSession.exception needs to be volatile to ensure that the awoken thread sees it Done in v3. bq. Would it be better if RepairSession implemented IEndpointStateChangeSubscriber directly? Good idea, it's slightly simpler, done in v3. bq. The endpoint set needs to be threadsafe, since it will be modified by the endpoint state change thread, and the AE_STAGE thread Done in v3. That will probably change with CASSANDRA-2610 anyway (which I have to update) bq. Should StreamInSession.retries be volatile/atomic? (likely they won't retry quickly enough for it to be a problem, but...) I did not change that, but if it's a problem for retries to not be volatile, I suspect having StreamInSession.current not volatile is also a problem. But really I'd be curious to see that be a problem. bq. Playing devil's advocate: would sending a half-built tree in case of failure still be useful? I don't think it is. Or more precisely, if you do send half-built tree, you'll have to be careful that the other doesn't consider what's missing as ranges not being in sync (I don't think people will be happy with tons of data being stream just because we happen to have a bug that make compaction throw an exception during the validation). So I think you cannot do much with a half-built tree, and it will add complication. For a case where people will need to restart a repair anyway once whatever happened is fixed bq. success might need to be volatile as well Done in v3. Failed Streams Break Repair --- Key: CASSANDRA-2433 URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Attachments: 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re-v2.patch, 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re-v3.patch, 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re.patch, 0002-Register-in-gossip-to-handle-node-failures-v2.patch, 0002-Register-in-gossip-to-handle-node-failures-v3.patch, 0002-Register-in-gossip-to-handle-node-failures.patch, 0003-Report-streaming-errors-back-to-repair-v2.patch, 0003-Report-streaming-errors-back-to-repair-v3.patch, 0003-Report-streaming-errors-back-to-repair.patch, 0004-Reports-validation-compaction-errors-back-to-repair-v2.patch, 0004-Reports-validation-compaction-errors-back-to-repair-v3.patch, 0004-Reports-validation-compaction-errors-back-to-repair.patch Running repair in cases where a stream fails we are seeing multiple problems. 1. Although retry is initiated and completes, the old stream doesn't seem to clean itself up and repair hangs. 2. The temp files are left behind and multiple failures can end up filling up the data partition. These issues together are making repair very difficult for nearly everyone running repair on a non-trivial sized data set. This issue is also being worked on w.r.t CASSANDRA-2088, however that was moved to 0.8 for a few reasons. This ticket is to fix the immediate issues that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2759) Scrub could lose increments and replicate that loss
Scrub could lose increments and replicate that loss --- Key: CASSANDRA-2759 URL: https://issues.apache.org/jira/browse/CASSANDRA-2759 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Reporter: Sylvain Lebresne Fix For: 0.8.1 If scrub cannot 'repair' a corrupted row, it will skip it. On node A, if the row contains some sub-count for A id, those will be lost forever since A is the source of truth on it's current id. We should thus renew node A id when that happens to avoid this (not unlike we do in cleanup). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-2759) Scrub could lose increments and replicate that loss
[ https://issues.apache.org/jira/browse/CASSANDRA-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne reassigned CASSANDRA-2759: --- Assignee: Sylvain Lebresne Scrub could lose increments and replicate that loss --- Key: CASSANDRA-2759 URL: https://issues.apache.org/jira/browse/CASSANDRA-2759 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: counters Fix For: 0.8.1 If scrub cannot 'repair' a corrupted row, it will skip it. On node A, if the row contains some sub-count for A id, those will be lost forever since A is the source of truth on it's current id. We should thus renew node A id when that happens to avoid this (not unlike we do in cleanup). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2759) Scrub could lose increments and replicate that loss
[ https://issues.apache.org/jira/browse/CASSANDRA-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2759: Attachment: 0001-Renew-nodeId-in-scrub-when-skipping-rows.patch Attached patch against 0.8. The patch also add a new startup option to renew the node id on startup. This could be useful if someone lose one of it's sstable (because of a bad disk for instance) and don't want to fully decommission that node. This could arguably be splitted in another ticket though. Scrub could lose increments and replicate that loss --- Key: CASSANDRA-2759 URL: https://issues.apache.org/jira/browse/CASSANDRA-2759 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: counters Fix For: 0.8.1 Attachments: 0001-Renew-nodeId-in-scrub-when-skipping-rows.patch If scrub cannot 'repair' a corrupted row, it will skip it. On node A, if the row contains some sub-count for A id, those will be lost forever since A is the source of truth on it's current id. We should thus renew node A id when that happens to avoid this (not unlike we do in cleanup). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2759) Scrub could lose increments and replicate that loss
[ https://issues.apache.org/jira/browse/CASSANDRA-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047285#comment-13047285 ] Sylvain Lebresne commented on CASSANDRA-2759: - It's picking a new UUID for the current node to use for new counter increment. The problem is that on a given node we store deltas for it's current nodeId (to avoid synchronized read-before-write, but I'm starting to wonder is that was the smartest ever). Anyway, if scrub skips a row, it may skip some of those deltas. Let's say at first there is no increments coming for this row for A as 'first distinguished replica'. So far we are still kind of good, because on a read (with CL ONE) the result coming from A will have a 'version' for it's own sub-count smaller that the one on the other replica, so we will us the sub-count on those replica and return the correct value. However, as soon as A acknowledge new increments for this row, it will start inserting new deltas while he is not intrinsically up to date. Which will result in an definitive undercount. The goal of renewing the node id of A is to make sure that second part never happen (because after the renew A will add new deltas as A', not A anymore). Anyway, now that I've plugged the brain this patch doesn't really works because A will never be repaired by the other nodes of it's now inconsistent value. So I have no clue how to actually fix that. Scrub could lose increments and replicate that loss --- Key: CASSANDRA-2759 URL: https://issues.apache.org/jira/browse/CASSANDRA-2759 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: counters Fix For: 0.8.1 Attachments: 0001-Renew-nodeId-in-scrub-when-skipping-rows.patch If scrub cannot 'repair' a corrupted row, it will skip it. On node A, if the row contains some sub-count for A id, those will be lost forever since A is the source of truth on it's current id. We should thus renew node A id when that happens to avoid this (not unlike we do in cleanup). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2759) Scrub could lose increments and replicate that loss
[ https://issues.apache.org/jira/browse/CASSANDRA-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047306#comment-13047306 ] Sylvain Lebresne commented on CASSANDRA-2759: - It may be that the best short fix here is to make scrub *not* skipping row on counter column families (though CASSANDRA-2614 would change that to 'never ever skipping row') and just throw a RuntimeException. Scrub could lose increments and replicate that loss --- Key: CASSANDRA-2759 URL: https://issues.apache.org/jira/browse/CASSANDRA-2759 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: counters Fix For: 0.8.1 Attachments: 0001-Renew-nodeId-in-scrub-when-skipping-rows.patch If scrub cannot 'repair' a corrupted row, it will skip it. On node A, if the row contains some sub-count for A id, those will be lost forever since A is the source of truth on it's current id. We should thus renew node A id when that happens to avoid this (not unlike we do in cleanup). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2759) Scrub could lose increments and replicate that loss
[ https://issues.apache.org/jira/browse/CASSANDRA-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2759: Attachment: 0001-Don-t-skip-rows-on-scrub-for-counter-CFs.patch Attaching patch to simply re-throw the exception instead of skipping the row for counter column families. bq. Only if you actually did have a counter in the column_metadata, right? right. Scrub could lose increments and replicate that loss --- Key: CASSANDRA-2759 URL: https://issues.apache.org/jira/browse/CASSANDRA-2759 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: counters Fix For: 0.8.1 Attachments: 0001-Don-t-skip-rows-on-scrub-for-counter-CFs.patch, 0001-Renew-nodeId-in-scrub-when-skipping-rows.patch If scrub cannot 'repair' a corrupted row, it will skip it. On node A, if the row contains some sub-count for A id, those will be lost forever since A is the source of truth on it's current id. We should thus renew node A id when that happens to avoid this (not unlike we do in cleanup). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2641) AbstractBounds.normalize should deal with overlapping ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049042#comment-13049042 ] Sylvain Lebresne commented on CASSANDRA-2641: - bq. it overlaps quite a bit with StorageProxy.getRestrictedRanges: is there anything there that can be reused? getRestrictedRanges splits a range at different tokens. This patch is about merging overlapping range as part of normalize. Not sure I follow what could be reused here. And in any, I'm in favor of not refactoring anything that is not necessary for this patch. This is not worth it. AbstractBounds.normalize should deal with overlapping ranges Key: CASSANDRA-2641 URL: https://issues.apache.org/jira/browse/CASSANDRA-2641 Project: Cassandra Issue Type: Test Components: Core Reporter: Stu Hood Assignee: Stu Hood Priority: Minor Fix For: 0.8.1 Attachments: 0001-Assert-non-overlapping-ranges-in-normalize.txt, 0001-Make-normalize-deoverlap-ranges.patch, 0002-Don-t-use-overlapping-ranges-in-tests.txt Apparently no consumers have encountered it in production, but AbstractBounds.normalize does not handle overlapping ranges. If given overlapping ranges, the output will be sorted but still overlapping, for which SSTableReader.getPositionsForRanges will choose ranges in an SSTable that may overlap. We should either add an assert in normalize(), or in getPositionsForRanges() to ensure that this never bites us in production. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2752) repair fails with java.io.EOFException
[ https://issues.apache.org/jira/browse/CASSANDRA-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-2752. - Resolution: Fixed Reviewer: slebresne Assignee: Jonathan Ellis (was: Terje Marthinussen) repair fails with java.io.EOFException -- Key: CASSANDRA-2752 URL: https://issues.apache.org/jira/browse/CASSANDRA-2752 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Reporter: Terje Marthinussen Assignee: Jonathan Ellis Priority: Critical Fix For: 0.8.1 Attachments: 2752.txt Issuing repair on node 1 (1.10.42.81) in a cluster quickly fails with INFO [AntiEntropyStage:1] 2011-06-09 19:02:47,999 AntiEntropyService.java (line 234) Queueing comparison #Differencer #TreeRequest manual-repair-0c17c5f9-583f-4a31-a6d4-a9e7306fb46e, /1 .10.42.82, (JP,XXX), (Token(bytes[6e]),Token(bytes[313039])] INFO [AntiEntropyStage:1] 2011-06-09 19:02:48,026 AntiEntropyService.java (line 468) Endpoints somewhere/1.10.42.81 and /1.10.42.82 have 2 range(s) out of sync for (JP,XXX) on (Token(bytes[6e]),Token(bytes[313039])] INFO [AntiEntropyStage:1] 2011-06-09 19:02:48,026 AntiEntropyService.java (line 485) Performing streaming repair of 2 ranges for #TreeRequest manual-repair-0c17c5f9-583f-4a31-a6d4-a9e7306 fb46e, /1.10.42.82, (JP,XXX), (Token(bytes[6e]),Token(bytes[313039])] INFO [AntiEntropyStage:1] 2011-06-09 19:02:48,030 StreamOut.java (line 173) Stream context metadata [/data/cassandra/node0/data/JP/XXX-g-3-Data.db sections=1 progress=0/36592 - 0%], 1 sstables. INFO [AntiEntropyStage:1] 2011-06-09 19:02:48,031 StreamOutSession.java (line 174) Streaming to /1.10.42.82 ERROR [CompactionExecutor:9] 2011-06-09 19:02:48,970 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[CompactionExecutor:9,1,main] java.io.EOFException at java.io.RandomAccessFile.readInt(RandomAccessFile.java:725) at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.doIndexing(SSTableWriter.java:457) at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.index(SSTableWriter.java:364) at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:315) at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:1099) at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:1090) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) On .82 ERROR [CompactionExecutor:12] 2011-06-09 19:02:48,051 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[CompactionExecutor:12,1,main] java.io.EOFException at java.io.RandomAccessFile.readInt(RandomAccessFile.java:725) at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.doIndexing(SSTableWriter.java:457) at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.index(SSTableWriter.java:364) at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:315) at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:1099) at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:1090) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) ERROR [Thread-132] 2011-06-09 19:02:48,051 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[Thread-132,5,main] java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.io.EOFException at org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:152) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:63) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:93) Caused by: java.util.concurrent.ExecutionException: java.io.EOFException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at
[jira] [Commented] (CASSANDRA-2752) repair fails with java.io.EOFException
[ https://issues.apache.org/jira/browse/CASSANDRA-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049045#comment-13049045 ] Sylvain Lebresne commented on CASSANDRA-2752: - Good catch. +1 (committed). Thanks Terje. repair fails with java.io.EOFException -- Key: CASSANDRA-2752 URL: https://issues.apache.org/jira/browse/CASSANDRA-2752 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Reporter: Terje Marthinussen Assignee: Terje Marthinussen Priority: Critical Fix For: 0.8.1 Attachments: 2752.txt Issuing repair on node 1 (1.10.42.81) in a cluster quickly fails with INFO [AntiEntropyStage:1] 2011-06-09 19:02:47,999 AntiEntropyService.java (line 234) Queueing comparison #Differencer #TreeRequest manual-repair-0c17c5f9-583f-4a31-a6d4-a9e7306fb46e, /1 .10.42.82, (JP,XXX), (Token(bytes[6e]),Token(bytes[313039])] INFO [AntiEntropyStage:1] 2011-06-09 19:02:48,026 AntiEntropyService.java (line 468) Endpoints somewhere/1.10.42.81 and /1.10.42.82 have 2 range(s) out of sync for (JP,XXX) on (Token(bytes[6e]),Token(bytes[313039])] INFO [AntiEntropyStage:1] 2011-06-09 19:02:48,026 AntiEntropyService.java (line 485) Performing streaming repair of 2 ranges for #TreeRequest manual-repair-0c17c5f9-583f-4a31-a6d4-a9e7306 fb46e, /1.10.42.82, (JP,XXX), (Token(bytes[6e]),Token(bytes[313039])] INFO [AntiEntropyStage:1] 2011-06-09 19:02:48,030 StreamOut.java (line 173) Stream context metadata [/data/cassandra/node0/data/JP/XXX-g-3-Data.db sections=1 progress=0/36592 - 0%], 1 sstables. INFO [AntiEntropyStage:1] 2011-06-09 19:02:48,031 StreamOutSession.java (line 174) Streaming to /1.10.42.82 ERROR [CompactionExecutor:9] 2011-06-09 19:02:48,970 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[CompactionExecutor:9,1,main] java.io.EOFException at java.io.RandomAccessFile.readInt(RandomAccessFile.java:725) at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.doIndexing(SSTableWriter.java:457) at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.index(SSTableWriter.java:364) at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:315) at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:1099) at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:1090) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) On .82 ERROR [CompactionExecutor:12] 2011-06-09 19:02:48,051 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[CompactionExecutor:12,1,main] java.io.EOFException at java.io.RandomAccessFile.readInt(RandomAccessFile.java:725) at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.doIndexing(SSTableWriter.java:457) at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.index(SSTableWriter.java:364) at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:315) at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:1099) at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:1090) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) ERROR [Thread-132] 2011-06-09 19:02:48,051 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[Thread-132,5,main] java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.io.EOFException at org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:152) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:63) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:93) Caused by: java.util.concurrent.ExecutionException: java.io.EOFException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at
[jira] [Created] (CASSANDRA-2767) ConcurrentModificationException in AntiEntropyService.getNeighbors()
ConcurrentModificationException in AntiEntropyService.getNeighbors() Key: CASSANDRA-2767 URL: https://issues.apache.org/jira/browse/CASSANDRA-2767 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 0.8.1 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2767) ConcurrentModificationException in AntiEntropyService.getNeighbors()
[ https://issues.apache.org/jira/browse/CASSANDRA-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2767: Attachment: 0001-Fix-ConcurrentModificationException.patch ConcurrentModificationException in AntiEntropyService.getNeighbors() Key: CASSANDRA-2767 URL: https://issues.apache.org/jira/browse/CASSANDRA-2767 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Attachments: 0001-Fix-ConcurrentModificationException.patch Original Estimate: 1h Remaining Estimate: 1h -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2767) ConcurrentModificationException in AntiEntropyService.getNeighbors()
[ https://issues.apache.org/jira/browse/CASSANDRA-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2767: Attachment: (was: 0001-Fix-ConcurrentModificationException.patch) ConcurrentModificationException in AntiEntropyService.getNeighbors() Key: CASSANDRA-2767 URL: https://issues.apache.org/jira/browse/CASSANDRA-2767 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Attachments: 0001-Fix-ConcurrentModificationException.patch Original Estimate: 1h Remaining Estimate: 1h -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2767) ConcurrentModificationException in AntiEntropyService.getNeighbors()
[ https://issues.apache.org/jira/browse/CASSANDRA-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2767: Attachment: 0001-Fix-ConcurrentModificationException.patch ConcurrentModificationException in AntiEntropyService.getNeighbors() Key: CASSANDRA-2767 URL: https://issues.apache.org/jira/browse/CASSANDRA-2767 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Attachments: 0001-Fix-ConcurrentModificationException.patch Original Estimate: 1h Remaining Estimate: 1h -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2768) AntiEntropyService excluding nodes that are on version 0.7 or sooner
[ https://issues.apache.org/jira/browse/CASSANDRA-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049078#comment-13049078 ] Sylvain Lebresne commented on CASSANDRA-2768: - The important part here is that this is not a repair specific thing per se. The important part of the stack trace is the 'Excluding ...' part. It is triggered because of the following code in AES.getNeighbors: {noformat} if (Gossiper.instance.getVersion(endpoint) = MessagingService.VERSION_07) { logger.info(Excluding + endpoint + from repair because it is on version 0.7 or sooner. You should consider updating this node before running repair again.); neighbors.remove(endpoint); } {noformat} Since Sasha has reportedly verified that all node report being on 0.8.0, this suggests a Gossiper bug that reports the wrong version (even after node restarts). The exception itself has been fixed in CASSANDRA-2767 and should not be the focus of attention here. AntiEntropyService excluding nodes that are on version 0.7 or sooner Key: CASSANDRA-2768 URL: https://issues.apache.org/jira/browse/CASSANDRA-2768 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Environment: 4 node environment -- Originally 0.7.6-2 with a Keyspace defined with RF=3 Upgraded all nodes ( 1 at a time ) to version 0.8.0: For each node, the node was shut down, new version was turned on, using the existing data files / directories and a nodetool repair was run. Reporter: Sasha Dolgy Assignee: Sylvain Lebresne When I run nodetool repair on any of the nodes, the /var/log/cassandra/system.log reports errors similar to: INFO [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 21:28:39,877 AntiEntropyService.java (line 177) Excluding /10.128.34.18 from repair because it is on version 0.7 or sooner. You should consider updating this node before running repair again. ERROR [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 21:28:39,877 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec,5,RMI Runtime] java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.apache.cassandra.service.AntiEntropyService.getNeighbors(AntiEntropyService.java:173) at org.apache.cassandra.service.AntiEntropyService$RepairSession.run(AntiEntropyService.java:776) The INFO message and subsequent ERROR message are logged for 2 nodes .. I suspect that this is because RF=3. nodetool ring shows that all nodes are up. Client connections (read / write) are not having issues.. nodetool version on all nodes shows that each node is 0.8.0 At suggestion of some contributors, I have restarted each node and tried to run a nodetool repair again ... the result is the same with the messages being logged. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2758) nodetool repair never finishes. Loops forever through merkle trees?
[ https://issues.apache.org/jira/browse/CASSANDRA-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2758: Attachment: 0001-Fix-MerkleTree.init-to-not-create-non-sensical-trees.patch MerkleTree.init(), which is used to create the merkle tree in case there is no data, was creating a nonsensical tree by stopping it's iteration too late. Attached patch to fix (and dumping priority to minor because it has very little chance to hit anyone in any real-life situation). nodetool repair never finishes. Loops forever through merkle trees? --- Key: CASSANDRA-2758 URL: https://issues.apache.org/jira/browse/CASSANDRA-2758 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Reporter: Terje Marthinussen Assignee: Sylvain Lebresne Fix For: 0.8.1 Attachments: 0001-Fix-MerkleTree.init-to-not-create-non-sensical-trees.patch I am not sure all steps here is needed, but as part of testing something else, I set up node1: initial_token: 1 node2: initial_token: 5 Then: {noformat} create keyspace myks with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' with strategy_options = [{ replication_factor:2 }]; use myks; create column family test with comparator = AsciiType and column_metadata=[ {column_name: 'up_', validation_class: LongType, index_type: 0}, {column_name: 'del_', validation_class: LongType, index_type: 0} ] and keys_cached = 10 and rows_cached = 1 and min_compaction_threshold = 2; quit; {noformat} Doing nodetool repair after this gets both nodes busy looping forever. A quick look at one node in eclipse makes me guess its having fun spinning through merkle trees, but I have to admit I have not look at it for a long time. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2758) nodetool repair never finishes. Loops forever through merkle trees?
[ https://issues.apache.org/jira/browse/CASSANDRA-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2758: Priority: Minor (was: Major) nodetool repair never finishes. Loops forever through merkle trees? --- Key: CASSANDRA-2758 URL: https://issues.apache.org/jira/browse/CASSANDRA-2758 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Reporter: Terje Marthinussen Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.8.1 Attachments: 0001-Fix-MerkleTree.init-to-not-create-non-sensical-trees.patch I am not sure all steps here is needed, but as part of testing something else, I set up node1: initial_token: 1 node2: initial_token: 5 Then: {noformat} create keyspace myks with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' with strategy_options = [{ replication_factor:2 }]; use myks; create column family test with comparator = AsciiType and column_metadata=[ {column_name: 'up_', validation_class: LongType, index_type: 0}, {column_name: 'del_', validation_class: LongType, index_type: 0} ] and keys_cached = 10 and rows_cached = 1 and min_compaction_threshold = 2; quit; {noformat} Doing nodetool repair after this gets both nodes busy looping forever. A quick look at one node in eclipse makes me guess its having fun spinning through merkle trees, but I have to admit I have not look at it for a long time. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2679) Move some column creation logic into Column factory functions
[ https://issues.apache.org/jira/browse/CASSANDRA-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049118#comment-13049118 ] Sylvain Lebresne commented on CASSANDRA-2679: - lgtm but can we rename the 'get' to 'create' (or 'make') as this better suggest what those methods do. Move some column creation logic into Column factory functions - Key: CASSANDRA-2679 URL: https://issues.apache.org/jira/browse/CASSANDRA-2679 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Assignee: Stu Hood Priority: Minor Fix For: 1.0 Attachments: 0001-CASSANDRA-2679-Move-Deleted-Expiring-switch-and-contex.txt Expiring and Counter columns have extra creation logic that is better encapsulated when implemented inside a factory function. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2766) ConcurrentModificationException during node recovery
[ https://issues.apache.org/jira/browse/CASSANDRA-2766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049191#comment-13049191 ] Sylvain Lebresne commented on CASSANDRA-2766: - lgtm +1 on v2 ConcurrentModificationException during node recovery Key: CASSANDRA-2766 URL: https://issues.apache.org/jira/browse/CASSANDRA-2766 Project: Cassandra Issue Type: Bug Components: Core Reporter: Terje Marthinussen Assignee: Jonathan Ellis Attachments: 2766-v2.txt, 2766.txt Testing some node recovery operations. In this case: 1. Data is being added/updated as it would in production 2. repair is running on other nodes in the cluster 3. we wiped data on this node and started up again, but before repair was actually started on this node (but it had gotten data through the regular data feed) we got this error. I see no indication in the logs that outgoing streams has been started, but the node have finished one incoming stream before this (I guess from some other node doing repair). INFO [CompactionExecutor:11] 2011-06-14 14:15:09,078 SSTableReader.java (line 155) Opening /data/cassandra/node1/data/JP/test-g-8 INFO [CompactionExecutor:13] 2011-06-14 14:15:09,079 SSTableReader.java (line 155) Opening /data/cassandra/node1/data/JP/test-g-10 INFO [HintedHandoff:1] 2011-06-14 14:15:26,623 HintedHandOffManager.java (line 302) Started hinted handoff for endpoint /1.10.42.216 INFO [HintedHandoff:1] 2011-06-14 14:15:26,623 HintedHandOffManager.java (line 358) Finished hinted handoff of 0 rows to endpoint /1.10.42.216 INFO [CompactionExecutor:9] 2011-06-14 14:15:29,417 SSTableReader.java (line 155) Opening /data/cassandra/node1/data/JP/Datetest-g-2 ERROR [Thread-84] 2011-06-14 14:15:36,755 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[Thread-84,5,main] java.util.ConcurrentModificationException at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) at java.util.AbstractList$Itr.next(AbstractList.java:343) at org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:132) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:63) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:93) ERROR [Thread-79] 2011-06-14 14:15:36,755 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[Thread-79,5,main] java.util.ConcurrentModificationException at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) at java.util.AbstractList$Itr.next(AbstractList.java:343) at org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:132) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:63) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:93) ERROR [Thread-83] 2011-06-14 14:15:36,755 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[Thread-83,5,main] java.util.ConcurrentModificationException at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) at java.util.AbstractList$Itr.next(AbstractList.java:343) at org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:132) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:63) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:93) ERROR [Thread-85] 2011-06-14 14:15:36,755 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[Thread-85,5,main] java.util.ConcurrentModificationException at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) at java.util.AbstractList$Itr.next(AbstractList.java:343) at org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:132) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:63) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:93) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2614) create Column and CounterColumn in the same column family
[ https://issues.apache.org/jira/browse/CASSANDRA-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049308#comment-13049308 ] Sylvain Lebresne commented on CASSANDRA-2614: - Turns out there is a tiny hiccup. For a Deletion (inside a thrift Mutation), the timestamp is required to be set if the CF is a regular one, but not if it is a counter CF. But more importantly, for counter CF, the timestamp should be a server generated timestamp. If we allow row to mix counters and regular columns, then when facing a Deletion for a full row, there is no way to accommodate those two requirements. It would kind of be a shame of not doing this because of Deletion, but I don't see a good way around this (other than changing the API, which would move that ticket to 1.0). Ideas ? create Column and CounterColumn in the same column family - Key: CASSANDRA-2614 URL: https://issues.apache.org/jira/browse/CASSANDRA-2614 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Dave Rav Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.8.1 create Column and CounterColumn in the same column family -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2774) one way to make counter delete work better
[ https://issues.apache.org/jira/browse/CASSANDRA-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049688#comment-13049688 ] Sylvain Lebresne commented on CASSANDRA-2774: - Consider 2 nodes A, B and C with RF=2 and a given counter c whose replica set is {B, C}. Consider a single client issuing the following operations (in order) while connected to node A: # client increment c by +2 at CL.ONE # client delete c at CL.ONE # client increment c by +3 at CL.ONE # client reads c at CL.ALL The *only* valid answer the client should ever get on its last read is 3. Any other value is a break of the consistency level contract and not something we can expect people to be happy with. Any other answer means that deletes are broken (and this *is* the problem with the actual implementation). However, because the write are made at CL.ONE in the example above, at the time the read is issued, the only thing we know for sure is that each write has been received by one node, but not necessarily the same each time. Depending on the actual timing and on which node happens to be the one acknowledging each writes, when the read reaches the nodes you can have a lot of different situations including: * A and B both have received the 3 writes in the right order, they will all return 3, the 'right' answer. * A received the deletion (the two increments are still on the wire yet to be received) and B received the other two increments (the delete is still on the wire yet to be received). A will return the tombstone, B will return 5. You can assign all epoch number you want, there is no way you can return 3 to the client. It will be either 5 or 0. So the same query will result in different answers depending on the internal timing of events, and will sometimes return an answer that is a break of the contract. Removes of counters are broken and the only safe way to use them is for permanent removal with no following inserts. This patch doesn't fix it. Btw, it's not too hard to come up with the same kind of example using only QUORUM reads and writes (but you'll need one more replica and a few more steps). one way to make counter delete work better -- Key: CASSANDRA-2774 URL: https://issues.apache.org/jira/browse/CASSANDRA-2774 Project: Cassandra Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Yang Yang Attachments: counter_delete.diff current Counter does not work with delete, because different merging order of sstables would produces different result, for example: add 1 delete add 2 if the merging happens by 1-2, (1,2)--3 order, the result we see will be 2 if merging is: 1--3, (1,3)--2, the result will be 3. the issue is that delete now can not separate out previous adds and adds later than the delete. supposedly a delete is to create a completely new incarnation of the counter, or a new lifetime, or epoch. the new approach utilizes the concept of epoch number, so that each delete bumps up the epoch number. since each write is replicated (replicate on write is almost always enabled in practice, if this is a concern, we could further force ROW in case of delete ), so the epoch number is global to a replica set changes are attached, existing tests pass fine, some tests are modified since the semantic is changed a bit. some cql tests do not pass in the original 0.8.0 source, that's not the fault of this change. see details at http://mail-archives.apache.org/mod_mbox/cassandra-user/201106.mbox/%3cbanlktikqcglsnwtt-9hvqpseoo7sf58...@mail.gmail.com%3E the goal of this is to make delete work ( at least with consistent behavior, yes in case of long network partition, the behavior is not ideal, but it's consistent with the definition of logical clock), so that we could have expiring Counters -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2769) Cannot Create Duplicate Compaction Marker
[ https://issues.apache.org/jira/browse/CASSANDRA-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2769: Attachment: 0002-Only-compact-what-has-been-succesfully-marked-as-com.patch 0001-Do-compact-only-smallerSSTables.patch 0001-0.8.0-Remove-useless-unmarkCompacting-in-doCleanup.patch Alright, there is a bunch of problems, one of which affects 0.8 and trunk and could cause this stackTrace. The others are due to CASSANDRA-1610 and thus only affect trunk (but one of those can also result in the attached stackTrace). The problem affecting 0.8 and trunk is related to a left over line in doCleanup() that is wrongly unmarking a sstable from the compacting set before having removed it from the active set of sstables. Thus another compaction could start compacting this sstable and we'll end up marking the file as compacted twice (and we would have duplicated the sstable, which is a problem for counters). Patch 0001-0.8.0-Remove-useless-unmarkCompacting-in-doCleanup.patch removes it and is against 0.8. Trunk has a few problems of its own: * If disk space is not sufficient to compact all sstables, it computes the smallestSSTables set that fits, but doesn't use it. Attached first patch (0001-Do-compact-only-smallerSSTables.patch) fixes that. * The CompactionTask logic wrongly decorrelates the set of sstables that are successfully marked from the ones it did compact. That is, it grabs a list of sstables it wants to compact, then call markCompacting on them, but does not check if all of them are successfully marked and compact the original list instead. In effect, a task will recompact sstables that are already being compacted by other task and the given file will be compacted twice (or more) and marked compacted multiple times. Attached patch (0002-Only-compact-what-has-been-succesfully-marked-as-com.patch) fixes this by changing the sstables set of a given CompactionTask to whatever has been successfully marked only. Since the marking involves updating the task, I've move the logic to AbstractCompactionTask where it seems to make more sense to me. * For some reason, the markCompacting added for CompactionTasks was refusing to mark (and compact) anything if the set of sstable was bigger that MaxCompactionThreshold. This means that as soon as the number of sstables (of same size) in the column family would exceed the threshold, no compaction would be started. This is not the expected behavior. The second patch also fixes this by reusing the original markCompacting that handles this correctly. Cannot Create Duplicate Compaction Marker - Key: CASSANDRA-2769 URL: https://issues.apache.org/jira/browse/CASSANDRA-2769 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.0 Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Fix For: 0.8.1, 1.0 Attachments: 0001-0.8.0-Remove-useless-unmarkCompacting-in-doCleanup.patch, 0001-Do-compact-only-smallerSSTables.patch, 0002-Only-compact-what-has-been-succesfully-marked-as-com.patch Concurrent compaction can trigger the following exception when two threads compact the same sstable. DataTracker attempts to prevent this but apparently not successfully. java.io.IOError: java.io.IOException: Unable to create compaction marker at org.apache.cassandra.io.sstable.SSTableReader.markCompacted(SSTableReader.java:638) at org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:321) at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:294) at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:255) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:932) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:173) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:119) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:102) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Caused by: java.io.IOException: Unable to create compaction marker at org.apache.cassandra.io.sstable.SSTableReader.markCompacted(SSTableReader.java:634) ... 12 more -- This message is automatically generated by JIRA. For more information on JIRA, see:
[jira] [Resolved] (CASSANDRA-2641) AbstractBounds.normalize should deal with overlapping ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-2641. - Resolution: Fixed Fix Version/s: (was: 0.8.1) 1.0 Reviewer: stuhood (was: slebresne) Assignee: Sylvain Lebresne (was: Stu Hood) Committed to 1.0. Since I'm pretty sure we don't generate overlapping range so far, it's not worth taking the risk to put in 0.8. AbstractBounds.normalize should deal with overlapping ranges Key: CASSANDRA-2641 URL: https://issues.apache.org/jira/browse/CASSANDRA-2641 Project: Cassandra Issue Type: Test Components: Core Reporter: Stu Hood Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.0 Attachments: 0001-Assert-non-overlapping-ranges-in-normalize.txt, 0001-Make-normalize-deoverlap-ranges.patch, 0002-Don-t-use-overlapping-ranges-in-tests.txt Apparently no consumers have encountered it in production, but AbstractBounds.normalize does not handle overlapping ranges. If given overlapping ranges, the output will be sorted but still overlapping, for which SSTableReader.getPositionsForRanges will choose ranges in an SSTable that may overlap. We should either add an assert in normalize(), or in getPositionsForRanges() to ensure that this never bites us in production. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2433: Attachment: (was: 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re.patch) Failed Streams Break Repair --- Key: CASSANDRA-2433 URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Running repair in cases where a stream fails we are seeing multiple problems. 1. Although retry is initiated and completes, the old stream doesn't seem to clean itself up and repair hangs. 2. The temp files are left behind and multiple failures can end up filling up the data partition. These issues together are making repair very difficult for nearly everyone running repair on a non-trivial sized data set. This issue is also being worked on w.r.t CASSANDRA-2088, however that was moved to 0.8 for a few reasons. This ticket is to fix the immediate issues that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2433: Attachment: (was: 0002-Register-in-gossip-to-handle-node-failures-v3.patch) Failed Streams Break Repair --- Key: CASSANDRA-2433 URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Running repair in cases where a stream fails we are seeing multiple problems. 1. Although retry is initiated and completes, the old stream doesn't seem to clean itself up and repair hangs. 2. The temp files are left behind and multiple failures can end up filling up the data partition. These issues together are making repair very difficult for nearly everyone running repair on a non-trivial sized data set. This issue is also being worked on w.r.t CASSANDRA-2088, however that was moved to 0.8 for a few reasons. This ticket is to fix the immediate issues that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2433: Attachment: (was: 0002-Register-in-gossip-to-handle-node-failures-v2.patch) Failed Streams Break Repair --- Key: CASSANDRA-2433 URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Running repair in cases where a stream fails we are seeing multiple problems. 1. Although retry is initiated and completes, the old stream doesn't seem to clean itself up and repair hangs. 2. The temp files are left behind and multiple failures can end up filling up the data partition. These issues together are making repair very difficult for nearly everyone running repair on a non-trivial sized data set. This issue is also being worked on w.r.t CASSANDRA-2088, however that was moved to 0.8 for a few reasons. This ticket is to fix the immediate issues that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2433: Attachment: (was: 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re-v3.patch) Failed Streams Break Repair --- Key: CASSANDRA-2433 URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Running repair in cases where a stream fails we are seeing multiple problems. 1. Although retry is initiated and completes, the old stream doesn't seem to clean itself up and repair hangs. 2. The temp files are left behind and multiple failures can end up filling up the data partition. These issues together are making repair very difficult for nearly everyone running repair on a non-trivial sized data set. This issue is also being worked on w.r.t CASSANDRA-2088, however that was moved to 0.8 for a few reasons. This ticket is to fix the immediate issues that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2433: Attachment: (was: 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re-v2.patch) Failed Streams Break Repair --- Key: CASSANDRA-2433 URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Running repair in cases where a stream fails we are seeing multiple problems. 1. Although retry is initiated and completes, the old stream doesn't seem to clean itself up and repair hangs. 2. The temp files are left behind and multiple failures can end up filling up the data partition. These issues together are making repair very difficult for nearly everyone running repair on a non-trivial sized data set. This issue is also being worked on w.r.t CASSANDRA-2088, however that was moved to 0.8 for a few reasons. This ticket is to fix the immediate issues that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2433: Attachment: (was: 0002-Register-in-gossip-to-handle-node-failures.patch) Failed Streams Break Repair --- Key: CASSANDRA-2433 URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Running repair in cases where a stream fails we are seeing multiple problems. 1. Although retry is initiated and completes, the old stream doesn't seem to clean itself up and repair hangs. 2. The temp files are left behind and multiple failures can end up filling up the data partition. These issues together are making repair very difficult for nearly everyone running repair on a non-trivial sized data set. This issue is also being worked on w.r.t CASSANDRA-2088, however that was moved to 0.8 for a few reasons. This ticket is to fix the immediate issues that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2433: Attachment: (was: 0004-Reports-validation-compaction-errors-back-to-repair-v2.patch) Failed Streams Break Repair --- Key: CASSANDRA-2433 URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Running repair in cases where a stream fails we are seeing multiple problems. 1. Although retry is initiated and completes, the old stream doesn't seem to clean itself up and repair hangs. 2. The temp files are left behind and multiple failures can end up filling up the data partition. These issues together are making repair very difficult for nearly everyone running repair on a non-trivial sized data set. This issue is also being worked on w.r.t CASSANDRA-2088, however that was moved to 0.8 for a few reasons. This ticket is to fix the immediate issues that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2433: Attachment: (was: 0004-Reports-validation-compaction-errors-back-to-repair.patch) Failed Streams Break Repair --- Key: CASSANDRA-2433 URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Running repair in cases where a stream fails we are seeing multiple problems. 1. Although retry is initiated and completes, the old stream doesn't seem to clean itself up and repair hangs. 2. The temp files are left behind and multiple failures can end up filling up the data partition. These issues together are making repair very difficult for nearly everyone running repair on a non-trivial sized data set. This issue is also being worked on w.r.t CASSANDRA-2088, however that was moved to 0.8 for a few reasons. This ticket is to fix the immediate issues that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2433: Attachment: (was: 0003-Report-streaming-errors-back-to-repair-v2.patch) Failed Streams Break Repair --- Key: CASSANDRA-2433 URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Running repair in cases where a stream fails we are seeing multiple problems. 1. Although retry is initiated and completes, the old stream doesn't seem to clean itself up and repair hangs. 2. The temp files are left behind and multiple failures can end up filling up the data partition. These issues together are making repair very difficult for nearly everyone running repair on a non-trivial sized data set. This issue is also being worked on w.r.t CASSANDRA-2088, however that was moved to 0.8 for a few reasons. This ticket is to fix the immediate issues that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2433: Attachment: (was: 0003-Report-streaming-errors-back-to-repair-v3.patch) Failed Streams Break Repair --- Key: CASSANDRA-2433 URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Running repair in cases where a stream fails we are seeing multiple problems. 1. Although retry is initiated and completes, the old stream doesn't seem to clean itself up and repair hangs. 2. The temp files are left behind and multiple failures can end up filling up the data partition. These issues together are making repair very difficult for nearly everyone running repair on a non-trivial sized data set. This issue is also being worked on w.r.t CASSANDRA-2088, however that was moved to 0.8 for a few reasons. This ticket is to fix the immediate issues that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2433: Attachment: 0004-Reports-validation-compaction-errors-back-to-repair-v4.patch 0003-Report-streaming-errors-back-to-repair-v4.patch 0002-Register-in-gossip-to-handle-node-failures-v4.patch 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re-v4.patch Attaching v4 that is rebased and simply set the reties variable in StreamInSession volatile after all (I've removed old version because it was a mess). Failed Streams Break Repair --- Key: CASSANDRA-2433 URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.1 Attachments: 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re-v4.patch, 0002-Register-in-gossip-to-handle-node-failures-v4.patch, 0003-Report-streaming-errors-back-to-repair-v4.patch, 0004-Reports-validation-compaction-errors-back-to-repair-v4.patch Running repair in cases where a stream fails we are seeing multiple problems. 1. Although retry is initiated and completes, the old stream doesn't seem to clean itself up and repair hangs. 2. The temp files are left behind and multiple failures can end up filling up the data partition. These issues together are making repair very difficult for nearly everyone running repair on a non-trivial sized data set. This issue is also being worked on w.r.t CASSANDRA-2088, however that was moved to 0.8 for a few reasons. This ticket is to fix the immediate issues that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2405) should expose 'time since last successful repair' for easier aes monitoring
[ https://issues.apache.org/jira/browse/CASSANDRA-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049770#comment-13049770 ] Sylvain Lebresne commented on CASSANDRA-2405: - The problem with using the completion time as the (Super)Column name is that you have to wait the end of the repair to store anything. First, this will not capture started but failed session (which while not mandatory could be nice, especially as soon as we will start keeping a bit more info this could help troubleshooting). And Second, it will be a pain to have to keep some of the information until the end (the processingStartedAt is a first sign of this). And third, we may want to keep some info on say merkle tree creation on all replica participating in the repair, even though we only store the completed time on the node initiating the repair. So I would propose to something like: row key: KS/CF super column name: repair session name (a TimeUUID) columns: the infos on the session (range, start and end time, number of range repaired, bytes transferred, ...) That is roughly the same thing as you propose but with super column name being the repair session name. Now, because the repair session names are TimeUUID (well, right now it is a sting including a UUID, we can change it to a simple TimeUUID easily), the session will be ordered by creation time. So getting the last successful repair is probably not too hard: just grab the last 1000 created sessions and find the last successful one. And if we want, we can even use another specific index row that associate 'completion time' - 'session UUID' (and thanks to the new DynamicCompositeType we can have some rows ordered by TimeUUIDType and some other ordered by LongType without the need of multiple system table). should expose 'time since last successful repair' for easier aes monitoring --- Key: CASSANDRA-2405 URL: https://issues.apache.org/jira/browse/CASSANDRA-2405 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Pavel Yaskevich Priority: Minor Fix For: 0.8.1 Attachments: CASSANDRA-2405-v2.patch, CASSANDRA-2405-v3.patch, CASSANDRA-2405.patch The practical implementation issues of actually ensuring repair runs is somewhat of an undocumented/untreated issue. One hopefully low hanging fruit would be to at least expose the time since last successful repair for a particular column family, to make it easier to write a correct script to monitor for lack of repair in a non-buggy fashion. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2405) should expose 'time since last successful repair' for easier aes monitoring
[ https://issues.apache.org/jira/browse/CASSANDRA-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049784#comment-13049784 ] Sylvain Lebresne commented on CASSANDRA-2405: - I'm keen on adding persisted stats for repair for CASSANDRA-2698. Recording the start and end time of repair also amounts to persisting stats on repair. Given that, I don't care too much about what the description of this says, but I'm pretty much opposed to doing anything here that would make CASSANDRA-2698 much harder that it needs unless there is a good reason, and I don't see one. I'm happy with making this a duplicate or dependency of CASSANDRA-2698 though. should expose 'time since last successful repair' for easier aes monitoring --- Key: CASSANDRA-2405 URL: https://issues.apache.org/jira/browse/CASSANDRA-2405 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Pavel Yaskevich Priority: Minor Fix For: 0.8.1 Attachments: CASSANDRA-2405-v2.patch, CASSANDRA-2405-v3.patch, CASSANDRA-2405.patch The practical implementation issues of actually ensuring repair runs is somewhat of an undocumented/untreated issue. One hopefully low hanging fruit would be to at least expose the time since last successful repair for a particular column family, to make it easier to write a correct script to monitor for lack of repair in a non-buggy fashion. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2774) one way to make counter delete work better
[ https://issues.apache.org/jira/browse/CASSANDRA-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049834#comment-13049834 ] Sylvain Lebresne commented on CASSANDRA-2774: - bq. I think with quorum delete you will guarantee timing to be consistent eoyh client And then achieve client expected result I. Your Case, id like to hear your counter example Consider a cluster with RF=3 and counter c replicated on node A, B and C. Consider that all operation are done by the same client connected to some other node (doesn't have to be the same each time but can be). All operations are performed at QUORUM consistency level. The client does the following operations: # increment c by 1 # delete c # increment c by 1 # reads c Because QUORUM is 2, depending on internal timings (latency on the wire and such), either only 2 or the 3 nodes will have seen each write once it is acked to the client. Again, for the same inputs and depending on timing, the client could get on the read a variety of results: * 1 if each node have received each operation in the order issued. * 0 or 2, if for instance, by the time the read is issued: ** the first increment only reached B and C ** the deletion only reached A and C ** the second increment only reached A and B and it happens that the two first node answering the read are B and C. The exact value depends on the exact rules for dealing with the epoch number, but in any case, B would only have the two increments and C would have the first increment and deletion (issued after the increment, so the deletion wins). So B will answer 2 and C will answer a tombstone. Whatever resolution the coordinator does, it just cannot return 1 that time. one way to make counter delete work better -- Key: CASSANDRA-2774 URL: https://issues.apache.org/jira/browse/CASSANDRA-2774 Project: Cassandra Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Yang Yang Attachments: counter_delete.diff current Counter does not work with delete, because different merging order of sstables would produces different result, for example: add 1 delete add 2 if the merging happens by 1-2, (1,2)--3 order, the result we see will be 2 if merging is: 1--3, (1,3)--2, the result will be 3. the issue is that delete now can not separate out previous adds and adds later than the delete. supposedly a delete is to create a completely new incarnation of the counter, or a new lifetime, or epoch. the new approach utilizes the concept of epoch number, so that each delete bumps up the epoch number. since each write is replicated (replicate on write is almost always enabled in practice, if this is a concern, we could further force ROW in case of delete ), so the epoch number is global to a replica set changes are attached, existing tests pass fine, some tests are modified since the semantic is changed a bit. some cql tests do not pass in the original 0.8.0 source, that's not the fault of this change. see details at http://mail-archives.apache.org/mod_mbox/cassandra-user/201106.mbox/%3cbanlktikqcglsnwtt-9hvqpseoo7sf58...@mail.gmail.com%3E the goal of this is to make delete work ( at least with consistent behavior, yes in case of long network partition, the behavior is not ideal, but it's consistent with the definition of logical clock), so that we could have expiring Counters -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2405) should expose 'time since last successful repair' for easier aes monitoring
[ https://issues.apache.org/jira/browse/CASSANDRA-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049882#comment-13049882 ] Sylvain Lebresne commented on CASSANDRA-2405: - Hum, the thing is that there will be many repair sessions for a given set of KS/CF and range. So you need one of the key (either row key or supercolumn name) to be the session_id (or anything that is unique to a session). If you use a row for each KS/CF pair and one super column for each session, you will have one super column for each repair made in a session (or kind of, you will indeed have multiple merkle tree for instance, one for each replica, but we can easily prefix the column with the replica name if need be). should expose 'time since last successful repair' for easier aes monitoring --- Key: CASSANDRA-2405 URL: https://issues.apache.org/jira/browse/CASSANDRA-2405 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Pavel Yaskevich Priority: Minor Fix For: 0.8.1 Attachments: CASSANDRA-2405-v2.patch, CASSANDRA-2405-v3.patch, CASSANDRA-2405.patch The practical implementation issues of actually ensuring repair runs is somewhat of an undocumented/untreated issue. One hopefully low hanging fruit would be to at least expose the time since last successful repair for a particular column family, to make it easier to write a correct script to monitor for lack of repair in a non-buggy fashion. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2369) support replication decisions per-key
[ https://issues.apache.org/jira/browse/CASSANDRA-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049899#comment-13049899 ] Sylvain Lebresne commented on CASSANDRA-2369: - Let me also add that if you allow that, load balancing will be a bitch. One may argue it should be the problem of whomever wants to use this, but I'm not sure that providing tools that make foot-shooting too easy is such a good idea. support replication decisions per-key - Key: CASSANDRA-2369 URL: https://issues.apache.org/jira/browse/CASSANDRA-2369 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Vijay Priority: Minor Fix For: 1.0 Currently the replicationstrategy gets a token and a keyspace with which to decide how to place replicas. for per-row replication this is insufficient because tokenization is lossy (CASSANDRA-1034). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2774) one way to make counter delete work better
[ https://issues.apache.org/jira/browse/CASSANDRA-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049912#comment-13049912 ] Sylvain Lebresne commented on CASSANDRA-2774: - bq. but as I stated in my last comment, at least we can be sure that the new approach guarantees some common agreement eventually. It is already the case with the current implementation. Once compaction has compacted the deletes, all node will reach common agreement. bq. it would be nice if we achieve the agreement in case of quorum, but that's not my main argument My main argument is that this patch slightly change the behavior here and there but I don't think it adds any tangible new guarantee that people can work with. On the other side, it adds a fairly heavy performance hit by adding a read before write on every replica (and though you won't necessary do a read for every write, you will do that read more often than not as soon as the set of counters you're incrementing is not small enough). one way to make counter delete work better -- Key: CASSANDRA-2774 URL: https://issues.apache.org/jira/browse/CASSANDRA-2774 Project: Cassandra Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Yang Yang Attachments: counter_delete.diff current Counter does not work with delete, because different merging order of sstables would produces different result, for example: add 1 delete add 2 if the merging happens by 1-2, (1,2)--3 order, the result we see will be 2 if merging is: 1--3, (1,3)--2, the result will be 3. the issue is that delete now can not separate out previous adds and adds later than the delete. supposedly a delete is to create a completely new incarnation of the counter, or a new lifetime, or epoch. the new approach utilizes the concept of epoch number, so that each delete bumps up the epoch number. since each write is replicated (replicate on write is almost always enabled in practice, if this is a concern, we could further force ROW in case of delete ), so the epoch number is global to a replica set changes are attached, existing tests pass fine, some tests are modified since the semantic is changed a bit. some cql tests do not pass in the original 0.8.0 source, that's not the fault of this change. see details at http://mail-archives.apache.org/mod_mbox/cassandra-user/201106.mbox/%3cbanlktikqcglsnwtt-9hvqpseoo7sf58...@mail.gmail.com%3E the goal of this is to make delete work ( at least with consistent behavior, yes in case of long network partition, the behavior is not ideal, but it's consistent with the definition of logical clock), so that we could have expiring Counters -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2769) Cannot Create Duplicate Compaction Marker
[ https://issues.apache.org/jira/browse/CASSANDRA-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049923#comment-13049923 ] Sylvain Lebresne commented on CASSANDRA-2769: - Alright, I've committed the 0.8 patch. I'll have a look at the checks. Cannot Create Duplicate Compaction Marker - Key: CASSANDRA-2769 URL: https://issues.apache.org/jira/browse/CASSANDRA-2769 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.0 Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Fix For: 0.8.1, 1.0 Attachments: 0001-0.8.0-Remove-useless-unmarkCompacting-in-doCleanup.patch, 0001-Do-compact-only-smallerSSTables.patch, 0002-Only-compact-what-has-been-succesfully-marked-as-com.patch Concurrent compaction can trigger the following exception when two threads compact the same sstable. DataTracker attempts to prevent this but apparently not successfully. java.io.IOError: java.io.IOException: Unable to create compaction marker at org.apache.cassandra.io.sstable.SSTableReader.markCompacted(SSTableReader.java:638) at org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:321) at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:294) at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:255) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:932) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:173) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:119) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:102) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Caused by: java.io.IOException: Unable to create compaction marker at org.apache.cassandra.io.sstable.SSTableReader.markCompacted(SSTableReader.java:634) ... 12 more -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2405) should expose 'time since last successful repair' for easier aes monitoring
[ https://issues.apache.org/jira/browse/CASSANDRA-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049962#comment-13049962 ] Sylvain Lebresne commented on CASSANDRA-2405: - Nothing against that, though if we're going to have only a handful of rows in each it could be more efficient/cleaner to use the DynamicCompositeType instead of the creating two different CFs. Though if you absolutely prefer 2 CFs I won't fight against it. should expose 'time since last successful repair' for easier aes monitoring --- Key: CASSANDRA-2405 URL: https://issues.apache.org/jira/browse/CASSANDRA-2405 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Pavel Yaskevich Priority: Minor Fix For: 0.8.1 Attachments: CASSANDRA-2405-v2.patch, CASSANDRA-2405-v3.patch, CASSANDRA-2405.patch The practical implementation issues of actually ensuring repair runs is somewhat of an undocumented/untreated issue. One hopefully low hanging fruit would be to at least expose the time since last successful repair for a particular column family, to make it easier to write a correct script to monitor for lack of repair in a non-buggy fashion. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2521) Move away from Phantom References for Compaction/Memtable
[ https://issues.apache.org/jira/browse/CASSANDRA-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050287#comment-13050287 ] Sylvain Lebresne commented on CASSANDRA-2521: - Actually I started working on this yesterday evening and I think I'm almost done. So re-assigning to myself for now :) Move away from Phantom References for Compaction/Memtable - Key: CASSANDRA-2521 URL: https://issues.apache.org/jira/browse/CASSANDRA-2521 Project: Cassandra Issue Type: Improvement Reporter: Chris Goffinet Assignee: Sylvain Lebresne Fix For: 1.0 http://wiki.apache.org/cassandra/MemtableSSTable Let's move to using reference counting instead of relying on GC to be called in StorageService. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2521) Move away from Phantom References for Compaction/Memtable
[ https://issues.apache.org/jira/browse/CASSANDRA-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2521: Attachment: 0001-Use-reference-counting-to-decide-when-a-sstable-can-.patch Attaching patch against trunk. Tests are passing and it seems to work, at least with small tests. I started a stress on a 3 node cluster with a repair and a major compaction started towards the end and compacted files did wait to be fully streamed to be removed and I didn't hit any bump (I did hit CASSANDRA-2769 a bunch of time but that's another story). Still, this is a fairly tricky problem so it could use other eyes. The basics are fairly simple though: each time a thread want to do something with a SSTableReader, it acquires a reference to that sstable and releases it when done. SSTableReader just keep a counter of acquired references. When the sstable has been marked compacted, we start looking until all acquired reference has been released. When that's the case, the file can be removed. Obviously the main drawback of this approach (compared to the phantomReference one) is that there is room for error. If a consumer forgot to acquire a reference (or do it in a non-thread-safe manner), the sstable can be removed. Thankfully there is not so many place in the code that needs to do this so hopefully I haven't missed any place. The other thing is that if a reference on a sstable is acquired, it should be released (otherwise the sstable will not be removed until next restart). I've try to ensure this using try-catch block, but it's not really possible with the way streaming works. However, if streaming fails, it's not really worst than before since the files where not cleaned due to the (failed) session staying in the global map of streaming sessions. CASSANDRA-2433 should fix that in most cases anyway. Last thing is http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4715154. In other words, the deletion of a file won't work until the mmapping is finalized (aka, GC, Where art thou), at least on windows. For that reason, when the deletion of file fails (after the usual number of retries, which btw may make less sense now), the deletion task is saved in a global list. If Cassandra is low on disk, it will still trigger a GC, after which it will reschedule all failed files in the hope they can now be deleted. There is also a JMX call to retry this rescheduling. Move away from Phantom References for Compaction/Memtable - Key: CASSANDRA-2521 URL: https://issues.apache.org/jira/browse/CASSANDRA-2521 Project: Cassandra Issue Type: Improvement Reporter: Chris Goffinet Assignee: Sylvain Lebresne Fix For: 1.0 Attachments: 0001-Use-reference-counting-to-decide-when-a-sstable-can-.patch http://wiki.apache.org/cassandra/MemtableSSTable Let's move to using reference counting instead of relying on GC to be called in StorageService. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2782) Create a debian package for the CQL drivers
Create a debian package for the CQL drivers --- Key: CASSANDRA-2782 URL: https://issues.apache.org/jira/browse/CASSANDRA-2782 Project: Cassandra Issue Type: Wish Components: Packaging Reporter: Sylvain Lebresne Priority: Minor Since the CQL drivers are not release in lockstep with Cassandra, they are excluded from the Cassandra debian package. Creating a debian package for them could make debian user's live a bit easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2783) Explore adding replay ID for counters
Explore adding replay ID for counters - Key: CASSANDRA-2783 URL: https://issues.apache.org/jira/browse/CASSANDRA-2783 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne If a counter write returns a TimeoutException, the client cannot retry its write without risking an overcount. One idea to fix this would be to allow the client to specify a replay ID with each counter write unique to the write. If the write timeout, the client would resubmit the write with the same replay ID and the system would ensure that write is only replayed if the previous write was not persisted. Of course, the last part of this (the system would ensure ...) is the hard part. Still worth exploring I believe. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2769) Cannot Create Duplicate Compaction Marker
[ https://issues.apache.org/jira/browse/CASSANDRA-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2769: Attachment: 0002-Only-compact-what-has-been-succesfully-marked-as-com-v2.patch 0001-Do-compact-only-smallerSSTables-v2.patch bq. For trunk patches, I'm not comfortable w/ 0001 reassigning the sstables field on general principles either. We could have the compaction proceed using smallerSSTables as a simpler alternative, but in general this organization feels like negative progress from the 0.8 doCompaction/doCompactionWithoutSizeEstimation. Attaching v2 that doesn't reassign the sstables field. bq. I think Alan has a good point. I don't think it's an appropriate role of the data tracker to modify the set of sstables to be compacted in a task. I do not disagree with that. However I'd like that we fix trunk as a first priority. It's a pain to work on other issues (CASSANDRA-2521 for instance) while it is broken (and the goal must be to do our best to always have a working trunk). The attached patches doesn't really change any behavior, it just fixes the bugs, so let's get that in first before thinking about refactoring. Cannot Create Duplicate Compaction Marker - Key: CASSANDRA-2769 URL: https://issues.apache.org/jira/browse/CASSANDRA-2769 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.0 Reporter: Benjamin Coverston Assignee: Sylvain Lebresne Fix For: 0.8.2 Attachments: 0001-0.8.0-Remove-useless-unmarkCompacting-in-doCleanup.patch, 0001-Do-compact-only-smallerSSTables-v2.patch, 0001-Do-compact-only-smallerSSTables.patch, 0002-Only-compact-what-has-been-succesfully-marked-as-com-v2.patch, 0002-Only-compact-what-has-been-succesfully-marked-as-com.patch Concurrent compaction can trigger the following exception when two threads compact the same sstable. DataTracker attempts to prevent this but apparently not successfully. java.io.IOError: java.io.IOException: Unable to create compaction marker at org.apache.cassandra.io.sstable.SSTableReader.markCompacted(SSTableReader.java:638) at org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:321) at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:294) at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:255) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:932) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:173) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:119) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:102) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Caused by: java.io.IOException: Unable to create compaction marker at org.apache.cassandra.io.sstable.SSTableReader.markCompacted(SSTableReader.java:634) ... 12 more -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2521) Move away from Phantom References for Compaction/Memtable
[ https://issues.apache.org/jira/browse/CASSANDRA-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2521: Attachment: 0002-Force-unmapping-files-before-deletion-v2.patch 0001-Use-reference-counting-to-decide-when-a-sstable-can-v2.patch Attaching rebased first patch and a second patch to implement the Cleaner trick. I have confirmed on an example that, at least on linux, it does force the unmapping: the jvm crashes if you try to access the buffer after the unmapping. This is the biggest drawback of this approach imho. If we screw up with the reference counting and some thread does access the mapping, we won't get a nice exception, the JVM will simply crash (with the headache of having to find if it does is a bug on our side or a JVM bug). But for the quick testing I've done, it seems to work correctly. Move away from Phantom References for Compaction/Memtable - Key: CASSANDRA-2521 URL: https://issues.apache.org/jira/browse/CASSANDRA-2521 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Goffinet Assignee: Sylvain Lebresne Fix For: 1.0 Attachments: 0001-Use-reference-counting-to-decide-when-a-sstable-can-.patch, 0001-Use-reference-counting-to-decide-when-a-sstable-can-v2.patch, 0002-Force-unmapping-files-before-deletion-v2.patch http://wiki.apache.org/cassandra/MemtableSSTable Let's move to using reference counting instead of relying on GC to be called in StorageService. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2788) Add startup option renew the NodeId (for counters)
[ https://issues.apache.org/jira/browse/CASSANDRA-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2788: Attachment: 0001-Option-to-renew-the-NodeId-on-startup.patch Add startup option renew the NodeId (for counters) -- Key: CASSANDRA-2788 URL: https://issues.apache.org/jira/browse/CASSANDRA-2788 Project: Cassandra Issue Type: Improvement Affects Versions: 0.8.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Labels: counters Fix For: 0.8.2 Attachments: 0001-Option-to-renew-the-NodeId-on-startup.patch If an sstable of a counter column family is corrupted, the only safe solution a user have right now is to: # Remove the NodeId System table to force the node to regenerate a new NodeId (and thus stop incrementing on it's previous, corrupted, subcount) # Remove all the sstables for that column family on that node (this is important because otherwise the node will never get repaired for it's previous subcount) This is far from being ideal, but I think this is the price we pay for avoiding the read-before-write. In any case, the first step (remove the NodeId system table) happens to remove the list of the old NodeId this node has, which could prevent us for merging the other potential previous nodeId. This is ok but sub-optimal. This ticket proposes to add a new startup flag to make the node renew it's NodeId, thus replacing this first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2788) Add startup option renew the NodeId (for counters)
Add startup option renew the NodeId (for counters) -- Key: CASSANDRA-2788 URL: https://issues.apache.org/jira/browse/CASSANDRA-2788 Project: Cassandra Issue Type: Improvement Affects Versions: 0.8.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.8.2 Attachments: 0001-Option-to-renew-the-NodeId-on-startup.patch If an sstable of a counter column family is corrupted, the only safe solution a user have right now is to: # Remove the NodeId System table to force the node to regenerate a new NodeId (and thus stop incrementing on it's previous, corrupted, subcount) # Remove all the sstables for that column family on that node (this is important because otherwise the node will never get repaired for it's previous subcount) This is far from being ideal, but I think this is the price we pay for avoiding the read-before-write. In any case, the first step (remove the NodeId system table) happens to remove the list of the old NodeId this node has, which could prevent us for merging the other potential previous nodeId. This is ok but sub-optimal. This ticket proposes to add a new startup flag to make the node renew it's NodeId, thus replacing this first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2786) After a minor compaction, deleted key-slices are visible again
[ https://issues.apache.org/jira/browse/CASSANDRA-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051110#comment-13051110 ] Sylvain Lebresne commented on CASSANDRA-2786: - The java version would be really cool :) After a minor compaction, deleted key-slices are visible again -- Key: CASSANDRA-2786 URL: https://issues.apache.org/jira/browse/CASSANDRA-2786 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Environment: Single node with empty database Reporter: rene kochen Attachments: CassandraIssue.zip After a minor compaction, deleted key-slices are visible again. Steps to reproduce: 1) Insert a row named test. 2) Insert 50 rows. During this step, test is included in a major compaction. 3) Delete row named test. 4) Insert 50 rows. During this step, test is included in a minor compaction. After step 4, row test is live again. Test environment: Single node with empty database. Standard configured super-column-family (I see this behavior with several gc_grace settings (big and small values): create column family Customers with column_type = 'Super' and comparator = 'BytesType; In Cassandra 0.7.6 I observe the expected behavior, i.e. after step 4, the row is still deleted. I've included a .NET program to reproduce the problem. I will add a Java version later on. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2653) index scan errors out when zero columns are requested
[ https://issues.apache.org/jira/browse/CASSANDRA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051798#comment-13051798 ] Sylvain Lebresne commented on CASSANDRA-2653: - This really primarily fixes the error from Jake's test cases. I'll have to admit that's the only I looked. I did not realize the original problem was not necessarily related and so it is very possible (even likely) this does not fix the zero-columns-requested problem. index scan errors out when zero columns are requested - Key: CASSANDRA-2653 URL: https://issues.apache.org/jira/browse/CASSANDRA-2653 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 beta 2 Reporter: Jonathan Ellis Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.8.1 Attachments: 0001-Reset-SSTII-in-EchoedRow-constructor.patch, v1-0001-CASSANDRA-2653-reproduce-regression.txt As reported by Tyler Hobbs as an addendum to CASSANDRA-2401, {noformat} ERROR 16:13:38,864 Fatal exception in thread Thread[ReadStage:16,5,main] java.lang.AssertionError: No data found for SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, count=0] in DecoratedKey(81509516161424251288255223397843705139, 6b657931):QueryPath(columnFamilyName='cf', superColumnName='null', columnName='null') (original filter SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, count=0]) from expression 'cf.626972746864617465 EQ 1' at org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1517) at org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2735) Timestamp Based Compaction Strategy
[ https://issues.apache.org/jira/browse/CASSANDRA-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051836#comment-13051836 ] Sylvain Lebresne commented on CASSANDRA-2735: - The goal here is not to have TTL for counters (or anything else). The goal is to have a compaction strategy that as part of what it does can throw up entire sstable when the content is considered old enough (and that's actually only part of the strategy, not necessarily its primary goal). As it turns out, this will roughly (and the rough part is important) amount to expire data, including counters. But this will be a very heavy hammer, in particular it will only work if all the counter/data in the column family have the exact same expiration time. And this won't work at all for say counters that you would want to start re-incrementing after expiration. But again, this is not the goal of the ticket. Timestamp Based Compaction Strategy --- Key: CASSANDRA-2735 URL: https://issues.apache.org/jira/browse/CASSANDRA-2735 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Alan Liang Assignee: Alan Liang Priority: Minor Labels: compaction Attachments: 0004-timestamp-bucketed-compaction-strategy.patch Compaction strategy implementation based on max timestamp ordering of the sstables while satisfying max sstable size, min and max compaction thresholds. It also handles expiration of sstables based on a timestamp. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2773) Index manager cannot support deleting and inserting into a row in the same mutation
[ https://issues.apache.org/jira/browse/CASSANDRA-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051871#comment-13051871 ] Sylvain Lebresne commented on CASSANDRA-2773: - Hum, we cannot remove the column from cf in ignoreObsoleteMutations() because cf is the original column family from the row mutation and that's racy with commit log write (à la CASSANDRA-2604). We should clone the column family, but maybe it's simpler to add validation logic after all ? In any case, it could be worth it adding some comment in Table.apply() or Table.ignoreObsoleteMutations(). Index manager cannot support deleting and inserting into a row in the same mutation - Key: CASSANDRA-2773 URL: https://issues.apache.org/jira/browse/CASSANDRA-2773 Project: Cassandra Issue Type: Bug Affects Versions: 0.7.0 Reporter: Boris Yen Assignee: Jonathan Ellis Priority: Minor Fix For: 0.8.2 Attachments: 2773.txt I use hector 0.8.0-1 and cassandra 0.8. 1. create mutator by using hector api, 2. Insert a few columns into the mutator for key key1, cf standard. 3. add a deletion to the mutator to delete the record of key1, cf standard. 4. repeat 2 and 3 5. execute the mutator. the result: the connection seems to be held by the sever forever, it never returns. when I tried to restart the cassandra I saw unsupportedexception : Index manager cannot support deleting and inserting into a row in the same mutation. and the cassandra is dead forever, unless I delete the commitlog. I would expect to get an exception when I execute the mutator, not after I restart the cassandra. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2795) Autodelete empty rows
[ https://issues.apache.org/jira/browse/CASSANDRA-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051909#comment-13051909 ] Sylvain Lebresne commented on CASSANDRA-2795: - bq. I tested setting gc_grace very low (tried 0 and 1) in a single node, and the row didn't disappear. Ok, to be precise you need to have a compaction occuring after gc_grace has passed. So you'll need to flush after the insertion, wait for the column to expire, force a compaction, wait for it to finish and then request. Autodelete empty rows - Key: CASSANDRA-2795 URL: https://issues.apache.org/jira/browse/CASSANDRA-2795 Project: Cassandra Issue Type: Improvement Components: Core, Tools Affects Versions: 0.8.0 Reporter: Pau Rodriguez In a system where every column expire using TTL. The rows persist, and they are empty. If is possible to also delete them if empty when last column had expired. I understand that this may be difficult to synchronize between all the cluster. If this behavior isn't good for all cases, maybe can be configured in a variable per Column Family. Alternatively could be a tool to removed empty rows along all the cluster, the problem to do that using the API is the time between the check is done and the remove is send. I think that is preferable to be done when last column has expired. Thanks in advance. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2795) Autodelete empty rows
[ https://issues.apache.org/jira/browse/CASSANDRA-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-2795. - Resolution: Not A Problem What you are seeing is range ghosts: http://wiki.apache.org/cassandra/FAQ#range_ghosts The row *is* correctly deleted when all columns expires. It won't show as a range ghost once gc_grace seconds have passed. Autodelete empty rows - Key: CASSANDRA-2795 URL: https://issues.apache.org/jira/browse/CASSANDRA-2795 Project: Cassandra Issue Type: Improvement Components: Core, Tools Affects Versions: 0.8.0 Reporter: Pau Rodriguez In a system where every column expire using TTL. The rows persist, and they are empty. If is possible to also delete them if empty when last column had expired. I understand that this may be difficult to synchronize between all the cluster. If this behavior isn't good for all cases, maybe can be configured in a variable per Column Family. Alternatively could be a tool to removed empty rows along all the cluster, the problem to do that using the API is the time between the check is done and the remove is send. I think that is preferable to be done when last column has expired. Thanks in advance. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-2793) SSTable Corrupt (negative) value length encountered exception blocks compaction.
[ https://issues.apache.org/jira/browse/CASSANDRA-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051915#comment-13051915 ] Sylvain Lebresne edited comment on CASSANDRA-2793 at 6/20/11 10:46 AM: --- bq. Hi the issue reported was that the sstable corruption is blocking compaction with the consequence the bucket of sstables Cassandra wants to compact just grows and you get huge cpu load (from repeated attempts at compaction and increasing read inefficiency). This is a dupe of CASSANDRA-2261. bq. the trace also shows that it has just skipped the corrupted row so in fact it hasn't solved the problem at all. In most cases of corruption, there is not much more we can do than skip the row. As the long as the corruption is local and you don't use RF=1, this is usually not a big deal (which does not mean corruption is something we should be happy with). bq. The corruption itself is also an issue Corruption can be of two forms: either we have a bug or the corruption is external (bad hard drive for instance). Hard drive corruptions do happen and there is not much we can do about it (well, actually we should use checksum to at least better dectect them : CASSANDRA-1717). On the front of a bug, since I see this happens on a Super column family, it could be due to a race fixed by CASSANDRA-2675. was (Author: slebresne): bq. Hi the issue reported was that the sstable corruption is blocking compaction with the consequence the bucket of sstables Cassandra wants to compact just grows and you get huge cpu load (from repeated attempts at compaction and increasing read inefficiency). This is a dupe of https://issues.apache.org/jira/browse/CASSANDRA-2261. bq. the trace also shows that it has just skipped the corrupted row so in fact it hasn't solved the problem at all. In most cases of corruption, there is not much more we can do than skip the row. As the long as the corruption is local and you don't use RF=1, this is usually not a big deal (which does not mean corruption is something we should be happy with). bq. The corruption itself is also an issue Corruption can be of two forms: either we have a bug or the corruption is external (bad hard drive for instance). Hard drive corruptions do happen and there is not much we can do about it (well, actually we should use checksum to at least better dectect them : CASSANDRA-1717). On the front of a bug, since I see this happens on a Super column family, it could be due to a race fixed by CASSANDRA-2675. SSTable Corrupt (negative) value length encountered exception blocks compaction. -- Key: CASSANDRA-2793 URL: https://issues.apache.org/jira/browse/CASSANDRA-2793 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.6 Environment: Ubuntu Reporter: Dominic Williams A node was consistently experiencing high CPU load. Examination of the logs showed that compaction of an sstable was failing with an error: INFO [CompactionExecutor:1] 2011-06-17 00:18:51,676 CompactionManager.java (line 395) Compacting [SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-6993-Data.db'),SSTableReader( path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-6994-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-6995-Data.db'),SSTableReader(path='/var/opt/cassandra /data/FightMyMonster/UserMonsters-f-6996-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-6998-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/Use rMonsters-f-7000-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7002-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7004-Data.db '),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7006-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7008-Data.db'),SSTableReader(path='/ var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7010-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7012-Data.db'),SSTableReader(path='/var/opt/cassandra/data/F ightMyMonster/UserMonsters-f-7014-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7016-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonste rs-f-7018-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7020-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7022-Data.db'),SSTa