[Cassandra Wiki] Update of ClientOptions by DeanHiller
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The ClientOptions page has been changed by DeanHiller: http://wiki.apache.org/cassandra/ClientOptions?action=diffrev1=156rev2=157 * Pycassa: http://github.com/pycassa/pycassa * Telephus: http://github.com/driftx/Telephus (Twisted) * Java: + * PlayOrm: https://github.com/deanhiller/playorm * Astyanax: https://github.com/Netflix/astyanax/wiki/Getting-Started * Hector: * Site: http://hector-client.org
[jira] [Commented] (CASSANDRA-4512) Nodes removed with removetoken stay around preventing truncation
[ https://issues.apache.org/jira/browse/CASSANDRA-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435958#comment-13435958 ] Manoj Kanta Mainali commented on CASSANDRA-4512: Do you have any concrete steps on how to reproduce this and what is the error message you get? Did you check the server logs? I tried truncating after create two instances, killing one of it and removing its token using the nodetool, but cli didn't report any error and truncation was successful. The only reason truncate will throw nodes as UNREACHABLE is if the endpoint still exists in the down nodes (unreachable) set. If your remove token was successful and didn't throw any error, the endpoint would be removed from the unreachable nodes set. Nodes removed with removetoken stay around preventing truncation Key: CASSANDRA-4512 URL: https://issues.apache.org/jira/browse/CASSANDRA-4512 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.10 Environment: Ubuntu, EC2 Reporter: Taras Ovsyankin Priority: Minor Removed multiple nodes from the cluster in order to scale down (killed VMs then ran removetoken for every dead node). Nodetool ring looks happy, but cassandra-cli reports removed nodes as UNREACHABLE and truncation doesn't work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost
[ https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436020#comment-13436020 ] Jonathan Ellis commented on CASSANDRA-4481: --- I'm not aware of commitlog format changes recently. What specific versions are you saying are incompatible? Commitlog not replayed after restart - data lost Key: CASSANDRA-4481 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.2 Environment: Single node cluster on 64Bit CentOS Reporter: Ivo Meißner Priority: Critical When data is written to the commitlog and I restart the machine, all commited data is lost that has not been flushed to disk. In the startup logs it says that it replays the commitlog successfully, but the data is not available then. When I open the commitlog file in an editor I can see the added data, but after the restart it cannot be fetched from cassandra. {code} INFO 09:59:45,362 Replaying /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Finished reading /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Log replay complete, 0 replayed mutations {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost
[ https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436024#comment-13436024 ] Florent Clairambault commented on CASSANDRA-4481: - Well, I spoke a little bit too fast here. It looks like they are incompatible and it looks like this potential so-called (by me) incompatibility occurs between 1.1.1 or 1.1.2 and 1.1.3. Commitlog not replayed after restart - data lost Key: CASSANDRA-4481 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.2 Environment: Single node cluster on 64Bit CentOS Reporter: Ivo Meißner Priority: Critical When data is written to the commitlog and I restart the machine, all commited data is lost that has not been flushed to disk. In the startup logs it says that it replays the commitlog successfully, but the data is not available then. When I open the commitlog file in an editor I can see the added data, but after the restart it cannot be fetched from cassandra. {code} INFO 09:59:45,362 Replaying /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Finished reading /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Log replay complete, 0 replayed mutations {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-4481) Commitlog not replayed after restart - data lost
[ https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435597#comment-13435597 ] Florent Clairambault edited comment on CASSANDRA-4481 at 8/17/12 2:08 AM: -- So, I didn't find (or tried to find) a solution to reproduce this bug. But I found an (incomplete) fix. I'm on Debian/6.0.5, still with Cassandra/1.1.3: For a keyspace named dom2dom (so that I don't have to replace any name). In my case I removed all the commitlog files that were created prior to 1.1.3 {code:title=Shell commands|borderStyle=solid} # 1. Flush cassandra nodetool flush # 2. Stop cassandra service cassandra stop # 3. Move the sstable files to an other directory mkdir /var/lib/cassandra/toload mv /var/lib/cassandra/data/dom2dom /var/lib/cassandra/toload/m2mp # In my case, I had to create a 127.0.0.2 loopback interface # and update the cassandra.yaml file to change rpc_address and listen_address settings # to 127.0.0.2 so that sstableloader could work. # 4. Start cassandra service cassandra stop # At that point the commitlogs should work again and you should have some new sstable created du -sh /var/lib/cassandra/dom2dom # Returns: 236K # You now have the new data and not the old one, so you need to load the old data using sstableloader: find /var/lib/cassandra/toload/dom2dom/ -type d -exec sstableloader -d 127.0.0.2 {} \; # In my case, I had to put back localhost in the cassandra.yaml for the rpc_address and listen_address settings # You can delete the /var/lib/cassandra/toload folder {code} *IMPORTANT NOTE:* I'm don't think the old (prior to 1.1.3) commitlog files will work. From what I've quickly tested it doesn't. Still, it would be very good to have some kind of error/warning logs around these commitlogs that are not taken into account for a reason. Because you currently only discover it when you restart your cassandra server. was (Author: superfc): So, I didn't find (or tried to find) a solution to reproduce this bug. But I found an (incomplete) fix. I'm on Debian/6.0.5, still with Cassandra/1.1.3: For a keyspace named dom2dom (so that I don't have to replace any name). In my case I removed all the commitlog files that were created prior to 1.1.3 {code:title=Shell commands|borderStyle=solid} # 1. Flush cassandra nodetool flush # 2. Stop cassandra service cassandra stop # 3. Move the sstable files to an other directory mkdir /var/lib/cassandra/toload mv /var/lib/cassandra/data/dom2dom /var/lib/cassandra/toload/m2mp # In my case, I had to create a 127.0.0.2 loopback interface # and update the cassandra.yaml file to change rpc_address and listen_address settings # to 127.0.0.2 so that sstableloader could work. # 4. Start cassandra service cassandra stop # At that point the commitlogs should work again and you should have some new sstable created du -sh /var/lib/cassandra/dom2dom # Returns: 236K # You now have the new data and not the old one, so you need to load the old data using sstableloader: find /var/lib/cassandra/toload/dom2dom/ -type d -exec sstableloader -d 127.0.0.2 {} \; # In my case, I had to put back localhost in the cassandra.yaml for the rpc_address and listen_address settings # You can delete the /var/lib/cassandra/toload folder {code} *IMPORTANT NOTE:* I'm don't think the old (prior to 1.1.3) commitlog files will work. From what I've quickly tested it doesn't. Still, it would be very good to have some kind of error/warning logs around this commit logs and sstables incompatibility issue. Because you currently only discover it when you restart your cassandra server. Commitlog not replayed after restart - data lost Key: CASSANDRA-4481 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.2 Environment: Single node cluster on 64Bit CentOS Reporter: Ivo Meißner Priority: Critical When data is written to the commitlog and I restart the machine, all commited data is lost that has not been flushed to disk. In the startup logs it says that it replays the commitlog successfully, but the data is not available then. When I open the commitlog file in an editor I can see the added data, but after the restart it cannot be fetched from cassandra. {code} INFO 09:59:45,362 Replaying /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Finished reading /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Log replay complete, 0 replayed mutations {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators:
[jira] [Comment Edited] (CASSANDRA-4481) Commitlog not replayed after restart - data lost
[ https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436024#comment-13436024 ] Florent Clairambault edited comment on CASSANDRA-4481 at 8/17/12 2:08 AM: -- Well, I spoke a little bit too fast here. It looks like they are incompatible and it looks like this potential so-called (by me) incompatibility occurs between 1.1.1 or 1.1.2 and 1.1.3. I tried to clarify my previous message. was (Author: superfc): Well, I spoke a little bit too fast here. It looks like they are incompatible and it looks like this potential so-called (by me) incompatibility occurs between 1.1.1 or 1.1.2 and 1.1.3. Commitlog not replayed after restart - data lost Key: CASSANDRA-4481 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.2 Environment: Single node cluster on 64Bit CentOS Reporter: Ivo Meißner Priority: Critical When data is written to the commitlog and I restart the machine, all commited data is lost that has not been flushed to disk. In the startup logs it says that it replays the commitlog successfully, but the data is not available then. When I open the commitlog file in an editor I can see the added data, but after the restart it cannot be fetched from cassandra. {code} INFO 09:59:45,362 Replaying /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Finished reading /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Log replay complete, 0 replayed mutations {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost
[ https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436030#comment-13436030 ] Jonathan Ellis commented on CASSANDRA-4481: --- If you can reproduce commitlog incompatibility between those versions, please let us know. Commitlog not replayed after restart - data lost Key: CASSANDRA-4481 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.2 Environment: Single node cluster on 64Bit CentOS Reporter: Ivo Meißner Priority: Critical When data is written to the commitlog and I restart the machine, all commited data is lost that has not been flushed to disk. In the startup logs it says that it replays the commitlog successfully, but the data is not available then. When I open the commitlog file in an editor I can see the added data, but after the restart it cannot be fetched from cassandra. {code} INFO 09:59:45,362 Replaying /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Finished reading /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Log replay complete, 0 replayed mutations {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost
[ https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436033#comment-13436033 ] Florent Clairambault commented on CASSANDRA-4481: - As I have something that now works fine, I don't think I will do it. Commitlog not replayed after restart - data lost Key: CASSANDRA-4481 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.2 Environment: Single node cluster on 64Bit CentOS Reporter: Ivo Meißner Priority: Critical When data is written to the commitlog and I restart the machine, all commited data is lost that has not been flushed to disk. In the startup logs it says that it replays the commitlog successfully, but the data is not available then. When I open the commitlog file in an editor I can see the added data, but after the restart it cannot be fetched from cassandra. {code} INFO 09:59:45,362 Replaying /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Finished reading /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Log replay complete, 0 replayed mutations {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-4481) Commitlog not replayed after restart - data lost
[ https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435597#comment-13435597 ] Florent Clairambault edited comment on CASSANDRA-4481 at 8/17/12 2:21 AM: -- So, I didn't find (or tried to find) a solution to reproduce this bug. But I found an (incomplete) fix. I'm on Debian/6.0.5, still with Cassandra/1.1.3: For a keyspace named dom2dom (so that I don't have to replace any name). In my case I removed all the commitlog files that were created prior to 1.1.3 {code:title=Shell commands|borderStyle=solid} # 1. Flush cassandra nodetool flush # 2. Stop cassandra service cassandra stop # 3. Move the sstable files to an other directory mkdir /var/lib/cassandra/toload mv /var/lib/cassandra/data/dom2dom /var/lib/cassandra/toload/m2mp # In my case, I had to create a 127.0.0.2 loopback interface # and update the cassandra.yaml file to change rpc_address and listen_address settings # to 127.0.0.2 so that sstableloader could work. # 4. Start cassandra service cassandra stop # At that point the commitlogs should work again and you should have some new sstable created du -sh /var/lib/cassandra/dom2dom # Returns: 236K # You now have the new data and not the old one, so you need to load the old data using sstableloader: find /var/lib/cassandra/toload/dom2dom/ -type d -exec sstableloader -d 127.0.0.2 {} \; # In my case, I had to put back localhost in the cassandra.yaml for the rpc_address and listen_address settings # You can delete the /var/lib/cassandra/toload folder {code} *IMPORTANT NOTE:* I'm don't think the old (prior to 1.1.3) commitlog files will work. From what I've quickly tested they don't. Still, it would be very good to have some kind of error/warning logs around these commitlogs that are not taken into account for a reason. Because you currently only discover it when you restart your cassandra server. was (Author: superfc): So, I didn't find (or tried to find) a solution to reproduce this bug. But I found an (incomplete) fix. I'm on Debian/6.0.5, still with Cassandra/1.1.3: For a keyspace named dom2dom (so that I don't have to replace any name). In my case I removed all the commitlog files that were created prior to 1.1.3 {code:title=Shell commands|borderStyle=solid} # 1. Flush cassandra nodetool flush # 2. Stop cassandra service cassandra stop # 3. Move the sstable files to an other directory mkdir /var/lib/cassandra/toload mv /var/lib/cassandra/data/dom2dom /var/lib/cassandra/toload/m2mp # In my case, I had to create a 127.0.0.2 loopback interface # and update the cassandra.yaml file to change rpc_address and listen_address settings # to 127.0.0.2 so that sstableloader could work. # 4. Start cassandra service cassandra stop # At that point the commitlogs should work again and you should have some new sstable created du -sh /var/lib/cassandra/dom2dom # Returns: 236K # You now have the new data and not the old one, so you need to load the old data using sstableloader: find /var/lib/cassandra/toload/dom2dom/ -type d -exec sstableloader -d 127.0.0.2 {} \; # In my case, I had to put back localhost in the cassandra.yaml for the rpc_address and listen_address settings # You can delete the /var/lib/cassandra/toload folder {code} *IMPORTANT NOTE:* I'm don't think the old (prior to 1.1.3) commitlog files will work. From what I've quickly tested it doesn't. Still, it would be very good to have some kind of error/warning logs around these commitlogs that are not taken into account for a reason. Because you currently only discover it when you restart your cassandra server. Commitlog not replayed after restart - data lost Key: CASSANDRA-4481 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.2 Environment: Single node cluster on 64Bit CentOS Reporter: Ivo Meißner Priority: Critical When data is written to the commitlog and I restart the machine, all commited data is lost that has not been flushed to disk. In the startup logs it says that it replays the commitlog successfully, but the data is not available then. When I open the commitlog file in an editor I can see the added data, but after the restart it cannot be fetched from cassandra. {code} INFO 09:59:45,362 Replaying /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Finished reading /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Log replay complete, 0 replayed mutations {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators:
[jira] [Commented] (CASSANDRA-4292) Per-disk I/O queues
[ https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436059#comment-13436059 ] Jonathan Ellis commented on CASSANDRA-4292: --- Your instincts were better than mine: combining compaction and flush i/o into a single executor was a mistake. We could band-aid it by adding some kind of semaphore mechanism to make sure we always leave at least one thread free for flushing but this still won't let us max out on flushing temporarily at the expense of compaction, without introducing extremely complicated preemption logic. So, color me convinced that we need to keep separate executors for flush and compaction. Additionally, the more I think about it the less I think the DBT abstraction is what we want here. Or at a higher level: I don't think we want to be that strict about one thread per disk. Which was my fault in the first place, sorry! If we instead just follow the above disk prioritization logic, we'll still get effectively thread-per-disk until disks start to run out of space. But having a (standard) flexible pool of threads means that we generalize much better to SSDs, where having substantially more threads than disks makes sense (since compaction becomes CPU bound). So I think we can simplify our approach a lot, perhaps by having a global Directory state that tracks space remaining and how many i/o tasks are running on each, that we can use when handing out flush and compaction targets. The executor architecture won't need to change. (May want to introduce a DirectoryBoundRunnable abstraction, whose run method encapsulates updating i/o task count and space free after running the flush/compaction, but without trying it I'm not sure if that actually works as imagined.) Per-disk I/O queues --- Key: CASSANDRA-4292 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Yuki Morishita Fix For: 1.2 Attachments: 4292.txt, 4292-v2.txt, 4292-v3.txt As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) threads, which mix and match disk volumes indiscriminately. It may be worth creating a tight thread - disk affinity, to prevent unnecessary conflict at that level. OTOH as SSDs become more prevalent this becomes a non-issue. Unclear how much pain this actually causes in practice in the meantime. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2710) Get multiple column ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436075#comment-13436075 ] Jonathan Ellis commented on CASSANDRA-2710: --- What is the use case? TBH this seems like a misfeature to me, that we only support for backwards compatibility. Get multiple column ranges -- Key: CASSANDRA-2710 URL: https://issues.apache.org/jira/browse/CASSANDRA-2710 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: David Boxenhorn Assignee: Vijay Labels: compositeColumns, cql Attachments: 0001-2710-multiple-column-ranges-cql.patch, 0001-2710-multiple-column-ranges-thrift.patch I have replaced all my super column families with regular column families using composite columns. I have easily been able to support all previous functionality (I don't need range delete) except for one thing: getting multiple super columns with a single access. For this, I would need to get multiple ranges. (I can get multiple columns, or a single range, but not multiple ranges.) For example, I used to have [superColumnName1,subColumnName1..N],[superColumnName2,subColumnName1..N] and I could get superColumnName1, superColumnName2 Now I have [lensuperColumnName10lensubColumnName1..lensuperColumnName10lensubColumnNameN],[lensuperColumnName20lensubColumnName1..lensuperColumnName20lensubColumnNameN] and I need to get superColumnName1..superColumnName1+, superColumnName2..superColumnName2+ to get the same functionality I would like the clients to support this functionality, e.g. Hector to have .setRages parallel to .setColumnNames and for CQL to support a syntax like SELECT [FIRST N] [REVERSED] name1..nameN1, name2..nameN2... FROM ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4487) remove uses of SchemaDisagreementException
[ https://issues.apache.org/jira/browse/CASSANDRA-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436087#comment-13436087 ] Jonathan Ellis commented on CASSANDRA-4487: --- LGTM remove uses of SchemaDisagreementException -- Key: CASSANDRA-4487 URL: https://issues.apache.org/jira/browse/CASSANDRA-4487 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.2 Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Priority: Minor Fix For: 1.2 Attachments: 0001-code-changes.patch, 0002-re-generated-thrift.patch, CASSANDRA-4487-v2.patch Since we can handle concurrent schema changes now, there's no need to validateSchemaAgreement before modification now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write
[ https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2897: -- Fix Version/s: 1.2 Secondary indexes without read-before-write --- Key: CASSANDRA-2897 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.7.0 Reporter: Sylvain Lebresne Priority: Minor Labels: secondary_index Fix For: 1.2 Attachments: 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt Currently, secondary index updates require a read-before-write to maintain the index consistency. Keeping the index consistent at all time is not necessary however. We could let the (secondary) index get inconsistent on writes and repair those on reads. This would be easy because on reads, we make sure to request the indexed columns anyway, so we can just skip the row that are not needed and repair the index at the same time. This does trade work on writes for work on reads. However, read-before-write is sufficiently costly that it will likely be a win overall. There is (at least) two small technical difficulties here though: # If we repair on read, this will be racy with writes, so we'll probably have to synchronize there. # We probably shouldn't only rely on read to repair and we should also have a task to repair the index for things that are rarely read. It's unclear how to make that low impact though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4549) Update the pig examples to include more recent pig/cassandra features
Jeremy Hanna created CASSANDRA-4549: --- Summary: Update the pig examples to include more recent pig/cassandra features Key: CASSANDRA-4549 URL: https://issues.apache.org/jira/browse/CASSANDRA-4549 Project: Cassandra Issue Type: Task Components: Hadoop Reporter: Jeremy Hanna Assignee: Jeremy Hanna Priority: Minor Now that there is support for a variety of Cassandra features from Pig (esp 1.1+), it would great to have some of them in the examples so that people can see how to use them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-1743) Switch to TFastFramedTransport
[ https://issues.apache.org/jira/browse/CASSANDRA-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-1743. --- Resolution: Won't Fix Fix Version/s: (was: 1.2) Assignee: (was: T Jake Luciani) now that we have a custom binary protocol, we can leave thrift in peace Switch to TFastFramedTransport -- Key: CASSANDRA-1743 URL: https://issues.apache.org/jira/browse/CASSANDRA-1743 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Attachments: 1743.txt, 1743.txt, 1743_v3.txt, 1743_v4.txt, 1743_v5.txt Original Estimate: 16h Remaining Estimate: 16h Forgot that after THRIFT-831 fast mode is not the default and is a separate transport class. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2710) Get multiple column ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436124#comment-13436124 ] Vijay commented on CASSANDRA-2710: -- Hi Jonathan, Consider a Use case where we have a hierarchical data, type:latest:1:2, type:v1:1:2, type:v2:1,2 etc the user might want to query all the data most of the time, sometimes only the latest and sometimes a specific versions of a given type. 1) if we model the type to be part of the Row Key then the problem is for 80% or so use case i will be doing a multi-get (we dont advice OPP so sometimes you might need a index). 2) if i have all of them in one row then i will be doing multiple calls to get the data out. I am not arguing the need for it, there are other ways you can get it done (by adding the type and v1 in the super column name or something like that)... but it will be little more flexible. I am fine closing the ticket too :) Let me know Thanks! Get multiple column ranges -- Key: CASSANDRA-2710 URL: https://issues.apache.org/jira/browse/CASSANDRA-2710 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: David Boxenhorn Assignee: Vijay Labels: compositeColumns, cql Attachments: 0001-2710-multiple-column-ranges-cql.patch, 0001-2710-multiple-column-ranges-thrift.patch I have replaced all my super column families with regular column families using composite columns. I have easily been able to support all previous functionality (I don't need range delete) except for one thing: getting multiple super columns with a single access. For this, I would need to get multiple ranges. (I can get multiple columns, or a single range, but not multiple ranges.) For example, I used to have [superColumnName1,subColumnName1..N],[superColumnName2,subColumnName1..N] and I could get superColumnName1, superColumnName2 Now I have [lensuperColumnName10lensubColumnName1..lensuperColumnName10lensubColumnNameN],[lensuperColumnName20lensubColumnName1..lensuperColumnName20lensubColumnNameN] and I need to get superColumnName1..superColumnName1+, superColumnName2..superColumnName2+ to get the same functionality I would like the clients to support this functionality, e.g. Hector to have .setRages parallel to .setColumnNames and for CQL to support a syntax like SELECT [FIRST N] [REVERSED] name1..nameN1, name2..nameN2... FROM ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4550) nodetool ring output should use hex not integers for tokens
Aaron Turner created CASSANDRA-4550: --- Summary: nodetool ring output should use hex not integers for tokens Key: CASSANDRA-4550 URL: https://issues.apache.org/jira/browse/CASSANDRA-4550 Project: Cassandra Issue Type: Improvement Components: Tools Affects Versions: 1.0.9 Environment: Linux Reporter: Aaron Turner Priority: Minor The current output of nodetool ring prints start token values as base10 integers instead of hex. This is not very user friendly for a number of reasons: 1. Hides the fact that the values are 128bit 2. Values are not of a consistent length, while in hex padding with zero is generally accepted 3. When using the default random partitioner, having the values in hex makes it easier for users to determine which node(s) a given key resides on since md5 utilities like md5sum output hex. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2710) Get multiple column ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436224#comment-13436224 ] Jonathan Ellis commented on CASSANDRA-2710: --- bq. if we model the type to be part of the Row Key then the problem is for 80% or so use case i will be doing a multi-get What's the objection here? multiget-within-a-single-row still has all the problems of multiget-across-rows, with the added problem that it doesn't parallelize across machines. Get multiple column ranges -- Key: CASSANDRA-2710 URL: https://issues.apache.org/jira/browse/CASSANDRA-2710 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: David Boxenhorn Assignee: Vijay Labels: compositeColumns, cql Attachments: 0001-2710-multiple-column-ranges-cql.patch, 0001-2710-multiple-column-ranges-thrift.patch I have replaced all my super column families with regular column families using composite columns. I have easily been able to support all previous functionality (I don't need range delete) except for one thing: getting multiple super columns with a single access. For this, I would need to get multiple ranges. (I can get multiple columns, or a single range, but not multiple ranges.) For example, I used to have [superColumnName1,subColumnName1..N],[superColumnName2,subColumnName1..N] and I could get superColumnName1, superColumnName2 Now I have [lensuperColumnName10lensubColumnName1..lensuperColumnName10lensubColumnNameN],[lensuperColumnName20lensubColumnName1..lensuperColumnName20lensubColumnNameN] and I need to get superColumnName1..superColumnName1+, superColumnName2..superColumnName2+ to get the same functionality I would like the clients to support this functionality, e.g. Hector to have .setRages parallel to .setColumnNames and for CQL to support a syntax like SELECT [FIRST N] [REVERSED] name1..nameN1, name2..nameN2... FROM ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4550) nodetool ring output should use hex not integers for tokens
[ https://issues.apache.org/jira/browse/CASSANDRA-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4550: -- Priority: Trivial (was: Minor) Affects Version/s: (was: 1.0.9) Labels: lhf (was: ) nodetool ring output should use hex not integers for tokens --- Key: CASSANDRA-4550 URL: https://issues.apache.org/jira/browse/CASSANDRA-4550 Project: Cassandra Issue Type: Improvement Components: Tools Environment: Linux Reporter: Aaron Turner Priority: Trivial Labels: lhf The current output of nodetool ring prints start token values as base10 integers instead of hex. This is not very user friendly for a number of reasons: 1. Hides the fact that the values are 128bit 2. Values are not of a consistent length, while in hex padding with zero is generally accepted 3. When using the default random partitioner, having the values in hex makes it easier for users to determine which node(s) a given key resides on since md5 utilities like md5sum output hex. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4304) Add bytes-limit clause to queries
[ https://issues.apache.org/jira/browse/CASSANDRA-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4304: -- Fix Version/s: (was: 1.2.0) Add bytes-limit clause to queries - Key: CASSANDRA-4304 URL: https://issues.apache.org/jira/browse/CASSANDRA-4304 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Christian Spriegel Attachments: TestImplForSlices.patch Idea is to add a second limit clause to (slice)queries. This would allow easy loading of batches, even if content is variable sized. Imagine the following use case: You want to load a batch of XMLs, where each is between 100bytes and 5MB large. Currently you can load either - a large number of XMLs, but risk OOMs or timeouts or - a small number of XMLs, and do too many queries where each query usually retrieves very little data. With cassandra being able to limit by size and not just count, we could do a single query which would never OOM but always return a decent amount of data -- with no extra overhead for multiple queries. Few thoughts from my side: - The limit should be a soft limit, not a hard limit. Therefore it will always return at least one row/column, even if that one large than the limit specifies. - HintedHandoffManager:303 is already doing a InMemoryCompactionLimit/averageColumnSize to avoid OOM. It could then simply use the new limit clause :-) - A bytes-limit on a range- or indexed-query should always return a complete row -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3649) Code style changes, aka The Big Reformat
[ https://issues.apache.org/jira/browse/CASSANDRA-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3649. --- Resolution: Won't Fix Fix Version/s: (was: 1.2.0) Realistically I guess this is not going to happen. Code style changes, aka The Big Reformat Key: CASSANDRA-3649 URL: https://issues.apache.org/jira/browse/CASSANDRA-3649 Project: Cassandra Issue Type: Wish Components: Core Reporter: Brandon Williams With a new major release coming soon and not having a ton of huge pending patches that have prevented us from doing this in the past, post-freeze looks like a good time to finally do this. Mostly this will include the removal of underscores in private variables, and no more brace-on-newline policy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1337) parallelize fetching rows for low-cardinality indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-1337: -- Priority: Minor (was: Major) Fix Version/s: (was: 1.2.0) 1.2.1 parallelize fetching rows for low-cardinality indexes - Key: CASSANDRA-1337 URL: https://issues.apache.org/jira/browse/CASSANDRA-1337 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: David Alves Priority: Minor Fix For: 1.2.1 Attachments: 0001-CASSANDRA-1337-scan-concurrently-depending-on-num-rows.txt, 1137-bugfix.patch, CASSANDRA-1337.patch Original Estimate: 8h Remaining Estimate: 8h currently, we read the indexed rows from the first node (in partitioner order); if that does not have enough matching rows, we read the rows from the next, and so forth. we should use the statistics fom CASSANDRA-1155 to query multiple nodes in parallel, such that we have a high chance of getting enough rows w/o having to do another round of queries (but, if our estimate is incorrect, we do need to loop and do more rounds until we have enough data or we have fetched from each node). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4292) Improve JBOD loadbalancing and reduce contention
[ https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4292: -- Summary: Improve JBOD loadbalancing and reduce contention (was: Per-disk I/O queues) Improve JBOD loadbalancing and reduce contention Key: CASSANDRA-4292 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Yuki Morishita Fix For: 1.2.0 Attachments: 4292.txt, 4292-v2.txt, 4292-v3.txt As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) threads, which mix and match disk volumes indiscriminately. It may be worth creating a tight thread - disk affinity, to prevent unnecessary conflict at that level. OTOH as SSDs become more prevalent this becomes a non-issue. Unclear how much pain this actually causes in practice in the meantime. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3783) Add 'null' support to CQL 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-3783: -- Fix Version/s: (was: 1.2.0) 1.2.1 Add 'null' support to CQL 3.0 - Key: CASSANDRA-3783 URL: https://issues.apache.org/jira/browse/CASSANDRA-3783 Project: Cassandra Issue Type: Sub-task Components: API Reporter: Sylvain Lebresne Priority: Minor Labels: cql3 Fix For: 1.2.1 Dense composite supports adding records where only a prefix of all the component specifying the key is defined. In other words, with: {noformat} CREATE TABLE connections ( userid int, ip text, port int, protocol text, time timestamp, PRIMARY KEY (userid, ip, port, protocol) ) WITH COMPACT STORAGE {noformat} you can insert {noformat} INSERT INTO connections (userid, ip, port, time) VALUES (2, '192.168.0.1', 80, 123456789); {noformat} You cannot however select that column specifically (i.e, without selecting column (2, '192.168.0.1', 80, 'http') for instance). This ticket proposes to allow that though 'null', i.e. to allow {noformat} SELECT * FROM connections WHERE userid = 2 AND ip = '192.168.0.1' AND port = 80 AND protocol = null; {noformat} It would then also make sense to support: {noformat} INSERT INTO connections (userid, ip, port, protocol, time) VALUES (2, '192.168.0.1', 80, null, 123456789); {noformat} as an equivalent to the insert query above. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3920) tests for cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-3920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-3920: -- Fix Version/s: (was: 1.2.0) tests for cqlsh --- Key: CASSANDRA-3920 URL: https://issues.apache.org/jira/browse/CASSANDRA-3920 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: paul cannon Assignee: paul cannon Priority: Minor Labels: cqlsh Cqlsh has become big enough and tries to cover enough situations that it's time to start acting like a responsible adult and make this bugger some unit tests to guard against regressions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4430) optional pluggable o.a.c.metrics reporters
[ https://issues.apache.org/jira/browse/CASSANDRA-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4430: -- Fix Version/s: (was: 1.2.0) optional pluggable o.a.c.metrics reporters -- Key: CASSANDRA-4430 URL: https://issues.apache.org/jira/browse/CASSANDRA-4430 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Burroughs Priority: Minor CASSANDRA-4009 expanded the use of the metrics library which has a set of reporter modules http://metrics.codahale.com/manual/core/#reporters You can report to flat files, ganglia, spit everything over http, etc. The next step is a mechanism for using those reporters with o.a.c.metrics. To avoid bundling everything I suggest following the mx4j approach of enable only if on classpath coupled with a reporter configuration file. Strawman file: {noformat} console: time: 1 timeunit: seconds csv: - time: 1 timeunit: minutes file: foo.csv - time: 10 timeunit: seconds file: bar.csv ganglia: - time: 30 timunit: seconds host: server-1 port: 8649 - time: 30 timunit: seconds host: server-2 port: 8649 {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4316) Compaction Throttle too bursty with large rows
[ https://issues.apache.org/jira/browse/CASSANDRA-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4316: -- Fix Version/s: (was: 1.2.0) 1.2.1 Compaction Throttle too bursty with large rows -- Key: CASSANDRA-4316 URL: https://issues.apache.org/jira/browse/CASSANDRA-4316 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.8.0 Reporter: Wayne Lewis Assignee: Yuki Morishita Priority: Minor Fix For: 1.2.1 In org.apache.cassandra.db.compaction.CompactionIterable the check for compaction throttling occurs once every 1000 rows. In our workload this is much too large as we have many large rows (16 - 100 MB). With a 100 MB row, about 100 GB is read (and possibly written) before the compaction throttle sleeps. This causes bursts of essentially unthrottled compaction IO followed by a long sleep which yields inconsistence performance and high error rates during the bursts. We applied a workaround to check throttle every row which solved our performance and error issues: line 116 in org.apache.cassandra.db.compaction.CompactionIterable: if ((row++ % 1000) == 0) replaced with if ((row++ % 1) == 0) I think the better solution is to calculate how often throttle should be checked based on the throttle rate to apply sleeps more consistently. E.g. if 16MB/sec is the limit then check for sleep after every 16MB is read so sleeps are spaced out about every second. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2293) Rewrite nodetool help
[ https://issues.apache.org/jira/browse/CASSANDRA-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2293: -- Fix Version/s: (was: 1.2.0) Rewrite nodetool help - Key: CASSANDRA-2293 URL: https://issues.apache.org/jira/browse/CASSANDRA-2293 Project: Cassandra Issue Type: Improvement Components: Core, Documentation website Affects Versions: 0.8 beta 1 Reporter: Aaron Morton Assignee: David Alves Priority: Minor Once CASSANDRA-2008 is through and we are happy with the approach I would like to write similar help for nodetool. Both command line help of the form nodetool help and nodetool help command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4536) Ability for CQL3 to list partition keys
[ https://issues.apache.org/jira/browse/CASSANDRA-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4536: -- Fix Version/s: (was: 1.2.0) 1.2.1 Ability for CQL3 to list partition keys --- Key: CASSANDRA-4536 URL: https://issues.apache.org/jira/browse/CASSANDRA-4536 Project: Cassandra Issue Type: New Feature Components: API Affects Versions: 1.1.0 Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Priority: Minor Labels: cql3 Fix For: 1.2.1 It can be useful to know the set of in-use partition keys (storage engine row keys). One example given to me was where application data was modeled as a few 10s of 1000s of wide rows, where the app required presenting these rows to the user sorted based on information in the partition key. The partition count is small enough to do the sort client-side in memory, which is what the app did with the Thrift API--a range slice with an empty columns list. This was a problem when migrating to CQL3. {{SELECT mykey FROM mytable}} includes all the logical rows, which makes the resultset too large to make this a reasonable approach, even with paging. One way to add support would be to allow DISTINCT in the special case of {{SELECT DISTINCT mykey FROM mytable}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2109) Improve default window size for DES
[ https://issues.apache.org/jira/browse/CASSANDRA-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams resolved CASSANDRA-2109. - Resolution: Duplicate Fix Version/s: (was: 1.3) 1.2.0 Resolved by CASSANDRA-4038 Improve default window size for DES --- Key: CASSANDRA-2109 URL: https://issues.apache.org/jira/browse/CASSANDRA-2109 Project: Cassandra Issue Type: Improvement Reporter: Stu Hood Priority: Minor Labels: des Fix For: 1.2.0 The window size for DES is currently hardcoded at 100 requests. A larger window means that it takes longer to react to a suddenly slow node, but that you have a smoother transition for scores. An example of bad behaviour: with a window of size 100, we saw a case with a failing node where if enough requests could be answered quickly out of cache or bloomfilters, the window might be momentarily filled with 10 ms requests, pushing out requests that had to go disk and took 10 seconds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2710) Get multiple column ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436274#comment-13436274 ] Vijay commented on CASSANDRA-2710: -- multiget-within-a-single-row still has all the problems of multiget-across-rows, with the added problem that it doesn't parallelize across machines. Well multiget-within-a-single-row is suppose to be one sequential IO (hence more throughput at-least for the best case), and the co-ordinator doesn't need to wait for the slowest responding node (more transient memory in the co-ordinator) etc. Get multiple column ranges -- Key: CASSANDRA-2710 URL: https://issues.apache.org/jira/browse/CASSANDRA-2710 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: David Boxenhorn Assignee: Vijay Labels: compositeColumns, cql Attachments: 0001-2710-multiple-column-ranges-cql.patch, 0001-2710-multiple-column-ranges-thrift.patch I have replaced all my super column families with regular column families using composite columns. I have easily been able to support all previous functionality (I don't need range delete) except for one thing: getting multiple super columns with a single access. For this, I would need to get multiple ranges. (I can get multiple columns, or a single range, but not multiple ranges.) For example, I used to have [superColumnName1,subColumnName1..N],[superColumnName2,subColumnName1..N] and I could get superColumnName1, superColumnName2 Now I have [lensuperColumnName10lensubColumnName1..lensuperColumnName10lensubColumnNameN],[lensuperColumnName20lensubColumnName1..lensuperColumnName20lensubColumnNameN] and I need to get superColumnName1..superColumnName1+, superColumnName2..superColumnName2+ to get the same functionality I would like the clients to support this functionality, e.g. Hector to have .setRages parallel to .setColumnNames and for CQL to support a syntax like SELECT [FIRST N] [REVERSED] name1..nameN1, name2..nameN2... FROM ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2710) Get multiple column ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436276#comment-13436276 ] Jonathan Ellis commented on CASSANDRA-2710: --- bq. multiget-within-a-single-row is suppose to be one sequential IO only if the row is small enough, in which case, is there that much benefit over just grabbing the whole row? Get multiple column ranges -- Key: CASSANDRA-2710 URL: https://issues.apache.org/jira/browse/CASSANDRA-2710 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: David Boxenhorn Assignee: Vijay Labels: compositeColumns, cql Attachments: 0001-2710-multiple-column-ranges-cql.patch, 0001-2710-multiple-column-ranges-thrift.patch I have replaced all my super column families with regular column families using composite columns. I have easily been able to support all previous functionality (I don't need range delete) except for one thing: getting multiple super columns with a single access. For this, I would need to get multiple ranges. (I can get multiple columns, or a single range, but not multiple ranges.) For example, I used to have [superColumnName1,subColumnName1..N],[superColumnName2,subColumnName1..N] and I could get superColumnName1, superColumnName2 Now I have [lensuperColumnName10lensubColumnName1..lensuperColumnName10lensubColumnNameN],[lensuperColumnName20lensubColumnName1..lensuperColumnName20lensubColumnNameN] and I need to get superColumnName1..superColumnName1+, superColumnName2..superColumnName2+ to get the same functionality I would like the clients to support this functionality, e.g. Hector to have .setRages parallel to .setColumnNames and for CQL to support a syntax like SELECT [FIRST N] [REVERSED] name1..nameN1, name2..nameN2... FROM ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2710) Get multiple column ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436284#comment-13436284 ] Vijay commented on CASSANDRA-2710: -- is there that much benefit over just grabbing the whole row? Sure but then we have to stream all the columns into the client which can be wasteful too... I am fine either ways, nice to have. Get multiple column ranges -- Key: CASSANDRA-2710 URL: https://issues.apache.org/jira/browse/CASSANDRA-2710 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: David Boxenhorn Assignee: Vijay Labels: compositeColumns, cql Attachments: 0001-2710-multiple-column-ranges-cql.patch, 0001-2710-multiple-column-ranges-thrift.patch I have replaced all my super column families with regular column families using composite columns. I have easily been able to support all previous functionality (I don't need range delete) except for one thing: getting multiple super columns with a single access. For this, I would need to get multiple ranges. (I can get multiple columns, or a single range, but not multiple ranges.) For example, I used to have [superColumnName1,subColumnName1..N],[superColumnName2,subColumnName1..N] and I could get superColumnName1, superColumnName2 Now I have [lensuperColumnName10lensubColumnName1..lensuperColumnName10lensubColumnNameN],[lensuperColumnName20lensubColumnName1..lensuperColumnName20lensubColumnNameN] and I need to get superColumnName1..superColumnName1+, superColumnName2..superColumnName2+ to get the same functionality I would like the clients to support this functionality, e.g. Hector to have .setRages parallel to .setColumnNames and for CQL to support a syntax like SELECT [FIRST N] [REVERSED] name1..nameN1, name2..nameN2... FROM ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1337) parallelize fetching rows for low-cardinality indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436286#comment-13436286 ] Jonathan Ellis commented on CASSANDRA-1337: --- Reverted the three commits here in 0bfea6f678034c54d64c0c613f758de02d266415 and bumped to 1.2.1 since David may not have time to get back to this before 1.2.0 freeze. parallelize fetching rows for low-cardinality indexes - Key: CASSANDRA-1337 URL: https://issues.apache.org/jira/browse/CASSANDRA-1337 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: David Alves Priority: Minor Fix For: 1.2.1 Attachments: 0001-CASSANDRA-1337-scan-concurrently-depending-on-num-rows.txt, 1137-bugfix.patch, CASSANDRA-1337.patch Original Estimate: 8h Remaining Estimate: 8h currently, we read the indexed rows from the first node (in partitioner order); if that does not have enough matching rows, we read the rows from the next, and so forth. we should use the statistics fom CASSANDRA-1155 to query multiple nodes in parallel, such that we have a high chance of getting enough rows w/o having to do another round of queries (but, if our estimate is incorrect, we do need to loop and do more rounds until we have enough data or we have fetched from each node). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4551) Nodetool getendpoints keys do not work with ASCII, key_validation=ascii value of key = a test no delimiter
Mark Valdez created CASSANDRA-4551: -- Summary: Nodetool getendpoints keys do not work with ASCII, key_validation=ascii value of key = a test no delimiter Key: CASSANDRA-4551 URL: https://issues.apache.org/jira/browse/CASSANDRA-4551 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.0.9 Reporter: Mark Valdez Nodetool getendpoints keys do not work with ASCII, key_validation=ascii value of key = a test no delimiter tried to escape key = a test with double and single quotes. It doesn't work. It just reiterates the format of the tool's command: getendpoints requires ks, cf and key args -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[1/2] git commit: run local range scans on the read stage patch by jbellis; reviewed by vijay for CASSANDRA-3687
Updated Branches: refs/heads/trunk fe784f58e - 5577ff626 run local range scans on the read stage patch by jbellis; reviewed by vijay for CASSANDRA-3687 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5577ff62 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5577ff62 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5577ff62 Branch: refs/heads/trunk Commit: 5577ff626bb38d419a3540e0c0ccb1a9d8b8680f Parents: 29fed1f Author: Jonathan Ellis jbel...@apache.org Authored: Thu Aug 16 15:43:02 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Thu Aug 16 15:43:02 2012 -0500 -- CHANGES.txt|1 + .../cassandra/service/AbstractRowResolver.java | 11 -- .../org/apache/cassandra/service/ReadCallback.java | 27 ++--- .../org/apache/cassandra/service/StorageProxy.java | 91 --- 4 files changed, 59 insertions(+), 71 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5577ff62/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 5a2848d..75de54e 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 1.2-dev + * run local range scans on the read stage (CASSANDRA-3687) * clean up ioexceptions (CASSANDRA-2116) * Introduce new json format with row level deletion (CASSANDRA-4054) * remove redundant name column from schema_keyspaces (CASSANDRA-4433) http://git-wip-us.apache.org/repos/asf/cassandra/blob/5577ff62/src/java/org/apache/cassandra/service/AbstractRowResolver.java -- diff --git a/src/java/org/apache/cassandra/service/AbstractRowResolver.java b/src/java/org/apache/cassandra/service/AbstractRowResolver.java index b1647a2..beaf73c 100644 --- a/src/java/org/apache/cassandra/service/AbstractRowResolver.java +++ b/src/java/org/apache/cassandra/service/AbstractRowResolver.java @@ -51,17 +51,6 @@ public abstract class AbstractRowResolver implements IResponseResolverReadRespo replies.add(message); } -/** hack so local reads don't force de/serialization of an extra real Message */ -public void injectPreProcessed(ReadResponse result) -{ -MessageInReadResponse message = MessageIn.create(FBUtilities.getBroadcastAddress(), - result, - Collections.String, byte[]emptyMap(), - MessagingService.Verb.INTERNAL_RESPONSE, - MessagingService.current_version); -replies.add(message); -} - public IterableMessageInReadResponse getMessages() { return replies; http://git-wip-us.apache.org/repos/asf/cassandra/blob/5577ff62/src/java/org/apache/cassandra/service/ReadCallback.java -- diff --git a/src/java/org/apache/cassandra/service/ReadCallback.java b/src/java/org/apache/cassandra/service/ReadCallback.java index a3d273c..bfd0044 100644 --- a/src/java/org/apache/cassandra/service/ReadCallback.java +++ b/src/java/org/apache/cassandra/service/ReadCallback.java @@ -19,6 +19,7 @@ package org.apache.cassandra.service; import java.io.IOException; import java.net.InetAddress; +import java.util.Collections; import java.util.List; import java.util.concurrent.TimeUnit; import java.util.concurrent.TimeoutException; @@ -165,32 +166,20 @@ public class ReadCallbackTMessage, TResolved implements IAsyncCallbackTMessag /** * @return true if the message counts towards the blockfor threshold - * TODO turn the Message into a response so we don't need two versions of this method */ protected boolean waitingFor(MessageIn message) { return true; } -/** - * @return true if the response counts towards the blockfor threshold - */ -protected boolean waitingFor(ReadResponse response) +public void response(TMessage result) { -return true; -} - -public void response(ReadResponse result) -{ -((RowDigestResolver) resolver).injectPreProcessed(result); -int n = waitingFor(result) - ? received.incrementAndGet() - : received.get(); -if (n = blockfor resolver.isDataPresent()) -{ -condition.signal(); -maybeResolveForRepair(); -} +MessageInTMessage message = MessageIn.create(FBUtilities.getBroadcastAddress(), + result, +
[2/2] git commit: revert CASSANDRA-1337 comprising commits ef23335, f17fbac, 9cf915f.
revert CASSANDRA-1337 comprising commits ef23335, f17fbac, 9cf915f. Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/29fed1f1 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/29fed1f1 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/29fed1f1 Branch: refs/heads/trunk Commit: 29fed1f18188cfcd71c817db394c1087e0698dbd Parents: fe784f5 Author: Jonathan Ellis jbel...@apache.org Authored: Thu Aug 16 15:42:30 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Thu Aug 16 15:42:30 2012 -0500 -- .../org/apache/cassandra/service/StorageProxy.java | 60 -- 1 files changed, 17 insertions(+), 43 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/29fed1f1/src/java/org/apache/cassandra/service/StorageProxy.java -- diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java b/src/java/org/apache/cassandra/service/StorageProxy.java index 8d0e0b3..9d55739 100644 --- a/src/java/org/apache/cassandra/service/StorageProxy.java +++ b/src/java/org/apache/cassandra/service/StorageProxy.java @@ -853,23 +853,6 @@ public class StorageProxy implements StorageProxyMBean int columnsCount = 0; rows = new ArrayListRow(); ListAbstractBoundsRowPosition ranges = getRestrictedRanges(command.range); - -// get the cardinality of this index based on row count -// use this info to decide how many scans to do in parallel -Table table = Table.open(command.keyspace); -long estimatedKeysPerRange = table.getColumnFamilyStore(command.column_family) -.estimateKeys() / table.getReplicationStrategy().getReplicationFactor(); - -int concurrencyFactor = (int) (command.maxResults / (estimatedKeysPerRange + 1)); -if (concurrencyFactor = 0 || command.maxIsColumns) -concurrencyFactor = 1; -else if (concurrencyFactor ranges.size()) -concurrencyFactor = ranges.size(); - -// parallel scan handlers -ListReadCallbackRangeSliceReply, IterableRow scanHandlers = new ArrayListReadCallbackRangeSliceReply, IterableRow(concurrencyFactor); - -int parallelHandlers = concurrencyFactor; for (AbstractBoundsRowPosition range : ranges) { RangeSliceCommand nodeCmd = new RangeSliceCommand(command.keyspace, @@ -904,7 +887,6 @@ public class StorageProxy implements StorageProxyMBean { throw new AssertionError(e); } -parallelHandlers--; } else { @@ -921,36 +903,28 @@ public class StorageProxy implements StorageProxyMBean logger.debug(reading + nodeCmd + from + endpoint); } -scanHandlers.add(handler); - -if (scanHandlers.size() = parallelHandlers) +try { -for (ReadCallbackRangeSliceReply, IterableRow scanHandler : scanHandlers) +for (Row row : handler.get()) { -try -{ -for (Row row : scanHandler.get()) -{ -rows.add(row); -columnsCount += row.getLiveColumnCount(); -logger.debug(range slices read {}, row.key); -} - FBUtilities.waitOnFutures(resolver.repairResults, DatabaseDescriptor.getRangeRpcTimeout()); -} -catch (TimeoutException ex) -{ -if (logger.isDebugEnabled()) -logger.debug(Range slice timeout: {}, ex.toString()); -throw ex; -} -catch (DigestMismatchException e) -{ -throw new AssertionError(e); // no digests in range slices yet -} +rows.add(row); +columnsCount += row.getLiveColumnCount(); +logger.debug(range slices read {}, row.key); } -scanHandlers.clear(); //go back for more +FBUtilities.waitOnFutures(resolver.repairResults,
[jira] [Assigned] (CASSANDRA-3237) refactor super column implmentation to use composite column names instead
[ https://issues.apache.org/jira/browse/CASSANDRA-3237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay reassigned CASSANDRA-3237: Assignee: Vijay refactor super column implmentation to use composite column names instead - Key: CASSANDRA-3237 URL: https://issues.apache.org/jira/browse/CASSANDRA-3237 Project: Cassandra Issue Type: Improvement Reporter: Matthew F. Dennis Assignee: Vijay Priority: Minor Labels: ponies Fix For: 1.3 Attachments: cassandra-supercolumn-irc.log super columns are annoying. composite columns offer a better API and performance. people should use composites over super columns. some people are already using super columns. C* should implement the super column API in terms of composites to reduce code, complexity and testing as well as increase performance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost
[ https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436346#comment-13436346 ] Ivo Meißner commented on CASSANDRA-4481: I have also created the broken keyspace with a version prior to 1.1.2 (I'm pretty sure it was 1.1.1). So maybe there is a commitlog incompatibility... I also ran into some schema changing issues with that keyspace. Maybe I destroyed the keyspace structure. But it would be nice to get some kind of error message if something goes wrong with the commitlogs. Everything else seems to work with the keyspace. You really don't notice until you wonder where the data is... Commitlog not replayed after restart - data lost Key: CASSANDRA-4481 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.2 Environment: Single node cluster on 64Bit CentOS Reporter: Ivo Meißner Priority: Critical When data is written to the commitlog and I restart the machine, all commited data is lost that has not been flushed to disk. In the startup logs it says that it replays the commitlog successfully, but the data is not available then. When I open the commitlog file in an editor I can see the added data, but after the restart it cannot be fetched from cassandra. {code} INFO 09:59:45,362 Replaying /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Finished reading /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Log replay complete, 0 replayed mutations {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost
[ https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436350#comment-13436350 ] Jonathan Ellis commented on CASSANDRA-4481: --- But this is exactly the situation if you dropped the keyspace on purpose: commitlog will have data for CFs that don't exist anymore. Not a good idea to panic users when things are working as designed. Commitlog not replayed after restart - data lost Key: CASSANDRA-4481 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.2 Environment: Single node cluster on 64Bit CentOS Reporter: Ivo Meißner Priority: Critical When data is written to the commitlog and I restart the machine, all commited data is lost that has not been flushed to disk. In the startup logs it says that it replays the commitlog successfully, but the data is not available then. When I open the commitlog file in an editor I can see the added data, but after the restart it cannot be fetched from cassandra. {code} INFO 09:59:45,362 Replaying /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Finished reading /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Log replay complete, 0 replayed mutations {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4552) cqlsh doesn't handle Int32Type when fully-qualified package name is present
Kirk True created CASSANDRA-4552: Summary: cqlsh doesn't handle Int32Type when fully-qualified package name is present Key: CASSANDRA-4552 URL: https://issues.apache.org/jira/browse/CASSANDRA-4552 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.2.0 Environment: Today's (08/16/2012) trunk. Reporter: Kirk True Assignee: Kirk True Fix For: 1.2.0 Steps to reproduce: 1. Start Cassandra 2. Start cqlsh: {{./bin/cqlsh -3 --debug}} 3. Execute these statements: {noformat} create keyspace foo with strategy_class = 'SimpleStrategy' and strategy_options:replication_factor=1; use foo; create table bar ( a int, b int, primary key (a) ); insert into bar (a, b) values (1, 1); select * from bar; {noformat} Expected: to see my row results Actual: I see this error: {noformat} Traceback (most recent call last): File ./bin/cqlsh, line 926, in onecmd self.handle_statement(st, statementtext) File ./bin/cqlsh, line 954, in handle_statement return custom_handler(parsed) File ./bin/cqlsh, line 1015, in do_select self.perform_statement(parsed.extract_orig(), decoder=decoder) File ./bin/cqlsh, line 1042, in perform_statement self.print_result(self.cursor) File ./bin/cqlsh, line 1096, in print_result self.print_static_result(cursor) File ./bin/cqlsh, line 1112, in print_static_result formatted_data = [map(self.myformat_value, row, coltypes) for row in cursor] File ./bin/cqlsh, line 622, in myformat_value float_precision=self.display_float_precision, **kwargs) File ./bin/cqlsh, line 504, in format_value escapedval = val.replace('\\', '') AttributeError: 'int' object has no attribute 'replace' {noformat} This is similar to CASSANDRA-4083 in terms of the error message, but may be of a different cause. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4552) cqlsh doesn't handle Int32Type when fully-qualified package name is present
[ https://issues.apache.org/jira/browse/CASSANDRA-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated CASSANDRA-4552: - Attachment: trunk-4552.txt The value of the cqlsh Python script's {{casstype}} is in some cases the fully-qualified package name. In my case it was {{org.apache.cassandra.db.marshal.Int32Type}} while it appears the code is expecting it to be simply {{Int32Type}}. I don't think this is the right fix, but it's a start and it unblocks me :) cqlsh doesn't handle Int32Type when fully-qualified package name is present --- Key: CASSANDRA-4552 URL: https://issues.apache.org/jira/browse/CASSANDRA-4552 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.2.0 Environment: Today's (08/16/2012) trunk. Reporter: Kirk True Assignee: Kirk True Fix For: 1.2.0 Attachments: trunk-4552.txt Steps to reproduce: 1. Start Cassandra 2. Start cqlsh: {{./bin/cqlsh -3 --debug}} 3. Execute these statements: {noformat} create keyspace foo with strategy_class = 'SimpleStrategy' and strategy_options:replication_factor=1; use foo; create table bar ( a int, b int, primary key (a) ); insert into bar (a, b) values (1, 1); select * from bar; {noformat} Expected: to see my row results Actual: I see this error: {noformat} Traceback (most recent call last): File ./bin/cqlsh, line 926, in onecmd self.handle_statement(st, statementtext) File ./bin/cqlsh, line 954, in handle_statement return custom_handler(parsed) File ./bin/cqlsh, line 1015, in do_select self.perform_statement(parsed.extract_orig(), decoder=decoder) File ./bin/cqlsh, line 1042, in perform_statement self.print_result(self.cursor) File ./bin/cqlsh, line 1096, in print_result self.print_static_result(cursor) File ./bin/cqlsh, line 1112, in print_static_result formatted_data = [map(self.myformat_value, row, coltypes) for row in cursor] File ./bin/cqlsh, line 622, in myformat_value float_precision=self.display_float_precision, **kwargs) File ./bin/cqlsh, line 504, in format_value escapedval = val.replace('\\', '') AttributeError: 'int' object has no attribute 'replace' {noformat} This is similar to CASSANDRA-4083 in terms of the error message, but may be of a different cause. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-4552) cqlsh doesn't handle Int32Type when fully-qualified package name is present
[ https://issues.apache.org/jira/browse/CASSANDRA-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True resolved CASSANDRA-4552. -- Resolution: Duplicate Marking as a duplicate of CASSANDRA-4546. cqlsh doesn't handle Int32Type when fully-qualified package name is present --- Key: CASSANDRA-4552 URL: https://issues.apache.org/jira/browse/CASSANDRA-4552 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.2.0 Environment: Today's (08/16/2012) trunk. Reporter: Kirk True Assignee: Kirk True Fix For: 1.2.0 Attachments: trunk-4552.txt Steps to reproduce: 1. Start Cassandra 2. Start cqlsh: {{./bin/cqlsh -3 --debug}} 3. Execute these statements: {noformat} create keyspace foo with strategy_class = 'SimpleStrategy' and strategy_options:replication_factor=1; use foo; create table bar ( a int, b int, primary key (a) ); insert into bar (a, b) values (1, 1); select * from bar; {noformat} Expected: to see my row results Actual: I see this error: {noformat} Traceback (most recent call last): File ./bin/cqlsh, line 926, in onecmd self.handle_statement(st, statementtext) File ./bin/cqlsh, line 954, in handle_statement return custom_handler(parsed) File ./bin/cqlsh, line 1015, in do_select self.perform_statement(parsed.extract_orig(), decoder=decoder) File ./bin/cqlsh, line 1042, in perform_statement self.print_result(self.cursor) File ./bin/cqlsh, line 1096, in print_result self.print_static_result(cursor) File ./bin/cqlsh, line 1112, in print_static_result formatted_data = [map(self.myformat_value, row, coltypes) for row in cursor] File ./bin/cqlsh, line 622, in myformat_value float_precision=self.display_float_precision, **kwargs) File ./bin/cqlsh, line 504, in format_value escapedval = val.replace('\\', '') AttributeError: 'int' object has no attribute 'replace' {noformat} This is similar to CASSANDRA-4083 in terms of the error message, but may be of a different cause. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4546) cqlsh: handle when full cassandra type class names are given
[ https://issues.apache.org/jira/browse/CASSANDRA-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436359#comment-13436359 ] Kirk True commented on CASSANDRA-4546: -- +1 on the first part of the patch. The third change in the patch _appears_ unrelated to me. Please clarify for my own edification. Thanks. cqlsh: handle when full cassandra type class names are given Key: CASSANDRA-4546 URL: https://issues.apache.org/jira/browse/CASSANDRA-4546 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.2.0 Reporter: paul cannon Assignee: paul cannon Labels: cqlsh Fix For: 1.2.0 Attachments: 4546.patch.txt When a builtin Cassandra type was being used for data in previous versions of Cassandra, only the short name was sent: UTF8Type, TimeUUIDType, etc. Starting with 1.2, as of CASSANDRA-4453, the full class names are sent. Cqlsh doesn't know how to handle this, and is currently treating all data as if it were an unknown type. This goes as far as to cause an exception when the type is actually a number, because the driver deserializes it right, and then cqlsh tries to use it as a string. Here for googlage: {noformat} AttributeError: 'int' object has no attribute 'replace' {noformat} Fixeries are in order. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4546) cqlsh: handle when full cassandra type class names are given
[ https://issues.apache.org/jira/browse/CASSANDRA-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436362#comment-13436362 ] paul cannon commented on CASSANDRA-4546: It's not directly related. Just a minor problem with error reporting that came up while I was testing this. cqlsh: handle when full cassandra type class names are given Key: CASSANDRA-4546 URL: https://issues.apache.org/jira/browse/CASSANDRA-4546 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.2.0 Reporter: paul cannon Assignee: paul cannon Labels: cqlsh Fix For: 1.2.0 Attachments: 4546.patch.txt When a builtin Cassandra type was being used for data in previous versions of Cassandra, only the short name was sent: UTF8Type, TimeUUIDType, etc. Starting with 1.2, as of CASSANDRA-4453, the full class names are sent. Cqlsh doesn't know how to handle this, and is currently treating all data as if it were an unknown type. This goes as far as to cause an exception when the type is actually a number, because the driver deserializes it right, and then cqlsh tries to use it as a string. Here for googlage: {noformat} AttributeError: 'int' object has no attribute 'replace' {noformat} Fixeries are in order. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
git commit: cqlsh: handle fully qualified class names Patch by paul cannon, reviewed by brandonwilliams for CASSANDRA-4546
Updated Branches: refs/heads/trunk 5577ff626 - 7ddb5c7a4 cqlsh: handle fully qualified class names Patch by paul cannon, reviewed by brandonwilliams for CASSANDRA-4546 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7ddb5c7a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7ddb5c7a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7ddb5c7a Branch: refs/heads/trunk Commit: 7ddb5c7a477361f1a1dd7e4b7e9613b921e50b5b Parents: 5577ff6 Author: Brandon Williams brandonwilli...@apache.org Authored: Thu Aug 16 17:26:24 2012 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Thu Aug 16 17:26:24 2012 -0500 -- bin/cqlsh |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7ddb5c7a/bin/cqlsh -- diff --git a/bin/cqlsh b/bin/cqlsh index 6b61364..7ea0128 100755 --- a/bin/cqlsh +++ b/bin/cqlsh @@ -457,6 +457,7 @@ def unix_time_from_uuid1(u): def format_value(val, casstype, output_encoding, addcolor=False, time_format='', float_precision=3, colormap=DEFAULT_VALUE_COLORS, nullval='null'): +casstype = trim_if_present(casstype, 'org.apache.cassandra.db.marshal.') color = colormap['default'] coloredval = None displaywidth = None @@ -498,6 +499,7 @@ def format_value(val, casstype, output_encoding, addcolor=False, time_format='', color = colormap['hex'] else: # AsciiType is the only other one known right now, but handle others +val = str(val) escapedval = val.replace('\\', '') bval = controlchars_re.sub(_show_control_chars, escapedval) if addcolor: @@ -775,8 +777,8 @@ class Shell(cmd.Cmd): def get_keyspace(self, ksname): try: return self.make_hacktastic_thrift_call('describe_keyspace', ksname) -except cql.cassandra.ttypes.NotFoundException, e: -raise KeyspaceNotFound('Keyspace %s not found.' % e) +except cql.cassandra.ttypes.NotFoundException: +raise KeyspaceNotFound('Keyspace %r not found.' % ksname) def get_keyspaces(self): return self.make_hacktastic_thrift_call('describe_keyspaces')
[jira] [Updated] (CASSANDRA-4553) NPE while loading Saved KeyCache
[ https://issues.apache.org/jira/browse/CASSANDRA-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay updated CASSANDRA-4553: - Attachment: 0001-CASSANDRA-4553.patch Simple fix to handle null in ASC NPE while loading Saved KeyCache Key: CASSANDRA-4553 URL: https://issues.apache.org/jira/browse/CASSANDRA-4553 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.0 Reporter: Vijay Assignee: Vijay Priority: Minor Fix For: 1.2.0 Attachments: 0001-CASSANDRA-4553.patch WARN [main] 2012-08-16 15:31:13,896 AutoSavingCache.java (line 146) error reading saved cache /var/lib/cassandra/saved_caches/system-local-KeyCache-b.db java.lang.NullPointerException at org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:140) at org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:251) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:354) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:326) at org.apache.cassandra.db.Table.initCf(Table.java:312) at org.apache.cassandra.db.Table.init(Table.java:252) at org.apache.cassandra.db.Table.open(Table.java:97) at org.apache.cassandra.db.Table.open(Table.java:75) at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:285) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:168) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:318) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:361) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4553) NPE while loading Saved KeyCache
[ https://issues.apache.org/jira/browse/CASSANDRA-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436384#comment-13436384 ] Jonathan Ellis commented on CASSANDRA-4553: --- +1, although would be even better w/ comment as to why we expect keycache deserialize to return nulls sometimes (but not rowcache) NPE while loading Saved KeyCache Key: CASSANDRA-4553 URL: https://issues.apache.org/jira/browse/CASSANDRA-4553 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.0 Reporter: Vijay Assignee: Vijay Priority: Minor Fix For: 1.2.0 Attachments: 0001-CASSANDRA-4553.patch WARN [main] 2012-08-16 15:31:13,896 AutoSavingCache.java (line 146) error reading saved cache /var/lib/cassandra/saved_caches/system-local-KeyCache-b.db java.lang.NullPointerException at org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:140) at org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:251) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:354) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:326) at org.apache.cassandra.db.Table.initCf(Table.java:312) at org.apache.cassandra.db.Table.init(Table.java:252) at org.apache.cassandra.db.Table.open(Table.java:97) at org.apache.cassandra.db.Table.open(Table.java:75) at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:285) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:168) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:318) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:361) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4553) NPE while loading Saved KeyCache
[ https://issues.apache.org/jira/browse/CASSANDRA-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436386#comment-13436386 ] Pavel Yaskevich commented on CASSANDRA-4553: +1 with Jonathan NPE while loading Saved KeyCache Key: CASSANDRA-4553 URL: https://issues.apache.org/jira/browse/CASSANDRA-4553 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.0 Reporter: Vijay Assignee: Vijay Priority: Minor Fix For: 1.2.0 Attachments: 0001-CASSANDRA-4553.patch WARN [main] 2012-08-16 15:31:13,896 AutoSavingCache.java (line 146) error reading saved cache /var/lib/cassandra/saved_caches/system-local-KeyCache-b.db java.lang.NullPointerException at org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:140) at org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:251) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:354) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:326) at org.apache.cassandra.db.Table.initCf(Table.java:312) at org.apache.cassandra.db.Table.init(Table.java:252) at org.apache.cassandra.db.Table.open(Table.java:97) at org.apache.cassandra.db.Table.open(Table.java:75) at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:285) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:168) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:318) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:361) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
git commit: remove schema agreement checking from all external APIs (Thrift, CQL and CQL3) patch by Pavel Yaskevich; reviewed by Jonathan Ellis for CASSANDRA-4487
Updated Branches: refs/heads/trunk 7ddb5c7a4 - 71f5d91ab remove schema agreement checking from all external APIs (Thrift, CQL and CQL3) patch by Pavel Yaskevich; reviewed by Jonathan Ellis for CASSANDRA-4487 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/71f5d91a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/71f5d91a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/71f5d91a Branch: refs/heads/trunk Commit: 71f5d91ab7825196990a2744cf3e40e654917d33 Parents: 7ddb5c7 Author: Pavel Yaskevich xe...@apache.org Authored: Wed Aug 15 14:00:28 2012 +0300 Committer: Pavel Yaskevich xe...@apache.org Committed: Fri Aug 17 01:54:13 2012 +0300 -- CHANGES.txt|1 + interface/cassandra.thrift |7 ++- src/java/org/apache/cassandra/cli/CliClient.java | 49 +-- src/java/org/apache/cassandra/cli/CliMain.java |2 +- .../org/apache/cassandra/cql/QueryProcessor.java | 50 +-- .../org/apache/cassandra/cql3/CQLStatement.java|5 +- .../org/apache/cassandra/cql3/QueryProcessor.java | 10 +-- .../cql3/statements/CreateKeyspaceStatement.java |3 +- .../cql3/statements/DropKeyspaceStatement.java |3 +- .../cql3/statements/SchemaAlteringStatement.java | 46 + .../apache/cassandra/thrift/CassandraServer.java | 16 - .../cassandra/transport/messages/ErrorMessage.java |4 - .../cassandra/transport/messages/QueryMessage.java |4 +- 13 files changed, 24 insertions(+), 176 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/71f5d91a/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 75de54e..39c92b1 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -37,6 +37,7 @@ * improve DynamicEndpointSnitch by using reservoir sampling (CASSANDRA-4038) * (cql3) Add support for 2ndary indexes (CASSANDRA-3680) * (cql3) fix defining more than one PK to be invalid (CASSANDRA-4477) + * remove schema agreement checking from all external APIs (Thrift, CQL and CQL3) (CASSANDRA-4487) 1.1.4 http://git-wip-us.apache.org/repos/asf/cassandra/blob/71f5d91a/interface/cassandra.thrift -- diff --git a/interface/cassandra.thrift b/interface/cassandra.thrift index 5e933d7..1f735e6 100644 --- a/interface/cassandra.thrift +++ b/interface/cassandra.thrift @@ -158,7 +158,12 @@ exception AuthorizationException { 1: required string why } -/** schemas are not in agreement across all nodes */ +/** + * NOTE: This up outdated exception left for backward compatibility reasons, + * no actual schema agreement validation is done starting from Cassandra 1.2 + * + * schemas are not in agreement across all nodes + */ exception SchemaDisagreementException { } http://git-wip-us.apache.org/repos/asf/cassandra/blob/71f5d91a/src/java/org/apache/cassandra/cli/CliClient.java -- diff --git a/src/java/org/apache/cassandra/cli/CliClient.java b/src/java/org/apache/cassandra/cli/CliClient.java index f2f492a..176f70a 100644 --- a/src/java/org/apache/cassandra/cli/CliClient.java +++ b/src/java/org/apache/cassandra/cli/CliClient.java @@ -198,7 +198,7 @@ public class CliClient } // Execute a CLI Statement -public void executeCLIStatement(String statement) throws CharacterCodingException, TException, TimedOutException, NotFoundException, NoSuchFieldException, InvalidRequestException, UnavailableException, InstantiationException, IllegalAccessException, ClassNotFoundException, SchemaDisagreementException +public void executeCLIStatement(String statement) throws CharacterCodingException, TException, TimedOutException, NotFoundException, NoSuchFieldException, InvalidRequestException, UnavailableException, InstantiationException, IllegalAccessException, ClassNotFoundException { Tree tree = CliCompiler.compileQuery(statement); try @@ -1006,7 +1006,6 @@ public class CliClient { String mySchemaVersion = thriftClient.system_add_keyspace(updateKsDefAttributes(statement, ksDef)); sessionState.out.println(mySchemaVersion); -validateSchemaIsSettled(mySchemaVersion); keyspacesMap.put(keyspaceName, thriftClient.describe_keyspace(keyspaceName)); } @@ -1037,7 +1036,6 @@ public class CliClient { String mySchemaVersion = thriftClient.system_add_column_family(updateCfDefAttributes(statement, cfDef)); sessionState.out.println(mySchemaVersion); -validateSchemaIsSettled(mySchemaVersion);
[jira] [Resolved] (CASSANDRA-4538) Strange CorruptedBlockException when massive insert binary data
[ https://issues.apache.org/jira/browse/CASSANDRA-4538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-4538. --- Resolution: Cannot Reproduce does sound like a problem with that specific machine given that neither you nor cathy can reproduce elsewhere Strange CorruptedBlockException when massive insert binary data --- Key: CASSANDRA-4538 URL: https://issues.apache.org/jira/browse/CASSANDRA-4538 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.3 Environment: Debian sequeeze 32bit Reporter: Tommy Cheng Priority: Critical Labels: CorruptedBlockException, binary, insert Attachments: cassandra-stresstest.zip After inserting ~ 1 records, here is the error log INFO 10:53:33,543 Compacted to [/var/lib/cassandra/data/ST/company/ST-company.company_acct_no_idx-he-13-Data.db,]. 407,681 to 409,133 (~100% of original) bytes for 9,250 keys at 0.715926MB/s. Time: 545ms. ERROR 10:53:35,445 Exception in thread Thread[CompactionExecutor:3,1,main] java.io.IOError: org.apache.cassandra.io.compress.CorruptedBlockException: (/var/lib/cassandra/data/ST/company/ST-company-he-9-Data.db): corruption detected, chunk at 7530128 of length 19575. at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:116) at org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:99) at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:176) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:83) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:68) at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:118) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:101) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:173) at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:154) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.cassandra.io.compress.CorruptedBlockException: (/var/lib/cassandra/data/ST/company/ST-company-he-9-Data.db): corruption detected, chunk at 7530128 of length 19575. at org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:98) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:77) at org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:302) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:397) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377) at org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:401) at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:363) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:119) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:36) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:144) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234) at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:112) ... 20 more Here is the startup of cassandra
[jira] [Resolved] (CASSANDRA-875) Performance regression tests, take 2
[ https://issues.apache.org/jira/browse/CASSANDRA-875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-875. -- Resolution: Won't Fix working on this out of tree, similar to dtests Performance regression tests, take 2 Key: CASSANDRA-875 URL: https://issues.apache.org/jira/browse/CASSANDRA-875 Project: Cassandra Issue Type: Test Components: Tools Reporter: Jonathan Ellis Assignee: Tyler Patterson Labels: gsoc, gsoc2010 We have a stress test in contrib/py_stress, and Paul has a tool using libcloud to automate running it against an ephemeral cluster of rackspace cloud servers, but to really qualify as performance regression tests we need to - test a wide variety of data types (skinny rows, wide rows, different comparator types, different value byte[] sizes, etc) - produce pretty graphs. seriously. - archive historical data somewhere for comparison (rackspace can provide a VM to host a db for this, if the ASF doesn't have something in place for this kind of thing already) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2397) Improve or remove replicate-on-write setting
[ https://issues.apache.org/jira/browse/CASSANDRA-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-2397. --- Resolution: Won't Fix more discussion in CASSANDRA-3868 Improve or remove replicate-on-write setting Key: CASSANDRA-2397 URL: https://issues.apache.org/jira/browse/CASSANDRA-2397 Project: Cassandra Issue Type: Bug Reporter: Stu Hood The replicate on write setting breaks assumptions in various places in the codebase dealing with whether data will be replicated in a timely fashion. It's worthwhile to discuss whether we should go all-the-way on replicate-on-write, such that it is a fully supported feature, or whether we should remove it entirely. On one hand, ROW could be considered to be just another replication tunable like HH, RR and AES. On the other hand, a lazily replicating store is very rarely what you actually wanted. Open issues related to ROW are linked, but additionally, we'd need to: * Make the setting have an effect for standard column families * Change the default for ROW to enabled and properly warn of the effects -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2636) Import/Export of Schema Migrations
[ https://issues.apache.org/jira/browse/CASSANDRA-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-2636. --- Resolution: Invalid We no longer store migrations, only the merged schema. Import/Export of Schema Migrations -- Key: CASSANDRA-2636 URL: https://issues.apache.org/jira/browse/CASSANDRA-2636 Project: Cassandra Issue Type: Improvement Reporter: David Boxenhorn My use case is like this: I have a development cluster, a staging cluster and a production cluster. When I finish a set of migrations on the development cluster, I want to apply them to the staging cluster, and eventually the production cluster. I don't want to do it by hand, because it's a painful and error-prone process. What I would like to do is export the last N migrations from the development cluster as a text file, with exactly the same format as the original text commands, and import them to the staging and production clusters. I think the best place to do this might be the CLI, since you would probably want to view your migrations before exporting them. Something like this: show migrations N; Shows the last N migrations. export migrations N fileNameExports the last N migrations to file fileName. import migrations fileNameImports migrations from fileName. The import process would apply the migrations one at a time giving you feedback like, applying migration: update column family If a migration fails, the process should give an appropriate message and stop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3360) Read data inconsistancy in Cassandra 1.0.0-rc2
[ https://issues.apache.org/jira/browse/CASSANDRA-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3360. --- Resolution: Cannot Reproduce Read data inconsistancy in Cassandra 1.0.0-rc2 -- Key: CASSANDRA-3360 URL: https://issues.apache.org/jira/browse/CASSANDRA-3360 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.0.0 Reporter: Gopalakrishnan Rajagopal When qsuper column for a particular key is being queried using hector-core-0.8.0-2, the data retrieved is inconsistent. I mean, for the key that I use to fetch data, there are 7 sub columns actually. But the query returns 1 or 3 sub columns depending on which nodes respond to it. (I tested by bringing down each one of the three nodes in turn). When I tried to fetch the data for the same key using cassandra-cli tool, I get all the 7 sub columns for both the consistancy levels ONE and QUORUM. Below is the code that I used to fetch data superColumnQuery = HFactory.createSuperColumnQuery (keyspaceOperator, stringSerializer, stringSerializer, stringSerializer, stringSerializer); superColumnQuery.setColumnFamily(cfName).setKey (key).setSuperName(scName); result=superColumnQuery.execute(); superColumn=result.get(); columnList=superColumn.getColumns(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3731) bad data size in compactionstats
[ https://issues.apache.org/jira/browse/CASSANDRA-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3731. --- Resolution: Cannot Reproduce bad data size in compactionstats Key: CASSANDRA-3731 URL: https://issues.apache.org/jira/browse/CASSANDRA-3731 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.6 Environment: debian, oracle java 1.6.26 LeveledCompaction with 4096M (file size limit is 4096MB) count of sstables 500 Reporter: Zenek Kraweznik pending tasks: -2147483648 compaction type keyspace column family bytes compacted bytes total progress Compaction Archive Messages 35050352366 *0* n/a 0 bytes total is visible on this node (nodetool ring is reporting 37.18GB). After every compactions bytes total is about 73x (i guess it is not compress data size), but this value isn't saved anywhere. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3739) Cassandra setInputSplitSize is not working properly
[ https://issues.apache.org/jira/browse/CASSANDRA-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3739. --- Resolution: Cannot Reproduce Let us know if you can reproduce on 1.0.12. Cassandra setInputSplitSize is not working properly --- Key: CASSANDRA-3739 URL: https://issues.apache.org/jira/browse/CASSANDRA-3739 Project: Cassandra Issue Type: Bug Components: Hadoop Affects Versions: 1.0.6 Reporter: manu I am using Hadoop 0.20.205 and Cassandra 1.0.6. I use setInputSplitSize(1000) and i expect that every split should be ~1000 rows. The problem is that each mapper still receive much more than 1000 rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost
[ https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436450#comment-13436450 ] Florent Clairambault commented on CASSANDRA-4481: - This bug is marked as resolved, so we're just documenting something that never happened. We're not scaring anyone here, we're making sure we have all the documentation to prove that I we were wrong. So just to make things clear, I didn't make any kind of change or deletion on my keyspaces. The two keyspaces were created by code (one with pelops and one with hector) once and never changed. I know I told I did it with cassandra-cli earlier but it turns out that it was entirely by code. While doing some tests, I did delete the keyspaces and in that cases it gives an error that looks like: Commit logs for non-existing Column Family 1036 were ignored (I can't find the exact error in my logs). When I deleted the keyspace files, they were recreated by reading the commit logs (this is step 4 in my previous report). So I think they were in accordance with the schema stored in cassandra. --- I wanted to actually test it. The only last versions I could find were 1.0.1, 1.1.2 and 1.1.3. I created a small testscript and it definitely works with them. But it would be good if we could have access to 1.1.1 to do the same data upgrade we did in the past. {code} #!/bin/sh apt-get remove --purge cassandra -y rm -Rf /var/log/cassandra /var/lib/cassandra if [ ! -f cassandra_1.0.11_all.deb ]; then wget http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.0.11_all.deb fi if [ ! -f cassandra_1.1.2_all.deb ]; then wget http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.1.2_all.deb fi if [ ! -f cassandra_1.1.3_all.deb ]; then wget http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.1.3_all.deb fi wait_for_server() { while ! echo exit | nc localhost 9160; do sleep 1; done } dpkg -i cassandra_1.0.11_all.deb tail -f /var/log/cassandra/output.log wait_for_server; cassandra-cli -h localhost EOF create keyspace m2mp; use m2mp; create column family Registry with column_type = 'Standard' and comparator = 'AsciiType' and default_validation_class = 'AsciiType' and key_validation_class = 'AsciiType'; set Registry['/user/florent']['first']='Florent'; set Registry['/user/florent']['country']='France'; set Registry['/version']['1.0.11']='done'; EOF cassandra-cli -h localhost -k m2mp EOF list Registry; exit; EOF dpkg -i cassandra_1.1.2_all.deb wait_for_server; cassandra-cli -h localhost -k m2mp EOF set Registry['/version']['1.1.2']='done'; list Registry; exit; EOF dpkg -i cassandra_1.1.3_all.deb wait_for_server; cassandra-cli -h localhost -k m2mp EOF set Registry['/version']['1.1.3']='done'; list Registry; exit; EOF service cassandra restart wait_for_server; cassandra-cli -h localhost -k m2mp EOF list Registry; exit; EOF {code} In the end I do have: {quote} --- RowKey: /user/florent = (column=country, value=France, timestamp=1345161343036000) = (column=first, value=Florent, timestamp=1345161342992000) --- RowKey: /version = (column=1.0.11, value=done, timestamp=1345161343039000) = (column=1.1.2, value=done, timestamp=1345161366935000) = (column=1.1.3, value=done, timestamp=134516138976) {quote} So it's ok. But I would be pretty interested to see if we get the same result if we don't skip any version. Commitlog not replayed after restart - data lost Key: CASSANDRA-4481 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.2 Environment: Single node cluster on 64Bit CentOS Reporter: Ivo Meißner Priority: Critical When data is written to the commitlog and I restart the machine, all commited data is lost that has not been flushed to disk. In the startup logs it says that it replays the commitlog successfully, but the data is not available then. When I open the commitlog file in an editor I can see the added data, but after the restart it cannot be fetched from cassandra. {code} INFO 09:59:45,362 Replaying /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Finished reading /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Log replay complete, 0 replayed mutations {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-4481) Commitlog not replayed after restart - data lost
[ https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436450#comment-13436450 ] Florent Clairambault edited comment on CASSANDRA-4481 at 8/17/12 11:11 AM: --- This bug is marked as resolved, so we're just documenting something that never happened. We're not scaring anyone here, we're making sure we have all the documentation to prove that I we were wrong. So just to make things clear, I didn't make any kind of change or deletion on my keyspaces. The two keyspaces were created by code (one with pelops and one with hector) once and never changed. I know I told I did it with cassandra-cli earlier but it turns out that it was entirely by code. While doing some tests, I did delete the keyspaces and in that cases it gives an error that looks like: Commit logs for non-existing Column Family 1036 were ignored (I can't find the exact error in my logs). When I deleted the keyspace files, they were recreated by reading the commit logs (this is step 4 in my previous report). So I think they were in accordance with the schema stored in cassandra. --- I wanted to actually test it. The only last versions I could find were 1.0.11, 1.1.2 and 1.1.3. I created a small testscript and it definitely works with them. But it would be good to test it with 1.1.1 (which I didn't find) also. {code} #!/bin/sh apt-get remove --purge cassandra -y rm -Rf /var/log/cassandra /var/lib/cassandra if [ ! -f cassandra_1.0.11_all.deb ]; then wget http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.0.11_all.deb fi if [ ! -f cassandra_1.1.2_all.deb ]; then wget http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.1.2_all.deb fi if [ ! -f cassandra_1.1.3_all.deb ]; then wget http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.1.3_all.deb fi wait_for_server() { while ! echo exit | nc localhost 9160; do sleep 1; done } dpkg -i cassandra_1.0.11_all.deb tail -f /var/log/cassandra/output.log wait_for_server; cassandra-cli -h localhost EOF create keyspace m2mp; use m2mp; create column family Registry with column_type = 'Standard' and comparator = 'AsciiType' and default_validation_class = 'AsciiType' and key_validation_class = 'AsciiType'; set Registry['/user/florent']['first']='Florent'; set Registry['/user/florent']['country']='France'; set Registry['/version']['1.0.11']='done'; EOF cassandra-cli -h localhost -k m2mp EOF list Registry; exit; EOF dpkg -i cassandra_1.1.2_all.deb wait_for_server; cassandra-cli -h localhost -k m2mp EOF set Registry['/version']['1.1.2']='done'; list Registry; exit; EOF dpkg -i cassandra_1.1.3_all.deb wait_for_server; cassandra-cli -h localhost -k m2mp EOF set Registry['/version']['1.1.3']='done'; list Registry; exit; EOF service cassandra restart wait_for_server; cassandra-cli -h localhost -k m2mp EOF list Registry; exit; EOF {code} In the end I do have: {quote} --- RowKey: /user/florent = (column=country, value=France, timestamp=1345161343036000) = (column=first, value=Florent, timestamp=1345161342992000) --- RowKey: /version = (column=1.0.11, value=done, timestamp=1345161343039000) = (column=1.1.2, value=done, timestamp=1345161366935000) = (column=1.1.3, value=done, timestamp=134516138976) {quote} So it's ok. But I would be pretty interested to see if we get the same result if we don't skip any version. was (Author: superfc): This bug is marked as resolved, so we're just documenting something that never happened. We're not scaring anyone here, we're making sure we have all the documentation to prove that I we were wrong. So just to make things clear, I didn't make any kind of change or deletion on my keyspaces. The two keyspaces were created by code (one with pelops and one with hector) once and never changed. I know I told I did it with cassandra-cli earlier but it turns out that it was entirely by code. While doing some tests, I did delete the keyspaces and in that cases it gives an error that looks like: Commit logs for non-existing Column Family 1036 were ignored (I can't find the exact error in my logs). When I deleted the keyspace files, they were recreated by reading the commit logs (this is step 4 in my previous report). So I think they were in accordance with the schema stored in cassandra. --- I wanted to actually test it. The only last versions I could find were 1.0.11, 1.1.2 and 1.1.3. I created a small testscript and it definitely works with them. But it would be good to test it with 1.1.1 (which I didn't have good access to) also. {code} #!/bin/sh apt-get remove --purge cassandra -y rm -Rf /var/log/cassandra /var/lib/cassandra if [ ! -f cassandra_1.0.11_all.deb ]; then wget
[jira] [Comment Edited] (CASSANDRA-4481) Commitlog not replayed after restart - data lost
[ https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436450#comment-13436450 ] Florent Clairambault edited comment on CASSANDRA-4481 at 8/17/12 11:17 AM: --- This bug is marked as resolved, so we're just documenting something that never happened. We're not scaring anyone here, we're making sure we have all the documentation to prove that we were wrong. So just to make things clear, I didn't make any kind of change or deletion on my keyspaces. The two keyspaces were created by code (one with pelops and one with hector) once and never changed. I know I told I did it with cassandra-cli earlier but it turns out that it was entirely by code. While doing some tests, I did delete the keyspaces and in that cases it gives an error that looks like: Commit logs for non-existing Column Family 1036 were ignored (I can't find the exact error in my logs). When I deleted the keyspace files, they were recreated by reading the commit logs (this is step 4 in my previous report). So I think they were in accordance with the schema stored in cassandra. --- I wanted to actually test it. The only last versions I could find were 1.0.11, 1.1.2 and 1.1.3. I created a small testscript and it definitely works with them. But it would be good to test it with 1.1.1 (which I didn't find) also. {code} #!/bin/sh apt-get remove --purge cassandra -y rm -Rf /var/log/cassandra /var/lib/cassandra if [ ! -f cassandra_1.0.11_all.deb ]; then wget http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.0.11_all.deb fi if [ ! -f cassandra_1.1.2_all.deb ]; then wget http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.1.2_all.deb fi if [ ! -f cassandra_1.1.3_all.deb ]; then wget http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.1.3_all.deb fi wait_for_server() { while ! echo exit | nc localhost 9160; do sleep 1; done } dpkg -i cassandra_1.0.11_all.deb tail -f /var/log/cassandra/output.log wait_for_server; cassandra-cli -h localhost EOF create keyspace m2mp; use m2mp; create column family Registry with column_type = 'Standard' and comparator = 'AsciiType' and default_validation_class = 'AsciiType' and key_validation_class = 'AsciiType'; set Registry['/user/florent']['first']='Florent'; set Registry['/user/florent']['country']='France'; set Registry['/version']['1.0.11']='done'; EOF cassandra-cli -h localhost -k m2mp EOF list Registry; exit; EOF dpkg -i cassandra_1.1.2_all.deb wait_for_server; cassandra-cli -h localhost -k m2mp EOF set Registry['/version']['1.1.2']='done'; list Registry; exit; EOF dpkg -i cassandra_1.1.3_all.deb wait_for_server; cassandra-cli -h localhost -k m2mp EOF set Registry['/version']['1.1.3']='done'; list Registry; exit; EOF service cassandra restart wait_for_server; cassandra-cli -h localhost -k m2mp EOF list Registry; exit; EOF {code} In the end I do have: {quote} --- RowKey: /user/florent = (column=country, value=France, timestamp=1345161343036000) = (column=first, value=Florent, timestamp=1345161342992000) --- RowKey: /version = (column=1.0.11, value=done, timestamp=1345161343039000) = (column=1.1.2, value=done, timestamp=1345161366935000) = (column=1.1.3, value=done, timestamp=134516138976) {quote} So it's ok. But I would be pretty interested to see if we get the same result if we don't skip any version. was (Author: superfc): This bug is marked as resolved, so we're just documenting something that never happened. We're not scaring anyone here, we're making sure we have all the documentation to prove that I we were wrong. So just to make things clear, I didn't make any kind of change or deletion on my keyspaces. The two keyspaces were created by code (one with pelops and one with hector) once and never changed. I know I told I did it with cassandra-cli earlier but it turns out that it was entirely by code. While doing some tests, I did delete the keyspaces and in that cases it gives an error that looks like: Commit logs for non-existing Column Family 1036 were ignored (I can't find the exact error in my logs). When I deleted the keyspace files, they were recreated by reading the commit logs (this is step 4 in my previous report). So I think they were in accordance with the schema stored in cassandra. --- I wanted to actually test it. The only last versions I could find were 1.0.11, 1.1.2 and 1.1.3. I created a small testscript and it definitely works with them. But it would be good to test it with 1.1.1 (which I didn't find) also. {code} #!/bin/sh apt-get remove --purge cassandra -y rm -Rf /var/log/cassandra /var/lib/cassandra if [ ! -f cassandra_1.0.11_all.deb ]; then wget
[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write
[ https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Jenvey updated CASSANDRA-2897: - Attachment: 41ec9fc-2897.txt Here's an alternative patch that also tackles just the non-compaction changes (it's a little stale, against 41ec9fc) Briefly looking at Sam's version, I'll note that: o Mine handles entire row deletions in Memtable o but it lacks changes to CompositesSearcher/SchemaLoader/CFMetaDataTest (though I'm not familiar with these code paths, either) o in KeysSearcher, I very likely should be using the compare method from getValueValidator to check for staleness (instead of naively just calling equals) Secondary indexes without read-before-write --- Key: CASSANDRA-2897 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.7.0 Reporter: Sylvain Lebresne Priority: Minor Labels: secondary_index Fix For: 1.2.0 Attachments: 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 41ec9fc-2897.txt Currently, secondary index updates require a read-before-write to maintain the index consistency. Keeping the index consistent at all time is not necessary however. We could let the (secondary) index get inconsistent on writes and repair those on reads. This would be easy because on reads, we make sure to request the indexed columns anyway, so we can just skip the row that are not needed and repair the index at the same time. This does trade work on writes for work on reads. However, read-before-write is sufficiently costly that it will likely be a win overall. There is (at least) two small technical difficulties here though: # If we repair on read, this will be racy with writes, so we'll probably have to synchronize there. # We probably shouldn't only rely on read to repair and we should also have a task to repair the index for things that are rarely read. It's unclear how to make that low impact though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4550) nodetool ring output should use hex not integers for tokens
[ https://issues.apache.org/jira/browse/CASSANDRA-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436511#comment-13436511 ] Jonathan Ellis commented on CASSANDRA-4550: --- (Removed affects=1.09 because we use affects as the earliest version affected, which is all versions in this case.) nodetool ring output should use hex not integers for tokens --- Key: CASSANDRA-4550 URL: https://issues.apache.org/jira/browse/CASSANDRA-4550 Project: Cassandra Issue Type: Improvement Components: Tools Environment: Linux Reporter: Aaron Turner Priority: Trivial Labels: lhf The current output of nodetool ring prints start token values as base10 integers instead of hex. This is not very user friendly for a number of reasons: 1. Hides the fact that the values are 128bit 2. Values are not of a consistent length, while in hex padding with zero is generally accepted 3. When using the default random partitioner, having the values in hex makes it easier for users to determine which node(s) a given key resides on since md5 utilities like md5sum output hex. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost
[ https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436515#comment-13436515 ] Jonathan Ellis commented on CASSANDRA-4481: --- 1.1.1 is available here: http://archive.apache.org/dist/cassandra/1.1.1/ Commitlog not replayed after restart - data lost Key: CASSANDRA-4481 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.2 Environment: Single node cluster on 64Bit CentOS Reporter: Ivo Meißner Priority: Critical When data is written to the commitlog and I restart the machine, all commited data is lost that has not been flushed to disk. In the startup logs it says that it replays the commitlog successfully, but the data is not available then. When I open the commitlog file in an editor I can see the added data, but after the restart it cannot be fetched from cassandra. {code} INFO 09:59:45,362 Replaying /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Finished reading /var/myproject/cassandra/commitlog/CommitLog-83203377067.log INFO 09:59:45,476 Log replay complete, 0 replayed mutations {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4554) Log when a node is down longer than the hint window and we stop saving hints
Jonathan Ellis created CASSANDRA-4554: - Summary: Log when a node is down longer than the hint window and we stop saving hints Key: CASSANDRA-4554 URL: https://issues.apache.org/jira/browse/CASSANDRA-4554 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Priority: Minor Fix For: 1.1.5 We know that we need to repair whenever we lose a node or disk permanently (since it may have had undelivered hints on it), but without exposing this we don't know when nodes stop saving hints for a temporarily dead node, unless we're paying very close attention to external monitoring. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4554) Log when a node is down longer than the hint window and we stop saving hints
[ https://issues.apache.org/jira/browse/CASSANDRA-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436537#comment-13436537 ] Jonathan Ellis commented on CASSANDRA-4554: --- Should probably log this in a system table so it's easily queried. Log when a node is down longer than the hint window and we stop saving hints Key: CASSANDRA-4554 URL: https://issues.apache.org/jira/browse/CASSANDRA-4554 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Priority: Minor Fix For: 1.1.5 We know that we need to repair whenever we lose a node or disk permanently (since it may have had undelivered hints on it), but without exposing this we don't know when nodes stop saving hints for a temporarily dead node, unless we're paying very close attention to external monitoring. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira