[jira] [Commented] (CASSANDRA-6799) schema_version of newly bootstrapped nodes disagrees with existing nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-6799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920631#comment-13920631 ] Duncan Sands commented on CASSANDRA-6799: - I restarted one of the new nodes last night (neither had been restarted since it was bootstrapped), and now all nodes have the same schema version: not just the restarted node, but also the other newly bootstrapped node. schema_version of newly bootstrapped nodes disagrees with existing nodes Key: CASSANDRA-6799 URL: https://issues.apache.org/jira/browse/CASSANDRA-6799 Project: Cassandra Issue Type: Bug Components: Core Environment: x86_64 ubuntu, java version 1.7.0_45, Cassandra 2.0.5 Reporter: Duncan Sands Attachments: system.log.gz After bootstrapping new nodes 172.18.33.23 and 172.18.33.24 last weekend, I noticed that they have a different schema_version to the existing nodes. The existing nodes have all been around for a while, saw some schema changes in the past (eg: timeuuid - timestamp on a column family) but none recently, and were originally running 1.2 (they were upgraded to 2.0.5). Here you see the different schema version 0d9173d5-3947-328e-a14d-ce05239f61e0 for the two nodes: cqlsh select peer, data_center, host_id, preferred_ip, rack, release_version, rpc_address, schema_version from system.peers; peer | data_center | host_id | preferred_ip | rack | release_version | rpc_address| schema_version +-+--+--+--+-++-- 192.168.21.12 | rdm | 55e4b4b6-2e64-4542-87a4-d8a8e28b5135 | null | RAC1 | 2.0.5 | 192.168.21.12 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 172.18.33.24 | ldn | 6e634206-94b6-4dcf-9cf8-72bfe190feee | null | RAC1 | 2.0.5 | 172.18.33.24 | 0d9173d5-3947-328e-a14d-ce05239f61e0 172.18.33.22 | ldn | 75c9c81f-b00b-4335-8483-fb7f1bc0be1e | null | RAC1 | 2.0.5 | 172.18.33.22 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.60.136 | adm | c83d403f-ef0d-4c54-a844-d69730fa54d3 | null | RAC1 | 2.0.5 | 192.168.60.136 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.60.137 | adm | b12e6d71-e189-4fe8-b00a-8ff2cc9848fd | null | RAC1 | 2.0.5 | 192.168.60.137 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.21.11 | rdm | dd2e69cb-232f-4236-89f2-b5479669d9f7 | null | RAC1 | 2.0.5 | 192.168.21.11 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 172.18.33.21 | ldn | 6942404c-e512-46b4-977a-243defa48d0f | null | RAC1 | 2.0.5 | 172.18.33.21 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.60.138 | adm | a229bc0f-201b-479e-8312-66891f37ca85 | null | RAC1 | 2.0.5 | 192.168.60.138 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.60.134 | adm | 7b860a54-59ea-4a92-9b47-44b52793cc70 | null | RAC1 | 2.0.5 | 192.168.60.134 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 172.18.33.23 | ldn | a08bad62-55bb-492b-be64-7cf5d5073d6d | null | RAC1 | 2.0.5 | 172.18.33.23 | 0d9173d5-3947-328e-a14d-ce05239f61e0 192.168.60.130 | adm | 3498b4b8-1047-4b42-b13b-bf27b3aa3177 | null | RAC1 | 2.0.5 | 192.168.60.130 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.60.133 | adm | 21d3faad-5c5d-447e-bab4-ad9323bdf4c1 | null | RAC1 | 2.0.5 | 192.168.60.133 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.60.135 | adm | 860ff4bb-4fcf-43ba-b270-f1844bdd3e65 | null | RAC1 | 2.0.5 | 192.168.60.135 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.60.131 | adm | d8b7b0b2-d697-43ae-ad6e-982b24637865 | null | RAC1 | 2.0.5 | 192.168.60.131 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 (14 rows) I've attached the Cassandra log showing the 172.18.33.23 node bootstrapping. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6799) schema_version of newly bootstrapped nodes disagrees with existing nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-6799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920639#comment-13920639 ] Duncan Sands commented on CASSANDRA-6799: - To be more precise, all nodes are now at schema version f673ced0-8cfd-3d69-baba-4f81dc60c5b5 schema_version of newly bootstrapped nodes disagrees with existing nodes Key: CASSANDRA-6799 URL: https://issues.apache.org/jira/browse/CASSANDRA-6799 Project: Cassandra Issue Type: Bug Components: Core Environment: x86_64 ubuntu, java version 1.7.0_45, Cassandra 2.0.5 Reporter: Duncan Sands Attachments: system.log.gz After bootstrapping new nodes 172.18.33.23 and 172.18.33.24 last weekend, I noticed that they have a different schema_version to the existing nodes. The existing nodes have all been around for a while, saw some schema changes in the past (eg: timeuuid - timestamp on a column family) but none recently, and were originally running 1.2 (they were upgraded to 2.0.5). Here you see the different schema version 0d9173d5-3947-328e-a14d-ce05239f61e0 for the two nodes: cqlsh select peer, data_center, host_id, preferred_ip, rack, release_version, rpc_address, schema_version from system.peers; peer | data_center | host_id | preferred_ip | rack | release_version | rpc_address| schema_version +-+--+--+--+-++-- 192.168.21.12 | rdm | 55e4b4b6-2e64-4542-87a4-d8a8e28b5135 | null | RAC1 | 2.0.5 | 192.168.21.12 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 172.18.33.24 | ldn | 6e634206-94b6-4dcf-9cf8-72bfe190feee | null | RAC1 | 2.0.5 | 172.18.33.24 | 0d9173d5-3947-328e-a14d-ce05239f61e0 172.18.33.22 | ldn | 75c9c81f-b00b-4335-8483-fb7f1bc0be1e | null | RAC1 | 2.0.5 | 172.18.33.22 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.60.136 | adm | c83d403f-ef0d-4c54-a844-d69730fa54d3 | null | RAC1 | 2.0.5 | 192.168.60.136 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.60.137 | adm | b12e6d71-e189-4fe8-b00a-8ff2cc9848fd | null | RAC1 | 2.0.5 | 192.168.60.137 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.21.11 | rdm | dd2e69cb-232f-4236-89f2-b5479669d9f7 | null | RAC1 | 2.0.5 | 192.168.21.11 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 172.18.33.21 | ldn | 6942404c-e512-46b4-977a-243defa48d0f | null | RAC1 | 2.0.5 | 172.18.33.21 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.60.138 | adm | a229bc0f-201b-479e-8312-66891f37ca85 | null | RAC1 | 2.0.5 | 192.168.60.138 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.60.134 | adm | 7b860a54-59ea-4a92-9b47-44b52793cc70 | null | RAC1 | 2.0.5 | 192.168.60.134 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 172.18.33.23 | ldn | a08bad62-55bb-492b-be64-7cf5d5073d6d | null | RAC1 | 2.0.5 | 172.18.33.23 | 0d9173d5-3947-328e-a14d-ce05239f61e0 192.168.60.130 | adm | 3498b4b8-1047-4b42-b13b-bf27b3aa3177 | null | RAC1 | 2.0.5 | 192.168.60.130 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.60.133 | adm | 21d3faad-5c5d-447e-bab4-ad9323bdf4c1 | null | RAC1 | 2.0.5 | 192.168.60.133 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.60.135 | adm | 860ff4bb-4fcf-43ba-b270-f1844bdd3e65 | null | RAC1 | 2.0.5 | 192.168.60.135 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 192.168.60.131 | adm | d8b7b0b2-d697-43ae-ad6e-982b24637865 | null | RAC1 | 2.0.5 | 192.168.60.131 | f673ced0-8cfd-3d69-baba-4f81dc60c5b5 (14 rows) I've attached the Cassandra log showing the 172.18.33.23 node bootstrapping. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-2356) make the debian package never start by default
[ https://issues.apache.org/jira/browse/CASSANDRA-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920650#comment-13920650 ] Duncan Sands commented on CASSANDRA-2356: - Many Debian packages use the /etc/default/cassandra scheme suggested by Brandon Williams. Simple, standard - sounds good to me! I don't understand why it was rejected. For new installs it should clearly contain ENABLED=false; for people upgrading, the upgrade script would have to create this file if it didn't exist already with ENABLED=true, to preserve the previous behaviour. Another point that came up on IRC is that shutting down a C* instance using the init scripts doesn't first drain the node. As a result you get to replay all the commit logs when you start it up again - this can take a long time. So draining the node before shutdown (including restart) can be a big win. make the debian package never start by default -- Key: CASSANDRA-2356 URL: https://issues.apache.org/jira/browse/CASSANDRA-2356 Project: Cassandra Issue Type: Improvement Components: Packaging Reporter: Jeremy Hanna Priority: Minor Labels: debian, packaging Attachments: 2356.txt Currently the debian package that installs cassandra starts cassandra by default. It sounds like that is a standard debian packaging convention. However, if you want to bootstrap a new node and want to configure it before it creates any sort of state information, it's a pain. I would think that the common use case would be to have it install all of the init scripts and such but *not* have it start up by default. That way an admin can configure cassandra with seed, token, host, etc. information and then start it. That makes it easier to programmatically do this as well - have chef/puppet install cassandra, do some configuration, then do the service start. With the current setup, it sounds like cassandra creates state on startup that has to be cleaned before a new configuration can take effect. So the process of installing turns into: * install debian package * shutdown cassandra * clean out state (data/log dirs) * configure cassandra * start cassandra That seems suboptimal for the default case, especially when trying to automate new nodes being bootstrapped. Another case might be when a downed node comes back up and starts by default and tries to claim a token that has already been claimed by another newly bootstrapped node. Rob is more familiar with that case so I'll let him explain it in the comments. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6689) Partially Off Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920720#comment-13920720 ] Benedict commented on CASSANDRA-6689: - bq. sort of RCU (i'm looking at you OpOrder) What do you mean here? If you mean read-copy-update, OpOrder is nothing like this. bq. I'm not sure what is to retain here if we do that copy when we send to the wire Ultimately, doing this copying before sending to the wire is something I would like to avoid. Using the RefAction.allocateOnHeap() on top of this copying sees wire transfer speeds for thrift drop by about 10% in my fairly rough-and-ready benchmarks, so obviously copying has a cost. Possibly this cost is due to unavoidably copying data you don't necessarily want to serialise, but it seems to be there. Ultimately if we want to get in-memory read operations to 10x their current performance, we can't go cutting any corners. bq. introducing separate gc I've stated clearly what this introduces as a benefit: overwrite workloads no longer cause excessive flushes bq. things but as we have a fixed number of threads it is going to work out the same way as for buffering open files in the steady system state Your next sentence states how this is a large cause of memory consumption, so surely we should be using that memory if possible for other uses (returning it to the buffer cache, or using it internally for more caching)? bq. Temporary memory allocated by readers is exactly what we should be managing at the first place because they allocate the most and it always the biggest concern for us I agree we should be moving to managing this as well, however I disagree about how we should be managing it. In the medium term we should be bringing the buffer cache in process, so that we can answer some queries without handing off to the mutation stage (anything known to be non-blocking and fast should be answered immediately by the thread that processed the connection), at which point we will benefit from shared use of the memory pool, and concrete control over how much memory readers are using, and zero-copy reads from the buffer cache. I hope we may be able to do this for 3.0. bq. do a simple memcpy test and see how much mb/s can you get from copying from one pre-allocated pool to another Are you performing a full object tree copy, and doing this with a running system to see how it affects the performance of other system components? If not, it doesn't seem to be a useful comparison. Note that this will still create a tremendous amount of heap churn, as most of the memory used by objects right now is on-heap. So copying the records is almost certainly no better for young gen pressure than what we currently do - in fact, *it probably makes the situation worse*. bq. it's not the memtable which creates the most of the noise and memory presure in the system (even tho it uses big chunk of heap) It may not be causing the young gen pressure you're seeing, but it certainly offers some benefit here by keeping more rows in memory so recent queries are more likely to be answered with zero allocation, so reducing young gen pressure; it is also a foundation for improving the row cache and introducing a shared page cache which could bring us closer to zero allocation reads. It's also not clear to me how you would be managing the reclaim of the off-heap allocations without OpOrder, or do you mean to only use off-heap buffers for readers, or to ref-count any memory as you're reading it? Not using off-heap memory for the memtables would negate the main original point of this ticket: to support larger memtables, thus reducing write amplification. Ref-counting incurs overhead linear to the size of the result set, much like copying, and is also fiddly to get right (not convinced it's cleaner or neater), whereas OpOrder incurs overhead proportional to the number of times you reclaim. So if you're using OpOrder, all you're really talking about is a new RefAction: copyToAllocator() or something. So it doesn't notably reduce complexity, it just reduces the quality of the end result. Partially Off Heap Memtables Key: CASSANDRA-6689 URL: https://issues.apache.org/jira/browse/CASSANDRA-6689 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Benedict Assignee: Benedict Fix For: 2.1 beta2 Attachments: CASSANDRA-6689-small-changes.patch Move the contents of ByteBuffers off-heap for records written to a memtable. (See comments for details) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6689) Partially Off Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920721#comment-13920721 ] Benedict commented on CASSANDRA-6689: - bq. but the reads and internode communication (especially the latter). Also, I'd love to see some evidence for this (particularly the latter). I'm not disputing it, just would like to see what caused you to reach these conclusions. These definitely warrant separate tickets IMO, but if you have evidence for it, it would help direct any work. Partially Off Heap Memtables Key: CASSANDRA-6689 URL: https://issues.apache.org/jira/browse/CASSANDRA-6689 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Benedict Assignee: Benedict Fix For: 2.1 beta2 Attachments: CASSANDRA-6689-small-changes.patch Move the contents of ByteBuffers off-heap for records written to a memtable. (See comments for details) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920786#comment-13920786 ] Piotr Kołaczkowski commented on CASSANDRA-6311: --- org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java:275 {noformat} OptionalSSLOptions ssLOptions = getSSLOptions(conf); {noformat} typo: ssL - ssl -- org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java:398: {noformat} OptionalInteger maxSimultaneousRequests = getInputMinSimultReqPerConnections(conf); OptionalInteger minSimultaneousRequests = getInputMaxSimultReqPerConnections(conf); {noformat} min and max swapped? -- org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java:549: {noformat} OptionalString keystorePassword = getInputNativeSSLTruststorePassword(conf); {noformat} should be: {noformat} OptionalString keystorePassword = getInputNativeSSLKeystorePassword(conf); {noformat} -- org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java:524: {noformat} return new AbstractIteratorHost() { protected Host computeNext() { return origHost; } }; {noformat} Not sure if it was the intent to create an infinite iterator returning nulls or the same host over and over again here. According to the docs, guava iterator implementations *must* invoke endOfData() to terminate iteration. Don't we need here an iterator returning just one item stickHost and let the driver handle the rest? Also, not sure if returning nulls here is allowed at all (the driver docs isn't explicit on that). I guess very likely it is going to NPE if there is a connection problem which might cause confusion. Probably a better solution would be to just return stickHost and let the driver attempt connecting and throwing a meaningful error message upon failure. BTW the implementation of the LoadBalancingPolicy, having two fields origHost and stickHost is redundant and using null on one of those for marking the host is down / unreachable does not convey the intent clearly to me. Can't we just use stickHost and a direct boolean flag for denoting whether it is reachable or not? -- org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java:591: {noformat} private static OptionalString getStringSetting(String parameter, Configuration conf) { String setting = conf.get(parameter); if (setting == null || setting.isEmpty()) return Optional.absent(); return Optional.of(setting); } {noformat} In getStringSetting, setting an empty string is considered an absent option - so it is not possible to have an empty string setting (not sure if it would be useful - just double checking if it was on purpose or by omission) -- {noformat} * 2) where clause must include token(partition_key1 ... partition_keyn) ? and * token(partition_key1 ... partition_keyn) = ? {noformat} Would be nice to have at least some basic validation of the WHERE clause, so the user gets a nice error message when one screws it up. -- org/apache/cassandra/hadoop/cql3/CqlRecordReader.java:230 {noformat} public RowIterator(Configuration conf) {noformat} conf not used -- org/apache/cassandra/hadoop/cql3/CqlRecordReader.java:268 {noformat} return Pair.create(Long.valueOf(keyId), row); {noformat} Boxing is not needed here. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.6 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Kołaczkowski updated CASSANDRA-6311: -- Reviewer: Piotr Kołaczkowski (was: Jonathan Ellis) Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.6 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6283) Windows 7 data files keept open / can't be deleted after compaction.
[ https://issues.apache.org/jira/browse/CASSANDRA-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920811#comment-13920811 ] Andreas Schnitzerling commented on CASSANDRA-6283: -- Hello, since I don't know all code areas of C*, I describe, what I tested to reproduce: I cleaned system.log and used again C* 2.0.5-rel with LEAK detection and finalizer-patch in RAR.java. After starting again C* w/o doing anything I got a lot LEAK messages. I waited until C* finished his own work (mainly compacting I think). Now I started repair -par. Result are a lot of LEAK messages. Here the first one: {panel:title=nodetool repair -par events} ERROR [Finalizer] 2014-03-05 13:45:25,932 RandomAccessReader.java (line 394) LEAK finalizer had to clean up java.lang.Exception: RAR for D:\Programme\cassandra\data\events\eventsbyproject\events-eventsbyproject-jb-2002-Index.db allocated at org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:63) at org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:103) at org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:90) at org.apache.cassandra.io.util.BufferedPoolingSegmentedFile.createReader(BufferedPoolingSegmentedFile.java:45) at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39) at org.apache.cassandra.io.util.SegmentedFile$SegmentIterator.next(SegmentedFile.java:162) at org.apache.cassandra.io.util.SegmentedFile$SegmentIterator.next(SegmentedFile.java:143) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:936) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:871) at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:783) at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1186) at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1174) at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:252) at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.init(CompactionManager.java:888) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:787) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:62) at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:397) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {panel} If I can make more tests, let me know. After Thursday I will be on holiday for 3 weeks and in office again at Mon, 03/31/2014. Windows 7 data files keept open / can't be deleted after compaction. Key: CASSANDRA-6283 URL: https://issues.apache.org/jira/browse/CASSANDRA-6283 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7 (32) / Java 1.7.0.45 Reporter: Andreas Schnitzerling Assignee: Joshua McKenzie Labels: compaction Fix For: 2.0.6 Attachments: 6283_StreamWriter_patch.txt, leakdetect.patch, screenshot-1.jpg, system.log Files cannot be deleted, patch CASSANDRA-5383 (Win7 deleting problem) doesn't help on Win-7 on Cassandra 2.0.2. Even 2.1 Snapshot is not running. The cause is: Opened file handles seem to be lost and not closed properly. Win 7 blames, that another process is still using the file (but its obviously cassandra). Only restart of the server makes the files deleted. But after heavy using (changes) of tables, there are about 24K files in the data folder (instead of 35 after every restart) and Cassandra crashes. I experiminted and I found out, that a finalizer fixes the problem. So after GC the files will be deleted (not optimal, but working fine). It runs now 2 days continously without problem. Possible fix/test: I wrote the following finalizer at the end of class org.apache.cassandra.io.util.RandomAccessReader: {code:title=RandomAccessReader.java|borderStyle=solid} @Override protected void finalize() throws Throwable { deallocate(); super.finalize(); } {code} Can somebody test / develop / patch it? Thx. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6283) Windows 7 data files keept open / can't be deleted after compaction.
[ https://issues.apache.org/jira/browse/CASSANDRA-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920811#comment-13920811 ] Andreas Schnitzerling edited comment on CASSANDRA-6283 at 3/5/14 12:59 PM: --- Hello, since I don't know all code areas of C*, I describe, what I tested to reproduce: I cleaned system.log and used again C* 2.0.5-rel with LEAK detection and finalizer-patch in RAR.java. After starting again C* w/o doing anything I got a lot LEAK messages. I waited until C* finished his own work (mainly compacting I think). Now I started repair -par. Result are a lot of LEAK messages. Here the first one: {panel:title=nodetool repair -par events} ERROR [Finalizer] 2014-03-05 13:45:25,932 RandomAccessReader.java (line 394) LEAK finalizer had to clean up java.lang.Exception: RAR for D:\Programme\cassandra\data\events\eventsbyproject\events-eventsbyproject-jb-2002-Index.db allocated at org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:63) at org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:103) at org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:90) at org.apache.cassandra.io.util.BufferedPoolingSegmentedFile.createReader(BufferedPoolingSegmentedFile.java:45) at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39) at org.apache.cassandra.io.util.SegmentedFile$SegmentIterator.next(SegmentedFile.java:162) at org.apache.cassandra.io.util.SegmentedFile$SegmentIterator.next(SegmentedFile.java:143) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:936) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:871) at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:783) at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1186) at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1174) at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:252) at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.init(CompactionManager.java:888) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:787) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:62) at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:397) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {panel} {panel:title=neighbor node} ERROR [Finalizer] 2014-03-05 13:50:54,061 RandomAccessReader.java (line 394) LEAK finalizer had to clean up java.lang.Exception: RAR for D:\Programme\cassandra\data\events\evrangesdevice\events-evrangesdevice-jb-905-Index.db allocated at org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:63) at org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:103) at org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:90) at org.apache.cassandra.io.util.BufferedPoolingSegmentedFile.createReader(BufferedPoolingSegmentedFile.java:45) at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39) at org.apache.cassandra.io.util.SegmentedFile$SegmentIterator.next(SegmentedFile.java:162) at org.apache.cassandra.io.util.SegmentedFile$SegmentIterator.next(SegmentedFile.java:143) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:936) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:871) at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:788) at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1186) at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1174) at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:252) at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.init(CompactionManager.java:888) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:787) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:62) at
[jira] [Comment Edited] (CASSANDRA-6283) Windows 7 data files keept open / can't be deleted after compaction.
[ https://issues.apache.org/jira/browse/CASSANDRA-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920811#comment-13920811 ] Andreas Schnitzerling edited comment on CASSANDRA-6283 at 3/5/14 1:00 PM: -- Hello, since I don't know all code areas of C*, I describe, what I tested to reproduce: I cleaned system.log and used again C* 2.0.5-rel with LEAK detection and finalizer-patch in RAR.java. After starting again C* w/o doing anything I got a lot LEAK messages. I waited until C* finished his own work (mainly compacting I think). Now I started repair -par. Result are a lot of LEAK messages. Here the first one: {panel:title=nodetool repair -par events} ERROR [Finalizer] 2014-03-05 13:45:25,932 RandomAccessReader.java (line 394) LEAK finalizer had to clean up java.lang.Exception: RAR for D:\Programme\cassandra\data\events\eventsbyproject\events-eventsbyproject-jb-2002-Index.db allocated at org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:63) at org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:103) at org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:90) at org.apache.cassandra.io.util.BufferedPoolingSegmentedFile.createReader(BufferedPoolingSegmentedFile.java:45) at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39) at org.apache.cassandra.io.util.SegmentedFile$SegmentIterator.next(SegmentedFile.java:162) at org.apache.cassandra.io.util.SegmentedFile$SegmentIterator.next(SegmentedFile.java:143) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:936) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:871) at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:783) at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1186) at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1174) at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:252) at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.init(CompactionManager.java:888) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:787) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:62) at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:397) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {panel} {panel:title=neighbor node} ERROR [Finalizer] 2014-03-05 13:50:54,061 RandomAccessReader.java (line 394) LEAK finalizer had to clean up java.lang.Exception: RAR for D:\Programme\cassandra\data\events\evrangesdevice\events-evrangesdevice-jb-905-Index.db allocated at org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:63) at org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:103) at org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:90) at org.apache.cassandra.io.util.BufferedPoolingSegmentedFile.createReader(BufferedPoolingSegmentedFile.java:45) at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39) at org.apache.cassandra.io.util.SegmentedFile$SegmentIterator.next(SegmentedFile.java:162) at org.apache.cassandra.io.util.SegmentedFile$SegmentIterator.next(SegmentedFile.java:143) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:936) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:871) at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:788) at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1186) at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1174) at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:252) at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.init(CompactionManager.java:888) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:787) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:62) at
[3/3] git commit: add Thrift get_multi_slice call patch by Ed Capriolo; reviewed by Tyler Hobbs for CASSANDRA-6757
add Thrift get_multi_slice call patch by Ed Capriolo; reviewed by Tyler Hobbs for CASSANDRA-6757 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/60fb9230 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/60fb9230 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/60fb9230 Branch: refs/heads/trunk Commit: 60fb923018a6fd2dabf04a1d4500f7b29a23a6f1 Parents: 630d3b9 Author: Jonathan Ellis jbel...@apache.org Authored: Wed Mar 5 07:57:25 2014 -0600 Committer: Jonathan Ellis jbel...@apache.org Committed: Wed Mar 5 08:02:59 2014 -0600 -- CHANGES.txt |1 + interface/cassandra.thrift | 37 +- .../org/apache/cassandra/thrift/Cassandra.java | 3071 +- .../apache/cassandra/thrift/ColumnSlice.java| 551 .../cassandra/thrift/MultiSliceRequest.java | 1042 ++ .../cassandra/thrift/cassandraConstants.java|2 +- .../cassandra/thrift/CassandraServer.java | 68 + test/system/test_thrift_server.py | 34 + .../cassandra/db/ColumnFamilyStoreTest.java |1 + .../apache/cassandra/thrift/MultiSliceTest.java | 149 + 10 files changed, 4050 insertions(+), 906 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/60fb9230/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index e324225..1c0941b 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,5 +1,6 @@ 3.0 * Remove CQL2 (CASSANDRA-5918) + * add Thrift get_multi_slice call (CASSANDRA-6757) 2.1.0-beta2 http://git-wip-us.apache.org/repos/asf/cassandra/blob/60fb9230/interface/cassandra.thrift -- diff --git a/interface/cassandra.thrift b/interface/cassandra.thrift index e46b85e..b6b06dc 100644 --- a/interface/cassandra.thrift +++ b/interface/cassandra.thrift @@ -55,7 +55,7 @@ namespace rb CassandraThrift # An effort should be made not to break forward-client-compatibility either # (e.g. one should avoid removing obsolete fields from the IDL), but no # guarantees in this respect are made by the Cassandra project. -const string VERSION = 20.0.0 +const string VERSION = 20.1.0 # @@ -563,6 +563,35 @@ struct CfSplit { 3: required i64 row_count } +/** The ColumnSlice is used to select a set of columns from inside a row. + * If start or finish are unspecified they will default to the start-of + * end-of value. + * @param start. The start of the ColumnSlice inclusive + * @param finish. The end of the ColumnSlice inclusive + */ +struct ColumnSlice { +1: optional binary start, +2: optional binary finish +} + +/** + * Used to perform multiple slices on a single row key in one rpc operation + * @param key. The row key to be multi sliced + * @param column_parent. The column family (super columns are unsupported) + * @param column_slices. 0 to many ColumnSlice objects each will be used to select columns + * @param reversed. Direction of slice + * @param count. Maximum number of columns + * @param consistency_level. Level to perform the operation at + */ +struct MultiSliceRequest { +1: optional binary key, +2: optional ColumnParent column_parent, +3: optional listColumnSlice column_slices, +4: optional bool reversed=false, +5: optional i32 count=1000, +6: optional ConsistencyLevel consistency_level=ConsistencyLevel.ONE +} + service Cassandra { # auth methods void login(1: required AuthenticationRequest auth_request) throws (1:AuthenticationException authnx, 2:AuthorizationException authzx), @@ -741,7 +770,11 @@ service Cassandra { void truncate(1:required string cfname) throws (1: InvalidRequestException ire, 2: UnavailableException ue, 3: TimedOutException te), - + /** + * Select multiple slices of a key in a single RPC operation + */ + listColumnOrSuperColumn get_multi_slice(1:required MultiSliceRequest request) + throws (1:InvalidRequestException ire, 2:UnavailableException ue, 3:TimedOutException te), // Meta-APIs -- APIs to get information about the node or cluster, // rather than user data. The nodeprobe program provides usage examples.
[jira] [Commented] (CASSANDRA-6283) Windows 7 data files keept open / can't be deleted after compaction.
[ https://issues.apache.org/jira/browse/CASSANDRA-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920931#comment-13920931 ] Joshua McKenzie commented on CASSANDRA-6283: Could you attach the system.log from the root and neighbor nodes to this ticket? Might help see if there's anything else going on there in the environment involved in this. Windows 7 data files keept open / can't be deleted after compaction. Key: CASSANDRA-6283 URL: https://issues.apache.org/jira/browse/CASSANDRA-6283 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7 (32) / Java 1.7.0.45 Reporter: Andreas Schnitzerling Assignee: Joshua McKenzie Labels: compaction Fix For: 2.0.6 Attachments: 6283_StreamWriter_patch.txt, leakdetect.patch, screenshot-1.jpg, system.log Files cannot be deleted, patch CASSANDRA-5383 (Win7 deleting problem) doesn't help on Win-7 on Cassandra 2.0.2. Even 2.1 Snapshot is not running. The cause is: Opened file handles seem to be lost and not closed properly. Win 7 blames, that another process is still using the file (but its obviously cassandra). Only restart of the server makes the files deleted. But after heavy using (changes) of tables, there are about 24K files in the data folder (instead of 35 after every restart) and Cassandra crashes. I experiminted and I found out, that a finalizer fixes the problem. So after GC the files will be deleted (not optimal, but working fine). It runs now 2 days continously without problem. Possible fix/test: I wrote the following finalizer at the end of class org.apache.cassandra.io.util.RandomAccessReader: {code:title=RandomAccessReader.java|borderStyle=solid} @Override protected void finalize() throws Throwable { deallocate(); super.finalize(); } {code} Can somebody test / develop / patch it? Thx. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6283) Windows 7 data files keept open / can't be deleted after compaction.
[ https://issues.apache.org/jira/browse/CASSANDRA-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920931#comment-13920931 ] Joshua McKenzie edited comment on CASSANDRA-6283 at 3/5/14 3:01 PM: Could you attach the most recent system.log from the root and neighbor nodes to this ticket? Might help see if there's anything else going on there in the environment involved in this. was (Author: joshuamckenzie): Could you attach the system.log from the root and neighbor nodes to this ticket? Might help see if there's anything else going on there in the environment involved in this. Windows 7 data files keept open / can't be deleted after compaction. Key: CASSANDRA-6283 URL: https://issues.apache.org/jira/browse/CASSANDRA-6283 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7 (32) / Java 1.7.0.45 Reporter: Andreas Schnitzerling Assignee: Joshua McKenzie Labels: compaction Fix For: 2.0.6 Attachments: 6283_StreamWriter_patch.txt, leakdetect.patch, screenshot-1.jpg, system.log Files cannot be deleted, patch CASSANDRA-5383 (Win7 deleting problem) doesn't help on Win-7 on Cassandra 2.0.2. Even 2.1 Snapshot is not running. The cause is: Opened file handles seem to be lost and not closed properly. Win 7 blames, that another process is still using the file (but its obviously cassandra). Only restart of the server makes the files deleted. But after heavy using (changes) of tables, there are about 24K files in the data folder (instead of 35 after every restart) and Cassandra crashes. I experiminted and I found out, that a finalizer fixes the problem. So after GC the files will be deleted (not optimal, but working fine). It runs now 2 days continously without problem. Possible fix/test: I wrote the following finalizer at the end of class org.apache.cassandra.io.util.RandomAccessReader: {code:title=RandomAccessReader.java|borderStyle=solid} @Override protected void finalize() throws Throwable { deallocate(); super.finalize(); } {code} Can somebody test / develop / patch it? Thx. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6800) ant codecoverage no longer works due jdk 1.7
[ https://issues.apache.org/jira/browse/CASSANDRA-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6800: -- Reviewer: Jonathan Ellis Component/s: Tests Priority: Minor (was: Major) ant codecoverage no longer works due jdk 1.7 Key: CASSANDRA-6800 URL: https://issues.apache.org/jira/browse/CASSANDRA-6800 Project: Cassandra Issue Type: Bug Components: Tests Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Minor Fix For: 2.1 beta2 Code coverage does not run currently due to cobertura jdk incompatibility. Fix is coming. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6800) ant codecoverage no longer works due jdk 1.7
[ https://issues.apache.org/jira/browse/CASSANDRA-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920955#comment-13920955 ] Jonathan Ellis commented on CASSANDRA-6800: --- I'm getting a lot of errors even after realclean. The first is: {noformat} cobertura-instrument: [cobertura-instrument] Cobertura null - GNU GPL License (NO WARRANTY) - See COPYRIGHT file [cobertura-instrument] WARN instrumentClass, Unable to instrument file /Users/jbellis/projects/cassandra/git/build/classes/main/org/apache/cassandra/cli/CliClient.class [cobertura-instrument] java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.cassandra.thrift.CounterSuperColumn [cobertura-instrument] at org.objectweb.asm.ClassWriter.getCommonSuperClass(Unknown Source) [cobertura-instrument] at org.objectweb.asm.ClassWriter.a(Unknown Source) [cobertura-instrument] at org.objectweb.asm.Frame.a(Unknown Source) [cobertura-instrument] at org.objectweb.asm.Frame.a(Unknown Source) [cobertura-instrument] at org.objectweb.asm.MethodWriter.visitMaxs(Unknown Source) [cobertura-instrument] at org.objectweb.asm.MethodVisitor.visitMaxs(Unknown Source) [cobertura-instrument] at org.objectweb.asm.util.CheckMethodAdapter.visitMaxs(Unknown Source) [cobertura-instrument] at org.objectweb.asm.MethodVisitor.visitMaxs(Unknown Source) [cobertura-instrument] at org.objectweb.asm.commons.LocalVariablesSorter.visitMaxs(Unknown Source) [cobertura-instrument] at org.objectweb.asm.tree.MethodNode.accept(Unknown Source) [cobertura-instrument] at org.objectweb.asm.util.CheckMethodAdapter$1.visitEnd(Unknown Source) [cobertura-instrument] at org.objectweb.asm.MethodVisitor.visitEnd(Unknown Source) [cobertura-instrument] at org.objectweb.asm.util.CheckMethodAdapter.visitEnd(Unknown Source) [cobertura-instrument] at org.objectweb.asm.ClassReader.b(Unknown Source) [cobertura-instrument] at org.objectweb.asm.ClassReader.accept(Unknown Source) [cobertura-instrument] at org.objectweb.asm.ClassReader.accept(Unknown Source) [cobertura-instrument] at net.sourceforge.cobertura.instrument.CoberturaInstrumenter.instrumentClass(CoberturaInstrumenter.java:204) [cobertura-instrument] at net.sourceforge.cobertura.instrument.CoberturaInstrumenter.instrumentClass(CoberturaInstrumenter.java:121) [cobertura-instrument] at net.sourceforge.cobertura.instrument.CoberturaInstrumenter.addInstrumentationToSingleClass(CoberturaInstrumenter.java:233) [cobertura-instrument] at net.sourceforge.cobertura.instrument.Main.addInstrumentationToSingleClass(Main.java:274) [cobertura-instrument] at net.sourceforge.cobertura.instrument.Main.addInstrumentation(Main.java:283) [cobertura-instrument] at net.sourceforge.cobertura.instrument.Main.parseArguments(Main.java:373) [cobertura-instrument] at net.sourceforge.cobertura.instrument.Main.main(Main.java:395) {noformat} ant codecoverage no longer works due jdk 1.7 Key: CASSANDRA-6800 URL: https://issues.apache.org/jira/browse/CASSANDRA-6800 Project: Cassandra Issue Type: Bug Components: Tests Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Minor Fix For: 2.1 beta2 Code coverage does not run currently due to cobertura jdk incompatibility. Fix is coming. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6800) ant codecoverage no longer works due jdk 1.7
[ https://issues.apache.org/jira/browse/CASSANDRA-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920960#comment-13920960 ] Edward Capriolo commented on CASSANDRA-6800: I noticed that. This is pretty weird. From maven I have used the cobertura plugin, worked great. What a PITA ant is/ Maybe we should switch to maven :) I made it all the way though the process and it build the cobertura.ser but ran into some problem with the report target. I will keep looking at it for a bit. ant codecoverage no longer works due jdk 1.7 Key: CASSANDRA-6800 URL: https://issues.apache.org/jira/browse/CASSANDRA-6800 Project: Cassandra Issue Type: Bug Components: Tests Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Minor Fix For: 2.1 beta2 Code coverage does not run currently due to cobertura jdk incompatibility. Fix is coming. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6147) Break timestamp ties for thrift-ers
[ https://issues.apache.org/jira/browse/CASSANDRA-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920986#comment-13920986 ] Nate McCall commented on CASSANDRA-6147: That would actually be very helpful and would not break anything in the wild (Astyanax, Hector and (im pretty sure) pycassa all assert a not-null timestamp on egress anyhoo), so it would be unusual for someone to be relying on this as validation currently. Break timestamp ties for thrift-ers --- Key: CASSANDRA-6147 URL: https://issues.apache.org/jira/browse/CASSANDRA-6147 Project: Cassandra Issue Type: Sub-task Reporter: Edward Capriolo Assignee: Edward Capriolo Fix For: 2.1 beta2 Thrift users are still forced to generate timestamps on the client side. Currently the way the thrift bindings are generated users are forced to supply timestamps. There are two solutions I see. * -1 as timestamp means generate on the server side This is a breaking change, for those using -1 as a timestamp (which should effectively be no one. * Prepare yourself Our thrift signatures are wrong, you can't overload methods in thrift thrift.get(byte [], byte[], ts) should REALLY be changed to GetRequest g = new GetRequest() g.setName() g.setValue() g.setTs() ///optional thrift. get( g ) I know no one is going to want to make this change because thrift is quasi/dead but it would allow us to evolve thrift in a meaningful way. We could simple add these new methods under different names as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6591) un-deprecate cache recentHitRate and expose in o.a.c.metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920988#comment-13920988 ] Chris Burroughs commented on CASSANDRA-6591: Sorry I'm not following. If we are getting requests but no hits (mostly misses), the hit rate going down is what I would expect. un-deprecate cache recentHitRate and expose in o.a.c.metrics Key: CASSANDRA-6591 URL: https://issues.apache.org/jira/browse/CASSANDRA-6591 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Burroughs Assignee: Chris Burroughs Priority: Minor Attachments: j6591-1.2-v1.txt, j6591-1.2-v2.txt, j6591-1.2-v3.txt recentHitRate metrics were not added as part of CASSANDRA-4009 because there is not an obvious way to do it with the Metrics library. Instead hitRate was added as an all time measurement since node restart. This does allow changes in cache rate (aka production performance problems) to be detected. Ideally there would be 1/5/15 moving averages for the hit rate, but I'm not sure how to calculate that. Instead I propose updating recentHitRate on a fixed interval and exposing that as a Gauge. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6147) Break timestamp ties for thrift-ers
[ https://issues.apache.org/jira/browse/CASSANDRA-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921007#comment-13921007 ] Edward Capriolo commented on CASSANDRA-6147: I do not think this would break anything. Anything currently out there must be setting the timestamp explicitly. Anything not setting the timestamp is just getting 0. Users quickly find out what happens when two inserts have the same 0 timestamp Break timestamp ties for thrift-ers --- Key: CASSANDRA-6147 URL: https://issues.apache.org/jira/browse/CASSANDRA-6147 Project: Cassandra Issue Type: Sub-task Reporter: Edward Capriolo Assignee: Edward Capriolo Fix For: 2.1 beta2 Thrift users are still forced to generate timestamps on the client side. Currently the way the thrift bindings are generated users are forced to supply timestamps. There are two solutions I see. * -1 as timestamp means generate on the server side This is a breaking change, for those using -1 as a timestamp (which should effectively be no one. * Prepare yourself Our thrift signatures are wrong, you can't overload methods in thrift thrift.get(byte [], byte[], ts) should REALLY be changed to GetRequest g = new GetRequest() g.setName() g.setValue() g.setTs() ///optional thrift. get( g ) I know no one is going to want to make this change because thrift is quasi/dead but it would allow us to evolve thrift in a meaningful way. We could simple add these new methods under different names as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6147) Break timestamp ties for thrift-ers
[ https://issues.apache.org/jira/browse/CASSANDRA-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921018#comment-13921018 ] Jonathan Ellis commented on CASSANDRA-6147: --- I'm a little confused, are we changing scope on this ticket from break timestamp ties to allow opting in to server-side timestamps? nanotime is basically random so that would break ties but not very usefully :) Break timestamp ties for thrift-ers --- Key: CASSANDRA-6147 URL: https://issues.apache.org/jira/browse/CASSANDRA-6147 Project: Cassandra Issue Type: Sub-task Reporter: Edward Capriolo Assignee: Edward Capriolo Fix For: 2.1 beta2 Thrift users are still forced to generate timestamps on the client side. Currently the way the thrift bindings are generated users are forced to supply timestamps. There are two solutions I see. * -1 as timestamp means generate on the server side This is a breaking change, for those using -1 as a timestamp (which should effectively be no one. * Prepare yourself Our thrift signatures are wrong, you can't overload methods in thrift thrift.get(byte [], byte[], ts) should REALLY be changed to GetRequest g = new GetRequest() g.setName() g.setValue() g.setTs() ///optional thrift. get( g ) I know no one is going to want to make this change because thrift is quasi/dead but it would allow us to evolve thrift in a meaningful way. We could simple add these new methods under different names as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6147) Break timestamp ties for thrift-ers
[ https://issues.apache.org/jira/browse/CASSANDRA-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921033#comment-13921033 ] Edward Capriolo commented on CASSANDRA-6147: [~jbellis] You are right I kinda stole this ticket. The point of this patch is that if CQL can auto-timestamp things, thrift should be able to as well. Would you like me to open another ticket? Should the auto-timestamp be system.currentTimeMillis() + 1000? How does CQL arrive at its auto timestamp? Break timestamp ties for thrift-ers --- Key: CASSANDRA-6147 URL: https://issues.apache.org/jira/browse/CASSANDRA-6147 Project: Cassandra Issue Type: Sub-task Reporter: Edward Capriolo Assignee: Edward Capriolo Fix For: 2.1 beta2 Thrift users are still forced to generate timestamps on the client side. Currently the way the thrift bindings are generated users are forced to supply timestamps. There are two solutions I see. * -1 as timestamp means generate on the server side This is a breaking change, for those using -1 as a timestamp (which should effectively be no one. * Prepare yourself Our thrift signatures are wrong, you can't overload methods in thrift thrift.get(byte [], byte[], ts) should REALLY be changed to GetRequest g = new GetRequest() g.setName() g.setValue() g.setTs() ///optional thrift. get( g ) I know no one is going to want to make this change because thrift is quasi/dead but it would allow us to evolve thrift in a meaningful way. We could simple add these new methods under different names as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[6/6] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f601cac0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f601cac0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f601cac0 Branch: refs/heads/trunk Commit: f601cac021be203b0c4caa8375a3c9eb3ee94b70 Parents: 60fb923 7f7a9cc Author: Brandon Williams brandonwilli...@apache.org Authored: Wed Mar 5 11:23:59 2014 -0600 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed Mar 5 11:23:59 2014 -0600 -- --
[2/6] git commit: Add hadoop progressable compatibility. Patch by Ben Coverston, reviewed by brandonwilliams for CASSANDRA-5201
Add hadoop progressable compatibility. Patch by Ben Coverston, reviewed by brandonwilliams for CASSANDRA-5201 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/24923083 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/24923083 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/24923083 Branch: refs/heads/cassandra-2.1 Commit: 249230834c2ce1ac169b2b3228d5d222f5ecacc2 Parents: ab2717b Author: Brandon Williams brandonwilli...@apache.org Authored: Wed Mar 5 11:21:35 2014 -0600 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed Mar 5 11:21:35 2014 -0600 -- build.xml | 3 - .../hadoop/AbstractColumnFamilyInputFormat.java | 1 - .../AbstractColumnFamilyOutputFormat.java | 1 - .../AbstractColumnFamilyRecordWriter.java | 2 + .../cassandra/hadoop/BulkOutputFormat.java | 3 +- .../cassandra/hadoop/BulkRecordWriter.java | 16 +- .../hadoop/ColumnFamilyInputFormat.java | 1 - .../hadoop/ColumnFamilyOutputFormat.java| 2 +- .../hadoop/ColumnFamilyRecordReader.java| 1 - .../hadoop/ColumnFamilyRecordWriter.java| 15 +- .../apache/cassandra/hadoop/HadoopCompat.java | 309 +++ .../apache/cassandra/hadoop/Progressable.java | 50 --- .../cassandra/hadoop/cql3/CqlOutputFormat.java | 3 +- .../hadoop/cql3/CqlPagingInputFormat.java | 2 +- .../hadoop/cql3/CqlPagingRecordReader.java | 2 +- .../cassandra/hadoop/cql3/CqlRecordWriter.java | 12 +- .../cassandra/hadoop/pig/CassandraStorage.java | 2 +- .../apache/cassandra/hadoop/pig/CqlStorage.java | 3 +- 18 files changed, 346 insertions(+), 82 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/24923083/build.xml -- diff --git a/build.xml b/build.xml index 77b2639..9972aa2 100644 --- a/build.xml +++ b/build.xml @@ -367,7 +367,6 @@ /dependency dependency groupId=org.apache.hadoop artifactId=hadoop-core version=1.0.3/ dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster version=1.0.3/ - dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat version=4.3/ dependency groupId=org.apache.pig artifactId=pig version=0.10.0/ dependency groupId=net.java.dev.jna artifactId=jna version=3.2.7/ @@ -410,7 +409,6 @@ dependency groupId=org.apache.rat artifactId=apache-rat/ dependency groupId=org.apache.hadoop artifactId=hadoop-core/ dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster/ -dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat/ dependency groupId=org.apache.pig artifactId=pig/ dependency groupId=net.java.dev.jna artifactId=jna/ @@ -474,7 +472,6 @@ !-- don't need hadoop classes to run, but if you use the hadoop stuff -- dependency groupId=org.apache.hadoop artifactId=hadoop-core optional=true/ dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster optional=true/ -dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat optional=true/ dependency groupId=org.apache.pig artifactId=pig optional=true/ !-- don't need jna to run, but nice to have -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/24923083/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java index f547fd0..ba79eee 100644 --- a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java +++ b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java @@ -29,7 +29,6 @@ import java.util.concurrent.TimeUnit; import com.google.common.collect.ImmutableList; import com.google.common.collect.Lists; -import com.twitter.elephantbird.util.HadoopCompat; import org.slf4j.Logger; import org.slf4j.LoggerFactory; http://git-wip-us.apache.org/repos/asf/cassandra/blob/24923083/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java index a3c4234..3041829 100644 --- a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java +++
[4/6] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7f7a9cc7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7f7a9cc7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7f7a9cc7 Branch: refs/heads/trunk Commit: 7f7a9cc754944cd7da19996c9e20377ecf2cfe7d Parents: 0851fd7 2492308 Author: Brandon Williams brandonwilli...@apache.org Authored: Wed Mar 5 11:23:47 2014 -0600 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed Mar 5 11:23:47 2014 -0600 -- --
[5/6] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7f7a9cc7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7f7a9cc7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7f7a9cc7 Branch: refs/heads/cassandra-2.1 Commit: 7f7a9cc754944cd7da19996c9e20377ecf2cfe7d Parents: 0851fd7 2492308 Author: Brandon Williams brandonwilli...@apache.org Authored: Wed Mar 5 11:23:47 2014 -0600 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed Mar 5 11:23:47 2014 -0600 -- --
[1/6] git commit: Add hadoop progressable compatibility. Patch by Ben Coverston, reviewed by brandonwilliams for CASSANDRA-5201
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 ab2717b6f - 249230834 refs/heads/cassandra-2.1 0851fd74b - 7f7a9cc75 refs/heads/trunk 60fb92301 - f601cac02 Add hadoop progressable compatibility. Patch by Ben Coverston, reviewed by brandonwilliams for CASSANDRA-5201 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/24923083 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/24923083 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/24923083 Branch: refs/heads/cassandra-2.0 Commit: 249230834c2ce1ac169b2b3228d5d222f5ecacc2 Parents: ab2717b Author: Brandon Williams brandonwilli...@apache.org Authored: Wed Mar 5 11:21:35 2014 -0600 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed Mar 5 11:21:35 2014 -0600 -- build.xml | 3 - .../hadoop/AbstractColumnFamilyInputFormat.java | 1 - .../AbstractColumnFamilyOutputFormat.java | 1 - .../AbstractColumnFamilyRecordWriter.java | 2 + .../cassandra/hadoop/BulkOutputFormat.java | 3 +- .../cassandra/hadoop/BulkRecordWriter.java | 16 +- .../hadoop/ColumnFamilyInputFormat.java | 1 - .../hadoop/ColumnFamilyOutputFormat.java| 2 +- .../hadoop/ColumnFamilyRecordReader.java| 1 - .../hadoop/ColumnFamilyRecordWriter.java| 15 +- .../apache/cassandra/hadoop/HadoopCompat.java | 309 +++ .../apache/cassandra/hadoop/Progressable.java | 50 --- .../cassandra/hadoop/cql3/CqlOutputFormat.java | 3 +- .../hadoop/cql3/CqlPagingInputFormat.java | 2 +- .../hadoop/cql3/CqlPagingRecordReader.java | 2 +- .../cassandra/hadoop/cql3/CqlRecordWriter.java | 12 +- .../cassandra/hadoop/pig/CassandraStorage.java | 2 +- .../apache/cassandra/hadoop/pig/CqlStorage.java | 3 +- 18 files changed, 346 insertions(+), 82 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/24923083/build.xml -- diff --git a/build.xml b/build.xml index 77b2639..9972aa2 100644 --- a/build.xml +++ b/build.xml @@ -367,7 +367,6 @@ /dependency dependency groupId=org.apache.hadoop artifactId=hadoop-core version=1.0.3/ dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster version=1.0.3/ - dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat version=4.3/ dependency groupId=org.apache.pig artifactId=pig version=0.10.0/ dependency groupId=net.java.dev.jna artifactId=jna version=3.2.7/ @@ -410,7 +409,6 @@ dependency groupId=org.apache.rat artifactId=apache-rat/ dependency groupId=org.apache.hadoop artifactId=hadoop-core/ dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster/ -dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat/ dependency groupId=org.apache.pig artifactId=pig/ dependency groupId=net.java.dev.jna artifactId=jna/ @@ -474,7 +472,6 @@ !-- don't need hadoop classes to run, but if you use the hadoop stuff -- dependency groupId=org.apache.hadoop artifactId=hadoop-core optional=true/ dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster optional=true/ -dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat optional=true/ dependency groupId=org.apache.pig artifactId=pig optional=true/ !-- don't need jna to run, but nice to have -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/24923083/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java index f547fd0..ba79eee 100644 --- a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java +++ b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java @@ -29,7 +29,6 @@ import java.util.concurrent.TimeUnit; import com.google.common.collect.ImmutableList; import com.google.common.collect.Lists; -import com.twitter.elephantbird.util.HadoopCompat; import org.slf4j.Logger; import org.slf4j.LoggerFactory; http://git-wip-us.apache.org/repos/asf/cassandra/blob/24923083/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java
[3/6] git commit: Add hadoop progressable compatibility. Patch by Ben Coverston, reviewed by brandonwilliams for CASSANDRA-5201
Add hadoop progressable compatibility. Patch by Ben Coverston, reviewed by brandonwilliams for CASSANDRA-5201 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/24923083 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/24923083 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/24923083 Branch: refs/heads/trunk Commit: 249230834c2ce1ac169b2b3228d5d222f5ecacc2 Parents: ab2717b Author: Brandon Williams brandonwilli...@apache.org Authored: Wed Mar 5 11:21:35 2014 -0600 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed Mar 5 11:21:35 2014 -0600 -- build.xml | 3 - .../hadoop/AbstractColumnFamilyInputFormat.java | 1 - .../AbstractColumnFamilyOutputFormat.java | 1 - .../AbstractColumnFamilyRecordWriter.java | 2 + .../cassandra/hadoop/BulkOutputFormat.java | 3 +- .../cassandra/hadoop/BulkRecordWriter.java | 16 +- .../hadoop/ColumnFamilyInputFormat.java | 1 - .../hadoop/ColumnFamilyOutputFormat.java| 2 +- .../hadoop/ColumnFamilyRecordReader.java| 1 - .../hadoop/ColumnFamilyRecordWriter.java| 15 +- .../apache/cassandra/hadoop/HadoopCompat.java | 309 +++ .../apache/cassandra/hadoop/Progressable.java | 50 --- .../cassandra/hadoop/cql3/CqlOutputFormat.java | 3 +- .../hadoop/cql3/CqlPagingInputFormat.java | 2 +- .../hadoop/cql3/CqlPagingRecordReader.java | 2 +- .../cassandra/hadoop/cql3/CqlRecordWriter.java | 12 +- .../cassandra/hadoop/pig/CassandraStorage.java | 2 +- .../apache/cassandra/hadoop/pig/CqlStorage.java | 3 +- 18 files changed, 346 insertions(+), 82 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/24923083/build.xml -- diff --git a/build.xml b/build.xml index 77b2639..9972aa2 100644 --- a/build.xml +++ b/build.xml @@ -367,7 +367,6 @@ /dependency dependency groupId=org.apache.hadoop artifactId=hadoop-core version=1.0.3/ dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster version=1.0.3/ - dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat version=4.3/ dependency groupId=org.apache.pig artifactId=pig version=0.10.0/ dependency groupId=net.java.dev.jna artifactId=jna version=3.2.7/ @@ -410,7 +409,6 @@ dependency groupId=org.apache.rat artifactId=apache-rat/ dependency groupId=org.apache.hadoop artifactId=hadoop-core/ dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster/ -dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat/ dependency groupId=org.apache.pig artifactId=pig/ dependency groupId=net.java.dev.jna artifactId=jna/ @@ -474,7 +472,6 @@ !-- don't need hadoop classes to run, but if you use the hadoop stuff -- dependency groupId=org.apache.hadoop artifactId=hadoop-core optional=true/ dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster optional=true/ -dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat optional=true/ dependency groupId=org.apache.pig artifactId=pig optional=true/ !-- don't need jna to run, but nice to have -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/24923083/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java index f547fd0..ba79eee 100644 --- a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java +++ b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java @@ -29,7 +29,6 @@ import java.util.concurrent.TimeUnit; import com.google.common.collect.ImmutableList; import com.google.common.collect.Lists; -import com.twitter.elephantbird.util.HadoopCompat; import org.slf4j.Logger; import org.slf4j.LoggerFactory; http://git-wip-us.apache.org/repos/asf/cassandra/blob/24923083/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java index a3c4234..3041829 100644 --- a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java +++
[jira] [Commented] (CASSANDRA-6588) Add a 'NO EMPTY RESULTS' filter to SELECT
[ https://issues.apache.org/jira/browse/CASSANDRA-6588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921074#comment-13921074 ] Sylvain Lebresne commented on CASSANDRA-6588: - It occured to me that it's not at all impossible to optimize this all out at the storage engine level (and that's probably the right solution). Let me first sum up quickly the problem we're actually trying to solve here: when you query just one CQL row and only select some of it's columns (and *only* in that), we can't use a NamesQueryFilter underneath just because if we get back no result we're not able to distinguish between the row exists but has not data for those columns that have been selected and the row doesn't exist. So instead we currently issue a SliceQueryFilter for the whole CQL row, which can be slower than if we were able to use a NamesQueryFilter because: # NamesQueryFilter uses the CollationController.collectTimeOrderedData() path, that can potentially skip some sstables. # NamesQueryFilter avoids sending the value for the columns of the CQL row that are not selected to the coordinator to have them ignored later (it doesn't matter so much as far as disk reading is concerned since we don't really read cells from disk one by one). So anyway, we could specialize a new RowQueryFilter (which would be the new NamesQueryFilter for CQL3 tables). That filter would use the collectTimeOrderedData() path and would only return the columns queried (+ the row marker), but at the sstable level, it would read from the beginning of the CQL row and as soon as it encounter a live column, it would add the row marker to the result, but otherwise it would skip any column that is not part of the selected ones. In other words, why we can't rely on the row marker being here due to TTL, it's not too hard when deserializing the sstable to generate a fake one for the purpose of the query, but to do avoid doing any extra work otherwise. As a side note, we could actually reuse that same idea for SliceQueryFilter (i.e. have a slice filter but that only care about a subset of the CQL row columns), which would improve the case for slices (when you only select a subset of the columns that is). Add a 'NO EMPTY RESULTS' filter to SELECT - Key: CASSANDRA-6588 URL: https://issues.apache.org/jira/browse/CASSANDRA-6588 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Priority: Minor Fix For: 2.1 beta2 It is the semantic of CQL that a (CQL) row exists as long as it has one non-null column (including the PK columns, which, given that no PK columns can be null, means that it's enough to have the PK set for a row to exist). This does means that the result to {noformat} CREATE TABLE test (k int PRIMARY KEY, v1 int, v2 int); INSERT INTO test(k, v1) VALUES (0, 4); SELECT v2 FROM test; {noformat} must be (and is) {noformat} v2 -- null {noformat} That fact does mean however that when we only select a few columns of a row, we still need to find out rows that exist but have no values for the selected columns. Long story short, given how the storage engine works, this means we need to query full (CQL) rows even when only some of the columns are selected because that's the only way to distinguish between the row exists but have no value for the selected columns and the row doesn't exist. I'll note in particular that, due to CASSANDRA-5762, we can't unfortunately rely on the row marker to optimize that out. Now, when you selects only a subsets of the columns of a row, there is many cases where you don't care about rows that exists but have no value for the columns you requested and are happy to filter those out. So, for those cases, we could provided a new SELECT filter. Outside the potential convenience (not having to filter empty results client side), one interesting part is that when this filter is provided, we could optimize a bit by only querying the columns selected, since we wouldn't need to return rows that exists but have no values for the selected columns. For the exact syntax, there is probably a bunch of options. For instance: * {{SELECT NON EMPTY(v2, v3) FROM test}}: the vague rational for putting it in the SELECT part is that such filter is kind of in the spirit to DISTINCT. Possibly a bit ugly outside of that. * {{SELECT v2, v3 FROM test NO EMPTY RESULTS}} or {{SELECT v2, v3 FROM test NO EMPTY ROWS}} or {{SELECT v2, v3 FROM test NO EMPTY}}: the last one is shorter but maybe a bit less explicit. As for {{RESULTS}} versus {{ROWS}}, the only small object to {{NO EMPTY ROWS}} could be that it might suggest it is filtering non existing rows (I mean, the fact we never ever return non existing rows should hint
[jira] [Commented] (CASSANDRA-6591) un-deprecate cache recentHitRate and expose in o.a.c.metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921078#comment-13921078 ] Yuki Morishita commented on CASSANDRA-6591: --- My test is here https://gist.github.com/yukim/9371796 which simulates misses after hits. If I plot this to graph: !https://docs.google.com/spreadsheet/oimg?key=0AhjS79jizSXtdDJUcnBzdU9tSG9WVG5ia1N3eUx1bncoid=5zx=xnsjrj3s0of0! The blue line is the one proposed in this ticket and the red line is hit rate one minute rate, and I see quite difference there. un-deprecate cache recentHitRate and expose in o.a.c.metrics Key: CASSANDRA-6591 URL: https://issues.apache.org/jira/browse/CASSANDRA-6591 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Burroughs Assignee: Chris Burroughs Priority: Minor Attachments: j6591-1.2-v1.txt, j6591-1.2-v2.txt, j6591-1.2-v3.txt recentHitRate metrics were not added as part of CASSANDRA-4009 because there is not an obvious way to do it with the Metrics library. Instead hitRate was added as an all time measurement since node restart. This does allow changes in cache rate (aka production performance problems) to be detected. Ideally there would be 1/5/15 moving averages for the hit rate, but I'm not sure how to calculate that. Instead I propose updating recentHitRate on a fixed interval and exposing that as a Gauge. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6689) Partially Off Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921110#comment-13921110 ] Pavel Yaskevich commented on CASSANDRA-6689: bq. I've stated clearly what this introduces as a benefit: overwrite workloads no longer cause excessive flushes If you do a copy before of the memtable buffer, you can clearly put it back to the allocator once it's overwritten or becomes otherwise useless, in the process of merging columns with previous row contents. bq. Your next sentence states how this is a large cause of memory consumption, so surely we should be using that memory if possible for other uses (returning it to the buffer cache, or using it internally for more caching)? It doesn't state that is a *large cause of memory consumption*, it states that it has additional cost but it the steady state it don't be allocating over the limit because of the properties of the system that we have, namely the fixed number of threads. bq. Are you performing a full object tree copy, and doing this with a running system to see how it affects the performance of other system components? If not, it doesn't seem to be a useful comparison. Note that this will still create a tremendous amount of heap churn, as most of the memory used by objects right now is on-heap. So copying the records is almost certainly no better for young gen pressure than what we currently do - in fact, it probably makes the situation worse. Do you mean this? Let's say we copy a Cell (or Column object), which is 1 level deep so just allocate additional space for the object headers and do a copy, most of the work would be spend by doing a copy of the data (name/value) anyway, so as we want to live inside of ParNew, see how many such allocations you will be able to do in e.g. 1 second then wipe the whole thing and do it again. We are doing mlockall too which should make that even faster as we are sure that heap is pre-faulted already. bq. It may not be causing the young gen pressure you're seeing, but it certainly offers some benefit here by keeping more rows in memory so recent queries are more likely to be answered with zero allocation, so reducing young gen pressure; it is also a foundation for improving the row cache and introducing a shared page cache which could bring us closer to zero allocation reads. _And so on_ I'm not sure how this would help in the case of row cache, once reference is added to the row cache it means that memtable would hang in there until that row is purged, so if there is a long lived row (write once, read multiple times) in each of the regions (and we reclaim based on regions) would that keep memtable around longer than expected? bq. It's also not clear to me how you would be managing the reclaim of the off-heap allocations without OpOrder, or do you mean to only use off-heap buffers for readers, or to ref-count any memory as you're reading it? Not using off-heap memory for the memtables would negate the main original point of this ticket: to support larger memtables, thus reducing write amplification. Ref-counting incurs overhead linear to the size of the result set, much like copying, and is also fiddly to get right (not convinced it's cleaner or neater), whereas OpOrder incurs overhead proportional to the number of times you reclaim. So if you're using OpOrder, all you're really talking about is a new RefAction: copyToAllocator() or something. So it doesn't notably reduce complexity, it just reduces the quality of the end result. In terms of memory usage copy adds additional linear cost yes but at the same time it makes the system behavior more controllable/predictable which is what ops usually care about where, even on the artificial stress test, there seems to be a low once off-heap feature is enabled which is no surprise once you look at how much complexity does it actually add. bq. Also, I'd love to see some evidence for this (particularly the latter). I'm not disputing it, just would like to see what caused you to reach these conclusions. These definitely warrant separate tickets IMO, but if you have evidence for it, it would help direct any work. Well, it seems like you never operated a real Cassandra cluster, did you? All of the problems that I have listed here are well known, you can even simulate this with docker VMs and making internal network gradually slower, there is no back pressure mechanism built-in so right now Cassandra would accept a bunch or operations on the normal speed (if the outgoing link is physically different than internal) but suddenly would just stop accepting anything and fail internally because of GC storm caused by all of the internode buffers hanging around. Partially Off Heap Memtables Key: CASSANDRA-6689 URL:
[jira] [Created] (CASSANDRA-6802) Row cache improvements
Marcus Eriksson created CASSANDRA-6802: -- Summary: Row cache improvements Key: CASSANDRA-6802 URL: https://issues.apache.org/jira/browse/CASSANDRA-6802 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Fix For: 3.0 There are a few things we could do; * Start using the native memory constructs from CASSANDRA-6694 to avoid serialization/deserialization costs and to minimize the on-heap overhead * Stop invalidating cached rows on writes (update on write instead). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921141#comment-13921141 ] Alex Liu commented on CASSANDRA-6311: - 1. validation of input CQL query needs parsing the query which is what we are trying to avoid. 2. AbstractIterator is to always return to the local host (so that the task is only read data from local host ), it doesn't return endOfData(). It's using stickHost, a host name, to get the Host object which can't be directly created due to the class is not public class. The Host object, origHost, is obtained from cluster internal code. It's possible that origHost object can be null which case the stickHost is not in the cluster. In that case we don't want the job to run for it's in the wrong host. 3. I clean up the code according to other notes. Attach v6 version. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.6 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Liu updated CASSANDRA-6311: Attachment: 6311-v6-2.0-branch.txt Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.6 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6283) Windows 7 data files keept open / can't be deleted after compaction.
[ https://issues.apache.org/jira/browse/CASSANDRA-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Schnitzerling updated CASSANDRA-6283: - Attachment: neighbor-log.zip root-log.zip Logs during nodetool repair -par events. C* 2.0.5-rel with LEAK-log and finalizer-patch under Win-7. Windows 7 data files keept open / can't be deleted after compaction. Key: CASSANDRA-6283 URL: https://issues.apache.org/jira/browse/CASSANDRA-6283 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7 (32) / Java 1.7.0.45 Reporter: Andreas Schnitzerling Assignee: Joshua McKenzie Labels: compaction Fix For: 2.0.6 Attachments: 6283_StreamWriter_patch.txt, leakdetect.patch, neighbor-log.zip, root-log.zip, screenshot-1.jpg, system.log Files cannot be deleted, patch CASSANDRA-5383 (Win7 deleting problem) doesn't help on Win-7 on Cassandra 2.0.2. Even 2.1 Snapshot is not running. The cause is: Opened file handles seem to be lost and not closed properly. Win 7 blames, that another process is still using the file (but its obviously cassandra). Only restart of the server makes the files deleted. But after heavy using (changes) of tables, there are about 24K files in the data folder (instead of 35 after every restart) and Cassandra crashes. I experiminted and I found out, that a finalizer fixes the problem. So after GC the files will be deleted (not optimal, but working fine). It runs now 2 days continously without problem. Possible fix/test: I wrote the following finalizer at the end of class org.apache.cassandra.io.util.RandomAccessReader: {code:title=RandomAccessReader.java|borderStyle=solid} @Override protected void finalize() throws Throwable { deallocate(); super.finalize(); } {code} Can somebody test / develop / patch it? Thx. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6283) Windows 7 data files keept open / can't be deleted after compaction.
[ https://issues.apache.org/jira/browse/CASSANDRA-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921180#comment-13921180 ] Andreas Schnitzerling edited comment on CASSANDRA-6283 at 3/5/14 6:40 PM: -- I attached logs (root-log.zip and neighbor-log.zip) during nodetool repair -par events. C* 2.0.5-rel with LEAK-log and finalizer-patch under Win-7. was (Author: andie78): Logs during nodetool repair -par events. C* 2.0.5-rel with LEAK-log and finalizer-patch under Win-7. Windows 7 data files keept open / can't be deleted after compaction. Key: CASSANDRA-6283 URL: https://issues.apache.org/jira/browse/CASSANDRA-6283 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7 (32) / Java 1.7.0.45 Reporter: Andreas Schnitzerling Assignee: Joshua McKenzie Labels: compaction Fix For: 2.0.6 Attachments: 6283_StreamWriter_patch.txt, leakdetect.patch, neighbor-log.zip, root-log.zip, screenshot-1.jpg, system.log Files cannot be deleted, patch CASSANDRA-5383 (Win7 deleting problem) doesn't help on Win-7 on Cassandra 2.0.2. Even 2.1 Snapshot is not running. The cause is: Opened file handles seem to be lost and not closed properly. Win 7 blames, that another process is still using the file (but its obviously cassandra). Only restart of the server makes the files deleted. But after heavy using (changes) of tables, there are about 24K files in the data folder (instead of 35 after every restart) and Cassandra crashes. I experiminted and I found out, that a finalizer fixes the problem. So after GC the files will be deleted (not optimal, but working fine). It runs now 2 days continously without problem. Possible fix/test: I wrote the following finalizer at the end of class org.apache.cassandra.io.util.RandomAccessReader: {code:title=RandomAccessReader.java|borderStyle=solid} @Override protected void finalize() throws Throwable { deallocate(); super.finalize(); } {code} Can somebody test / develop / patch it? Thx. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6689) Partially Off Heap Memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921110#comment-13921110 ] Pavel Yaskevich edited comment on CASSANDRA-6689 at 3/5/14 6:55 PM: bq. I've stated clearly what this introduces as a benefit: overwrite workloads no longer cause excessive flushes If you do a copy before of the memtable buffer, you can clearly put it back to the allocator once it's overwritten or becomes otherwise useless, in the process of merging columns with previous row contents. bq. Your next sentence states how this is a large cause of memory consumption, so surely we should be using that memory if possible for other uses (returning it to the buffer cache, or using it internally for more caching)? It doesn't state that is a *large cause of memory consumption*, it states that it has additional cost but it the steady state it won't be allocating over the limit because of the properties of the system that we have, namely the fixed number of threads. bq. Are you performing a full object tree copy, and doing this with a running system to see how it affects the performance of other system components? If not, it doesn't seem to be a useful comparison. Note that this will still create a tremendous amount of heap churn, as most of the memory used by objects right now is on-heap. So copying the records is almost certainly no better for young gen pressure than what we currently do - in fact, it probably makes the situation worse. Do you mean this? Let's say we copy a Cell (or Column object), which is 1 level deep so just allocate additional space for the object headers and do a copy, most of the work would be spend by doing a copy of the data (name/value) anyway, so as we want to live inside of ParNew (so we can just discard already dead objects), see how many such allocations you will be able to do in e.g. 1 second then wipe the whole thing (equivalent of ParNew with rejects dead + compacts) and do it again. We are doing mlockall too which should make that even faster as we are sure that heap is pre-faulted already. bq. It may not be causing the young gen pressure you're seeing, but it certainly offers some benefit here by keeping more rows in memory so recent queries are more likely to be answered with zero allocation, so reducing young gen pressure; it is also a foundation for improving the row cache and introducing a shared page cache which could bring us closer to zero allocation reads. _And so on_ I'm not sure how this would help in the case of row cache, once reference is added to the row cache it means that memtable would hang in there until that row is purged, so if there is a long lived row (write once, read multiple times) in each of the regions (and we reclaim based on regions) would that keep memtable around longer than expected? bq. It's also not clear to me how you would be managing the reclaim of the off-heap allocations without OpOrder, or do you mean to only use off-heap buffers for readers, or to ref-count any memory as you're reading it? Not using off-heap memory for the memtables would negate the main original point of this ticket: to support larger memtables, thus reducing write amplification. Ref-counting incurs overhead linear to the size of the result set, much like copying, and is also fiddly to get right (not convinced it's cleaner or neater), whereas OpOrder incurs overhead proportional to the number of times you reclaim. So if you're using OpOrder, all you're really talking about is a new RefAction: copyToAllocator() or something. So it doesn't notably reduce complexity, it just reduces the quality of the end result. In terms of memory usage copy adds additional linear cost, yes, but at the same time it makes the system behavior more controllable/predictable which is what ops usually care about, where, even with the artificial stress test, there seems to be a low once off-heap feature is enabled which is no surprise once you look at how much complexity does it actually add. bq. Also, I'd love to see some evidence for this (particularly the latter). I'm not disputing it, just would like to see what caused you to reach these conclusions. These definitely warrant separate tickets IMO, but if you have evidence for it, it would help direct any work. Well, it seems like you never operated a real Cassandra cluster, did you? All of the problems that I have listed here are well known, you can even simulate this with docker VMs and making internal network gradually slower, there is *no* back pressure mechanism built-in so right now Cassandra would accept a bunch or operations on the normal speed (if the outgoing link is physically different than internal, which should always be the case) but suddenly would just stop accepting anything and fail internally because of GC storm
[jira] [Updated] (CASSANDRA-5201) Cassandra/Hadoop does not support current Hadoop releases
[ https://issues.apache.org/jira/browse/CASSANDRA-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Coverston updated CASSANDRA-5201: -- Attachment: hadoop-compat-2.1-merge.patch Patch for 2.1 branch Cassandra/Hadoop does not support current Hadoop releases - Key: CASSANDRA-5201 URL: https://issues.apache.org/jira/browse/CASSANDRA-5201 Project: Cassandra Issue Type: Bug Components: Hadoop Affects Versions: 1.2.0 Reporter: Brian Jeltema Assignee: Benjamin Coverston Fix For: 2.0.6 Attachments: 5201_a.txt, hadoop-compat-2.1-merge.patch, hadoopCompat.patch, hadoopcompat-trunk.patch, progressable-fix.patch, progressable-wrapper.patch Using Hadoop 0.22.0 with Cassandra results in the stack trace below. It appears that version 0.21+ changed org.apache.hadoop.mapreduce.JobContext from a class to an interface. Exception in thread main java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:103) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:445) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:462) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:357) at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1045) at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1042) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1153) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1042) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1062) at MyHadoopApp.run(MyHadoopApp.java:163) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69) at MyHadoopApp.main(MyHadoopApp.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:192) -- This message was sent by Atlassian JIRA (v6.2#6252)
[1/3] git commit: Add hadoop progressable compatibility. Patch by Ben Coverston, reviewed by brandonwilliams for CASSANDRA-5201
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 7f7a9cc75 - 4cf8a8a6c refs/heads/trunk f601cac02 - 7c7193769 Add hadoop progressable compatibility. Patch by Ben Coverston, reviewed by brandonwilliams for CASSANDRA-5201 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4cf8a8a6 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4cf8a8a6 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4cf8a8a6 Branch: refs/heads/cassandra-2.1 Commit: 4cf8a8a6c356889609f9ffb74d548a68e52ec506 Parents: 7f7a9cc Author: Brandon Williams brandonwilli...@apache.org Authored: Wed Mar 5 12:54:42 2014 -0600 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed Mar 5 12:54:42 2014 -0600 -- build.xml | 3 - .../hadoop/AbstractColumnFamilyInputFormat.java | 1 - .../AbstractColumnFamilyOutputFormat.java | 1 - .../AbstractColumnFamilyRecordWriter.java | 2 + .../cassandra/hadoop/BulkOutputFormat.java | 3 +- .../cassandra/hadoop/BulkRecordWriter.java | 16 +- .../hadoop/ColumnFamilyInputFormat.java | 1 - .../hadoop/ColumnFamilyOutputFormat.java| 2 +- .../hadoop/ColumnFamilyRecordReader.java| 1 - .../hadoop/ColumnFamilyRecordWriter.java| 15 +- .../apache/cassandra/hadoop/HadoopCompat.java | 309 +++ .../apache/cassandra/hadoop/Progressable.java | 50 --- .../cassandra/hadoop/cql3/CqlOutputFormat.java | 3 +- .../hadoop/cql3/CqlPagingInputFormat.java | 2 +- .../hadoop/cql3/CqlPagingRecordReader.java | 2 +- .../cassandra/hadoop/cql3/CqlRecordWriter.java | 12 +- .../cassandra/hadoop/pig/CassandraStorage.java | 2 +- .../apache/cassandra/hadoop/pig/CqlStorage.java | 1 - 18 files changed, 345 insertions(+), 81 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4cf8a8a6/build.xml -- diff --git a/build.xml b/build.xml index bb8673e..304b5fe 100644 --- a/build.xml +++ b/build.xml @@ -374,7 +374,6 @@ exclusion groupId=org.mortbay.jetty artifactId=servlet-api/ /dependency dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster version=1.0.3/ - dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat version=4.3/ dependency groupId=org.apache.pig artifactId=pig version=0.11.1/ dependency groupId=net.java.dev.jna artifactId=jna version=4.0.0/ @@ -418,7 +417,6 @@ dependency groupId=org.apache.rat artifactId=apache-rat/ dependency groupId=org.apache.hadoop artifactId=hadoop-core/ dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster/ -dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat/ dependency groupId=org.apache.pig artifactId=pig/ dependency groupId=com.google.code.findbugs artifactId=jsr305/ /artifact:pom @@ -485,7 +483,6 @@ !-- don't need hadoop classes to run, but if you use the hadoop stuff -- dependency groupId=org.apache.hadoop artifactId=hadoop-core optional=true/ dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster optional=true/ -dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat optional=true/ dependency groupId=org.apache.pig artifactId=pig optional=true/ !-- don't need jna to run, but nice to have -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4cf8a8a6/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java index 760193f..cb106e9 100644 --- a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java +++ b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java @@ -29,7 +29,6 @@ import java.util.concurrent.TimeUnit; import com.google.common.collect.ImmutableList; import com.google.common.collect.Lists; -import com.twitter.elephantbird.util.HadoopCompat; import org.slf4j.Logger; import org.slf4j.LoggerFactory; http://git-wip-us.apache.org/repos/asf/cassandra/blob/4cf8a8a6/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java index
[3/3] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7c719376 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7c719376 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7c719376 Branch: refs/heads/trunk Commit: 7c7193769c5b85b8bcb38cf3f5afb3e3be0e1016 Parents: f601cac 4cf8a8a Author: Brandon Williams brandonwilli...@apache.org Authored: Wed Mar 5 12:55:30 2014 -0600 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed Mar 5 12:55:30 2014 -0600 -- build.xml | 3 - .../hadoop/AbstractColumnFamilyInputFormat.java | 1 - .../AbstractColumnFamilyOutputFormat.java | 1 - .../AbstractColumnFamilyRecordWriter.java | 2 + .../cassandra/hadoop/BulkOutputFormat.java | 3 +- .../cassandra/hadoop/BulkRecordWriter.java | 16 +- .../hadoop/ColumnFamilyInputFormat.java | 1 - .../hadoop/ColumnFamilyOutputFormat.java| 2 +- .../hadoop/ColumnFamilyRecordReader.java| 1 - .../hadoop/ColumnFamilyRecordWriter.java| 15 +- .../apache/cassandra/hadoop/HadoopCompat.java | 309 +++ .../apache/cassandra/hadoop/Progressable.java | 50 --- .../cassandra/hadoop/cql3/CqlOutputFormat.java | 3 +- .../hadoop/cql3/CqlPagingInputFormat.java | 2 +- .../hadoop/cql3/CqlPagingRecordReader.java | 2 +- .../cassandra/hadoop/cql3/CqlRecordWriter.java | 12 +- .../cassandra/hadoop/pig/CassandraStorage.java | 2 +- .../apache/cassandra/hadoop/pig/CqlStorage.java | 1 - 18 files changed, 345 insertions(+), 81 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7c719376/build.xml --
[2/3] git commit: Add hadoop progressable compatibility. Patch by Ben Coverston, reviewed by brandonwilliams for CASSANDRA-5201
Add hadoop progressable compatibility. Patch by Ben Coverston, reviewed by brandonwilliams for CASSANDRA-5201 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4cf8a8a6 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4cf8a8a6 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4cf8a8a6 Branch: refs/heads/trunk Commit: 4cf8a8a6c356889609f9ffb74d548a68e52ec506 Parents: 7f7a9cc Author: Brandon Williams brandonwilli...@apache.org Authored: Wed Mar 5 12:54:42 2014 -0600 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed Mar 5 12:54:42 2014 -0600 -- build.xml | 3 - .../hadoop/AbstractColumnFamilyInputFormat.java | 1 - .../AbstractColumnFamilyOutputFormat.java | 1 - .../AbstractColumnFamilyRecordWriter.java | 2 + .../cassandra/hadoop/BulkOutputFormat.java | 3 +- .../cassandra/hadoop/BulkRecordWriter.java | 16 +- .../hadoop/ColumnFamilyInputFormat.java | 1 - .../hadoop/ColumnFamilyOutputFormat.java| 2 +- .../hadoop/ColumnFamilyRecordReader.java| 1 - .../hadoop/ColumnFamilyRecordWriter.java| 15 +- .../apache/cassandra/hadoop/HadoopCompat.java | 309 +++ .../apache/cassandra/hadoop/Progressable.java | 50 --- .../cassandra/hadoop/cql3/CqlOutputFormat.java | 3 +- .../hadoop/cql3/CqlPagingInputFormat.java | 2 +- .../hadoop/cql3/CqlPagingRecordReader.java | 2 +- .../cassandra/hadoop/cql3/CqlRecordWriter.java | 12 +- .../cassandra/hadoop/pig/CassandraStorage.java | 2 +- .../apache/cassandra/hadoop/pig/CqlStorage.java | 1 - 18 files changed, 345 insertions(+), 81 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4cf8a8a6/build.xml -- diff --git a/build.xml b/build.xml index bb8673e..304b5fe 100644 --- a/build.xml +++ b/build.xml @@ -374,7 +374,6 @@ exclusion groupId=org.mortbay.jetty artifactId=servlet-api/ /dependency dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster version=1.0.3/ - dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat version=4.3/ dependency groupId=org.apache.pig artifactId=pig version=0.11.1/ dependency groupId=net.java.dev.jna artifactId=jna version=4.0.0/ @@ -418,7 +417,6 @@ dependency groupId=org.apache.rat artifactId=apache-rat/ dependency groupId=org.apache.hadoop artifactId=hadoop-core/ dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster/ -dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat/ dependency groupId=org.apache.pig artifactId=pig/ dependency groupId=com.google.code.findbugs artifactId=jsr305/ /artifact:pom @@ -485,7 +483,6 @@ !-- don't need hadoop classes to run, but if you use the hadoop stuff -- dependency groupId=org.apache.hadoop artifactId=hadoop-core optional=true/ dependency groupId=org.apache.hadoop artifactId=hadoop-minicluster optional=true/ -dependency groupId=com.twitter.elephantbird artifactId=elephant-bird-hadoop-compat optional=true/ dependency groupId=org.apache.pig artifactId=pig optional=true/ !-- don't need jna to run, but nice to have -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4cf8a8a6/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java index 760193f..cb106e9 100644 --- a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java +++ b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java @@ -29,7 +29,6 @@ import java.util.concurrent.TimeUnit; import com.google.common.collect.ImmutableList; import com.google.common.collect.Lists; -import com.twitter.elephantbird.util.HadoopCompat; import org.slf4j.Logger; import org.slf4j.LoggerFactory; http://git-wip-us.apache.org/repos/asf/cassandra/blob/4cf8a8a6/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java index a3c4234..3041829 100644 --- a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java +++
[jira] [Updated] (CASSANDRA-6778) FBUtilities.singleton() should use the CF comparator
[ https://issues.apache.org/jira/browse/CASSANDRA-6778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-6778: --- Attachment: 6778-test.txt +1 The attached 6778-text.txt adds a unit test to exercise this patch. As for 1.2, I agree that this is a pretty rare corner case, so it's probably safer to only apply this patch to 2.0. By the way, it looks like you already did most of this work on 2.1 as part of CASSANDRA-5417, so make sure the conflicts get resolved properly. FBUtilities.singleton() should use the CF comparator Key: CASSANDRA-6778 URL: https://issues.apache.org/jira/browse/CASSANDRA-6778 Project: Cassandra Issue Type: Bug Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 2.0.6 Attachments: 0001-Proper-comparison-for-singleton-sorted-set.txt, 0002-Use-comparator-instead-of-BB.equals.txt, 6778-test.txt We sometimes use FBUtilities.singleton() to created a SortedSet for NamesQueryFilter. However, the set created by that method does not use the CF comparator, so that it use ByteBuffer comparison/equality for methods like contains(). And this might not be ok if it turns that the comparator is so that 2 column name can be equal without their binary representation being equal, and as it turns out at least IntegerType, DecimalType (because they let you put arbitrary many zeros in front of the binary encoding) have such property (BooleanType should also have that property though it doesn't in practice which I think that's a bug, but that's for another ticket). I'll note that CASSANDRA-6733 contains an example where this matter. However, in practice, only SELECT on compact tables that select just one column can ever ran into that and you'd only run into it if your client insert useless zeros in its IntegerType/DecimalType binary representation, which ought to be not common in the first place. It's still wrong and should be fixed. Patch attached to include the comparator in FBUtilities.singleton. I also found 2 other small places where we were using ByteBuffer.equals() where the comparator should be used instead and attaching a 2nd patch for those. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921141#comment-13921141 ] Alex Liu edited comment on CASSANDRA-6311 at 3/5/14 7:08 PM: - 1. validation of input CQL query needs parsing the query which is what we are trying to avoid. 2. AbstractIterator is to always return to the local host (so that the task is only read data from local host ). It's using stickHost, a host name, to get the Host object which can't be directly created due to the class is not public class. Add liveRemoteHosts, so if local host is down, then use remote node. 3. I clean up the code according to other notes. Attach v6 version. was (Author: alexliu68): 1. validation of input CQL query needs parsing the query which is what we are trying to avoid. 2. AbstractIterator is to always return to the local host (so that the task is only read data from local host ), it doesn't return endOfData(). It's using stickHost, a host name, to get the Host object which can't be directly created due to the class is not public class. The Host object, origHost, is obtained from cluster internal code. It's possible that origHost object can be null which case the stickHost is not in the cluster. In that case we don't want the job to run for it's in the wrong host. 3. I clean up the code according to other notes. Attach v6 version. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.6 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Liu updated CASSANDRA-6311: Attachment: 6311-v6-2.0-branch.txt Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.6 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Liu updated CASSANDRA-6311: Attachment: (was: 6311-v6-2.0-branch.txt) Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.6 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6526) CQLSSTableWriter addRow(MapString, Object values) does not work as documented.
[ https://issues.apache.org/jira/browse/CASSANDRA-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921240#comment-13921240 ] Tyler Hobbs commented on CASSANDRA-6526: +1 CQLSSTableWriter addRow(MapString, Object values) does not work as documented. Key: CASSANDRA-6526 URL: https://issues.apache.org/jira/browse/CASSANDRA-6526 Project: Cassandra Issue Type: Bug Components: Core Reporter: Yariv Amar Assignee: Sylvain Lebresne Fix For: 2.0.6 Attachments: 6526.txt Original Estimate: 24h Remaining Estimate: 24h There are 2 bugs in the method {code} addRow(MapString, Object values) {code} First issue is that the map bmust/b contain all the column names as keys in the map otherwise the addRow fails (with InvalidRequestException Invalid number of arguments, expecting %d values but got %d). Second Issue is that the keys in the map must be in lower-case otherwise they may not be found in the map, which will result in a NPE during decompose. h6. SUGGESTED SOLUTION: Fix the addRow method with: {code} public CQLSSTableWriter addRow(MapString, Object values) throws InvalidRequestException, IOException { int size = boundNames.size(); MapString, ByteBuffer rawValues = new HashMap(size); for (int i = 0; i size; i++) { ColumnSpecification spec = boundNames.get(i); String colName = spec.name.toString(); rawValues.put(colName, values.get(colName) == null ? null : ((AbstractType)spec.type).decompose(values.get(colName))); } return rawAddRow(rawValues); } {code} When creating the new Map for the insert we need to go over all columns and apply null to missing columns. Fix the method documentation add this line: {code} * p * Keys in the map bmust/b be in lower case, otherwise their value will be null. * {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6801) INSERT with IF NOT EXISTS fails when row is an expired ttl
[ https://issues.apache.org/jira/browse/CASSANDRA-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921298#comment-13921298 ] Paul Kendall commented on CASSANDRA-6801: - This is related to CASSANDRA-6623. Although in that case only a single value has a TTL, in this all values have a TTL. INSERT with IF NOT EXISTS fails when row is an expired ttl -- Key: CASSANDRA-6801 URL: https://issues.apache.org/jira/browse/CASSANDRA-6801 Project: Cassandra Issue Type: Bug Components: Core Reporter: Adam Hattrell I ran this on a 2 DC cluster with 3 nodes each. CREATE KEYSPACE test WITH replication = { 'class': 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3' }; CREATE TABLE clusterlock ( name text, hostname text, lockid text, PRIMARY KEY (name) ) ; Then add some data and flush it to ensure the sstables exist (didn't reproduce in memtables for some reason). Then insert into clusterlock (name, lockid, hostname) values ( 'adam', 'tt', '111') IF NOT EXISTS USING TTL 5; Wait for ttl to be reached then try again: insert into clusterlock (name, lockid, hostname) values ( 'adam', 'tt', '111') IF NOT EXISTS USING TTL 5; [applied] --- False select * shows no rows in table. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6147) Allow Thrift opt-in to server-side timestamps
[ https://issues.apache.org/jira/browse/CASSANDRA-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6147: -- Component/s: API Reviewer: Tyler Hobbs Priority: Minor (was: Major) Summary: Allow Thrift opt-in to server-side timestamps (was: Break timestamp ties for thrift-ers) WDYT [~thobbs]? Allow Thrift opt-in to server-side timestamps - Key: CASSANDRA-6147 URL: https://issues.apache.org/jira/browse/CASSANDRA-6147 Project: Cassandra Issue Type: Sub-task Components: API Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Minor Fix For: 2.1 beta2 Thrift users are still forced to generate timestamps on the client side. Currently the way the thrift bindings are generated users are forced to supply timestamps. There are two solutions I see. * -1 as timestamp means generate on the server side This is a breaking change, for those using -1 as a timestamp (which should effectively be no one. * Prepare yourself Our thrift signatures are wrong, you can't overload methods in thrift thrift.get(byte [], byte[], ts) should REALLY be changed to GetRequest g = new GetRequest() g.setName() g.setValue() g.setTs() ///optional thrift. get( g ) I know no one is going to want to make this change because thrift is quasi/dead but it would allow us to evolve thrift in a meaningful way. We could simple add these new methods under different names as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6147) Allow Thrift opt-in to server-side timestamps
[ https://issues.apache.org/jira/browse/CASSANDRA-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921328#comment-13921328 ] Nate McCall commented on CASSANDRA-6147: [~appodictic] FBUtilities#timestampMicros() looks like the common way to do what you want. Allow Thrift opt-in to server-side timestamps - Key: CASSANDRA-6147 URL: https://issues.apache.org/jira/browse/CASSANDRA-6147 Project: Cassandra Issue Type: Sub-task Components: API Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Minor Fix For: 2.1 beta2 Thrift users are still forced to generate timestamps on the client side. Currently the way the thrift bindings are generated users are forced to supply timestamps. There are two solutions I see. * -1 as timestamp means generate on the server side This is a breaking change, for those using -1 as a timestamp (which should effectively be no one. * Prepare yourself Our thrift signatures are wrong, you can't overload methods in thrift thrift.get(byte [], byte[], ts) should REALLY be changed to GetRequest g = new GetRequest() g.setName() g.setValue() g.setTs() ///optional thrift. get( g ) I know no one is going to want to make this change because thrift is quasi/dead but it would allow us to evolve thrift in a meaningful way. We could simple add these new methods under different names as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-5899) Sends all interface in native protocol notification when rpc_address=0.0.0.0
[ https://issues.apache.org/jira/browse/CASSANDRA-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-5899: --- Attachment: 5899.txt 5899.txt (and [branch|https://github.com/thobbs/cassandra/tree/CASSANDRA-5899]) adds a broadcast_rpc_address option. If not set, this defaults to rpc_address. If rpc_address is 0.0.0.0, broadcast_rpc_address must be set, and you can never use 0.0.0.0 for broadcast_rpc_address. Sends all interface in native protocol notification when rpc_address=0.0.0.0 Key: CASSANDRA-5899 URL: https://issues.apache.org/jira/browse/CASSANDRA-5899 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Tyler Hobbs Priority: Minor Fix For: 2.1 beta2 Attachments: 5899.txt For the native protocol notifications, when we send a new node notification, we send the rpc_address of that new node. For this to be actually useful, that address sent should be publicly accessible by the driver it is destined to. The problem is when rpc_address=0.0.0.0. Currently, we send the listen_address, which is correct in the sense that we do are bind on it but might not be accessible by client nodes. In fact, one of the good reason to use 0.0.0.0 rpc_address would be if you have a private network for internode communication and another for client-server communinations, but still want to be able to issue query from the private network for debugging. In that case, the current behavior to send listen_address doesn't really help. So one suggestion would be to instead send all the addresses on which the (native protocol) server is bound to (which would still leave to the driver the task to pick the right one, but at least it has something to pick from). That's relatively trivial to do in practice, but it does require a minor binary protocol break to return a list instead of just one IP, which is why I'm tentatively marking this 2.0. Maybe we can shove that tiny change in the final (in the protocol v2 only)? Povided we agree it's a good idea of course. Now to be complete, for the same reasons, we would also need to store all the addresses we are bound to in the peers table. That's also fairly simple and the backward compatibility story is maybe a tad simpler: we could add a new {{rpc_addresses}} column that would be a list and deprecate {{rpc_address}} (to be removed in 2.1 for instance). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-5899) Sends all interface in native protocol notification when rpc_address=0.0.0.0
[ https://issues.apache.org/jira/browse/CASSANDRA-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-5899: --- Reviewer: Sylvain Lebresne Sends all interface in native protocol notification when rpc_address=0.0.0.0 Key: CASSANDRA-5899 URL: https://issues.apache.org/jira/browse/CASSANDRA-5899 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Tyler Hobbs Priority: Minor Fix For: 2.1 beta2 Attachments: 5899.txt For the native protocol notifications, when we send a new node notification, we send the rpc_address of that new node. For this to be actually useful, that address sent should be publicly accessible by the driver it is destined to. The problem is when rpc_address=0.0.0.0. Currently, we send the listen_address, which is correct in the sense that we do are bind on it but might not be accessible by client nodes. In fact, one of the good reason to use 0.0.0.0 rpc_address would be if you have a private network for internode communication and another for client-server communinations, but still want to be able to issue query from the private network for debugging. In that case, the current behavior to send listen_address doesn't really help. So one suggestion would be to instead send all the addresses on which the (native protocol) server is bound to (which would still leave to the driver the task to pick the right one, but at least it has something to pick from). That's relatively trivial to do in practice, but it does require a minor binary protocol break to return a list instead of just one IP, which is why I'm tentatively marking this 2.0. Maybe we can shove that tiny change in the final (in the protocol v2 only)? Povided we agree it's a good idea of course. Now to be complete, for the same reasons, we would also need to store all the addresses we are bound to in the peers table. That's also fairly simple and the backward compatibility story is maybe a tad simpler: we could add a new {{rpc_addresses}} column that would be a list and deprecate {{rpc_address}} (to be removed in 2.1 for instance). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-6803) nodetool getsstables fails with 'blob' type primary keys
Nate McCall created CASSANDRA-6803: -- Summary: nodetool getsstables fails with 'blob' type primary keys Key: CASSANDRA-6803 URL: https://issues.apache.org/jira/browse/CASSANDRA-6803 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Nate McCall Assignee: Nate McCall Fix For: 2.0.6 Trivial fix, just need to get the bytebuffer from the CfMetaData's key validator as opposed to just calling String#getBytes (which breaks for keys of BytesType). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6803) nodetool getsstables fails with 'blob' type primary keys
[ https://issues.apache.org/jira/browse/CASSANDRA-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate McCall updated CASSANDRA-6803: --- Attachment: sstables_for_key_blob_support.txt sstables_for_key_blob_support_2.0.txt patches for 2.0 and 1.2. nodetool getsstables fails with 'blob' type primary keys Key: CASSANDRA-6803 URL: https://issues.apache.org/jira/browse/CASSANDRA-6803 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Nate McCall Assignee: Nate McCall Fix For: 2.0.6 Attachments: sstables_for_key_blob_support.txt, sstables_for_key_blob_support_2.0.txt Trivial fix, just need to get the bytebuffer from the CfMetaData's key validator as opposed to just calling String#getBytes (which breaks for keys of BytesType). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6147) Allow Thrift opt-in to server-side timestamps
[ https://issues.apache.org/jira/browse/CASSANDRA-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921856#comment-13921856 ] Edward Capriolo commented on CASSANDRA-6147: I was considering adding this feature to deletes as well because the same logic holds. Here is the problem: struct Deletion { 1: optional i64 timestamp, 2: optional binary super_column, 3: optional SlicePredicate predicate, } void remove(1:required binary key, 2:required ColumnPath column_path, 3:required i64 timestamp, 4:ConsistencyLevel consistency_level=ConsistencyLevel.ONE) throws (1:InvalidRequestException ire, 2:UnavailableException ue, 3:TimedOutException te), else if (!del.isSetTimestamp()) { throw new org.apache.cassandra.exceptions.InvalidRequestException(Deletion timestamp is not optional for non commutative column family + metadata.cfName); } Because remove requires a timestamp and deletion does not. We can not remove that. What we can do is set it to optional, throw an exception server side and them maybe later (1 year) truly allow it to be optional and not throw the exception. I update my branch with Nate's change. I also modified the interface file to document the change. Allow Thrift opt-in to server-side timestamps - Key: CASSANDRA-6147 URL: https://issues.apache.org/jira/browse/CASSANDRA-6147 Project: Cassandra Issue Type: Sub-task Components: API Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Minor Fix For: 2.1 beta2 Thrift users are still forced to generate timestamps on the client side. Currently the way the thrift bindings are generated users are forced to supply timestamps. There are two solutions I see. * -1 as timestamp means generate on the server side This is a breaking change, for those using -1 as a timestamp (which should effectively be no one. * Prepare yourself Our thrift signatures are wrong, you can't overload methods in thrift thrift.get(byte [], byte[], ts) should REALLY be changed to GetRequest g = new GetRequest() g.setName() g.setValue() g.setTs() ///optional thrift. get( g ) I know no one is going to want to make this change because thrift is quasi/dead but it would allow us to evolve thrift in a meaningful way. We could simple add these new methods under different names as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6588) Add a 'NO EMPTY RESULTS' filter to SELECT
[ https://issues.apache.org/jira/browse/CASSANDRA-6588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921974#comment-13921974 ] Tupshin Harper commented on CASSANDRA-6588: --- I read that three times, and as long as there is no technical objection or problem implementing it, I'd *love* to see that as our approach. +1 Add a 'NO EMPTY RESULTS' filter to SELECT - Key: CASSANDRA-6588 URL: https://issues.apache.org/jira/browse/CASSANDRA-6588 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Priority: Minor Fix For: 2.1 beta2 It is the semantic of CQL that a (CQL) row exists as long as it has one non-null column (including the PK columns, which, given that no PK columns can be null, means that it's enough to have the PK set for a row to exist). This does means that the result to {noformat} CREATE TABLE test (k int PRIMARY KEY, v1 int, v2 int); INSERT INTO test(k, v1) VALUES (0, 4); SELECT v2 FROM test; {noformat} must be (and is) {noformat} v2 -- null {noformat} That fact does mean however that when we only select a few columns of a row, we still need to find out rows that exist but have no values for the selected columns. Long story short, given how the storage engine works, this means we need to query full (CQL) rows even when only some of the columns are selected because that's the only way to distinguish between the row exists but have no value for the selected columns and the row doesn't exist. I'll note in particular that, due to CASSANDRA-5762, we can't unfortunately rely on the row marker to optimize that out. Now, when you selects only a subsets of the columns of a row, there is many cases where you don't care about rows that exists but have no value for the columns you requested and are happy to filter those out. So, for those cases, we could provided a new SELECT filter. Outside the potential convenience (not having to filter empty results client side), one interesting part is that when this filter is provided, we could optimize a bit by only querying the columns selected, since we wouldn't need to return rows that exists but have no values for the selected columns. For the exact syntax, there is probably a bunch of options. For instance: * {{SELECT NON EMPTY(v2, v3) FROM test}}: the vague rational for putting it in the SELECT part is that such filter is kind of in the spirit to DISTINCT. Possibly a bit ugly outside of that. * {{SELECT v2, v3 FROM test NO EMPTY RESULTS}} or {{SELECT v2, v3 FROM test NO EMPTY ROWS}} or {{SELECT v2, v3 FROM test NO EMPTY}}: the last one is shorter but maybe a bit less explicit. As for {{RESULTS}} versus {{ROWS}}, the only small object to {{NO EMPTY ROWS}} could be that it might suggest it is filtering non existing rows (I mean, the fact we never ever return non existing rows should hint that it's not what it does but well...) while we're just filtering empty resultSet rows. Of course, if there is a pre-existing SQL syntax for that, it's even better, though a very quick search didn't turn anything. Other suggestions welcome too. -- This message was sent by Atlassian JIRA (v6.2#6252)
git commit: use junit asserts
Repository: cassandra Updated Branches: refs/heads/trunk 7c7193769 - b173ce207 use junit asserts Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b173ce20 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b173ce20 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b173ce20 Branch: refs/heads/trunk Commit: b173ce207b311a57f288269eebf13375a2459a99 Parents: 7c71937 Author: Dave Brosius dbros...@mebigfatguy.com Authored: Wed Mar 5 23:57:37 2014 -0500 Committer: Dave Brosius dbros...@mebigfatguy.com Committed: Wed Mar 5 23:57:37 2014 -0500 -- .../db/compaction/CompactionsTest.java | 67 +--- 1 file changed, 43 insertions(+), 24 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/b173ce20/test/unit/org/apache/cassandra/db/compaction/CompactionsTest.java -- diff --git a/test/unit/org/apache/cassandra/db/compaction/CompactionsTest.java b/test/unit/org/apache/cassandra/db/compaction/CompactionsTest.java index 1497b3a..ac47bb6 100644 --- a/test/unit/org/apache/cassandra/db/compaction/CompactionsTest.java +++ b/test/unit/org/apache/cassandra/db/compaction/CompactionsTest.java @@ -18,22 +18,35 @@ */ package org.apache.cassandra.db.compaction; -import java.io.*; +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; +import static org.junit.Assert.assertNotNull; + +import java.io.File; import java.nio.ByteBuffer; -import java.util.*; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Collections; +import java.util.Map; +import java.util.Set; +import java.util.UUID; import java.util.concurrent.ExecutionException; import java.util.concurrent.TimeUnit; -import com.google.common.base.Function; -import com.google.common.collect.Iterables; -import com.google.common.collect.Sets; -import org.junit.Test; -import org.junit.runner.RunWith; - import org.apache.cassandra.OrderedJUnit4ClassRunner; import org.apache.cassandra.SchemaLoader; import org.apache.cassandra.Util; -import org.apache.cassandra.db.*; +import org.apache.cassandra.db.ColumnFamily; +import org.apache.cassandra.db.ColumnFamilyStore; +import org.apache.cassandra.db.DataRange; +import org.apache.cassandra.db.DecoratedKey; +import org.apache.cassandra.db.Keyspace; +import org.apache.cassandra.db.Mutation; +import org.apache.cassandra.db.RangeTombstone; +import org.apache.cassandra.db.RowPosition; +import org.apache.cassandra.db.SuperColumns; +import org.apache.cassandra.db.SystemKeyspace; import org.apache.cassandra.db.columniterator.OnDiskAtomIterator; import org.apache.cassandra.db.filter.QueryFilter; import org.apache.cassandra.dht.BytesToken; @@ -45,8 +58,13 @@ import org.apache.cassandra.io.sstable.SSTableScanner; import org.apache.cassandra.utils.ByteBufferUtil; import org.apache.cassandra.utils.FBUtilities; import org.apache.cassandra.utils.Pair; +import org.junit.Ignore; +import org.junit.Test; +import org.junit.runner.RunWith; -import static org.junit.Assert.*; +import com.google.common.base.Function; +import com.google.common.collect.Iterables; +import com.google.common.collect.Sets; @RunWith(OrderedJUnit4ClassRunner.class) public class CompactionsTest extends SchemaLoader @@ -115,7 +133,7 @@ public class CompactionsTest extends SchemaLoader ColumnFamilyStore store = testSingleSSTableCompaction(LeveledCompactionStrategy.class.getCanonicalName()); LeveledCompactionStrategy strategy = (LeveledCompactionStrategy) store.getCompactionStrategy(); // tombstone removal compaction should not promote level -assert strategy.getLevelSize(0) == 1; +assertEquals(1, strategy.getLevelSize(0)); } @Test @@ -151,8 +169,8 @@ public class CompactionsTest extends SchemaLoader SSTableScanner scanner = sstable.getScanner(DataRange.forKeyRange(keyRange)); OnDiskAtomIterator iter = scanner.next(); assertEquals(key, iter.getKey()); -assert iter.next() instanceof RangeTombstone; -assert !iter.hasNext(); +assertTrue(iter.next() instanceof RangeTombstone); +assertFalse(iter.hasNext()); } public static void assertMaxTimestamp(ColumnFamilyStore cfs, long maxTimestampExpected) @@ -187,7 +205,7 @@ public class CompactionsTest extends SchemaLoader cfs.forceBlockingFlush(); } CollectionSSTableReader toCompact = cfs.getSSTables(); -assert toCompact.size() == 2; +assertEquals(2, toCompact.size()); // Reinserting the same keys. We will compact only the previous sstable, but we need those new ones // to make sure we use
[jira] [Commented] (CASSANDRA-6623) Null in a cell caused by expired TTL does not work with IF clause (in CQL3)
[ https://issues.apache.org/jira/browse/CASSANDRA-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922046#comment-13922046 ] Paul Kendall commented on CASSANDRA-6623: - This problem is not fixed. I am trying to exactly as the steps as in comment #2 above and get exactly the same problems using the trunk version from git. Null in a cell caused by expired TTL does not work with IF clause (in CQL3) --- Key: CASSANDRA-6623 URL: https://issues.apache.org/jira/browse/CASSANDRA-6623 Project: Cassandra Issue Type: Bug Components: Tests Environment: One cluster with two nodes on a Linux and a Windows system. cqlsh 4.1.0 | Cassandra 2.0.4 | CQL spec 3.1.1 | Thrift protocol 19.39.0. CQL3 Column Family Reporter: Csaba Seres Assignee: Sylvain Lebresne Priority: Minor Fix For: 2.0.6 Attachments: 6623.txt IF onecell=null clause does not work if the onecell has got its null value from an expired TTL. If onecell is updated with null value (UPDATE) then IF onecell=null works fine. This bug is not present when you create a table with COMPACT STORAGE directive. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6623) Null in a cell caused by expired TTL does not work with IF clause (in CQL3)
[ https://issues.apache.org/jira/browse/CASSANDRA-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922046#comment-13922046 ] Paul Kendall edited comment on CASSANDRA-6623 at 3/6/14 6:24 AM: - This problem is not fixed. I am trying to exactly as the steps as in comment #2 above and get exactly the same problems using the trunk version from git. The expiry time of a column is in seconds The call time passed to isLive is in milliseconds The queryTimestamp is in microseconds (not seconds like the comment in the patch says) There are 3 calls to the CQL3CasConditions constructor passing the queryTimestamp and the one in ModificationStatement.executeWithCondition is the only one that changes the scale of the time value. From my testing the best solution is to remove the scaling done here and apply a divide by 1000 in the constructor of CQL3CasConditions. was (Author: pkendall): This problem is not fixed. I am trying to exactly as the steps as in comment #2 above and get exactly the same problems using the trunk version from git. Null in a cell caused by expired TTL does not work with IF clause (in CQL3) --- Key: CASSANDRA-6623 URL: https://issues.apache.org/jira/browse/CASSANDRA-6623 Project: Cassandra Issue Type: Bug Components: Tests Environment: One cluster with two nodes on a Linux and a Windows system. cqlsh 4.1.0 | Cassandra 2.0.4 | CQL spec 3.1.1 | Thrift protocol 19.39.0. CQL3 Column Family Reporter: Csaba Seres Assignee: Sylvain Lebresne Priority: Minor Fix For: 2.0.6 Attachments: 6623.txt IF onecell=null clause does not work if the onecell has got its null value from an expired TTL. If onecell is updated with null value (UPDATE) then IF onecell=null works fine. This bug is not present when you create a table with COMPACT STORAGE directive. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6623) Null in a cell caused by expired TTL does not work with IF clause (in CQL3)
[ https://issues.apache.org/jira/browse/CASSANDRA-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922046#comment-13922046 ] Paul Kendall edited comment on CASSANDRA-6623 at 3/6/14 6:27 AM: - This problem is not fixed. I am trying to exactly as the steps as in comment #2 above and get exactly the same problems using the trunk version from git. The expiry time of a column is in seconds The call time passed to isLive is in milliseconds The queryTimestamp is in microseconds (not seconds like the comment in the patch says) There are 3 calls to the CQL3CasConditions constructor passing the queryTimestamp and the one in ModificationStatement.executeWithCondition is the only one that changes the scale of the time value. From my testing the best solution is to remove the scaling done here and apply a divide by 1000 in the constructor of CQL3CasConditions. Attached patch [^0001-Fix-for-expiring-columns-used-in-cas-conditions.patch] was (Author: pkendall): This problem is not fixed. I am trying to exactly as the steps as in comment #2 above and get exactly the same problems using the trunk version from git. The expiry time of a column is in seconds The call time passed to isLive is in milliseconds The queryTimestamp is in microseconds (not seconds like the comment in the patch says) There are 3 calls to the CQL3CasConditions constructor passing the queryTimestamp and the one in ModificationStatement.executeWithCondition is the only one that changes the scale of the time value. From my testing the best solution is to remove the scaling done here and apply a divide by 1000 in the constructor of CQL3CasConditions. Null in a cell caused by expired TTL does not work with IF clause (in CQL3) --- Key: CASSANDRA-6623 URL: https://issues.apache.org/jira/browse/CASSANDRA-6623 Project: Cassandra Issue Type: Bug Components: Tests Environment: One cluster with two nodes on a Linux and a Windows system. cqlsh 4.1.0 | Cassandra 2.0.4 | CQL spec 3.1.1 | Thrift protocol 19.39.0. CQL3 Column Family Reporter: Csaba Seres Assignee: Sylvain Lebresne Priority: Minor Fix For: 2.0.6 Attachments: 0001-Fix-for-expiring-columns-used-in-cas-conditions.patch, 6623.txt IF onecell=null clause does not work if the onecell has got its null value from an expired TTL. If onecell is updated with null value (UPDATE) then IF onecell=null works fine. This bug is not present when you create a table with COMPACT STORAGE directive. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6623) Null in a cell caused by expired TTL does not work with IF clause (in CQL3)
[ https://issues.apache.org/jira/browse/CASSANDRA-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Kendall updated CASSANDRA-6623: Attachment: 0001-Fix-for-expiring-columns-used-in-cas-conditions.patch Null in a cell caused by expired TTL does not work with IF clause (in CQL3) --- Key: CASSANDRA-6623 URL: https://issues.apache.org/jira/browse/CASSANDRA-6623 Project: Cassandra Issue Type: Bug Components: Tests Environment: One cluster with two nodes on a Linux and a Windows system. cqlsh 4.1.0 | Cassandra 2.0.4 | CQL spec 3.1.1 | Thrift protocol 19.39.0. CQL3 Column Family Reporter: Csaba Seres Assignee: Sylvain Lebresne Priority: Minor Fix For: 2.0.6 Attachments: 0001-Fix-for-expiring-columns-used-in-cas-conditions.patch, 6623.txt IF onecell=null clause does not work if the onecell has got its null value from an expired TTL. If onecell is updated with null value (UPDATE) then IF onecell=null works fine. This bug is not present when you create a table with COMPACT STORAGE directive. -- This message was sent by Atlassian JIRA (v6.2#6252)
git commit: FBUtilities.singleton() should use the CF comparator
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 249230834 - 773fade9a FBUtilities.singleton() should use the CF comparator patch by slebresne; reviewed by thobbs for CASSANDRA-6778 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/773fade9 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/773fade9 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/773fade9 Branch: refs/heads/cassandra-2.0 Commit: 773fade9aee009170c7062d174f2b78211061fce Parents: 2492308 Author: Sylvain Lebresne sylv...@datastax.com Authored: Thu Mar 6 08:54:32 2014 +0100 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Thu Mar 6 08:56:08 2014 +0100 -- CHANGES.txt | 1 + .../cql3/statements/ColumnGroupMap.java | 4 +- .../cql3/statements/SelectStatement.java| 7 +- .../org/apache/cassandra/db/SystemKeyspace.java | 2 +- .../cassandra/db/filter/NamesQueryFilter.java | 4 +- .../apache/cassandra/db/filter/QueryFilter.java | 8 --- .../org/apache/cassandra/utils/FBUtilities.java | 6 +- .../apache/cassandra/db/LongKeyspaceTest.java | 3 +- .../unit/org/apache/cassandra/SchemaLoader.java | 3 +- .../org/apache/cassandra/config/DefsTest.java | 7 +- .../cassandra/db/CollationControllerTest.java | 5 +- .../cassandra/db/ColumnFamilyStoreTest.java | 67 +--- .../org/apache/cassandra/db/KeyspaceTest.java | 7 +- .../apache/cassandra/db/ReadMessageTest.java| 4 +- .../db/RecoveryManagerTruncateTest.java | 3 +- .../apache/cassandra/db/RemoveColumnTest.java | 3 +- .../cassandra/io/sstable/LegacySSTableTest.java | 4 +- .../cassandra/tools/SSTableExportTest.java | 8 ++- 18 files changed, 102 insertions(+), 44 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/773fade9/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 19cedd8..d697e3f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -33,6 +33,7 @@ * Fix UPDATE updating PRIMARY KEY columns implicitly (CASSANDRA-6782) * Fix IllegalArgumentException when updating from 1.2 with SuperColumns (CASSANDRA-6733) + * FBUtilities.singleton() should use the CF comparator (CASSANDRA-6778) Merged from 1.2: * Add CMSClassUnloadingEnabled JVM option (CASSANDRA-6541) * Catch memtable flush exceptions during shutdown (CASSANDRA-6735) http://git-wip-us.apache.org/repos/asf/cassandra/blob/773fade9/src/java/org/apache/cassandra/cql3/statements/ColumnGroupMap.java -- diff --git a/src/java/org/apache/cassandra/cql3/statements/ColumnGroupMap.java b/src/java/org/apache/cassandra/cql3/statements/ColumnGroupMap.java index 5c3fcb9..1c9a346 100644 --- a/src/java/org/apache/cassandra/cql3/statements/ColumnGroupMap.java +++ b/src/java/org/apache/cassandra/cql3/statements/ColumnGroupMap.java @@ -25,6 +25,7 @@ import java.util.List; import java.util.Map; import org.apache.cassandra.db.Column; +import org.apache.cassandra.db.marshal.AbstractType; import org.apache.cassandra.db.marshal.CompositeType; import org.apache.cassandra.utils.Pair; @@ -155,7 +156,8 @@ public class ColumnGroupMap { for (int i = 0; i idx; i++) { -if (!c[i].equals(previous[i])) +AbstractType? comp = composite.types.get(i); +if (comp.compare(c[i], previous[i]) != 0) return false; } return true; http://git-wip-us.apache.org/repos/asf/cassandra/blob/773fade9/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java -- diff --git a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java index 5a9d3d9..100383f 100644 --- a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java +++ b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java @@ -717,7 +717,7 @@ public class SelectStatement implements CQLStatement, MeasurableForPreparedCache { if (cfDef.isCompact) { -return FBUtilities.singleton(builder.build()); +return FBUtilities.singleton(builder.build(), cfDef.cfm.comparator); } else { @@ -994,10 +994,11 @@ public class SelectStatement implements CQLStatement, MeasurableForPreparedCache } else if (sliceRestriction != null) { +ComparatorByteBuffer comp = cfDef.cfm.comparator; // For dynamic CF, the column could be out of the