[jira] [Updated] (CASSANDRA-4462) upgradesstables strips active data from sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-4462: Attachment: 4462.txt There is indeed an unfortunate typo in the code of upgradesstables that makes it purge every tombstone instead of purging none. Since we upgrade one sstable at a time, purging tombstone is a bug that will resurrect data. Attached patch to fix (which also fix a 2nd occurrence of the same problem but that 2nd one was introduce by CASSANDRA-4456 so wasn't released yet). upgradesstables strips active data from sstables Key: CASSANDRA-4462 URL: https://issues.apache.org/jira/browse/CASSANDRA-4462 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Fix For: 1.1.3 Attachments: 4462.txt From the discussion here: http://mail-archives.apache.org/mod_mbox/cassandra-user/201207.mbox/%3CCAOac0GCtyDqS6ocuHOuQqre4re5wKj3o-ZpUZGkGsjCHzDVbTA%40mail.gmail.com%3E We are trying to migrate a 0.8.8 cluster to 1.1.2 by migrating the sstables from the 0.8.8 ring to a parallel 1.1.2 ring. However, every time we run the `nodetool upgradesstables` step we find it removes active data from our CFs -- leading to lost data in our application. The steps we took were: 1. Bring up a 1.1.2 ring in the same AZ/data center configuration with tokens matching the corresponding nodes in the 0.8.8 ring. 2. Create the same keyspace on 1.1.2. 3. Create each CF in the keyspace on 1.1.2. 4. Flush each node of the 0.8.8 ring. 5. Rsync each non-compacted sstable from 0.8.8 to the corresponding node in 1.1.2. 6. Move each 0.8.8 sstable into the 1.1.2 directory structure by renaming the file to the /cassandra/data/keyspace/cf/keyspace-cf... format. For example, for the keyspace Metrics and CF epochs_60 we get: cassandra/data/Metrics/epochs_60/Metrics-epochs_60-g-941-Data.db. 7. On each 1.1.2 node run `nodetool -h localhost refresh Metrics CF` for each CF in the keyspace. We notice that storage load jumps accordingly. 8. On each 1.1.2 node run `nodetool -h localhost upgradesstables`. Afterwards we would test the validity of the data by comparing it with data from the original 0.8.8 ring. After an upgradesstables command the data was always incorrect. With further testing we found that we could successfully use scrub to convert our sstables without data loss. However, any invocation of upgradesstables causes active data to be culled from the sstables: INFO [CompactionExecutor:4] 2012-07-24 04:27:36,837 CompactionTask.java (line 109) Compacting [SSTableReader(path='/raid0/cassandra/data/Metrics/metrics_900/Metrics-metrics_900-hd-51-Data.db')] INFO [CompactionExecutor:4] 2012-07-24 04:27:51,090 CompactionTask.java (line 221) Compacted to [/raid0/cassandra/data/Metrics/metrics_900/Metrics-metrics_900-hd-58-Data.db,]. 60,449,155 to 2,578,102 (~4% of original) bytes for 4,002 keys at 0.172562MB/s. Time: 14,248ms. These are the steps we've tried: WORKS refresh - scrub WORKS refresh - scrub - major compaction WORKS refresh - scrub - cleanup WORKS refresh - scrub - repair FAILS refresh - upgradesstables FAILS refresh - scrub - upgradesstables FAILS refresh - scrub - repair - upgradesstables FAILS refresh - scrub - major compaction - upgradesstables We have fewer than 143 million row keys in the CFs we're testing and none of the *-Filter.db files are 10MB, so I don't believe this is our problem: https://issues.apache.org/jira/browse/CASSANDRA-3820 The keyspace is defined as: Keyspace: Metrics: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [us-east:3] And the column family that we tested with is defined as: ColumnFamily: metrics_900 Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.LongType,org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type) GC grace seconds: 0 Compaction min/max thresholds: 4/32 Read repair chance: 0.1 DC Local Read repair chance: 0.0 Replicate on write: true Caching: KEYS_ONLY Bloom Filter FP chance: default Built indexes: [] Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy Compression Options: sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor All rows have a TTL of
[jira] [Commented] (CASSANDRA-4462) upgradesstables strips active data from sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422186#comment-13422186 ] Mike Heffner commented on CASSANDRA-4462: - Would that typo lead to the behavior we saw where non-tombstoned data would be removed from the sstable during an upgradesstables run? upgradesstables strips active data from sstables Key: CASSANDRA-4462 URL: https://issues.apache.org/jira/browse/CASSANDRA-4462 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Fix For: 1.1.3 Attachments: 4462.txt From the discussion here: http://mail-archives.apache.org/mod_mbox/cassandra-user/201207.mbox/%3CCAOac0GCtyDqS6ocuHOuQqre4re5wKj3o-ZpUZGkGsjCHzDVbTA%40mail.gmail.com%3E We are trying to migrate a 0.8.8 cluster to 1.1.2 by migrating the sstables from the 0.8.8 ring to a parallel 1.1.2 ring. However, every time we run the `nodetool upgradesstables` step we find it removes active data from our CFs -- leading to lost data in our application. The steps we took were: 1. Bring up a 1.1.2 ring in the same AZ/data center configuration with tokens matching the corresponding nodes in the 0.8.8 ring. 2. Create the same keyspace on 1.1.2. 3. Create each CF in the keyspace on 1.1.2. 4. Flush each node of the 0.8.8 ring. 5. Rsync each non-compacted sstable from 0.8.8 to the corresponding node in 1.1.2. 6. Move each 0.8.8 sstable into the 1.1.2 directory structure by renaming the file to the /cassandra/data/keyspace/cf/keyspace-cf... format. For example, for the keyspace Metrics and CF epochs_60 we get: cassandra/data/Metrics/epochs_60/Metrics-epochs_60-g-941-Data.db. 7. On each 1.1.2 node run `nodetool -h localhost refresh Metrics CF` for each CF in the keyspace. We notice that storage load jumps accordingly. 8. On each 1.1.2 node run `nodetool -h localhost upgradesstables`. Afterwards we would test the validity of the data by comparing it with data from the original 0.8.8 ring. After an upgradesstables command the data was always incorrect. With further testing we found that we could successfully use scrub to convert our sstables without data loss. However, any invocation of upgradesstables causes active data to be culled from the sstables: INFO [CompactionExecutor:4] 2012-07-24 04:27:36,837 CompactionTask.java (line 109) Compacting [SSTableReader(path='/raid0/cassandra/data/Metrics/metrics_900/Metrics-metrics_900-hd-51-Data.db')] INFO [CompactionExecutor:4] 2012-07-24 04:27:51,090 CompactionTask.java (line 221) Compacted to [/raid0/cassandra/data/Metrics/metrics_900/Metrics-metrics_900-hd-58-Data.db,]. 60,449,155 to 2,578,102 (~4% of original) bytes for 4,002 keys at 0.172562MB/s. Time: 14,248ms. These are the steps we've tried: WORKS refresh - scrub WORKS refresh - scrub - major compaction WORKS refresh - scrub - cleanup WORKS refresh - scrub - repair FAILS refresh - upgradesstables FAILS refresh - scrub - upgradesstables FAILS refresh - scrub - repair - upgradesstables FAILS refresh - scrub - major compaction - upgradesstables We have fewer than 143 million row keys in the CFs we're testing and none of the *-Filter.db files are 10MB, so I don't believe this is our problem: https://issues.apache.org/jira/browse/CASSANDRA-3820 The keyspace is defined as: Keyspace: Metrics: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [us-east:3] And the column family that we tested with is defined as: ColumnFamily: metrics_900 Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.LongType,org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type) GC grace seconds: 0 Compaction min/max thresholds: 4/32 Read repair chance: 0.1 DC Local Read repair chance: 0.0 Replicate on write: true Caching: KEYS_ONLY Bloom Filter FP chance: default Built indexes: [] Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy Compression Options: sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor All rows have a TTL of 30 days and a gc_grace=0 so it's possible that a small number of older columns would be removed during a compaction/scrub/upgradesstables step. However, the majority should still be kept as their TTL's have
[jira] [Updated] (CASSANDRA-4455) Nodetool fail to setcompactionthreshold
[ https://issues.apache.org/jira/browse/CASSANDRA-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-4455: - Attachment: CASSANDRA-4455.patch Nodetool fail to setcompactionthreshold --- Key: CASSANDRA-4455 URL: https://issues.apache.org/jira/browse/CASSANDRA-4455 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.0.3 Environment: Cassandra 1.0.3 Reporter: Jason Tang Assignee: Aleksey Yeschenko Priority: Minor Fix For: 1.1.4 Attachments: CASSANDRA-4455.patch first change compaction threshold from 4/32 to 2/2 /opt/dve/cassandra/bin/nodetool -h 127.0.0.1 -p 7199 setcompactionthreshold ks cf 2 2 It successful /opt/dve/cassandra/bin/nodetool -h 127.0.0.1 -p 7199 setcompactionthreshold ks cf 4 32 Exception in thread main java.lang.RuntimeException: The min_compaction_threshold cannot be larger than the max. at org.apache.cassandra.db.ColumnFamilyStore.setMinimumCompactionThreshold(ColumnFamilyStore.java:1697) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeSetter(MBeanIntrospector.java:238) at com.sun.jmx.mbeanserver.PerInterface.setAttribute(PerInterface.java:84) at com.sun.jmx.mbeanserver.MBeanSupport.setAttribute(MBeanSupport.java:240) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.setAttribute(DefaultMBeanServerInterceptor.java:762) at com.sun.jmx.mbeanserver.JmxMBeanServer.setAttribute(JmxMBeanServer.java:699) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1450) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360) at javax.management.remote.rmi.RMIConnectionImpl.setAttribute(RMIConnectionImpl.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:303) at sun.rmi.transport.Transport$1.run(Transport.java:159) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:155) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) The tool first try to set min then max, so it failed, since orign max is smaller the new min. The work around is: /opt/dve/cassandra/bin/nodetool -h 127.0.0.1 -p 7199 setcompactionthreshold ks cf 2 32 /opt/dve/cassandra/bin/nodetool -h 127.0.0.1 -p 7199 setcompactionthreshold ks cf 4 32 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4406) Update stress for CQL3
[ https://issues.apache.org/jira/browse/CASSANDRA-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Alves updated CASSANDRA-4406: --- Attachment: 4406.patch removed whitespace Update stress for CQL3 -- Key: CASSANDRA-4406 URL: https://issues.apache.org/jira/browse/CASSANDRA-4406 Project: Cassandra Issue Type: Improvement Components: Tools Affects Versions: 1.1.0 Reporter: Sylvain Lebresne Assignee: David Alves Labels: stress Fix For: 1.2 Attachments: 4406.patch, 4406.patch Stress does not support CQL3. We should add support for it so that: # we can benchmark CQL3 # we can benchmark CASSANDRA-2478 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4455) Nodetool fail to setcompactionthreshold
[ https://issues.apache.org/jira/browse/CASSANDRA-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-4455: --- Attachment: CASSANDRA-4455-v2.patch Attaching alternative version which adds setCompactionThresholds(int, int) method instead of doing branch checking for old min/max values in nodetool. What do you think, Aleksey? Nodetool fail to setcompactionthreshold --- Key: CASSANDRA-4455 URL: https://issues.apache.org/jira/browse/CASSANDRA-4455 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.0.3 Environment: Cassandra 1.0.3 Reporter: Jason Tang Assignee: Aleksey Yeschenko Priority: Minor Fix For: 1.1.4 Attachments: CASSANDRA-4455-v2.patch, CASSANDRA-4455.patch first change compaction threshold from 4/32 to 2/2 /opt/dve/cassandra/bin/nodetool -h 127.0.0.1 -p 7199 setcompactionthreshold ks cf 2 2 It successful /opt/dve/cassandra/bin/nodetool -h 127.0.0.1 -p 7199 setcompactionthreshold ks cf 4 32 Exception in thread main java.lang.RuntimeException: The min_compaction_threshold cannot be larger than the max. at org.apache.cassandra.db.ColumnFamilyStore.setMinimumCompactionThreshold(ColumnFamilyStore.java:1697) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeSetter(MBeanIntrospector.java:238) at com.sun.jmx.mbeanserver.PerInterface.setAttribute(PerInterface.java:84) at com.sun.jmx.mbeanserver.MBeanSupport.setAttribute(MBeanSupport.java:240) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.setAttribute(DefaultMBeanServerInterceptor.java:762) at com.sun.jmx.mbeanserver.JmxMBeanServer.setAttribute(JmxMBeanServer.java:699) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1450) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360) at javax.management.remote.rmi.RMIConnectionImpl.setAttribute(RMIConnectionImpl.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:303) at sun.rmi.transport.Transport$1.run(Transport.java:159) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:155) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) The tool first try to set min then max, so it failed, since orign max is smaller the new min. The work around is: /opt/dve/cassandra/bin/nodetool -h 127.0.0.1 -p 7199 setcompactionthreshold ks cf 2 32 /opt/dve/cassandra/bin/nodetool -h 127.0.0.1 -p 7199 setcompactionthreshold ks cf 4 32 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4459) pig driver casts ints as bytearray
[ https://issues.apache.org/jira/browse/CASSANDRA-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-4459: Attachment: 4459-v2.txt Update with a comment explaining that IntegerType is wrong, but we're doing it anyway. Also switched all the IntegerTypes to Int32Types in the tests, which pass. I don't see any point in explicitly testing IntegerType as well until pig has a BigInteger. pig driver casts ints as bytearray -- Key: CASSANDRA-4459 URL: https://issues.apache.org/jira/browse/CASSANDRA-4459 Project: Cassandra Issue Type: Bug Environment: C* 1.1.2 embedded in DSE Reporter: Cathy Daw Assignee: Brandon Williams Fix For: 1.1.3 Attachments: 4459-v2.txt, 4459.txt we seem to be auto-mapping C* int columns to bytearray in Pig, and farther down I can't seem to find a way to cast that to int and do an average. {code} grunt cassandra_users = LOAD 'cassandra://cqldb/users' USING CassandraStorage(); grunt dump cassandra_users; (bobhatter,(act,22),(fname,bob),(gender,m),(highSchool,Cal High),(lname,hatter),(sat,500),(state,CA),{}) (alicesmith,(act,27),(fname,alice),(gender,f),(highSchool,Tuscon High),(lname,smith),(sat,650),(state,AZ),{}) // notice sat and act columns are bytearray values grunt describe cassandra_users; cassandra_users: {key: chararray,act: (name: chararray,value: bytearray),fname: (name: chararray,value: chararray), gender: (name: chararray,value: chararray),highSchool: (name: chararray,value: chararray),lname: (name: chararray,value: chararray), sat: (name: chararray,value: bytearray),state: (name: chararray,value: chararray),columns: {(name: chararray,value: chararray)}} grunt users_by_state = GROUP cassandra_users BY state; grunt dump users_by_state; ((state,AX),{(aoakley,(highSchool,Phoenix High),(lname,Oakley),state,(act,22),(sat,500),(gender,m),(fname,Anne),{})}) ((state,AZ),{(gjames,(highSchool,Tuscon High),(lname,James),state,(act,24),(sat,650),(gender,f),(fname,Geronomo),{})}) ((state,CA),{(philton,(highSchool,Beverly High),(lname,Hilton),state,(act,37),(sat,220),(gender,m),(fname,Paris),{}),(jbrown,(highSchool,Cal High),(lname,Brown),state,(act,20),(sat,700),(gender,m),(fname,Jerry),{})}) // Error - use explicit cast grunt user_avg = FOREACH users_by_state GENERATE cassandra_users.state, AVG(cassandra_users.sat); grunt dump user_avg; 2012-07-22 17:15:04,361 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.AVG as multiple or none of them fit. Please use an explicit cast. // Unable to cast as int grunt user_avg = FOREACH users_by_state GENERATE cassandra_users.state, AVG((int)cassandra_users.sat); grunt dump user_avg; 2012-07-22 17:07:39,217 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1052: Cannot cast bag with schema sat: bag({name: chararray,value: bytearray}) to int {code} *Seed data in CQL* {code} CREATE KEYSPACE cqldb with strategy_class = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options:replication_factor=3; use cqldb; CREATE COLUMNFAMILY users ( KEY text PRIMARY KEY, fname text, lname text, gender varchar, act int, sat int, highSchool text, state varchar); insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) values (gjames, Geronomo, James, f, 24, 650, 'Tuscon High', 'AZ'); insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) values (aoakley, Anne, Oakley, m , 22, 500, 'Phoenix High', 'AX'); insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) values (jbrown, Jerry, Brown, m , 20, 700, 'Cal High', 'CA'); insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) values (philton, Paris, Hilton, m , 37, 220, 'Beverly High', 'CA'); select * from users; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4292) Per-disk I/O queues
[ https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-4292: -- Attachment: 4292-v2.txt v2 attached. This version introduces new way of specifying # of threads per disk. In cassandra.yaml, {{data_file_directory}} now takes additional parameter in the following format(num threads follows after ':'). {code} data_file_directories: - /mnt/d1/data:1 - /mnt/d1/data:3 {code} If ':#' is omitted, it defaults to 1, so we can preserve backward compatibility. {{memtable_flush_writers}} is removed from yaml. In this version, compaction also uses disk bound task executor to write sstables. Directory is chosen based on available space in both queue and disk. bq. probably cleaner to use a Map for the new getLocationForDisk method I did not modify to Map, since I think it is redundant and looping through few directories does not make difference. Per-disk I/O queues --- Key: CASSANDRA-4292 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Yuki Morishita Fix For: 1.2 Attachments: 4292-v2.txt, 4292.txt As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) threads, which mix and match disk volumes indiscriminately. It may be worth creating a tight thread - disk affinity, to prevent unnecessary conflict at that level. OTOH as SSDs become more prevalent this becomes a non-issue. Unclear how much pain this actually causes in practice in the meantime. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4459) pig driver casts ints as bytearray
[ https://issues.apache.org/jira/browse/CASSANDRA-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422395#comment-13422395 ] Jonathan Ellis commented on CASSANDRA-4459: --- +1 pig driver casts ints as bytearray -- Key: CASSANDRA-4459 URL: https://issues.apache.org/jira/browse/CASSANDRA-4459 Project: Cassandra Issue Type: Bug Environment: C* 1.1.2 embedded in DSE Reporter: Cathy Daw Assignee: Brandon Williams Fix For: 1.1.3 Attachments: 4459-v2.txt, 4459.txt we seem to be auto-mapping C* int columns to bytearray in Pig, and farther down I can't seem to find a way to cast that to int and do an average. {code} grunt cassandra_users = LOAD 'cassandra://cqldb/users' USING CassandraStorage(); grunt dump cassandra_users; (bobhatter,(act,22),(fname,bob),(gender,m),(highSchool,Cal High),(lname,hatter),(sat,500),(state,CA),{}) (alicesmith,(act,27),(fname,alice),(gender,f),(highSchool,Tuscon High),(lname,smith),(sat,650),(state,AZ),{}) // notice sat and act columns are bytearray values grunt describe cassandra_users; cassandra_users: {key: chararray,act: (name: chararray,value: bytearray),fname: (name: chararray,value: chararray), gender: (name: chararray,value: chararray),highSchool: (name: chararray,value: chararray),lname: (name: chararray,value: chararray), sat: (name: chararray,value: bytearray),state: (name: chararray,value: chararray),columns: {(name: chararray,value: chararray)}} grunt users_by_state = GROUP cassandra_users BY state; grunt dump users_by_state; ((state,AX),{(aoakley,(highSchool,Phoenix High),(lname,Oakley),state,(act,22),(sat,500),(gender,m),(fname,Anne),{})}) ((state,AZ),{(gjames,(highSchool,Tuscon High),(lname,James),state,(act,24),(sat,650),(gender,f),(fname,Geronomo),{})}) ((state,CA),{(philton,(highSchool,Beverly High),(lname,Hilton),state,(act,37),(sat,220),(gender,m),(fname,Paris),{}),(jbrown,(highSchool,Cal High),(lname,Brown),state,(act,20),(sat,700),(gender,m),(fname,Jerry),{})}) // Error - use explicit cast grunt user_avg = FOREACH users_by_state GENERATE cassandra_users.state, AVG(cassandra_users.sat); grunt dump user_avg; 2012-07-22 17:15:04,361 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.AVG as multiple or none of them fit. Please use an explicit cast. // Unable to cast as int grunt user_avg = FOREACH users_by_state GENERATE cassandra_users.state, AVG((int)cassandra_users.sat); grunt dump user_avg; 2012-07-22 17:07:39,217 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1052: Cannot cast bag with schema sat: bag({name: chararray,value: bytearray}) to int {code} *Seed data in CQL* {code} CREATE KEYSPACE cqldb with strategy_class = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options:replication_factor=3; use cqldb; CREATE COLUMNFAMILY users ( KEY text PRIMARY KEY, fname text, lname text, gender varchar, act int, sat int, highSchool text, state varchar); insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) values (gjames, Geronomo, James, f, 24, 650, 'Tuscon High', 'AZ'); insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) values (aoakley, Anne, Oakley, m , 22, 500, 'Phoenix High', 'AX'); insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) values (jbrown, Jerry, Brown, m , 20, 700, 'Cal High', 'CA'); insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) values (philton, Paris, Hilton, m , 37, 220, 'Beverly High', 'CA'); select * from users; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[Cassandra Wiki] Update of VirtualNodes/Balance by EricEvans
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The VirtualNodes/Balance page has been changed by EricEvans: http://wiki.apache.org/cassandra/VirtualNodes/Balance Comment: stubbed out page New page: This page is for design notes and information relating to operations effecting token/range ownership. See also: * [[https://issues.apache.org/jira/browse/CASSANDRA-4445|CASSANDRA-4445: balance utility for vnodes]] * [[https://issues.apache.org/jira/browse/CASSANDRA-4443|CASSANDRA-4443: shuffle utility for vnodes]] TableOfContents Anchor(requirements) == Requirements == 1. Offsetting ownership ratios for [[#heterogeneous_nodes|heterogeneous nodes]] 1. Correcting [[#imbalance|imbalances created by random token selection]] 1. [[#shuffling|Randomizing ranges]] after a migration Anchor(heterogeneous_nodes) == Heterogeneous Nodes == When running a cluster of heterogeneous nodes, (i.e. differing amounts of storage, memory, cores, etc), it may be desirable to place a greater or less portion of the keyspace on one or more nodes. Anchor(imbalance) == Imbalance == By default, a nodes tokens are randomly generated with the expectation that an even distribution of the namespace will result. However, variations of as much as 7% have been reported on small clusters when using the `num_tokens` default of 256. These randomly generated tokens are MD5 sums, so entropy isn't the problem here, at least not in the sense that using a better RNG would create a more even distribution of ranges. Increasing the token count (either by increasing num_tokens, or the number of nodes) will improve this, (the more tokens, the more the distribution will even out). This anecdotal worst-case is probably Good Enough, especially when considering that key distribution is subject to the same properties, or that many data sets are skewed on their own, (i.e. optimal ownership is not necessary optimal anyway). That said, our history is one where random token selection produced completely unacceptable results, and manual intervention was required. The typical (expected) result of manual token selection is near perfect balance of ownership, and it will likely be some time before people are comfortable seeing otherwise. Anchor(shuffling) == Shuffling == When migrating a legacy cluster with one-token-per-node to virtual nodes, the existing range is carved up into `num_tokens` new ranges. These new ranges are still contiguous however, and a means of randomizing their placement is needed.
[jira] [Commented] (CASSANDRA-4459) pig driver casts ints as bytearray
[ https://issues.apache.org/jira/browse/CASSANDRA-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422400#comment-13422400 ] Pavel Yaskevich commented on CASSANDRA-4459: +1 pig driver casts ints as bytearray -- Key: CASSANDRA-4459 URL: https://issues.apache.org/jira/browse/CASSANDRA-4459 Project: Cassandra Issue Type: Bug Environment: C* 1.1.2 embedded in DSE Reporter: Cathy Daw Assignee: Brandon Williams Fix For: 1.1.3 Attachments: 4459-v2.txt, 4459.txt we seem to be auto-mapping C* int columns to bytearray in Pig, and farther down I can't seem to find a way to cast that to int and do an average. {code} grunt cassandra_users = LOAD 'cassandra://cqldb/users' USING CassandraStorage(); grunt dump cassandra_users; (bobhatter,(act,22),(fname,bob),(gender,m),(highSchool,Cal High),(lname,hatter),(sat,500),(state,CA),{}) (alicesmith,(act,27),(fname,alice),(gender,f),(highSchool,Tuscon High),(lname,smith),(sat,650),(state,AZ),{}) // notice sat and act columns are bytearray values grunt describe cassandra_users; cassandra_users: {key: chararray,act: (name: chararray,value: bytearray),fname: (name: chararray,value: chararray), gender: (name: chararray,value: chararray),highSchool: (name: chararray,value: chararray),lname: (name: chararray,value: chararray), sat: (name: chararray,value: bytearray),state: (name: chararray,value: chararray),columns: {(name: chararray,value: chararray)}} grunt users_by_state = GROUP cassandra_users BY state; grunt dump users_by_state; ((state,AX),{(aoakley,(highSchool,Phoenix High),(lname,Oakley),state,(act,22),(sat,500),(gender,m),(fname,Anne),{})}) ((state,AZ),{(gjames,(highSchool,Tuscon High),(lname,James),state,(act,24),(sat,650),(gender,f),(fname,Geronomo),{})}) ((state,CA),{(philton,(highSchool,Beverly High),(lname,Hilton),state,(act,37),(sat,220),(gender,m),(fname,Paris),{}),(jbrown,(highSchool,Cal High),(lname,Brown),state,(act,20),(sat,700),(gender,m),(fname,Jerry),{})}) // Error - use explicit cast grunt user_avg = FOREACH users_by_state GENERATE cassandra_users.state, AVG(cassandra_users.sat); grunt dump user_avg; 2012-07-22 17:15:04,361 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.AVG as multiple or none of them fit. Please use an explicit cast. // Unable to cast as int grunt user_avg = FOREACH users_by_state GENERATE cassandra_users.state, AVG((int)cassandra_users.sat); grunt dump user_avg; 2012-07-22 17:07:39,217 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1052: Cannot cast bag with schema sat: bag({name: chararray,value: bytearray}) to int {code} *Seed data in CQL* {code} CREATE KEYSPACE cqldb with strategy_class = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options:replication_factor=3; use cqldb; CREATE COLUMNFAMILY users ( KEY text PRIMARY KEY, fname text, lname text, gender varchar, act int, sat int, highSchool text, state varchar); insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) values (gjames, Geronomo, James, f, 24, 650, 'Tuscon High', 'AZ'); insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) values (aoakley, Anne, Oakley, m , 22, 500, 'Phoenix High', 'AX'); insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) values (jbrown, Jerry, Brown, m , 20, 700, 'Cal High', 'CA'); insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) values (philton, Paris, Hilton, m , 37, 220, 'Beverly High', 'CA'); select * from users; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
git commit: Pig: support for Int32Type. Patch by brandonwilliams, reviewed by xedin for CASSANDRA-4459
Updated Branches: refs/heads/cassandra-1.1 9a6339476 - 6f384c54d Pig: support for Int32Type. Patch by brandonwilliams, reviewed by xedin for CASSANDRA-4459 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6f384c54 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6f384c54 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6f384c54 Branch: refs/heads/cassandra-1.1 Commit: 6f384c54de567d8d901592f0c32769b6582e50e4 Parents: 9a63394 Author: Brandon Williams brandonwilli...@apache.org Authored: Wed Jul 25 12:06:49 2012 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed Jul 25 12:06:49 2012 -0500 -- examples/pig/test/populate-cli.txt |4 ++-- .../cassandra/hadoop/pig/CassandraStorage.java |2 +- 2 files changed, 3 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6f384c54/examples/pig/test/populate-cli.txt -- diff --git a/examples/pig/test/populate-cli.txt b/examples/pig/test/populate-cli.txt index 1f59642..b2dda58 100644 --- a/examples/pig/test/populate-cli.txt +++ b/examples/pig/test/populate-cli.txt @@ -8,7 +8,7 @@ column_metadata = [ {column_name: name, validation_class: UTF8Type, index_type: KEYS}, {column_name: vote_type, validation_class: UTF8Type}, -{column_name: rating, validation_class: IntegerType}, +{column_name: rating, validation_class: Int32Type}, {column_name: score, validation_class: LongType}, {column_name: percent, validation_class: FloatType}, {column_name: atomic_weight, validation_class: DoubleType}, @@ -23,7 +23,7 @@ column_metadata = [ {column_name: name, validation_class: UTF8Type, index_type: KEYS}, {column_name: vote_type, validation_class: UTF8Type}, -{column_name: rating, validation_class: IntegerType}, +{column_name: rating, validation_class: Int32Type}, {column_name: score, validation_class: LongType}, {column_name: percent, validation_class: FloatType}, {column_name: atomic_weight, validation_class: DoubleType}, http://git-wip-us.apache.org/repos/asf/cassandra/blob/6f384c54/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java -- diff --git a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java b/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java index 454330c..f2fad67 100644 --- a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java +++ b/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java @@ -670,7 +670,7 @@ public class CassandraStorage extends LoadFunc implements StoreFuncInterface, Lo { if (type instanceof LongType || type instanceof DateType) // DateType is bad and it should feel bad return DataType.LONG; -else if (type instanceof IntegerType) +else if (type instanceof IntegerType || type instanceof Int32Type) // IntegerType will overflow at 2**31, but is kept for compatibility until pig has a BigInteger return DataType.INTEGER; else if (type instanceof AsciiType) return DataType.CHARARRAY;
[2/4] git commit: Pig: support for Int32Type. Patch by brandonwilliams, reviewed by xedin for CASSANDRA-4459
Pig: support for Int32Type. Patch by brandonwilliams, reviewed by xedin for CASSANDRA-4459 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6f384c54 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6f384c54 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6f384c54 Branch: refs/heads/trunk Commit: 6f384c54de567d8d901592f0c32769b6582e50e4 Parents: 9a63394 Author: Brandon Williams brandonwilli...@apache.org Authored: Wed Jul 25 12:06:49 2012 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed Jul 25 12:06:49 2012 -0500 -- examples/pig/test/populate-cli.txt |4 ++-- .../cassandra/hadoop/pig/CassandraStorage.java |2 +- 2 files changed, 3 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6f384c54/examples/pig/test/populate-cli.txt -- diff --git a/examples/pig/test/populate-cli.txt b/examples/pig/test/populate-cli.txt index 1f59642..b2dda58 100644 --- a/examples/pig/test/populate-cli.txt +++ b/examples/pig/test/populate-cli.txt @@ -8,7 +8,7 @@ column_metadata = [ {column_name: name, validation_class: UTF8Type, index_type: KEYS}, {column_name: vote_type, validation_class: UTF8Type}, -{column_name: rating, validation_class: IntegerType}, +{column_name: rating, validation_class: Int32Type}, {column_name: score, validation_class: LongType}, {column_name: percent, validation_class: FloatType}, {column_name: atomic_weight, validation_class: DoubleType}, @@ -23,7 +23,7 @@ column_metadata = [ {column_name: name, validation_class: UTF8Type, index_type: KEYS}, {column_name: vote_type, validation_class: UTF8Type}, -{column_name: rating, validation_class: IntegerType}, +{column_name: rating, validation_class: Int32Type}, {column_name: score, validation_class: LongType}, {column_name: percent, validation_class: FloatType}, {column_name: atomic_weight, validation_class: DoubleType}, http://git-wip-us.apache.org/repos/asf/cassandra/blob/6f384c54/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java -- diff --git a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java b/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java index 454330c..f2fad67 100644 --- a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java +++ b/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java @@ -670,7 +670,7 @@ public class CassandraStorage extends LoadFunc implements StoreFuncInterface, Lo { if (type instanceof LongType || type instanceof DateType) // DateType is bad and it should feel bad return DataType.LONG; -else if (type instanceof IntegerType) +else if (type instanceof IntegerType || type instanceof Int32Type) // IntegerType will overflow at 2**31, but is kept for compatibility until pig has a BigInteger return DataType.INTEGER; else if (type instanceof AsciiType) return DataType.CHARARRAY;
[4/4] git commit: Fix scary message about secondaries always being created at startup
Fix scary message about secondaries always being created at startup Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/41c9ba63 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/41c9ba63 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/41c9ba63 Branch: refs/heads/trunk Commit: 41c9ba63d624d1d6863b67a0cbcf4144bfbea29c Parents: aba1f16 Author: Brandon Williams brandonwilli...@apache.org Authored: Mon Jul 23 18:30:13 2012 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Mon Jul 23 18:32:00 2012 -0500 -- .../cassandra/db/index/SecondaryIndexManager.java |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/41c9ba63/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java -- diff --git a/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java b/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java index 6733c90..ba066e2 100644 --- a/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java +++ b/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java @@ -205,7 +205,6 @@ public class SecondaryIndexManager return null; assert cdef.getIndexType() != null; -logger.info(Creating new index : {},cdef); SecondaryIndex index; try @@ -231,6 +230,7 @@ public class SecondaryIndexManager { index = currentIndex; index.addColumnDef(cdef); +logger.info(Creating new index : {},cdef); } } else
[1/4] git commit: Merge branch 'cassandra-1.1' into trunk
Updated Branches: refs/heads/trunk e73b2a68b - d62f8c1e5 Merge branch 'cassandra-1.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d62f8c1e Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d62f8c1e Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d62f8c1e Branch: refs/heads/trunk Commit: d62f8c1e5f4a901652cd9dd7ef7f8ecb4b779450 Parents: e73b2a6 6f384c5 Author: Brandon Williams brandonwilli...@apache.org Authored: Wed Jul 25 12:09:14 2012 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed Jul 25 12:09:14 2012 -0500 -- examples/pig/test/populate-cli.txt |4 ++-- .../cassandra/hadoop/pig/CassandraStorage.java |2 +- 2 files changed, 3 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d62f8c1e/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java --
[3/4] git commit: cqlsh: add a COPY TO command Patch by paul cannon, reviewed by brandonwilliams for CASSANDRA-4434
cqlsh: add a COPY TO command Patch by paul cannon, reviewed by brandonwilliams for CASSANDRA-4434 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9a633947 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9a633947 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9a633947 Branch: refs/heads/trunk Commit: 9a63394765de28160d576c9285be68587e222a86 Parents: 41c9ba6 Author: Brandon Williams brandonwilli...@apache.org Authored: Tue Jul 24 13:57:19 2012 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Tue Jul 24 13:57:19 2012 -0500 -- CHANGES.txt |1 + bin/cqlsh | 126 - 2 files changed, 105 insertions(+), 22 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9a633947/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 0885387..638574c 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -23,6 +23,7 @@ Merged from 1.0: * Fix LCS splitting sstable base on uncompressed size (CASSANDRA-4419) * Bootstraps that fail are detected upon restart and will retry safely without needing to delete existing data first (CASSANDRA-4427) + * (cqlsh) add a COPY TO command to copy a CF to a CSV file (CASSANDRA-4434) 1.1.2 http://git-wip-us.apache.org/repos/asf/cassandra/blob/9a633947/bin/cqlsh -- diff --git a/bin/cqlsh b/bin/cqlsh index 574d49b..c67a818 100755 --- a/bin/cqlsh +++ b/bin/cqlsh @@ -224,7 +224,8 @@ cqlsh_extra_syntax_rules = r''' copyCommand ::= COPY cf=columnFamilyName ( ( [colnames]=colname ( , [colnames]=colname )* ) )? - FROM ( fname=stringLiteral | STDIN ) + ( dir=FROM ( fname=stringLiteral | STDIN ) + | dir=TO ( fname=stringLiteral | STDOUT ) ) ( WITH copyOption ( AND copyOption )* )? ; @@ -303,12 +304,16 @@ def complete_copy_column_names(ctxt, cqlsh): return [colnames[0]] return set(colnames[1:]) - set(existcols) -COPY_OPTIONS = ('DELIMITER', 'QUOTE', 'ESCAPE', 'HEADER') +COPY_OPTIONS = ('DELIMITER', 'QUOTE', 'ESCAPE', 'HEADER', 'ENCODING', 'NULL') @cqlsh_syntax_completer('copyOption', 'optnames') def complete_copy_options(ctxt, cqlsh): optnames = map(str.upper, ctxt.get_binding('optnames', ())) -return set(COPY_OPTIONS) - set(optnames) +direction = ctxt.get_binding('dir').upper() +opts = set(COPY_OPTIONS) - set(optnames) +if direction == 'FROM': +opts -= ('ENCODING', 'NULL') +return opts @cqlsh_syntax_completer('copyOption', 'optvals') def complete_copy_opt_values(ctxt, cqlsh): @@ -448,13 +453,13 @@ def unix_time_from_uuid1(u): return (u.get_time() - 0x01B21DD213814000) / 1000.0 def format_value(val, casstype, output_encoding, addcolor=False, time_format='', - float_precision=3, colormap=DEFAULT_VALUE_COLORS): + float_precision=3, colormap=DEFAULT_VALUE_COLORS, nullval='null'): color = colormap['default'] coloredval = None displaywidth = None if val is None: -bval = 'null' +bval = nullval color = colormap['error'] elif isinstance(val, DecodeError): casstype = 'BytesType' @@ -727,7 +732,7 @@ class Shell(cmd.Cmd): def get_column_names(self, ksname, cfname): if ksname is None: ksname = self.current_keyspace -if self.cqlver_atleast(3): +if ksname != 'system' and self.cqlver_atleast(3): return self.get_column_names_from_layout(ksname, cfname) else: return self.get_column_names_from_cfdef(ksname, cfname) @@ -1433,6 +1438,9 @@ class Shell(cmd.Cmd): COPY table_name [ ( column [, ...] ) ] FROM ( 'filename' | STDIN ) [ WITH option='value' [AND ...] ]; +COPY table_name [ ( column [, ...] ) ] + TO ( 'filename' | STDOUT ) + [ WITH option='value' [AND ...] ]; Available options and defaults: @@ -1440,6 +1448,8 @@ class Shell(cmd.Cmd): QUOTE=''- quoting character to be used to quote fields ESCAPE='\' - character to appear before the QUOTE char when quoted HEADER=false - whether to ignore the first line + ENCODING='utf8' - encoding for CSV output (COPY TO only) + NULL='' - string that represents a null value (COPY TO only) When entering CSV data on STDIN, you can use the sequence \. on a line by itself to end the data input. @@ -1448,12 +1458,11 @@ class Shell(cmd.Cmd): ks
[Cassandra Wiki] Update of VirtualNodes/Balance by EricEvans
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The VirtualNodes/Balance page has been changed by EricEvans: http://wiki.apache.org/cassandra/VirtualNodes/Balance?action=diffrev1=1rev2=2 Comment: hashing out implementation proposal When migrating a legacy cluster with one-token-per-node to virtual nodes, the existing range is carved up into `num_tokens` new ranges. These new ranges are still contiguous however, and a means of randomizing their placement is needed. + Anchor(implementation) + == Implementation (Draft) == + === Nodes / Cluster === + The most straightforward method of effecting ownership is a token move (i.e. relocating a range from one node to another). Exposing this with JMX would allow implementing all of the required operations client-side. + + === User Interface === + + {{{ + $ nodetool balance + }}} + + {{{ + $ nodetool shuffle + }}} + + {{{ + $ nodetool trim + }}} +
[jira] [Updated] (CASSANDRA-4447) enable jamm for OpenJDK = 1.6.0.23
[ https://issues.apache.org/jira/browse/CASSANDRA-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-4447: Attachment: 4447.txt Attaching a slightly different approach with cleaner logic. enable jamm for OpenJDK = 1.6.0.23 --- Key: CASSANDRA-4447 URL: https://issues.apache.org/jira/browse/CASSANDRA-4447 Project: Cassandra Issue Type: Improvement Components: Packaging Environment: openjdk Reporter: Ilya Shipitsin Priority: Trivial Fix For: 1.1.3 Attachments: 4447.txt we tested jamm with OpenJDK, it works well starting at 1.6.0.23, so I suggest --- cassandra-env.sh.dist 2012-07-19 12:24:44.938886154 +0600 +++ cassandra-env.sh2012-07-19 12:28:34.913886847 +0600 @@ -119,8 +119,10 @@ # add the jamm javaagent check_openjdk=`${JAVA:-java} -version 21 | awk '{if (NR == 2) {print $1}}'` -if [ $check_openjdk != OpenJDK ] +check_openjdk_is_good_for_jamm=`${JAVA:-java} -version 21 | awk -F _|\ '/1\.6\.0/ $3 23 {print bad }'` +if [ $check_openjdk = OpenJDK ] [ $check_openjdk_is_good_for_jamm = bad ] then +else JVM_OPTS=$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar fi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-4447) enable jamm for OpenJDK = 1.6.0.23
[ https://issues.apache.org/jira/browse/CASSANDRA-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams reassigned CASSANDRA-4447: --- Assignee: Brandon Williams enable jamm for OpenJDK = 1.6.0.23 --- Key: CASSANDRA-4447 URL: https://issues.apache.org/jira/browse/CASSANDRA-4447 Project: Cassandra Issue Type: Improvement Components: Packaging Environment: openjdk Reporter: Ilya Shipitsin Assignee: Brandon Williams Priority: Trivial Fix For: 1.1.3 Attachments: 4447.txt we tested jamm with OpenJDK, it works well starting at 1.6.0.23, so I suggest --- cassandra-env.sh.dist 2012-07-19 12:24:44.938886154 +0600 +++ cassandra-env.sh2012-07-19 12:28:34.913886847 +0600 @@ -119,8 +119,10 @@ # add the jamm javaagent check_openjdk=`${JAVA:-java} -version 21 | awk '{if (NR == 2) {print $1}}'` -if [ $check_openjdk != OpenJDK ] +check_openjdk_is_good_for_jamm=`${JAVA:-java} -version 21 | awk -F _|\ '/1\.6\.0/ $3 23 {print bad }'` +if [ $check_openjdk = OpenJDK ] [ $check_openjdk_is_good_for_jamm = bad ] then +else JVM_OPTS=$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar fi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4447) enable jamm for OpenJDK = 1.6.0.23
[ https://issues.apache.org/jira/browse/CASSANDRA-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-4447: Reviewer: thepaul (was: brandon.williams) enable jamm for OpenJDK = 1.6.0.23 --- Key: CASSANDRA-4447 URL: https://issues.apache.org/jira/browse/CASSANDRA-4447 Project: Cassandra Issue Type: Improvement Components: Packaging Environment: openjdk Reporter: Ilya Shipitsin Assignee: Brandon Williams Priority: Trivial Fix For: 1.1.3 Attachments: 4447.txt we tested jamm with OpenJDK, it works well starting at 1.6.0.23, so I suggest --- cassandra-env.sh.dist 2012-07-19 12:24:44.938886154 +0600 +++ cassandra-env.sh2012-07-19 12:28:34.913886847 +0600 @@ -119,8 +119,10 @@ # add the jamm javaagent check_openjdk=`${JAVA:-java} -version 21 | awk '{if (NR == 2) {print $1}}'` -if [ $check_openjdk != OpenJDK ] +check_openjdk_is_good_for_jamm=`${JAVA:-java} -version 21 | awk -F _|\ '/1\.6\.0/ $3 23 {print bad }'` +if [ $check_openjdk = OpenJDK ] [ $check_openjdk_is_good_for_jamm = bad ] then +else JVM_OPTS=$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar fi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4447) enable jamm for OpenJDK = 1.6.0.23
[ https://issues.apache.org/jira/browse/CASSANDRA-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422455#comment-13422455 ] Ilya Shipitsin commented on CASSANDRA-4447: --- ok, it's better enable jamm for OpenJDK = 1.6.0.23 --- Key: CASSANDRA-4447 URL: https://issues.apache.org/jira/browse/CASSANDRA-4447 Project: Cassandra Issue Type: Improvement Components: Packaging Environment: openjdk Reporter: Ilya Shipitsin Assignee: Brandon Williams Priority: Trivial Fix For: 1.1.3 Attachments: 4447.txt we tested jamm with OpenJDK, it works well starting at 1.6.0.23, so I suggest --- cassandra-env.sh.dist 2012-07-19 12:24:44.938886154 +0600 +++ cassandra-env.sh2012-07-19 12:28:34.913886847 +0600 @@ -119,8 +119,10 @@ # add the jamm javaagent check_openjdk=`${JAVA:-java} -version 21 | awk '{if (NR == 2) {print $1}}'` -if [ $check_openjdk != OpenJDK ] +check_openjdk_is_good_for_jamm=`${JAVA:-java} -version 21 | awk -F _|\ '/1\.6\.0/ $3 23 {print bad }'` +if [ $check_openjdk = OpenJDK ] [ $check_openjdk_is_good_for_jamm = bad ] then +else JVM_OPTS=$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar fi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[2/5] git commit: add comment to #4452
add comment to #4452 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f46232c0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f46232c0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f46232c0 Branch: refs/heads/cassandra-1.1 Commit: f46232c0b02f27c5177bd453a6d0b0f6441c2499 Parents: 06bdd3e Author: Jonathan Ellis jbel...@apache.org Authored: Wed Jul 25 13:15:21 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Wed Jul 25 13:15:21 2012 -0500 -- .../cassandra/service/StorageServiceMBean.java |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f46232c0/src/java/org/apache/cassandra/service/StorageServiceMBean.java -- diff --git a/src/java/org/apache/cassandra/service/StorageServiceMBean.java b/src/java/org/apache/cassandra/service/StorageServiceMBean.java index 0872e2b..c4c6a1d 100644 --- a/src/java/org/apache/cassandra/service/StorageServiceMBean.java +++ b/src/java/org/apache/cassandra/service/StorageServiceMBean.java @@ -401,8 +401,10 @@ public interface StorageServiceMBean public void loadNewSSTables(String ksName, String cfName); /** - * Return a List of Tokens representing a sample of keys - * across all ColumnFamilyStores + * Return a List of Tokens representing a sample of keys across all ColumnFamilyStores. + * + * Note: this should be left as an operation, not an attribute (methods starting with get) + * to avoid sending potentially multiple MB of data when accessing this mbean by default. See CASSANDRA-4452. * * @return set of Tokens as Strings */
[4/5] git commit: rename getRangeKeySample to sampleKeyRange patch by Jan Prach; reviewed by jbellis for CASSANDRA-4452
rename getRangeKeySample to sampleKeyRange patch by Jan Prach; reviewed by jbellis for CASSANDRA-4452 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/06bdd3ea Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/06bdd3ea Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/06bdd3ea Branch: refs/heads/trunk Commit: 06bdd3ea8e6ec8ebf47b7bd813041550f99fa48b Parents: 6f384c5 Author: Jonathan Ellis jbel...@apache.org Authored: Wed Jul 25 13:12:32 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Wed Jul 25 13:12:32 2012 -0500 -- CHANGES.txt|2 ++ .../apache/cassandra/service/StorageService.java |2 +- .../cassandra/service/StorageServiceMBean.java |2 +- src/java/org/apache/cassandra/tools/NodeCmd.java |2 +- src/java/org/apache/cassandra/tools/NodeProbe.java |4 ++-- 5 files changed, 7 insertions(+), 5 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/06bdd3ea/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 638574c..c160d69 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,6 @@ 1.1.3 + * (JMX) rename getRangeKeySample to sampleKeyRange to avoid returning + multi-MB results as an attribute (CASSANDRA-4452) * flush based on data size, not throughput; overwritten columns no longer artificially inflate liveRatio (CASSANDRA-4399) * update default commitlog segment size to 32MB and total commitlog http://git-wip-us.apache.org/repos/asf/cassandra/blob/06bdd3ea/src/java/org/apache/cassandra/service/StorageService.java -- diff --git a/src/java/org/apache/cassandra/service/StorageService.java b/src/java/org/apache/cassandra/service/StorageService.java index 28a3551..bfc8c81 100644 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@ -3080,7 +3080,7 @@ public class StorageService implements IEndpointStateChangeSubscriber, StorageSe /** * #{@inheritDoc} */ -public ListString getRangeKeySample() +public ListString sampleKeyRange() // do not rename to getter - see CASSANDRA-4452 for details { ListDecoratedKey keys = keySamples(ColumnFamilyStore.allUserDefined(), getLocalPrimaryRange()); http://git-wip-us.apache.org/repos/asf/cassandra/blob/06bdd3ea/src/java/org/apache/cassandra/service/StorageServiceMBean.java -- diff --git a/src/java/org/apache/cassandra/service/StorageServiceMBean.java b/src/java/org/apache/cassandra/service/StorageServiceMBean.java index 72d03d1..0872e2b 100644 --- a/src/java/org/apache/cassandra/service/StorageServiceMBean.java +++ b/src/java/org/apache/cassandra/service/StorageServiceMBean.java @@ -406,7 +406,7 @@ public interface StorageServiceMBean * * @return set of Tokens as Strings */ -public ListString getRangeKeySample(); +public ListString sampleKeyRange(); /** * rebuild the specified indexes http://git-wip-us.apache.org/repos/asf/cassandra/blob/06bdd3ea/src/java/org/apache/cassandra/tools/NodeCmd.java -- diff --git a/src/java/org/apache/cassandra/tools/NodeCmd.java b/src/java/org/apache/cassandra/tools/NodeCmd.java index a8d3f55..b73e96a 100644 --- a/src/java/org/apache/cassandra/tools/NodeCmd.java +++ b/src/java/org/apache/cassandra/tools/NodeCmd.java @@ -922,7 +922,7 @@ public class NodeCmd private void printRangeKeySample(PrintStream outs) { outs.println(RangeKeySample: ); -ListString tokenStrings = this.probe.getRangeKeySample(); +ListString tokenStrings = this.probe.sampleKeyRange(); for (String tokenString : tokenStrings) { outs.println(\t + tokenString); http://git-wip-us.apache.org/repos/asf/cassandra/blob/06bdd3ea/src/java/org/apache/cassandra/tools/NodeProbe.java -- diff --git a/src/java/org/apache/cassandra/tools/NodeProbe.java b/src/java/org/apache/cassandra/tools/NodeProbe.java index d1a615d..5c04eff 100644 --- a/src/java/org/apache/cassandra/tools/NodeProbe.java +++ b/src/java/org/apache/cassandra/tools/NodeProbe.java @@ -690,9 +690,9 @@ public class NodeProbe ssProxy.rebuild(sourceDc); } -public ListString getRangeKeySample() +public ListString sampleKeyRange() { -return ssProxy.getRangeKeySample(); +return ssProxy.sampleKeyRange(); } public void resetLocalSchema() throws IOException
[3/5] git commit: add comment to #4452
add comment to #4452 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f46232c0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f46232c0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f46232c0 Branch: refs/heads/trunk Commit: f46232c0b02f27c5177bd453a6d0b0f6441c2499 Parents: 06bdd3e Author: Jonathan Ellis jbel...@apache.org Authored: Wed Jul 25 13:15:21 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Wed Jul 25 13:15:21 2012 -0500 -- .../cassandra/service/StorageServiceMBean.java |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f46232c0/src/java/org/apache/cassandra/service/StorageServiceMBean.java -- diff --git a/src/java/org/apache/cassandra/service/StorageServiceMBean.java b/src/java/org/apache/cassandra/service/StorageServiceMBean.java index 0872e2b..c4c6a1d 100644 --- a/src/java/org/apache/cassandra/service/StorageServiceMBean.java +++ b/src/java/org/apache/cassandra/service/StorageServiceMBean.java @@ -401,8 +401,10 @@ public interface StorageServiceMBean public void loadNewSSTables(String ksName, String cfName); /** - * Return a List of Tokens representing a sample of keys - * across all ColumnFamilyStores + * Return a List of Tokens representing a sample of keys across all ColumnFamilyStores. + * + * Note: this should be left as an operation, not an attribute (methods starting with get) + * to avoid sending potentially multiple MB of data when accessing this mbean by default. See CASSANDRA-4452. * * @return set of Tokens as Strings */
[1/5] git commit: Merge branch 'cassandra-1.1' into trunk
Updated Branches: refs/heads/cassandra-1.1 6f384c54d - f46232c0b refs/heads/trunk d62f8c1e5 - b167e9ba7 Merge branch 'cassandra-1.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b167e9ba Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b167e9ba Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b167e9ba Branch: refs/heads/trunk Commit: b167e9ba74fa917aeed55cfdcbff9133c13720d5 Parents: d62f8c1 f46232c Author: Jonathan Ellis jbel...@apache.org Authored: Wed Jul 25 13:15:29 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Wed Jul 25 13:15:29 2012 -0500 -- CHANGES.txt|2 ++ .../apache/cassandra/service/StorageService.java |2 +- .../cassandra/service/StorageServiceMBean.java |8 +--- src/java/org/apache/cassandra/tools/NodeCmd.java |2 +- src/java/org/apache/cassandra/tools/NodeProbe.java |4 ++-- 5 files changed, 11 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/b167e9ba/CHANGES.txt -- diff --cc CHANGES.txt index c558c3f,c160d69..6dc6382 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,39 -1,6 +1,41 @@@ +1.2-dev + * Introduce new json format with row level deletion (CASSANDRA-4054) + * remove redundant name column from schema_keyspaces (CASSANDRA-4433) + * improve nodetool ring handling of multi-dc clusters (CASSANDRA-3047) + * update NTS calculateNaturalEndpoints to be O(N log N) (CASSANDRA-3881) + * add UseCondCardMark XX jvm settings on jdk 1.7 (CASSANDRA-4366) + * split up rpc timeout by operation type (CASSANDRA-2819) + * rewrite key cache save/load to use only sequential i/o (CASSANDRA-3762) + * update MS protocol with a version handshake + broadcast address id + (CASSANDRA-4311) + * multithreaded hint replay (CASSANDRA-4189) + * add inter-node message compression (CASSANDRA-3127) + * remove COPP (CASSANDRA-2479) + * Track tombstone expiration and compact when tombstone content is + higher than a configurable threshold, default 20% (CASSANDRA-3442, 4234) + * update MurmurHash to version 3 (CASSANDRA-2975) + * (CLI) track elapsed time for `delete' operation (CASSANDRA-4060) + * (CLI) jline version is bumped to 1.0 to properly support + 'delete' key function (CASSANDRA-4132) + * Save IndexSummary into new SSTable 'Summary' component (CASSANDRA-2392, 4289) + * Add support for range tombstones (CASSANDRA-3708) + * Improve MessagingService efficiency (CASSANDRA-3617) + * Avoid ID conflicts from concurrent schema changes (CASSANDRA-3794) + * Set thrift HSHA server thread limit to unlimited by default (CASSANDRA-4277) + * Avoids double serialization of CF id in RowMutation messages + (CASSANDRA-4293) + * stream compressed sstables directly with java nio (CASSANDRA-4297) + * Support multiple ranges in SliceQueryFilter (CASSANDRA-3885) + * Add column metadata to system column families (CASSANDRA-4018) + * (cql3) Always use composite types by default (CASSANDRA-4329) + * (cql3) Add support for set, map and list (CASSANDRA-3647) + * Validate date type correctly (CASSANDRA-4441) + * (cql3) Allow definitions with only a PK (CASSANDRA-4361) + + 1.1.3 + * (JMX) rename getRangeKeySample to sampleKeyRange to avoid returning +multi-MB results as an attribute (CASSANDRA-4452) * flush based on data size, not throughput; overwritten columns no longer artificially inflate liveRatio (CASSANDRA-4399) * update default commitlog segment size to 32MB and total commitlog http://git-wip-us.apache.org/repos/asf/cassandra/blob/b167e9ba/src/java/org/apache/cassandra/service/StorageService.java -- diff --cc src/java/org/apache/cassandra/service/StorageService.java index d8bed6f,bfc8c81..4af399d --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@@ -3289,11 -3080,9 +3289,11 @@@ public class StorageService implements /** * #{@inheritDoc} */ - public ListString getRangeKeySample() + public ListString sampleKeyRange() // do not rename to getter - see CASSANDRA-4452 for details { -ListDecoratedKey keys = keySamples(ColumnFamilyStore.allUserDefined(), getLocalPrimaryRange()); +ListDecoratedKey keys = new ArrayListDecoratedKey(); +for (RangeToken range : getLocalPrimaryRanges()) +keys.addAll(keySamples(ColumnFamilyStore.allUserDefined(), range)); ListString sampledKeys = new ArrayListString(keys.size()); for (DecoratedKey key : keys)
[Cassandra Wiki] Update of VirtualNodes/Balance by EricEvans
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The VirtualNodes/Balance page has been changed by EricEvans: http://wiki.apache.org/cassandra/VirtualNodes/Balance?action=diffrev1=2rev2=3 Comment: balance vs. intentional imbalance When migrating a legacy cluster with one-token-per-node to virtual nodes, the existing range is carved up into `num_tokens` new ranges. These new ranges are still contiguous however, and a means of randomizing their placement is needed. + Anchor(implementation) == Implementation (Draft) == + === Considerations === + In the most basic sense, ''balanced'' means that each node has 1/n of the token-space, so adjusting ownership for [[#heterogeneous_nodes|heterogeneous nodes]] is implicitly about ''unbalancing''. This is important because, if for example, you reduced ownership of a node to say (1/n)*.8, you expect that imbalance to persist, and not be balanced-away by operations on other nodes. + + ''Note: This will likely require storing state, in the form of an offset, on each node.'' + === Nodes / Cluster === The most straightforward method of effecting ownership is a token move (i.e. relocating a range from one node to another). Exposing this with JMX would allow implementing all of the required operations client-side.
[jira] [Assigned] (CASSANDRA-1920) Enhance word_count example s.t. it can ingest and analyze arbitrary text
[ https://issues.apache.org/jira/browse/CASSANDRA-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True reassigned CASSANDRA-1920: Assignee: Kirk True Enhance word_count example s.t. it can ingest and analyze arbitrary text Key: CASSANDRA-1920 URL: https://issues.apache.org/jira/browse/CASSANDRA-1920 Project: Cassandra Issue Type: Improvement Components: Contrib Environment: N/A Reporter: Benjamin Coverston Assignee: Kirk True Priority: Minor Labels: lhf Original Estimate: 4h Remaining Estimate: 4h Enhance the word_count demo so that arbitrary text files can be ingested, and those ingested files can also be analyzed in the map reduce jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3564) flush before shutdown so restart is faster
[ https://issues.apache.org/jira/browse/CASSANDRA-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422511#comment-13422511 ] Brandon Williams commented on CASSANDRA-3564: - Doesn't {{nodetool flush}} already contain everything we need? We just need the packaging glue. flush before shutdown so restart is faster -- Key: CASSANDRA-3564 URL: https://issues.apache.org/jira/browse/CASSANDRA-3564 Project: Cassandra Issue Type: New Feature Components: Packaging Reporter: Jonathan Ellis Assignee: David Alves Priority: Minor Fix For: 1.2 Attachments: 3564.patch, 3564.patch Cassandra handles flush in its shutdown hook for durable_writes=false CFs (otherwise we're *guaranteed* to lose data) but leaves it up to the operator otherwise. I'd rather leave it that way to offer these semantics: - cassandra stop = shutdown nicely [explicit flush, then kill -int] - kill -INT = shutdown faster but don't lose any updates [current behavior] - kill -KILL = lose most recent writes unless durable_writes=true and batch commits are on [also current behavior] But if it's not reasonable to use nodetool from the init script then I guess we can just make the shutdown hook flush everything. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[1/2] git commit: merge from 1.1
Updated Branches: refs/heads/trunk b167e9ba7 - 5cde66bab merge from 1.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5cde66ba Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5cde66ba Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5cde66ba Branch: refs/heads/trunk Commit: 5cde66bab30e7aa8f98bae9a7504b2a4a17cdda1 Parents: b167e9b cc0be1b Author: Pavel Yaskevich xe...@apache.org Authored: Wed Jul 25 22:01:14 2012 +0300 Committer: Pavel Yaskevich xe...@apache.org Committed: Wed Jul 25 22:01:14 2012 +0300 -- CHANGES.txt|1 + .../org/apache/cassandra/db/ColumnFamilyStore.java | 33 +-- .../cassandra/db/ColumnFamilyStoreMBean.java |5 ++ .../compaction/SizeTieredCompactionStrategy.java |3 +- src/java/org/apache/cassandra/tools/NodeProbe.java |3 +- .../cassandra/db/compaction/CompactionsTest.java |3 +- 6 files changed, 29 insertions(+), 19 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5cde66ba/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5cde66ba/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5cde66ba/src/java/org/apache/cassandra/db/ColumnFamilyStoreMBean.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5cde66ba/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5cde66ba/src/java/org/apache/cassandra/tools/NodeProbe.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5cde66ba/test/unit/org/apache/cassandra/db/compaction/CompactionsTest.java --
[2/2] git commit: fix nodetool's setcompactionthreshold command patch by Aleksey Yeschenko; reviewed by Pavel Yaskevich for CASSANDRA-4455
fix nodetool's setcompactionthreshold command patch by Aleksey Yeschenko; reviewed by Pavel Yaskevich for CASSANDRA-4455 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/cc0be1b4 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/cc0be1b4 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/cc0be1b4 Branch: refs/heads/trunk Commit: cc0be1b40007ef4b653e4ad6bc4dbe0438b97785 Parents: f46232c Author: Pavel Yaskevich xe...@apache.org Authored: Wed Jul 25 17:52:39 2012 +0300 Committer: Pavel Yaskevich xe...@apache.org Committed: Wed Jul 25 21:59:38 2012 +0300 -- CHANGES.txt|1 + .../org/apache/cassandra/db/ColumnFamilyStore.java | 33 +-- .../cassandra/db/ColumnFamilyStoreMBean.java |5 ++ .../compaction/SizeTieredCompactionStrategy.java |3 +- src/java/org/apache/cassandra/tools/NodeProbe.java |3 +- .../cassandra/db/compaction/CompactionsTest.java |3 +- 6 files changed, 29 insertions(+), 19 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/cc0be1b4/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index c160d69..169f66d 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -18,6 +18,7 @@ * Fix LCS bug with sstable containing only 1 row (CASSANDRA-4411) * fix Can't Modify Index Name problem on CF update (CASSANDRA-4439) * Fix assertion error in getOverlappingSSTables during repair (CASSANDRA-4456) + * fix nodetool's setcompactionthreshold command (CASSANDRA-4455) Merged from 1.0: * allow dropping columns shadowed by not-yet-expired supercolumn or row tombstones in PrecompactedRow (CASSANDRA-4396) http://git-wip-us.apache.org/repos/asf/cassandra/blob/cc0be1b4/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 0b66020..b93adc1 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1845,6 +1845,18 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean return compactionStrategy; } +public void setCompactionThresholds(int minThreshold, int maxThreshold) +{ +validateCompactionThresholds(minThreshold, maxThreshold); + +minCompactionThreshold.set(minThreshold); +maxCompactionThreshold.set(maxThreshold); + +// this is called as part of CompactionStrategy constructor; avoid circular dependency by checking for null +if (compactionStrategy != null) +CompactionManager.instance.submitBackground(this); +} + public int getMinimumCompactionThreshold() { return minCompactionThreshold.value(); @@ -1852,14 +1864,8 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public void setMinimumCompactionThreshold(int minCompactionThreshold) { -if ((minCompactionThreshold this.maxCompactionThreshold.value()) this.maxCompactionThreshold.value() != 0) -throw new RuntimeException(The min_compaction_threshold cannot be larger than the max.); - +validateCompactionThresholds(minCompactionThreshold, maxCompactionThreshold.value()); this.minCompactionThreshold.set(minCompactionThreshold); - -// this is called as part of CompactionStrategy constructor; avoid circular dependency by checking for null -if (compactionStrategy != null) -CompactionManager.instance.submitBackground(this); } public int getMaximumCompactionThreshold() @@ -1869,14 +1875,15 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public void setMaximumCompactionThreshold(int maxCompactionThreshold) { -if (maxCompactionThreshold 0 maxCompactionThreshold this.minCompactionThreshold.value()) -throw new RuntimeException(The max_compaction_threshold cannot be smaller than the min.); - +validateCompactionThresholds(minCompactionThreshold.value(), maxCompactionThreshold); this.maxCompactionThreshold.set(maxCompactionThreshold); +} -// this is called as part of CompactionStrategy constructor; avoid circular dependency by checking for null -if (compactionStrategy != null) -CompactionManager.instance.submitBackground(this); +private void validateCompactionThresholds(int minThreshold, int maxThreshold) +{ +if (minThreshold maxThreshold maxThreshold != 0) +throw new RuntimeException(String.format(The min_compaction_threshold cannot be
[jira] [Commented] (CASSANDRA-3564) flush before shutdown so restart is faster
[ https://issues.apache.org/jira/browse/CASSANDRA-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422517#comment-13422517 ] Jonathan Ellis commented on CASSANDRA-3564: --- bq. we could still improve the (debian) packaging to have a 'call flush for me before shutdown' WFM, but it should default to off IMO. flush before shutdown so restart is faster -- Key: CASSANDRA-3564 URL: https://issues.apache.org/jira/browse/CASSANDRA-3564 Project: Cassandra Issue Type: New Feature Components: Packaging Reporter: Jonathan Ellis Assignee: David Alves Priority: Minor Fix For: 1.2 Attachments: 3564.patch, 3564.patch Cassandra handles flush in its shutdown hook for durable_writes=false CFs (otherwise we're *guaranteed* to lose data) but leaves it up to the operator otherwise. I'd rather leave it that way to offer these semantics: - cassandra stop = shutdown nicely [explicit flush, then kill -int] - kill -INT = shutdown faster but don't lose any updates [current behavior] - kill -KILL = lose most recent writes unless durable_writes=true and batch commits are on [also current behavior] But if it's not reasonable to use nodetool from the init script then I guess we can just make the shutdown hook flush everything. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3564) flush before shutdown so restart is faster
[ https://issues.apache.org/jira/browse/CASSANDRA-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422522#comment-13422522 ] Brandon Williams commented on CASSANDRA-3564: - bq. it should default to off IMO +1 flush before shutdown so restart is faster -- Key: CASSANDRA-3564 URL: https://issues.apache.org/jira/browse/CASSANDRA-3564 Project: Cassandra Issue Type: New Feature Components: Packaging Reporter: Jonathan Ellis Assignee: David Alves Priority: Minor Fix For: 1.2 Attachments: 3564.patch, 3564.patch Cassandra handles flush in its shutdown hook for durable_writes=false CFs (otherwise we're *guaranteed* to lose data) but leaves it up to the operator otherwise. I'd rather leave it that way to offer these semantics: - cassandra stop = shutdown nicely [explicit flush, then kill -int] - kill -INT = shutdown faster but don't lose any updates [current behavior] - kill -KILL = lose most recent writes unless durable_writes=true and batch commits are on [also current behavior] But if it's not reasonable to use nodetool from the init script then I guess we can just make the shutdown hook flush everything. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422532#comment-13422532 ] Jonathan Ellis commented on CASSANDRA-4436: --- bq. But we won't have the same ancestor multiple times I don't think that's true. Suppose for instance we have leveled compaction with A and B in L0. They are larger than 5MB so we split the result into X, Y, and Z. Next we flush C to L0. It overlaps with Y and Z, so we're compacting C, Y, and Z. Now we have Y and Z both with A and B as ancestors. (Switching from LCS back to STCS is another way you could get duplicate ancestors.) Counters in columns don't preserve correct values after cluster restart --- Key: CASSANDRA-4436 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.10 Reporter: Peter Velas Assignee: Sylvain Lebresne Fix For: 1.1.3 Attachments: 4436-1.0-2.txt, 4436-1.0.txt, 4436-1.1-2.txt, 4436-1.1.txt, increments.cql.gz Similar to #3821. but affecting normal columns. Set up a 2-node cluster with rf=2. 1. Create a counter column family and increment a 100 keys in loop 5000 times. 2. Then make a rolling restart to cluster. 3. Again increment another 5000 times. 4. Make a rolling restart to cluster. 5. Again increment another 5000 times. 6. Make a rolling restart to cluster. After step 6 we were able to reproduce bug with bad counter values. Expected values were 15 000. Values returned from cluster are higher then 15000 + some random number. Rolling restarts are done with nodetool drain. Always waiting until second node discover its down then kill java process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4292) Per-disk I/O queues
[ https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422574#comment-13422574 ] Jonathan Ellis commented on CASSANDRA-4292: --- bq. Directory is chosen based on available space in both queue and disk. We still want to prioritize disks that have no tasks yet, since ipos are a bigger bottleneck than space, in general. So specifically, we want to prioritize in order of: # enough space for the new sstable (boolean) # zero tasks (boolean) # total free space (long) We may want to test changing #2 to ordering by task count... both have pros and cons. Per-disk I/O queues --- Key: CASSANDRA-4292 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Yuki Morishita Fix For: 1.2 Attachments: 4292-v2.txt, 4292.txt As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) threads, which mix and match disk volumes indiscriminately. It may be worth creating a tight thread - disk affinity, to prevent unnecessary conflict at that level. OTOH as SSDs become more prevalent this becomes a non-issue. Unclear how much pain this actually causes in practice in the meantime. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4460) SystemTable.setBootstrapState always sets bootstrap state to true
[ https://issues.apache.org/jira/browse/CASSANDRA-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422593#comment-13422593 ] Dave Brosius commented on CASSANDRA-4460: - it has to be migrated anyway. The table is defined to be boolean currently. So either you migrate to integer or string. I chose string as 0, 1, 2 mean nothing to me. SystemTable.setBootstrapState always sets bootstrap state to true - Key: CASSANDRA-4460 URL: https://issues.apache.org/jira/browse/CASSANDRA-4460 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2 Reporter: Dave Brosius Assignee: Dave Brosius Priority: Trivial Attachments: use_bootstrap_enum_strings.txt public static void setBootstrapState(BootstrapState state) { String req = INSERT INTO system.%s (key, bootstrapped) VALUES ('%s', '%b'); processInternal(String.format(req, LOCAL_CF, LOCAL_KEY, getBootstrapState())); forceBlockingFlush(LOCAL_CF); } Third parameter %b is set from getBootstrapState() which returns an enum, thus %b collapses to null/non null checks. This would seem then to always set it to true. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[Cassandra Wiki] Update of VirtualNodes/Balance by EricEvans
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The VirtualNodes/Balance page has been changed by EricEvans: http://wiki.apache.org/cassandra/VirtualNodes/Balance?action=diffrev1=3rev2=4 Comment: proposed tool interfaces === User Interface === + The `balance` sub-command balances the node it is ran against, by default a targeted ownership of `1/n`. The sub-command takes an optional offset in the rangeFootNote(Does this range make sense?) of `+100` to `-100`, which results in a targeted ownership of `(1/n)*(offset/100)`. + + ''Note: ranges copied from/to other nodes must be selected in such a way as to respect their offsets.'' + {{{ - $ nodetool balance + $ nodetool balance [+/-offset] }}} + + The ``shuffle` sub-command randomly exchanges contiguous ranges on the node it ran against, with other nodes in the cluster. {{{ $ nodetool shuffle }}} + The `trim` sub-command assigns an offset in the rangeFootNote(Does this range make sense?) of `+100` to `-100`, and copies randomly selected ranges onto, or off of, the node it is ran against to achieve the requested ownership (`(1/n)*(offset/100)`). + {{{ - $ nodetool trim + $ nodetool trim +/- offset }}}
[jira] [Commented] (CASSANDRA-1967) commit log replay shouldn't end with a flush
[ https://issues.apache.org/jira/browse/CASSANDRA-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422613#comment-13422613 ] Robert Coli commented on CASSANDRA-1967: After making the above update, I noticed Cassandra 1.0.10 flushing after replay. Given this experience clashing with my interpretation of the code, I conjectured that the flush must be deeper in the code paths than previous versions, and deeper than I read this time. I asked about this in #cassandra. Per jbellis in #cassandra : 1) Explicit flush at the end of replay is by design. 2) The design goal in this case is to avoid multiple replay of the same log, if node crashes before replayed data is flushed. I don't find 2) a compelling design goal, and believe it violates the principle of least surprise. The purpose of the commitlog is to hold the contents of memtables. In the case of a crash, I expect the commitlog replay process to result in the same memtables that my node contained before it crashed. If it then crashes again, I expect the same memtables to be replayed again. There may be some negative externalities to this repeated replay which are not currently clear to me, but I am relatively confident that being surprised by my memtable state is not one of them. In my opinion, avoiding compaction as a side effect of restart/replay is, in contrast, a compelling design goal. Significant production users appear to agree in CASSANDRA-2444 ([Twitter has] ran into many times where we do not want compaction to run right away against CFs when booting up a node.) But the resolution of CASSANDRA-2444 (If the node needs to compact, it will do so at the first flush, which is more likely to be staggered across the cluster) does not make sense if commitlog replay always ends with a flush. The logical result of both code paths appears the same : restart has a potential to trigger immediate compaction. In summary... +1 for re-opening this ticket and making commit log replay not end with a flush. commit log replay shouldn't end with a flush Key: CASSANDRA-1967 URL: https://issues.apache.org/jira/browse/CASSANDRA-1967 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.3 Reporter: Robert Coli Priority: Minor (Apologies in advance if there is some very compelling reason to flush after replay, of which I am not currently aware. ;D) Currently, when a node restarts, the following sequence occurs : a) commitlog is replayed b) any memtables resulting from a) are flushed c) a new commitlog is opened, new memtables are switched in ... (other stuff happens) d) node starts taking traffic This has side effects, perhaps most seriously the potential of triggering compaction. As a node is likely to struggle performance-wise after restarting, triggering compaction at that time seems like something we might wish to avoid. I propose that the sequence be : a) commitlog is replayed b) a new commitlog is opened, new memtables are switched in ... (other stuff happens) c) node starts taking traffic Looking through the relevant code, the only code that appears to depend on this flush is at src/java/org/apache/cassandra/db/commitlog/CommitLog.java:112 : // all old segments are recovered and deleted before CommitLog is instantiated. // All we need to do is create a new one. segments.add(new CommitLogSegment()); Presumably this code would have to be refactored to be aware of the currently open commitlog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1967) commit log replay shouldn't end with a flush
[ https://issues.apache.org/jira/browse/CASSANDRA-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422628#comment-13422628 ] Jonathan Ellis commented on CASSANDRA-1967: --- You're barking up the wrong tree by blaming flush. To the degree that compaction is a problem (and on a properly tuned system it shouldn't be), we can simply extend the five minute delay on autocompaction to these flushes as well. commit log replay shouldn't end with a flush Key: CASSANDRA-1967 URL: https://issues.apache.org/jira/browse/CASSANDRA-1967 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.3 Reporter: Robert Coli Priority: Minor (Apologies in advance if there is some very compelling reason to flush after replay, of which I am not currently aware. ;D) Currently, when a node restarts, the following sequence occurs : a) commitlog is replayed b) any memtables resulting from a) are flushed c) a new commitlog is opened, new memtables are switched in ... (other stuff happens) d) node starts taking traffic This has side effects, perhaps most seriously the potential of triggering compaction. As a node is likely to struggle performance-wise after restarting, triggering compaction at that time seems like something we might wish to avoid. I propose that the sequence be : a) commitlog is replayed b) a new commitlog is opened, new memtables are switched in ... (other stuff happens) c) node starts taking traffic Looking through the relevant code, the only code that appears to depend on this flush is at src/java/org/apache/cassandra/db/commitlog/CommitLog.java:112 : // all old segments are recovered and deleted before CommitLog is instantiated. // All we need to do is create a new one. segments.add(new CommitLogSegment()); Presumably this code would have to be refactored to be aware of the currently open commitlog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4460) SystemTable.setBootstrapState always sets bootstrap state to true
[ https://issues.apache.org/jira/browse/CASSANDRA-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-4460: Attachment: 4460.txt bq. it has to be migrated anyway. The table is defined to be boolean currently Actually, it's only boolean in trunk and we don't need to keep trunk compatible with itself. It turns out upgradeSystemData() is handling the 1.1 to trunk transition for us already. bq. I chose string as 0, 1, 2 mean nothing to me. Fair enough. Attaching a new version which takes all of this into account, and fixes a bug in setBootstrapState using getBootstrapState instead of the state passed to it. SystemTable.setBootstrapState always sets bootstrap state to true - Key: CASSANDRA-4460 URL: https://issues.apache.org/jira/browse/CASSANDRA-4460 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2 Reporter: Dave Brosius Assignee: Dave Brosius Priority: Trivial Attachments: 4460.txt, use_bootstrap_enum_strings.txt public static void setBootstrapState(BootstrapState state) { String req = INSERT INTO system.%s (key, bootstrapped) VALUES ('%s', '%b'); processInternal(String.format(req, LOCAL_CF, LOCAL_KEY, getBootstrapState())); forceBlockingFlush(LOCAL_CF); } Third parameter %b is set from getBootstrapState() which returns an enum, thus %b collapses to null/non null checks. This would seem then to always set it to true. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4465) Index fails to be created on all nodes in cluster, restart resolves
Grant Heffernan created CASSANDRA-4465: -- Summary: Index fails to be created on all nodes in cluster, restart resolves Key: CASSANDRA-4465 URL: https://issues.apache.org/jira/browse/CASSANDRA-4465 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.10 Environment: 21 node cluster, Ubuntu Linux 11.10 in a virtualized environment, Apache cassandra community release, binary distribution Reporter: Grant Heffernan Priority: Minor On a production cluster, under load, creating an index on a column resulted in the index being successfully created on 4 of 21 nodes. All nodes received the schema agreement and were in concert. There were no errors logged on any of the nodes that failed to build the index. A rolling restart of the cluster resulted in the nodes which had previously failed to build the index doing so when coming back up from a restart. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4460) SystemTable.setBootstrapState always sets bootstrap state to true
[ https://issues.apache.org/jira/browse/CASSANDRA-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422810#comment-13422810 ] Dave Brosius commented on CASSANDRA-4460: - LGTM SystemTable.setBootstrapState always sets bootstrap state to true - Key: CASSANDRA-4460 URL: https://issues.apache.org/jira/browse/CASSANDRA-4460 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2 Reporter: Dave Brosius Assignee: Dave Brosius Priority: Trivial Attachments: 4460.txt, use_bootstrap_enum_strings.txt public static void setBootstrapState(BootstrapState state) { String req = INSERT INTO system.%s (key, bootstrapped) VALUES ('%s', '%b'); processInternal(String.format(req, LOCAL_CF, LOCAL_KEY, getBootstrapState())); forceBlockingFlush(LOCAL_CF); } Third parameter %b is set from getBootstrapState() which returns an enum, thus %b collapses to null/non null checks. This would seem then to always set it to true. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
git commit: Fix SystemTable.setBootstrapState and other merge fallout from #4427. Patch by Dave Brosius and brandonwilliams, reviewed by Dave Brosius for CASSANDRA-4460
Updated Branches: refs/heads/trunk 5cde66bab - d96e813e0 Fix SystemTable.setBootstrapState and other merge fallout from #4427. Patch by Dave Brosius and brandonwilliams, reviewed by Dave Brosius for CASSANDRA-4460 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d96e813e Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d96e813e Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d96e813e Branch: refs/heads/trunk Commit: d96e813e06cc9fdf902d79af2962f38f047b14fa Parents: 5cde66b Author: Brandon Williams brandonwilli...@apache.org Authored: Wed Jul 25 20:34:11 2012 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed Jul 25 20:36:49 2012 -0500 -- .../org/apache/cassandra/config/CFMetaData.java|2 +- src/java/org/apache/cassandra/db/SystemTable.java | 15 --- 2 files changed, 9 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d96e813e/src/java/org/apache/cassandra/config/CFMetaData.java -- diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java b/src/java/org/apache/cassandra/config/CFMetaData.java index 906017c..4321175 100644 --- a/src/java/org/apache/cassandra/config/CFMetaData.java +++ b/src/java/org/apache/cassandra/config/CFMetaData.java @@ -166,7 +166,7 @@ public final class CFMetaData + token_bytes blob, + cluster_name text, + gossip_generation int, - + bootstrapped boolean, + + bootstrapped text, + ring_id uuid, + release_version text, + thrift_version text, http://git-wip-us.apache.org/repos/asf/cassandra/blob/d96e813e/src/java/org/apache/cassandra/db/SystemTable.java -- diff --git a/src/java/org/apache/cassandra/db/SystemTable.java b/src/java/org/apache/cassandra/db/SystemTable.java index e852003..778f3cb 100644 --- a/src/java/org/apache/cassandra/db/SystemTable.java +++ b/src/java/org/apache/cassandra/db/SystemTable.java @@ -80,8 +80,8 @@ public class SystemTable public enum BootstrapState { -NEEDS_BOOTSTRAP, // ordered for boolean backward compatibility, false -COMPLETED, // true +NEEDS_BOOTSTRAP, +COMPLETED, IN_PROGRESS } @@ -136,8 +136,8 @@ public class SystemTable Token token = StorageService.getPartitioner().getTokenFactory().fromByteArray(oldColumns.next().value()); String tokenBytes = ByteBufferUtil.bytesToHex(serializeTokens(Collections.singleton(token))); // (assume that any node getting upgraded was bootstrapped, since that was stored in a separate row for no particular reason) -String req = INSERT INTO system.%s (key, cluster_name, token_bytes, bootstrapped) VALUES ('%s', '%s', '%s', 'true'); -processInternal(String.format(req, LOCAL_CF, LOCAL_KEY, clusterName, tokenBytes)); +String req = INSERT INTO system.%s (key, cluster_name, token_bytes, bootstrapped) VALUES ('%s', '%s', '%s', '%s'); +processInternal(String.format(req, LOCAL_CF, LOCAL_KEY, clusterName, tokenBytes, BootstrapState.COMPLETED.name())); oldStatusCfs.truncate(); } @@ -372,7 +372,8 @@ public class SystemTable if (result.isEmpty() || !result.one().has(bootstrapped)) return BootstrapState.NEEDS_BOOTSTRAP; -return BootstrapState.values()[result.one().getInt(bootstrapped)]; + +return BootstrapState.valueOf(result.one().getString(bootstrapped)); } public static boolean bootstrapComplete() @@ -387,8 +388,8 @@ public class SystemTable public static void setBootstrapState(BootstrapState state) { -String req = INSERT INTO system.%s (key, bootstrapped) VALUES ('%s', '%b'); -processInternal(String.format(req, LOCAL_CF, LOCAL_KEY, getBootstrapState())); +String req = INSERT INTO system.%s (key, bootstrapped) VALUES ('%s', '%s'); +processInternal(String.format(req, LOCAL_CF, LOCAL_KEY, state.name())); forceBlockingFlush(LOCAL_CF); }