Re: Is column update column-atomic or row atomic?
Sorry for the rather primitive question, but it's not clear to me if I need to fetch the whole row, add a column as a dictionary entry and re-insert it if I want to expand the row by one column. Help will be appreciated. As was pointed you, reading and re-inserting is definitely not the way to go. But note that when inserting a column, there is never going to be a guarantee that other columns are not inserted/deleted concurrently by other writers (unless there is external synchronization). Your question makes me believe you're trying to ensure some kind of consistency across multiple columns in a row. Maybe if you describe your use-case. -- / Peter Schuller
Re: Getting list of active cassandra nodes
moving to user list. describe_ring() will give you a list of the token ranges and the nodes that are responsible for them http://wiki.apache.org/cassandra/API . It does not include information on which nodes are up or down or bootstrapping. Information about the state of the nodes is available on the StorageService JMX MBean. AAron On 16 Mar 2011, at 15:10, Anurag Gujral wrote: Hi All, How can I get the list of active cassandra nodes using cassandra api 0.7. Thanks a ton, Anurag
swap setting on linux
Dear community! Please share you settings for swap on linux box
Re: swap setting on linux
According to Cassandra Wiki, best strategy is no swap at all. http://wiki.apache.org/cassandra/MemtableThresholds#Virtual_Memory_and_Swap 2011/3/16 ruslan usifov ruslan.usi...@gmail.com: Dear community! Please share you settings for swap on linux box -- w3m
Upgrade to a different version?
We are running 0.6.6 and are considering upgrading to either 0.6.8 or one of the 0.7.x releases. What is the recommended version and procedure? What are the issues we face? Are there any specific storage gotchas we need to be aware of? Are there any docs around this process for review? Thanks, jake -- Jake Maizel Soundcloud Mail GTalk: j...@soundcloud.com Skype: jakecloud Rosenthaler strasse 13, 101 19, Berlin, DE
Re: where to find the stress testing programs?
There are both Python and Java stress testing tools. I found the Java version easier to use. These directions (which echo the README for stress.java) may help get you going: http://www.datastax.com/docs/0.7/utilities/stress_java On Tue, Mar 15, 2011 at 9:25 AM, Jeremy Hanna jeremy.hanna1...@gmail.comwrote: contrib is only in the source download of cassandra On Mar 15, 2011, at 11:23 AM, Jonathan Colby wrote: According to the Cassandra Wiki and OReilly book supposedly there is a contrib directory within the cassandra download containing the Python Stress Test script stress.py. It's not in the binary tarball of 0.7.3. Anyone know where to find it? Anyone know of other, maybe better stress testing scripts? Jon
Re: Upgrade to a different version?
Hi Jake, I'm sending this privately, because I wanted to tell you my opinion frankly. I don't know about the .6 series or .74, but so far, all of the .7 series of cassandra has been a disaster. I would think twice about switching to anything in .7 series to production until things stabilize and at least one reasonably large site starts using cassandra .7. Jonathan claims reddit is using cassandra, but it can't be a good experience with the type of bugs that have been found. .70 had data corruption issues .71 also had data corruption issues, had major issues with anything over 2 gigs in memory .72 issues with reading properly .73 had major issues with anything over 2 gigs in memory, had issues with performance due to flushing rules being broken, many people had huge issues with large amounts of insertions, and a few had startup issues. .74 too new to say. In either case, do a lot of testing for your use case before switching as things in the .7 series are still way in development. I've talked to Jonathan about putting it into beta status because of the severity of the bugs, but so far, there has been no decision to do so. Good luck. Paul On 3/16/2011 1:21 PM, Jake Maizel wrote: We are running 0.6.6 and are considering upgrading to either 0.6.8 or one of the 0.7.x releases. What is the recommended version and procedure? What are the issues we face? Are there any specific storage gotchas we need to be aware of? Are there any docs around this process for review? Thanks, jake
memory usage for secondary indexes
Was just reading through the code to get an understanding of the memory impact for secondary indexes. The index CF is created with the same memtable settings as the parent CF (in CFMetaData.newIndexMetadata). Does this mean that when estimating JVM heap size each index should be considered as another CF? I'll update the wiki with the answer http://wiki.apache.org/cassandra/MemtableThresholds Cheers Aaron
Re: On 0.6.6 to 0.7.3 migration, DC-aware traffic and minimising data transfer
That should work then, assuming SimpleStrategy/RackUnawareStrategy. Otherwise figuring out which machines share which data gets complicated. Note that if you have room on the machines, it's going to be faster to copy the entire data set to each machine and run cleanup, than to have repair fix 3 of 4 replicas from scratch. Repair would work, eventually, but it's kind of a worst-case scenario for it. On Mon, Mar 14, 2011 at 10:39 AM, Jedd Rashbrooke j...@visualdna.com wrote: Jonathon, thank you for your answers here. To explain this bit ... On 11 March 2011 20:46, Jonathan Ellis jbel...@gmail.com wrote: On Thu, Mar 10, 2011 at 6:06 AM, Jedd Rashbrooke j...@visualdna.com wrote: Copying a cluster between AWS DC's: We have ~ 150-250GB per node, with a Replication Factor of 4. I ack that 0.6 - 0.7 is necessarily STW, so in an attempt to minimise that outage period I was wondering if it's possible to drain stop the cluster, then copy over only the 1st, 5th, 9th, and 13th nodes' worth of data (which should be a full copy of all our actual data - we are nicely partitioned, despite the disparity in GB per node) and have Cassandra re-populate the new destination 16 nodes from those four data sets. If this is feasible, is it likely to be more expensive (in terms of time the new cluster is unresponsive as it rebuilds) than just copying across all 16 sets of data - about 2.7TB. I'm confused. You're trying to upgrade and add a DC at the same time? Yeah, I know, it's probably not the sanest route - but the hardware (virtualised, Amazonish EC2 that it is) will be the same between the two sites, so that reduces some of the usual roll in / roll out migration risk. But more importantly for us it would mean we'd have just the one major outage, rather than two (relocation and 0.6 - 0.7) cheers, Jedd. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Upgrade to a different version?
Sorry guys, that was meant to be private. My opinion stands, but I didn't want to hurt any of the dev's feelings by being too frank. I think the progress has been good in new features, but I feel we have taken a step back in relability and scalability since so many features were added without adequate testing. Hopefully, at some point soon, it will get better and doing a data import job won't take a cassandra cluster to it's knees or we won't experience stop the world GC issues and have out of memory errors from routine usage. Paul On 3/16/2011 2:13 PM, Paul Pak wrote: Hi Jake, I'm sending this privately, because I wanted to tell you my opinion frankly. I don't know about the .6 series or .74, but so far, all of the .7 series of cassandra has been a disaster. I would think twice about switching to anything in .7 series to production until things stabilize and at least one reasonably large site starts using cassandra .7. Jonathan claims reddit is using cassandra, but it can't be a good experience with the type of bugs that have been found. .70 had data corruption issues .71 also had data corruption issues, had major issues with anything over 2 gigs in memory .72 issues with reading properly .73 had major issues with anything over 2 gigs in memory, had issues with performance due to flushing rules being broken, many people had huge issues with large amounts of insertions, and a few had startup issues. .74 too new to say. In either case, do a lot of testing for your use case before switching as things in the .7 series are still way in development. I've talked to Jonathan about putting it into beta status because of the severity of the bugs, but so far, there has been no decision to do so. Good luck. Paul On 3/16/2011 1:21 PM, Jake Maizel wrote: We are running 0.6.6 and are considering upgrading to either 0.6.8 or one of the 0.7.x releases. What is the recommended version and procedure? What are the issues we face? Are there any specific storage gotchas we need to be aware of? Are there any docs around this process for review? Thanks, jake
replace one node to onother
Hello For example if we want change one server to another with ip address change too. How can we that eases way? For now we do nodetool removetocken, then set autobootstrap: true on new server (with the token that was on old node)
Re: Upgrade to a different version?
So did you downgraded it back to 0.6.x series? On Thu, Mar 17, 2011 at 6:36 AM, Paul Pak p...@yellowseo.com wrote: Sorry guys, that was meant to be private. My opinion stands, but I didn't want to hurt any of the dev's feelings by being too frank. I think the progress has been good in new features, but I feel we have taken a step back in relability and scalability since so many features were added without adequate testing. Hopefully, at some point soon, it will get better and doing a data import job won't take a cassandra cluster to it's knees or we won't experience stop the world GC issues and have out of memory errors from routine usage. Paul On 3/16/2011 2:13 PM, Paul Pak wrote: Hi Jake, I'm sending this privately, because I wanted to tell you my opinion frankly. I don't know about the .6 series or .74, but so far, all of the .7 series of cassandra has been a disaster. I would think twice about switching to anything in .7 series to production until things stabilize and at least one reasonably large site starts using cassandra .7. Jonathan claims reddit is using cassandra, but it can't be a good experience with the type of bugs that have been found. .70 had data corruption issues .71 also had data corruption issues, had major issues with anything over 2 gigs in memory .72 issues with reading properly .73 had major issues with anything over 2 gigs in memory, had issues with performance due to flushing rules being broken, many people had huge issues with large amounts of insertions, and a few had startup issues. .74 too new to say. In either case, do a lot of testing for your use case before switching as things in the .7 series are still way in development. I've talked to Jonathan about putting it into beta status because of the severity of the bugs, but so far, there has been no decision to do so. Good luck. Paul On 3/16/2011 1:21 PM, Jake Maizel wrote: We are running 0.6.6 and are considering upgrading to either 0.6.8 or one of the 0.7.x releases. What is the recommended version and procedure? What are the issues we face? Are there any specific storage gotchas we need to be aware of? Are there any docs around this process for review? Thanks, jake -- http://twitter.com/jpartogi
Please help decipher /proc/cpuinfo for optimal Cassandra config
Dear All, this is from my new Cassandra server. It obviously uses hyperthreading, I just don't know how to translate this to concurrent readers and writers in cassandra.yaml -- can somebody take a look and tell me what number of cores I need to assume for concurrent_reads and concurrent_writes. Is it 24? Thanks! [cassandra@cassandra01 bin]$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz stepping: 2 cpu MHz : 1596.000 cache size : 12288 KB physical id : 0 siblings: 12 core id : 0 cpu cores : 6 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt aes lahf_lm arat tpr_shadow vnmi flexpriority ept vpid bogomips: 5333.91 clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz stepping: 2 cpu MHz : 1596.000 cache size : 12288 KB physical id : 0 siblings: 12 core id : 1 cpu cores : 6 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt aes lahf_lm arat tpr_shadow vnmi flexpriority ept vpid bogomips: 5333.15 clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz stepping: 2 cpu MHz : 1596.000 cache size : 12288 KB physical id : 0 siblings: 12 core id : 2 cpu cores : 6 apicid : 4 initial apicid : 4 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt aes lahf_lm arat tpr_shadow vnmi flexpriority ept vpid bogomips: 5333.15 clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz stepping: 2 cpu MHz : 1596.000 cache size : 12288 KB physical id : 0 siblings: 12 core id : 8 cpu cores : 6 apicid : 16 initial apicid : 16 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt aes lahf_lm arat tpr_shadow vnmi flexpriority ept vpid bogomips: 5333.15 clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 4 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz stepping: 2 cpu MHz : 1596.000 cache size : 12288 KB physical id : 0 siblings: 12 core id : 9 cpu cores : 6 apicid : 18 initial apicid : 18 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16
Re: Is column update column-atomic or row atomic?
Hello Peter, thanks for the note. I'm not looking for anything fancy. It's just when I'm looking at the following bit of Pycassa docs, it's not 100% clear to me that it won't overwrite the entire row for the key, if I want to simply add an extra column {'foo':'bar'} to the already existing row. I don't care about cross-node consistency at this point. insert(key, columns[, timestamp][, ttl][, write_consistency_level])¶ Insert or update columns in the row with key key. columns should be a dictionary of columns or super columns to insert or update. If this is a standard column family, columns should look like {column_name: column_value}. If this is a super column family, columns should look like {super_column_name: {sub_column_name: value}} -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Is-column-update-column-atomic-or-row-atomic-tp6174445p6179492.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Upgrade to a different version?
Paul, Don't feel like you have to hold back when it comes to feedback. There is a place to vote on releases. If you have something that could potentially be critical that you can isolate, by all means chime in. Even if your vote isn't binding if you are not a committer, votes with something credible behind them get taken seriously. Votes happen on the dev@cassandra mailing list. Alternately, feel free to create Jira tickets any time. Also, there are unit tests, integration tests, and distributed tests. If you feel like you can add to any of these, please get involved. It sounds like you already do internal testing so it might be fairly simple to add to some of these tests. Wrt the distributed tests, some devs at twitter along with others have contributed a distributed test harness for Cassandra which has been in 0.7 since 0.7.1. See CASSANDRA-1859 for the beginning and http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.7/test/ for the latest. This uses apache whirr to spin up some nodes and runs tests over them. In any case, we all want to make a solid release and if you have specifics on what can make it better, it would benefit the whole community. Jeremy On Mar 16, 2011, at 2:36 PM, Paul Pak wrote: Sorry guys, that was meant to be private. My opinion stands, but I didn't want to hurt any of the dev's feelings by being too frank. I think the progress has been good in new features, but I feel we have taken a step back in relability and scalability since so many features were added without adequate testing. Hopefully, at some point soon, it will get better and doing a data import job won't take a cassandra cluster to it's knees or we won't experience stop the world GC issues and have out of memory errors from routine usage. Paul On 3/16/2011 2:13 PM, Paul Pak wrote: Hi Jake, I'm sending this privately, because I wanted to tell you my opinion frankly. I don't know about the .6 series or .74, but so far, all of the .7 series of cassandra has been a disaster. I would think twice about switching to anything in .7 series to production until things stabilize and at least one reasonably large site starts using cassandra .7. Jonathan claims reddit is using cassandra, but it can't be a good experience with the type of bugs that have been found. .70 had data corruption issues .71 also had data corruption issues, had major issues with anything over 2 gigs in memory .72 issues with reading properly .73 had major issues with anything over 2 gigs in memory, had issues with performance due to flushing rules being broken, many people had huge issues with large amounts of insertions, and a few had startup issues. .74 too new to say. In either case, do a lot of testing for your use case before switching as things in the .7 series are still way in development. I've talked to Jonathan about putting it into beta status because of the severity of the bugs, but so far, there has been no decision to do so. Good luck. Paul On 3/16/2011 1:21 PM, Jake Maizel wrote: We are running 0.6.6 and are considering upgrading to either 0.6.8 or one of the 0.7.x releases. What is the recommended version and procedure? What are the issues we face? Are there any specific storage gotchas we need to be aware of? Are there any docs around this process for review? Thanks, jake
Re: Please help decipher /proc/cpuinfo for optimal Cassandra config
On Wed, Mar 16, 2011 at 9:58 PM, buddhasystem potek...@bnl.gov wrote: Dear All, this is from my new Cassandra server. It obviously uses hyperthreading, I just don't know how to translate this to concurrent readers and writers in cassandra.yaml -- can somebody take a look and tell me what number of cores I need to assume for concurrent_reads and concurrent_writes. Is it 24? Thanks! [cassandra@cassandra01 bin]$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz stepping : 2 cpu MHz : 1596.000 cache size : 12288 KB physical id : 0 siblings : 12 core id : 0 cpu cores : 6 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt aes lahf_lm arat tpr_shadow vnmi flexpriority ept vpid bogomips : 5333.91 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz stepping : 2 cpu MHz : 1596.000 cache size : 12288 KB physical id : 0 siblings : 12 core id : 1 cpu cores : 6 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt aes lahf_lm arat tpr_shadow vnmi flexpriority ept vpid bogomips : 5333.15 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz stepping : 2 cpu MHz : 1596.000 cache size : 12288 KB physical id : 0 siblings : 12 core id : 2 cpu cores : 6 apicid : 4 initial apicid : 4 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt aes lahf_lm arat tpr_shadow vnmi flexpriority ept vpid bogomips : 5333.15 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz stepping : 2 cpu MHz : 1596.000 cache size : 12288 KB physical id : 0 siblings : 12 core id : 8 cpu cores : 6 apicid : 16 initial apicid : 16 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt aes lahf_lm arat tpr_shadow vnmi flexpriority ept vpid bogomips : 5333.15 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 4 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz stepping : 2 cpu MHz : 1596.000 cache size : 12288 KB physical id : 0 siblings : 12 core id : 9 cpu cores : 6 apicid : 18 initial apicid : 18 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36
Re: Is column update column-atomic or row atomic?
insert() will only overwrite (or insert) the columns that you supply in the dictionary. So, if you do: cf.insert('key', {'foo': 'bar'}) and the column 'foo' doesn't exist in that row yet, the column will simply be added to the other columns in the row. On Wed, Mar 16, 2011 at 9:00 PM, buddhasystem potek...@bnl.gov wrote: Hello Peter, thanks for the note. I'm not looking for anything fancy. It's just when I'm looking at the following bit of Pycassa docs, it's not 100% clear to me that it won't overwrite the entire row for the key, if I want to simply add an extra column {'foo':'bar'} to the already existing row. I don't care about cross-node consistency at this point. insert(key, columns[, timestamp][, ttl][, write_consistency_level])¶ Insert or update columns in the row with key key. columns should be a dictionary of columns or super columns to insert or update. If this is a standard column family, columns should look like {column_name: column_value}. If this is a super column family, columns should look like {super_column_name: {sub_column_name: value}} -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Is-column-update-column-atomic-or-row-atomic-tp6174445p6179492.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com. -- Tyler Hobbs Software Engineer, DataStax http://datastax.com/ Maintainer of the pycassa http://github.com/pycassa/pycassa Cassandra Python client library
Re: reduced cached mem; resident set size growth
On Thu, Feb 3, 2011 at 1:49 AM, Ryan King r...@twitter.com wrote: On Wed, Feb 2, 2011 at 6:22 AM, Chris Burroughs chris.burrou...@gmail.com wrote: On 01/28/2011 09:19 PM, Chris Burroughs wrote: Thanks Oleg and Zhu. I swear that wasn't a new hotspot version when I checked, but that's obviously not the case. I'll update one node to the latest as soon as I can and report back. RSS over 48 hours with java 6 update 23: http://img716.imageshack.us/img716/5202/u2348hours.png I'll continue monitoring but RSS still appears to grow without bounds. Zhu reported a similar problem with Ubuntu 10.04. While possible, it would seem seam extraordinary unlikely that there is a glibc or kernel bug affecting us both. We're seeing a similar problem with one of our clusters (but over a longer time scale). Its possible that its not a leak, but just fragmentation. Unless you've told it otherwise, the jvm uses glibc's malloc implementation for off-heap allocations. We're currently running a test with jemalloc on one node to see if the problem goes away. Ryan, does jemalloc solve the RSS growth problem in your test? -ryan
Re: Is column update column-atomic or row atomic?
Thanks for clarification, Tyler, sorry again for the basic question. I've been doing straight inserts from Oracle so far but now I need to update rows with new columns. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Is-column-update-column-atomic-or-row-atomic-tp6174445p6179536.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Please help decipher /proc/cpuinfo for optimal Cassandra config
Thanks! Docs say it's good to set it to 8*Ncores, are saying you see 8 cores in this output? I know I need to go way above default 32 with this setup. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Please-help-decipher-proc-cpuinfo-for-optimal-Cassandra-config-tp6179487p6179539.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: reduced cached mem; resident set size growth
On Thu, Mar 17, 2011 at 10:27 AM, Zhu Han schumi@gmail.com wrote: On Thu, Feb 3, 2011 at 1:49 AM, Ryan King r...@twitter.com wrote: On Wed, Feb 2, 2011 at 6:22 AM, Chris Burroughs chris.burrou...@gmail.com wrote: On 01/28/2011 09:19 PM, Chris Burroughs wrote: Thanks Oleg and Zhu. I swear that wasn't a new hotspot version when I checked, but that's obviously not the case. I'll update one node to the latest as soon as I can and report back. RSS over 48 hours with java 6 update 23: http://img716.imageshack.us/img716/5202/u2348hours.png I'll continue monitoring but RSS still appears to grow without bounds. Zhu reported a similar problem with Ubuntu 10.04. While possible, it would seem seam extraordinary unlikely that there is a glibc or kernel bug affecting us both. We're seeing a similar problem with one of our clusters (but over a longer time scale). Does it mean not all your clusters running cassandra observed the same RSS growth problem? Its possible that its not a leak, but just fragmentation. Unless you've told it otherwise, the jvm uses glibc's malloc implementation for off-heap allocations. We're currently running a test with jemalloc on one node to see if the problem goes away. Ryan, does jemalloc solve the RSS growth problem in your test? -ryan
super_column.name?
Hi, I've been working on a scala based api for cassandra. I've built it directly on top of thrift. I'm having a problem getting a slice of a superColumn. When I get a columnOrSuperColumn back, and call 'cos.super_column.name' and deserialize the bytes I'm not getting the expected output. Here's whats in cassandra --- RowKey: key = (super_column=super-col-0, (column=column, value=76616c756530, timestamp=1300330948240) (column=column1, value=76616c756530, timestamp=1300330948244)) …. and this is the deserialized string ? get_slice super-col-0 column value0 .?æ? column1 value0 .?æ? super-col-1 column value1 .?æ? column1 value1 .?æ? super-col-2 column value2 .?æ? column1 value2 .?æ? super-col-3 column value3 .?æ? column1 value3 .?æ? I would expect super-col-0 Any ideas on what I'm doing wrong? Thanks, Mike
Re: AW: problems while TimeUUIDType-index-querying with two expressions
Thanks for tracking that down, Roland. I've created https://issues.apache.org/jira/browse/CASSANDRA-2347 to fix this. On Wed, Mar 16, 2011 at 10:37 AM, Roland Gude roland.g...@yoochoose.com wrote: I have applied the suggested changes in my local source tree and did run all my testcases (the supplied ones as well as those with real data). They do work now. Von: Roland Gude [mailto:roland.g...@yoochoose.com] Gesendet: Mittwoch, 16. März 2011 16:29 An: user@cassandra.apache.org Betreff: AW: AW: problems while TimeUUIDType-index-querying with two expressions With debugging into it i found something that might be the issue (please correct me if I am wrong): In ColumnFamilyStore.java lines 1597 to 1613 is the code that checks whether some column satisfies an index expression. In line 1608 it compares the value of the index expression with the value given in the expression. For this comparison it utilizes the comparator of the columnfamily while it should use the comparator of the Column validation class. private static boolean satisfies(ColumnFamily data, IndexClause clause, IndexExpression first) { for (IndexExpression expression : clause.expressions) { // (we can skip first since we already know it's satisfied) if (expression == first) continue; // check column data vs expression IColumn column = data.getColumn(expression.column_name); if (column == null) return false; int v = data.getComparator().compare(column.value(), expression.value); if (!satisfies(v, expression.op)) return false; } return true; } The line 1608 should be changed from: int v = data.getComparator().compare(column.value(), expression.value); to int v = data.metadata().getValueValidator (expression.column_name).compare(column.value(), expression.value); greetings roland Von: Roland Gude [mailto:roland.g...@yoochoose.com] Gesendet: Mittwoch, 16. März 2011 14:50 An: user@cassandra.apache.org Betreff: AW: AW: problems while TimeUUIDType-index-querying with two expressions Hi Aaron, now I am completely confused. The code that did not work for days now – like a miracle – works even against the unpatched Cassandra 0.7.3 but the testcase still does not… There seems to be some randomness in whether it works or not (which is a bad sign I think)… I will debug a little deeper into this and report anything I find. Greetings, roland Von: aaron morton [mailto:aa...@thelastpickle.com] Gesendet: Mittwoch, 16. März 2011 01:15 An: user@cassandra.apache.org Betreff: Re: AW: problems while TimeUUIDType-index-querying with two expressions Have attached a patch to https://issues.apache.org/jira/browse/CASSANDRA-2328 Can you give it a try ? You should not get a InvalidRequestException when you send an invalid name or value in the query expression. Aaron On 16 Mar 2011, at 10:30, aaron morton wrote: Will have the Jira I created finished soon, it's a legitimate issue we should be validating the column names and values when a ger_indexed_slice() request is sent. The error in your original email shows that. WRT your code example. You are using the TimeUUID Validator for the column name when creating the index expression, but are using a string serialiser for the value... IndexedSlicesQueryString, UUID, String indexQuery = HFactory .createIndexedSlicesQuery(keyspace, stringSerializer, UUID_SERIALIZER, stringSerializer); indexQuery.addEqualsExpression(MANDATOR_UUID, mandator); But your schema is saying it is a bytes type... column_metadata=[{column_name: --1000--, validation_class: BytesType, index_name: mandatorIndex, index_type: KEYS}, {column_name: 0001--1000--, validation_class: BytesType, index_name: useridIndex, index_type: KEYS}];On 15 Mar 2011, at 22:41, Once I have the patch can you apply it and run your test again ? You may also want to ask on the Hector list if it automagically check you are using the correct types when creating an IndexedSlicesQuery. Aaron Roland Gude wrote: Forgot to attach the source code… here it comes Von: Roland Gude [mailto:roland.g...@yoochoose.com] Gesendet: Dienstag, 15. März 2011 10:39 An: user@cassandra.apache.org Betreff: AW: problems while TimeUUIDType-index-querying with two expressions Actually its not the column values that should be UUIDs in our case, but the column keys. The CF uses TimeUUID ordering and the values are just some ByteArrays. Even with changing the code to use UUIDSerializer instead of serializing the UUIDs manually the issue still exists. As far as I can see, there is
Cassandra c++ client
Hi All, Anyone knows about stable C++ client for cassandra? Thanks Anurag
Re: Cassandra c++ client
You could try this, https://github.com/posulliv/libcassandra - primal From: Anurag Gujral anurag.guj...@gmail.com To: user@cassandra.apache.org Sent: Wed, March 16, 2011 9:36:25 PM Subject: Cassandra c++ client Hi All, Anyone knows about stable C++ client for cassandra? Thanks Anurag
Re: Cassandra c++ client
libcassandra isn't vary active. Since we already has a object pool library, we went for using raw thrift in C++ instead of using any other library. Thanks, Naren On Wed, Mar 16, 2011 at 10:03 PM, Primal Wijesekera primalwijesek...@yahoo.com wrote: You could try this, https://github.com/posulliv/libcassandra - primal -- *From:* Anurag Gujral anurag.guj...@gmail.com *To:* user@cassandra.apache.org *Sent:* Wed, March 16, 2011 9:36:25 PM *Subject:* Cassandra c++ client Hi All, Anyone knows about stable C++ client for cassandra? Thanks Anurag