Re: Upgrade from 1.2.x to 2.0.x, upgradesstables has doubled the size on disk?
Nope, just ran out of disk space... So on 1.2.x I had 70GB used on a 200GB disk and everything was great, with 2.0.x I'm now at 99% used and getting exception while compacting about insufficient disk space. FML... Dan Washusen On Sun, Dec 31, 2017 at 6:47 AM, Dan Washusen <d...@reactive.org> wrote: > Thanks for the response Jeff. It wasn't snapshots but after running > upgradesstables on all nodes I started a repair and it seems like the file > sizes are reducing: > > INFO [CompactionExecutor:1626] 2017-12-30 19:42:36,065 > CompactionTask.java (line 299) Compacted 2 sstables to > [/appdata/lib/cassandra/data/dp/s_evt/dp-s_evt-jb-302,]. 9,663,696,752 > bytes to 4,834,895,601 (~50% of original) in 3,899,888ms = 1.182320MB/s. > 90,533 > total partitions merged to 45,278. Partition merge counts were {1:23, > 2:45255, } > > Dan Washusen > > > On Sun, Dec 31, 2017 at 1:51 AM, Jeff Jirsa <jji...@gmail.com> wrote: > >> 1.2 to 2.0 was a long time ago for many of us, but I don’t recall >> anything that should have doubled size other than perhaps temporarily >> during the sstable rewrite or snapshots (which may? Be automatic on >> upgrade). >> >> The bloom filters, sstable count, compression ratio in cfstats all look >> similar, only the size is double, so that sorta hints st maybe a snapshot >> >> You have few sstables, looks like STCS, so it’d be possible that if the >> upgrade is still running, maybe one sstable of the old version still >> (temporarily) exists on disk causing it to be double counted. >> >> >> >> -- >> Jeff Jirsa >> >> >> On Dec 29, 2017, at 4:33 PM, Dan Washusen <d...@reactive.org> wrote: >> >> Hi All, >> We're taking advantage of the lull in traffic to go through a production >> cluster upgrade from 1.2.x (latest) to 2.0.x (latest). We have three nodes >> with a replication factor of three. I've noticed that the 'space used' has >> almost doubled as a result of running 'nodetool upgradesstables'. >> >> Anyone have any ideas? Is that to be expected? >> >> For comparison, on a node (pre-upgrade): >> >>> nodetool cfstats dp.s_evt >>> Keyspace: dp >>> Read Count: 190570567 >>> Read Latency: 2.6280611004164145 ms. >>> Write Count: 46213651 >>> Write Latency: 0.08166790944519835 ms. >>> Pending Tasks: 0 >>> Column Family: s_evt >>> SSTable count: 8 >>> Space used (live): 36269415929 >>> Space used (total): 36274282945 >>> SSTable Compression Ratio: 0.2345030140572 >>> Number of Keys (estimate): 3213696 >>> Memtable Columns Count: 2934 >>> Memtable Data Size: 9561951 >>> Memtable Switch Count: 1974 >>> Read Count: 190570567 >>> Read Latency: 2.628 ms. >>> Write Count: 46213651 >>> Write Latency: 0.082 ms. >>> Pending Tasks: 0 >>> Bloom Filter False Positives: 1162636 >>> Bloom Filter False Ratio: 0.73869 >>> Bloom Filter Space Used: 4492256 >>> Compacted row minimum size: 373 >>> Compacted row maximum size: 1996099046 >>> Compacted row mean size: 63595 >>> Average live cells per slice (last five minutes): 11.0 >>> Average tombstones per slice (last five minutes): 0.0 >> >> >> And after upgrading and running 'upgradesstables' (different node): >> >>> nodetool cfstats dp.s_evt >>> Keyspace: dp >>> Read Count: 1461617 >>> Read Latency: 4.9734411921864625 ms. >>> Write Count: 359250 >>> Write Latency: 0.11328054279749478 ms. >>> Pending Tasks: 0 >>> Table: s_evt >>> SSTable count: 6 >>> Space used (live), bytes: 71266932602 >>> Space used (total), bytes: 71266932602 >>> Off heap memory used (total), bytes: 44853104 >>> SSTable Compression Ratio: 0.2387480210082192 >>> Number of keys (estimate): 3307776 >>> Memtable cell count: 603223 >>> Memtable data size, bytes: 121913569 >>> Memtable switch count: 9 >>> Lo
Re: Upgrade from 1.2.x to 2.0.x, upgradesstables has doubled the size on disk?
Thanks for the response Jeff. It wasn't snapshots but after running upgradesstables on all nodes I started a repair and it seems like the file sizes are reducing: INFO [CompactionExecutor:1626] 2017-12-30 19:42:36,065 CompactionTask.java (line 299) Compacted 2 sstables to [/appdata/lib/cassandra/data/dp/s_evt/dp-s_evt-jb-302,]. 9,663,696,752 bytes to 4,834,895,601 (~50% of original) in 3,899,888ms = 1.182320MB/s. 90,533 total partitions merged to 45,278. Partition merge counts were {1:23, 2:45255, } Dan Washusen On Sun, Dec 31, 2017 at 1:51 AM, Jeff Jirsa <jji...@gmail.com> wrote: > 1.2 to 2.0 was a long time ago for many of us, but I don’t recall anything > that should have doubled size other than perhaps temporarily during the > sstable rewrite or snapshots (which may? Be automatic on upgrade). > > The bloom filters, sstable count, compression ratio in cfstats all look > similar, only the size is double, so that sorta hints st maybe a snapshot > > You have few sstables, looks like STCS, so it’d be possible that if the > upgrade is still running, maybe one sstable of the old version still > (temporarily) exists on disk causing it to be double counted. > > > > -- > Jeff Jirsa > > > On Dec 29, 2017, at 4:33 PM, Dan Washusen <d...@reactive.org> wrote: > > Hi All, > We're taking advantage of the lull in traffic to go through a production > cluster upgrade from 1.2.x (latest) to 2.0.x (latest). We have three nodes > with a replication factor of three. I've noticed that the 'space used' has > almost doubled as a result of running 'nodetool upgradesstables'. > > Anyone have any ideas? Is that to be expected? > > For comparison, on a node (pre-upgrade): > >> nodetool cfstats dp.s_evt >> Keyspace: dp >> Read Count: 190570567 >> Read Latency: 2.6280611004164145 ms. >> Write Count: 46213651 >> Write Latency: 0.08166790944519835 ms. >> Pending Tasks: 0 >> Column Family: s_evt >> SSTable count: 8 >> Space used (live): 36269415929 >> Space used (total): 36274282945 >> SSTable Compression Ratio: 0.2345030140572 >> Number of Keys (estimate): 3213696 >> Memtable Columns Count: 2934 >> Memtable Data Size: 9561951 >> Memtable Switch Count: 1974 >> Read Count: 190570567 >> Read Latency: 2.628 ms. >> Write Count: 46213651 >> Write Latency: 0.082 ms. >> Pending Tasks: 0 >> Bloom Filter False Positives: 1162636 >> Bloom Filter False Ratio: 0.73869 >> Bloom Filter Space Used: 4492256 >> Compacted row minimum size: 373 >> Compacted row maximum size: 1996099046 >> Compacted row mean size: 63595 >> Average live cells per slice (last five minutes): 11.0 >> Average tombstones per slice (last five minutes): 0.0 > > > And after upgrading and running 'upgradesstables' (different node): > >> nodetool cfstats dp.s_evt >> Keyspace: dp >> Read Count: 1461617 >> Read Latency: 4.9734411921864625 ms. >> Write Count: 359250 >> Write Latency: 0.11328054279749478 ms. >> Pending Tasks: 0 >> Table: s_evt >> SSTable count: 6 >> Space used (live), bytes: 71266932602 >> Space used (total), bytes: 71266932602 >> Off heap memory used (total), bytes: 44853104 >> SSTable Compression Ratio: 0.2387480210082192 >> Number of keys (estimate): 3307776 >> Memtable cell count: 603223 >> Memtable data size, bytes: 121913569 >> Memtable switch count: 9 >> Local read count: 1461617 >> Local read latency: 7.248 ms >> Local write count: 359250 >> Local write latency: 0.110 ms >> Pending tasks: 0 >> Bloom filter false positives: 2501 >> Bloom filter false ratio: 0.01118 >> Bloom filter space used, bytes: 4135248 >> Bloom filter off heap memory used, bytes: 4135200 >> Index summary off heap memory used, bytes: 723576 >> Compression metadata off heap memory used, bytes: 39994328 >> Compacted partition minimum bytes: 536 >>
Upgrade from 1.2.x to 2.0.x, upgradesstables has doubled the size on disk?
Hi All, We're taking advantage of the lull in traffic to go through a production cluster upgrade from 1.2.x (latest) to 2.0.x (latest). We have three nodes with a replication factor of three. I've noticed that the 'space used' has almost doubled as a result of running 'nodetool upgradesstables'. Anyone have any ideas? Is that to be expected? For comparison, on a node (pre-upgrade): > nodetool cfstats dp.s_evt > Keyspace: dp > Read Count: 190570567 > Read Latency: 2.6280611004164145 ms. > Write Count: 46213651 > Write Latency: 0.08166790944519835 ms. > Pending Tasks: 0 > Column Family: s_evt > SSTable count: 8 > Space used (live): 36269415929 > Space used (total): 36274282945 > SSTable Compression Ratio: 0.2345030140572 > Number of Keys (estimate): 3213696 > Memtable Columns Count: 2934 > Memtable Data Size: 9561951 > Memtable Switch Count: 1974 > Read Count: 190570567 > Read Latency: 2.628 ms. > Write Count: 46213651 > Write Latency: 0.082 ms. > Pending Tasks: 0 > Bloom Filter False Positives: 1162636 > Bloom Filter False Ratio: 0.73869 > Bloom Filter Space Used: 4492256 > Compacted row minimum size: 373 > Compacted row maximum size: 1996099046 > Compacted row mean size: 63595 > Average live cells per slice (last five minutes): 11.0 > Average tombstones per slice (last five minutes): 0.0 And after upgrading and running 'upgradesstables' (different node): > nodetool cfstats dp.s_evt > Keyspace: dp > Read Count: 1461617 > Read Latency: 4.9734411921864625 ms. > Write Count: 359250 > Write Latency: 0.11328054279749478 ms. > Pending Tasks: 0 > Table: s_evt > SSTable count: 6 > Space used (live), bytes: 71266932602 > Space used (total), bytes: 71266932602 > Off heap memory used (total), bytes: 44853104 > SSTable Compression Ratio: 0.2387480210082192 > Number of keys (estimate): 3307776 > Memtable cell count: 603223 > Memtable data size, bytes: 121913569 > Memtable switch count: 9 > Local read count: 1461617 > Local read latency: 7.248 ms > Local write count: 359250 > Local write latency: 0.110 ms > Pending tasks: 0 > Bloom filter false positives: 2501 > Bloom filter false ratio: 0.01118 > Bloom filter space used, bytes: 4135248 > Bloom filter off heap memory used, bytes: 4135200 > Index summary off heap memory used, bytes: 723576 > Compression metadata off heap memory used, bytes: 39994328 > Compacted partition minimum bytes: 536 > Compacted partition maximum bytes: 2874382626 > Compacted partition mean bytes: 108773 > Average live cells per slice (last five minutes): 11.0 > Average tombstones per slice (last five minutes): 17.0 Column familiy definition: > create column family s_evt with column_type = 'Super' and comparator = > 'TimeUUIDType' and subcomparator = 'UTF8Type'; Also curious why the 'Average tombstones per slice' value has gone from 0 to 17. Note sure if its relevant but way back when we used to write values to that (super) column family with a TTL, but for a long time now its been append only (with no TTL)... Thanks, Dan
Re: which high level Java client
Pelops is a very thin wrapper over the Thrift client so that could be a good option. You could also check out a fork by the VMWare/Spring guys which adds full async support: https://github.com/andrewswan/scale7-pelops. I'm not sure on the state of it, but it seems promising... On Thursday, 28 June 2012 at 11:04 AM, James Pirz wrote: Dear all, I am interested in using Cassandra 1.1.1 in a read-intensive scenario, where more than 95% of my operations are get(). I have a cluster with ~10 nodes, around 15-20 GB of data on each, while in the extreme case I expect to have 20-40 concurrent clients. I am kind of confused about which high level java client should I use ? (Which one is the best/fastest for concurrent read operations) Hector, Pelops, Astyanax, or something else ? I browsed the mailing list, but I came across different types of arguments and conclusions on behalf of various clients. Thanks in advance, James
Re: Configuring cassandra cluster with host preferences
It's not possible 'out of the box' but you could implement your own org.scale7.cassandra.pelops.pool.CommonsBackedPool.INodeSelectionStrategy that chooses the desired node. -- Dan Washusen Make big files fly visit digitalpigeon.com (http://digitalpigeon.com) On Tuesday, 15 May 2012 at 3:23 AM, Oleg Dulin wrote: I am running my processes on the same nodes as Cassandra. What I'd like to do is when I get a connection from Pelops, it gives preference to the Cassandra node local to the host my process is on. Is it possible ? How ? Regards, Oleg Dulin Please note my new office #: 732-917-0159
Re: Cassandra Clients for Java
I've added some comments/questions inline... Cheers, -- Dan Washusen On Saturday, 18 June 2011 at 8:02 AM, Daniel Colchete wrote: Good day everyone! I'm getting started with a new project and I'm thinking about using Cassandra because of its distributed quality and because of its performance. I'm using Java on the back-end. There are many many things being said about the Java high level clients for Cassandra on the web. To be frank, I see problems with all of the java clients. For example, Hector and Scale7-pelops have new semantics on them that are neither Java's or Cassandra's, and I don't see much gain from it apart from the fact that it is more complex. Also, I was hoping to go with something that was annotation based so that it wouldn't be necessary to write boilerplate code (again, no gain). I'm interested in hearing more on your comment regarding Hector and Pelops adding little but complexity; could you add a little context to the comment? If you're coming from an ORM framework like Hibernate then doing simple tasks may seem cumbersome. However, once you've wrapped your head around Cassandra's read and write concepts both Hector and Pelops seem relatively straight forward (to me)...? Also, a quick look at the Hector wiki suggests that they have some form of annotation support (https://github.com/rantav/hector/wiki/Using-the-EntityManager). Demoiselle Cassandra seems to be one option but I couldn't find a download for it. I'm new to Java in the back-end and I find that maven is too much to learn just because of a client library. Also it seems to be hard to integrate with the other things I use on my project (GWT, GWT-platform, Google Eclipse Plugin). Kundera looks great but besides not having a download link (Google site link to Github, that links to Google site, but no download) its information is partitioned on many blog posts, some of them saying things I couldn't find on its website. One says it uses Lucandra for indexes but that is the only place talking about it, no documentation about using it. It doesn't seem to support Cassandra 0.8 also. Does it? It's my understand that Lucandra has been superseded by Solandra (https://github.com/tjake/Solandra). I would like to hear from the users here what worked for you guys. Some real world project in production that was good to write in Java, where the client was stable and is maintained. What are the success stories of using Cassandra with Java. What would you recommend? Pelops is used successfully on (at least) fightmymonster.com and digitalpigeon.com. Pelops is actively developed and maintained by those two companies + contributors and Hector looks like it's backed by Datastax, which would seem to be a pretty compelling sell point. Thank you very much! Best, -- Dani Cloud3 Tech - http://cloud3.tc/ Twitter: @DaniCloud3 @Cloud3Tech
0.7.5 Debian packages - can't upgrade?
Hey all, I can't seem upgrade to 0.7.5 using the Debian packages. Here's what I've done... Edited sources.list and changed unstable to 07x. deb http://www.apache.org/dist/cassandra/debian 07x main deb-src http://www.apache.org/dist/cassandra/debian 07x main Add the new key. sudo gpg --keyserver pgp.mit.edu --recv-keys 2B5C1B00 sudo gpg --export --armor 2B5C1B00 | sudo apt-key add - Run the usual commands. sudo aptitude update sudo aptitude safe-upgrade The upgrade shows this: Reading package lists... Done Building dependency tree Reading state information... Done Reading extended state information Initializing package states... Done No packages will be installed, upgraded, or removed. 0 packages upgraded, 0 newly installed, 0 to remove and *1 not upgraded*. Need to get 0B of archives. After unpacking 0B will be used. Reading package lists... Done Building dependency tree Reading state information... Done Reading extended state information Initializing package states... Done The above mentions that 1 package wasn't upgraded (I assume this is 0.7.5). Anyone have any ideas what I'm doing wrong? Cheers, Dan
Re: 0.7.5 Debian packages - can't upgrade?
Thanks for the response. :) I should have also mentioned that I'm running this on Ubuntu Karmic Koalahttp://i44.tinypic.com/27xp2lc.jpg(9.10). The output of `sudo aptitude full-upgrade` looks the same as safe-upgrade: Reading package lists... Done Building dependency tree Reading state information... Done Reading extended state information Initializing package states... Done No packages will be installed, upgraded, or removed. 0 packages upgraded, 0 newly installed, 0 to remove and 1 not upgraded. Need to get 0B of archives. After unpacking 0B will be used. Reading package lists... Done Building dependency tree Reading state information... Done Reading extended state information Initializing package states... Done Here is the output of 'apt-cache policy apt-cache policy cassandra': http://pastebin.com/PqRiGmWi On 30 April 2011 11:18, Eric Evans eev...@rackspace.com wrote: On Sat, 2011-04-30 at 09:34 +1000, Dan Washusen wrote: sudo aptitude update sudo aptitude safe-upgrade The upgrade shows this: Reading package lists... Done Building dependency tree Reading state information... Done Reading extended state information Initializing package states... Done No packages will be installed, upgraded, or removed. 0 packages upgraded, 0 newly installed, 0 to remove and *1 not upgraded*. Need to get 0B of archives. After unpacking 0B will be used. Reading package lists... Done Building dependency tree Reading state information... Done Reading extended state information Initializing package states... Done The above mentions that 1 package wasn't upgraded (I assume this is 0.7.5). Anyone have any ideas what I'm doing wrong? Usually this means that upgrading would install a new package (i.e. that it picked up a new dependency), which shouldn't be the case. You might try an `aptitude full-upgrade' just to see what that might be. You could also try pasting the output of `apt-cache policy apt-cache policy cassandra' to the list. -- Eric Evans eev...@rackspace.com
Re: RE: batch_mutate failed: out of sequence response
It turns out that once a TProtocolException is thrown from Cassandra the connection is useless for future operations. Pelops was closing connections when it detected TimedOutException, TTransportException and UnavailableException but not TProtocolException. We have now changed Pelops to close connections is all cases *except* NotFoundException. Cheers, -- Dan Washusen On Friday, 8 April 2011 at 7:28 AM, Dan Washusen wrote: Pelops uses a single connection per operation from a pool that is backed by Apache Commons Pool (assuming you're using Cassandra 0.7). I'm not saying it's perfect but it's NOT sharing a connection over multiple threads. Dan Hendry mentioned that he sees these errors. Is he also using Pelops? From his comment about retrying I'd assume not... -- Dan Washusen On Thursday, 7 April 2011 at 7:39 PM, Héctor Izquierdo Seliva wrote: El mié, 06-04-2011 a las 21:04 -0500, Jonathan Ellis escribió: out of sequence response is thrift's way of saying I got a response for request Y when I expected request X. my money is on using a single connection from multiple threads. don't do that. I'm not using thrift directly, and my application is single thread, so I guess this is Pelops fault somehow. Since I managed to tame memory comsuption the problem has not appeared again, but it always happened during a stop-the-world GC. Could it be that the message was sent instead of being dropped by the server when the client assumed it had timed out?
Re: RE: batch_mutate failed: out of sequence response
An example scenario (that is now fixed in Pelops): Attempt to write a column with a null value Cassandra throws a TProtocolException which renders the connection useless for future operations Pelops returns the corrupt connection to the pool A second read operation is attempted with the corrupt connection and Cassandra throws an ApplicationException A Pelops test case for this can be found here: https://github.com/s7/scale7-pelops/blob/3fe7584a24bb4b62b01897a814ef62415bd2fe43/src/test/java/org/scale7/cassandra/pelops/MutatorIntegrationTest.java#L262 Cheers, -- Dan Washusen On Tuesday, 19 April 2011 at 10:28 AM, Jonathan Ellis wrote: Any idea what's causing the original TPE? On Mon, Apr 18, 2011 at 6:22 PM, Dan Washusen d...@reactive.org wrote: It turns out that once a TProtocolException is thrown from Cassandra the connection is useless for future operations. Pelops was closing connections when it detected TimedOutException, TTransportException and UnavailableException but not TProtocolException. We have now changed Pelops to close connections is all cases *except* NotFoundException. Cheers, -- Dan Washusen On Friday, 8 April 2011 at 7:28 AM, Dan Washusen wrote: Pelops uses a single connection per operation from a pool that is backed by Apache Commons Pool (assuming you're using Cassandra 0.7). I'm not saying it's perfect but it's NOT sharing a connection over multiple threads. Dan Hendry mentioned that he sees these errors. Is he also using Pelops? From his comment about retrying I'd assume not... -- Dan Washusen On Thursday, 7 April 2011 at 7:39 PM, Héctor Izquierdo Seliva wrote: El mié, 06-04-2011 a las 21:04 -0500, Jonathan Ellis escribió: out of sequence response is thrift's way of saying I got a response for request Y when I expected request X. my money is on using a single connection from multiple threads. don't do that. I'm not using thrift directly, and my application is single thread, so I guess this is Pelops fault somehow. Since I managed to tame memory comsuption the problem has not appeared again, but it always happened during a stop-the-world GC. Could it be that the message was sent instead of being dropped by the server when the client assumed it had timed out? -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Atomicity Strategies
Here's a good writeup on how fightmymonster.com does it... http://ria101.wordpress.com/category/nosql-databases/locking/ -- Dan Washusen Make big files fly visit digitalpigeon.com On Saturday, 9 April 2011 at 11:53 AM, Alex Araujo wrote: On 4/8/11 5:46 PM, Drew Kutcharian wrote: I'm interested in this too, but I don't think this can be done with Cassandra alone. Cassandra doesn't support transactions. I think hector can retry operations, but I'm not sure about the atomicity of the whole thing. On Apr 8, 2011, at 1:26 PM, Alex Araujo wrote: Hi, I was wondering if there are any patterns/best practices for creating atomic units of work when dealing with several column families and their inverted indices. For example, if I have Users and Groups column families and did something like: Users.insert( user_id, columns ) UserGroupTimeline.insert( group_id, { timeuuid() : user_id } ) UserGroupStatus.insert( group_id + : + user_id, { Active : True } ) UserEvents.insert( timeuuid(), { user_id : user_id, group_id : group_id, event_type : join } ) Would I want the client to retry all subsequent operations that failed against other nodes after n succeeded, maintain an undo queue of operations to run, batch the mutations and choose a strong consistency level, some combination of these/others, etc? Thanks, Alex Thanks Drew. I'm familiar with lack of transactions and have read about people using ZK (possibly Cages as well?) to accomplish this, but since it seems that inverted indices are common place I'm interested in how anyone is mitigating lack of atomicity to any extent without the use of such tools. It appears that Hector and Pelops have retrying built in to their APIs and I'm fairly confident that proper use of those capabilities may help. Just trying to cover all bases. Hopefully someone can share their approaches and/or experiences. Cheers, Alex.
Re: RE: batch_mutate failed: out of sequence response
Pelops uses a single connection per operation from a pool that is backed by Apache Commons Pool (assuming you're using Cassandra 0.7). I'm not saying it's perfect but it's NOT sharing a connection over multiple threads. Dan Hendry mentioned that he sees these errors. Is he also using Pelops? From his comment about retrying I'd assume not... -- Dan Washusen On Thursday, 7 April 2011 at 7:39 PM, Héctor Izquierdo Seliva wrote: El mié, 06-04-2011 a las 21:04 -0500, Jonathan Ellis escribió: out of sequence response is thrift's way of saying I got a response for request Y when I expected request X. my money is on using a single connection from multiple threads. don't do that. I'm not using thrift directly, and my application is single thread, so I guess this is Pelops fault somehow. Since I managed to tame memory comsuption the problem has not appeared again, but it always happened during a stop-the-world GC. Could it be that the message was sent instead of being dropped by the server when the client assumed it had timed out?
Re: RE: batch_mutate failed: out of sequence response
Pelops will retry when TimedOutException, TTransportException or UnavailableException exceptions are thrown but not TApplicationException. TApplicationException has a type property which looks like it could be used to retry based on specific values. Based on the names the INTERNAL_ERROR and BAD_SEQUENCE_ID types sound like good candidates for a retry. I just did a quick hunt through the Pycassa and Hector code and it doesn't look like they do anything special based on the type property. Jonathan (or other Cassandra gurus) should connection managers take different actions based on these type property of TApplicationException? Cheers, Dan On Wednesday, 6 April 2011 at 8:03 PM, Héctor Izquierdo Seliva wrote: El mié, 06-04-2011 a las 09:06 +1000, Dan Washusen escribió: Pelops raises a RuntimeException? Can you provide more info please? org.scale7.cassandra.pelops.exceptions.ApplicationException: batch_mutate failed: out of sequence response -- Dan Washusen Make big files fly visit digitalpigeon.com On Tuesday, 5 April 2011 at 11:43 PM, Héctor Izquierdo Seliva wrote: El mar, 05-04-2011 a las 09:35 -0400, Dan Hendry escribió: I too have seen the out of sequence response problem. My solution has just been to retry and it seems to work. None of my mutations are THAT large ( 200 columns). The only related information I could find points to a thrift/ubuntu bug of some kind (http://markmail.org/message/xc3tskhhvsf5awz7). What OS are you running? Dan Hi Dan. I'm running on Debian stable and cassandra 0.7.4. I have rows with up to 1000 columns. I have changed the way I was doing the batch mutates to never be bigger than 100 columns at a time. I hope this will work, otherwise the move is going to take too long. The problem is aggravated by Pelops not retrying automatically and instead raising a RuntimeException. I'll try to add a retry if this doesn't work. Thanks for your response! Héctor -Original Message- From: Héctor Izquierdo Seliva [mailto:izquie...@strands.com] Sent: April-05-11 8:30 To: user@cassandra.apache.org Subject: batch_mutate failed: out of sequence response Hi everyone. I'm having trouble while inserting big amounts of data into cassandra. I'm getting this exception: batch_mutate failed: out of sequence response I'm gessing is due to very big mutates. I have made the batch mutates smaller and it seems to be behaving. Can somebody shed some light? Thanks! No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.894 / Virus Database: 271.1.1/3551 - Release Date: 04/05/11 02:34:00
Re: RE: batch_mutate failed: out of sequence response
Pelops raises a RuntimeException? Can you provide more info please? -- Dan Washusen Make big files fly visit digitalpigeon.com On Tuesday, 5 April 2011 at 11:43 PM, Héctor Izquierdo Seliva wrote: El mar, 05-04-2011 a las 09:35 -0400, Dan Hendry escribió: I too have seen the out of sequence response problem. My solution has just been to retry and it seems to work. None of my mutations are THAT large ( 200 columns). The only related information I could find points to a thrift/ubuntu bug of some kind (http://markmail.org/message/xc3tskhhvsf5awz7). What OS are you running? Dan Hi Dan. I'm running on Debian stable and cassandra 0.7.4. I have rows with up to 1000 columns. I have changed the way I was doing the batch mutates to never be bigger than 100 columns at a time. I hope this will work, otherwise the move is going to take too long. The problem is aggravated by Pelops not retrying automatically and instead raising a RuntimeException. I'll try to add a retry if this doesn't work. Thanks for your response! Héctor -Original Message- From: Héctor Izquierdo Seliva [mailto:izquie...@strands.com] Sent: April-05-11 8:30 To: user@cassandra.apache.org Subject: batch_mutate failed: out of sequence response Hi everyone. I'm having trouble while inserting big amounts of data into cassandra. I'm getting this exception: batch_mutate failed: out of sequence response I'm gessing is due to very big mutates. I have made the batch mutates smaller and it seems to be behaving. Can somebody shed some light? Thanks! No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.894 / Virus Database: 271.1.1/3551 - Release Date: 04/05/11 02:34:00
Re: OOM during compaction with half the heap still available?
Ah, it would appear I forgot to do that on the hudson machine. Thanks! -- Dan Washusen On Friday, 25 March 2011 at 2:23 PM, Jonathan Ellis wrote: Have you run nodetool scrub? The data versioning problem scrub fixes can manifest itself as trying to read GB of data into memory during compaction. On Thu, Mar 24, 2011 at 8:52 PM, Dan Washusen d...@reactive.org wrote: Hey All, I've noticed that the Cassandra instance I have running on our build machine occasionally crashes with an OOM error during compaction. I'm going to dial down the memtable thresholds etc but I was wondering if anyone could help explain the heap usage at the time of the crash. I just happened to leave a JMX console window open and it's showing that just before the crash roughly 50% of the heap was still available. Screenshot of heap usage: http://img3.imageshack.us/img3/2822/memoryf.png The above screenshot was taken a few weeks ago on Cassandra 0.7.2 (I think) with nmap disabled on Java 1.6.0_24-b07. I'm now running 0.7.4 with nmap enabled and just got the same crash (based on the error message in the log)... Log snippet from crash that matches screenshot: http://pastebin.com/ACa8fKUu Log snippet from 0.7.4 crash: http://pastebin.com/SwSQawUM Cheers, -- Dan Washusen -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: I: Re: Are row-keys sorted by the compareWith?
Pelops moved to github several months ago... https://github.com/s7/scale7-pelops/blob/master/src/main/java/org/scale7/cassandra/pelops/Selector.java#L1179 Cheers, -- Dan Washusen On Wednesday, 2 March 2011 at 3:35 AM, Matthew Dennis wrote: I'm not really familiar with pelops code, but I found two implementations (~ line 454 and ~ line 559) of getColumnsFromRows in Selector.java in pelops trunk. The first uses a HashMap so it clearly isn't ordered, the second uses a LinkedHashMap but it inserts the keys in the order returned by C* which we already know isn't ordered. See http://bit.ly/egZaXi for relevant code. Like I said, I'm not really familiar with pelops so I could be completely off on this, but it looks like if pelops was intending to preserve the order of the requested keys that it's not actually doing it... On Wed, Feb 23, 2011 at 3:44 PM, Dan Washusen d...@reactive.org wrote: Hi Matthew, As you mention the map returned from multiget_slice is not order preserving, Pelops is doing this on the client side... Cheers, Dan -- Dan Washusen Sent with Sparrow On Wednesday, 23 February 2011 at 8:38 PM, Matthew Dennis wrote: The map returned by multiget_slice (what I suspect is the underlying thrift call for getColumnsFromRows) is not a order preserving map, it's a HashMap so the order of the returned results cannot be depended on. Even if it was a order preserving map, not all languages would be able to make use of the results since not all languages have ordered maps (though many, including Java, certainly do). That being said, it would be fairly easy to change this on the C* side to preserve the order the keys were requested in, though as mentioned not all clients could take advantage of it. On Mon, Feb 21, 2011 at 4:09 PM, cbert...@libero.it cbert...@libero.it wrote: As Jonathan mentions the compareWith on a column family def. is defines the order for the columns *within* a row... In order to control the ordering of rows you'll need to use the OrderPreservingPartitioner (http://www.datastax.com/docs/0.7/operations/clustering#tokens-partitioners-ring). Thanks for your answer and for your time, I will take a look at this. As for getColumnsFromRows; it should be returning you a map of lists. The map is insertion-order-preserving and populated based on the provided list of row keys (so if you iterate over the entries in the map they should be in the same order as the list of row keys). mmm ... well it didn't happen like this. In my code I had a CF named comments and also a CF called usercomments. UserComments use an uuid as row-key to keep, TimeUUID sorted, the pointers to the comments of the user. When I get the sorted list of keys from the UserComments and I use this list as row-keys-list in the GetColumnsFromRows I don't get back the data sorted as I expect them to be. It looks like if Cassandra/Pelops does not care on how I provide the row-keys-list. I am sure about that cause I did something different: I iterate over my row-keys-list and made many GetColumnFromRow instead of one GetColumnsFromRows and when I iterate data are correctly sorted. But this can not be a solution ... I am using Cassandra 0.6.9 I profit of your knownledge of Pelops to ask you something: I am evaluating the migration to Cassandra 0.7 ... as far as you know, in terms of written code, is it an heavy job? Best Regards Carlo Messaggio originale Da: d...@reactive.org On Saturday, 19 February 2011 at 8:16 AM, cbert...@libero.it wrote: Hi all, I created a CF in which i need to get, sorted by time, the Rows inside. Each Row represents a comment. ColumnFamily name=Comments compareWith=TimeUUIDType / I've created a few rows using as Row Key a generated TimeUUID but when I call the Pelops method GetColumnsFromRows I don't get the data back as I expect: rows are not sorted by TimeUUID. I though it was probably cause of the random-part of the TimeUUID so I create a new CF ... ColumnFamily name=Comments2 compareWith=LongType / This time I created a few rows using the java System.CurrentTimeMillis() that retrieve a long. I call again the GetColumnsFromRows and again the same results: data are not sorted! I've read many times that Rows are sorted as specified in the compareWith but I can't see it. To solve this problem for the moment I've used a SuperColumnFamily with an UNIQUE ROW ... but I think this is just a workaround and not the solution
Re: I: Re: Are row-keys sorted by the compareWith?
Hi Matthew, As you mention the map returned from multiget_slice is not order preserving, Pelops is doing this on the client side... Cheers, Dan -- Dan Washusen Sent with Sparrow On Wednesday, 23 February 2011 at 8:38 PM, Matthew Dennis wrote: The map returned by multiget_slice (what I suspect is the underlying thrift call for getColumnsFromRows) is not a order preserving map, it's a HashMap so the order of the returned results cannot be depended on. Even if it was a order preserving map, not all languages would be able to make use of the results since not all languages have ordered maps (though many, including Java, certainly do). That being said, it would be fairly easy to change this on the C* side to preserve the order the keys were requested in, though as mentioned not all clients could take advantage of it. On Mon, Feb 21, 2011 at 4:09 PM, cbert...@libero.it cbert...@libero.it wrote: As Jonathan mentions the compareWith on a column family def. is defines the order for the columns *within* a row... In order to control the ordering of rows you'll need to use the OrderPreservingPartitioner (http://www.datastax.com/docs/0.7/operations/clustering#tokens-partitioners-ring). Thanks for your answer and for your time, I will take a look at this. As for getColumnsFromRows; it should be returning you a map of lists. The map is insertion-order-preserving and populated based on the provided list of row keys (so if you iterate over the entries in the map they should be in the same order as the list of row keys). mmm ... well it didn't happen like this. In my code I had a CF named comments and also a CF called usercomments. UserComments use an uuid as row-key to keep, TimeUUID sorted, the pointers to the comments of the user. When I get the sorted list of keys from the UserComments and I use this list as row-keys-list in the GetColumnsFromRows I don't get back the data sorted as I expect them to be. It looks like if Cassandra/Pelops does not care on how I provide the row-keys-list. I am sure about that cause I did something different: I iterate over my row-keys-list and made many GetColumnFromRow instead of one GetColumnsFromRows and when I iterate data are correctly sorted. But this can not be a solution ... I am using Cassandra 0.6.9 I profit of your knownledge of Pelops to ask you something: I am evaluating the migration to Cassandra 0.7 ... as far as you know, in terms of written code, is it an heavy job? Best Regards Carlo Messaggio originale Da: d...@reactive.org On Saturday, 19 February 2011 at 8:16 AM, cbert...@libero.it wrote: Hi all, I created a CF in which i need to get, sorted by time, the Rows inside. Each Row represents a comment. ColumnFamily name=Comments compareWith=TimeUUIDType / I've created a few rows using as Row Key a generated TimeUUID but when I call the Pelops method GetColumnsFromRows I don't get the data back as I expect: rows are not sorted by TimeUUID. I though it was probably cause of the random-part of the TimeUUID so I create a new CF ... ColumnFamily name=Comments2 compareWith=LongType / This time I created a few rows using the java System.CurrentTimeMillis() that retrieve a long. I call again the GetColumnsFromRows and again the same results: data are not sorted! I've read many times that Rows are sorted as specified in the compareWith but I can't see it. To solve this problem for the moment I've used a SuperColumnFamily with an UNIQUE ROW ... but I think this is just a workaround and not the solution. ColumnFamily name=Comments type=Super compareWith=TimeUUIDType CompareSubcolumnsWith=BytesType/ Now when I call the GetSuperColumnsFromRow I get all the SuperColumns as I expected: sorted by TimeUUID. Why it does not happen the same with the Rows? I'm confused. TIA for any help. Best Regards Carlo
Re: Are row-keys sorted by the compareWith?
Hi Carlo, As Jonathan mentions the compareWith on a column family def. is defines the order for the columns *within* a row... In order to control the ordering of rows you'll need to use the OrderPreservingPartitioner (http://www.datastax.com/docs/0.7/operations/clustering#tokens-partitioners-ring). As for getColumnsFromRows; it should be returning you a map of lists. The map is insertion-order-preserving and populated based on the provided list of row keys (so if you iterate over the entries in the map they should be in the same order as the list of row keys). The list for each row entry are definitely in the order that Cassandra provides them, take a look at org.scale7.cassandra.pelops.Selector#toColumnList if you need more info. Cheers, Dan -- Dan Washusen Sent with Sparrow On Saturday, 19 February 2011 at 8:16 AM, cbert...@libero.it wrote: Hi all, I created a CF in which i need to get, sorted by time, the Rows inside. Each Row represents a comment. ColumnFamily name=Comments compareWith=TimeUUIDType / I've created a few rows using as Row Key a generated TimeUUID but when I call the Pelops method GetColumnsFromRows I don't get the data back as I expect: rows are not sorted by TimeUUID. I though it was probably cause of the random-part of the TimeUUID so I create a new CF ... ColumnFamily name=Comments2 compareWith=LongType / This time I created a few rows using the java System.CurrentTimeMillis() that retrieve a long. I call again the GetColumnsFromRows and again the same results: data are not sorted! I've read many times that Rows are sorted as specified in the compareWith but I can't see it. To solve this problem for the moment I've used a SuperColumnFamily with an UNIQUE ROW ... but I think this is just a workaround and not the solution. ColumnFamily name=Comments type=Super compareWith=TimeUUIDType CompareSubcolumnsWith=BytesType/ Now when I call the GetSuperColumnsFromRow I get all the SuperColumns as I expected: sorted by TimeUUID. Why it does not happen the same with the Rows? I'm confused. TIA for any help. Best Regards Carlo
Re: Another EOFException
I'm seeing this as well; several column families with keys_cached = 0 on 0.7.1. Debug level logs: http://pastebin.com/qvujKDth -- Dan Washusen On Wednesday, 16 February 2011 at 1:12 PM, Jonathan Ellis wrote: Created https://issues.apache.org/jira/browse/CASSANDRA-2172. On Tue, Feb 15, 2011 at 3:34 PM, B. Todd Burruss bburr...@real.com wrote: it happens when i start the node. just tried it again. here's the saved_caches directory: [cassandra@kv-app02 ~]$ ls -l /data/cassandra-data/saved_caches/ total 12 -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 NotificationSystem-Events-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 NotificationSystem-Msgs-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 NotificationSystem-Rendered-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 NotificationSystem-ScheduledMsgs-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 NotificationSystem-ScheduledTimes-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 NotificationSystem-SystemState-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 NotificationSystem-Templates-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 NotificationSystem-Transports-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 Queues-EmailTransport_Pending-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 Queues-EmailTransport_Waiting-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 Queues-Errors_Pending-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 Queues-Errors_Waiting-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 Queues-MessageDescriptors-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 Queues-PipeDescriptors-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 Queues-Processing_Pending-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 Queues-Processing_Waiting-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 Queues-QueueDescriptors-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 Queues-QueuePipeCnxn-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 Queues-QueueStats-KeyCache -rw-rw-r-- 1 cassandra cassandra 38 Feb 15 09:36 system-HintsColumnFamily-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 09:36 system-IndexInfo-KeyCache -rw-rw-r-- 1 cassandra cassandra 5 Feb 15 09:36 system-LocationInfo-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 09:36 system-Migrations-KeyCache -rw-rw-r-- 1 cassandra cassandra 18 Feb 15 09:36 system-Schema-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 09:36 UDS4Profile-ProfileDefinitions-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 09:36 UDS4Profile-ProfileNamespaces-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 09:36 UDS4Profile-Profiles_40229-KeyCache -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 09:36 UDS4Profile-Profiles_RN_test-KeyCache On 02/15/2011 01:01 PM, Jonathan Ellis wrote: Is this reproducible or just I happened to kill the server while it was in the middle of writing out the cache keys? On Tue, Feb 15, 2011 at 1:10 PM, B. Todd Burrussbburr...@real.com wrote: the following exception seems to be about loading saved caches, but i don't really care about the cache so maybe isn't a big deal. anyway, this is with patched 0.7.1 (0001-Fix-bad-signed-conversion-from-byte-to-int.patch) WARN 11:07:59,800 error reading saved cache /data/cassandra-data/saved_caches/UDS4Profile-Profiles_40229-KeyCache java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2281) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2750) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:780) at java.io.ObjectInputStream.init(ObjectInputStream.java:280) at org.apache.cassandra.db.ColumnFamilyStore.readSavedCache(ColumnFamilyStore.java:255) at org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:198) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:451) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:432) at org.apache.cassandra.db.Table.initCf(Table.java:360) at org.apache.cassandra.db.Table.init(Table.java:290) at org.apache.cassandra.db.Table.open(Table.java:107) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:162) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79) -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http
Re: seed node failure crash the whole cluster
Hi, I've added some comments and questions inline. Cheers, Dan On 8 February 2011 10:00, Jonathan Ellis jbel...@gmail.com wrote: On Mon, Feb 7, 2011 at 1:51 AM, TSANG Yiu Wing ywts...@gmail.com wrote: cassandra version: 0.7 client library: scale7-pelops / 1.0-RC1-0.7.0-SNAPSHOT cluster: 3 machines (A, B, C) details: it works perfectly when all 3 machines are up and running but if the seed machine is down, the problems happen: 1) new client connection cannot be established sounds like pelops relies on the seed node to introduce it to the cluster. you should configure it either with a hardcoded list of nodes or use something like RRDNS instead. I don't use pelops so I can't help other than that. (I believe there is a mailing list for Pelops though.) When dynamic node discovery is turned on (off by default) it doesn't (shouldn't) rely on the initial seed node once past initialization. So either make sure you have dynamic node discovery turned on or seed Pelops with all nodes in your cluster... It would be helpful if you provided more information about the errors you're seeing preferably with debug level logging turned on. 2) if a client keeps connecting to and operating at (issue get and update) the cluster, when the seed is down, the working client will throw exception upon the next operation I know Hector supports transparent failover to another Cassandra node. Perhaps Pelops does not. Pelops will validate connections at a configurable period (60 seconds by default) and remove them from the pool. Pelops will also retry the operation three times (configurable) against a different node in the pool each time. If you want Pelops to take more agressive actions when it detects downed nodes then check out org.scale7.cassandra.pelops.pool.CommonsBackedPool.INodeSuspensionStrategy. 3) using cassandra-cli to connect the remaining nodes in the cluster, Internal error processing get_range_slices will happen when querying column family list cf; Cassandra always logs the cause of internal errors in system.log, so you should look there. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Re: R: Re: Ring up but read fails ...
This is a known issue with the Cassandra 0.6 versions of Pelops. The issue was fixed in the 0.7 based versions a few months ago but never back-ported (Dominic, myself and the other contributors don't run 0.6)... On 24 January 2011 05:25, cbert...@libero.it cbert...@libero.it wrote: Reconnect and try again? Sorry what do you mean by Reconnect and try again? -- You mean to shut down the old pool and create a new pool of connections? I don't have the possibility to handle the single connection using Pelops ... From Dominic Williams Blog To work with a Cassandra cluster, you need to start off by defining a connection pool. This is typically done once in the startup code of your application [...] One of the key design decisions that at the time of writing distinguishes Pelops, is that the data processing code written by developers does not involve connection pooling or management. Instead, classes like Mutatorand Selector borrow connections to Cassandra from a Pelops pool for just the periods that they need to read and write to the underlying Thrift API. This has two advantages. Firstly, obviously, code becomes cleaner and developers are freed from connection management concerns. But also more subtly this enables the Pelops library to completely manage connection pooling itself, and for example keep track of how many outstanding operations are currently running against each cluster node. This for example, enables Pelops to perform more effective client load balancing by ensuring that new operations are performed against the node to which it currently has the least outstanding operations running. Because of this architectural choice, it will even be possible to offer strategies in the future where for example nodes are actually queried to determine their load. TIA -- - Carlo -
Re: Java cient
Pelops is pretty thin wrapper for the Thrift API. It's thinness has both up and down sides; on the up side it's very easy to map functionality mentioned on the Cassandra API wiki page to functionality provided by Pelops, it is also relatively simple to add features (thanks to Alois^^ for indexing support). The down side is you often have to deal with the Cassandra Thrift classes like ColumnOrSuperColumn... On 20 January 2011 15:58, Dan Retzlaff dretzl...@gmail.com wrote: My team switched our production stack from Hector to Pelops a while back, based largely on this admittedly subjective programmer experience bit. I've found Pelops' code and abstractions significantly easier to follow and integrate with, plus Pelops has had feature-parity with Hector for all of our use cases. It's quite possible that we just caught Hector during its transition to what Nate calls v2 but for our part, with no disrespect to the Hector community intended, we've been quite happy with the transition. Dan On Wed, Jan 19, 2011 at 3:30 PM, Jonathan Shook jsh...@gmail.com wrote: Perhaps. I use hector. I have an bit of rework to do moving from .6 to .7. This is something I wasn't anticipating in my earlier planning. Had Pelops been around when I started using Hector, I would have probably chosen it over Hector. The Pelops client seemed to be better conceived as far as programmer experience and simplicity went. Since then, Hector has had a v2 upgrade to their API which breaks much of the things that you would have done in version .6 and before. Conceptually speaking, they appear more similar now than before the Hector changes. I'm dreading having to do a significant amount of work on my client interface because of the incompatible API changes.. but I will have to in order to get my client/server caught up to the currently supported branch. That is just part of the cost of doing business with Cassandra at the moment. Hopefully after 1.0 on the server and some of the clients, this type of thing will be more unusual. 2011/1/19 Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com: Thanks everyone. I guess, I should go with hector On 18 Jan 2011 17:41, Alois Bělaška alois.bela...@gmail.com wrote: Definitelly Pelops https://github.com/s7/scale7-pelops 2011/1/18 Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com What is the most commonly used java client library? Which is the the most mature/feature complete? Noble
Re: Range Queries in RP on SCF in 0.7 with UUID SCs
Using the methods on the Bytes class would be preferable. The byte[] related methods on UuidHelper should have been deprecated with the Bytes class was introduced... e.g. new Bytes(col.getName()).toUuid() Cheers, Dan On Thu, Dec 2, 2010 at 10:26 AM, Frank LoVecchio fr...@isidorey.com wrote: Actually, it was a class issue at this line: System.*out*.println(NAME: + UUID.*nameUUIDFromBytes*(col.getName())); The native Pelops class timeUuidHelper is what should be used. On Wed, Dec 1, 2010 at 4:16 PM, Aaron Morton aa...@thelastpickle.comwrote: When you say I want to get rows starting from a Super Column... it's a bit confusing. Do you want to get super columns from a single row, or multiple rows? I'm assuming you are talking about getting columns from a single row / key as that's what your code does. For the pelops code, it looks OK but I've not used Pelops. You can turn the logging up on the server and check the command that is sent to it. I'm would guess there is something wrong with the way you are transforming the start key For your cli example what was the command you executed ? Aaron On 02 Dec, 2010,at 11:03 AM, Frank LoVecchio fr...@isidorey.com wrote: Hey Aaron, Yes, in regards to SCF definition, you are correct: name: Sensor column_type: Super compare_with: TimeUUIDType gc_grace_seconds: 864000 keys_cached: 1.0 read_repair_chance: 1.0 rows_cached: 0.0 I'm not quite sure I follow you, though, as I think I'm doing what you specify. The Pelops code is below. Basically, I want to get rows starting from a Super Column with a specific UUID and limit the number, just as you inferred. When I run this code I just get the last N values (25 in this case) if non-reversed, and the first N values if reversed. However, regardless of what start param we use (Super Column UUID is String startKey below), we still get the same values for the specified amount (e.g. the same 25). *public* *void* getSuperRowKeys(String rowKey, String columnFamily, *int* limit, String startKey) *throws* Exception { *byte*[] byteArray = UuidHelper.*timeUuidStringToBytes*(startKey); ByteBuffer bb = ByteBuffer.*wrap*(byteArray); *new* UUID (bb.getLong(), bb.getLong()); ListSuperColumn cols = selector.getPageOfSuperColumnsFromRow(columnFamily, rowKey, Bytes.*fromByteBuffer*(bb), *false*, limit, ConsistencyLevel.*ONE *); *for* (SuperColumn col : cols) { *if* (col.getName() != *null*) { System.*out*.println(NAME: + UUID.*nameUUIDFromBytes* (col.getName())); *for* (Column c : col.columns) { System.*out*.println(\t\tName: + Bytes.*toUTF8*(c.getName()) + Value: + Bytes.*toUTF8*(c.getValue()) + timestamp: + c.timestamp); } } } } Here is some example data from the CLI. If we specify 2f814d30-f758-11df-2f81-4d30f75811df as the start param (second super column down), we still get 52e6540-f759-11df-952e-6540f75911df (first super column) returned. = (super_column=952e6540-f759-11df-952e-6540f75911df, (column=64617465, value=323031302d31312d32332032333a32393a30332e303030, timestamp=1290554997141000) (column=65787472615f696e666f, value=6e6f6e65, timestamp=1290554997141000) (column=726561736f6e, value=6e6f6e65, timestamp=1290554997141000) (column=7365636f6e64735f746f5f6e657874, value=373530, timestamp=1290554997141000) (column=73657269616c, value=393135353032353731, timestamp=1290554997141000) (column=737461747573, value=5550, timestamp=1290554997141000) (column=74797065, value=486561727462656174, timestamp=1290554997141000)) = (super_column=2f814d30-f758-11df-2f81-4d30f75811df, (column=64617465, value=323031302d31312d32332032333a31393a30332e303030, timestamp=129055439706) (column=65787472615f696e666f, value=6e6f6e65, timestamp=129055439706) (column=726561736f6e, value=6e6f6e65, timestamp=129055439706) (column=7365636f6e64735f746f5f6e657874, value=373530, timestamp=129055439706) (column=73657269616c, value=393135353032353731, timestamp=129055439706) (column=737461747573, value=5550, timestamp=129055439706) (column=74797065, value=486561727462656174, timestamp=129055439706)) = (super_column=7c959f00-f757-11df-7c95-9f00f75711df, (column=64617465, value=323031302d31312d32332032333a31343a30332e303030, timestamp=1290554096881000) (column=65787472615f696e666f, value=6e6f6e65, timestamp=1290554096881000) (column=726561736f6e, value=6e6f6e65, timestamp=1290554096881000) (column=7365636f6e64735f746f5f6e657874, value=373530, timestamp=1290554096881000) (column=73657269616c, value=393135353032353731, timestamp=1290554096881000) (column=737461747573, value=5550, timestamp=1290554096881000) (column=74797065, value=486561727462656174, timestamp=1290554096881000)) = (super_column=c9be6330-f756-11df-c9be-6330f75611df,
cassandra-cli multiline commands?
I notice CASSANDRA-1742 mentions support for commands that span multiple lines in cassandra-cli. Did it make it in? If so what's the syntax? Cheers, Dan
Re: HintedHandoff and ReplicationFactor with a downed node
The last time this came up on the list Jonathan Ellis said (something along the lines of) if your application can't tolerate stale data then you should read with a consistency level of QUORUM. It would be nice if there was some sort of middle ground for an application that can tolerate slightly stale data (minutes) but not very stale data (hours or days) could still get the performance gain of consistency level of ONE. Even if a node just made a best effort in the OPs scenario it might be sufficient...? Is there an alternative solution to reading with consistency level of QUORUM? For example, if a node has been down for an extended period of time could you re-add it as a new node (fetching all its data again) and avoid having to read with QUORUM? Just curious... :) Cheers, Dan On Sat, Oct 23, 2010 at 10:01 AM, Rob Coli rc...@digg.com wrote: On 10/22/10 2:55 PM, Craig Ching wrote: Even better, I'd love a way to not allow B to be available until replication is complete, can I detect that somehow? Proposed and rejected a while back : https://issues.apache.org/jira/browse/CASSANDRA-768 =Rob
Re: What is the correct way of changing a partitioner?
http://wiki.apache.org/cassandra/DistributedDeletes From the http://wiki.apache.org/cassandra/StorageConfiguration page: Achtung! Changing this parameter requires wiping your data directories, since the partitioner can modify the !sstable on-disk format. So delete your data and commit log dirs after shutting down Cassandra... On Tue, Oct 19, 2010 at 4:09 PM, Wicked J wickedj2...@gmail.com wrote: Hi, I deleted all the data (programmatically). Then I changed the partitioner from RandomPartitioner to OrderPreservingPartitioner and when I started Cassandra - I get the following error. What is the correct way of changing the partitioner and how can I get past this error? ERROR 17:28:28,985 Fatal exception during initialization java.io.IOException: Found system table files, but they couldn't be loaded. Did you change the partitioner? at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:154) at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:94) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:211) Thanks!
Re: Running out of heap
http://wiki.apache.org/cassandra/FAQ#i_deleted_what_gives That help? On Wed, Sep 22, 2010 at 5:27 PM, Chris Jansen chris.jan...@cognitomobile.com wrote: Hi all, I have written a test application that does a write, read and delete on one of the sample column families that ship with Cassandra, and for some reason when I leave it going for an extended period of time I see Cassandra crash with out of heap exceptions. I don’t understand why this should be as I am deleting the data almost as soon as I have read it. Also I am seeing the data files grow for Keyspace1, again with apparently no reason as I am deleting the data as I read it, which eventually causes the disk space to completely fill up. How can this be, am I using Cassandra in the wrong way or is this a bug? Any help or advice would be greatly appreciated. Thanks in advance, Chris PS To give a better idea of what I am doing I’ve included some of the source from my Java test app, typically I have 20 threads running in parallel performing this operation: while(true) { long startTime = System.currentTimeMillis(); key = UUID.randomUUID().toString(); long timestamp = System.currentTimeMillis(); ColumnPath colPathFdl = new ColumnPath(columnFamily); colPathFdl.setColumn((345345345354+key).getBytes(UTF8)); boolean broken = true; while(broken) { try { client.insert(keyspace, key, colPathFdl, getBytesFromFile(new File(/opt/java/apache-cassandra/conf/storage-conf.xml)),timestamp, ConsistencyLevel.QUORUM); broken = false; } catch(Exception e) { System.out.println(Cannot write: +key+ RETRYING); broken=true; e.printStackTrace(); } } try { Column col = client1.get(keyspace, key, colPathFdl,ConsistencyLevel.QUORUM).getColumn(); System.out.println(key + column name: + new String(col.name, UTF8)); //System.out.println(column value: + new String(col.value, UTF8)); System.out.println(key + column timestamp: + new Date(col.timestamp)); } catch(Exception e) { System.out.println(Cannot read: +key); e.printStackTrace(); } try { System.out.println(key + delete column:: +key); client.remove(keyspace, key, colPathFdl, timestamp, ConsistencyLevel.QUORUM); } catch(Exception e) { System.out.println(Cannot delete: +key); e.printStackTrace(); } long stopTime = System.currentTimeMillis(); long timeTaken = stopTime -startTime; System.err.println(Thread.currentThread().getName() + +key+ Last operation took + timeTaken+ms ); } NOTICE: Cognito Limited. Benham Valence, Newbury, Berkshire, RG20 8LU. UK. Company number 02723032. This e-mail message and any attachment is confidential. It may not be disclosed to or used by anyone other than the intended recipient. If you have received this e-mail in error please notify the sender immediately then delete it from your system. Whilst every effort has been made to check this mail is virus free we accept no responsibility for software viruses and you should check for viruses before opening any attachments. Opinions, conclusions and other information in this email and any attachments which do not relate to the official business of the company are neither given by the company nor endorsed by it. This email message has been scanned for viruses by Mimecasthttp://www.mimecast.com
Re: Error when compile pelops
I just downloaded the jar file in question and it seems fine... wget -O cassandra.jar http://github.com/s7/mvnrepo/raw/master/org/apache/cassandra/cassandra/0.7.0-2010-09-12_19-23-07/cassandra-0.7.0-2010-09-12_19-23-07.jar; unzip -t cassandra.jar Which version of Maven are you using? On Tue, Sep 14, 2010 at 1:02 PM, Ying Tang ivytang0...@gmail.com wrote: I download the pelops source frm github , then cd the pelops folder. mvn compile But the error occurs. INFO] Compilation failure error: error reading /root/.m2/repository/org/apache/cassandra/cassandra/0.7.0-2010-09-12_19-23-07/cassandra-0.7.0-2010-09-12_19-23-07.jar; error in opening zip file Anyone met with the same problem with me ? How to solve it? -- Best regards, Ivy Tang
Re: too many open files 0.7.0 beta1
Maybe you're seeing this: https://issues.apache.org/jira/browse/CASSANDRA-1416 On Thu, Aug 26, 2010 at 2:05 PM, Aaron Morton aa...@thelastpickle.comwrote: Under 0.7.0 beta1 am seeing cassandra run out of files handles... Caused by: java.io.FileNotFoundException: /local1/junkbox/cassandra/data/ junkbox.wetafx.co.nz/ObjectIndex-e-31-Index.db (Too many open files) at java.ioRandomAccessFile.open(Native Method) at java.io.RandomAccessFile.init(RandomAccessFile.java:212) at java.io.RandomAccessFile.init(RandomAccessFile.java:98) at org.apache.cassandra.io.util.BufferedRandomAccessFile.init(BufferedRandomAccessFile.java:142) If I look at the file descriptors for the process I can see it already has 1,958 for to the file sudo ls -l /proc/20862/fd | grep ObjectIndex-e-31-Data.db | wc -l 1958 Out of a total of 2044. Other nodes in the cluster have a similar number of fd's - around 2k with the majority to one SSTable. I did not experience this under 0.6 so just checking if this sounds OK and I should just increase the number of handles or if it's a bug? Thanks Aaron
Re: Upgrading to Cassanda 0.7 Thrift Erlang
Slightly off topic but still related (java instead of erlang). I just tried using the latest trunk build available on Hudson (2010-07-31_12-31-29) and I'm getting lock ups. The same code (without the framed transport) was working with a build form 2010-07-07_13-32-16 I'm connecting using the following: TSocket socket = new TSocket(node, port); transport = new TFramedTransport(socket); protocol = new TBinaryProtocol(transport); client = new Cassandra.Client(protocol); transport.open(); // set the keyspace on the client and do get slice stuff The locked up thread looks like: main prio=5 tid=101801000 nid=0x100501000 runnable [1004fe000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) - locked 1093daa10 (a java.io.BufferedInputStream) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:369) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:295) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:202) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_slice(Cassandra.java:542) at org.apache.cassandra.thrift.Cassandra$Client.get_slice(Cassandra.java:524) On 28 July 2010 17:43, J T jt4websi...@googlemail.com wrote: Hi, That fixed the problem! I added the Framed option and like magic things have started working again. Example: thrift_client:start_link(localhost, 9160, cassandra_thrift, [ { framed, true } ] ) JT. On Tue, Jul 27, 2010 at 10:04 PM, Jonathan Ellis jbel...@gmail.comwrote: trunk is using framed thrift connections by default now (was unframed) On Tue, Jul 27, 2010 at 11:33 AM, J T jt4websi...@googlemail.com wrote: Hi, I just tried upgrading a perfectly working Cassandra 0.6.3 to Cassandra 0.7 and am finding that even after re-generating the erlang thrift bindings that I am unable to perform any operation. I can get a connection but if I try to login or set the keyspace I get a report from the erlang bindings to say that the connection is closed. I then tried upgrading to a later version of thrift but still get the same error. e.g. (zotonic3...@127.0.0.1)1 thrift_client:start_link(localhost, 9160, cassandra_thrift). {ok,0.327.0} (zotonic3...@127.0.0.1)2 {ok,C}=thrift_client:start_link(localhost, 9160, cassandra_thrift). {ok,0.358.0} (zotonic3...@127.0.0.1)3 thrift_client:call( C, set_keyspace, [ Test ]). =ERROR REPORT 27-Jul-2010::03:48:08 === ** Generic server 0.358.0 terminating ** Last message in was {call,set_keyspace,[Test]} ** When Server state == {state,cassandra_thrift, {protocol,thrift_binary_protocol, {binary_protocol, {transport,thrift_buffered_transport,0.359.0}, true,true}}, 0} ** Reason for termination == ** {{case_clause,{error,closed}}, [{thrift_client,read_result,3}, {thrift_client,catch_function_exceptions,2}, {thrift_client,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]} ** exception exit: {case_clause,{error,closed}} in function thrift_client:read_result/3 in call from thrift_client:catch_function_exceptions/2 in call from thrift_client:handle_call/3 in call from gen_server:handle_msg/5 in call from proc_lib:init_p_do_apply/3 The cassandra log seems to indicate that a connection has been made (although thats only apparent by a TRACE log message saying that a logout has been done). The cassandra-cli program is able to connect and function normally so I can only assume that there is a problem with the erlang bindings. Has anyone else had any success using 0.7 from Erlang ? JT. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Upgrading to Cassanda 0.7 Thrift Erlang
p.s. If I set thrift_framed_transport_size_in_mb to 0 and just use TSocket instead of TFramedTransport everything works as expected... On 1 August 2010 12:16, Dan Washusen d...@reactive.org wrote: Slightly off topic but still related (java instead of erlang). I just tried using the latest trunk build available on Hudson (2010-07-31_12-31-29) and I'm getting lock ups. The same code (without the framed transport) was working with a build form 2010-07-07_13-32-16 I'm connecting using the following: TSocket socket = new TSocket(node, port); transport = new TFramedTransport(socket); protocol = new TBinaryProtocol(transport); client = new Cassandra.Client(protocol); transport.open(); // set the keyspace on the client and do get slice stuff The locked up thread looks like: main prio=5 tid=101801000 nid=0x100501000 runnable [1004fe000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) - locked 1093daa10 (a java.io.BufferedInputStream) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:369) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:295) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:202) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_slice(Cassandra.java:542) at org.apache.cassandra.thrift.Cassandra$Client.get_slice(Cassandra.java:524) On 28 July 2010 17:43, J T jt4websi...@googlemail.com wrote: Hi, That fixed the problem! I added the Framed option and like magic things have started working again. Example: thrift_client:start_link(localhost, 9160, cassandra_thrift, [ { framed, true } ] ) JT. On Tue, Jul 27, 2010 at 10:04 PM, Jonathan Ellis jbel...@gmail.comwrote: trunk is using framed thrift connections by default now (was unframed) On Tue, Jul 27, 2010 at 11:33 AM, J T jt4websi...@googlemail.com wrote: Hi, I just tried upgrading a perfectly working Cassandra 0.6.3 to Cassandra 0.7 and am finding that even after re-generating the erlang thrift bindings that I am unable to perform any operation. I can get a connection but if I try to login or set the keyspace I get a report from the erlang bindings to say that the connection is closed. I then tried upgrading to a later version of thrift but still get the same error. e.g. (zotonic3...@127.0.0.1)1 thrift_client:start_link(localhost, 9160, cassandra_thrift). {ok,0.327.0} (zotonic3...@127.0.0.1)2 {ok,C}=thrift_client:start_link(localhost, 9160, cassandra_thrift). {ok,0.358.0} (zotonic3...@127.0.0.1)3 thrift_client:call( C, set_keyspace, [ Test ]). =ERROR REPORT 27-Jul-2010::03:48:08 === ** Generic server 0.358.0 terminating ** Last message in was {call,set_keyspace,[Test]} ** When Server state == {state,cassandra_thrift, {protocol,thrift_binary_protocol, {binary_protocol, {transport,thrift_buffered_transport,0.359.0}, true,true}}, 0} ** Reason for termination == ** {{case_clause,{error,closed}}, [{thrift_client,read_result,3}, {thrift_client,catch_function_exceptions,2}, {thrift_client,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]} ** exception exit: {case_clause,{error,closed}} in function thrift_client:read_result/3 in call from thrift_client:catch_function_exceptions/2 in call from thrift_client:handle_call/3 in call from gen_server:handle_msg/5 in call from proc_lib:init_p_do_apply/3 The cassandra log seems to indicate that a connection has been made (although thats only apparent by a TRACE log message saying that a logout has been done). The cassandra-cli program is able to connect and function normally so I can only assume that there is a problem with the erlang bindings. Has anyone else had any success using 0.7 from Erlang ? JT. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Using Pelops with Cassandra 0.7.X
http://github.com/danwashusen/pelops/tree/cassandra-0.7.0 p.s. Pelops doesn't have any test coverage and my implicit tests (my app integration tests) don't touch anywhere near all of the Pelops API. p.s.s. I've made API breaking changes to support the new 0.7.0 API and Dominic (the original Pelops author) hasn't had reviewed, commented or even looked at them yet... On 14 July 2010 08:35, Ran Tavory ran...@gmail.com wrote: Hector doesn't have 0.7 support yet On Jul 14, 2010 1:34 AM, Peter Harrison cheetah...@gmail.com wrote: I know Cassandra 0.7 isn't released yet, but I was wondering if anyone has used Pelops with the latest builds of Cassandra? I'm having some issues, but I wanted to make sure that somebody else isn't working on a branch of Pelops to support Cassandra 7. I have downloaded and built the latest code from GitHub, trunk of Pelops, and this works with 6.3, but not Cassandra Trunk. Is Pelops worth updating or should I use other client libraries for Java such as Hector?
Re: Pelops 'up and running' post question + WTF is a SuperColumn = really confused.
L1Tickets = { // column family userId: { // row key 42C120DF-D44A-44E4-9BDC-2B5439A5C7B4: { category: videoPhone, reportType: POOR_PICTURE, ...}, 99B60047-382A-4237-82CE-AE53A74FB747: { category: somethingElse, reportType: FOO, ...} } } On 3 July 2010 02:29, S Ahmed sahmed1...@gmail.com wrote: https://ria101.wordpress.com/2010/06/11/pelops-the-beautiful-cassandra-database-client-for-java So using the code snipped below, I want to create a json representation of the CF (super). /** * Write multiple sub-column values to a super column... * @param rowKeyThe key of the row to modify * @param colFamily The name of the super column family to operate on * @param colName The name of the super column * @param subColumnsA list of the sub-columns to write */ mutator. writeSubColumns( userId, L1Tickets, UuidHelper.newTimeUuidBytes(), // using a UUID value that sorts by time mutator.newColumnList( mutator.newColumn(category, videoPhone), mutator.newColumn(reportType, POOR_PICTURE), mutator.newColumn(createdDate, NumberHelper.toBytes(System.currentTimeMillis())), mutator.newColumn(capture, jpegBytes), mutator.newColumn(comment) )); Can someone show me what it would look like? This is what I have so far SupportTickets = { userId : { L1Tickets : { } } } But from what I understood, a CF of type super looks like ( http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model) : AddressBook = { // this is a ColumnFamily of type Super phatduckk: {// this is the key to this row inside the Super CF // the key here is the name of the owner of the address book // now we have an infinite # of super columns in this row // the keys inside the row are the names for the SuperColumns // each of these SuperColumns is an address book entry friend1: {street: 8th street, zip: 90210, city: Beverley Hills, state: CA}, // this is the address book entry for John in phatduckk's address book John: {street: Howard street, zip: 94404, city: FC, state: CA}, Kim: {street: X street, zip: 87876, city: Balls, state: VA}, Tod: {street: Jerry street, zip: 54556, city: Cartoon, state: CO}, Bob: {street: Q Blvd, zip: 24252, city: Nowhere, state: MN}, ... // we can have an infinite # of ScuperColumns (aka address book entries) }, // end row ieure: { // this is the key to another row in the Super CF // all the address book entries for ieure joey: {street: A ave, zip: 55485, city: Hell, state: NV}, William: {street: Armpit Dr, zip: 93301, city: Bakersfield, state: CA}, }, } The Pelop's code snippet seems to be adding an additional inner layer to this to me, confused!
Re: Pelops - a new Java client library paradigm
Very nice! You mention that the connections are handled internally by Pelops, does that mean that potentially a different connection is used for each operation performed? I had assumed using the same connection for several operations with ConsistencyLevel.ONE would provide a basic level of atomicity. For example, using the same connection for all operations in a web request would allow the request to read it's own writes. Is that assumption correct and does that impact on your decision to handle the connections internally to Pelops? Cheers, Dan On 13 June 2010 05:05, Ran Tavory ran...@gmail.com wrote: Nice going, Dominic, having a clear API for cassandra is a big step forward :) Interestingly, at hector we came up with similar approach, just didn't find the time for code that, as production systems keep me busy at nights as well... We started with the implementation of BatchMutation, but the rest of the API improvements are still TODO Keep up the good work, competition keeps us healthy ;) On Fri, Jun 11, 2010 at 4:41 PM, Dominic Williams thedwilli...@googlemail.com wrote: Pelops is a new high quality Java client library for Cassandra. It has a design that: * reveals the full power of Cassandra through an elegant Mutator and Selector paradigm * generates better, cleaner, less bug prone code * reduces the learning curve for new users * drives rapid application development * encapsulates advanced pooling algorithms An article introducing Pelops can be found at http://ria101.wordpress.com/2010/06/11/pelops-the-beautiful-cassandra-database-client-for-java/ Thanks for reading. Best, Dominic
Re: Pelops - a new Java client library paradigm
Thanks for clarifying! On 13 June 2010 09:03, Miguel Verde miguelitov...@gmail.com wrote: afaik, Cassandra does nothing to guarantee connection-level read your own writes consistency beyond its usual consistency levels. See https://issues.apache.org/jira/browse/CASSANDRA-876 and the earlier http://issues.apache.org/jira/browse/CASSANDRA-132 http://issues.apache.org/jira/browse/CASSANDRA-132 On Jun 12, 2010, at 5:48 PM, Dan Washusen d...@reactive.org wrote: Very nice! You mention that the connections are handled internally by Pelops, does that mean that potentially a different connection is used for each operation performed? I had assumed using the same connection for several operations with ConsistencyLevel.ONE would provide a basic level of atomicity. For example, using the same connection for all operations in a web request would allow the request to read it's own writes. Is that assumption correct and does that impact on your decision to handle the connections internally to Pelops? Cheers, Dan On 13 June 2010 05:05, Ran Tavory ran...@gmail.comran...@gmail.comwrote: Nice going, Dominic, having a clear API for cassandra is a big step forward :) Interestingly, at hector we came up with similar approach, just didn't find the time for code that, as production systems keep me busy at nights as well... We started with the implementation of BatchMutation, but the rest of the API improvements are still TODO Keep up the good work, competition keeps us healthy ;) On Fri, Jun 11, 2010 at 4:41 PM, Dominic Williams thedwilli...@googlemail.com thedwilli...@googlemail.com wrote: Pelops is a new high quality Java client library for Cassandra. It has a design that: * reveals the full power of Cassandra through an elegant Mutator and Selector paradigm * generates better, cleaner, less bug prone code * reduces the learning curve for new users * drives rapid application development * encapsulates advanced pooling algorithms An article introducing Pelops can be found at http://ria101.wordpress.com/2010/06/11/pelops-the-beautiful-cassandra-database-client-for-java/ http://ria101.wordpress.com/2010/06/11/pelops-the-beautiful-cassandra-database-client-for-java/ Thanks for reading. Best, Dominic