Re: Upgrade from 1.2.x to 2.0.x, upgradesstables has doubled the size on disk?

2017-12-30 Thread Dan Washusen
Nope, just ran out of disk space... So on 1.2.x I had 70GB used on a 200GB
disk and everything was great, with 2.0.x I'm now at 99% used and getting
exception while compacting about insufficient disk space. FML...

Dan Washusen


On Sun, Dec 31, 2017 at 6:47 AM, Dan Washusen <d...@reactive.org> wrote:

> Thanks for the response Jeff. It wasn't snapshots but after running
> upgradesstables on all nodes I started a repair and it seems like the file
> sizes are reducing:
>
>  INFO [CompactionExecutor:1626] 2017-12-30 19:42:36,065
> CompactionTask.java (line 299) Compacted 2 sstables to
> [/appdata/lib/cassandra/data/dp/s_evt/dp-s_evt-jb-302,].  9,663,696,752
> bytes to 4,834,895,601 (~50% of original) in 3,899,888ms = 1.182320MB/s.  
> 90,533
> total partitions merged to 45,278.  Partition merge counts were {1:23,
> 2:45255, }
>
> Dan Washusen
>
>
> On Sun, Dec 31, 2017 at 1:51 AM, Jeff Jirsa <jji...@gmail.com> wrote:
>
>> 1.2 to 2.0 was a long time ago for many of us, but I don’t recall
>> anything that should have doubled size other than perhaps temporarily
>> during the sstable rewrite or snapshots (which may? Be automatic on
>> upgrade).
>>
>> The bloom filters, sstable count, compression ratio in cfstats all look
>> similar, only the size is double, so that sorta hints st maybe a snapshot
>>
>> You have few sstables, looks like STCS, so it’d be possible that if the
>> upgrade is still running, maybe one sstable of the old version still
>> (temporarily) exists on disk causing it to be double counted.
>>
>>
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Dec 29, 2017, at 4:33 PM, Dan Washusen <d...@reactive.org> wrote:
>>
>> Hi All,
>> We're taking advantage of the lull in traffic to go through a production
>> cluster upgrade from 1.2.x (latest) to 2.0.x (latest). We have three nodes
>> with a replication factor of three. I've noticed that the 'space used' has
>> almost doubled as a result of running 'nodetool upgradesstables'.
>>
>> Anyone have any ideas? Is that to be expected?
>>
>> For comparison, on a node (pre-upgrade):
>>
>>> nodetool cfstats dp.s_evt
>>> Keyspace: dp
>>> Read Count: 190570567
>>> Read Latency: 2.6280611004164145 ms.
>>> Write Count: 46213651
>>> Write Latency: 0.08166790944519835 ms.
>>> Pending Tasks: 0
>>> Column Family: s_evt
>>> SSTable count: 8
>>> Space used (live): 36269415929
>>> Space used (total): 36274282945
>>> SSTable Compression Ratio: 0.2345030140572
>>> Number of Keys (estimate): 3213696
>>> Memtable Columns Count: 2934
>>> Memtable Data Size: 9561951
>>> Memtable Switch Count: 1974
>>> Read Count: 190570567
>>> Read Latency: 2.628 ms.
>>> Write Count: 46213651
>>> Write Latency: 0.082 ms.
>>> Pending Tasks: 0
>>> Bloom Filter False Positives: 1162636
>>> Bloom Filter False Ratio: 0.73869
>>> Bloom Filter Space Used: 4492256
>>> Compacted row minimum size: 373
>>> Compacted row maximum size: 1996099046
>>> Compacted row mean size: 63595
>>> Average live cells per slice (last five minutes): 11.0
>>> Average tombstones per slice (last five minutes): 0.0
>>
>>
>> And after upgrading and running 'upgradesstables' (different node):
>>
>>> nodetool cfstats dp.s_evt
>>> Keyspace: dp
>>> Read Count: 1461617
>>> Read Latency: 4.9734411921864625 ms.
>>> Write Count: 359250
>>> Write Latency: 0.11328054279749478 ms.
>>> Pending Tasks: 0
>>> Table: s_evt
>>> SSTable count: 6
>>> Space used (live), bytes: 71266932602
>>> Space used (total), bytes: 71266932602
>>> Off heap memory used (total), bytes: 44853104
>>> SSTable Compression Ratio: 0.2387480210082192
>>> Number of keys (estimate): 3307776
>>> Memtable cell count: 603223
>>> Memtable data size, bytes: 121913569
>>> Memtable switch count: 9
>>> Lo

Re: Upgrade from 1.2.x to 2.0.x, upgradesstables has doubled the size on disk?

2017-12-30 Thread Dan Washusen
Thanks for the response Jeff. It wasn't snapshots but after running
upgradesstables on all nodes I started a repair and it seems like the file
sizes are reducing:

 INFO [CompactionExecutor:1626] 2017-12-30 19:42:36,065 CompactionTask.java
(line 299) Compacted 2 sstables to
[/appdata/lib/cassandra/data/dp/s_evt/dp-s_evt-jb-302,].  9,663,696,752
bytes to 4,834,895,601 (~50% of original) in 3,899,888ms =
1.182320MB/s.  90,533
total partitions merged to 45,278.  Partition merge counts were {1:23,
2:45255, }

Dan Washusen


On Sun, Dec 31, 2017 at 1:51 AM, Jeff Jirsa <jji...@gmail.com> wrote:

> 1.2 to 2.0 was a long time ago for many of us, but I don’t recall anything
> that should have doubled size other than perhaps temporarily during the
> sstable rewrite or snapshots (which may? Be automatic on upgrade).
>
> The bloom filters, sstable count, compression ratio in cfstats all look
> similar, only the size is double, so that sorta hints st maybe a snapshot
>
> You have few sstables, looks like STCS, so it’d be possible that if the
> upgrade is still running, maybe one sstable of the old version still
> (temporarily) exists on disk causing it to be double counted.
>
>
>
> --
> Jeff Jirsa
>
>
> On Dec 29, 2017, at 4:33 PM, Dan Washusen <d...@reactive.org> wrote:
>
> Hi All,
> We're taking advantage of the lull in traffic to go through a production
> cluster upgrade from 1.2.x (latest) to 2.0.x (latest). We have three nodes
> with a replication factor of three. I've noticed that the 'space used' has
> almost doubled as a result of running 'nodetool upgradesstables'.
>
> Anyone have any ideas? Is that to be expected?
>
> For comparison, on a node (pre-upgrade):
>
>> nodetool cfstats dp.s_evt
>> Keyspace: dp
>> Read Count: 190570567
>> Read Latency: 2.6280611004164145 ms.
>> Write Count: 46213651
>> Write Latency: 0.08166790944519835 ms.
>> Pending Tasks: 0
>> Column Family: s_evt
>> SSTable count: 8
>> Space used (live): 36269415929
>> Space used (total): 36274282945
>> SSTable Compression Ratio: 0.2345030140572
>> Number of Keys (estimate): 3213696
>> Memtable Columns Count: 2934
>> Memtable Data Size: 9561951
>> Memtable Switch Count: 1974
>> Read Count: 190570567
>> Read Latency: 2.628 ms.
>> Write Count: 46213651
>> Write Latency: 0.082 ms.
>> Pending Tasks: 0
>> Bloom Filter False Positives: 1162636
>> Bloom Filter False Ratio: 0.73869
>> Bloom Filter Space Used: 4492256
>> Compacted row minimum size: 373
>> Compacted row maximum size: 1996099046
>> Compacted row mean size: 63595
>> Average live cells per slice (last five minutes): 11.0
>> Average tombstones per slice (last five minutes): 0.0
>
>
> And after upgrading and running 'upgradesstables' (different node):
>
>> nodetool cfstats dp.s_evt
>> Keyspace: dp
>> Read Count: 1461617
>> Read Latency: 4.9734411921864625 ms.
>> Write Count: 359250
>> Write Latency: 0.11328054279749478 ms.
>> Pending Tasks: 0
>> Table: s_evt
>> SSTable count: 6
>> Space used (live), bytes: 71266932602
>> Space used (total), bytes: 71266932602
>> Off heap memory used (total), bytes: 44853104
>> SSTable Compression Ratio: 0.2387480210082192
>> Number of keys (estimate): 3307776
>> Memtable cell count: 603223
>> Memtable data size, bytes: 121913569
>> Memtable switch count: 9
>> Local read count: 1461617
>> Local read latency: 7.248 ms
>> Local write count: 359250
>> Local write latency: 0.110 ms
>> Pending tasks: 0
>> Bloom filter false positives: 2501
>> Bloom filter false ratio: 0.01118
>> Bloom filter space used, bytes: 4135248
>> Bloom filter off heap memory used, bytes: 4135200
>> Index summary off heap memory used, bytes: 723576
>> Compression metadata off heap memory used, bytes: 39994328
>> Compacted partition minimum bytes: 536
>>

Upgrade from 1.2.x to 2.0.x, upgradesstables has doubled the size on disk?

2017-12-29 Thread Dan Washusen
Hi All,
We're taking advantage of the lull in traffic to go through a production
cluster upgrade from 1.2.x (latest) to 2.0.x (latest). We have three nodes
with a replication factor of three. I've noticed that the 'space used' has
almost doubled as a result of running 'nodetool upgradesstables'.

Anyone have any ideas? Is that to be expected?

For comparison, on a node (pre-upgrade):

> nodetool cfstats dp.s_evt
> Keyspace: dp
> Read Count: 190570567
> Read Latency: 2.6280611004164145 ms.
> Write Count: 46213651
> Write Latency: 0.08166790944519835 ms.
> Pending Tasks: 0
> Column Family: s_evt
> SSTable count: 8
> Space used (live): 36269415929
> Space used (total): 36274282945
> SSTable Compression Ratio: 0.2345030140572
> Number of Keys (estimate): 3213696
> Memtable Columns Count: 2934
> Memtable Data Size: 9561951
> Memtable Switch Count: 1974
> Read Count: 190570567
> Read Latency: 2.628 ms.
> Write Count: 46213651
> Write Latency: 0.082 ms.
> Pending Tasks: 0
> Bloom Filter False Positives: 1162636
> Bloom Filter False Ratio: 0.73869
> Bloom Filter Space Used: 4492256
> Compacted row minimum size: 373
> Compacted row maximum size: 1996099046
> Compacted row mean size: 63595
> Average live cells per slice (last five minutes): 11.0
> Average tombstones per slice (last five minutes): 0.0


And after upgrading and running 'upgradesstables' (different node):

> nodetool cfstats dp.s_evt
> Keyspace: dp
> Read Count: 1461617
> Read Latency: 4.9734411921864625 ms.
> Write Count: 359250
> Write Latency: 0.11328054279749478 ms.
> Pending Tasks: 0
> Table: s_evt
> SSTable count: 6
> Space used (live), bytes: 71266932602
> Space used (total), bytes: 71266932602
> Off heap memory used (total), bytes: 44853104
> SSTable Compression Ratio: 0.2387480210082192
> Number of keys (estimate): 3307776
> Memtable cell count: 603223
> Memtable data size, bytes: 121913569
> Memtable switch count: 9
> Local read count: 1461617
> Local read latency: 7.248 ms
> Local write count: 359250
> Local write latency: 0.110 ms
> Pending tasks: 0
> Bloom filter false positives: 2501
> Bloom filter false ratio: 0.01118
> Bloom filter space used, bytes: 4135248
> Bloom filter off heap memory used, bytes: 4135200
> Index summary off heap memory used, bytes: 723576
> Compression metadata off heap memory used, bytes: 39994328
> Compacted partition minimum bytes: 536
> Compacted partition maximum bytes: 2874382626
> Compacted partition mean bytes: 108773
> Average live cells per slice (last five minutes): 11.0
> Average tombstones per slice (last five minutes): 17.0


Column familiy definition:

> create column family s_evt with column_type = 'Super' and comparator =
> 'TimeUUIDType' and subcomparator = 'UTF8Type';



Also curious why the 'Average tombstones per slice' value has gone from 0
to 17. Note sure if its relevant but way back when we used to write values
to that (super) column family with a TTL, but for a long time now its been
append only (with no TTL)...

Thanks,
Dan


Re: which high level Java client

2012-06-28 Thread Dan Washusen
Pelops is a very thin wrapper over the Thrift client so that could be a good 
option.  

You could also check out a fork by the VMWare/Spring guys which adds full async 
support: https://github.com/andrewswan/scale7-pelops.  I'm not sure on the 
state of it, but it seems promising... 


On Thursday, 28 June 2012 at 11:04 AM, James Pirz wrote:

 Dear all,
 
 I am interested in using Cassandra 1.1.1 in a read-intensive scenario, where 
 more than 95% of my operations are get().
 I have a cluster with ~10 nodes,  around 15-20 GB of data on each, while in 
 the extreme case I expect to have 20-40 concurrent clients.
 
 I am kind of confused about which high level java client should I use  ? 
 (Which one is the best/fastest for concurrent read operations)
 Hector, Pelops, Astyanax, or something else ?
 
 I browsed the mailing list, but I came across different types of arguments 
 and conclusions on behalf of various clients.
 
 Thanks in advance,
 
 James  



Re: Configuring cassandra cluster with host preferences

2012-05-14 Thread Dan Washusen
It's not possible 'out of the box' but you could implement your own 
org.scale7.cassandra.pelops.pool.CommonsBackedPool.INodeSelectionStrategy that 
chooses the desired node.

-- 
Dan Washusen
Make big files fly
visit digitalpigeon.com (http://digitalpigeon.com)



On Tuesday, 15 May 2012 at 3:23 AM, Oleg Dulin wrote:

 I am running my processes on the same nodes as Cassandra.
 
 What I'd like to do is when I get a connection from Pelops, it gives 
 preference to the Cassandra node local to the host my process is on.
 
 Is it possible ? How ?
 
 
 Regards,
 Oleg Dulin
 Please note my new office #: 732-917-0159
 
 



Re: Cassandra Clients for Java

2011-06-17 Thread Dan Washusen
I've added some comments/questions inline...

Cheers,
-- 
Dan Washusen

On Saturday, 18 June 2011 at 8:02 AM, Daniel Colchete wrote:

 Good day everyone!
 
 I'm getting started with a new project and I'm thinking about using Cassandra 
 because of its distributed quality and because of its performance.
 
 I'm using Java on the back-end. There are many many things being said about 
 the Java high level clients for Cassandra on the web. To be frank, I see 
 problems with all of the java clients. For example, Hector and Scale7-pelops 
 have new semantics on them that are neither Java's or Cassandra's, and I 
 don't see much gain from it apart from the fact that it is more complex. 
 Also, I was hoping to go with something that was annotation based so that it 
 wouldn't be necessary to write boilerplate code (again, no gain). 
I'm interested in hearing more on your comment regarding Hector and Pelops 
adding little but complexity; could you add a little context to the comment? If 
you're coming from an ORM framework like Hibernate then doing simple tasks may 
seem cumbersome. However, once you've wrapped your head around Cassandra's read 
and write concepts both Hector and Pelops seem relatively straight forward (to 
me)...?

Also, a quick look at the Hector wiki suggests that they have some form of 
annotation support 
(https://github.com/rantav/hector/wiki/Using-the-EntityManager).
 
 Demoiselle Cassandra seems to be one option but I couldn't find a download 
 for it. I'm new to Java in the back-end and I find that maven is too much to 
 learn just because of a client library. Also it seems to be hard to integrate 
 with the other things I use on my project (GWT, GWT-platform, Google Eclipse 
 Plugin). 
 
 Kundera looks great but besides not having a download link (Google site link 
 to Github, that links to Google site, but no download) its information is 
 partitioned on many blog posts, some of them saying things I couldn't find on 
 its website. One says it uses Lucandra for indexes but that is the only place 
 talking about it, no documentation about using it. It doesn't seem to support 
 Cassandra 0.8 also. Does it?
It's my understand that Lucandra has been superseded by Solandra 
(https://github.com/tjake/Solandra).
 
 I would like to hear from the users here what worked for you guys. Some real 
 world project in production that was good to write in Java, where the client 
 was stable and is maintained. What are the success stories of using Cassandra 
 with Java. What would you recommend?
Pelops is used successfully on (at least) fightmymonster.com and 
digitalpigeon.com. Pelops is actively developed and maintained by those two 
companies + contributors and Hector looks like it's backed by Datastax, which 
would seem to be a pretty compelling sell point. 
 
 Thank you very much!
 
 Best,
 -- 
 Dani
 Cloud3 Tech - http://cloud3.tc/ 
 Twitter: @DaniCloud3 @Cloud3Tech



0.7.5 Debian packages - can't upgrade?

2011-04-29 Thread Dan Washusen
Hey all,
I can't seem upgrade to 0.7.5 using the Debian packages.   Here's what I've
done...

Edited sources.list and changed unstable to 07x.

 deb http://www.apache.org/dist/cassandra/debian 07x main
 deb-src http://www.apache.org/dist/cassandra/debian 07x main


Add the new key.

 sudo gpg --keyserver pgp.mit.edu --recv-keys 2B5C1B00

sudo gpg --export --armor 2B5C1B00 | sudo apt-key add -


Run the usual commands.

 sudo aptitude update

sudo aptitude safe-upgrade


The upgrade shows this:

 Reading package lists... Done
 Building dependency tree
 Reading state information... Done
 Reading extended state information
 Initializing package states... Done
 No packages will be installed, upgraded, or removed.
 0 packages upgraded, 0 newly installed, 0 to remove and *1 not upgraded*.
 Need to get 0B of archives. After unpacking 0B will be used.
 Reading package lists... Done
 Building dependency tree
 Reading state information... Done
 Reading extended state information
 Initializing package states... Done


The above mentions that 1 package wasn't upgraded (I assume this is 0.7.5).
 Anyone have any ideas what I'm doing wrong?

Cheers,
Dan


Re: 0.7.5 Debian packages - can't upgrade?

2011-04-29 Thread Dan Washusen
Thanks for the response. :)

I should have also mentioned that I'm running this on Ubuntu Karmic
Koalahttp://i44.tinypic.com/27xp2lc.jpg(9.10).

The output of `sudo aptitude full-upgrade` looks the same as safe-upgrade:

 Reading package lists... Done
 Building dependency tree
 Reading state information... Done
 Reading extended state information
 Initializing package states... Done
 No packages will be installed, upgraded, or removed.
 0 packages upgraded, 0 newly installed, 0 to remove and 1 not upgraded.
 Need to get 0B of archives. After unpacking 0B will be used.
 Reading package lists... Done
 Building dependency tree
 Reading state information... Done
 Reading extended state information
 Initializing package states... Done


Here is the output of 'apt-cache policy  apt-cache policy cassandra':
http://pastebin.com/PqRiGmWi


On 30 April 2011 11:18, Eric Evans eev...@rackspace.com wrote:

 On Sat, 2011-04-30 at 09:34 +1000, Dan Washusen wrote:
   sudo aptitude update
 
  sudo aptitude safe-upgrade
 
 
  The upgrade shows this:
 
   Reading package lists... Done
   Building dependency tree
   Reading state information... Done
   Reading extended state information
   Initializing package states... Done
   No packages will be installed, upgraded, or removed.
   0 packages upgraded, 0 newly installed, 0 to remove and *1 not
  upgraded*.
   Need to get 0B of archives. After unpacking 0B will be used.
   Reading package lists... Done
   Building dependency tree
   Reading state information... Done
   Reading extended state information
   Initializing package states... Done
 
 
  The above mentions that 1 package wasn't upgraded (I assume this is
  0.7.5).
   Anyone have any ideas what I'm doing wrong?

 Usually this means that upgrading would install a new package (i.e. that
 it picked up a new dependency), which shouldn't be the case.  You might
 try an `aptitude full-upgrade' just to see what that might be.  You
 could also try pasting the output of `apt-cache policy  apt-cache
 policy cassandra' to the list.

 --
 Eric Evans
 eev...@rackspace.com




Re: RE: batch_mutate failed: out of sequence response

2011-04-18 Thread Dan Washusen
It turns out that once a TProtocolException is thrown from Cassandra the 
connection is useless for future operations. Pelops was closing connections 
when it detected TimedOutException, TTransportException and 
UnavailableException but not TProtocolException. We have now changed Pelops to 
close connections is all cases *except* NotFoundException.

Cheers,
-- 
Dan Washusen
On Friday, 8 April 2011 at 7:28 AM, Dan Washusen wrote: 
 Pelops uses a single connection per operation from a pool that is backed by 
 Apache Commons Pool (assuming you're using Cassandra 0.7). I'm not saying 
 it's perfect but it's NOT sharing a connection over multiple threads.
 
 Dan Hendry mentioned that he sees these errors. Is he also using Pelops? From 
 his comment about retrying I'd assume not...
 
 -- 
 Dan Washusen
 On Thursday, 7 April 2011 at 7:39 PM, Héctor Izquierdo Seliva wrote:
  El mié, 06-04-2011 a las 21:04 -0500, Jonathan Ellis escribió:
   out of sequence response is thrift's way of saying I got a response
   for request Y when I expected request X.
   
   my money is on using a single connection from multiple threads. don't do 
   that.
  
  I'm not using thrift directly, and my application is single thread, so I
  guess this is Pelops fault somehow. Since I managed to tame memory
  comsuption the problem has not appeared again, but it always happened
  during a stop-the-world GC. Could it be that the message was sent
  instead of being dropped by the server when the client assumed it had
  timed out?
  
 


Re: RE: batch_mutate failed: out of sequence response

2011-04-18 Thread Dan Washusen
An example scenario (that is now fixed in Pelops):
Attempt to write a column with a null value
Cassandra throws a TProtocolException which renders the connection useless for 
future operations
Pelops returns the corrupt connection to the pool
A second read operation is attempted with the corrupt connection and Cassandra 
throws an ApplicationException


A Pelops test case for this can be found here:
https://github.com/s7/scale7-pelops/blob/3fe7584a24bb4b62b01897a814ef62415bd2fe43/src/test/java/org/scale7/cassandra/pelops/MutatorIntegrationTest.java#L262

Cheers,
-- 
Dan Washusen
On Tuesday, 19 April 2011 at 10:28 AM, Jonathan Ellis wrote: 
 Any idea what's causing the original TPE?
 
 On Mon, Apr 18, 2011 at 6:22 PM, Dan Washusen d...@reactive.org wrote:
  It turns out that once a TProtocolException is thrown from Cassandra the
  connection is useless for future operations. Pelops was closing connections
  when it detected TimedOutException, TTransportException and
  UnavailableException but not TProtocolException. We have now changed Pelops
  to close connections is all cases *except* NotFoundException.
  
  Cheers,
  --
  Dan Washusen
  
  On Friday, 8 April 2011 at 7:28 AM, Dan Washusen wrote:
  
  Pelops uses a single connection per operation from a pool that is backed by
  Apache Commons Pool (assuming you're using Cassandra 0.7). I'm not saying
  it's perfect but it's NOT sharing a connection over multiple threads.
  Dan Hendry mentioned that he sees these errors. Is he also using Pelops?
  From his comment about retrying I'd assume not...
  
  --
  Dan Washusen
  
  On Thursday, 7 April 2011 at 7:39 PM, Héctor Izquierdo Seliva wrote:
  
  El mié, 06-04-2011 a las 21:04 -0500, Jonathan Ellis escribió:
  
  out of sequence response is thrift's way of saying I got a response
  for request Y when I expected request X.
  
  my money is on using a single connection from multiple threads. don't do
  that.
  
  I'm not using thrift directly, and my application is single thread, so I
  guess this is Pelops fault somehow. Since I managed to tame memory
  comsuption the problem has not appeared again, but it always happened
  during a stop-the-world GC. Could it be that the message was sent
  instead of being dropped by the server when the client assumed it had
  timed out?
 
 
 
 -- 
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com
 


Re: Atomicity Strategies

2011-04-08 Thread Dan Washusen
Here's a good writeup on how fightmymonster.com does it...

http://ria101.wordpress.com/category/nosql-databases/locking/

-- 
Dan Washusen
Make big files fly
visit digitalpigeon.com

On Saturday, 9 April 2011 at 11:53 AM, Alex Araujo wrote:
On 4/8/11 5:46 PM, Drew Kutcharian wrote:
  I'm interested in this too, but I don't think this can be done with 
  Cassandra alone. Cassandra doesn't support transactions. I think hector can 
  retry operations, but I'm not sure about the atomicity of the whole thing.
  
  
  
  On Apr 8, 2011, at 1:26 PM, Alex Araujo wrote:
  
   Hi, I was wondering if there are any patterns/best practices for creating 
   atomic units of work when dealing with several column families and their 
   inverted indices.
   
   For example, if I have Users and Groups column families and did something 
   like:
   
   Users.insert( user_id, columns )
   UserGroupTimeline.insert( group_id, { timeuuid() : user_id } )
   UserGroupStatus.insert( group_id + : + user_id, { Active : True } )
   UserEvents.insert( timeuuid(), { user_id : user_id, group_id : 
   group_id, event_type : join } )
   
   Would I want the client to retry all subsequent operations that failed 
   against other nodes after n succeeded, maintain an undo queue of 
   operations to run, batch the mutations and choose a strong consistency 
   level, some combination of these/others, etc?
   
   Thanks,
   Alex
 Thanks Drew. I'm familiar with lack of transactions and have read about 
 people using ZK (possibly Cages as well?) to accomplish this, but since 
 it seems that inverted indices are common place I'm interested in how 
 anyone is mitigating lack of atomicity to any extent without the use of 
 such tools. It appears that Hector and Pelops have retrying built in to 
 their APIs and I'm fairly confident that proper use of those 
 capabilities may help. Just trying to cover all bases. Hopefully 
 someone can share their approaches and/or experiences. Cheers, Alex.
 


Re: RE: batch_mutate failed: out of sequence response

2011-04-07 Thread Dan Washusen
Pelops uses a single connection per operation from a pool that is backed by 
Apache Commons Pool (assuming you're using Cassandra 0.7). I'm not saying it's 
perfect but it's NOT sharing a connection over multiple threads.

Dan Hendry mentioned that he sees these errors. Is he also using Pelops? From 
his comment about retrying I'd assume not...

-- 
Dan Washusen
On Thursday, 7 April 2011 at 7:39 PM, Héctor Izquierdo Seliva wrote: 
 El mié, 06-04-2011 a las 21:04 -0500, Jonathan Ellis escribió:
  out of sequence response is thrift's way of saying I got a response
  for request Y when I expected request X.
  
  my money is on using a single connection from multiple threads. don't do 
  that.
 
 I'm not using thrift directly, and my application is single thread, so I
 guess this is Pelops fault somehow. Since I managed to tame memory
 comsuption the problem has not appeared again, but it always happened
 during a stop-the-world GC. Could it be that the message was sent
 instead of being dropped by the server when the client assumed it had
 timed out?
 


Re: RE: batch_mutate failed: out of sequence response

2011-04-06 Thread Dan Washusen
Pelops will retry when TimedOutException, TTransportException or 
UnavailableException exceptions are thrown but not TApplicationException. 
TApplicationException has a type property which looks like it could be used to 
retry based on specific values. Based on the names the INTERNAL_ERROR and 
BAD_SEQUENCE_ID types sound like good candidates for a retry.

I just did a quick hunt through the Pycassa and Hector code and it doesn't look 
like they do anything special based on the type property.

Jonathan (or other Cassandra gurus) should connection managers take different 
actions based on these type property of TApplicationException?

Cheers,
Dan


On Wednesday, 6 April 2011 at 8:03 PM, Héctor Izquierdo Seliva wrote:
El mié, 06-04-2011 a las 09:06 +1000, Dan Washusen escribió:
  Pelops raises a RuntimeException? Can you provide more info please?
 
 org.scale7.cassandra.pelops.exceptions.ApplicationException:
 batch_mutate failed: out of sequence response
 
  -- 
  Dan Washusen
  Make big files fly
  visit digitalpigeon.com
  
  On Tuesday, 5 April 2011 at 11:43 PM, Héctor Izquierdo Seliva wrote:
  
   El mar, 05-04-2011 a las 09:35 -0400, Dan Hendry escribió:
I too have seen the out of sequence response problem. My solution
has just been to retry and it seems to work. None of my mutations
are THAT large ( 200 columns). 

The only related information I could find points to a
thrift/ubuntu bug of some kind
(http://markmail.org/message/xc3tskhhvsf5awz7). What OS are you
running? 

Dan
   
   Hi Dan. I'm running on Debian stable and cassandra 0.7.4. I have
   rows
   with up to 1000 columns. I have changed the way I was doing the
   batch
   mutates to never be bigger than 100 columns at a time. I hope this
   will
   work, otherwise the move is going to take too long.
   
   The problem is aggravated by Pelops not retrying automatically and
   instead raising a RuntimeException.
   
   I'll try to add a retry if this doesn't work. 
   
   Thanks for your response!
   
   Héctor
   
-Original Message-
From: Héctor Izquierdo Seliva [mailto:izquie...@strands.com] 
Sent: April-05-11 8:30
To: user@cassandra.apache.org
Subject: batch_mutate failed: out of sequence response

Hi everyone. I'm having trouble while inserting big amounts of
data into
cassandra. I'm getting this exception:

batch_mutate failed: out of sequence response

I'm gessing is due to very big mutates. I have made the batch
mutates
smaller and it seems to be behaving. Can somebody shed some light?

Thanks!

No virus found in this incoming message.
Checked by AVG - www.avg.com 
Version: 9.0.894 / Virus Database: 271.1.1/3551 - Release Date:
04/05/11 02:34:00
 


Re: RE: batch_mutate failed: out of sequence response

2011-04-05 Thread Dan Washusen
Pelops raises a RuntimeException? Can you provide more info please?

-- 
Dan Washusen
Make big files fly
visit digitalpigeon.com

On Tuesday, 5 April 2011 at 11:43 PM, Héctor Izquierdo Seliva wrote:
El mar, 05-04-2011 a las 09:35 -0400, Dan Hendry escribió:
  I too have seen the out of sequence response problem. My solution has just 
  been to retry and it seems to work. None of my mutations are THAT large ( 
  200 columns). 
  
  The only related information I could find points to a thrift/ubuntu bug of 
  some kind (http://markmail.org/message/xc3tskhhvsf5awz7). What OS are you 
  running? 
  
  Dan
 
 Hi Dan. I'm running on Debian stable and cassandra 0.7.4. I have rows
 with up to 1000 columns. I have changed the way I was doing the batch
 mutates to never be bigger than 100 columns at a time. I hope this will
 work, otherwise the move is going to take too long.
 
 The problem is aggravated by Pelops not retrying automatically and
 instead raising a RuntimeException.
 
 I'll try to add a retry if this doesn't work. 
 
 Thanks for your response!
 
 Héctor
 
  -Original Message-
  From: Héctor Izquierdo Seliva [mailto:izquie...@strands.com] 
  Sent: April-05-11 8:30
  To: user@cassandra.apache.org
  Subject: batch_mutate failed: out of sequence response
  
  Hi everyone. I'm having trouble while inserting big amounts of data into
  cassandra. I'm getting this exception:
  
  batch_mutate failed: out of sequence response
  
  I'm gessing is due to very big mutates. I have made the batch mutates
  smaller and it seems to be behaving. Can somebody shed some light?
  
  Thanks!
  
  No virus found in this incoming message.
  Checked by AVG - www.avg.com 
  Version: 9.0.894 / Virus Database: 271.1.1/3551 - Release Date: 04/05/11 
  02:34:00
 


Re: OOM during compaction with half the heap still available?

2011-03-24 Thread Dan Washusen
Ah, it would appear I forgot to do that on the hudson machine. Thanks!

-- 
Dan Washusen
On Friday, 25 March 2011 at 2:23 PM, Jonathan Ellis wrote: 
 Have you run nodetool scrub? The data versioning problem scrub fixes
 can manifest itself as trying to read GB of data into memory during
 compaction.
 
 On Thu, Mar 24, 2011 at 8:52 PM, Dan Washusen d...@reactive.org wrote:
  Hey All,
  I've noticed that the Cassandra instance I have running on our build machine
  occasionally crashes with an OOM error during compaction. I'm going to dial
  down the memtable thresholds etc but I was wondering if anyone could help
  explain the heap usage at the time of the crash. I just happened to leave a
  JMX console window open and it's showing that just before the crash roughly
  50% of the heap was still available.
  Screenshot of heap usage: http://img3.imageshack.us/img3/2822/memoryf.png
  The above screenshot was taken a few weeks ago on Cassandra 0.7.2 (I think)
  with nmap disabled on Java 1.6.0_24-b07. I'm now running 0.7.4 with nmap
  enabled and just got the same crash (based on the error message in the
  log)...
  Log snippet from crash that matches screenshot: http://pastebin.com/ACa8fKUu
  Log snippet from 0.7.4 crash: http://pastebin.com/SwSQawUM
  Cheers,
  --
  Dan Washusen
 
 
 
 -- 
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com
 


Re: I: Re: Are row-keys sorted by the compareWith?

2011-03-01 Thread Dan Washusen
Pelops moved to github several months ago...

https://github.com/s7/scale7-pelops/blob/master/src/main/java/org/scale7/cassandra/pelops/Selector.java#L1179

Cheers,
-- 
Dan Washusen
On Wednesday, 2 March 2011 at 3:35 AM, Matthew Dennis wrote: 
 I'm not really familiar with pelops code, but I found two implementations (~ 
 line 454 and ~ line 559) of getColumnsFromRows in Selector.java in pelops 
 trunk.
 
 The first uses a HashMap so it clearly isn't ordered, the second uses a 
 LinkedHashMap but it inserts the keys in the order returned by C* which we 
 already know isn't ordered.
 
 See http://bit.ly/egZaXi for relevant code.
 
 Like I said, I'm not really familiar with pelops so I could be completely off 
 on this, but it looks like if pelops was intending to preserve the order of 
 the requested keys that it's not actually doing it...
 
 On Wed, Feb 23, 2011 at 3:44 PM, Dan Washusen d...@reactive.org wrote:
  Hi Matthew,
  As you mention the map returned from multiget_slice is not order 
  preserving, Pelops is doing this on the client side...
  
  Cheers,
  Dan
  
  -- 
  Dan Washusen
  Sent with Sparrow
  On Wednesday, 23 February 2011 at 8:38 PM, Matthew Dennis wrote:
   The map returned by multiget_slice (what I suspect is the underlying 
   thrift call for getColumnsFromRows) is not a order preserving map, it's a 
   HashMap so the order of the returned results cannot be depended on.  Even 
   if it was a order preserving map, not all languages would be able to make 
   use of the results since not all languages have ordered maps (though 
   many, including Java, certainly do).
   
   That being said, it would be fairly easy to change this on the C* side to 
   preserve the order the keys were requested in, though as mentioned not 
   all clients could take advantage of it.
   
On Mon, Feb 21, 2011 at 4:09 PM, cbert...@libero.it cbert...@libero.it 
   wrote:
 
 As Jonathan mentions the compareWith on a column family def. is 
 defines the order for the columns *within* a row... In order to 
 control the ordering of rows you'll need to use the 
 OrderPreservingPartitioner 
 (http://www.datastax.com/docs/0.7/operations/clustering#tokens-partitioners-ring).
 
 Thanks for your answer and for your time, I will take a look at this.
 
 As for getColumnsFromRows; it should be returning you a map of lists. 
 The map is insertion-order-preserving and populated based on the 
 provided list of row keys (so if you iterate over the entries in the 
 map they should be in the same order as the list of row keys). 
 
 mmm ... well it didn't happen like this. In my code I had a CF named 
 comments and also a CF called usercomments. UserComments use an uuid 
 as row-key to keep, TimeUUID sorted, the pointers to the comments 
 of the user. When I get the sorted list of keys from the UserComments 
 and I use this list as row-keys-list in the GetColumnsFromRows I 
 don't get back the data sorted as I expect them to be. 
 It looks like if Cassandra/Pelops does not care on how I provide the 
 row-keys-list. I am sure about that cause I did something different: 
 I iterate over my row-keys-list and made many GetColumnFromRow 
 instead of one GetColumnsFromRows and when I iterate data are 
 correctly sorted. But this can not be a solution ...
 
 I am using Cassandra 0.6.9
 
 I profit of your knownledge of Pelops to ask you something: I am 
 evaluating the migration to Cassandra 0.7 ... as far as you know, in 
 terms of written code, is it an heavy job? 
 
 Best Regards
 
 Carlo
 
   Messaggio originale
   Da: d...@reactive.org
  
  On Saturday, 19 February 2011 at 8:16 AM, cbert...@libero.it wrote:
   Hi all,
   I created a CF in which i need to get, sorted by time, the Rows 
   inside. Each 
   Row represents a comment.
   
   ColumnFamily name=Comments compareWith=TimeUUIDType / 
   
   I've created a few rows using as Row Key a generated TimeUUID but 
   when I call 
   the Pelops method GetColumnsFromRows I don't get the data back 
   as I expect: 
   rows are not sorted by TimeUUID.
I though it was probably cause of the random-part of the 
   TimeUUID so I create 
   a new CF ...
   
   ColumnFamily name=Comments2 compareWith=LongType / 
   
   This time I created a few rows using the java 
   System.CurrentTimeMillis() that 
retrieve a long. I call again the GetColumnsFromRows and again 
   the same 
   results: data are not sorted!
   I've read many times that Rows are sorted as specified in the 
   compareWith but 
   I can't see it. 
To solve this problem for the moment I've used a 
   SuperColumnFamily with an 
   UNIQUE ROW ... but I think this is just a workaround and not the 
   solution

Re: I: Re: Are row-keys sorted by the compareWith?

2011-02-23 Thread Dan Washusen
Hi Matthew,
As you mention the map returned from multiget_slice is not order preserving, 
Pelops is doing this on the client side...

Cheers,
Dan

-- 
Dan Washusen
Sent with Sparrow
On Wednesday, 23 February 2011 at 8:38 PM, Matthew Dennis wrote: 
 The map returned by multiget_slice (what I suspect is the underlying thrift 
 call for getColumnsFromRows) is not a order preserving map, it's a HashMap so 
 the order of the returned results cannot be depended on.  Even if it was a 
 order preserving map, not all languages would be able to make use of the 
 results since not all languages have ordered maps (though many, including 
 Java, certainly do).
 
 That being said, it would be fairly easy to change this on the C* side to 
 preserve the order the keys were requested in, though as mentioned not all 
 clients could take advantage of it.
 
  On Mon, Feb 21, 2011 at 4:09 PM, cbert...@libero.it cbert...@libero.it 
 wrote:
   
   As Jonathan mentions the compareWith on a column family def. is defines 
   the order for the columns *within* a row... In order to control the 
   ordering of rows you'll need to use the OrderPreservingPartitioner 
   (http://www.datastax.com/docs/0.7/operations/clustering#tokens-partitioners-ring).
   
   Thanks for your answer and for your time, I will take a look at this.
   
   As for getColumnsFromRows; it should be returning you a map of lists. The 
   map is insertion-order-preserving and populated based on the provided 
   list of row keys (so if you iterate over the entries in the map they 
   should be in the same order as the list of row keys). 
   
   mmm ... well it didn't happen like this. In my code I had a CF named 
   comments and also a CF called usercomments. UserComments use an uuid as 
   row-key to keep, TimeUUID sorted, the pointers to the comments of the 
   user. When I get the sorted list of keys from the UserComments and I use 
   this list as row-keys-list in the GetColumnsFromRows I don't get back the 
   data sorted as I expect them to be. 
   It looks like if Cassandra/Pelops does not care on how I provide the 
   row-keys-list. I am sure about that cause I did something different: I 
   iterate over my row-keys-list and made many GetColumnFromRow instead of 
   one GetColumnsFromRows and when I iterate data are correctly sorted. But 
   this can not be a solution ...
   
   I am using Cassandra 0.6.9
   
   I profit of your knownledge of Pelops to ask you something: I am 
   evaluating the migration to Cassandra 0.7 ... as far as you know, in 
   terms of written code, is it an heavy job? 
   
   Best Regards
   
   Carlo
   
 Messaggio originale
 Da: d...@reactive.org

On Saturday, 19 February 2011 at 8:16 AM, cbert...@libero.it wrote:
 Hi all,
 I created a CF in which i need to get, sorted by time, the Rows 
 inside. Each 
 Row represents a comment.
 
 ColumnFamily name=Comments compareWith=TimeUUIDType / 
 
 I've created a few rows using as Row Key a generated TimeUUID but 
 when I call 
 the Pelops method GetColumnsFromRows I don't get the data back as I 
 expect: 
 rows are not sorted by TimeUUID.
  I though it was probably cause of the random-part of the TimeUUID so 
 I create 
 a new CF ...
 
 ColumnFamily name=Comments2 compareWith=LongType / 
 
 This time I created a few rows using the java 
 System.CurrentTimeMillis() that 
  retrieve a long. I call again the GetColumnsFromRows and again the 
 same 
 results: data are not sorted!
 I've read many times that Rows are sorted as specified in the 
 compareWith but 
 I can't see it. 
  To solve this problem for the moment I've used a SuperColumnFamily 
 with an 
 UNIQUE ROW ... but I think this is just a workaround and not the 
 solution.
 
 ColumnFamily name=Comments type=Super compareWith=TimeUUIDType 
  CompareSubcolumnsWith=BytesType/ 
 
 Now when I call the GetSuperColumnsFromRow I get all the 
 SuperColumns as I 
 expected: sorted by TimeUUID. Why it does not happen the same with 
 the Rows? 
  I'm confused.
 
 TIA for any help.
 
 Best Regards
 
 Carlo
 


   
   
 
 


Re: Are row-keys sorted by the compareWith?

2011-02-20 Thread Dan Washusen
Hi Carlo,
As Jonathan mentions the compareWith on a column family def. is defines the 
order for the columns *within* a row... In order to control the ordering of 
rows you'll need to use the OrderPreservingPartitioner 
(http://www.datastax.com/docs/0.7/operations/clustering#tokens-partitioners-ring).


As for getColumnsFromRows; it should be returning you a map of lists. The map 
is insertion-order-preserving and populated based on the provided list of row 
keys (so if you iterate over the entries in the map they should be in the same 
order as the list of row keys). The list for each row entry are definitely in 
the order that Cassandra provides them, take a look at 
org.scale7.cassandra.pelops.Selector#toColumnList if you need more info.

Cheers,
Dan

-- 
Dan Washusen
Sent with Sparrow
On Saturday, 19 February 2011 at 8:16 AM, cbert...@libero.it wrote: 
 Hi all,
 I created a CF in which i need to get, sorted by time, the Rows inside. Each 
 Row represents a comment.
 
 ColumnFamily name=Comments compareWith=TimeUUIDType / 
 
 I've created a few rows using as Row Key a generated TimeUUID but when I call 
 the Pelops method GetColumnsFromRows I don't get the data back as I expect: 
 rows are not sorted by TimeUUID.
 I though it was probably cause of the random-part of the TimeUUID so I create 
 a new CF ...
 
 ColumnFamily name=Comments2 compareWith=LongType / 
 
 This time I created a few rows using the java System.CurrentTimeMillis() that 
 retrieve a long. I call again the GetColumnsFromRows and again the same 
 results: data are not sorted!
 I've read many times that Rows are sorted as specified in the compareWith but 
 I can't see it. 
 To solve this problem for the moment I've used a SuperColumnFamily with an 
 UNIQUE ROW ... but I think this is just a workaround and not the solution.
 
 ColumnFamily name=Comments type=Super compareWith=TimeUUIDType 
 CompareSubcolumnsWith=BytesType/ 
 
 Now when I call the GetSuperColumnsFromRow I get all the SuperColumns as I 
 expected: sorted by TimeUUID. Why it does not happen the same with the Rows? 
 I'm confused.
 
 TIA for any help.
 
 Best Regards
 
 Carlo
 


Re: Another EOFException

2011-02-15 Thread Dan Washusen
I'm seeing this as well; several column families with keys_cached = 0 on 0.7.1.

Debug level logs: http://pastebin.com/qvujKDth

-- 
Dan Washusen
On Wednesday, 16 February 2011 at 1:12 PM, Jonathan Ellis wrote: 
 Created https://issues.apache.org/jira/browse/CASSANDRA-2172.
 
 On Tue, Feb 15, 2011 at 3:34 PM, B. Todd Burruss bburr...@real.com wrote:
  it happens when i start the node. just tried it again. here's the
  saved_caches directory:
  
  
  [cassandra@kv-app02 ~]$ ls -l /data/cassandra-data/saved_caches/
  total 12
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  NotificationSystem-Events-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  NotificationSystem-Msgs-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  NotificationSystem-Rendered-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  NotificationSystem-ScheduledMsgs-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  NotificationSystem-ScheduledTimes-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  NotificationSystem-SystemState-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  NotificationSystem-Templates-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  NotificationSystem-Transports-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  Queues-EmailTransport_Pending-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  Queues-EmailTransport_Waiting-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  Queues-Errors_Pending-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  Queues-Errors_Waiting-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  Queues-MessageDescriptors-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  Queues-PipeDescriptors-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  Queues-Processing_Pending-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  Queues-Processing_Waiting-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  Queues-QueueDescriptors-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36
  Queues-QueuePipeCnxn-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 10:36 Queues-QueueStats-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 38 Feb 15 09:36
  system-HintsColumnFamily-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 09:36 system-IndexInfo-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 5 Feb 15 09:36
  system-LocationInfo-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 09:36 system-Migrations-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 18 Feb 15 09:36 system-Schema-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 09:36
  UDS4Profile-ProfileDefinitions-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 09:36
  UDS4Profile-ProfileNamespaces-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 09:36
  UDS4Profile-Profiles_40229-KeyCache
  -rw-rw-r-- 1 cassandra cassandra 0 Feb 15 09:36
  UDS4Profile-Profiles_RN_test-KeyCache
  
  
  
  On 02/15/2011 01:01 PM, Jonathan Ellis wrote:
   
   Is this reproducible or just I happened to kill the server while it
   was in the middle of writing out the cache keys?
   
   On Tue, Feb 15, 2011 at 1:10 PM, B. Todd Burrussbburr...@real.com
   wrote:

the following exception seems to be about loading saved caches, but i
don't
really care about the cache so maybe isn't a big deal. anyway, this is
with
patched 0.7.1 (0001-Fix-bad-signed-conversion-from-byte-to-int.patch)


WARN 11:07:59,800 error reading saved cache
/data/cassandra-data/saved_caches/UDS4Profile-Profiles_40229-KeyCache
java.io.EOFException
at

java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2281)
at

java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2750)
at
java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:780)
at java.io.ObjectInputStream.init(ObjectInputStream.java:280)
at

org.apache.cassandra.db.ColumnFamilyStore.readSavedCache(ColumnFamilyStore.java:255)
at

org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:198)
at

org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:451)
at

org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:432)
at org.apache.cassandra.db.Table.initCf(Table.java:360)
at org.apache.cassandra.db.Table.init(Table.java:290)
at org.apache.cassandra.db.Table.open(Table.java:107)
at

org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:162)
at

org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
at
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
 
 
 
 -- 
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http

Re: seed node failure crash the whole cluster

2011-02-07 Thread Dan Washusen
Hi,
I've added some comments and questions inline.

Cheers,
Dan

On 8 February 2011 10:00, Jonathan Ellis jbel...@gmail.com wrote:

 On Mon, Feb 7, 2011 at 1:51 AM, TSANG Yiu Wing ywts...@gmail.com wrote:
  cassandra version: 0.7
 
  client library: scale7-pelops / 1.0-RC1-0.7.0-SNAPSHOT
 
  cluster: 3 machines (A, B, C)
 
  details:
  it works perfectly when all 3 machines are up and running
 
  but if the seed machine is down, the problems happen:
 
  1) new client connection cannot be established

 sounds like pelops relies on the seed node to introduce it to the
 cluster.  you should configure it either with a hardcoded list of
 nodes or use something like RRDNS instead.  I don't use pelops so I
 can't help other than that.  (I believe there is a mailing list for
 Pelops though.)


When dynamic node discovery is turned on (off by default) it doesn't
(shouldn't) rely on the initial seed node once past initialization.  So
either make sure you have dynamic node discovery turned on or seed Pelops
with all nodes in your cluster...

It would be helpful if you provided more information about the errors you're
seeing preferably with debug level logging turned on.



  2) if a client keeps connecting to and operating at (issue get and
  update) the cluster, when the seed is down, the working client will
  throw exception upon the next operation

 I know Hector supports transparent failover to another Cassandra node.
  Perhaps Pelops does not.


Pelops will validate connections at a configurable period (60 seconds by
default) and remove them from the pool.  Pelops will also retry the
operation three times (configurable) against a different node in the pool
each time.

If you want Pelops to take more agressive actions when it detects downed
nodes then check out
org.scale7.cassandra.pelops.pool.CommonsBackedPool.INodeSuspensionStrategy.



  3) using cassandra-cli to connect the remaining nodes in the cluster,
  Internal error processing get_range_slices will happen when querying
  column family
  list cf;

 Cassandra always logs the cause of internal errors in system.log, so
 you should look there.

 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



Re: Re: R: Re: Ring up but read fails ...

2011-01-23 Thread Dan Washusen
This is a known issue with the Cassandra 0.6 versions of Pelops.  The issue
was fixed in the 0.7 based versions a few months ago but never back-ported
(Dominic, myself and the other contributors don't run 0.6)...

On 24 January 2011 05:25, cbert...@libero.it cbert...@libero.it wrote:

  Reconnect and try again?

 Sorry what do you mean by Reconnect and try again? -- You mean to shut
 down the old pool and create a new pool of connections?
 I don't have the possibility to handle the single connection using Pelops
 ...

 From Dominic Williams Blog

 To work with a Cassandra cluster, you need to start off by defining a
 connection pool. This is typically done once in the startup code of your
 application
 [...]
 One of the key design decisions that at the time of writing distinguishes
 Pelops, is that the data processing code written by developers does not
 involve connection pooling or management. Instead, classes like Mutatorand
 Selector borrow connections to Cassandra from a Pelops pool for just the
 periods that they need to read and write to the underlying Thrift API. This
 has two advantages.

 Firstly, obviously, code becomes cleaner and developers are freed from
 connection management concerns. But also more subtly this enables the Pelops
 library to completely manage connection pooling itself, and for example keep
 track of how many outstanding operations are currently running against each
 cluster node.

 This for example, enables Pelops to perform more effective client load
 balancing by ensuring that new operations are performed against the node to
 which it currently has the least outstanding operations running. Because of
 this architectural choice, it will even be possible to offer strategies in
 the future where for example nodes are actually queried to determine their
 load.


 TIA


 --

 - Carlo -



Re: Java cient

2011-01-19 Thread Dan Washusen
Pelops is pretty thin wrapper for the Thrift API.  It's thinness has both up
and down sides; on the up side it's very easy to map functionality mentioned
on the Cassandra API wiki page to functionality provided by Pelops, it is
also relatively simple to add features (thanks to Alois^^ for indexing
support).  The down side is you often have to deal with the Cassandra Thrift
classes like ColumnOrSuperColumn...

On 20 January 2011 15:58, Dan Retzlaff dretzl...@gmail.com wrote:

 My team switched our production stack from Hector to Pelops a while back,
 based largely on this admittedly subjective programmer experience bit.
 I've found Pelops' code and abstractions significantly easier to follow and
 integrate with, plus Pelops has had feature-parity with Hector for all of
 our use cases. It's quite possible that we just caught Hector during its
 transition to what Nate calls v2 but for our part, with no disrespect to
 the Hector community intended, we've been quite happy with the transition.

 Dan

 On Wed, Jan 19, 2011 at 3:30 PM, Jonathan Shook jsh...@gmail.com wrote:

 Perhaps. I use hector. I have an bit of rework to do moving from .6 to
 .7. This is something I wasn't anticipating in my earlier planning.
 Had Pelops been around when I started using Hector, I would have
 probably chosen it over Hector. The Pelops client seemed to be better
 conceived as far as programmer experience and simplicity went. Since
 then, Hector has had a v2 upgrade to their API which breaks much of
 the things that you would have done in version .6 and before.
 Conceptually speaking, they appear more similar now than before the
 Hector changes.

 I'm dreading having to do a significant amount of work on my client
 interface because of the incompatible API changes.. but I will have to
 in order to get my client/server caught up to the currently supported
 branch. That is just part of the cost of doing business with Cassandra
 at the moment. Hopefully after 1.0 on the server and some of the
 clients, this type of thing will be more unusual.


 2011/1/19 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@gmail.com:
  Thanks everyone. I guess, I should go with hector
 
  On 18 Jan 2011 17:41, Alois Bělaška alois.bela...@gmail.com wrote:
  Definitelly Pelops https://github.com/s7/scale7-pelops
 
  2011/1/18 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@gmail.com
 
  What is the most commonly used java client library? Which is the the
 most
  mature/feature complete?
  Noble
 
 





Re: Range Queries in RP on SCF in 0.7 with UUID SCs

2010-12-01 Thread Dan Washusen
Using the methods on the Bytes class would be preferable.  The byte[]
related methods on UuidHelper should have been deprecated with the Bytes
class was introduced...

e.g. new Bytes(col.getName()).toUuid()

Cheers,
Dan

On Thu, Dec 2, 2010 at 10:26 AM, Frank LoVecchio fr...@isidorey.com wrote:

 Actually, it was a class issue at this line:

 System.*out*.println(NAME:  + UUID.*nameUUIDFromBytes*(col.getName()));

 The native Pelops class timeUuidHelper is what should be used.

 On Wed, Dec 1, 2010 at 4:16 PM, Aaron Morton aa...@thelastpickle.comwrote:

 When you say I want to get rows starting from a Super Column... it's a
 bit confusing. Do you want to get super columns from a single row, or
 multiple rows? I'm assuming you are talking about getting columns from a
 single row / key as that's what your code does.

 For the pelops code, it looks OK but I've not used Pelops. You can turn
 the logging up on the server and check the command that is sent to it. I'm
 would guess there is something wrong with the way you are transforming the
 start key

 For your cli example what was the command you executed ?

 Aaron

 On 02 Dec, 2010,at 11:03 AM, Frank LoVecchio fr...@isidorey.com wrote:

 Hey Aaron,


 Yes, in regards to SCF definition, you are correct:


 name: Sensor

   column_type: Super

   compare_with: TimeUUIDType

   gc_grace_seconds: 864000

   keys_cached: 1.0

   read_repair_chance: 1.0

   rows_cached: 0.0

 I'm not quite sure I follow you, though, as I think I'm doing what you
 specify.  The Pelops code is below.  Basically, I want to get rows
 starting from a Super Column with a specific UUID and limit the number, just
 as you inferred.  When I run this code I just get the last N values (25 in
 this case) if non-reversed, and the first N values if reversed.  However,
 regardless of what start param we use (Super Column UUID is String startKey
 below), we still get the same values for the specified amount (e.g. the same
 25).

 *public* *void* getSuperRowKeys(String rowKey, String columnFamily, *int* 
 limit,
 String startKey) *throws* Exception {

   *byte*[] byteArray = UuidHelper.*timeUuidStringToBytes*(startKey);

  ByteBuffer bb = ByteBuffer.*wrap*(byteArray);

  *new* UUID (bb.getLong(), bb.getLong());

ListSuperColumn cols = 
 selector.getPageOfSuperColumnsFromRow(columnFamily,
 rowKey, Bytes.*fromByteBuffer*(bb), *false*, limit, ConsistencyLevel.*ONE
 *);


   *for* (SuperColumn col : cols) {

  *if* (col.getName() != *null*) {


  System.*out*.println(NAME:  + UUID.*nameUUIDFromBytes*
 (col.getName()));


 *for* (Column c : col.columns) {

 System.*out*.println(\t\tName:  + Bytes.*toUTF8*(c.getName())

 +  Value:  + Bytes.*toUTF8*(c.getValue())

 +  timestamp:  + c.timestamp);


 }


 }

 }


 }

 Here is some example data from the CLI.  If we specify 
 2f814d30-f758-11df-2f81-4d30f75811df
 as the start param (second super column down), we still get 
 52e6540-f759-11df-952e-6540f75911df
 (first super column) returned.

 = (super_column=952e6540-f759-11df-952e-6540f75911df,
  (column=64617465,
 value=323031302d31312d32332032333a32393a30332e303030,
 timestamp=1290554997141000)
  (column=65787472615f696e666f, value=6e6f6e65,
 timestamp=1290554997141000)
  (column=726561736f6e, value=6e6f6e65, timestamp=1290554997141000)
  (column=7365636f6e64735f746f5f6e657874, value=373530,
 timestamp=1290554997141000)
  (column=73657269616c, value=393135353032353731,
 timestamp=1290554997141000)
  (column=737461747573, value=5550, timestamp=1290554997141000)
  (column=74797065, value=486561727462656174,
 timestamp=1290554997141000))
 = (super_column=2f814d30-f758-11df-2f81-4d30f75811df,
  (column=64617465,
 value=323031302d31312d32332032333a31393a30332e303030,
 timestamp=129055439706)
  (column=65787472615f696e666f, value=6e6f6e65,
 timestamp=129055439706)
  (column=726561736f6e, value=6e6f6e65, timestamp=129055439706)
  (column=7365636f6e64735f746f5f6e657874, value=373530,
 timestamp=129055439706)
  (column=73657269616c, value=393135353032353731,
 timestamp=129055439706)
  (column=737461747573, value=5550, timestamp=129055439706)
  (column=74797065, value=486561727462656174,
 timestamp=129055439706))
 = (super_column=7c959f00-f757-11df-7c95-9f00f75711df,
  (column=64617465,
 value=323031302d31312d32332032333a31343a30332e303030,
 timestamp=1290554096881000)
  (column=65787472615f696e666f, value=6e6f6e65,
 timestamp=1290554096881000)
  (column=726561736f6e, value=6e6f6e65, timestamp=1290554096881000)
  (column=7365636f6e64735f746f5f6e657874, value=373530,
 timestamp=1290554096881000)
  (column=73657269616c, value=393135353032353731,
 timestamp=1290554096881000)
  (column=737461747573, value=5550, timestamp=1290554096881000)
  (column=74797065, value=486561727462656174,
 timestamp=1290554096881000))
 = (super_column=c9be6330-f756-11df-c9be-6330f75611df,
  

cassandra-cli multiline commands?

2010-11-23 Thread Dan Washusen
I notice CASSANDRA-1742 mentions support for commands that span multiple
lines in cassandra-cli.  Did it make it in?  If so what's the syntax?

Cheers,
Dan


Re: HintedHandoff and ReplicationFactor with a downed node

2010-10-22 Thread Dan Washusen
The last time this came up on the list Jonathan Ellis said (something
along the lines of) if your application can't tolerate stale data then
you should read with a consistency level of QUORUM.

It would be nice if there was some sort of middle ground for an
application that can tolerate slightly stale data (minutes) but not
very stale data (hours or days) could still get the performance gain
of consistency level of ONE.  Even if a node just made a best effort
in the OPs scenario it might be sufficient...?

Is there an alternative solution to reading with consistency level of
QUORUM?  For example, if a node has been down for an extended period
of time could you re-add it as a new node (fetching all its data
again) and avoid having to read with QUORUM?

Just curious... :)

Cheers,
Dan

On Sat, Oct 23, 2010 at 10:01 AM, Rob Coli rc...@digg.com wrote:

 On 10/22/10 2:55 PM, Craig Ching wrote:

 Even better, I'd love a way to not allow B to be available
 until replication is complete, can I detect that somehow?

 Proposed and rejected a while back :

 https://issues.apache.org/jira/browse/CASSANDRA-768

 =Rob


Re: What is the correct way of changing a partitioner?

2010-10-19 Thread Dan Washusen
http://wiki.apache.org/cassandra/DistributedDeletes

From the http://wiki.apache.org/cassandra/StorageConfiguration page:

 Achtung! Changing this parameter requires wiping your data directories,
 since the partitioner can modify the !sstable on-disk format.


So delete your data and commit log dirs after shutting down Cassandra...

On Tue, Oct 19, 2010 at 4:09 PM, Wicked J wickedj2...@gmail.com wrote:

 Hi,
 I deleted all the data (programmatically). Then I changed the partitioner
 from RandomPartitioner to OrderPreservingPartitioner and when I started
 Cassandra - I get the following error. What is the correct way of changing
 the partitioner and how can I get past this error?

 ERROR 17:28:28,985 Fatal exception during initialization
 java.io.IOException: Found system table files, but they couldn't be loaded.
 Did you change the partitioner?
 at
 org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:154)
 at
 org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:94)
 at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:211)

 Thanks!


Re: Running out of heap

2010-09-22 Thread Dan Washusen
http://wiki.apache.org/cassandra/FAQ#i_deleted_what_gives

That help?

On Wed, Sep 22, 2010 at 5:27 PM, Chris Jansen 
chris.jan...@cognitomobile.com wrote:

 Hi all,



 I have written a test application that does a write, read and delete on one
 of the sample column families that ship with Cassandra, and for some reason
 when I leave it going for an extended period of time I see Cassandra crash
 with out of heap exceptions. I don’t understand why this should be as I am
 deleting the data almost as soon as I have read it.



 Also I am seeing the data files grow for Keyspace1, again with apparently
 no reason as I am deleting the data as I read it, which eventually causes
 the disk space to completely fill up.



 How can this be, am I using Cassandra in the wrong way or is this a bug?



 Any help or advice would be greatly appreciated.



 Thanks in advance,



 Chris





 PS To give a better idea of what I am doing I’ve included some of the
 source from my Java test app, typically I have 20 threads running in
 parallel performing this operation:



 while(true)

 {

 long startTime = System.currentTimeMillis();

 key = UUID.randomUUID().toString();

 long timestamp = System.currentTimeMillis();

 ColumnPath colPathFdl = new ColumnPath(columnFamily);


 colPathFdl.setColumn((345345345354+key).getBytes(UTF8));



 boolean broken = true;



 while(broken)

 {

 try

 {

 client.insert(keyspace, key, colPathFdl,
 getBytesFromFile(new
 File(/opt/java/apache-cassandra/conf/storage-conf.xml)),timestamp,
 ConsistencyLevel.QUORUM);

 broken = false;

 }

 catch(Exception e)

 {

 System.out.println(Cannot write: +key+
 RETRYING);

 broken=true;

 e.printStackTrace();

 }

 }



 try

 {

 Column col = client1.get(keyspace, key,
 colPathFdl,ConsistencyLevel.QUORUM).getColumn();

 System.out.println(key + column name:  + new
 String(col.name, UTF8));

 //System.out.println(column value:  + new
 String(col.value, UTF8));

 System.out.println(key + column timestamp:  + new
 Date(col.timestamp));



 }

 catch(Exception e)

 {

 System.out.println(Cannot read: +key);

 e.printStackTrace();

 }



 try

 {

 System.out.println(key + delete column:: +key);

 client.remove(keyspace, key, colPathFdl, timestamp,
 ConsistencyLevel.QUORUM);

 }

 catch(Exception e)

 {

 System.out.println(Cannot delete: +key);

 e.printStackTrace();

 }



 long stopTime = System.currentTimeMillis();

 long timeTaken = stopTime -startTime;

 System.err.println(Thread.currentThread().getName() +
   +key+  Last operation took + timeTaken+ms );

 }





  NOTICE: Cognito Limited. Benham Valence, Newbury, Berkshire, RG20 8LU. UK.
 Company number 02723032. This e-mail message and any attachment is
 confidential. It may not be disclosed to or used by anyone other than the
 intended recipient. If you have received this e-mail in error please notify
 the sender immediately then delete it from your system. Whilst every effort
 has been made to check this mail is virus free we accept no responsibility
 for software viruses and you should check for viruses before opening any
 attachments. Opinions, conclusions and other information in this email and
 any attachments which do not relate to the official business of the company
 are neither given by the company nor endorsed by it.

   This email message has been scanned for viruses by 
 Mimecasthttp://www.mimecast.com



Re: Error when compile pelops

2010-09-13 Thread Dan Washusen
I just downloaded the jar file in question and it seems fine...

 wget -O cassandra.jar
 http://github.com/s7/mvnrepo/raw/master/org/apache/cassandra/cassandra/0.7.0-2010-09-12_19-23-07/cassandra-0.7.0-2010-09-12_19-23-07.jar;
  unzip -t cassandra.jar


Which version of Maven are you using?

On Tue, Sep 14, 2010 at 1:02 PM, Ying Tang ivytang0...@gmail.com wrote:

 I download the pelops source frm github , then cd the pelops folder.
 mvn compile

 But the error occurs.

 INFO] Compilation failure
 error: error reading
 /root/.m2/repository/org/apache/cassandra/cassandra/0.7.0-2010-09-12_19-23-07/cassandra-0.7.0-2010-09-12_19-23-07.jar;
 error in opening zip file

 Anyone met with the same problem with me ?

 How to solve it?

 --
 Best regards,

 Ivy Tang






Re: too many open files 0.7.0 beta1

2010-08-25 Thread Dan Washusen
Maybe you're seeing this:
https://issues.apache.org/jira/browse/CASSANDRA-1416

On Thu, Aug 26, 2010 at 2:05 PM, Aaron Morton aa...@thelastpickle.comwrote:

 Under 0.7.0 beta1 am seeing cassandra run out of files handles...

 Caused by: java.io.FileNotFoundException: /local1/junkbox/cassandra/data/
 junkbox.wetafx.co.nz/ObjectIndex-e-31-Index.db (Too many open files)
 at java.ioRandomAccessFile.open(Native Method)
 at java.io.RandomAccessFile.init(RandomAccessFile.java:212)
 at java.io.RandomAccessFile.init(RandomAccessFile.java:98)
 at
 org.apache.cassandra.io.util.BufferedRandomAccessFile.init(BufferedRandomAccessFile.java:142)

 If I look at the file descriptors for the process I can see it already has
 1,958 for to the file

 sudo ls -l /proc/20862/fd | grep ObjectIndex-e-31-Data.db |  wc -l
 1958

 Out of a total of 2044.

 Other nodes in the cluster have a similar number of fd's - around 2k with
 the majority to one SSTable.

 I did not experience this under 0.6 so just checking if this sounds OK and
 I should just increase the number of handles or if it's a bug?

 Thanks
 Aaron






Re: Upgrading to Cassanda 0.7 Thrift Erlang

2010-07-31 Thread Dan Washusen
Slightly off topic but still related (java instead of erlang).  I just tried
using the latest trunk build available on Hudson (2010-07-31_12-31-29) and
I'm getting lock ups.

The same code (without the framed transport) was working with a build form
2010-07-07_13-32-16

I'm connecting using the following:

 TSocket socket = new TSocket(node, port);

 transport = new TFramedTransport(socket);

 protocol = new TBinaryProtocol(transport);

 client = new Cassandra.Client(protocol);


 transport.open();


 // set the keyspace on the client and do get slice stuff



The locked up thread looks like:

 main prio=5 tid=101801000 nid=0x100501000 runnable [1004fe000]

   java.lang.Thread.State: RUNNABLE

at java.net.SocketInputStream.socketRead0(Native Method)

at java.net.SocketInputStream.read(SocketInputStream.java:129)

at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)

at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)

at java.io.BufferedInputStream.read(BufferedInputStream.java:317)

- locked 1093daa10 (a java.io.BufferedInputStream)

at
 org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)

at
 org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)

at
 org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)

at
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:369)

at
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:295)

at
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:202)

at
 org.apache.cassandra.thrift.Cassandra$Client.recv_get_slice(Cassandra.java:542)

at
 org.apache.cassandra.thrift.Cassandra$Client.get_slice(Cassandra.java:524)




On 28 July 2010 17:43, J T jt4websi...@googlemail.com wrote:

 Hi,

 That fixed the problem!

 I added the Framed option and like magic things have started working again.

 Example:

 thrift_client:start_link(localhost, 9160, cassandra_thrift, [ { framed,
 true } ] )

 JT.



 On Tue, Jul 27, 2010 at 10:04 PM, Jonathan Ellis jbel...@gmail.comwrote:

 trunk is using framed thrift connections by default now (was unframed)

 On Tue, Jul 27, 2010 at 11:33 AM, J T jt4websi...@googlemail.com wrote:
  Hi,
  I just tried upgrading a perfectly working Cassandra 0.6.3 to Cassandra
 0.7
  and am finding that even after re-generating the erlang thrift bindings
 that
  I am unable to perform any operation.
  I can get a connection but if I try to login or set the keyspace I get a
  report from the erlang bindings to say that the connection is closed.
  I then tried upgrading to a later version of thrift but still get the
 same
  error.
  e.g.
  (zotonic3...@127.0.0.1)1 thrift_client:start_link(localhost, 9160,
  cassandra_thrift).
  {ok,0.327.0}
  (zotonic3...@127.0.0.1)2 {ok,C}=thrift_client:start_link(localhost,
 9160,
  cassandra_thrift).
  {ok,0.358.0}
  (zotonic3...@127.0.0.1)3 thrift_client:call( C, set_keyspace, [
 Test
   ]).
  =ERROR REPORT 27-Jul-2010::03:48:08 ===
  ** Generic server 0.358.0 terminating
  ** Last message in was {call,set_keyspace,[Test]}
  ** When Server state == {state,cassandra_thrift,
   {protocol,thrift_binary_protocol,
{binary_protocol,
 
 {transport,thrift_buffered_transport,0.359.0},
 true,true}},
   0}
  ** Reason for termination ==
  ** {{case_clause,{error,closed}},
  [{thrift_client,read_result,3},
   {thrift_client,catch_function_exceptions,2},
   {thrift_client,handle_call,3},
   {gen_server,handle_msg,5},
   {proc_lib,init_p_do_apply,3}]}
  ** exception exit: {case_clause,{error,closed}}
   in function  thrift_client:read_result/3
   in call from thrift_client:catch_function_exceptions/2
   in call from thrift_client:handle_call/3
   in call from gen_server:handle_msg/5
   in call from proc_lib:init_p_do_apply/3
  The cassandra log seems to indicate that a connection has been made
  (although thats only apparent by a TRACE log message saying that a
 logout
  has been done).
  The cassandra-cli program is able to connect and function normally so I
 can
  only assume that there is a problem with the erlang bindings.
  Has anyone else had any success using 0.7 from Erlang ?
  JT.



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com





Re: Upgrading to Cassanda 0.7 Thrift Erlang

2010-07-31 Thread Dan Washusen
p.s. If I set thrift_framed_transport_size_in_mb to 0 and just use TSocket
instead of TFramedTransport everything works as expected...

On 1 August 2010 12:16, Dan Washusen d...@reactive.org wrote:

 Slightly off topic but still related (java instead of erlang).  I just
 tried using the latest trunk build available on Hudson (2010-07-31_12-31-29)
 and I'm getting lock ups.

 The same code (without the framed transport) was working with a build form
 2010-07-07_13-32-16

 I'm connecting using the following:

  TSocket socket = new TSocket(node, port);

 transport = new TFramedTransport(socket);

 protocol = new TBinaryProtocol(transport);

 client = new Cassandra.Client(protocol);


 transport.open();


 // set the keyspace on the client and do get slice stuff



 The locked up thread looks like:

 main prio=5 tid=101801000 nid=0x100501000 runnable [1004fe000]

java.lang.Thread.State: RUNNABLE

  at java.net.SocketInputStream.socketRead0(Native Method)

  at java.net.SocketInputStream.read(SocketInputStream.java:129)

  at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)

  at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)

  at java.io.BufferedInputStream.read(BufferedInputStream.java:317)

  - locked 1093daa10 (a java.io.BufferedInputStream)

  at
 org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)

  at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)

  at
 org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)

  at
 org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)

  at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)

  at
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:369)

  at
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:295)

  at
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:202)

  at
 org.apache.cassandra.thrift.Cassandra$Client.recv_get_slice(Cassandra.java:542)

  at
 org.apache.cassandra.thrift.Cassandra$Client.get_slice(Cassandra.java:524)




 On 28 July 2010 17:43, J T jt4websi...@googlemail.com wrote:

 Hi,

 That fixed the problem!

 I added the Framed option and like magic things have started working
 again.

 Example:

 thrift_client:start_link(localhost, 9160, cassandra_thrift, [ { framed,
 true } ] )

 JT.



 On Tue, Jul 27, 2010 at 10:04 PM, Jonathan Ellis jbel...@gmail.comwrote:

 trunk is using framed thrift connections by default now (was unframed)

 On Tue, Jul 27, 2010 at 11:33 AM, J T jt4websi...@googlemail.com
 wrote:
  Hi,
  I just tried upgrading a perfectly working Cassandra 0.6.3 to Cassandra
 0.7
  and am finding that even after re-generating the erlang thrift bindings
 that
  I am unable to perform any operation.
  I can get a connection but if I try to login or set the keyspace I get
 a
  report from the erlang bindings to say that the connection is closed.
  I then tried upgrading to a later version of thrift but still get the
 same
  error.
  e.g.
  (zotonic3...@127.0.0.1)1 thrift_client:start_link(localhost, 9160,
  cassandra_thrift).
  {ok,0.327.0}
  (zotonic3...@127.0.0.1)2 {ok,C}=thrift_client:start_link(localhost,
 9160,
  cassandra_thrift).
  {ok,0.358.0}
  (zotonic3...@127.0.0.1)3 thrift_client:call( C, set_keyspace, [
 Test
   ]).
  =ERROR REPORT 27-Jul-2010::03:48:08 ===
  ** Generic server 0.358.0 terminating
  ** Last message in was {call,set_keyspace,[Test]}
  ** When Server state == {state,cassandra_thrift,
   {protocol,thrift_binary_protocol,
{binary_protocol,
 
 {transport,thrift_buffered_transport,0.359.0},
 true,true}},
   0}
  ** Reason for termination ==
  ** {{case_clause,{error,closed}},
  [{thrift_client,read_result,3},
   {thrift_client,catch_function_exceptions,2},
   {thrift_client,handle_call,3},
   {gen_server,handle_msg,5},
   {proc_lib,init_p_do_apply,3}]}
  ** exception exit: {case_clause,{error,closed}}
   in function  thrift_client:read_result/3
   in call from thrift_client:catch_function_exceptions/2
   in call from thrift_client:handle_call/3
   in call from gen_server:handle_msg/5
   in call from proc_lib:init_p_do_apply/3
  The cassandra log seems to indicate that a connection has been made
  (although thats only apparent by a TRACE log message saying that a
 logout
  has been done).
  The cassandra-cli program is able to connect and function normally so I
 can
  only assume that there is a problem with the erlang bindings.
  Has anyone else had any success using 0.7 from Erlang ?
  JT.



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com






Re: Using Pelops with Cassandra 0.7.X

2010-07-13 Thread Dan Washusen
http://github.com/danwashusen/pelops/tree/cassandra-0.7.0

p.s. Pelops doesn't have any test coverage and my implicit tests (my app
integration tests) don't touch anywhere near all of the Pelops API.

p.s.s. I've made API breaking changes to support the new 0.7.0 API and
Dominic (the original Pelops author) hasn't had reviewed, commented or even
looked at them yet...

On 14 July 2010 08:35, Ran Tavory ran...@gmail.com wrote:

 Hector doesn't have 0.7 support yet

 On Jul 14, 2010 1:34 AM, Peter Harrison cheetah...@gmail.com wrote:

 I know Cassandra 0.7 isn't released yet, but I was wondering if anyone
 has used Pelops with the latest builds of Cassandra? I'm having some
 issues, but I wanted to make sure that somebody else isn't working on
 a branch of Pelops to support Cassandra 7. I have downloaded and built
 the latest code from GitHub, trunk of Pelops, and this works with 6.3,
 but not Cassandra Trunk. Is Pelops worth updating or should I use
 other client libraries for Java such as Hector?




Re: Pelops 'up and running' post question + WTF is a SuperColumn = really confused.

2010-07-02 Thread Dan Washusen
L1Tickets = { // column family
userId: { // row key
42C120DF-D44A-44E4-9BDC-2B5439A5C7B4: { category:
videoPhone, reportType: POOR_PICTURE, ...},
99B60047-382A-4237-82CE-AE53A74FB747: { category:
somethingElse, reportType: FOO, ...}
}
}

On 3 July 2010 02:29, S Ahmed sahmed1...@gmail.com wrote:


 https://ria101.wordpress.com/2010/06/11/pelops-the-beautiful-cassandra-database-client-for-java

 So using the code snipped below, I want to create a json representation of
 the CF (super).


 /**
  * Write multiple sub-column values to a super column...
  * @param rowKeyThe key of the row to modify
  * @param colFamily The name of the super column family to
 operate on
  * @param colName   The name of the super column
  * @param subColumnsA list of the sub-columns to write
  */
 mutator. writeSubColumns(
 userId,
 L1Tickets,
 UuidHelper.newTimeUuidBytes(), // using a UUID value that sorts by time
 mutator.newColumnList(
 mutator.newColumn(category, videoPhone),
 mutator.newColumn(reportType, POOR_PICTURE),
 mutator.newColumn(createdDate,
 NumberHelper.toBytes(System.currentTimeMillis())),
 mutator.newColumn(capture, jpegBytes),
 mutator.newColumn(comment) ));


 Can someone show me what it would look like?

 This is what I have so far

 SupportTickets = {

 userId : {

 L1Tickets : { }

 }


 }


 But from what I understood, a CF of type super looks like (
 http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model) :

 AddressBook = { // this is a ColumnFamily of type Super
 phatduckk: {// this is the key to this row inside the Super CF
 // the key here is the name of the owner of the address book

 // now we have an infinite # of super columns in this row
 // the keys inside the row are the names for the SuperColumns
 // each of these SuperColumns is an address book entry
 friend1: {street: 8th street, zip: 90210, city: Beverley
 Hills, state: CA},

 // this is the address book entry for John in phatduckk's address
 book
 John: {street: Howard street, zip: 94404, city: FC, state:
 CA},
 Kim: {street: X street, zip: 87876, city: Balls, state:
 VA},
 Tod: {street: Jerry street, zip: 54556, city: Cartoon, state:
 CO},
 Bob: {street: Q Blvd, zip: 24252, city: Nowhere, state:
 MN},
 ...
 // we can have an infinite # of ScuperColumns (aka address book
 entries)
 }, // end row
 ieure: { // this is the key to another row in the Super CF
 // all the address book entries for ieure
 joey: {street: A ave, zip: 55485, city: Hell, state: NV},
 William: {street: Armpit Dr, zip: 93301, city: Bakersfield,
 state: CA},
 },
 }

 The Pelop's code snippet seems to be adding an additional inner layer to
 this to me, confused!




Re: Pelops - a new Java client library paradigm

2010-06-12 Thread Dan Washusen
Very nice!

You mention that the connections are handled internally by Pelops, does that
mean that potentially a different connection is used for each operation
performed?

I had assumed using the same connection for several operations with
ConsistencyLevel.ONE would provide a basic level of atomicity.  For example,
using the same connection for all operations in a web request would allow
the request to read it's own writes.  Is that assumption correct and does
that impact on your decision to handle the connections internally to Pelops?

Cheers,
Dan

On 13 June 2010 05:05, Ran Tavory ran...@gmail.com wrote:

 Nice going, Dominic, having a clear API for cassandra is a big step forward
 :)
 Interestingly, at hector we came up with similar approach, just didn't find
 the time for code that, as production systems keep me busy at nights as
 well... We started with the implementation of BatchMutation, but the rest of
 the API improvements are still TODO
 Keep up the good work, competition keeps us healthy ;)


 On Fri, Jun 11, 2010 at 4:41 PM, Dominic Williams 
 thedwilli...@googlemail.com wrote:

 Pelops is a new high quality Java client library for Cassandra.

 It has a design that:
 * reveals the full power of Cassandra through an elegant Mutator and
 Selector paradigm
  * generates better, cleaner, less bug prone code
 * reduces the learning curve for new users
 * drives rapid application development
 * encapsulates advanced pooling algorithms

 An article introducing Pelops can be found at

 http://ria101.wordpress.com/2010/06/11/pelops-the-beautiful-cassandra-database-client-for-java/

 Thanks for reading.
 Best, Dominic





Re: Pelops - a new Java client library paradigm

2010-06-12 Thread Dan Washusen
Thanks for clarifying!

On 13 June 2010 09:03, Miguel Verde miguelitov...@gmail.com wrote:

 afaik, Cassandra does nothing to guarantee connection-level read your own
 writes consistency beyond its usual consistency levels.  See
 https://issues.apache.org/jira/browse/CASSANDRA-876 and the earlier  
 http://issues.apache.org/jira/browse/CASSANDRA-132
 http://issues.apache.org/jira/browse/CASSANDRA-132

 On Jun 12, 2010, at 5:48 PM, Dan Washusen d...@reactive.org wrote:

 Very nice!

 You mention that the connections are handled internally by Pelops, does
 that mean that potentially a different connection is used for each operation
 performed?

 I had assumed using the same connection for several operations with
 ConsistencyLevel.ONE would provide a basic level of atomicity.  For
 example, using the same connection for all operations in a web request would
 allow the request to read it's own writes.  Is that assumption correct and
 does that impact on your decision to handle the connections internally to
 Pelops?

 Cheers,
 Dan

 On 13 June 2010 05:05, Ran Tavory  ran...@gmail.comran...@gmail.comwrote:

 Nice going, Dominic, having a clear API for cassandra is a big step
 forward :)
 Interestingly, at hector we came up with similar approach, just didn't
 find the time for code that, as production systems keep me busy at nights as
 well... We started with the implementation of BatchMutation, but the rest of
 the API improvements are still TODO
 Keep up the good work, competition keeps us healthy ;)


 On Fri, Jun 11, 2010 at 4:41 PM, Dominic Williams 
 thedwilli...@googlemail.com
 thedwilli...@googlemail.com wrote:

 Pelops is a new high quality Java client library for Cassandra.

 It has a design that:
 * reveals the full power of Cassandra through an elegant Mutator and
 Selector paradigm
  * generates better, cleaner, less bug prone code
 * reduces the learning curve for new users
 * drives rapid application development
 * encapsulates advanced pooling algorithms

 An article introducing Pelops can be found at

 http://ria101.wordpress.com/2010/06/11/pelops-the-beautiful-cassandra-database-client-for-java/
 http://ria101.wordpress.com/2010/06/11/pelops-the-beautiful-cassandra-database-client-for-java/

 Thanks for reading.
 Best, Dominic