Re: client API

2011-06-17 Thread Jonathan Ellis
Cassandra also uses a bunch of classes that are new in JDK6. JDK5 is end-of-lifed, time to let it rest in piece. On Thu, Jun 16, 2011 at 10:41 PM, aaron morton aa...@thelastpickle.com wrote: The Thrift Java compiler creates code that is not compliant with Java 5.

Re: Docs: Token Selection

2011-06-17 Thread Jonathan Ellis
Replication location is determined by the row key, not the location of the client that inserted it. (Otherwise, without knowing what DC a row was inserted in, you couldn't look it up to read it!) On Fri, Jun 17, 2011 at 12:20 AM, AJ a...@dude.podzone.net wrote: On 6/16/2011 9:45 PM, aaron

Re: jsvc hangs shell

2011-06-17 Thread Jonathan Colby
jsvc is not very flexible. Check out wrapper software out. we swear by it. http://wrapper.tanukisoftware.com/doc/english/download.jsp On Jun 17, 2011, at 2:52 AM, Ken Brumer wrote: Anton Belyaev anton.belyaev at gmail.com writes: I guess it is not trivial to modify the package to make

cassandra crash

2011-06-17 Thread Donna Li
All: Can you find some exception from the last sentence? Would cassandra crash when memory is not enough? There are some other application run with cassandra, the other application may use large memory. 发件人: Donna Li 发送时间: 2011年6月17日 9:58 收件人:

Re: Easy way to overload a single node on purpose?

2011-06-17 Thread aaron morton
The short answer to the problem you saw is monitor the disk space. Also monitor client side logs for errors. Running out of commit log space does not stop the node from doing reads, so it can still be considered up. One nodes view of it's own UP'ness is not as important as the other nodes (or

Re: cassandra crash

2011-06-17 Thread aaron morton
What do you mean by crash ? If there was some sort of error in cassandra (including java running out of heap space) it will appear in the logs. Are there any error messages in the log. If there was some sort of JVM error it will be outputted to std error and probably end up on std out /

Cassandra.yaml

2011-06-17 Thread Vivek Mishra
I have a query: I have my Cassandra server running on my local machine and it has loaded Cassandra specific settings from apache-cassandra-0.8.0-src/apache-cassandra-0.8.0-src/conf/cassandra.yaml Now If I am writing a java program to connect to this server why do I need to provide a new

Re: cassandra crash

2011-06-17 Thread Sasha Dolgy
What type of environment? We had issues with our cluster on 0.7.6-2 ... The messages you see and highlighted, from what I recall aren't bad ... they are good. Investigating our crash, it turns out that the OS killed our Cassandra process and this was found in /var/log/messages Since then, I

Re: Cassandra.yaml

2011-06-17 Thread Sasha Dolgy
Hi Vivek, When I write client code in Java, using Hector, I don't specify a cassandra.yaml ... I specify the host(s) and keyspace I want to connect to. Alternately, I specify the host(s) and create the keyspace if the one I would like to use doesn't exist (new cluster for example). At no point

MemoryMeter uninitialized (jamm not specified as java agent)

2011-06-17 Thread Rene Kochen
Since using cassandra 0.8, I see the following warning: WARN 12:05:59,807 MemoryMeter uninitialized (jamm not specified as java agent); assuming liveRatio of 10.0. Usually this means cassandra-env.sh disabled jamm because you are using a buggy JRE; upgrade to the Sun JRE instead I'am using

Pruning commit logs manually

2011-06-17 Thread Marcus Bointon
My commit logs sometimes eat too much disk space. I see that the oldest is about a day old, so it's clearly pruning already, but is there some way I can clear them out manually without breaking stuff, assuming that all the transactions they describe have been completed? Marcus smime.p7s

Re: Pruning commit logs manually

2011-06-17 Thread Peter Schuller
My commit logs sometimes eat too much disk space. I see that the oldest is about a day old, so it's clearly pruning already, but is there some way I can clear them out manually without breaking stuff, assuming that all the transactions they describe have been completed? Don't manually

Re: SSTable corruption blocking compaction and scrub can't fix it

2011-06-17 Thread Sylvain Lebresne
Scrub apparently dies because it cannot acquire a file descriptor. Scrub does not correctly closes files (https://issues.apache.org/jira/browse/CASSANDRA-2669) so that may be part of why that happens. However, a simple fix is probably to raise up the file descriptor limit. -- Sylvain On Fri,

RE: Querying superColumn

2011-06-17 Thread Vivek Mishra
Correct. But that will not solve issue of data colocation(data locality) ? From: Sasha Dolgy [mailto:sdo...@gmail.com] Sent: Thursday, June 16, 2011 8:47 PM To: user@cassandra.apache.org Subject: Re: Querying superColumn Have 1 row with employee info for country/office/division, each column an

Re: Querying superColumn

2011-06-17 Thread Sasha Dolgy
Write two records ... 1. [department1] = { Vivek : India } 2. [India] = { Vivek : department1 } 1. [department1] = { Vivs : USA } 2. [USA] = { Vivs : department1 } Now you can query a single row to display all employees in USA or all employees in department1 ... employee moves to a new

Re: SSTable corruption blocking compaction and scrub can't fix it

2011-06-17 Thread Dominic Williams
As far as scrub goes that could be it. I'm already running unlimited file handles though so ulimit not answer unfortunately Dominic On 17 June 2011 12:12, Sylvain Lebresne sylv...@datastax.com wrote: Scrub apparently dies because it cannot acquire a file descriptor. Scrub does not correctly

Re: SSTable corruption blocking compaction and scrub can't fix it

2011-06-17 Thread Sylvain Lebresne
On Fri, Jun 17, 2011 at 1:51 PM, Dominic Williams dwilli...@system7.co.uk wrote: As far as scrub goes that could be it. I'm already running unlimited file handles though so ulimit not answer unfortunately Are you sure ? How many file descriptors are open on the system when you get that scrub

Re: Docs: Token Selection

2011-06-17 Thread William Oberman
I haven't done it yet, but when I researched how to make geo-diverse/failover DCs, I figured I'd have to do something like RF=6, strategy = {DC1=3, DC2=3}, and LOCAL_QUORUM for reads/writes. This gives you an ack after 2 local nodes do the read/write, but the data eventually gets distributed to

Re: getFieldValue()

2011-06-17 Thread Jonathan Ellis
1. the right way to right that is to just say struct.name, struct.value, etc 2. why are you writing raw thrift instead of using Hector? On Fri, Jun 17, 2011 at 5:03 AM, Vivek Mishra vivek.mis...@impetus.co.in wrote: From: Vivek Mishra Sent: Friday, June 17, 2011 3:25 PM To:

Re: getFieldValue()

2011-06-17 Thread Markus Wiesenbacher | Codefreun.de
One question regarding point 2: Why should we always use Hector, Thrift is not that bad? Von meinem iPhone gesendet Am 17.06.2011 um 17:12 schrieb Jonathan Ellis jbel...@gmail.com: 1. the right way to right that is to just say struct.name, struct.value, etc 2. why are you writing raw thrift

Re: getFieldValue()

2011-06-17 Thread Sasha Dolgy
A good example for what I understand in using Hector / pycassa / etc. is, if you wanted to implement connection pooling, you would have to craft your own solution, versus implementing the solution that is tested and ready to go, provided by Hector. Thrift doesn't provide native connection pooling

Re: getFieldValue()

2011-06-17 Thread Jonathan Ellis
If you don't get frustrated writing Thrift by hand you are a far, far more patient man than I am. It's tedious and error-prone to boot. On Fri, Jun 17, 2011 at 10:30 AM, Markus Wiesenbacher | Codefreun.de m...@codefreun.de wrote: One question regarding point 2: Why should we always use Hector,

Re: Docs: Token Selection

2011-06-17 Thread AJ
Thanks Jonathan. I assumed since each data center owned the full key space that the first replica would be stored in the dc of the coordinating node, the 2nd in another dc, and the 3rd+ back in the 1st dc. But, are you saying that the first endpoint is selected regardless of the location of

Re: getFieldValue()

2011-06-17 Thread Markus Wiesenbacher | Codefreun.de
I see ;) Von meinem iPhone gesendet Am 17.06.2011 um 17:55 schrieb Jonathan Ellis jbel...@gmail.com: If you don't get frustrated writing Thrift by hand you are a far, far more patient man than I am. It's tedious and error-prone to boot. On Fri, Jun 17, 2011 at 10:30 AM, Markus

Re: Docs: Token Selection

2011-06-17 Thread AJ
On 6/17/2011 7:26 AM, William Oberman wrote: I haven't done it yet, but when I researched how to make geo-diverse/failover DCs, I figured I'd have to do something like RF=6, strategy = {DC1=3, DC2=3}, and LOCAL_QUORUM for reads/writes. This gives you an ack after 2 local nodes do the

Re: Docs: Token Selection

2011-06-17 Thread Eric tamme
On Fri, Jun 17, 2011 at 12:07 PM, AJ a...@dude.podzone.net wrote: Thanks Jonathan.  I assumed since each data center owned the full key space that the first replica would be stored in the dc of the coordinating node, the 2nd in another dc, and the 3rd+ back in the 1st dc.  But, are you saying

Re: Docs: Token Selection

2011-06-17 Thread Eric tamme
What I don't like about NTS is I would have to have more replicas than I need.  {DC1=2, DC2=2}, RF=4 would be the minimum.  If I felt that 2 local replicas was insufficient, I'd have to move up to RF=6 which seems like a waste... I'm predicting data in the TB range so I'm trying to keep

Re: Docs: Token Selection

2011-06-17 Thread Sasha Dolgy
+1 for this if it is possible... On Fri, Jun 17, 2011 at 6:31 PM, Eric tamme eta...@gmail.com wrote: What I don't like about NTS is I would have to have more replicas than I need.  {DC1=2, DC2=2}, RF=4 would be the minimum.  If I felt that 2 local replicas was insufficient, I'd have to move up

RE: Docs: Token Selection

2011-06-17 Thread Jeremiah Jordan
Run two Cassandra clusters... -Original Message- From: Eric tamme [mailto:eta...@gmail.com] Sent: Friday, June 17, 2011 11:31 AM To: user@cassandra.apache.org Subject: Re: Docs: Token Selection What I don't like about NTS is I would have to have more replicas than I need.  {DC1=2,

Re: Docs: Token Selection

2011-06-17 Thread AJ
+1 Yes, that is what I'm talking about Eric. Maybe I could write my own strategy, I dunno. I'll have to understand more first. On 6/17/2011 10:37 AM, Sasha Dolgy wrote: +1 for this if it is possible... On Fri, Jun 17, 2011 at 6:31 PM, Eric tammeeta...@gmail.com wrote: What I don't like

Re : last record rowId

2011-06-17 Thread karim abbouh
is there any way to remember the keys (rowId) inserted in cassandra database? B.R De : Jonathan Ellis jbel...@gmail.com À : user@cassandra.apache.org Cc : karim abbouh karim_...@yahoo.fr Envoyé le : Mercredi 15 Juin 2011 18h05 Objet : Re: last record rowId

urgent how to specify multiple hosts in cassandra

2011-06-17 Thread Anurag Gujral
Hi All I specified multiple hosts in seeds field when using cassandra-0.8 like this seeds: 192.168.1.115,192.168.1.110,192.168.1.113 But I am getting error that hile parsing a block mapping in reader, line 106, column 13: - seeds: 192.168.1.115,192.168. ...

Re: urgent how to specify multiple hosts in cassandra

2011-06-17 Thread Sasha Dolgy
have them all within a and not multiple , for example: seeds: 192.168.1.115, 192.168.1.110 versus what you have... On Fri, Jun 17, 2011 at 7:00 PM, Anurag Gujral anurag.guj...@gmail.com wrote: Hi All           I specified multiple hosts in seeds field when using cassandra-0.8 like this

Re: SSTable corruption blocking compaction and scrub can't fix it

2011-06-17 Thread Ryan King
Even without lsof, you should be able to get the data from /proc/$pid -ryan On Fri, Jun 17, 2011 at 5:08 AM, Dominic Williams dwilli...@system7.co.uk wrote: Unfortunately I shutdown that node and anyway lsof wasn't installed. But $ulimit gives unlimited On 17 June 2011 13:00, Sylvain

Advice on configuring a Brisk cluster across regions in Amazon

2011-06-17 Thread Sameer Farooqui
Hi, I'd like to learn how to set up a Brisk cluster with HA/DR in Amazon. Last time I tried this a few months ago, it was tricky because we had to either set up a VPN or hack the Cassandra source to get internode communications to work across regions. But with v 0.8's new BriskSnitch or

Re: Docs: Token Selection

2011-06-17 Thread AJ
Hi Jeremiah, can you give more details? Thanks On 6/17/2011 10:49 AM, Jeremiah Jordan wrote: Run two Cassandra clusters... -Original Message- From: Eric tamme [mailto:eta...@gmail.com] Sent: Friday, June 17, 2011 11:31 AM To: user@cassandra.apache.org Subject: Re: Docs: Token

Re: SSTable corruption blocking compaction and scrub can't fix it

2011-06-17 Thread Dominic Williams
Yeah that would get the count (although I don't think you can see filenames - or maybe I just don't know how). Unfortunately that node was shut down. I then tried restarting with storage port 7001 to isolate as was quite toxic for performance of cluster but it now get's OOM on restart. If it's

RE: Docs: Token Selection

2011-06-17 Thread Jeremiah Jordan
Run two clusters, one which has {DC1:2, DC2:1} and one which is {DC1:1,DC2:2}. You can't have both in the same cluster, otherwise it isn't possible to tell where the data got written when you want to read it. For a given key XYZ you must be able to compute which nodes it is stored on

Re: Docs: Token Selection

2011-06-17 Thread Eric tamme
Yes.  But, the more I think about it, the more I see issues.  Here is what I envision (Issues marked with *): Three or more dc's, each serving as fail-overs for the others with 1 maximum unavailable dc supported at a time. Each dc is a production dc serving users that I choose. Each dc also

Re: Docs: Token Selection

2011-06-17 Thread AJ
On 6/17/2011 12:33 PM, Eric tamme wrote: As i said previously, trying to build make cassandra treat things differently based on some kind of persistent locality set it maintains in memory .. or whatever .. sounds like you will be absolutely undermining the core principles of how cassandra

Re: Docs: Token Selection

2011-06-17 Thread AJ
On 6/17/2011 12:32 PM, Jeremiah Jordan wrote: Run two clusters, one which has {DC1:2, DC2:1} and one which is {DC1:1,DC2:2}. You can't have both in the same cluster, otherwise it isn't possible to tell where the data got written when you want to read it. For a given key XYZ you must be

Re: Docs: Token Selection

2011-06-17 Thread Sasha Dolgy
Replication factor is defined per keyspace if i'm not mistaken. Can't remember if NTS is per keyspace or per cluster ... if it's per keyspace, that would be a way around it ... without having to maintain multiple clusters just have multiple keyspaces ... On Fri, Jun 17, 2011 at 9:23 PM, AJ

Re: Docs: Token Selection

2011-06-17 Thread AJ
On 6/17/2011 1:27 PM, Sasha Dolgy wrote: Replication factor is defined per keyspace if i'm not mistaken. Can't remember if NTS is per keyspace or per cluster ... if it's per keyspace, that would be a way around it ... without having to maintain multiple clusters just have multiple

Cassandra Clients for Java

2011-06-17 Thread Daniel Colchete
Good day everyone! I'm getting started with a new project and I'm thinking about using Cassandra because of its distributed quality and because of its performance. I'm using Java on the back-end. There are many many things being said about the Java high level clients for Cassandra on the web. To

Re: Cassandra Clients for Java

2011-06-17 Thread Jeffrey Kesselman
I'm using Hector. AFAIK its the only one that supports failover today. On Fri, Jun 17, 2011 at 6:02 PM, Daniel Colchete d...@cloud3.tc wrote: Good day everyone! I'm getting started with a new project and I'm thinking about using Cassandra because of its distributed quality and because of its

Re: Cassandra Clients for Java

2011-06-17 Thread Dan Retzlaff
My team prefers Pelops. https://github.com/s7/scale7-pelops It's had failover since 0.7. http://groups.google.com/group/scale7/browse_thread/thread/19d441b7cd000de0/624257fe4f94a037 With respect to avoiding writing marshaling code yourself, I agree with the OP that that is rather lacking with

Re: Cassandra Clients for Java

2011-06-17 Thread Dan Washusen
I've added some comments/questions inline... Cheers, -- Dan Washusen On Saturday, 18 June 2011 at 8:02 AM, Daniel Colchete wrote: Good day everyone! I'm getting started with a new project and I'm thinking about using Cassandra because of its distributed quality and because of its