Re: tmp files in /var/lib/cassandra/data

2011-12-14 Thread Ramesh Natarajan
yep, so far it looks like a file descriptor leak. Not sure if gc or some other event like compaction would close these files.. [root@CAP-VM-1 ~]# ls -al /proc/31134/fd | grep MSA | wc -l 540 [root@CAP-VM-1 ~]# ls -al /proc/31134/fd | grep MSA | wc -l 542 [root@CAP-VM-1 ~]# ls -al /proc/31134/

Best way to implement indexing for high-cardinality values?

2011-12-14 Thread Maxim Potekhin
I now have a CF with extremely skinny rows (in the current implementation), and the application will want to query by more than one column values. Problem is that the values in a lot of cases will be high cardinality. One other factor is that I want to rotate data in and our of the system in one d

RE: tmp files in /var/lib/cassandra/data

2011-12-14 Thread Bryce Godfrey
I'm seeing this also, and my nodes have started crashing with "too many open file errors". Running lsof I see lots of these open tmp files. java 8185 root 911u REG 8,32 38 129108266 /opt/cassandra/data/MonitoringData/Properties-tmp-hc-268721-CompressionI

tmp files in /var/lib/cassandra/data

2011-12-14 Thread Ramesh Natarajan
We are using leveled compaction running cassandra 1.0.6. I checked the data directory (/var/lib/cassandra/data) and i see these 0 bytes tmp files. What are these files? thanks Ramesh -rw-r--r-- 1 root root0 Dec 14 17:15 uid-tmp-hc-106-Data.db -rw-r--r-- 1 root root0 Dec 14 17:15

Re: Cassandra C client implementation

2011-12-14 Thread Mina Naguib
Hi Vlad I'm the author of libcassie. For what it's worth, it's in production where I work, consuming a heavily-used cassandra 0.7.9 cluster. We do have plans to upgrade the cluster to 1.x, to benefit from all the improvements, CQL, etc... but that includes revising all our clients (across se

Re: Crazy compactionstats

2011-12-14 Thread Peter Schuller
> Exception in thread "main" java.io.IOException: Repair command #1: some > repair session(s) failed (see log for details). For why repair failed you unfortunately need to log at logs as it suggests. > I still see pending tasks in nodetool compactionstats, and their number goes > into hundreds wh

Re: Cassandra C client implementation

2011-12-14 Thread Vlad Paiu
Hello Eric, We have that, thanks alot for the contribution. The idea is to not play around with including C++ code in a C app, if there's an alternative ( the thrift g_libc ). Unfortunately, since thrift does not generate a skeleton for the glibc code, I don't know how to find out what the API

Re: Cassandra C client implementation

2011-12-14 Thread Eric Tamme
On 12/14/2011 04:18 PM, Vlad Paiu wrote: Hi, Just tried libcassie and seems it's not compatible with latest cassandra, as even simple inserts and fetches fail with InvalidRequestException... So can anybody please provide a very simple example in C for connecting& fetching columns with thrift

Re: Cassandra C client implementation

2011-12-14 Thread Vlad Paiu
Hi, Just tried libcassie and seems it's not compatible with latest cassandra, as even simple inserts and fetches fail with InvalidRequestException... So can anybody please provide a very simple example in C for connecting & fetching columns with thrift ? Regards, Vlad Vlad Paiu wrote: >Hell

Crazy compactionstats

2011-12-14 Thread Maxim Potekhin
Hello I ran repair like this: nohup repair.sh & where repair.sh contains simply nodetool repair plus timestamp. The process dies while dumping this: Exception in thread "main" java.io.IOException: Repair command #1: some repair session(s) failed (see log for details). at org.apache.c

Asymmetric load

2011-12-14 Thread Maxim Potekhin
What could be the reason I see unequal loads on a 3-node cluster? This all started happening during repairs (which again are not going smoothly). Maxim

[RELEASE] Apache Cassandra 0.8.9 released

2011-12-14 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the release of Apache Cassandra version 0.8.9. Cassandra is a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model. You can read more here: http://cassand

Re: 1.0.3 CLI oddities

2011-12-14 Thread Janne Jalkanen
Correct. 1.0.6 fixes this for me. /Janne On 12 Dec 2011, at 02:57, Chris Burroughs wrote: > Sounds like https://issues.apache.org/jira/browse/CASSANDRA-3558 and the > other tickets reference there. > > On 11/28/2011 05:05 AM, Janne Jalkanen wrote: >> Hi! >> >> (Asked this on IRC too, but didn

Re: Cassandra C client implementation

2011-12-14 Thread Vlad Paiu
Hello, Thanks very much for your suggestions. Libcassie seems nice but doesn't seem like it's actively maintained and i'm not sure if it's compatible with latest Cassandra versions. Will give it a try though. I was looking through the generated thrift .c files and I can't seem to find what fun

Re: One ColumnFamily places data on only 3 out of 4 nodes

2011-12-14 Thread Mohit Anchlia
> bart@node1:~$ nodetool -h localhost getendpoints A UserDetails 4545027 > 192.168.81.5 > 192.168.81.2 > 192.168.81.3 Can you see what happens if you stop C* say on node .5 and write and read at quorum? On Wed, Dec 14, 2011 at 7:06 AM, Bart Swedrowski wrote: > > > On 14 December 2011 14:58, wro

[RELEASE] Apache Cassandra 1.0.6 released

2011-12-14 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the release of Apache Cassandra version 1.0.6. Cassandra is a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model. You can read more here: http://cassand

Re: Cassandra C client implementation

2011-12-14 Thread Jeremiah Jordan
If you are OK linking to a C++ based library you can look at: https://github.com/minaguib/libcassandra/tree/kickstart-libcassie-0.7/libcassie It is wrapper code around libcassandra which exports a C++ interface. If you look at the function names etc in the other languages, just use the similar fu

RE: Cassandra C client implementation

2011-12-14 Thread Don Smith
VIrgil apparently lets you access cassandra via a RESTful interface: http://code.google.com/a/apache-extras.org/p/virgil/ Depending on your performance needs and the maturity of virgil's code (I think it's alpha), that may work. You could always fork a java process and pipe to it. Don ___

Re: Keys for deleted rows visible in CLI

2011-12-14 Thread Maxim Potekhin
Thanks, it makes perfect sense now. Well an option in cassandra could make it optional as far as display it concerned, w/o performance hit -- of course this is all unimportant. Thanks again Maxim On 12/14/2011 11:30 AM, Brandon Williams wrote: http://wiki.apache.org/cassandra/FAQ#range_ghos

Re: commit log size

2011-12-14 Thread Maxim Potekhin
Alexandru, Jeremiah -- what setting needs to be tweaked, and what's the recommended value? I observed similar behavior this morning. Maxim On 11/28/2011 2:53 PM, Jeremiah Jordan wrote: Yes, the low volume memtables are causing the problem. Lower the thresholds for those tables if you don't

Re: Cassandra C client implementation

2011-12-14 Thread Vlad Paiu
Hello, Thanks for your answer. Unfortunately libcassandra is C++ , I'm looking for something written in ANSI C. I've searched alot and my guess is glibc thrift is my only option, but I could not find even one example onto how to make a connection & some queries to Cassandra using glibc thrift.

Re: configurable bloom filters (like hbase)

2011-12-14 Thread Brandon Williams
https://issues.apache.org/jira/browse/CASSANDRA-3497 On Wed, Dec 14, 2011 at 4:52 AM, Radim Kolar wrote: > Dne 11.11.2011 7:55, Radim Kolar napsal(a): > >> i have problem with large CF (about 200 billions entries per node). While >> i can configure index_interval to lower memory requirements, i s

Re: Keys for deleted rows visible in CLI

2011-12-14 Thread Brandon Williams
http://wiki.apache.org/cassandra/FAQ#range_ghosts On Wed, Dec 14, 2011 at 4:36 AM, Radim Kolar wrote: > Dne 14.12.2011 1:15, Maxim Potekhin napsal(a): > >> Thanks. It could be hidden from a human operator, I suppose :) > > I agree. Open JIRA for it.

Counters != Counts

2011-12-14 Thread Alain RODRIGUEZ
Hi everybody. I'm using a lot of counters to make statistics on a 4 nodes cluster (ec2 m1.small) with phpcassa (cassandra v1.0.2). I store some events and increment counters at the same time. Counters give me over-counts compared with the count of every corresponding events. I sure that my non-

Re: Cassandra C client implementation

2011-12-14 Thread i
BTW please use https://github.com/eyealike/libcassandra Best Regards, Yi "Steve" Yang ~~~ +1-401-441-5086 +86-13910771510 Sent via BlackBerry® from China Mobile -Original Message- From: i...@iyyang.com Date: Wed, 14 Dec 2011 15:52:56 To: Reply-To: i...@iyyang.com Subject:

Re: Cassandra C client implementation

2011-12-14 Thread i
Try libcassandra, but it doesn't support connection pooling --Original Message-- From: Vlad Paiu To: user@cassandra.apache.org ReplyTo: user@cassandra.apache.org Subject: Cassandra C client implementation Sent: Dec 14, 2011 11:11 PM Hello, I am trying to integrate some Cassandra related

Cassandra C client implementation

2011-12-14 Thread Vlad Paiu
Hello, I am trying to integrate some Cassandra related ops ( insert, get, etc ) into an application written entirelly in C, so C++ is not an option. Is there any C client library for cassandra ? I have also tried to generate thrift glibc code for Cassandra, but on wiki.apache.org/cassandra/Th

Re: One ColumnFamily places data on only 3 out of 4 nodes

2011-12-14 Thread Bart Swedrowski
On 14 December 2011 14:58, wrote: > No idea, try to check logs for errors, and increase verbosity level on > that node. > No errors at all, few warnings about HEAP size, that's it. Okay, thanks. Anyone else have got any ideas on how to push this forward?

Re: One ColumnFamily places data on only 3 out of 4 nodes

2011-12-14 Thread igor
No idea, try to check logs for errors, and increase verbosity level on that node. -Original Message- From: Bart Swedrowski To: user@cassandra.apache.org Sent: Wed, 14 Dec 2011 16:45 Subject: Re: One ColumnFamily places data on only 3 out of 4 nodes On 14 December 2011 13:02, wrote:

Re: One ColumnFamily places data on only 3 out of 4 nodes

2011-12-14 Thread Bart Swedrowski
On 14 December 2011 14:45, Bart Swedrowski wrote: > I have queried few, and to my surprise, 192.168.82.2 (node1) The IP is supposed to be 192.168.81.2

Re: One ColumnFamily places data on only 3 out of 4 nodes

2011-12-14 Thread Bart Swedrowski
On 14 December 2011 13:02, wrote: > Do you use randompartitiner? What nodetool getendpoints show for several > random keys > Yes, randompartitioner it is. Thanks for hint re 'nodetool getendpoints'. I have queried few, and to my surprise, 192.168.82.2 (node1) is showing up as a endpoint for few

Re: One ColumnFamily places data on only 3 out of 4 nodes

2011-12-14 Thread igor
Do you use randompartitiner? What nodetool getendpoints show for several random keys? -Original Message- From: Bart Swedrowski To: user@cassandra.apache.org Sent: Wed, 14 Dec 2011 12:56 Subject: Re: One ColumnFamily places data on only 3 out of 4 nodes Anyone? On 12 December 2011 15:

Counters and Top 10

2011-12-14 Thread cbert...@libero.it
Hi all, I'm using Cassandra in production for a small social network (~10.000 people). Now I have to assign some "credits" to each user operation (login, write post and so on) and then beeing capable of providing in each moment the top 10 of the most active users. I'm on Cassandra 0.7.6 I'd like

Re: One ColumnFamily places data on only 3 out of 4 nodes

2011-12-14 Thread Bart Swedrowski
Anyone? On 12 December 2011 15:25, Bart Swedrowski wrote: > Hello everyone, > > I seem to have came across rather weird (at least for me!) problem / > behaviour with Cassandra. > > I am running a 4-nodes cluster on Cassandra 0.8.7. For the keyspace in > question, I have RF=3, SimpleStrategy wit

Re: configurable bloom filters (like hbase)

2011-12-14 Thread Radim Kolar
Dne 11.11.2011 7:55, Radim Kolar napsal(a): i have problem with large CF (about 200 billions entries per node). While i can configure index_interval to lower memory requirements, i still have to stick with huge bloom filters. Ideal would be to have bloom filters configurable like in hbase. Ca

Re: Keys for deleted rows visible in CLI

2011-12-14 Thread Radim Kolar
Dne 14.12.2011 1:15, Maxim Potekhin napsal(a): Thanks. It could be hidden from a human operator, I suppose :) I agree. Open JIRA for it.