Re: nodetool repair uses insane amount of disk space

2012-08-20 Thread Michael Morris
Thanks everyone, for the pointers. I've found an opportunity to simplify the setup, still 2 DCs and 3 rack setup (RF = 1 for DC with 1 rack, and RF = 2 for DC with 2 racks), but now each rack contains 9 nodes with even token distribution. Once I got the new topology in place, I ran multiple repai

Re: nodetool output through REST API?

2012-08-20 Thread Tyler Hobbs
Your best bet is probably to set up mx4j with Cassandra, which will expose a REST api for all of the JMX stuff. On Mon, Aug 20, 2012 at 2:46 PM, Yang wrote: > I'm trying to write a little python script to manage our cassandra cluster. > > it uses output from nodetool, for example to find the cur

Re: nodetool , localhost connection refused

2012-08-20 Thread A J
Yes, the telnet does not work. Don't know what it was but switching to 1.1.4 solved the issue. On Mon, Aug 20, 2012 at 6:17 PM, Hiller, Dean wrote: > My guess is "telnet localhost 7199" also fails? And if you are on linux > and run netstat -anp, you will see no one is listening on that port? > >

Re: nodetool , localhost connection refused

2012-08-20 Thread Hiller, Dean
My guess is "telnet localhost 7199" also fails? And if you are on linux and run netstat -anp, you will see no one is listening on that port? So database node did not start and bind to that port and you would see exception in the logs of that database nodeŠ.just a guess. Dean On 8/20/12 4:10 PM,

nodetool , localhost connection refused

2012-08-20 Thread A J
I am running 1.1.3 Nodetool on the database node (just a single node db) is giving the error: Failed to connect to 'localhost:7199': Connection refused Any idea what could be causing this ? Thanks.

nodetool output through REST API?

2012-08-20 Thread Yang
I'm trying to write a little python script to manage our cassandra cluster. it uses output from nodetool, for example to find the current token assignment, node status etc. I could do this by parsing output from "nodetool ring" command. but is there a more "native way" , for example through some

Re: Thrift batch_mutate erase previous data?

2012-08-20 Thread Cyril Auburtin
no right it's ok, it was a bug on my side 2012/8/11 Tyler Hobbs > > > On Thu, Aug 9, 2012 at 10:43 AM, Cyril Auburtin > wrote: > >> It seems the Thrift method *batch-mutate*, with Mutations, will not >> update the previous data with the mutation given, but clear and replace by >> it? right? >>

Re: Index build status

2012-08-20 Thread Jeremy Hanna
For an individual node, you can check the status of building indexes using nodetool compactionstats. And similarly, if you want to speed up building the indexes (and you have the extra IO) you can increase or unthrottle your compaction throughput temporarily - nodetool setcompactionthrough 0 to

Re: CQL results are confusing me

2012-08-20 Thread Juan Ezquerro
EmailAddress is not indexed, must declare key for this before can do a search. 2012/8/20 Peter Morris > Consider the following statements > > #1 New family is created so I have no data > create columnfamily Test (UserName varchar primary key, EmailAddress > varchar); > > #2 Count how many rows I

CQL results are confusing me

2012-08-20 Thread Peter Morris
Consider the following statements #1 New family is created so I have no data create columnfamily Test (UserName varchar primary key, EmailAddress varchar); #2 Count how many rows I have select count(1) from Test; -Expected: 0 -Actual: 0 #3 Select all users with a specific email address select *

How to add secondary index to existing column family with CLI?

2012-08-20 Thread Ryabin, Thomas
I want to add a secondary index to an existing column family, but am running into some trouble. I'm trying to use the Cassandra CLI to add the secondary index. The column family is called "books", the column I'm trying to index is called "title", the key validation class is UTF8Type, and the def

[RELEASE] Apache Cassandra 1.1.4 released

2012-08-20 Thread Eric Evans
The Cassandra team is pleased to announce the release of Apache Cassandra 1.1.4 This is a maintenance release; The list of changes[1] is quite small but practice safe upgrades, and always read the release notes[2]. If you encounter any problems, please let us know[3]. Downloads of source and bin

get_slice on wide rows

2012-08-20 Thread feedly team
I have a column family that I am using for consistency purposes. Basically a marker column is written to a row in this family before some actions take place and is deleted only after all the actions complete. The idea is that if something goes horribly wrong this table can be read to see what needs

Re: Why so slow?

2012-08-20 Thread Hiller, Dean
Be careful on bulk as cassandra takes a bit longer to process. It was faster not doing too many rows at a time multithreaded in our performance testing and if I remember Aaron Morton might have told me that as well. Definitely use the cassandra bulk testing tool as well. I used that and compa

Re: Why so slow?

2012-08-20 Thread Carlos Carrasco
Are you inserting in bulk? Try to increase the amount of mutations you send in a single batch, otherwise you are just measuring the TCP roundtrip time. On 20 August 2012 17:36, Peter Morris wrote: > My misunderstanding, thanks for correcting me! > > > On Mon, Aug 20, 2012 at 4:32 PM, Hiller, Dea

Re: CQL logical operator: OR

2012-08-20 Thread Juan Ezquerro
Cassandra doesn't support disjunctions (OR) yet, so you'll have to do multiple queries. https://groups.google.com/forum/?fromgroups#!topic/phpcassa/Py42QgDHm3w%5B1-25%5D 2012/8/20 Peter Morris > select * from Users where UserName='me' or EmailAddress='m...@home.com'; > Bad Request: line 1:40 mi

CQL logical operator: OR

2012-08-20 Thread Peter Morris
select * from Users where UserName='me' or EmailAddress='m...@home.com'; Bad Request: line 1:40 mismatched input 'or' expecting EOF Could someone tell me how to use OR conditions in CQL? I am able to find examples of AND, but none for OR and it doesn't seem to work.

Re: Why so slow?

2012-08-20 Thread Peter Morris
My misunderstanding, thanks for correcting me! On Mon, Aug 20, 2012 at 4:32 PM, Hiller, Dean wrote: > There is latency and throughput. These are two totally different things > even for MySQL. If you are single threaded, each request (even with MySql) > has to be delayed by 1ms or whatever you

Re: Why so slow?

2012-08-20 Thread Hiller, Dean
There is latency and throughput. These are two totally different things even for MySQL. If you are single threaded, each request (even with MySql) has to be delayed by 1ms or whatever your ping time is. To fully utilize a 1Gps bandwidth, you NEED to be multithreaded or you are wasting bandwid

Re: Why so slow?

2012-08-20 Thread Peter Morris
I'm assessing how quickly on average I can deal with a single request. I cannot believe that connecting through a 1Gbps network cable is 14 times slower. I think I get a higher insert rate for SQL Server. On Mon, Aug 20, 2012 at 1:20 PM, Hiller, Dean wrote: > IF one has 1ms delay per reques

RE: new node joins the cluster but can't drop schemas

2012-08-20 Thread mdione.ext
De : mdione@orange.com [mailto:mdione@orange.com] > We used to have a nice test cluster with 2 nodes and everything was > peachy. At some point we (re)added a third node, which seems to work > allright. But then we try to delete one CF and requery it and we get > this: Seems we've got

Re: Opscenter 2.1 vs 1.3

2012-08-20 Thread Nick Bailey
Robin, RF shouldn't affect the numbers on that graph at all. The only explanation for those differences that I can see is the increase in the number of writes OpsCenter itself is doing. Do you see the same jump in writes when viewing graphs just for your application's column families? -Nick On S

new node joins the cluster but can't drop schemas

2012-08-20 Thread mdione.ext
We used to have a nice test cluster with 2 nodes and everything was peachy. At some point we (re)added a third node, which seems to work allright. But then we try to delete one CF and requery it and we get this: root@pnscassandra03:~# cqlsh -3 [cqlsh 2.2.0 | Cassandra 1.1.2hebex1 | CQL spec 3

Re: What is the ideal server-side technology stack to use with Cassandra?

2012-08-20 Thread Alex Major
On Sun, Aug 19, 2012 at 11:04 PM, Tyler Hobbs wrote: > On Sun, Aug 19, 2012 at 3:55 AM, aaron morton wrote: > >> >> >> It is not a judgement on the quality of PHPCassa or PDO-cassandra, >> neither of which I have used. >> >> My comments were mostly informed by past issues with Thrift and PHP. >>

(new nosqlOrm linke) composite table with cassandra without using cql3?

2012-08-20 Thread Hiller, Dean
Sorry, project went through a rename and I didn't realize links changed… https://github.com/deanhiller/playorm/blob/master/input/javasrc/com/alvazan/orm/layer9z/spi/db/cassandra/CassandraSession.java NOTE: You can look for the trick we use to store all longs, ints, shorts as smallest possible by

Re: What is the ideal server-side technology stack to use with Cassandra?

2012-08-20 Thread Hiller, Dean
As far as opinions go, the stack we are using is Playframework 1.2.5 (the stateless nature rocks compared to other platforms like tomcat or servlet container stuff). playOrm Astyanax Later, Dean On 8/17/12 11:54 AM, "Aaron Turner" wrote: >My stack: > >Java + JRuby + Rails + Torquebox > >I'm us

Re: Why so slow?

2012-08-20 Thread Hiller, Dean
IF one has 1ms delay per request and the other has .001, 1000 requests will be a one second delay tacked on(which is huge). This is why he suggested multi-threaded ;). Maybe there is some other factors as well. Dean From: Peter Morris mailto:mrpmor...@gmail.com>> Reply-To: "user@cassandra.ap

Re: ColumnFamilies.ReadCount

2012-08-20 Thread Rene Kochen
Okay, thanks for the info! I was just trying to understand what I saw. 2012/8/20 Tyler Hobbs : > > > On Sun, Aug 19, 2012 at 6:27 AM, Rene Kochen > wrote: >> >> >> Why does it not increase when servicing a range operation? > > > It doesn't because, basically, it wasn't designed to. Range queries

Re: Why so slow?

2012-08-20 Thread Peter Morris
I've set NoDelay = true on the socket, and although it is much better it is still only giving me 500 record inserts per second over a 1Gbps crossover cable - (I now also get 200 record inserts per second over wireless.) I would expect the cross over to have much better performance than this. Any

AUTO: Ken Robbins is out of the office (returning 08/22/2012)

2012-08-20 Thread Ken Robbins
I am out of the office until 08/22/2012. I will be out of the office with no access to email on Monday and Tuesday (8/20, 8/21). For urgent issues, please call or text 781-856-0078. Note: This is an automated response to your message "Cassandra with large number of columns per row" sent on

Re: What is the ideal server-side technology stack to use with Cassandra?

2012-08-20 Thread Andy Ballingall TF
On Aug 19, 2012 9:55 AM, "aaron morton" wrote: > > > Aaron Morton (aa...@thelastpickle.com) advised: > > > > "If possible i would avoid using PHP. The PHP story with cassandra has > > not been great in the past. There is little love for it, so it takes a > > while for work changes to get in the cl

Re: Why so slow?

2012-08-20 Thread Peter Morris
Thanks, I shall get onto the developer of the library :) On Sun, Aug 19, 2012 at 10:13 PM, Peter Schuller < peter.schul...@infidyne.com> wrote: > You're almost certainly using a client that doesn't set TCP_NODELAY on > the thrift TCP socket. The nagle algorithm is enabled, leading to 200 > ms la

Re: Cassandra with large number of columns per row

2012-08-20 Thread Chuan-Heng Hsiao
I think the limit of the size per row in cassandra is 2G? 1 x 1M = 10G. Hsiao On Mon, Aug 20, 2012 at 1:07 PM, oupfevph wrote: > I setup cassandra with default configuration in clean AWS instance, and I > insert 1 columns into a row, each column has a 1MB data. I use this > ruby(versio