Re: column family names

2010-08-30 Thread Aaron Morton
Moving to the user list. The new restrictions were added as part of CASSANDRA-1377 for 0.6.5 and 0.7, AFAIK it's to ensure the file names created for the CFs can be correctly parsed. So it's probably not going to change. The names have to match the \w reg ex class, which includes the

Re: column family names

2010-08-30 Thread Terje Marthinussen
Ah, sorry, I forgot that underscore was part of \w. That will do the trick for now. I do not see the big issue with file names though. Why not expand the allowed characters a bit and escape the file names? Maybe some sort of URL like escaping. Terje On Mon, Aug 30, 2010 at 6:29 PM, Aaron Morton

Re: Thrift + PHP: help!

2010-08-30 Thread Mike Peters
Juho, do you mind sharing your implementation with the group? We'd love to help as well with rewriting the thrift interface, specificaly TSocket.php which seems to be where the majority of the problems are lurking. Has anyone tried compiling native thrift support as described here

Re: cassandra for a inbox search with high reading qps

2010-08-30 Thread Mike Peters
Chen, Have you considered using http://www.slideshare.net/otisg/lucandra Lucandra for Inbox search? We have a similar setup and are currently looking into using Lucandra over implementing the searching ourselves with pure Cassandra. -- View this message in context:

Re: TException: Error: TSocket: timed out reading 1024 bytes from 10.1.1.27:9160

2010-08-30 Thread Mike Peters
Hi guys, There are several patches you need to apply to Thrift to completely resolve all timeout errors. Here's a list of them along with a link to download a patched thrift library: http://www.softwareprojects.com/resources/programming/t-php-thrift-library-for-cassandra-1982.html

Re: RowMutationVerbHandler.java (line 78) Error in row mutation

2010-08-30 Thread Gary Dusbabek
Is it possible this was a new node with a manual token and autobootstrap turned off? If not, could you give more details about the node? Gary. On Fri, Aug 27, 2010 at 17:58, B. Todd Burruss bburr...@real.com wrote: i got the latest code this morning.  i'm testing with 0.7 ERROR

get_slice sometimes returns previous result on php

2010-08-30 Thread Juho Mäkinen
I've ran into a strange bug where get_slice returns the result from previous query. My application iterates over a set of columns inside a supercolumn and for some reason it sometimes (quite rarely but often enough that it shows up) the results gets shifted around so that the application gets the

Re: Calls block when using Thrift API

2010-08-30 Thread Gary Dusbabek
If you're only interested in accessing data natively, I suggest you try the fat client. It brings up a node that participates in gossip, exposes the StorageProxy API, but does not receive a token and so does not have storage responsibilities. StorageService.instance.initClient(); in 0.7 you

Re: cassandra disk usage

2010-08-30 Thread Jonathan Ellis
column names are stored per cell (moving to user@) On Mon, Aug 30, 2010 at 6:58 AM, Terje Marthinussen tmarthinus...@gmail.com wrote: Hi, Was just looking at a SSTable file after loading a dataset. The data load has no updates of data  but: - Columns can in some rare cases be added to

Re: Thrift + PHP: help!

2010-08-30 Thread Mike Peters
Interesting! Thanks for sharing Have you considered instead of retrying the failing node, to iterate through other nodes in your cluster? If one node is failing (let's assume it's overloaded for a minute), you're probably going to be better off having the client send the insert to the next

Re: Thrift + PHP: help!

2010-08-30 Thread Juho Mäkinen
On Mon, Aug 30, 2010 at 4:24 PM, Mike Peters cassan...@softwareprojects.com wrote: Have you considered instead of retrying the failing node, to iterate through other nodes in your cluster? Yes, the $this-connect() does just that: it removes the previous node from the node list and gives the

NodeTool won't connect remotely

2010-08-30 Thread Allan Carroll
Hi, I'm trying to manage my cassandra cluster from a remote box and having issues getting nodetool to connect. All the machines I'm using are running on AWS. Here's what happens when I try: /opt/apache-cassandra-0.6.4/bin/nodetool -h xxx.xxx.xxx.143 -p 10036 ring Error connecting to remote

Re: NodeTool won't connect remotely

2010-08-30 Thread Juho Mäkinen
I think that JMX needs additional ports to function correctly. Try to disable all firewalls between the client and the server so that client can connect to any port in the server and try again. - Juho Mäkinen On Mon, Aug 30, 2010 at 7:07 PM, Allan Carroll alla...@gmail.com wrote: Hi, I'm

Re: Cassandra HAProxy

2010-08-30 Thread Dave Viner
FWIW - we've been using HAProxy in front of a cassandra cluster in production and haven't run into any problems yet. It sounds like our cluster is tiny in comparison to Anthony M's cluster. But I just wanted to mentioned that others out there are doing the same. One thing in this thread that I

Re: cassandra disk usage

2010-08-30 Thread Terje Marthinussen
On Mon, Aug 30, 2010 at 10:10 PM, Jonathan Ellis jbel...@gmail.com wrote: column names are stored per cell (moving to user@) I think that is already accommodated for in my numbers? What i listed was measured from the actual SSTable file (using the output from strings sstable.db), so

Re: Cassandra HAProxy

2010-08-30 Thread Edward Capriolo
On Mon, Aug 30, 2010 at 12:40 PM, Dave Viner davevi...@pobox.com wrote: FWIW - we've been using HAProxy in front of a cassandra cluster in production and haven't run into any problems yet.  It sounds like our cluster is tiny in comparison to Anthony M's cluster.  But I just wanted to mentioned

Re: NodeTool won't connect remotely

2010-08-30 Thread Allan Carroll
Thanks! That did it. Looks like the connection happens on 10036 and then the server negotiates a separate port for continued communication. Found this article once I knew what to look for. It also describes how to get more consistency on port numbers to allow for ssh tunneling and firewalls.

Dumping

2010-08-30 Thread Mark
Is there an easy way to retrieve all values from a CF.. similar to a dump? How about retrieving all columns for a particular key? In the second use case a simple iteration would work using a start and finish but how would this be accomplished across all keys for a particular CF when you

Re: Dumping

2010-08-30 Thread aaron morton
sstable2json discussed here http://wiki.apache.org/cassandra/Operations may be what you are after, or the snapshot feature. Not sure what you want to use the dump for. If you do not know the keys in the CF in advance take a look at get_range_slices (http://wiki.apache.org/cassandra/API) it

Client developer mailing list

2010-08-30 Thread Jeremy Hanna
There has been a new mailing list created for those who are working on Cassandra clients above thrift and/or avro. You can subscribe by sending an email to client-dev-subscr...@cassandra.apache.org or using the link at the bottom of http://cassandra.apache.org The list is meant to give client

Re: Job opening cassandra Barcelona, Spain

2010-08-30 Thread Dimitry Lvovsky
Thanks for the suggestion. On Aug 30, 2010, at 8:01 PM, Norman Maurer wrote: I think you should try jobs at apache.org too ;) Bye, Norman 2010/8/25 Dimitry Lvovsky dimi...@reviewpro.com: Hi All, Please forgive the job offer spam. We're looking to add a developer with experience

Re: Client developer mailing list

2010-08-30 Thread Ran Tavory
awesome, thanks, I'm subscribed :) On Mon, Aug 30, 2010 at 10:05 PM, Jeremy Hanna jeremy.hanna1...@gmail.comwrote: There has been a new mailing list created for those who are working on Cassandra clients above thrift and/or avro. You can subscribe by sending an email to

Re: Client developer mailing list

2010-08-30 Thread Mike Peters
I'm in! We really need a better PHP Thrift

Re: get_slice sometimes returns previous result on php

2010-08-30 Thread Benjamin Black
On Mon, Aug 30, 2010 at 6:05 AM, Juho Mäkinen juho.maki...@gmail.com wrote: The application is using the same cassandra thrift connection (it doesn't close it in between) and everything is happening inside same php process. This is why you are seeing this problem (and is specific to

Re: get_slice sometimes returns previous result on php

2010-08-30 Thread Juho Mäkinen
I'm not using connection poolin where the same tcp socket is used between different php requests. I open a new thrift connection with new socket to the node and I use the node through the request and I close it after. The get_slice requests are all happening in the same request, so something odd

Re: Follow-up post on cassandra configuration with some experiments on GC tuning

2010-08-30 Thread Peter Schuller
collection runs for the cases tested. In most cases, I prefer having low pauses due to any garbage collection runs and don't care too much about the shape of the memory usage, and I guess, that's the reason why the low pause collector is used by default for running cassandra. For myself, I

Re: Follow-up post on cassandra configuration with some experiments on GC tuning

2010-08-30 Thread Jonathan Ellis
On Mon, Aug 30, 2010 at 5:18 PM, Peter Schuller peter.schul...@infidyne.com wrote: Has anyone run Cassandra with G1 in production for prolonged periods of time? Not AFAIK. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support

Re: cassandra for a inbox search with high reading qps

2010-08-30 Thread Todd Nine
We use Lucandra as well for searching for users, as well as geo-encoding. It really works well except for numeric fields. https://issues.apache.org/jira/browse/CASSANDRA-1235 That bug may be a bit of an issue, but after they release 0.6.5 all the Lucene functionality will be available to you.

Re: cassandra for a inbox search with high reading qps

2010-08-30 Thread Chen Xinli
what's the average size of a user? As I know, lucandra will first poll the data from cassandra, then do computation in the client. That's ok for small rows. But we have 1M row in average, and some rows scale to 100M; at the same time, we expect high reading qps. Polling these data to client

Re: column family names

2010-08-30 Thread Benjamin Black
URL encoding. On Mon, Aug 30, 2010 at 5:55 PM, Aaron Morton aa...@thelastpickle.com wrote: under scores or URL encoding ? Aaron On 31 Aug, 2010,at 12:27 PM, Benjamin Black b...@b3k.us wrote: Please don't do this. On Mon, Aug 30, 2010 at 5:22 AM, Terje Marthinussen tmarthinus...@gmail.com

Re: column family names

2010-08-30 Thread Terje Marthinussen
Beyond aesthetics, specific reasons? Terje On Tue, Aug 31, 2010 at 11:54 AM, Benjamin Black b...@b3k.us wrote: URL encoding.