Re: logged out: #User allow_all groups=[]

2011-05-10 Thread Stu Hood
As a side note, be aware that running with DEBUG logging enabled can make your cluster run a full order of magnitude slower. On Mon, May 9, 2011 at 6:54 PM, Suan Aik Yeo yeosuan...@gmail.com wrote: Ah, must be the status check that I set up. Thanks! On Mon, May 9, 2011 at 7:42 PM, Tyler

Finding big rows

2011-05-10 Thread Meler Wojciech
Hello, I've noticed very nice stats exposed with JMX. I was quite shocked when I saw that MaxRowSize was about 400MB (it was expected to be several MB). What is the best way to find keys of such big rows? I couldn't find anything so I've written simple program to dump sizes from Index files

Re: PIG Cassandra - IPs of nodes in a ring

2011-05-10 Thread Jeremy Hanna
Anyone have any thoughts on this thread - about configuring cassandra with a different ip for listen address and rpc address? moving this to the cassandra user list as it more involves cassandra configuration at this point. On May 10, 2011, at 12:58 AM, Badrinarayanan S wrote: Hi, after

Datacenter migration

2011-05-10 Thread André-Philippe Paquet
Hi, We have a cluster of ~20 nodes all located in 1 datacenter running Cassandra 0.7.2. We are planning to move from this datacenter to another with as minimal downtime as possible. The first strategy that came into my mind is to use the topology placement strategy, create nodes in the new

Re: Datacenter migration

2011-05-10 Thread Jonathan Ellis
http://wiki.apache.org/cassandra/Operations#Replication 2011/5/10 André-Philippe Paquet andre-phili...@wajam.com: Hi, We have a cluster of ~20 nodes all located in 1 datacenter running Cassandra 0.7.2. We are planning to move from this datacenter to another with as minimal downtime as

Newbie question

2011-05-10 Thread Sam Ganesan
All: A newbie question to the aficianados. I understand that I can stipulate an ordering mechanism when I create a column family to reflect what I am querying in the long run. Generally I need to query a particular column space that I am contructing based on two different columns. The

column bloat

2011-05-10 Thread Terje Marthinussen
Hi, If you make a supercolumn today, what you end up with is: - short + Super Column name - int (local deletion time) - long (delete time) Byte array of columns each with: - short + column name - int (TTL) - int (local deletion time) - long (timestamp) - int + value of column That

Small typo in conf/cassandra.yaml

2011-05-10 Thread Benoit Perroud
Hi all, I found out a small typo in cassandra.yaml, which can confuse inattentive copy-paster. Here is the patch. Index: conf/cassandra.yaml === --- conf/cassandra.yaml (revision 1101465) +++ conf/cassandra.yaml (working copy) @@

Re: column bloat

2011-05-10 Thread Sylvain Lebresne
On Tue, May 10, 2011 at 3:44 PM, Terje Marthinussen tmarthinus...@gmail.com wrote: Hi, If you make a supercolumn today, what you end up with is: - short  + Super Column name - int (local deletion time) - long (delete time) Byte array of  columns each with:   - short + column name   - int

Re: column bloat

2011-05-10 Thread Terje Marthinussen
Anyway, to sum that up, expiring columns are 1 byte more and non-expiring ones are 7 bytes less. Not arguing, it's still fairly verbose, especially with tons of very small columns. Yes, you are right, sorry. Trying to do one thing to many at the same time. My brain filtered out part of the

Re: Newbie question

2011-05-10 Thread Narendra Sharma
You can have only one ordering defined in a CF. Super CF will allow you to have nested ordering i.e. SC can have one ordering whereas columns within SC can have other ordering. Note this is defined at CF level and cannot be defined at SC level. To model what you are trying to do, you can check if

Re: compaction strategy

2011-05-10 Thread Terje Marthinussen
Everyone may be well aware of that, but I'll still remark that a minor compaction will try to merge as many 20MB sstables as it can up to the max compaction threshold (which is configurable). So if you do accumulate some newly created sstable at some point in time, the next minor compaction

Re: Newbie question

2011-05-10 Thread Nate McCall
If I understand correctly, CompositeType comparator (https://issues.apache.org/jira/browse/CASSANDRA-2231) may be of interest to you once it becomes available.

installing cassandra on ec2 boxes

2011-05-10 Thread Anurag Gujral
Hi All, I am trying to install cassandra on ec2 boxes . I am using domain names to specify the listen_address and seeds. I am getting following error from cassandra when I try to create keyspace. Warning: unreachable nodes IP(address of the cassandra instance).. schemas agree across the

Re: Small typo in conf/cassandra.yaml

2011-05-10 Thread Jonathan Ellis
. inside quotations is correct in English. On Tue, May 10, 2011 at 9:06 AM, Benoit Perroud ben...@noisette.ch wrote: Hi all, I found out a small typo in cassandra.yaml, which can confuse inattentive copy-paster. Here is the patch. Index: conf/cassandra.yaml

RE: Small typo in conf/cassandra.yaml

2011-05-10 Thread Gert van der Spoel
. inside quotations is correct in English. Caused by: org.yaml.snakeyaml.error.YAMLException: Cannot create property=commitlog_sync for JavaBean=org.apache.cassandra.config.Config@1fd5e2; Unable to find enum value 'batch.' for enum class: org.apache.cassandra.config.Config$CommitLogSync That's

Re: Small typo in conf/cassandra.yaml

2011-05-10 Thread Tyler Hobbs
On Tue, May 10, 2011 at 11:40 AM, Jonathan Ellis jbel...@gmail.com wrote: . inside quotations is correct in English. It's a subtle and non-obvious rule of English. Besides, in a programming context, the English rule always feels wrong to me :)

Re: Renaming cluster

2011-05-10 Thread Shaun Newman
Hi, Here is the actual error I get when I remove all LocationInfo files: ERROR 18:05:04,918 Fatal exception during initialization org.apache.cassandra.config.ConfigurationException: Found system table files, but they couldn't be loaded. Did you change the partitioner? at

Re: Index interval tuning

2011-05-10 Thread Peter Schuller
That reminds me, my false positive ration is stuck at 1.0, so I guess bloom filters aren't doing a lot for me. That sounds unlikely unless you're hitting some edge case like reading a particular row that happened to be a collision, and only that row. This is from JMX stats on the column family

Re: compaction strategy

2011-05-10 Thread Sylvain Lebresne
On Tue, May 10, 2011 at 6:20 PM, Terje Marthinussen tmarthinus...@gmail.com wrote: Everyone may be well aware of that, but I'll still remark that a minor compaction will try to merge as many 20MB sstables as it can up to the max compaction threshold (which is configurable). So if you do

Re: installing cassandra on ec2 boxes

2011-05-10 Thread Sameer Farooqui
Hi Anurag, We're using an elastic IP for the seed address (public DNS name should also work) and using the private IP (10.255.x.x) for the listen address. As you're getting started, you may also find this blog that my team put together helpful:

Re: Small typo in conf/cassandra.yaml

2011-05-10 Thread Jonathan Ellis
Comments are text, not code. On Tue, May 10, 2011 at 12:01 PM, Tyler Hobbs ty...@datastax.com wrote: On Tue, May 10, 2011 at 11:40 AM, Jonathan Ellis jbel...@gmail.com wrote: . inside quotations is correct in English. It's a subtle and non-obvious rule of English.  Besides, in a programming

Re: installing cassandra on ec2 boxes

2011-05-10 Thread Anurag Gujral
Are you using different regions in ec2 Thanks Anurag On Tue, May 10, 2011 at 11:19 AM, Sameer Farooqui cassandral...@gmail.comwrote: Hi Anurag, We're using an elastic IP for the seed address (public DNS name should also work) and using the private IP (10.255.x.x) for the listen address. As

How to load schema non-programmatically? loadSchemaFromYAML doesn't work

2011-05-10 Thread Jim the Standing Bear
Hi, I am just learning Cassandra at the moment. The O'Reilly book on Cassandra says that I should be able to add the keyspace schema definition to cassandra.yaml and then invoke loadSchemaFromYAML() to load it. I tried that but it didn't work (and I do realize this book is a bit obsolete now).

Re: installing cassandra on ec2 boxes

2011-05-10 Thread Sameer Farooqui
Not yet, but I'm working on deploying 0.8.0 beta 2 on multi-regions using a VPN on Ubuntu. I can share my technique on this mailing list in a little bit for how I did it. On Tue, May 10, 2011 at 12:56 PM, Anurag Gujral anurag.guj...@gmail.comwrote: Are you using different regions in ec2

Re: How to load schema non-programmatically? loadSchemaFromYAML doesn't work

2011-05-10 Thread chovatia jaydeep
Hi, If you are looking for some utility for loading the schema they you can use schematool command line utlity as:  $ schematool host port import|export Thank you, Jaydeep From: Jim the Standing Bear standingb...@gmail.com To: user@cassandra.apache.org Sent:

Re: How to load schema non-programmatically? loadSchemaFromYAML doesn't work

2011-05-10 Thread Anurag Gujral
Use jconsole give your hostname:port (hostname,port on which cassandra is running) then from the MBean select storage service . On the storage service MBean execute operation loadSchemaFromXML(); You may have to change the hostname in the your cassandra jmx settings thanks A On Tue, May 10,

Re: How to load schema non-programmatically? loadSchemaFromYAML doesn't work

2011-05-10 Thread Jim the Standing Bear
Hi Anurag, That was what I used, but it didn't seem to work. On Tue, May 10, 2011 at 4:54 PM, Anurag Gujral anurag.guj...@gmail.com wrote: Use jconsole give your hostname:port (hostname,port on which cassandra is running) then from the MBean select storage service . On the storage service

Re: How to load schema non-programmatically? loadSchemaFromYAML doesn't work

2011-05-10 Thread Praveen Sadhu
The export dumps to the console. The import only works on fresh system and imports from cassandra.yaml file. Praveen On 5/10/11 1:59 PM, Jim the Standing Bear standingb...@gmail.com wrote: Hi Anurag, That was what I used, but it didn't seem to work. On Tue, May 10, 2011 at 4:54 PM, Anurag

Has anyone used cassandra.yml straight from 0.7.x on 0.8.x?

2011-05-10 Thread Larry Liu
Clearly it doesn't work if go this route. Just wonder if anyone has such experience which can give me some pointers. Thanks.

Re: Has anyone used cassandra.yml straight from 0.7.x on 0.8.x?

2011-05-10 Thread Jonathan Ellis
Start w/ the 0.8 one and add your modifications to it. On Tue, May 10, 2011 at 4:24 PM, Larry Liu larryliu...@gmail.com wrote: Clearly it doesn't work if go this route. Just wonder if anyone has such experience which can give me some pointers. Thanks. -- Jonathan Ellis Project Chair,

Re: Has anyone used cassandra.yml straight from 0.7.x on 0.8.x?

2011-05-10 Thread Larry Liu
Thanks, Jonathan. How about data? Restoring the snapshot taken on a 0.7.x box to a 0.8.x should work, shouldn't it? On Tue, May 10, 2011 at 2:25 PM, Jonathan Ellis jbel...@gmail.com wrote: Start w/ the 0.8 one and add your modifications to it. On Tue, May 10, 2011 at 4:24 PM, Larry Liu

Re: Finding big rows

2011-05-10 Thread aaron morton
I'm not aware of anything to find the row sizes, and your code looks like a good approach. Converting the key bytes to a string only makes sense if your app is doing the same thing. In the cli try using one of the data type functions to format the key the same way as your app is, e.g. get

Re: Index interval tuning

2011-05-10 Thread Chris Burroughs
On 05/10/2011 02:12 PM, Peter Schuller wrote: That reminds me, my false positive ration is stuck at 1.0, so I guess bloom filters aren't doing a lot for me. That sounds unlikely unless you're hitting some edge case like reading a particular row that happened to be a collision, and only that

Re: Cassandra node throws NPE on startup

2011-05-10 Thread Shu Zhang
Late reply, but I just got the same error restarting after upgrading from 0.7.2 to 0.7.5. I did a drain using nodetool on each node before I killed them and did the upgrade. Should all commitlogs have been cleaned up after a drain? I would think so, but they were not. Maybe there is a bug around

Re: column bloat

2011-05-10 Thread aaron morton
For a reasonable large amount of use cases (for me, 2 out of 3 at the moment) supercolumns will be units of data where the columns (attributes) will never change by themselves or where the data does not change anyway (archived data). Can you use a standard CF and pack the multiple columns

Re: Newbie question

2011-05-10 Thread aaron morton
Depending on the data size and workload consider storing the data under both keys, in the same CF. e.g. keys id/1234 and name/MrFoo are different rows with the same data in them. Or us secondary indexes or a custom index using another CF as Narenda suggests. Secondary indexes are doing

Re: Has anyone used cassandra.yml straight from 0.7.x on 0.8.x?

2011-05-10 Thread Jonathan Ellis
Data you could just upgrade in place, but the snapshots are also compatible. On Tue, May 10, 2011 at 4:37 PM, Larry Liu larryliu...@gmail.com wrote: Thanks, Jonathan. How about data?  Restoring the snapshot taken on a 0.7.x box to a 0.8.x should work, shouldn't it? On Tue, May 10, 2011 at

Re: Cassandra node throws NPE on startup

2011-05-10 Thread Jonathan Ellis
It looks like this will happen if it tries to replay a mutation for a dropped keyspace (dropped CFs are handled correctly). Created https://issues.apache.org/jira/browse/CASSANDRA-2631 to fix. On Wed, Apr 27, 2011 at 6:22 AM, Subscriber subscri...@zfabrik.de wrote: Hi, I'm using Cassandra

How to build and use modified Cassandra src code?

2011-05-10 Thread Sameer Farooqui
I just edited the MessagingServices.java and OutboundTcpConnection.java files in 0.8.0beta 2 and build it successfully using Ant (I just ran the ant command in the apache-cassandra-0.8.0-beta2-src directory). I need some help with how to deploy the newly build binaries to a new Cassandra cluster

EC2 Snitch

2011-05-10 Thread Sameer Farooqui
Has anybody successfully used EC2 Snitch for cross-region deployments on EC2? Brandon Williams has not recommended using this just yet, but I was curious if anybody is using it with 0.8.0. Also, the snitch just let's the cluster automatically discover what the different regions (aka data centers)

Re: Renaming cluster

2011-05-10 Thread aaron morton
I was looking in the wrong place, the code expects the data directory for the system keyspace to be empty if it could not read from the LocationInfo CF. Created a patch here https://issues.apache.org/jira/browse/CASSANDRA-2632 to see if it can be changed. In the mean time try removing all

Re: How to load schema non-programmatically? loadSchemaFromYAML doesn't work

2011-05-10 Thread Jim the Standing Bear
Hi Praveen,  The import only works on fresh system and imports from cassandra.yaml file. Thanks for this information. What do you mean by a fresh system? A system that was just installed, configured, and launched, with no any other user-defined keyspaces? So in other words, I can only add

Re: Index interval tuning

2011-05-10 Thread aaron morton
What version and what were the values for RecentBloomFilterFalsePositives and BloomFilterFalsePositives ? The bloom filter metrics are updated in SSTableReader.getPosition() the only slightly odd thing I can see is that we do not count a key cache hit a a true positive for the bloom filter. If

Re: How to build and use modified Cassandra src code?

2011-05-10 Thread aaron morton
try ant artifacts that will package up the same bin and src releases into the build dir you see on the website. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 11 May 2011, at 12:42, Sameer Farooqui wrote: I just edited the

Re: How to load schema non-programmatically? loadSchemaFromYAML doesn't work

2011-05-10 Thread aaron morton
see conf/schema-sample.txt it's a script that can be passed to the cli (has an example there) and you can also paste the text into the cli if you want to. The cli has a bunch of online help as well. Hope that helps. - Aaron Morton Freelance Cassandra Developer @aaronmorton