How to store a list of values?

2012-03-26 Thread Ben McCann
I have a profile column family and want to store a list of skills in each profile. In BigTable I could store a Protocol Bufferhttp://code.google.com/apis/protocolbuffers/docs/overview.htmlwith a repeated field, but I'm not sure how this is typically accomplished in Cassandra. One option would be

Re: Error in FAQ?

2012-03-26 Thread R. Verlangen
If you want to modify a column family, just open the command line interface (cassandra-cli), connect to a node (probably: connect localhost/9160;). When you have to create your first keyspace type: create keyspace MyKeyspace; For modifying an existing keyspace type: use MyKeyspace; If you need

Re: unbalanced ring

2012-03-26 Thread Radim Kolar
How can I fix this? add more data. 1.5M is not enough to get reliable reports

problem in create column family

2012-03-26 Thread puneet loya
It is giving errors like Unable to find abstract-type class 'org.apache.cassandra.db.marshal.utf8' and java.lang.RuntimeException: org.apache.cassandra.db.marshal.MarshalException: cannot parse 'catalogueId' as hex bytes where catalogueId is a column that has utf8 as its data type. they may be

Re: How to store a list of values?

2012-03-26 Thread samal
I would take simple approach. create one other CF UserSkill with row key same as profile_cf key, In user_skill cf will add skill as column name and value null. Columns can be added or removed. UserProfile={ '*ben*'={ blah :blah blah :blah blah :blah } } UserSkill={ '*ben*'={

Re: problem in create column family

2012-03-26 Thread R. Verlangen
You should use the full type names, e.g. create column family MyColumnFamily with comparator=UTF8Type; 2012/3/26 puneet loya puneetl...@gmail.com It is giving errors like Unable to find abstract-type class 'org.apache.cassandra.db.marshal.utf8' and java.lang.RuntimeException:

Re: How to store a list of values?

2012-03-26 Thread Ben McCann
Thanks for the reply Samal. I did not realize that you could store a column with null value. Do you know if this solution would work with composite columns? It seems super columns are being phased out in favor of composites, but I do not understand composites very well yet. I'm trying to

Re: How to store a list of values?

2012-03-26 Thread samal
plus it is fully compatible with CQL. SELECT * FROM UserSkill WHERE KEY='ben'; On Mon, Mar 26, 2012 at 9:13 PM, samal samalgo...@gmail.com wrote: I would take simple approach. create one other CF UserSkill with row key same as profile_cf key, In user_skill cf will add skill as column name

Re: How to store a list of values?

2012-03-26 Thread samal
On Mon, Mar 26, 2012 at 9:20 PM, Ben McCann b...@benmccann.com wrote: Thanks for the reply Samal. I did not realize that you could store a column with null value. values can be null or any value like [default@node] set hus['test']['wowq']='\{de\'.de\;\}\+\^anything'; Value inserted.

Re: Counters and replication factor

2012-03-26 Thread aaron morton
Can you describe the situations where counter updates are lost or go backwards ? Do you ever get TimedOutExceptions when performing counter updates ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 24/03/2012, at 6:34 PM, Radim Kolar

Re: Adding Long type rows to a CF containing Integer(32) type row keys, without overlapping ?

2012-03-26 Thread aaron morton
without them overlapping/disturbing each other (assuming that keys lie in above domains) ? Not sure what you mean by overlapping. 42 as a int and 42 as a long are the same key. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On

Re: How to store a list of values?

2012-03-26 Thread Sasha Dolgy
Save the skills in a single column in json format. Job done. On Mar 26, 2012 7:04 PM, Ben McCann b...@benmccann.com wrote: True. But I don't need the skills to be searchable, so I'd rather embed them in the user than add another top-level CF. I was thinking of doing something along the

Re: smart client proxy for cassandra

2012-03-26 Thread aaron morton
I've heard of people using HA Proxy http://haproxy.1wt.eu/ with php as a connection pool. Note that detecting failure in Cassandra can only be done as part of a request. So HA Proxy cannot understand if a node is actually functional, only that it allows a socket to be opened. There is some

Re: Regarding nodetool tpstats

2012-03-26 Thread aaron morton
Cheers :) - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 26/03/2012, at 1:55 PM, Watanabe Maki wrote: - InternalResponseStage Handles response to non client initiated messages, including bootstrap, schema check, etc. maki On

Re: Sample Data

2012-03-26 Thread Tyler Hobbs
The 'stress' tool that you can find in a source checkout of cassandra sounds like what you're looking for. It's designed to write data to (or read data from) a cluster as fast as possible, and has plenty of options for tweaking the type of data it inserts. You can read more about it here:

Re: Sample Data

2012-03-26 Thread Tom Melendez
I wish to test certain things in Cassandra so can someone help me with sample database or sample database data generator which can help me flood Cassandra nodes with large amount of data. I would recommend YCSB: https://github.com/brianfrankcooper/YCSB/wiki/ Thanks, Tom

Re: Adding Long type rows to a CF containing Integer(32) type row keys, without overlapping ?

2012-03-26 Thread Ertio Lew
I need to use the range beyond the integer32 type range, so I am using Long to write those keys. I am afraid if this might lead to collisions with the previously stored integer keys in the same CF even if I leave out the int32 type range. On Mon, Mar 26, 2012 at 10:51 PM, aaron morton

Re: Estimation of memtable size are wrong

2012-03-26 Thread aaron morton
Yes i noticed that. Its not too often, about 1 times per week. The assumption would be that the workload stabilises over time. INFO [MemoryMeter:1] 2012-03-23 00:00:18,407 Memtable.java (line 186) CFS(Keyspace='whois', ColumnFamily='ipbans') liveRatio is 64.0 (just-counted was

Re: One or Two clusters?

2012-03-26 Thread aaron morton
Use one cluster. Use lots-o-machines. The read and write paths do not directly interfere with each other like they do in a RDBMS. Compaction created by writes can suck up disk IO, but this is throttled so in practice it is not such a big problem. Excessive GC created by reads or compaction

Re: Performance overhead when using start and end columns

2012-03-26 Thread aaron morton
See the test's in the article. The code I used for profiling is also available. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 27/03/2012, at 6:21 AM, Mohit Anchlia wrote: Thanks but if I do have to specify start and end columns then

Re: Adding Long type rows to a CF containing Integer(32) type row keys, without overlapping ?

2012-03-26 Thread aaron morton
Only if you reuse a row key. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 27/03/2012, at 6:38 AM, Ertio Lew wrote: I need to use the range beyond the integer32 type range, so I am using Long to write those keys. I am afraid if

Re: Adding Long type rows to a CF containing Integer(32) type row keys, without overlapping ?

2012-03-26 Thread aaron morton
Only if you reuse a row key. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 27/03/2012, at 6:38 AM, Ertio Lew wrote: I need to use the range beyond the integer32 type range, so I am using Long to write those keys. I am afraid if

Re: Performance overhead when using start and end columns

2012-03-26 Thread Data Craftsman
Hi Aaron, Thanks for the benchmark. The matrix is valuable. Thanks, Charlie (@mujiang) 一个 木匠 === Data Architect Developer http://mujiang.blogspot.com On Mon, Mar 26, 2012 at 10:53 AM, aaron morton aa...@thelastpickle.com wrote: See the test's in the article. The code I used for

Re: How to store a list of values?

2012-03-26 Thread samal
Save the skills in a single column in json format. Job done. Good if it have fixed set of skills, then any add or delete changes need handle in app. -read column first-reformat JOSN-update column (2 thrift calls). skill~Java: null, skill~Cassandra: null This is also good option, but any

Server side scripting support in Cassandra - go Python !

2012-03-26 Thread Data Craftsman
Howdy, Some Polyglot Persistence(NoSQL) products started support server side scripting, similar to RDBMS store procedure. E.g. Redis Lua scripting. I wish it is Python when Cassandra has the server side scripting feature. FYI, http://antirez.com/post/250

Re: Performance overhead when using start and end columns

2012-03-26 Thread Mohit Anchlia
Thanks! On Mon, Mar 26, 2012 at 10:53 AM, aaron morton aa...@thelastpickle.comwrote: See the test's in the article. The code I used for profiling is also available. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On

Re: CQL Reversed and Comparator reversed=true

2012-03-26 Thread Praveen Baratam
Thank you Aaron! On Mon, Mar 26, 2012 at 10:44 PM, aaron morton aa...@thelastpickle.comwrote: create column family Comments with comparator = 'CompositeType(UTF8Type(reversed=True), UTF8Type)' and key_validation_class = 'UTF8Type' and default_validation_class = 'UTF8Type';

multi region EC2

2012-03-26 Thread Deno Vichas
all, we just about ready to push our app live and just have some cassandra tuning left. i've been currently running a 4 node (rep factor 3, simple) in EC2 using the datastax AMIs (thanks datastax). so after reading through a bunch of docs i have a few questions. - what is the min and

copy data for dev

2012-03-26 Thread Deno Vichas
all, is there a easy way to take a 4 node snapshot and restore it on my single node dev cluster? thanks, deno

what other ports than 7199 need to be open for nodetool to work?

2012-03-26 Thread Yiming Sun
Hi, We opened port 7199 on a cassandra node, but were unable to get a nodetool to talk to it remotely unless we turn off the firewall entirely. So what other ports should be opened for this -- online posts all indicate that JMX uses a random dynamic port, which would be difficult to create a

Re: what other ports than 7199 need to be open for nodetool to work?

2012-03-26 Thread Nick Bailey
You are correct about the second random dynamic port. There is a ticket open to fix that as well as some other jmx issues: https://issues.apache.org/jira/browse/CASSANDRA-2967 Regarding nodetool, it doesn't do anything special. Nodetool is often used to connect to 'localhost' which generally

Re: what other ports than 7199 need to be open for nodetool to work?

2012-03-26 Thread Yiming Sun
Thanks Nick -- I didn't know about this ticket. Good to know. Yes, nodetool doesn't do anything special - but I still wish I could use nodetool to examine other nodes, instead of having to ssh to other nodes first and then nodetool each one (i am lazy :-). -- Y. On Mon, Mar 26, 2012 at 3:50

Re: How to store a list of values?

2012-03-26 Thread R. Verlangen
but any schema change will break it How do you mean? You don't have to specify the columns in Cassandra so it should work perfect. Except for the skill~ is preserverd for your list. 2012/3/26 samal samalgo...@gmail.com Save the skills in a single column in json format. Job done. Good if

Re: Performance overhead when using start and end columns

2012-03-26 Thread R. Verlangen
@Aaron: Very interesting article! Mentioned it on my Dutch blog. 2012/3/26 Mohit Anchlia mohitanch...@gmail.com Thanks! On Mon, Mar 26, 2012 at 10:53 AM, aaron morton aa...@thelastpickle.comwrote: See the test's in the article. The code I used for profiling is also available. Cheers

Re: what other ports than 7199 need to be open for nodetool to work?

2012-03-26 Thread Edward Capriolo
I have documented some of the things you can do to make the random port nature of JMX happy. http://www.jointhegrid.com/highperfcassandra/?p=140 Other options are setting up mx4j or using jmxterm, or setting up a sock proxy and tell jconsole to use your proxy. Also there is the xwindows over

Re: multi region EC2

2012-03-26 Thread aaron morton
(rep factor 3, simple) if this means you are using the SimpleStrategy I would recommend using the NetworkTopologyStrategy. - what is the min and recommended number of nodes to use in multiple region cluster. we only have a single app server right now. It depends on how exciting you want

Schema advice/help

2012-03-26 Thread Ertio Lew
I need to store activities by each user, on 5 items types. I always want to read last 10 activities on each item type, by a user (ie, total activities to read at a time =50). I am wanting to store these activities in a single row for each user so that they can be retrieved in single row query,

cassandra 1.08 on java7 and win7

2012-03-26 Thread Frank Hsueh
I think I have cassandra the server started In another window: cassandra-cli.bat -h localhost -p 9160 Starting Cassandra Client Connected to: Test Cluster on localhost/9160 Welcome to Cassandra CLI version 1.0.8 Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit. [default@unknown]

Re: cassandra 1.08 on java7 and win7

2012-03-26 Thread R. Verlangen
Ben Coverston wrote earlier today: Use a version of the Java 6 runtime, Cassandra hasn't been tested at all with the Java 7 runtime So I think that might be a good way to start. 2012/3/26 Frank Hsueh frank.hs...@gmail.com I think I have cassandra the server started In another window:

Re: cassandra 1.08 on java7 and win7

2012-03-26 Thread Sasha Dolgy
interesting. that behaviour _does_ happen in 1.0.8, but doesn't in 1.0.6 on windows 7 with Java 7. looks to be a problem with the CLI and not the actual Cassandra service. just tried it now. -sd On Mon, Mar 26, 2012 at 11:29 PM, R. Verlangen ro...@us2.nl wrote: Ben Coverston wrote earlier

Re: cassandra 1.08 on java7 and win7

2012-03-26 Thread Frank Hsueh
I'm using the latest of Java 1.6 from Oracle. On Mon, Mar 26, 2012 at 2:29 PM, R. Verlangen ro...@us2.nl wrote: Ben Coverston wrote earlier today: Use a version of the Java 6 runtime, Cassandra hasn't been tested at all with the Java 7 runtime So I think that might be a good way to start.

Re: cassandra 1.08 on java7 and win7

2012-03-26 Thread Frank Hsueh
err ... same thing happens with Java 1.6 On Mon, Mar 26, 2012 at 2:35 PM, Frank Hsueh frank.hs...@gmail.com wrote: I'm using the latest of Java 1.6 from Oracle. On Mon, Mar 26, 2012 at 2:29 PM, R. Verlangen ro...@us2.nl wrote: Ben Coverston wrote earlier today: Use a version of the

Re: cassandra 1.08 on java7 and win7

2012-03-26 Thread Sasha Dolgy
best to open an issue: https://issues.apache.org/jira/browse/CASSANDRA On Mon, Mar 26, 2012 at 11:35 PM, Frank Hsueh frank.hs...@gmail.com wrote: err ... same thing happens with Java 1.6 On Mon, Mar 26, 2012 at 2:35 PM, Frank Hsueh frank.hs...@gmail.comwrote: I'm using the latest of

The cassandra gem on rubygems.org needs to be updated

2012-03-26 Thread Ilya Maykov
If you're not a maintainer of the cassandra gem on rubygems.org, you can stop reading. And if you are ... I'm just bringing this up to your attention: https://github.com/twitter/cassandra/issues/142 Thanks! -- Ilya

Re: cassandra 1.08 on java7 and win7

2012-03-26 Thread Frank Hsueh
create keyspace via cassandra cli fails https://issues.apache.org/jira/browse/CASSANDRA-4085 On Mon, Mar 26, 2012 at 2:44 PM, Sasha Dolgy sdo...@gmail.com wrote: best to open an issue: https://issues.apache.org/jira/browse/CASSANDRA On Mon, Mar 26, 2012 at 11:35 PM, Frank Hsueh

Re: multi region EC2

2012-03-26 Thread Deno Vichas
On 3/26/2012 2:15 PM, aaron morton wrote: - can i migrate the replication strategy one node at a time or do i need to shut to the whole cluster to do this? Just use the NTS from the start. but what if i already have a bunch (8g per node) data that i need and i don't have a way to re-create

Internal error processing get_slice (NullPointerException)

2012-03-26 Thread John Laban
Has anyone seen this particular NPE before from Cassandra? This is on 1.0.8. It seems to happen transiently on multiple nodes in my cluster, every so often, and goes away. ERROR [Thrift:45] 2012-03-26 19:59:12,024 Cassandra.java (line 3041) Internal error processing get_slice

Re: Fwd: information on cassandra

2012-03-26 Thread Maki Watanabe
auto_bootstrap has been removed from cassandra.yaml and always enabled since 1.0. fyi. maki 2012/3/26 R. Verlangen ro...@us2.nl: Yes, you can add nodes to a running cluster. It's very simple: configure the cluster name and seed node(s) in cassandra.yaml, set auto_bootstrap to true and start

Re: unbalanced ring

2012-03-26 Thread Maki Watanabe
What version are you using? Anyway try nodetool repair compact. maki 2012/3/26 Tamar Fraenkel ta...@tok-media.com Hi! I created Amazon ring using datastax image and started filling the db. The cluster seems un-balanced. nodetool ring returns: Address DC Rack

Re: Error in FAQ?

2012-03-26 Thread Ben McCann
I updated the FAQ to the best of my ability: http://wiki.apache.org/cassandra/FAQ#modify_cf_config On Mon, Mar 26, 2012 at 12:25 AM, R. Verlangen ro...@us2.nl wrote: If you want to modify a column family, just open the command line interface (cassandra-cli), connect to a node (probably:

Re: How to store a list of values?

2012-03-26 Thread samal
On Tue, Mar 27, 2012 at 1:47 AM, R. Verlangen ro...@us2.nl wrote: but any schema change will break it How do you mean? You don't have to specify the columns in Cassandra so it should work perfect. Except for the skill~ is preserverd for your list. In case skill~ is decided to change to