Re: Error messages after rolling updating cassandra from 0.7.0 to 0.7.2

2011-04-08 Thread Kazuo YAGI
That's it! Nagios causes this message. I should have noticed it when I found the message appeared every exactly 5 minutes. define command{ command_namecheck_cassandra_node command_line$USER1$/check_tcp -H $HOSTADDRESS$ -p 7000 -t 5 -E -M crit } The issue is settled now. Thank

selecting random columns ..

2011-04-08 Thread Sasha Dolgy
hi all, is there a way to select random columns from a key? -- Sasha Dolgy sasha.do...@gmail.com

Re: Minor Follow-up: reduced cached mem; resident set size growth

2011-04-08 Thread Chris Burroughs
On 04/05/2011 03:04 PM, Chris Burroughs wrote: I have gc logs if anyone is interested. This is from a node with standard io, jna enabled, but limits were not set for mlockall to succeed. One can see -/+ buffers/cache free shrinking and the C* pid's RSS growing. Includes several days of: gc

Is the repair still going on or did it fail because of exceptions?

2011-04-08 Thread Jonathan Colby
It seems on my cluster there are a few unserializable Rows. I'm trying to run a repair on the nodes, but it also seems that the replica nodes have unreadable or unserializable rows.The problem is, I cannot determine if the repair is still going on, or if was interrupted because of these

ballpark low cardinality range for secondary indexes

2011-04-08 Thread Adi
I am trying to decide whether to use secondary indexes or use an inverted index column family for a use case. Is there any suggested ballpark range for low cardinality for which secondary indexes are suitable. Meaning at what range should using a secondary index be ruled in or out: cardinality of

Re: Is the repair still going on or did it fail because of exceptions?

2011-04-08 Thread Sylvain Lebresne
Sadly repair isn't very resilient to errors and has failed. There is a few ticket open to improve this and repair in general but right now, if any problems occurs during repairs, it will fail (and nodetool repair won't return, so you could just ctrl-c). Provided you're on a recent enough

Pyramid Organization of Data

2011-04-08 Thread Patrick Julien
We have a pilot project running where all our historical data worldwide would be stored using cassandra. So far, we have been successful at getting the write and read throughput we need, in fact, coming in over 27% over our needed capacity and well beyond what we were able to achieve with mysql,

auto_bootstrap

2011-04-08 Thread mcasandra
in yaml: # Set to true to make new [non-seed] nodes automatically migrate data # to themselves from the pre-existing nodes in the cluster. Why only non-seed nodes? What if seed nodes need to bootstrap? -- View this message in context:

Re: selecting random columns ..

2011-04-08 Thread Edward Capriolo
On Fri, Apr 8, 2011 at 4:48 AM, Sasha Dolgy sdo...@gmail.com wrote: hi all, is there a way to select random columns from a key? -- Sasha Dolgy sasha.do...@gmail.com getRangeSlice with random column start key.

Re: ballpark low cardinality range for secondary indexes

2011-04-08 Thread Ed Anuff
If you're just indexing on a single column value and the values have low cardinality in, say, the 10's - I'd have a wide row for each cardinal value that contained the set of keys for rows that contained that value. For higher levels of cardinality or if you're indexing on multiple columns, there

Problem with UUID

2011-04-08 Thread Олександр Силка
Hi everyone, I have column family called site sorted by org.apache.cassandra.db.marshal.TimeUUIDType. When I try to save some data using hector i get next message InvalidRequestException(why:TimeUUID should be 16 or 0 bytes (3)). My Cassandra version 0.7.0 This is snippets of my code: public

Re: Problem with UUID

2011-04-08 Thread Ed Anuff
Hmm, if you're really doing this, you're not getting a time uuid: UUID timeUUID = getTimeUUID().randomUUID(); That call to randomUUID() is invoking the static randomUUID() method in java.util.UUID which is generating a non-time random uuid. I'm not sure why you're getting that error message

Atomicity Strategies

2011-04-08 Thread Alex Araujo
Hi, I was wondering if there are any patterns/best practices for creating atomic units of work when dealing with several column families and their inverted indices. For example, if I have Users and Groups column families and did something like: Users.insert( user_id, columns )

Host score calculation for dynamic_snitch

2011-04-08 Thread A J
dynamic_snitch seems to do host score calculation to figure the latency of each node. What are the details of this calculation : 1. What is the mechanism to determine latency ? 2. Does it score the calculated scores and use the historical figures to come up with the latest scores ? (You can't

Re: Pyramid Organization of Data

2011-04-08 Thread Joe Stump
A few lines of Java in a partitioning or rack aware strategy might be able to achieve this. --Joe -- Typed with big fingers on a small keyboard. On Apr 8, 2011, at 13:17, Patrick Julien pjul...@gmail.com wrote: We have a pilot project running where all our historical data worldwide would

Re: ballpark low cardinality range for secondary indexes

2011-04-08 Thread Adi
Thanks for the suggestions Ed. Your blog post is quite helpful in deciding on and implementing CF inverted indexes. Our data definitely leans towards external CF - has high cardinality(1000s for one column, millions for another), multiple columns need to be indexed, needs sorted order. Hope that

Re: ballpark low cardinality range for secondary indexes

2011-04-08 Thread Ed Anuff
Well, the amazon paper is good at describing the nature of the problem, but to solve it you'll probably want to use zookeeper. The paper is useful in understanding exactly what you need to lock on and what you don't while updating the index, so you can avoid slowing things down any more than is

Re: Problem with UUID

2011-04-08 Thread Олександр Силка
Then how i can generate correct time UUID key in java ? 8 квітня 2011 р. 22:58 Ed Anuff e...@anuff.com написав: Hmm, if you're really doing this, you're not getting a time uuid: UUID timeUUID = getTimeUUID().randomUUID(); That call to randomUUID() is invoking the static randomUUID() method

Re: Problem with UUID

2011-04-08 Thread Patrick Julien
I think this is what you're looking for http://wiki.apache.org/cassandra/FAQ#working_with_timeuuid_in_java 2011/4/8 Олександр Силка sylkaa...@gmail.com: Then how i can generate correct time UUID key in java ? 8 квітня 2011 р. 22:58 Ed Anuff e...@anuff.com написав: Hmm, if you're really

Re: Problem with UUID

2011-04-08 Thread Ed Anuff
Oops, I should have been more clear. You have this code: UUID timeUUID = getTimeUUID().randomUUID(); what you need is this code: UUID timeUUID = getTimeUUID(); What I meant by not understanding the error message was that I thought the TimeUUIDType gave a different error message than the one

Re: Problem with UUID

2011-04-08 Thread Олександр Силка
Thanks that you try to help me, but i still get error message InvalidRequestException(why:TimeUUID should be 16 or 0 bytes (3)) This code UUID timeUUID = getTimeUUID(); doesn't solve my problem. 9 квітня 2011 р. 01:16 Ed Anuff e...@anuff.com написав: Oops, I should have been more clear. You

Re: Pyramid Organization of Data

2011-04-08 Thread Jonathan Ellis
On Fri, Apr 8, 2011 at 12:17 PM, Patrick Julien pjul...@gmail.com wrote: The problem is this: we would like the historical data from Tokyo to stay in Tokyo and only be replicated to New York.  The one in London to be in London and only be replicated to New York and so on for all data centers.

Re: Problem with UUID

2011-04-08 Thread Олександр Силка
I try to use method getUniqueTimeUUIDinMillis from https://github.com/rantav/hector/blob/master/core/src/main/java/me/prettyprint/cassandra/utils/TimeUUIDUtils.java but i still get same result InvalidRequestException(why:TimeUUID should be 16 or 0 bytes (3)); 9 квітня 2011 р. 01:32 Олександр

Re: Problem with UUID

2011-04-08 Thread Ed Anuff
I think the problem is this then: mutator.addInsertion(timeUUID, columnFamilyName, column); I'm not sure what you're doing here, but you're using your timeUUID as the row key, not the column name. I don't see you actually assigning the column name so I don't know what you're putting

Re: Atomicity Strategies

2011-04-08 Thread Drew Kutcharian
I'm interested in this too, but I don't think this can be done with Cassandra alone. Cassandra doesn't support transactions. I think hector can retry operations, but I'm not sure about the atomicity of the whole thing. On Apr 8, 2011, at 1:26 PM, Alex Araujo wrote: Hi, I was wondering if

Re: Pyramid Organization of Data

2011-04-08 Thread Patrick Julien
I'm familiar with this material. I hadn't thought of it from this angle but I believe what you're suggesting is that the different data centers would hold a different properties file for node discovery instead of using auto-discovery. So Tokyo, and others, would have a configuration that make it

CF config for Stress Test

2011-04-08 Thread mcasandra
I am starting a stress test using hector on 6 node machine 4GB heap and 12 core. In hectore readme this is what I got by default: create keyspace StressKeyspace with replication_factor = 3 and placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'; use StressKeyspace; drop

Re: Problem with UUID

2011-04-08 Thread Олександр Силка
Is that mean with this configuration i must use for column value only UUID ? I realy don't understand how it work. I little change my code: UUID timeUUID = DaoHelper.getTimeUUID(); HColumnString, String column = HFactory.createColumn(name, Alex, StringSerializer.get(), StringSerializer.get());

Re: Pyramid Organization of Data

2011-04-08 Thread Jonathan Ellis
No, I'm suggesting you have a Tokyo keyspace that gets replicated as {Tokyo: 2, NYC:1}, a London keyspace that gets replicated to {London: 2, NYC: 1}, for example. On Fri, Apr 8, 2011 at 5:59 PM, Patrick Julien pjul...@gmail.com wrote: I'm familiar with this material.  I hadn't thought of it

Error connection to remote JMX agent

2011-04-08 Thread Mengchen Yu
I use version 0.7.4, I've done something like this: 1. I ssh to my Eucalyptus account and applied 1 instance, got a public IP and a internal IP. 2. I scp the tar ball of apache-cassandra-0.7.4-bin.tar.gz to root of my instance, unzip it and create directories 3. I run /bin/cassandra -f, everything

What need to be monitored while running stress test

2011-04-08 Thread mcasandra
What are the key things to monitor while running a stress test? There is tons of details in nodetoll tpstats/netstats/cfstats. What in particular should I be looking at? Also, I've been looking at iostat and await really goes high but cfstats shows low latency in microsecs. Is latency in cfstats

Re: Pyramid Organization of Data

2011-04-08 Thread Patrick Julien
thank you, I get it now. On Fri, Apr 8, 2011 at 7:15 PM, Jonathan Ellis jbel...@gmail.com wrote: No, I'm suggesting you have a Tokyo keyspace that gets replicated as {Tokyo: 2, NYC:1}, a London keyspace that gets replicated to {London: 2, NYC: 1}, for example. On Fri, Apr 8, 2011 at 5:59 PM,

Re: Atomicity Strategies

2011-04-08 Thread Alex Araujo
On 4/8/11 5:46 PM, Drew Kutcharian wrote: I'm interested in this too, but I don't think this can be done with Cassandra alone. Cassandra doesn't support transactions. I think hector can retry operations, but I'm not sure about the atomicity of the whole thing. On Apr 8, 2011, at 1:26 PM,

Re: nodetool move hammers the next node in the ring

2011-04-08 Thread aaron morton
My brain just started working. The streaming for the move may need to be throttled, but once the file has been received the bloom filters, row indexes and secondary indexes are built. That will also take some effort, do you have any secondary indexes? If you are doing a move again could you

Re: nodetool move hammers the next node in the ring

2011-04-08 Thread Chris Goffinet
We also have a ticket open at https://issues.apache.org/jira/browse/CASSANDRA-2399 We have observed in production the impact of streaming data to new nodes being added. We actually have our entire dataset in page cache in one of our clusters, our 99th percentiles go from 20ms to 1 second on

Re: Atomicity Strategies

2011-04-08 Thread Dan Washusen
Here's a good writeup on how fightmymonster.com does it... http://ria101.wordpress.com/category/nosql-databases/locking/ -- Dan Washusen Make big files fly visit digitalpigeon.com On Saturday, 9 April 2011 at 11:53 AM, Alex Araujo wrote: On 4/8/11 5:46 PM, Drew Kutcharian wrote: I'm

Re: Columns values(integer) need frequent updates/ increments

2011-04-08 Thread aaron morton
A lot depends on your definition of frequently. Also when a column is updated in the memtable the previous column is replaced, so when the memtable is flushed to disk as an SSTable only one copy of the column is stored. If you have a situation where a lot of columns are overwritten setting a