...@gmail.com]
Sent: Wednesday, May 05, 2010 7:04 PM
To: user@cassandra.apache.org
Subject: Re: performance tuning - where does the slowness come from?
On Wed, May 5, 2010 at 6:59 PM, Mark Jones
mjo...@imagehawk.commailto:mjo...@imagehawk.com wrote:
My data is single row/key to a 500 byte column and I'm
The max size would probably be best determined by looking at the size of your
MemTable
!--
~ Flush memtable after this much data has been inserted, including
~ overwritten data. There is one memtable per column family, and
~ this threshold is based solely on the amount of data
One of your problems here is the connect uses a daft connection string
convention
You would think node:port but it's actually node/port
Your connection only succeeded because 9160 is the default for port not
specified.
And the keyspace thing that jbellis mentioned.
-Original Message-
At the moment they all have to fit in memory during compaction. Columns OR
SuperColumns (for one Key).
From: Andrew Nguyen [mailto:andrew-lists-cassan...@ucsfcti.org]
Sent: Thursday, April 29, 2010 10:30 AM
To: user@cassandra.apache.org
Subject: Re: Cassandra data model for financial data
What
Sounds like you want something like http://oss.oetiker.ch/rrdtool/
Assuming you are trying to store computer log data.
Do you have any other data that can spread the data load? Like a machine name?
If so, you can use a hash of that value to place that machine randomly on
the net, then
MD5 is not a perfect hash, it can produce collisions, how are these dealt with?
Is there a size appended to them?
If 2 keys collide, would that result in a merging of data (if the column names
aren't the same) or an overwrite if they were?
Orthogonal in this case means at cross purposes Transactions can't really be
done with eventual consistency because all nodes don't have all the info at the
time the transaction is done. I think they recommend zookeeper for this kind
of stuff, but I don't know why you want to use Cassandra vs
How is this specified?
Is it a large hex #?
A string of bytes in hex?
http://wiki.apache.org/cassandra/StorageConfiguration doesn't say.
Ellis [mailto:jbel...@gmail.com]
Sent: Friday, April 23, 2010 10:22 AM
To: user@cassandra.apache.org
Subject: Re: org.apache.cassandra.dht.OrderPreservingPartitioner Initial Token
a normal String from the same universe as your keys.
On Fri, Apr 23, 2010 at 7:23 AM, Mark Jones mjo...@imagehawk.com
Turns out assign can be called with the length as well
So mod your code to be
new_col.column.assign((char *)uuid, 16);
and you are fixed.
-Original Message-
From: Mark Jones [mailto:mjo...@imagehawk.com]
Sent: Friday, April 23, 2010 10:52 AM
To: user@cassandra.apache.org
Subject: RE
Eliminating GC hell would probably do a lot to help Cassandra maintain speed vs
periods of superfast/superslow performance. I look forward to hearing how this
experiment goes.
From: Eric Hauser [mailto:ewhau...@gmail.com]
Sent: Friday, April 23, 2010 3:37 PM
To: user@cassandra.apache.org
Stop the program, wipe the data dir and commit logs, start the program, it's
what I'm doing.
I even made a script that will do it so it's just a one line command.
From: ROGER PUIG GANZA [mailto:rp...@tid.es]
Sent: Wednesday, April 21, 2010 5:20 AM
To: cassandra-u...@incubator.apache.org
I'm seeing a cluster of 4 (replication factor=2) to be about as slow overall as
the barely faster than the slowest node in the group. When I run the 4 nodes
individually, I see:
For inserts:
Two nodes @ 12000/second
1 node @ 9000/second
1 node @ 7000/second
For reads:
Abysmal, less than
I too am seeing very slow performance while testing worst case scenarios of 1
key leading to 1 supercolumn and 1 column beyond that.
Key - SuperColumn - 1 Column (of ~ 500 bytes)
Drive utilization is 80-90% and I'm only dealing with 50-70 million rows.
(With NO swapping) So far, I've found
the subcolumns in that supercolumn
http://wiki.apache.org/cassandra/CassandraLimitations
On Tue, Apr 20, 2010 at 9:50 AM, Mark Jones mjo...@imagehawk.com wrote:
I too am seeing very slow performance while testing worst case scenarios of
1 key leading to 1 supercolumn and 1 column beyond
at 11:08 AM, Mark Jones mjo...@imagehawk.com wrote:
When I first read this, it bothered me because it seemed like it couldn't be
so. So I read the link, and it says the whole thing, so I have to ask for
some classification here.
I had always assumed a super column was similar to a local
email per row, and another CF for
UserEmails with per-user index rows referencing the Emails rows.
b
On Tue, Apr 20, 2010 at 9:44 AM, Mark Jones mjo...@imagehawk.com wrote:
To make sure I'm clear on what you are saying:
Are the Individual Emails in the example below, Supercolumns
I'm seeing some issues like this as well, in fact, I think seeing your graphs
has helped me understand the dynamics of my cluster better.
Using some ballpark figures for inserting single column objects of ~500 bytes
onto individual nodes(not when combined as a cluster):
Node1: Inserts 12000/s
I don't see any way to increase the # of active Deserializers in
storage-conf.xml
Tpstats more than 8 hours after insert/read stop
Pool NameActive Pending Completed
FILEUTILS-DELETE-POOL 0 0227
STREAM-STAGE 0
19 matches
Mail list logo