A.) store ALL the data associated with the user onto a single users row-key.
Some user keys may be small, others may get larger over time depending upon
activity.
I would go with this.
The important thing is supporting the read queries.
Cheers
Aaron
-
Aaron Morton
Freelance
On disk layout is described here, not sure how correct it is now days.
http://wiki.apache.org/cassandra/ArchitectureSSTable
There are multiple files involved, this will give you an idea of the read and
write path http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/
Hope that helps.
For querying purposes it would be better to use readable strings because
you can really get information out of that.
TimeUUID is just a unique value based on time; but not only the time.
2012/2/28 Tamar Fraenkel ta...@tok-media.com
Hi!
I have a column family where I use rows as time buckets.
Have you tried lowering the batch size and increasing the time out? Even just
to get it to work.
If you get a TimedOutException it means CL number of servers did not respond in
time.
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On
not a great deal of difference, personally I would stick with seconds since
epoch (it is probably slightly faster).
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 28/02/2012, at 7:24 PM, Tamar Fraenkel wrote:
Hi!
I have a column
Hi All,
We are running a 3-node cluster with Cassandra 0.6.13.
We are in the process of upgrading to 1.x, but can't do so for a while
because we can't take the cluster offline.
Until now 0.6.13 has run without problems, but lately we are getting
some performance issues.
We are getting
* Does the column name get stored for every col/val for every key
(which sort of worries me for long column names)
Yes, the column name is stored with each value for every key, but it may
not matter if you switch on compression, which AFAIK has only advantages
and will be the default. I
2012/2/28 Hontvári József Levente hontv...@flyordie.com
* Does the column name get stored for every col/val for every key (which
sort of worries me for long column names)
Yes, the column name is stored with each value for every key, but it may
not matter if you switch on compression,
Thanks, makes my life easier.
Tamar
On Tue, Feb 28, 2012 at 10:23 AM, aaron morton aa...@thelastpickle.comwrote:
not a great deal of difference, personally I would stick with seconds
since epoch (it is probably slightly faster).
Cheers
-
Aaron Morton
Freelance Developer
I'll alter these settings and will let you know.
Regards,
P.
On Tue, Feb 28, 2012 at 09:23, aaron morton aa...@thelastpickle.com wrote:
Have you tried lowering the batch size and increasing the time out? Even
just to get it to work.
If you get a TimedOutException it means CL number of
When writing to Cassandra (v 1.0.7) I'm seeing ocasional delays of up to 4
seconds. Below is from the system.log where we are seeing the delays, is
this a result of GC and is it worth me tuning these settings in order to
fix? If so, any suggestions? adjusting memtable_total_space_in_mb?
*DEBUG
I have a small ring of two nodes running successfully on aws.
In order to understand cassandra support for NAT I have tried to add another
node outside aws on a machine behind NAT.
When I try to join the ring, there is a 30s pause after starting the
messaging service and then it fails, unable to
Hello.
Any messages about GC earlier in the logs? Cassandra server monitors
memory and starts complaining in advance if memory gets full.
Any chance you've got a full key delete-only scenario for some column
families? Cassandra has a bug not being able to flush such memtables.
I've filled a
In a multi server env, to avoid key collisions timeuuid may be the better
choice.
On Monday, February 27, 2012, Tamar Fraenkel wrote:
Hi!
I have a column family where I use rows as time buckets.
What I do is take epoc time in seconds, and round it to 1 hour (taking the
result of
Given that these rows are wanted to be time buckets, you would want
collisions, in fact that would be the standard way of working, so
IMO, the uuid just removes the ability to bucket data and would not
be wanted.
On 02/28/2012 10:30 AM, Paul Loy wrote:
I have not found any examples of utilizing a CompositeType of
DynamicCompositeType as a row key. Is doing this frowned upon? All the
examples I've seen have been using a CompositeType only for Column names
(or values).
My particular use case involves having the two components in the key being
a
Phil,
That's the problem with examples :)
Row keys can be composite values. That works just fine. Was there something
in particular you were trying to do?
- Chris
Chris Gerken
chrisger...@mindspring.com
512.587.5261
http://www.linkedin.com/in/chgerken
On Feb 28, 2012, at 10:25 AM,
Hi Stefan. Can you share the output of nodetool cfstats?
On Tue, Feb 28, 2012 at 1:50 AM, Stefan Reek ste...@unitedgames.com wrote:
Hi All,
We are running a 3-node cluster with Cassandra 0.6.13.
We are in the process of upgrading to 1.x, but can't do so for a while
because we can't take the
@Aaron: Are you suggesting 3 nodes (rather than 2) to allow quorum
operations even at the temporary loss of 1 node from cluster's reach ? I
understand this but I just another question popped up in my mind, probably
since I'm not much experienced managing cassandra, so I'm unaware whether
it may be
When I migrated data from our RDBMS, I hashed columns names to integers.
This makes for some
footwork, but the space gain is clearly there so it's worth it. I
de-hash on read.
Maxim
On 2/10/2012 5:15 PM, Narendra Sharma wrote:
It is good to have short column names. They save space all the
Thank you Aaron and others.
That helped and we were able to limit the commitlog disk usage.
We will be doing some tests by changing the memtable_total_space_in_mb
param and see how that goes.
On Mon, Feb 27, 2012 at 12:51 PM, aaron morton aa...@thelastpickle.comwrote:
yes, reducing
If you have 3 nodes of RF=3, you can continue the service on cassandra even if
one of the node will fail ( by hardware or software failure ).
One other benefit is you can shutdown one node for maintenance or patch up
without service interruption.
If you run your service with 2 node and RF=2, your
If you run your service with 2 node and RF=2, your data will be replicated but
your service will not be redundant. ( You can't stop both of nodes )
If your service doesn't need strong consistency ( allow cassandra returns
old data after write, and possible write lost ), you can use CL=ONE
for
Thanks, I think I don't need high consistency(as per my app requirements)
so I might be fine with CL.ONE instead of quorum, so I think I'm probably
going to be ok with a 2 node cluster initially..
Could you guys also recommend some minimum memory to start with ? Of course
that would depend on my
24 matches
Mail list logo