If you store only the key mappings in a column family, for custom ordering
of rows etc. for things like:
friends = {
user_id : { friendid1, friendid2, }
}
or
topForumPosts = {
forum_id1 : { post2343, post32343, post32223, ...}
}
Now on friends page or on the top_forum_posts
Hello,
We want to use cassandra to store and retrieve time related data. Storing
the time-value pairs is easy and works perfectly. The problem arrives at
retrieving the data. We do not only want to retrieve data from within a time
range, but also be able to get the previous and/or next data
You want to use 'reversed' in SliceRange (and a start with whatever
you want and a count of 2).
--
Sylvain
On Tue, Jun 15, 2010 at 12:01 PM, Bram van der Waaij
bramat...@gmail.com wrote:
Hello,
We want to use cassandra to store and retrieve time related data. Storing
the time-value pairs is
On Mon, 14 Jun 2010 16:01:57 -0700 Anthony Molinaro
antho...@alumni.caltech.edu wrote:
AM Now I would assume that for 'production' you want to remove
AM-ea
AM and
AM-XX:+HeapDumpOnOutOfMemoryError
AM as well as adjust -Xms and Xmx accordingly, but are there any others
AM which should
well it won't be a range, it will be random key lookups.
On Tue, Jun 15, 2010 at 8:44 AM, Gary Dusbabek gdusba...@gmail.com wrote:
On Tue, Jun 15, 2010 at 04:29, S Ahmed sahmed1...@gmail.com wrote:
If you store only the key mappings in a column family, for custom
ordering
of rows etc. for
Perfect! Thanks :-)
2010/6/15 Sylvain Lebresne sylv...@yakaz.com
You want to use 'reversed' in SliceRange (and a start with whatever
you want and a count of 2).
--
Sylvain
On Tue, Jun 15, 2010 at 12:01 PM, Bram van der Waaij
bramat...@gmail.com wrote:
Hello,
We want to use
if you are reading 500MB per thrift request from each of 3 threads,
then yes, simple arithmetic indicates that 1GB heap is not enough.
On Mon, Jun 14, 2010 at 6:13 PM, Caribbean410 caribbean...@gmail.com wrote:
Hi,
I wrote 200k records to db with each record 5MB. Get this error when I uses
3
Hi, I'm running cassandra .6.2 on a dedicated 4 node cluster and I
also have a dedicated 4 node hadoop cluster. I'm trying to run a
simple map reduce job against a single column family and it only takes
32 map tasks before I get floods of thrift timeouts. That would make
sense to me if the
Sorry, the record size should be 5KB not 5MB. Coz 4KB is still OK. I will
try Benjamin's suggestion.
-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com]
Sent: Tuesday, June 15, 2010 8:09 AM
To: user@cassandra.apache.org
Subject: Re: java.lang.OutofMemoryerror: Java heap
You should only have to restart once per node to pick up config changes.
On Tue, Jun 15, 2010 at 9:41 AM, caribbean410 caribbean...@gmail.com wrote:
Today I retry the 2GB heap now it's working. No that out of memory error.
Looks like I have to restart Cassandra several times before the new
(moving to user@)
On Mon, Jun 14, 2010 at 10:43 PM, Masood Mortazavi
masoodmortaz...@gmail.com wrote:
Is the clearer interpretation of this statement (in
conf/datacenters.properties) given anywhere else?
# The sum of all the datacenter replication factor values should equal
# the replication
http://wiki.apache.org/cassandra/ArticlesAndPresentations might help.
On Mon, Jun 14, 2010 at 1:13 PM, Johannes Weissensel
whitesensl...@googlemail.com wrote:
Hi everyone,
i am new to nosql databases and especially column-oriented Databases
like cassandra.
I am a student on
I am running a 10 node cassandra 0.6.1 cluster with a replication factor of 3.
To populate the database to perform my read benchmarking, I have 8 applications
using Thrift, each connecting to a different cassandra server and writing
100,000 rows of data (100 KB each row), using a
You are likely exhausting your heap space (probably still at the very
small 1G default?), and maximizing the amount of resource consumption
by using CL.ALL. Why are you using ALL?
On Tue, Jun 15, 2010 at 11:58 AM, Julie julie.su...@nextcentury.com wrote:
I am running a 10 node cassandra 0.6.1
Benjamin Black b at b3k.us writes:
You are likely exhausting your heap space (probably still at the very
small 1G default?), and maximizing the amount of resource consumption
by using CL.ALL. Why are you using ALL?
On Tue, Jun 15, 2010 at 11:58 AM, Julie julie.sugar at nextcentury.com
How are you doing your inserts?
I draw a clear line between 1) bootstrapping a cluster with data and 2)
simulating expected/projected read/write behavior.
If you are bootstrapping then I would look into the batch_mutate APIs. They
allow you to improve your performance on writes dramatically.
On Tue, Jun 15, 2010 at 1:40 PM, Julie julie.su...@nextcentury.com wrote:
Thanks for your reply. Yes, my heap space is 1G. My vms have only 1.7G of
memory so I hesitate to use more.
Then write slower. There is no free lunch.
b
On Tue, Jun 15, 2010 at 1:58 PM, Julie julie.su...@nextcentury.com wrote:
Coinciding with my write timeouts, all 10 of my cassandra servers are getting
the following exception written to system.log:
Value too large for defined data type looks like a bug found in
older JREs. Upgrade to u19 or
Hello,
Phil Stanhope pstanhope at wimba.com writes:
How are you doing your inserts?
I draw a clear line between 1) bootstrapping a cluster with data and 2)
simulating expected/projected
read/write behavior.
If you are bootstrapping then I would look into the batch_mutate APIs. They
allow you
On Tue, Jun 15, 2010 at 5:15 PM, Julie julie.su...@nextcentury.com wrote:
I'm also baffled that after all compactions are done on every one of the 10
servers, about 5 out of 10 servers are still at 40% CPU usage, although they
are doing 0 disk IO. I am not running anything else running on these
firstly, my apologies for the off-topic message,
but I thought most people on this list would be knowledgeable and
interested in this kind of thing.
We are looking to find a open source, scalable solution to do RT
aggregation and stream processing (similar to what the 'hop' project
hello,
I have a 4 node cassandra cluster with 0.6.1 installed. We've been running
a mixed read / write workload test how it works in our environment, we run
about 4M bath mutations and 40M get_range_slice requests over 6 to 8 hours
that load about 10 to 15 GB of data.
Yesterday while there was
Known bug, fixed in latest 0.6 release.
On Tue, Jun 15, 2010 at 3:29 PM, aaron aa...@thelastpickle.com wrote:
hello,
I have a 4 node cassandra cluster with 0.6.1 installed. We've been running
a mixed read / write workload test how it works in our environment, we run
about 4M bath mutations
Benjamin Black b at b3k.us writes:
Then write slower. There is no free lunch.
b
Are you implying that clients need to throttle their collective load on the
server to avoid causing the server to fail? That seems undesirable. Is this a
side effect of a server bug, or is it part of the
On Tue, Jun 15, 2010 at 3:55 PM, Charles Butterfield
charles.butterfi...@nextcentury.com wrote:
Benjamin Black b at b3k.us writes:
Then write slower. There is no free lunch.
b
Are you implying that clients need to throttle their collective load on the
server to avoid causing the server
Thanks, will move to 0.6.2.
Aaron
On Tue, 15 Jun 2010 15:55:46 -0700, Benjamin Black b...@b3k.us wrote:
Known bug, fixed in latest 0.6 release.
On Tue, Jun 15, 2010 at 3:29 PM, aaron aa...@thelastpickle.com wrote:
hello,
I have a 4 node cassandra cluster with 0.6.1 installed. We've been
Thanks for your updates, good to know that your performance is better now.
Actually, if the user asks one record a time, usually it will be done in
multi-threading, since most likely the requests coming from different users.
If a single users want 200k, and there are no difference to get 1
Benjamin Black b at b3k.us writes:
I am only saying something obvious: if you don't have sufficient
resources to handle the demand, you should reduce demand, increase
resources, or expect errors. Doing lots of writes without much heap
space is such a situation (whether or not it is
Actually, you shouldn't expect errors in the general case, unless you
are simply trying to use data that can't fit in available heap. There
are some practical limitations, as always.
If there aren't enough resources on the server side to service the
clients, the expectation should be that the
We are currently looking at a distributed database option and so far
Cassandra ticks all the boxes. However, I still have some questions.
Is there any need for archiving of Cassandra and what backup options are
available? As it is a no-data-loss system I'm guessing archiving is not
exactly
There is JSON import and export, of you want a form of external backup.
No, you can't hook event subscribers into the storage engine. You can modify
it to do this, however. It may not be trivial.
An easier way to do this would be to have a boundary system (or dedicated
thread, for example)
Doh! Replace of with if in the top line.
On Tue, Jun 15, 2010 at 7:57 PM, Jonathan Shook jsh...@gmail.com wrote:
There is JSON import and export, of you want a form of external backup.
No, you can't hook event subscribers into the storage engine. You can
modify it to do this, however. It may
Thanks Jonathan, I was only asking about the event listeners because an
alternative we are considering is TIBCO Active Spaces which draws quite
a lot of parallels to Cassandra.
I guess it would be interesting to find out how other people use
Cassandra, i.e., is it your one stop shop for data
This is not the bug to which I was referring. I don't recall the
number, perhaps someone else can assist on that front? I just know I
specifically upgraded to 0.6 trunk a bit before 0.6.2 to pick up the
fix (and it worked).
b
On Tue, Jun 15, 2010 at 6:07 PM, Rob Coli rc...@digg.com wrote:
On Tue, Jun 15, 2010 at 4:44 PM, Charles Butterfield
charles.butterfi...@nextcentury.com wrote:
I guess my point is that I have rarely run across database servers that die
from either too many client connections, or too rapid client requests. They
generally stop accepting incoming connections
On Tue, Jun 15, 2010 at 4:44 PM, Charles Butterfield
charles.butterfi...@nextcentury.com wrote:
To clarify the history here -- initially we were writing with CL=0 and had
great performance but ended up killing the server. It was pointed out that
we were really asking the server to accept and
On Tue, Jun 15, 2010 at 4:58 PM, Jonathan Shook jsh...@gmail.com wrote:
If there aren't enough resources on the server side to service the
clients, the expectation should be that the servers have a graceful
performance degradation, or in the worst case throw an error specific
to resource
On 6/15/10 6:35 PM, Benjamin Black wrote:
jmhodges contributed a patch (I remain incompetent at Jira searches)
for 'coprocessors' to do what you want. That'd be where I'd start
looking.
https://issues.apache.org/jira/browse/CASSANDRA-1016
=Rob
Thanks Benjamin. Looking at the 'plugins' now :)
-Original Message-
From: Benjamin Black [mailto:b...@b3k.us]
Sent: Wednesday, 16 June 2010 11:35 AM
To: user@cassandra.apache.org
Subject: Re: Some questions about using Cassandra
On Tue, Jun 15, 2010 at 6:07 PM, Anthony Ikeda
I think the one you're referring to is
https://issues.apache.org/jira/browse/CASSANDRA-1076
On Tue, Jun 15, 2010 at 8:16 PM, Benjamin Black b...@b3k.us wrote:
This is not the bug to which I was referring. I don't recall the
number, perhaps someone else can assist on that front? I just know I
Yes!
On Tue, Jun 15, 2010 at 6:44 PM, Jonathan Ellis jbel...@gmail.com wrote:
I think the one you're referring to is
https://issues.apache.org/jira/browse/CASSANDRA-1076
On Tue, Jun 15, 2010 at 8:16 PM, Benjamin Black b...@b3k.us wrote:
This is not the bug to which I was referring. I don't
The main change you'd commonly make is decreasing the max new gen size
on large heaps (say to 2GB) from the default of 1/3 of the heap.
IMO keeping heap dump on OOM around is a good idea in production; it
doesn't cost much (you're already screwed at the point where it starts
writing a dump, so
Thank you for the update. For the select issue, right now we just focus on
read and write, later we may test delete operation which need to query all
keys.
From: Dop Sun [mailto:su...@dopsun.com] ks
Sent: Tuesday, June 15, 2010 4:14 PM
To: user@cassandra.apache.org
Subject: RE: read operation
44 matches
Mail list logo