Are there any best practices for Storage configurations, MemTable thresholds
and Linux performance tuning to tune Cassandra nodes?
--
Thanks,
Mubarak Seyed.
Hi,
Has anyboy done any memory usage analysis for cassandra?
How much memory does cassandra need to manager 300G of data load? How much
extra memory will be needed when doing compaction?
Regarding mmap, memory usage will be determined by the OS so it has nothing to
do with the heap size of
How much memory does cassandra need to manager 300G of data load? How much
extra memory will be needed when doing compaction?
For one thing it depends on the data. One thing that scales linearly
(but with a low constant) with the amount of data are the bloom
filters. If those 300 GB correspond
Ran, I do know to run jest in own thread with maven surefire plugin, but
don't sure how can I do this with own JVM for each test. How are you doing
this? Thanks.
On Fri, Jul 9, 2010 at 10:33 PM, Ran Tavory ran...@gmail.com wrote:
The workaround I do is fork always. Each test pulls up its own
look at my pom. it has forkModealways/
http://github.com/rantav/hector/blob/master/pom.xml#L95
On Wed, Jul 14, 2010 at 3:02 PM, Andriy Kopachevsky
kopachev...@gmail.comwrote:
Ran, I do know to run jest in own thread with maven surefire plugin, but
don't sure how can I do this with own JVM for
Hector will released one along with 0.7, or there are any beta or alpha before
official release of 0.7?
I’m planning to update my client to work with Cassandra 0.7 trunk now, and I
have a dependency on your library. J
Dop
From: Ran Tavory [mailto:ran...@gmail.com]
Sent: Wednesday,
Sounds good to me.
On Wed, Jul 14, 2010 at 12:25 AM, Mike Malone m...@simplegeo.com wrote:
Yep, as Ben said, we're not asking for anyone to write this for us.
We've been playing with some ideas around encryption between EC2
data-centers/regions (intra-region is already secure enough for us --
socketexception means this is coming from the network, not the sstables
knowing the full error message would be nice, but just about any
problem on that end should be fixed by adding connection pooling to
your client.
(moving to user@)
On Wed, Jul 14, 2010 at 5:09 AM, Thomas Downing
Turns out we can get a list from Eventbrite:
http://www.eventbrite.com/org/474011012?s=1926097
On Tue, Jul 13, 2010 at 3:09 PM, Jonathan Ellis jbel...@gmail.com wrote:
On Fri, Jul 9, 2010 at 9:36 AM, Jeremy Dunck jdu...@gmail.com wrote:
On Fri, Jul 2, 2010 at 1:08 PM, Jonathan Ellis
Denver on Sept 10
Seattle on Oct 8
http://www.eventbrite.com/org/474011012?s=1926097
--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com
How will we load the VM on our machines? Do we download it ?
Is it running Ubuntu?
On Wed, Jul 14, 2010 at 11:11 AM, Jonathan Ellis jbel...@gmail.com wrote:
Turns out we can get a list from Eventbrite:
http://www.eventbrite.com/org/474011012?s=1926097
On Tue, Jul 13, 2010 at 3:09 PM,
[snip]
I'm not sure that is the case.
When the server gets into the unrecoverable state, the repeating exceptions
are indeed SocketException: Too many open files.
[snip]
Although this is unquestionably a network error, I don't think it is
actually a
network problem per se, as the maximum
I wrote a code that iterate on all the rows by using get_range_slices.
for the first call I use KeyRange from to .
for all the others I use from the last key that I got in the previous
iteration to .
I always get the same rows that I got in the previous iteration. I tried
changing the batch size
I bring a USB drive for every attendee.
The VM runs Debian.
On Wed, Jul 14, 2010 at 10:20 AM, S Ahmed sahmed1...@gmail.com wrote:
How will we load the VM on our machines? Do we download it ?
Is it running Ubuntu?
On Wed, Jul 14, 2010 at 11:11 AM, Jonathan Ellis jbel...@gmail.com wrote:
All,
Can anyone help?
I followed the instructions for a single node installation of Cassandra. I
tried to start it and got:
ERROR 08:13:53,499 Exception encountered during startup.
java.io.StreamCorruptedException: invalid stream header: 61696E5D
at
This is a bug. If you can give us data to reproduce with we can fix it faster.
On Wed, Jul 14, 2010 at 10:29 AM, shimi shim...@gmail.com wrote:
I wrote a code that iterate on all the rows by using get_range_slices.
for the first call I use KeyRange from to .
for all the others I use from the
Thomas, I had a similar problem a few weeks back. I changed my code to make
sure that each thread only creates and uses one Hector connection. It seems
that client sockets are not being released properly, but I didn't have the
time to dig into it.
Jorge
On Wed, Jul 14, 2010 at 8:28 AM, Peter
there is a window of time from when a node goes down and when the rest
of the cluster actually realizes that it is down.
what happens to writes during this time frame? does hinted handoff
record these writes and then handoff when the down node returns? or
does hinted handoff not kick in until
Each of my top-level functions was allocating a Hector client connection at
the top, and releasing it when returning. The problem arose when a top-level
function had to call another top-level function, which led to the same
thread allocating two connections. Hector was not releasing one of them
On Wed, Jul 14, 2010 at 1:43 PM, B. Todd Burruss bburr...@real.com wrote:
there is a window of time from when a node goes down and when the rest
of the cluster actually realizes that it is down.
what happens to writes during this time frame? does hinted handoff
record these writes and then
Where is the link that describes the various key types and their impact on
sorting? (I believe I read it before, can't seem to find it now).
So my application supports multi-tenants, so I need the keys to represent
things like:
website1123 + contentID
or
website3454 + userID
And for range
thx, but disappointing :)
is this just something we have to live with and periodically repair
the nodes? or is there future work to tighten up the window?
thx
On Wed, 2010-07-14 at 12:13 -0700, Jonathan Ellis wrote:
On Wed, Jul 14, 2010 at 1:43 PM, B. Todd Burruss bburr...@real.com wrote:
Is it OK or recommended to use the same timestamp value for all Column and Deletion records sent in a batch mutation? Am thinking of cases where there is a potential for multiple clients to update the same key (with multiple columns) at the same time. In the use case it's acceptable, as the client
The key structure you have should group the keys based on the website There are some differences between range queries with RP and OPP this article may help http://ria101.wordpress.com/2010/02/22/cassandra-randompartitioner-vs-orderpreservingpartitioner/AaronOn 15 Jul, 2010,at 08:44 AM, S Ahmed
Hi,
I have a 0.6.3 cluster which contains 6 nodes. I added 6 new nodes
by setting AutoBootstrap to true and setting an InitialToken on each new
node, then waiting for the Bootstrapping message in the log before
starting another. Then I've been watching the logs on the old boxes
waiting to see
Coordination in a distributed system is difficult. I don't think we
can fix HH's existing edge cases, without introducing other more
complicated edge cases.
So weekly-or-so repair will remain a common maintenance task for the
forseeable future.
On Wed, Jul 14, 2010 at 4:17 PM, B. Todd Burruss
It is good style but may not be necessary.
On Wed, Jul 14, 2010 at 4:54 PM, Aaron Morton aa...@thelastpickle.com wrote:
Is it OK or recommended to use the same timestamp value for all Column and
Deletion records sent in a batch mutation?
Am thinking of cases where there is a potential for
Each node logs what token it is going to bootstrap to. Who owns the
ranges that contain those tokens?
On Wed, Jul 14, 2010 at 5:58 PM, Anthony Molinaro
antho...@alumni.caltech.edu wrote:
Hi,
I have a 0.6.3 cluster which contains 6 nodes. I added 6 new nodes
by setting AutoBootstrap to true
The cluster nodes were running fine. When i restarted to modify the JVM heap
settings, two of the nodes are not joining the cluster and throws Bootstrap
Token collision
Any idea how to fix this error?
ERROR [GMFD:1] 2010-07-15 01:23:13,756 DebuggableThreadPoolExecutor.java
(line 101) Error in
for your apps, how about this schema:
key: website1123
columnName: UserID
...
On Thu, Jul 15, 2010 at 6:13 AM, Aaron Morton aa...@thelastpickle.comwrote:
The key structure you have should group the keys based on the website There
are some differences between range queries with RP and OPP this
I have 3 nodes A, B, C with RF=3. When I configure the cluster and before start
taking any read/write request, I first start A, put A itself as seed (following
in the instructions on wiki), and then start B (put A as the seed) and then
start C (also put A as the seed).
B and C seem joining the
BTW,
A is 192.168.11.29
B is 192.168.11.28
C is 192.168.11.27
from the result of nodetool ring, does it mean that B thinks A, C are down and
C thinks B is down?
I tried to restart B and for a bring moment, I didn't get this problem (all the
nodes are all from nodetool) but after a while, this
Hi everyone,
I'm newbie to Cassandra :D.. I try to insert data from MySQL to Cassandra.
Data dump from MySQL is about 11 MB (64716 records). But when i'm insert to
Cassandra, i think the data is become bigger than in MySQL. Is it true...???
Thanks
Can you do an insert with CL ALL?Are there any ERRORs in the log file? Try turning the logging up the
TRACE and see whats happening. Check B and see A by ssh'ing into
B and using node tool from there to connect to A. Do you have
any switches / firewalls between the nodes ? Could this be
34 matches
Mail list logo