i'll second edward's comment. cassandra is designed to scale horizontally,
so if disk I/O is slowing you down then you must scale
On Tue, Jan 8, 2013 at 7:10 AM, Jim Cistaro jcist...@netflix.com wrote:
One metric to watch is pending compactions (via nodetool
compactionstats). This count
?
Aye.
Their should be shapshots in there
https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L402
Cheers
-
Aaron Morton
Freelance Cassandra Developer
New Zealand
@aaronmorton
http://www.thelastpickle.com
On 4/01/2013, at 2:04 PM, B. Todd Burruss bto
i will add that we have had a good experience with leveled compaction
cleaning out tombstoned data faster than size tiered, therefore
keeping our total disk usage much more reasonable than size tiered.
it is at the cost of I/O ... maybe 2X the I/O?? but that is not
bothering us.
what is
to get it correct, meaning consistent, it seems you will need to do
a repair no matter what since the source cluster is taking writes
during this time and writing to commit log. so to avoid filename
issues just do the first copy and then repair. i am not sure if they
can have any filename.
to
i think this was a repair without -pr
thanks,
Andras
Andras Szerdahelyi*
*Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
M: +32 493 05 50 88 | Skype: sandrew84
On 18 Dec 2012, at 22:09, B. Todd Burruss bto...@gmail.com wrote:
in your data directory, for each
i believe we have hit this as well. if you use nodetool to
rebuild_index, does it work?
On Wed, Dec 19, 2012 at 8:10 PM, aaron morton aa...@thelastpickle.com wrote:
Well that was fun https://issues.apache.org/jira/browse/CASSANDRA-5079
Just testing my idea of a fix now.
Cheers
in your data directory, for each keyspace there is a solr.json. cassandra
stores the SSTABLEs it knows about when using leveled compaction. take a
look at that file and see if it looks accurate. if not, this is a bug with
cassandra that we are checking into as well
On Thu, Dec 6, 2012 at 7:38
my two cents ... i know this thread is a bit old, but the fact that
odd-sized SSTABLEs (usually large ones) will hang around for a while
can be very troublesome on disk space and planning. our data is
temporal in cassandra, being deleted constantly. we have seen space
usage in the 1+ TB range
trying to figure out if i'm doing something wrong or a bug. i am
creating a simple schema, inserting a timestamp using ISO8601 format,
but when retrieving the timestamp, the timezone is displayed
incorrectly. i'm inserting using GMT, the result is shown with
+, but the time is for my local
at 12:09 PM, B. Todd Burruss bto...@gmail.com wrote:
if i stop a node and remove an SSTABLE, let's call it X, is that safe?
ok, more info. i know that the data in SSTABLE X has been tombstoned
but the tomstones are in SSTABLE Y. i want to simply delete X and get
rid of the data.
how do i
with NetworkTopologyStrategy it theoretically should work
http://www.datastax.com/docs/1.0/cluster_architecture/replication
On Thu, Nov 8, 2012 at 5:11 PM, ws w...@jeremymckay.com wrote:
If I have multiple clusters can I replicate a keyspace from each of those
cluster to separate cluster?
@oleg, to answer your last question a cassandra node should never ask
another node for information it doesn't have. it uses the key and the
partitioner to determine where the data is located before ever
contacting another node.
On Mon, Nov 5, 2012 at 9:45 AM, Andrey Ilinykh ailin...@gmail.com
we are having the problem where we have huge SSTABLEs with tombstoned data
in them that is not being compacted soon enough (because size tiered
compaction requires, by default, 4 like sized SSTABLEs). this is using
more disk space than we anticipated.
we are very write heavy compared to reads,
we are running Datastax enterprise and cannot patch it. how bad is
kill performance? if it is so bad, why is it an option?
On Thu, Nov 8, 2012 at 10:17 AM, Radim Kolar h...@filez.com wrote:
Dne 8.11.2012 19:12, B. Todd Burruss napsal(a):
my question is would leveled compaction help to get
thanks for the links! i had forgotten about live sampling
On Thu, Nov 8, 2012 at 11:41 AM, Brandon Williams dri...@gmail.com wrote:
On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner synfina...@gmail.com wrote:
There are also ways to bring up a test node and just run Level Compaction on
that. Wish
, Nov 8, 2012 at 11:53 AM, B. Todd Burruss bto...@gmail.com wrote:
thanks for the links! i had forgotten about live sampling
On Thu, Nov 8, 2012 at 11:41 AM, Brandon Williams dri...@gmail.com wrote:
On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner synfina...@gmail.com wrote:
There are also ways
bryce, did you resolve this? i'm interested in the outcome.
when you write does it help to use CL = LOCAL_QUORUM?
On Mon, Oct 29, 2012 at 12:52 AM, aaron morton aa...@thelastpickle.com wrote:
Outbound messages for other DC's are grouped and a single instance is sent
to a single node in the
, 2012 11:54 AM, B. Todd Burruss bto...@gmail.com wrote:
does nodetool cleanup perform a major compaction in the process of
removing unwanted data?
No.
what is the internal memory model used? It sounds like it doesn't have a
page manager?
Regarding memory usage after a repair ... Are the merkle trees kept around?
On Oct 23, 2012 3:00 PM, Bryan Talbot btal...@aeriagames.com wrote:
On Mon, Oct 22, 2012 at 6:05 PM, aaron morton aa...@thelastpickle.comwrote:
The GC was on-going even when the nodes were not compacting or running a
if a node, X, has a tombstone marking deleted data, when can node X
remove the data - not the tombstone, but the data? i understand the
tombstone cannot be removed until GCGraceSeconds has passed, but it
seems the data could be compacted away at any time.
.
Dean
On 10/22/12 10:43 AM, B. Todd Burruss bto...@gmail.com wrote:
if a node, X, has a tombstone marking deleted data, when can node X
remove the data - not the tombstone, but the data? i understand the
tombstone cannot be removed until GCGraceSeconds has passed, but it
seems the data could
does nodetool cleanup perform a major compaction in the process of
removing unwanted data?
i seem to remember this to be the case, but can't find anything definitive
i have used StorageProxy and was forgetting to rewind (or otherwise
setup my ByteBuffer properly) and was getting, i believe, the same
error.
check your ByteBuffers
On Sat, Oct 13, 2012 at 8:49 AM, Nick Morizio nmori...@yahoo.com wrote:
I'm wondering if anyone has seen this issue before:
We
did the amount of data finally exceed your per machine RAM capacity?
is it the same 20% each time you read? or do your periodic reads
eventually work through the entire dataset?
if you are essentially table scanning your data set, and the size
exceeds available RAM, then a degradation like that
trying to think of a use case where you would want to order by
timestamp, and also have unique column names for direct access.
not really trying to challenge the use case, but you can get ordering
by timestamp and still maintain a name for the column using
composites. if the first component of
as of 1.0 (CASSANDRA-2034) hints are generated for nodes that timeout.
On Thu, Oct 11, 2012 at 3:55 AM, Watanabe Maki watanabe.m...@gmail.com wrote:
Even if HH works fine, HH will not be created until the failure detector
marks the node is dead.
HH will not be created for partially timeouted
https://issues.apache.org/jira/browse/CASSANDRA/fixforversion/12323284
On Wed, Oct 10, 2012 at 1:41 AM, Alexey Zotov azo...@griddynamics.com wrote:
Hi Guys,
What known critical bugs are there that couldn't allow to use 1.2 beta 1 in
production?
We don't use cql and secondary indexes.
--
major compaction in production is fine, however it is a heavy operation on
the node and will take I/O and some CPU.
the only time i have seen this happen is when i have changed the tokens in
the ring, like nodetool movetoken. cassandra does not auto-delete data
that it doesn't use anymore just
if you have N nodes in your cluster, add N new nodes using the new
hardware, then decommision the old N nodes.
(and migrate to VPC like dean said)
On Wed, Oct 10, 2012 at 5:23 AM, Hiller, Dean dean.hil...@nrel.gov wrote:
Well, you could use amazon VPC in which case you DO pick the IP yourself
On Wed, Oct 10, 2012 at 6:12 PM, B. Todd Burruss bto...@gmail.com wrote:
major compaction in production is fine, however it is a heavy operation
on the node and will take I/O and some CPU.
the only time i have seen this happen is when i have changed the tokens
in the ring, like nodetool
the following exception seems to be about loading saved caches, but i
don't really care about the cache so maybe isn't a big deal. anyway,
this is with patched 0.7.1
(0001-Fix-bad-signed-conversion-from-byte-to-int.patch)
WARN 11:07:59,800 error reading saved cache
wiki page is here ...
https://github.com/rantav/hector/wiki/Hector-Object-Mapper-(HOM)
https://github.com/rantav/hector/wiki/Hector-Object-Mapper-%28HOM%29
it does not handle relationships between objects yet, but does handle
inheritance
On 02/10/2011 12:21 PM, Jonathan Ellis wrote:
An
batch_mutate doesn't guarantee consistency. each mutation in the batch
is guaranteed to be consistent based on your CL, but if it returns an
error it means that it couldn't complete all mutations ... but the
converse isn't true. it may have successfully completed some
mutations. if you get
any word on when to expect 0.7.1? lots of good fixes we need. trying
to decide if i should apply patches or wait.
thx!
web site says sold out, too bad for me ;)
On 01/28/2011 07:01 PM, Jonathan Ellis wrote:
Next week is the Strata conference and not one, not two, but five
Cassandra events!
In chronological order:
1. My Strata Cassandra tutorial Tuesday afternoon:
ok thx. what about the repair creating hundreds of new sstables and
lsof showing cassandra using currently over 800 Data.db files? is this
normal?
On 01/27/2011 08:40 AM, Brandon Williams wrote:
On Thu, Jan 27, 2011 at 10:21 AM, Todd Burruss bburr...@real.com
mailto:bburr...@real.com wrote:
as -tmp-?
On Jan 27, 2011 9:00 AM, B. Todd Burruss bburr...@real.com
mailto:bburr...@real.com wrote:
ok thx. what about the repair creating hundreds of new sstables and
lsof showing cassandra using currently over 800 Data.db files? is this
normal?
On 01/27/2011 08:40 AM, Brandon Williams
i ran out of file handles on the repairing node after doing nodetool
repair - strange as i have never had this issue until using 0.7.0 (but i
should say that i have not truly tested 0.7.0 until now.) up'ed the
number of file handles, removed data, restarted nodes, then restarted my
test.
we use zabbix. we run the agent on our linux boxes and also start
zapcat using the class that follows. essentially you go into the zabbix
console and setup hosts for the zapcat port, and hosts for the
zabbix agent. then setup items for the zapcat host that are JMX
metrics. info on zapcat
has anyone created a maven plugin, like cargo for tomcat, for automating
starting/stopping a cassandra instance?
how are folks customizing the cassandra.yaml for each node in the
cluster. specifically the token and IP address.
with XML i used entities, but i'm not familiar with YAML. does yaml
support the same concept? or any sort of textual substitution?
thx
i am seeing several different exceptions across my 8 node cluster.
running 0.7 RC2. the following are all from one node. is this a known
issue?
ERROR [MutationStage:35] 2010-12-15 09:25:06,466
RowMutationVerbHandler.java (line 83) Error in row mutation
http://www.hazelcast.com/product.jsp
has anyone tested hazelcast as a distributed locking mechanism for java
clients? seems very attractive on the surface.
thx, it does say that in the log, but that is probably just a
reflection of whatever is read from cassandra.yaml.
i am wondering if some unix tool can tell me if my process is mmap'ing
files. maybe lsof?
On 10/14/2010 12:07 PM, Rob Coli wrote:
On 10/14/10 10:59 AM, B. Todd Burruss wrote
you should upgrade to the latest version of the JVM, 1.6.0_21
there was a bug around 1.6.0_18 (or there abouts) that affected cassandra
On 10/13/2010 07:55 PM, Eric Czech wrote:
And this is the java version:
java version 1.6.0_13
Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
Java
(if it is actually corrupted). Do you know if
compact or repair would detect bad data and disregard it? I'd like to
try something like that if possible before just upgrading the JVM and
potentially hiding the real problem.
On Wed, Oct 13, 2010 at 9:35 PM, B. Todd Burruss bburr...@real.com
if you are updating columns quite rapidly, you will scatter the columns
over many sstables as you update them over time. this means that a read
of a specific column will require looking at more sstables to find the
data. performing a compaction (using nodetool) will merge the sstables
into
i don't see a beta2 subversion tag. is there one?
On 10/01/2010 11:56 AM, Eric Evans wrote:
It's like Christmas in October, but without the long lines.
First, the obligatory disclaimer.
This is beta software. It's like a teenage driver, it seems as though
it's up to the task, and it almost
using 0.7 latest from trunk as of few minutes ago. 1 client, 1 node
i have the scenario where i want to drop a column family and recreate it
- unit testing for instance, is a good reason you may want to do this
(always start fresh).
the problem i observe is that if i do the following:
1 -
https://issues.apache.org/jira/browse/CASSANDRA-1477
comments below
On 09/07/2010 02:10 PM, Jonathan Ellis wrote:
On Tue, Sep 7, 2010 at 3:55 PM, B. Todd Burrussbburr...@real.com wrote:
using 0.7 latest from trunk as of few minutes ago. 1 client, 1 node
i have the scenario where i want
5 secs isn't enough for me, 10 is good. i haven't tried any other
values as i can get around this through another manner.
On 09/07/2010 02:24 PM, Edward Capriolo wrote:
On Tue, Sep 7, 2010 at 5:10 PM, Jonathan Ellisjbel...@gmail.com wrote:
On Tue, Sep 7, 2010 at 3:55 PM, B. Todd
i got the latest code this morning. i'm testing with 0.7
ERROR [ROW-MUTATION-STAGE:388] 2010-08-27 15:54:58,053
RowMutationVerbHandler.java (line 78) Error in row mutation
org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't
find cfId=1002
at
i did check sstables, and there are only three. i haven't done any
major compacts.
do u think it is taking so long because it must sift thru the deleted
columns before compaction?
so accessing a column by name instead of slice predicate is faster?
On 08/24/2010 11:23 PM, Benjamin Black
CentOS works fine for me. straight out-o-the box. i also use ubuntu
10.04 w/o any troubles. make sure to jave jdk 1.6.0_20 or better.
there was a bug that affects cassandra somewhere around 1.6.0_18 i
think.
On Tue, 2010-08-24 at 08:58 -0700, S Ahmed wrote:
Is there a particular linux flavor
, 2010 at 1:28 PM, B. Todd Burruss bburr...@real.com wrote:
i just came across this and i use tokens in range queries because it is
an easy straightforward way to divide the keyspace and operate on it
using multiple threads and throttle the processing. maybe this is what
hadoop does, i
i am using get_slice to pull columns from a row to emulate a queue.
column names are TimeUUID and the values are small, 32 bytes. simple
ColumnFamily.
i am using SlicePredicate like this to pull the first (oldest) column
in the row:
SlicePredicate predicate = new
, Aug 24, 2010 at 9:14 PM, B. Todd Burruss bburr...@real.com
mailto:bburr...@real.com wrote:
i am using get_slice to pull columns from a row to emulate a
queue. column names are TimeUUID and the values are small, 32
bytes. simple ColumnFamily.
i am using SlicePredicate like
i see the following in my server logs quite closely while doing a lot of
batch_mutations and reads. i create keyspaces and column families using
thrift api, not cassandra.yaml. did not migrate anything from 0.6.
4 node cluster, RF = 3, QUORUM read/write.
happens immediately on a fresh start of
into
https://issues.apache.org/jira/browse/CASSANDRA-1403, which was fixed
last week and will be included in beta2.
If you are experiencing this on trunk, please do file another ticket,
or comment on the existing one.
Gary.
On Mon, Aug 23, 2010 at 13:33, B. Todd Burruss bburr...@real.com
i am getting this as well. i am calling batch_mutate. i don't see any
server logs at INFO level. switched to DEBUG and still no interesting
messages.
by setting break points i tracked it down to TIOStreamTransport with
type TTransportException.END_OF_FILE. seems for some reason the bytes
read
if i am using batch_mutate to update/insert two columns in the same CF
and same key, is this an atomic operation?
i understand that an operation on a single key in a CF is atomic, but
not sure if the above scenario boils down to two operations or
considered one operation.
thx
ok i just saw the FAQ
(http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic)
follow up question ...
it states that As a special case, mutations against a single key are
atomic, but more generally no ... i interpret that to also mean ..
mutations against a single key in the same CF ...
so
AggressiveOpts, if i remember correctly, uses options that are not
documented but will probably make into a future release of the JVM.
cassandra used it once upon a time. probably should take it out, but
things work just fine for me now ;)
On Tue, 2010-07-27 at 01:48 -0700, Dathan Pattishall
if i have N=3 and run nodetool repair on node X. i assume that merkle
trees (at a minimum) are calculated on nodes X, X+1, and X+2 (since
N=3). when the repair is finished are nodes X, X+1, and X+2 all in sync
with respect to node X's data? or does X have the latest data and X+1
and X+2 still
there is a window of time from when a node goes down and when the rest
of the cluster actually realizes that it is down.
what happens to writes during this time frame? does hinted handoff
record these writes and then handoff when the down node returns? or
does hinted handoff not kick in until
thx, but disappointing :)
is this just something we have to live with and periodically repair
the nodes? or is there future work to tighten up the window?
thx
On Wed, 2010-07-14 at 12:13 -0700, Jonathan Ellis wrote:
On Wed, Jul 14, 2010 at 1:43 PM, B. Todd Burruss bburr...@real.com wrote
i'll jump in ... why AVRO over Thrift. can you guys point me at a
comparison? (i know next to nothing about both of them)
On 06/18/2010 03:41 PM, Paul Brown wrote:
On Jun 18, 2010, at 2:12 PM, Eric Evans wrote:
On Fri, 2010-06-18 at 11:00 -0700, Paul Brown wrote:
At the risk of
i just figured out that can't do a batch mutate + deletion that uses a
slice range predicate. is adding this functionality targeted for a
particular release? what i am trying to do is delete the first X
columns in a row. i can get around it by requesting all the columns in
question and then
thx
On 05/13/2010 02:12 PM, Gary Dusbabek wrote:
Yes--0.7. I aim to make it part of
https://issues.apache.org/jira/browse/CASSANDRA-494 (remove_slice).
Gary.
On Thu, May 13, 2010 at 16:08, B. Todd Burrussbburr...@real.com wrote:
i just figured out that can't do a batch mutate +
another note on this ... since all my nodes are very well balanced and
were started at the same time, i notice that they all do garbage
collection at about the same time. this of course causes a performance
issue.
i also have noticed that with the default JVM options and heavy load,
have you put your commit log on a disk by itself? not a logical
partition shared by oracle or cassandra data. this will make a
difference, as you don't want the cassandra commit logs competing with
other OS and oracle I/O. look in storage-conf.xml and see if you can
move this.
also check
i think you will see a slow down because of large values in your
columns. make sure you take a look at MemtableThroughputInMB in your
config. if you are writing 1MB of data per row, then you'll probably
want to increase this quite a bit so you are not constantly creating
sstables. can't
i see these exceptions on 4 out of the 7 nodes in my cluster. in
addition those same four nodes all show AE-SERVICE-STAGE with pending
work, and been showing this for several hours now. each node in the
cluster has less than 2gb, so it should be finished by now.
when i do nodetool streams
i agree, but it seems to have implications on the streaming service.
Jonathan Ellis wrote:
java.net.ConnectException: Connection timed out at
sun.nio.ch.Net.connect is an os-level connection problem.
On Fri, Apr 23, 2010 at 3:34 PM, B. Todd Burruss bburr...@real.com wrote:
i see
https://issues.apache.org/jira/browse/CASSANDRA-1019
Jonathan Ellis wrote:
Can you create a ticket?
On Fri, Apr 23, 2010 at 3:50 PM, B. Todd Burruss bburr...@real.com wrote:
i agree, but it seems to have implications on the streaming service.
Jonathan Ellis wrote
http://sourceforge.net/projects/clusterssh/
Roger Schildmeijer wrote:
dancer's shell / distributed shell
http://www.netfort.gr.jp/~dancer/software/dsh.html.en
On 20 apr 2010, at 17.18em, Joost Ouwerkerk wrote:
What are people using to manage Cassandra cluster nodes? i.e. to
responded with i believe what i need.
thx!
Benjamin Black wrote:
Are you deleting data through the API or just doing a bunch of inserts
and then running a compaction? The latter will not result in anything
to clean up since data must be explicitly deleted.
b
On Tue, Apr 20, 2010 at 10:33 AM, B. Todd
representation, so returning the internal IPs
is correct, even though this makes it slightly more difficult to use
for thrift clients.
On Tue, Mar 16, 2010 at 4:55 PM, B. Todd Burruss bburr...@real.com wrote:
if you choose #3 - get_string_property(token map) - keep in mind that the
IPs returned from
at 11:39, B. Todd Burruss bburr...@real.com wrote:
any other ideas on how to troubleshoot? i have tried kill -3 java_pid in
the past but don't know where cassandra writes the console out. i'll look
at scripts.
I have a sneaking suspicion that unless you're running with '-f
79 matches
Mail list logo