Have you tried using the G1 garbage collector instead of CMS?
We had the same issues that things were normally fine, but as soon as
something extraordinary happened, a node could go into GC hell and never
recover, and that could then spread to other nodes as they took up the
slack, trapping them i
We are running 2.1.x and are currently looking into changing from STCS to
LCS, as well as enabling incremental repairs.
In what order should we do that? Should we enable incremental repairs
first, let it run its course which would mark a lot of tables as repaired,
and those marks would then carry
gons. And not the fun Disney kind.
>
> While it may be more work I personally would use one node in write survey
> to test LCS
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpi
close, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
> *From:* Henrik Schröder [mailto:skro...@gmail.com]
> *Sent:* Tuesday, February 19, 2
Hey,
Version 1.1 of Cassandra introduced live traffic sampling, which allows you
to measure the performance of a node without it really joining the cluster:
http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-live-traffic-sampling
That page mentions that you can change the compaction stra
>From your logs you can see that it explicitly binds to 127.0.0.1/9160, so
connecting to it on any other ip address won't work. To understand what
happened in the other cases, simply check what it says it binds to when you
start it up for different values of rpc_address.
If you set rpc_address to
Don't run with a replication factor of 2, use 3 instead, and do all reads
and writes using quorum consistency.
That way, if a single node is down, all your operations will complete. In
fact, if every third node is down, you'll still be fine and able to handle
all requests.
However, if two adjacen
happen. This is one of problems caused by major compaction.
> For maintenance it is better to have a set of small sstables then one
> big.
>
> Andrey
>
>
> On Thu, Nov 8, 2012 at 2:55 AM, Henrik Schröder wrote:
>
>> Hi,
>>
>> We recently ran a major compact
you repaired last time ?
>
>
> 2012/11/8 Henrik Schröder
>
>> No, we're not using columns with TTL, and I performed a major compaction
>> before the repair, so there shouldn't be vast amounts of tombstones moving
>> around.
>>
>> And the
he load on my cluster, but what you are
> describing sounds like a lot to me.
>
> Does this increase happen due to repair entirely? Or was the load maybe
> increasing gradually over the week and you just checked for the first time?
>
> cheers,
> Christian
>
>
>
> O
Hi,
We recently ran a major compaction across our cluster, which reduced the
storage used by about 50%. This is fine, since we do a lot of updates to
existing data, so that's the expected result.
The day after, we ran a full repair -pr across the cluster, and when that
finished, each storage node
erved.
>
> The config options we are unsure about are things like commit log sizes, ….
>
> I would try to find some indication of what's going on before tweaking.
> Have you checked iostat ?
>
> Hope that helps.
>
> -
> Aaron Morton
> Freelance
When we ran Cassandra on windows, we got better performance without memory
mapped IO. We had the same problems your are describing, what happens is
that Windows is rather aggressive about swapping out memory when all the
memory is used, and it starts swapping out "unused" parts of the heap,
which c
Hi all,
We're running a small Cassandra cluster (v1.0.10) serving data to our web
application, and as our traffic grows, we're starting to see some weird
issues. The biggest of these is that sometimes, a single node becomes
unresponsive. It's impossible to start new connections, or impossible to
s
Removetoken should only be used when removing a dead node from a cluster,
it's a much slower and more expensive operation since it triggers a repair
so that the remaining nodes can figure out which data they should now have.
Decommission on the other hand is much simpler, the node that's being
deco
Bug: https://lkml.org/lkml/2012/6/30/122
Simple fix to reset the leap second flag: date; date `date
+"%m%d%H%M%C%y.%S"`; date;
/Henrik
On Mon, Jul 2, 2012 at 1:56 PM, Jean Paul Adant
wrote:
> Hi,
>
> I did have the same problem with cassandra 1.1.1 on Ubuntu 11.10
> I had to reboot all nodes
>
running a tight connection pool which recycles connections every few hours
and only waits a few seconds for a connection before timing out?
/Henrik
On Thu, Jun 14, 2012 at 4:54 PM, Mina Naguib
wrote:
>
> On 2012-06-14, at 10:38 AM, Henrik Schröder wrote:
>
> > Hi everyone,
&g
Hi everyone,
We have problem with our Cassandra cluster, and that is that sometimes it
takes several seconds to open a new Thrift connection to the server. We've
had this issue when we ran on windows, and we have this issue now that we
run on Ubuntu. We've had it with our old networking setup, and
On Thu, May 24, 2012 at 9:28 PM, Brandon Williams wrote:
>
> That sounds fine, with the caveat that you can't run sstableloader
> from a machine running Cassandra before 1.1, so copying the sstables
> manually (assuming both clusters are the same size and have the same
> tokens) might be better.
On Thu, May 24, 2012 at 8:07 PM, Brandon Williams wrote:
> > Are there any other ways of doing the migration? What happens if we join
> the
> > new servers without bootstrapping and run repair? Are there any other
> ugly
> > hacks or workaround we can do? We're not looking to run a mixed cluster,
datafiles, it should have all the
data associated with that token, and on joining the cluster it should just
pop in at the right place, but with a new ip address. And then we repeat
that for each server.
Will this work? Or is there a better way?
/Henrik
On Thu, May 24, 2012 at 7:41 PM, Henrik
Hey everyone,
We're trying to migrate a cassandra cluster from a bunch of Windows
machines to a bunch of (newer and more powerful) Linux machines.
Our initial plan was to simply bootstrap the Linux servers into the cluster
one by one, and then decommission the old servers one by one. However, whe
On Tue, May 1, 2012 at 10:00 PM, Oleg Proudnikov wrote:
> There is this note regarding major compaction in the tuning guide:
>
> "once you run a major compaction, automatic minor compactions are no longer
> triggered frequently forcing you to manually run major compactions on a
> routine
> basis"
On Tue, May 1, 2012 at 9:06 PM, Edward Capriolo wrote:
> Also there are some tickets in JIRA to impose a max sstable size and
> some other related optimizations that I think got stuck behind levelDB
> in coolness factor. Not every use case is good for leveled so adding
> more tools and optimizatio
On Tue, May 1, 2012 at 6:07 PM, Rob Coli wrote:
>
> The primary differences, as I understand it, are that the index
> performance and bloom filter false positive rate for your One Big File
> are worse. First, you are more likely to get a bloom filter false
> positive due to the intrinsic degradat
But what's the difference between doing an extra read from that One Big
File, than doing an extra read from whatever SSTable happen to be largest
in the course of automatic minor compaction?
We have a pretty update-heavy application, and doing a major compaction can
remove up to 30% of the used di
In your code you are using BufferedTransport, but in the Cassandra logs
you're getting errors when it tries to use FramedTransport. If I remember
correctly, BufferedTransport is gone, so you should only use
FramedTransport. Like this:
TTransport transport = new TFramedTransport(new TSocket(host, p
Great, thanks!
/Henrik
On Thu, Mar 1, 2012 at 13:08, Sylvain Lebresne wrote:
> It's a bug, namely: https://issues.apache.org/jira/browse/CASSANDRA-3616
> You'd want to upgrade.
>
> --
> Sylvain
>
> On Thu, Mar 1, 2012 at 1:01 PM, Henrik Schröder wrote:
> >
don't remember seeing this behaviour in older
versions of Cassandra, shouldn't it delete temp files while running? Is it
possible to force it to delete temp files while running? Is this fixed in a
later version? Or do we have to periodically restart servers to clean up
the datadirectories?
/Henrik Schröder
I had to port that piece of code to C#, and it's just a few lines of code,
so just write your own. Here's the original so you can see what it does:
http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=src/java/org/apache/cassandra/utils/FBUtilities.java;hb=refs/heads/trunk
/Henri
I'm running Cassandra 1.0.1 if that makes any difference.
/Henrik
On Sun, Dec 11, 2011 at 13:16, Henrik Schröder wrote:
> I have an existing cluster of four Cassandra nodes. The machines have both
> an internal and an external IP, and originally I set them up to use the
> exter
I have an existing cluster of four Cassandra nodes. The machines have both
an internal and an external IP, and originally I set them up to use the
external network. A little while later I moved them to the internal network
by bringing all machines down, changing the config, and bringing them up
aga
only
restarting Cassandra fixes it.
I hope this helps a little bit at least.
/Henrik Schröder
On Thu, Oct 20, 2011 at 18:53, Dan Hendry wrote:
> I have been playing around with Cassandra 1.0.0 in our test environment it
> seems pretty sweet so far. I have however come across what appears t
can create datafiles in 0.6 that are uncleanable in 0.7 so
that you all can repeat this and hopefully fix it.
/Henrik Schröder
On Sat, May 7, 2011 at 00:35, Jeremy Hanna wrote:
> If you're able, go into the #cassandra channel on freenode (IRC) and talk
> to driftx or jbellis or aaron_mort
I'll see if I can make some example broken files this weekend.
/Henrik Schröder
On Fri, May 6, 2011 at 02:10, aaron morton wrote:
> The difficulty is the different thrift clients between 0.6 and 0.7.
>
> If you want to roll your own solution I would consider:
> - write an a
Uhm, having a program that can talk to 0.6 and 0.7 servers at the same time
is not the hard problem, it took way less than five minutes to copy in both
generated clients in the same project and rename the C# namespaces. Two apps
and write to disk inbetween? Maven? That's crazy talk. :-D
What I was
e we correct in assuming that if we use the consistencylevel ALL we'll get
all rows?
/Henrik Schröder
e)
> {
> throw new RuntimeException(SSTableScanner.this + " failed
> to provide next columns from " + this, e);
> }
> }
>
> The string key is new String(ByteBufferUtil.getArray(key.key), "UTF-8")
> If y
eys throws the error, cleanup throws the
error, etc.
/Henrik Schröder
On Thu, May 5, 2011 at 13:57, aaron morton wrote:
> The hard core way to fix the data is export to json with sstable2json, hand
> edit, and then json2sstable it back.
>
> Also to confirm, this only happens when
reacts.
/Henrik
On Wed, May 4, 2011 at 18:53, Daniel Doubleday wrote:
> This is a bit of a wild guess but Windows and encoding and 0.7.5 sounds
> like
>
> https://issues.apache.org/jira/browse/CASSANDRA-2367
>
> <https://issues.apache.org/jira/browse/CASSANDRA-2367&g
oding and 0.7.5 sounds
> like
>
> https://issues.apache.org/jira/browse/CASSANDRA-2367
>
> <https://issues.apache.org/jira/browse/CASSANDRA-2367>
> On May 3, 2011, at 5:15 PM, Henrik Schröder wrote:
>
> Hey everyone,
>
> We did some tests before upgrading our Cassandra
now we're trying to figure out how to remove the rows that get
corrupted by upgrading, hopefully it can be solved.
/Henrik Schröder
On Tue, May 3, 2011 at 17:19, Henrik Schröder wrote:
> The way we solved this problem is that it turned out we had only a few
> hundred rows with unico
95b0e69982e99693": [["00", "02", 1304521931818, false]],
"666f6f": [["00", "01", 1304519721274, false]]
}
So I now have an SSTable with two rows with identical keys, except one of
the rows doesn't really work? So, now what? And how did I e
a problem in the future? Is there a chance that the good
duplicate is cleaned out in favour of the bad duplicate so that we suddnely
lose those rows again?
/Henrik Schröder
but apparently it
isn't.
Has anyone else experienced the same problem? Is it a platform-specific
problem? Is there a way to avoid this and upgrade from 0.6 to 0.7 and not
lose any rows? I would also really like to know which byte-array I should
send in to get back that second row, there's gotta be some key that can be
used to get it, the row is still there after all.
/Henrik Schröder
On Mon, Mar 29, 2010 at 14:15, Jonathan Ellis wrote:
> On Mon, Mar 29, 2010 at 4:06 AM, Henrik Schröder
> wrote:
> > On Fri, Mar 26, 2010 at 14:47, Jonathan Ellis wrote:
> >> It's a unique index then? And you're trying to read things ordered by
> >> the
On Fri, Mar 26, 2010 at 14:47, Jonathan Ellis wrote:
> On Fri, Mar 26, 2010 at 7:40 AM, Henrik Schröder
> wrote:
> > For each indexvalue we insert a row where the key is indexid + ":" +
> > indexvalue encoded as hex string, and the row contains only one column,
>
>
> So all the values for an entire index will be in one row? That
> doesn't sound good.
>
> You really want to put each index [and each table] in its own CF, but
> until we can do that dynamically (0.7) you could at least make the
> index row keys a tuple of (indexid, indexvalue) and the column n
On Thu, Mar 25, 2010 at 15:17, Sylvain Lebresne wrote:
> I don't know If that could play any role, but if ever you have
> disabled the assertions
> when running cassandra (that is, you removed the -ea line in
> cassandra.in.sh), there
> was a bug in 0.6beta2 that will make read in row with lots o
Hi everyone,
We're trying to implement a virtual datastore for our users where they can
set up "tables" and "indexes" to store objects and have them indexed on
arbitrary properties. And we did a test implementation for Cassandra in the
following way:
Objects are stored in one columnfamily, each k
50 matches
Mail list logo