Re: Backup full cluster

2011-05-04 Thread Peter Schuller
> Snapshot runs on a local node. How do I ensure I have a 'point in > time' snapshot of the full cluster ? Do I have to stop the writes on > the full cluster and then snapshot all the nodes individually ? You don't. By backing up individual nodes you can do a full cluster recovery that is eventual

Re: Backup full cluster

2011-05-04 Thread Peter Schuller
> would help here too in a very simple way... will try to file a JIRA > for that. No I won't, it's a broken suggestion (because you'd lose data; write 1 before cut-off will be overwritten by write 2 after cut-off, but starting up a node with cut-off will then see neither). Note to self again: Thi

Re: MemtablePostFlusher with high number of pending calls?

2011-05-04 Thread Terje Marthinussen
Partially, I guess this may be a side effect of multithreaded compactions? Before running out of space completely, I do see a few of these: WARN [CompactionExecutor:448] 2011-05-02 01:08:10,480 CompactionManager.java (line 516) insufficient space to compact all requested files SSTableReader(path=

Re: Decommissioning node is causing broken pipe error

2011-05-04 Thread aaron morton
It's no longer recommended to run nodetool compact regularly as it can mean that some tombstones do not get to be purged for a very long time. Minor compaction is all you need to keep things in check, however 648 seems like a lot of SSTables. Were some of these compacted files ? How many SSTable

Re: Problems recovering a dead node

2011-05-04 Thread aaron morton
Certainly sounds a bit sick. The first error looks like it happens when the index file points to the wrong place in the data file for the SSTable. The second one happens when the index file is corrupted. The should be problems nodetool scrub can fix. The disk space may be dead space to cassand

Re: Backup full cluster

2011-05-04 Thread aaron morton
Also see the recent discussion "Best way to backup" and this handy tool from simplegeo.com https://github.com/simplegeo/tablesnap - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 4 May 2011, at 19:12, Peter Schuller wrote: >> would help

Re: Problems recovering a dead node

2011-05-04 Thread Héctor Izquierdo Seliva
I'm sorry but I can't provide more detailed info as I have restarted the node. After that the number of pending tasks started at 40, and rapidly went down as compactions finished. After that, the ring looks ok, with all the nodes having about the same amount of data. There were no errors in the nod

Re: low performance inserting

2011-05-04 Thread charles THIBAULT
I have understood my error. I was comparing bulk insert in MySQL with batch_insert in Cassandra I made this two tests from a remote machine I have made the same test with one insert per row in MySQL and it took 4mn28s but the result I have with the stress utility it seems to be abnormal, isn't it

Re: Replica data distributing between racks

2011-05-04 Thread aaron morton
Eric, Jonathan is suggesting the approach Jeremiah was using. Calculate the tokens the nodes in each DC independantly, and then add 1 to the tokens if there are two nodes with the same tokens. In your case with 2 DC's with 2 nodes each. In DC 1 node 1 = 0 node 2 = 85

Re: Problems recovering a dead node

2011-05-04 Thread Héctor Izquierdo Seliva
> > El mié, 04-05-2011 a las 21:02 +1200, aaron morton escribió: > > Certainly sounds a bit sick. > > > > The first error looks like it happens when the index file points to the > > wrong place in the data file for the SSTable. The second one happens when > > the index file is corrupted. The

Re: MemtablePostFlusher with high number of pending calls?

2011-05-04 Thread Jonathan Ellis
Or we could "reserve" space when starting a compaction. On Wed, May 4, 2011 at 2:32 AM, Terje Marthinussen wrote: > Partially, I guess this may be a side effect of multithreaded compactions? > Before running out of space completely, I do see a few of these: >  WARN [CompactionExecutor:448] 2011-0

Re: Replica data distributing between racks

2011-05-04 Thread Eric tamme
>        Jonathan is suggesting the approach Jeremiah was using. > >        Calculate the tokens the nodes in each DC independantly, and then add > 1 to the tokens if there are two nodes with the same tokens. > >        In your case with 2 DC's with 2 nodes each. > > In DC 1 > node 1 = 0 > node 2

Re: Write performance help needed

2011-05-04 Thread Steve Smith
Since each row in my column family has 30 columns, wouldn't this translate to ~8,000 rows per second...or am I misunderstanding something. Talking in terms of columns, my load test would seem to perform as follows: 100,000 rows / 26 sec * 30 columns/row = 115K columns per second. That's on a dua

Re: Replica data distributing between racks

2011-05-04 Thread Konstantin Naryshkin
The way that I understand it (and that seems to be consistent with what was said in this discussion) is that each DC has its own data space. Using your simplified 1-10 system: DC1 DC2 0 D1R1 D2R2 1 D1R1 D2R1 2 D1R1 D2R1 3 D1R1 D2R1 4 D1R1 D2R1 5 D1R2 D2R1 6 D1R2 D2R2 7 D1R2 D

Re: MemtablePostFlusher with high number of pending calls?

2011-05-04 Thread Terje Marthinussen
Yes, some sort of data structure to coordinate this could reduce the problem as well. I made some comments on that at the end of 2558. I believe a coordinator could be in place both to - plan the start of compaction and - to coordinate compaction thread shutdown and tmp file deletion before we com

Re: Replica data distributing between racks

2011-05-04 Thread Eric tamme
On Wed, May 4, 2011 at 10:09 AM, Konstantin Naryshkin wrote: > The way that I understand it (and that seems to be consistent with what was > said in this discussion) is that each DC has its own data space. Using your > simplified 1-10 system: >   DC1   DC2 > 0  D1R1  D2R2 > 1  D1R1  D2R1 > 2  D

Re: Unicode key encoding problem when upgrading from 0.6.13 to 0.7.5

2011-05-04 Thread Henrik Schröder
My two keys that I send in my test program are 0xe695b0e69982e99693 and 0x666f6f, which decodes to "数時間" and "foo" respectively. So I ran my tests again, I started with a clean 0.6.13, wrote two rows with those two keys, drained, shut down, started 0.7.5, and imported my keyspace. In my test prog

Re: Unicode key encoding problem when upgrading from 0.6.13 to 0.7.5

2011-05-04 Thread Henrik Schröder
Those few hundred duplicated rows turned out to be a HUGE problem, the automatic compaction started throwing errors that "Keys must be written in ascending order.", and successive failing compactions started to fill the disk until it ran out of diskspace which was a bit sad. Right now we're trying

Node setup - recommended hardware

2011-05-04 Thread Anthony Ikeda
I just want to ask, when setting up nodes in a Node ring is it worthwhile using a 2 partition setup? i.e. Cassandra on the Primary, data directories etc on the second partition or does it really not make a difference? Anthony

Re: Node setup - recommended hardware

2011-05-04 Thread Edward Capriolo
On Wed, May 4, 2011 at 12:25 PM, Anthony Ikeda wrote: > I just want to ask, when setting up nodes in a Node ring is it worthwhile > using a 2 partition setup? i.e. Cassandra on the Primary, data directories > etc on the second partition or does it really not make a difference? > Anthony > Putting

Re: Node setup - recommended hardware

2011-05-04 Thread Eric tamme
On Wed, May 4, 2011 at 12:25 PM, Anthony Ikeda wrote: > I just want to ask, when setting up nodes in a Node ring is it worthwhile > using a 2 partition setup? i.e. Cassandra on the Primary, data directories > etc on the second partition or does it really not make a difference? > Anthony > I don'

Re: Node setup - recommended hardware

2011-05-04 Thread Anthony Ikeda
I wouldn't be concerned more about the performance with this configuration I'm looking more form a maintenance perspective - I have to draft some maintenance for our infrastructure team whom are used to a standard NAS storage setup which Cassandra obviously breaks. Ultimately, would keeping the ca

Re: Unicode key encoding problem when upgrading from 0.6.13 to 0.7.5

2011-05-04 Thread Daniel Doubleday
This is a bit of a wild guess but Windows and encoding and 0.7.5 sounds like https://issues.apache.org/jira/browse/CASSANDRA-2367 On May 3, 2011, at 5:15 PM, Henrik Schröder wrote: > Hey everyone, > > We did some tests before upgrading our Cassandra cluster from 0.6 to 0.7, > just to make su

Re: Node setup - recommended hardware

2011-05-04 Thread Serediuk, Adam
Having a well known node configuration that is trivial (one step) to create is your best maintenance bet. We are using 4 disk nodes in the following configuration: disk1: boot_raid1 os_raid1 cassandra_commit_log disk2: boot_raid1 os_raid1 cassandra_data_dir_raid0 disk3: cassandra_data_dir_raid0

Re: Node setup - recommended hardware

2011-05-04 Thread Anthony Ikeda
Thanks Adam. On Wed, May 4, 2011 at 10:02 AM, Serediuk, Adam < adam.sered...@serialssolutions.com> wrote: > Having a well known node configuration that is trivial (one step) to create > is your best maintenance bet. We are using 4 disk nodes in the following > configuration: > > disk1: boot_rai

Native heap leaks?

2011-05-04 Thread Hannes Schmidt
Hi, We are using Cassandra 0.6.12 in a cluster of 9 nodes. Each node is 64-bit, has 4 cores and 4G of RAM and runs on Ubuntu Lucid with the stock 2.6.32-31-generic kernel. We use the Sun/Oracle JDK. Here's the problem: The Cassandra process starts up with 1.1G resident memory (according to top) b

Making a custom Cassandra RPM

2011-05-04 Thread Konstantin Naryshkin
I want to create a custom RPM of Cassandra (so I can deploy it pre-configured). There is an RPM in the source tree, but it does not contain any details of the setup required to create the RPM (what files should I have where). I have tried to run rpmbuild -bi on the spec file and I am getting the

Re: Making a custom Cassandra RPM

2011-05-04 Thread Nick Bailey
Your apache ant install is too old. The ant that comes with rhel/centos 5.X isn't new enough to build cassandra. You will need to install ant manually. On Wed, May 4, 2011 at 2:01 PM, Konstantin Naryshkin wrote: > I want to create a custom RPM of Cassandra (so I can deploy it > pre-configured).

me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:Cluster schema does not yet agree)

2011-05-04 Thread Dikang Gu
I got this exception when I was trying to create a new columnFamily using hector api. me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:Cluster schema does not yet agree) What does this mean and how to resolve this? I have 3 nodes in the cassandra 0.7.4 c

Re: me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:Cluster schema does not yet agree)

2011-05-04 Thread Tyler Hobbs
The issue is quite possibly this: https://issues.apache.org/jira/browse/CASSANDRA-2536 A person on the ticket commented that decomissioning and rejoining the node with the disagreeing shema solved the issue. On Thu, May 5, 2011 at 12:40 AM, Dikang Gu wrote: > I got this exception when I was try

Re: Decommissioning node is causing broken pipe error

2011-05-04 Thread Peter Schuller
> It's no longer recommended to run nodetool compact regularly as it can mean > that some tombstones do not get to be purged for a very long time. I think this is a mis-typing; it used to be that major compactions were necessary to remove tombstones, but this is no longer the case in 0.7 so that t

Re: me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:Cluster schema does not yet agree)

2011-05-04 Thread Dikang Gu
Thanks, I will try the decomission and rejoin. On Thu, May 5, 2011 at 1:43 PM, Tyler Hobbs wrote: > The issue is quite possibly this: > https://issues.apache.org/jira/browse/CASSANDRA-2536 > > A person on the ticket commented that decomissioning and rejoining the node > with the disagreeing she

Re: Native heap leaks?

2011-05-04 Thread Oleg Anastasyev
Probably this is because of mmapped io access mode, which is enabled by default in 64-bit VMs - RAM is occupied by data files. If you have such a tight memory reqs, you can turn on standard access mode in storage-conf.xml, but dont expect it to work fast then:

Re: Decommissioning node is causing broken pipe error

2011-05-04 Thread Tyler Hobbs
On Thu, May 5, 2011 at 1:21 AM, Peter Schuller wrote: > > It's no longer recommended to run nodetool compact regularly as it can > mean > > that some tombstones do not get to be purged for a very long time. > > I think this is a mis-typing; it used to be that major compactions > were necessary to