Re: Fill disks more than 50%

2011-02-25 Thread Terje Marthinussen
 I am suggesting that your probably want to rethink your scheme design

 since partitioning by year is going to be bad performance since the
 old servers are going to be nothing more then expensive tape drives.


You fail to see the obvious

It is just the fact that most of the data is stale that makes the question
interesting in the first place, and I would obviously not have asked if
there would be an I/O throughput problem in doing this.

Now, when that is said, we tested a repair on a set of nodes that was 70-80%
full and no luck. Ran out of disk :(

Terje


Re: Fill disks more than 50%

2011-02-25 Thread Terje Marthinussen


 @Thibaut Britz
 Caveat:Using simple strategy.
 This works because cassandra scans data at startup and then serves
 what it finds. For a join for example you can rsync all the data from
 the node below/to the right of where the new node is joining. Then
 join without bootstrap then cleanup both nodes. (also you have to
 shutdown the first node so you do not have a lost write scenario in
 the time between rsync and new node startup)


rsync all data from node to left/right..
Wouldn't that mean that you need 2x the data to recover...?

Terje


Re: Fill disks more than 50%

2011-02-25 Thread Edward Capriolo
On Fri, Feb 25, 2011 at 7:38 AM, Terje Marthinussen
tmarthinus...@gmail.com wrote:

 @Thibaut Britz
 Caveat:Using simple strategy.
 This works because cassandra scans data at startup and then serves
 what it finds. For a join for example you can rsync all the data from
 the node below/to the right of where the new node is joining. Then
 join without bootstrap then cleanup both nodes. (also you have to
 shutdown the first node so you do not have a lost write scenario in
 the time between rsync and new node startup)


 rsync all data from node to left/right..
 Wouldn't that mean that you need 2x the data to recover...?
 Terje

Terje,

In your scenario where you are never updating running repair becomes
less important. I have an alternative for you. I have a program I call
the RescueRanger we use it to range-scan all our data, find old
entries and then delete them. However if we set that program to read
only mode and tell it to read at CL.ALL, It becomes a program that
read repairs data!

This is a tradeoff. Range scanning though all your data is not fast,
but it does not require the extra disk space. Kinda like merge sort vs
bubble sort.


Re: Fill disks more than 50%

2011-02-24 Thread Thibaut Britz
Hi,

How would you use rsync instead of repair in case of a node failure?

Rsync all files from the data directories from the adjacant nodes
(which are part of the quorum group) and then run a compactation which
will? remove all the unneeded keys?

Thanks,
Thibaut


On Thu, Feb 24, 2011 at 4:22 AM, Edward Capriolo edlinuxg...@gmail.com wrote:
 On Wed, Feb 23, 2011 at 9:39 PM, Terje Marthinussen
 tmarthinus...@gmail.com wrote:
 Hi,
 Given that you have have always increasing key values (timestamps) and never
 delete and hardly ever overwrite data.
 If you want to minimize work on rebalancing and statically assign (new)
 token ranges to new nodes as you add them so they always get the latest
 data
 Lets say you add a new node each year to handle next years data.
 In a scenario like this, could you with 0.7 be able to safely fill disks
 significantly more than 50% and still manage things like repair/recovery of
 faulty nodes?

 Regards,
 Terje

 Since all your data for a day/month/year would sit on the same server.
 Meaning all your servers with old data would be idle and your servers
 with current data would be very busy. This is probably not a good way
 to go.

 There is a ticket open for 0.8 for efficient node moves joins. It is
 already a lot better in 0.7. Pretend you did not see this (you can
 join nodes using rsync if you know some tricks) if you are really
 afraid of joins, which you really should not be.

 As for the 50% statement. In a worse case scenario a major compaction
 will require double the disk size of your column family. So if you
 have more then 1 column family you do NOT need 50% overhead.



Re: Fill disks more than 50%

2011-02-24 Thread Edward Capriolo
On Thu, Feb 24, 2011 at 4:08 AM, Thibaut Britz
thibaut.br...@trendiction.com wrote:
 Hi,

 How would you use rsync instead of repair in case of a node failure?

 Rsync all files from the data directories from the adjacant nodes
 (which are part of the quorum group) and then run a compactation which
 will? remove all the unneeded keys?

 Thanks,
 Thibaut


 On Thu, Feb 24, 2011 at 4:22 AM, Edward Capriolo edlinuxg...@gmail.com 
 wrote:
 On Wed, Feb 23, 2011 at 9:39 PM, Terje Marthinussen
 tmarthinus...@gmail.com wrote:
 Hi,
 Given that you have have always increasing key values (timestamps) and never
 delete and hardly ever overwrite data.
 If you want to minimize work on rebalancing and statically assign (new)
 token ranges to new nodes as you add them so they always get the latest
 data
 Lets say you add a new node each year to handle next years data.
 In a scenario like this, could you with 0.7 be able to safely fill disks
 significantly more than 50% and still manage things like repair/recovery of
 faulty nodes?

 Regards,
 Terje

 Since all your data for a day/month/year would sit on the same server.
 Meaning all your servers with old data would be idle and your servers
 with current data would be very busy. This is probably not a good way
 to go.

 There is a ticket open for 0.8 for efficient node moves joins. It is
 already a lot better in 0.7. Pretend you did not see this (you can
 join nodes using rsync if you know some tricks) if you are really
 afraid of joins, which you really should not be.

 As for the 50% statement. In a worse case scenario a major compaction
 will require double the disk size of your column family. So if you
 have more then 1 column family you do NOT need 50% overhead.


@Thibaut Britz
Caveat:Using simple strategy.
This works because cassandra scans data at startup and then serves
what it finds. For a join for example you can rsync all the data from
the node below/to the right of where the new node is joining. Then
join without bootstrap then cleanup both nodes. (also you have to
shutdown the first node so you do not have a lost write scenario in
the time between rsync and new node startup)

It does not make as much sense for repair because the data on a node
will tripple, before you compact/cleanup it.

@Terje
I am suggesting that your probably want to rethink your scheme design
since partitioning by year is going to be bad performance since the
old servers are going to be nothing more then expensive tape drives.


Re: Fill disks more than 50%

2011-02-23 Thread Edward Capriolo
On Wed, Feb 23, 2011 at 9:39 PM, Terje Marthinussen
tmarthinus...@gmail.com wrote:
 Hi,
 Given that you have have always increasing key values (timestamps) and never
 delete and hardly ever overwrite data.
 If you want to minimize work on rebalancing and statically assign (new)
 token ranges to new nodes as you add them so they always get the latest
 data
 Lets say you add a new node each year to handle next years data.
 In a scenario like this, could you with 0.7 be able to safely fill disks
 significantly more than 50% and still manage things like repair/recovery of
 faulty nodes?

 Regards,
 Terje

Since all your data for a day/month/year would sit on the same server.
Meaning all your servers with old data would be idle and your servers
with current data would be very busy. This is probably not a good way
to go.

There is a ticket open for 0.8 for efficient node moves joins. It is
already a lot better in 0.7. Pretend you did not see this (you can
join nodes using rsync if you know some tricks) if you are really
afraid of joins, which you really should not be.

As for the 50% statement. In a worse case scenario a major compaction
will require double the disk size of your column family. So if you
have more then 1 column family you do NOT need 50% overhead.