RE: Cleanup

2023-02-17 Thread Durity, Sean R via user
Cleanup, by itself, uses all the compactors available. So, it is important to 
see if you have the disk space for multiple large cleanup compactions running 
at the same time. We have a utility to do cleanup more intelligently – it 
temporarily doubles compaction throughput, operates on a single keyspace, sorts 
by table size ascending, and runs only 1 thread (-j 1) at a time to protect 
against the multiple large compactions at the same time issue. It also verifies 
that there is enough disk space to handle the largest sstable for the table 
about to be cleaned up.

It works very well in the use cases where we have a stair step arrangement of 
table sizes. We recover space from smaller tables and work up to the largest 
ones with whatever extra space we have acquired.


Sean R. Durity

From: Dipan Shah 
Sent: Friday, February 17, 2023 2:50 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cleanup

Hi Marc, Changes done using "nodetool setcompactionthroughput" will only be 
applicable till Cassandra service restart. The throughput value will revert 
back to the settings inside cassandra. yaml post service restart. On Fri, Feb 
17,

Hi Marc,

Changes done using "nodetool setcompactionthroughput" will only be applicable 
till Cassandra service restart.

The throughput value will revert back to the settings inside cassandra.yaml 
post service restart.

On Fri, Feb 17, 2023 at 1:04 PM Marc Hoppins 
mailto:marc.hopp...@eset.com>> wrote:
…and if it is altered via nodetool, is it altered until manually changed or 
service restart, so must be manually put pack?



INTERNAL USE
From: Aaron Ploetz mailto:aaronplo...@gmail.com>>
Sent: Thursday, February 16, 2023 4:50 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Cleanup

EXTERNAL
So if I remember right, setting compaction_throughput_per_mb to zero 
effectively disables throttling, which means cleanup and compaction will run as 
fast as the instance will allow.  For normal use, I'd recommend capping that at 
8 or 16.

Aaron


On Thu, Feb 16, 2023 at 9:43 AM Marc Hoppins 
mailto:marc.hopp...@eset.com>> wrote:
Compaction_throughtput_per_mb is 0 in cassandra.yaml. Is setting it in nodetool 
going to provide any increase?

From: Durity, Sean R via user 
mailto:user@cassandra.apache.org>>
Sent: Thursday, February 16, 2023 4:20 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: RE: Cleanup

EXTERNAL
Clean-up is constrained/throttled by compactionthroughput. If your system can 
handle it, you can increase that throughput (nodetool setcompactionthroughput) 
for the clean-up in order to reduce the total time.

It is a node-isolated operation, not cluster-involved. I often run clean up on 
all nodes in a DC at the same time. Think of it as compaction and consider your 
cluster performance/workload/timelines accordingly.

Sean R. Durity

From: manish khandelwal 
mailto:manishkhandelwa...@gmail.com>>
Sent: Thursday, February 16, 2023 5:05 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] Re: Cleanup

There is no advantage of running cleanup if no new nodes are introduced. So 
cleanup time should remain same when adding new nodes. Cleanup is a local to 
node so network bandwidth should have no effect on reducing cleanup time. Dont 
ignore cleanup

There is no advantage of running cleanup if no new nodes are introduced. So 
cleanup time should remain same when adding new nodes.

 Cleanup is a local to node so network bandwidth should have no effect on 
reducing cleanup time.

 Dont ignore cleanup as it can cause you disks occupied without any use.

 You should plan to run cleanup in a lean period (low traffic). Also you can 
use suboptions of keyspace and table names to plan it such a way that I/O 
pressure is not much.


Regards
Manish

On Thu, Feb 16, 2023 at 3:12 PM Marc Hoppins 
mailto:marc.hopp...@eset.com>> wrote:
Hulloa all,

I read a thing re. adding new nodes where the recommendation was to run cleanup 
on the nodes after adding a new node to remove redundant token ranges.

I timed this way back when we only had ~20G of data per node and it took 
approx. 5 mins per node.  After adding a node on Tuesday, I figured I’d run 
cleanup.

Per node, it is taking 6+ hours now as we have 2-2.5T per node.

Should we be running cleanup regularly regardless of whether or not new nodes 
have been added?  Would it reduce cleanup times for when we do add new nodes?
If we double the network bandwidth can we effectively reduce this lengthy 
cleanup?
Maybe just ignore cleanup entirely?
I appreciate that cleanup will increase the load but running cleanup on one 
node at a time seems impractical.  How many simultaneous nodes (per rack) 
should we limit cleanup to?

More experienced suggestions would be most appreciated.

Marc


INTERNAL USE


--

Thanks,

Dipan Shah

Data Engineer

[cid:~WRD.jpg]



3 Was

Re: Cleanup

2023-02-16 Thread Dipan Shah
Hi Marc,

Changes done using "nodetool setcompactionthroughput" will only be
applicable till Cassandra service restart.

The throughput value will revert back to the settings inside cassandra.yaml
post service restart.

On Fri, Feb 17, 2023 at 1:04 PM Marc Hoppins  wrote:

> …and if it is altered via nodetool, is it altered until manually changed
> or service restart, so must be manually put pack?
>
>
>
> *From:* Aaron Ploetz 
> *Sent:* Thursday, February 16, 2023 4:50 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Cleanup
>
>
>
> EXTERNAL
>
> So if I remember right, setting compaction_throughput_per_mb to zero
> effectively disables throttling, which means cleanup and compaction will
> run as fast as the instance will allow.  For normal use, I'd recommend
> capping that at 8 or 16.
>
>
>
> Aaron
>
>
>
>
>
> On Thu, Feb 16, 2023 at 9:43 AM Marc Hoppins 
> wrote:
>
> Compaction_throughtput_per_mb is 0 in cassandra.yaml. Is setting it in
> nodetool going to provide any increase?
>
>
>
> *From:* Durity, Sean R via user 
> *Sent:* Thursday, February 16, 2023 4:20 PM
> *To:* user@cassandra.apache.org
> *Subject:* RE: Cleanup
>
>
>
> EXTERNAL
>
> Clean-up is constrained/throttled by compactionthroughput. If your system
> can handle it, you can increase that throughput (nodetool
> setcompactionthroughput) for the clean-up in order to reduce the total time.
>
>
>
> It is a node-isolated operation, not cluster-involved. I often run clean
> up on all nodes in a DC at the same time. Think of it as compaction and
> consider your cluster performance/workload/timelines accordingly.
>
>
>
> Sean R. Durity
>
>
>
> *From:* manish khandelwal 
> *Sent:* Thursday, February 16, 2023 5:05 AM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Re: Cleanup
>
>
>
> There is no advantage of running cleanup if no new nodes are introduced.
> So cleanup time should remain same when adding new nodes. Cleanup is a
> local to node so network bandwidth should have no effect on reducing
> cleanup time. Dont ignore cleanup
>
>
>
> There is no advantage of running cleanup if no new nodes are introduced.
> So cleanup time should remain same when adding new nodes.
>
>
>
>  Cleanup is a local to node so network bandwidth should have no effect on
> reducing cleanup time.
>
>
>
>  Dont ignore cleanup as it can cause you disks occupied without any use.
>
>
>
>  You should plan to run cleanup in a lean period (low traffic). Also you
> can use suboptions of keyspace and table names to plan it such a way that
> I/O pressure is not much.
>
>
>
>
>
> Regards
>
> Manish
>
>
>
> On Thu, Feb 16, 2023 at 3:12 PM Marc Hoppins 
> wrote:
>
> Hulloa all,
>
>
>
> I read a thing re. adding new nodes where the recommendation was to run
> cleanup on the nodes after adding a new node to remove redundant token
> ranges.
>
>
>
> I timed this way back when we only had ~20G of data per node and it took
> approx. 5 mins per node.  After adding a node on Tuesday, I figured I’d run
> cleanup.
>
>
>
> Per node, it is taking 6+ hours now as we have 2-2.5T per node.
>
>
>
> Should we be running cleanup regularly regardless of whether or not new
> nodes have been added?  Would it reduce cleanup times for when we do add
> new nodes?
>
> If we double the network bandwidth can we effectively reduce this lengthy
> cleanup?
>
> Maybe just ignore cleanup entirely?
>
> I appreciate that cleanup will increase the load but running cleanup on
> one node at a time seems impractical.  How many simultaneous nodes (per
> rack) should we limit cleanup to?
>
>
>
> More experienced suggestions would be most appreciated.
>
>
> Marc
>
>
>
> INTERNAL USE
>
>

-- 

Thanks,

*Dipan Shah*

*Data Engineer*

[image: https://www.anant.us/Home.aspx]


3 Washington Circle NW, Suite 301

Washington, D.C. 20037


*Check out our **blog* <https://blog.anant.us/>*!*


This email and any attachments to it may be confidential and are intended
solely for the use of the individual to whom it is addressed. Any views or
opinions expressed are solely those of the author and do not necessarily
represent those of Anant Corporation. If you are not the intended recipient
of this email, you must neither take any action based upon its contents,
nor copy or show it to anyone. Please contact the sender if you believe you
have received this email in error.


RE: Cleanup

2023-02-16 Thread Marc Hoppins
…and if it is altered via nodetool, is it altered until manually changed or 
service restart, so must be manually put pack?

From: Aaron Ploetz 
Sent: Thursday, February 16, 2023 4:50 PM
To: user@cassandra.apache.org
Subject: Re: Cleanup

EXTERNAL
So if I remember right, setting compaction_throughput_per_mb to zero 
effectively disables throttling, which means cleanup and compaction will run as 
fast as the instance will allow.  For normal use, I'd recommend capping that at 
8 or 16.

Aaron


On Thu, Feb 16, 2023 at 9:43 AM Marc Hoppins 
mailto:marc.hopp...@eset.com>> wrote:
Compaction_throughtput_per_mb is 0 in cassandra.yaml. Is setting it in nodetool 
going to provide any increase?

From: Durity, Sean R via user 
mailto:user@cassandra.apache.org>>
Sent: Thursday, February 16, 2023 4:20 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: RE: Cleanup

EXTERNAL
Clean-up is constrained/throttled by compactionthroughput. If your system can 
handle it, you can increase that throughput (nodetool setcompactionthroughput) 
for the clean-up in order to reduce the total time.

It is a node-isolated operation, not cluster-involved. I often run clean up on 
all nodes in a DC at the same time. Think of it as compaction and consider your 
cluster performance/workload/timelines accordingly.

Sean R. Durity

From: manish khandelwal 
mailto:manishkhandelwa...@gmail.com>>
Sent: Thursday, February 16, 2023 5:05 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] Re: Cleanup

There is no advantage of running cleanup if no new nodes are introduced. So 
cleanup time should remain same when adding new nodes. Cleanup is a local to 
node so network bandwidth should have no effect on reducing cleanup time. Dont 
ignore cleanup

There is no advantage of running cleanup if no new nodes are introduced. So 
cleanup time should remain same when adding new nodes.

 Cleanup is a local to node so network bandwidth should have no effect on 
reducing cleanup time.

 Dont ignore cleanup as it can cause you disks occupied without any use.

 You should plan to run cleanup in a lean period (low traffic). Also you can 
use suboptions of keyspace and table names to plan it such a way that I/O 
pressure is not much.


Regards
Manish

On Thu, Feb 16, 2023 at 3:12 PM Marc Hoppins 
mailto:marc.hopp...@eset.com>> wrote:
Hulloa all,

I read a thing re. adding new nodes where the recommendation was to run cleanup 
on the nodes after adding a new node to remove redundant token ranges.

I timed this way back when we only had ~20G of data per node and it took 
approx. 5 mins per node.  After adding a node on Tuesday, I figured I’d run 
cleanup.

Per node, it is taking 6+ hours now as we have 2-2.5T per node.

Should we be running cleanup regularly regardless of whether or not new nodes 
have been added?  Would it reduce cleanup times for when we do add new nodes?
If we double the network bandwidth can we effectively reduce this lengthy 
cleanup?
Maybe just ignore cleanup entirely?
I appreciate that cleanup will increase the load but running cleanup on one 
node at a time seems impractical.  How many simultaneous nodes (per rack) 
should we limit cleanup to?

More experienced suggestions would be most appreciated.

Marc


INTERNAL USE


Re: Cleanup

2023-02-16 Thread Aaron Ploetz
So if I remember right, setting compaction_throughput_per_mb to zero
effectively disables throttling, which means cleanup and compaction will
run as fast as the instance will allow.  For normal use, I'd recommend
capping that at 8 or 16.

Aaron


On Thu, Feb 16, 2023 at 9:43 AM Marc Hoppins  wrote:

> Compaction_throughtput_per_mb is 0 in cassandra.yaml. Is setting it in
> nodetool going to provide any increase?
>
>
>
> *From:* Durity, Sean R via user 
> *Sent:* Thursday, February 16, 2023 4:20 PM
> *To:* user@cassandra.apache.org
> *Subject:* RE: Cleanup
>
>
>
> EXTERNAL
>
> Clean-up is constrained/throttled by compactionthroughput. If your system
> can handle it, you can increase that throughput (nodetool
> setcompactionthroughput) for the clean-up in order to reduce the total time.
>
>
>
> It is a node-isolated operation, not cluster-involved. I often run clean
> up on all nodes in a DC at the same time. Think of it as compaction and
> consider your cluster performance/workload/timelines accordingly.
>
>
>
> Sean R. Durity
>
>
>
> *From:* manish khandelwal 
> *Sent:* Thursday, February 16, 2023 5:05 AM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Re: Cleanup
>
>
>
> There is no advantage of running cleanup if no new nodes are introduced.
> So cleanup time should remain same when adding new nodes. Cleanup is a
> local to node so network bandwidth should have no effect on reducing
> cleanup time. Dont ignore cleanup
>
>
>
> There is no advantage of running cleanup if no new nodes are introduced.
> So cleanup time should remain same when adding new nodes.
>
>
>
>  Cleanup is a local to node so network bandwidth should have no effect on
> reducing cleanup time.
>
>
>
>  Dont ignore cleanup as it can cause you disks occupied without any use.
>
>
>
>  You should plan to run cleanup in a lean period (low traffic). Also you
> can use suboptions of keyspace and table names to plan it such a way that
> I/O pressure is not much.
>
>
>
>
>
> Regards
>
> Manish
>
>
>
> On Thu, Feb 16, 2023 at 3:12 PM Marc Hoppins 
> wrote:
>
> Hulloa all,
>
>
>
> I read a thing re. adding new nodes where the recommendation was to run
> cleanup on the nodes after adding a new node to remove redundant token
> ranges.
>
>
>
> I timed this way back when we only had ~20G of data per node and it took
> approx. 5 mins per node.  After adding a node on Tuesday, I figured I’d run
> cleanup.
>
>
>
> Per node, it is taking 6+ hours now as we have 2-2.5T per node.
>
>
>
> Should we be running cleanup regularly regardless of whether or not new
> nodes have been added?  Would it reduce cleanup times for when we do add
> new nodes?
>
> If we double the network bandwidth can we effectively reduce this lengthy
> cleanup?
>
> Maybe just ignore cleanup entirely?
>
> I appreciate that cleanup will increase the load but running cleanup on
> one node at a time seems impractical.  How many simultaneous nodes (per
> rack) should we limit cleanup to?
>
>
>
> More experienced suggestions would be most appreciated.
>
>
> Marc
>
>
>
> INTERNAL USE
>
>


RE: Cleanup

2023-02-16 Thread Marc Hoppins
Compaction_throughtput_per_mb is 0 in cassandra.yaml. Is setting it in nodetool 
going to provide any increase?

From: Durity, Sean R via user 
Sent: Thursday, February 16, 2023 4:20 PM
To: user@cassandra.apache.org
Subject: RE: Cleanup

EXTERNAL
Clean-up is constrained/throttled by compactionthroughput. If your system can 
handle it, you can increase that throughput (nodetool setcompactionthroughput) 
for the clean-up in order to reduce the total time.

It is a node-isolated operation, not cluster-involved. I often run clean up on 
all nodes in a DC at the same time. Think of it as compaction and consider your 
cluster performance/workload/timelines accordingly.

Sean R. Durity

From: manish khandelwal 
mailto:manishkhandelwa...@gmail.com>>
Sent: Thursday, February 16, 2023 5:05 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] Re: Cleanup

There is no advantage of running cleanup if no new nodes are introduced. So 
cleanup time should remain same when adding new nodes. Cleanup is a local to 
node so network bandwidth should have no effect on reducing cleanup time. Dont 
ignore cleanup

There is no advantage of running cleanup if no new nodes are introduced. So 
cleanup time should remain same when adding new nodes.

 Cleanup is a local to node so network bandwidth should have no effect on 
reducing cleanup time.

 Dont ignore cleanup as it can cause you disks occupied without any use.

 You should plan to run cleanup in a lean period (low traffic). Also you can 
use suboptions of keyspace and table names to plan it such a way that I/O 
pressure is not much.


Regards
Manish

On Thu, Feb 16, 2023 at 3:12 PM Marc Hoppins 
mailto:marc.hopp...@eset.com>> wrote:
Hulloa all,

I read a thing re. adding new nodes where the recommendation was to run cleanup 
on the nodes after adding a new node to remove redundant token ranges.

I timed this way back when we only had ~20G of data per node and it took 
approx. 5 mins per node.  After adding a node on Tuesday, I figured I'd run 
cleanup.

Per node, it is taking 6+ hours now as we have 2-2.5T per node.

Should we be running cleanup regularly regardless of whether or not new nodes 
have been added?  Would it reduce cleanup times for when we do add new nodes?
If we double the network bandwidth can we effectively reduce this lengthy 
cleanup?
Maybe just ignore cleanup entirely?
I appreciate that cleanup will increase the load but running cleanup on one 
node at a time seems impractical.  How many simultaneous nodes (per rack) 
should we limit cleanup to?

More experienced suggestions would be most appreciated.

Marc


INTERNAL USE


RE: Cleanup

2023-02-16 Thread Durity, Sean R via user
Clean-up is constrained/throttled by compactionthroughput. If your system can 
handle it, you can increase that throughput (nodetool setcompactionthroughput) 
for the clean-up in order to reduce the total time.

It is a node-isolated operation, not cluster-involved. I often run clean up on 
all nodes in a DC at the same time. Think of it as compaction and consider your 
cluster performance/workload/timelines accordingly.

Sean R. Durity

From: manish khandelwal 
Sent: Thursday, February 16, 2023 5:05 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cleanup

There is no advantage of running cleanup if no new nodes are introduced. So 
cleanup time should remain same when adding new nodes. Cleanup is a local to 
node so network bandwidth should have no effect on reducing cleanup time. Dont 
ignore cleanup

There is no advantage of running cleanup if no new nodes are introduced. So 
cleanup time should remain same when adding new nodes.

 Cleanup is a local to node so network bandwidth should have no effect on 
reducing cleanup time.

 Dont ignore cleanup as it can cause you disks occupied without any use.

 You should plan to run cleanup in a lean period (low traffic). Also you can 
use suboptions of keyspace and table names to plan it such a way that I/O 
pressure is not much.


Regards
Manish

On Thu, Feb 16, 2023 at 3:12 PM Marc Hoppins 
mailto:marc.hopp...@eset.com>> wrote:
Hulloa all,

I read a thing re. adding new nodes where the recommendation was to run cleanup 
on the nodes after adding a new node to remove redundant token ranges.

I timed this way back when we only had ~20G of data per node and it took 
approx. 5 mins per node.  After adding a node on Tuesday, I figured I’d run 
cleanup.

Per node, it is taking 6+ hours now as we have 2-2.5T per node.

Should we be running cleanup regularly regardless of whether or not new nodes 
have been added?  Would it reduce cleanup times for when we do add new nodes?
If we double the network bandwidth can we effectively reduce this lengthy 
cleanup?
Maybe just ignore cleanup entirely?
I appreciate that cleanup will increase the load but running cleanup on one 
node at a time seems impractical.  How many simultaneous nodes (per rack) 
should we limit cleanup to?

More experienced suggestions would be most appreciated.

Marc


INTERNAL USE


Re: Cleanup

2023-02-16 Thread manish khandelwal
There is no advantage of running cleanup if no new nodes are introduced. So
cleanup time should remain same when adding new nodes.

 Cleanup is a local to node so network bandwidth should have no effect on
reducing cleanup time.

 Dont ignore cleanup as it can cause you disks occupied without any use.

 You should plan to run cleanup in a lean period (low traffic). Also you
can use suboptions of keyspace and table names to plan it such a way that
I/O pressure is not much.


Regards
Manish

On Thu, Feb 16, 2023 at 3:12 PM Marc Hoppins  wrote:

> Hulloa all,
>
>
>
> I read a thing re. adding new nodes where the recommendation was to run
> cleanup on the nodes after adding a new node to remove redundant token
> ranges.
>
>
>
> I timed this way back when we only had ~20G of data per node and it took
> approx. 5 mins per node.  After adding a node on Tuesday, I figured I’d run
> cleanup.
>
>
>
> Per node, it is taking 6+ hours now as we have 2-2.5T per node.
>
>
>
> Should we be running cleanup regularly regardless of whether or not new
> nodes have been added?  Would it reduce cleanup times for when we do add
> new nodes?
>
> If we double the network bandwidth can we effectively reduce this lengthy
> cleanup?
>
> Maybe just ignore cleanup entirely?
>
> I appreciate that cleanup will increase the load but running cleanup on
> one node at a time seems impractical.  How many simultaneous nodes (per
> rack) should we limit cleanup to?
>
>
>
> More experienced suggestions would be most appreciated.
>
>
> Marc
>


Re: Cleanup cluster after expansion?

2018-10-25 Thread Alain RODRIGUEZ
Hello,

'*nodetool cleanup*' use to be mono-threaded (up to C*2.1) then used all
the cores (C*2.1 - C*2.1.14) and is now something that can be controlled
(C*2.1.14+):
'*nodetool cleanup -j 2*' for example would use 2 compactors maximum (out
of the number of concurrent_compactors you defined (probably no more than
8).

*Global*: My advice would be to run on all nodes with a 1 or 2 threads
(never more than half of what's available). The impact of the cleanup
should not be bigger than the impact of a compaction. Also, be sure to
leave some room for regular compactions. This way, the cleanup should be
rather safe and it should be acceptable to run it in parallel in most
cases. In parallel you will save time to move to other operations quickly,
but generally there is no rush to run cleanup per se. So it's up to you to
run it in parallel or not. I often did, fwiw.

*Early Cassandra 2.1:* If you're using a Cassandra version between 2.1 and
2.1.14, I would go 1 node at the time, as you cannot really control the
number of threads. This operation in early C*2.1 is risky and heavy,
upgrade if you can, then cleanup would be my advice here :). Be careful
there if you decide to go for the cleanup anyway. Monitor pending
compaction stacking and disk space used mostly. In worst case you want to
have 50% of the disk free before starting cleanups.

*Note: *Reducing disk space usage - If disk space available is low or if
you mind the data size variation, you can run the cleanup per* tables*
sequentially, one by one, instead of running it on the whole node or
keyspace. Cleanups are going through compactions that starts by increasing
the used disk space to write temporary SSTables. Most of the disk space is
freed at the end of the cleanup operation. Going one table at the time and
with a low number of threads helped me in the past running cleanups in the
most extreme conditions.
Here is how this could be run (you may need to adapt this):

```
*screen -R cleanup*
*# From screen:*
*for ks in "myks yourks whateverks"; do tables=$(ls
/var/lib/cassandra/data/$ks | sort | cut -d "-" -f 1); for table in
$tables; do echo "Running nodetool cleanup on $ks.$table..."; nodetool
cleanup -j 2 $ks $table; done; done*
```

The screen is a good idea to answer the question 'Did the cleanup finish?'.
You get back to the screen and see if the command returned or not and you
don't have to kill the command just after running it.

C*heers,
---
Alain Rodriguez - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

Le lun. 22 oct. 2018 à 21:18, Jeff Jirsa  a écrit :

> Nodetool will eventually return when it’s done
>
> You can also watch nodetool compactionstats
>
> --
> Jeff Jirsa
>
>
> > On Oct 22, 2018, at 10:53 AM, Ian Spence 
> wrote:
> >
> > Environment: Cassandra 2.2.9, GNU/Linux CentOS 6 + 7. Two DCs, 3 RACs in
> DC1 and 6 in DC2.
> >
> > We recently added 16 new nodes to our 38-node cluster (now 54 nodes).
> What would be the safest and most
> > efficient way of running a cleanup operation? I’ve experimented with
> running cleanup on a single node and
> > nodetool just hangs, but that seems to be a known issue.
> >
> > Would something like running it on a couple of nodes per day, working
> through the cluster, work?
> >
> >
> > -
> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: user-h...@cassandra.apache.org
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Cleanup cluster after expansion?

2018-10-22 Thread Jeff Jirsa
Nodetool will eventually return when it’s done

You can also watch nodetool compactionstats 

-- 
Jeff Jirsa


> On Oct 22, 2018, at 10:53 AM, Ian Spence  wrote:
> 
> Environment: Cassandra 2.2.9, GNU/Linux CentOS 6 + 7. Two DCs, 3 RACs in DC1 
> and 6 in DC2.
> 
> We recently added 16 new nodes to our 38-node cluster (now 54 nodes). What 
> would be the safest and most
> efficient way of running a cleanup operation? I’ve experimented with running 
> cleanup on a single node and
> nodetool just hangs, but that seems to be a known issue.
> 
> Would something like running it on a couple of nodes per day, working through 
> the cluster, work?
> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Cleanup blocking snapshots - Options?

2018-01-31 Thread kurt greaves
Thanks Thomas. I'll give it a shot myself and see if backporting the patch
fixes the problem. If it does I'll create a new ticket for backporting.

On 30 January 2018 at 09:22, Steinmaurer, Thomas <
thomas.steinmau...@dynatrace.com> wrote:

> Hi Kurt,
>
>
>
> had another try now, and yes, with 2.1.18, this constantly happens.
> Currently running nodetool cleanup on a single node in production with
> disabled hourly snapshots. SSTables with > 100G involved here. Triggering
> nodetool snapshot will result in being blocked. From an operational
> perspective, a bit annoying right now 
>
>
>
> Have asked on https://issues.apache.org/jira/browse/CASSANDRA-13873
> regarding a backport to 2.1, but possibly won’t get attention, cause the
> ticket has been resolved for 2.2+ already.
>
>
>
> Regards,
>
> Thomas
>
>
>
> *From:* kurt greaves [mailto:k...@instaclustr.com]
> *Sent:* Montag, 15. Jänner 2018 06:18
> *To:* User <user@cassandra.apache.org>
> *Subject:* Re: Cleanup blocking snapshots - Options?
>
>
>
> Disabling the snapshots is the best and only real option other than
> upgrading at the moment. Although apparently it was thought that there was
> only a small race condition in 2.1 that triggered this and it wasn't worth
> fixing. If you are triggering it easily maybe it is worth fixing in 2.1 as
> well. Does this happen consistently? Can you provide some more logs on the
> JIRA or better yet a way to reproduce?
>
>
>
> On 14 January 2018 at 16:12, Steinmaurer, Thomas <
> thomas.steinmau...@dynatrace.com> wrote:
>
> Hello,
>
>
>
> we are running 2.1.18 with vnodes in production and due to (
> https://issues.apache.org/jira/browse/CASSANDRA-11155) we can’t run
> cleanup e.g. after extending the cluster without blocking our hourly
> snapshots.
>
>
>
> What options do we have to get rid of partitions a node does not own
> anymore?
>
> · Using a version which has this issue fixed, although upgrading
> to 2.2+, due to various issues, is not an option at the moment
>
> · Temporarily disabling the hourly cron job before starting
> cleanup and re-enable after cleanup has finished
>
> · Any other way to re-write SSTables with data a node owns after
> a cluster scale out
>
>
>
> Thanks,
>
> Thomas
>
>
>
> The contents of this e-mail are intended for the named addressee only. It
> contains information that may be confidential. Unless you are the named
> addressee or an authorized designee, you may not copy or use it, or
> disclose it to anyone else. If you received it in error please notify us
> immediately and then destroy it. Dynatrace Austria GmbH (registration
> number FN 91482h) is a company registered in Linz whose registered office
> is at 4040 Linz, Austria, Freist
> <https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313=gmail=g>
> ädterstra
> <https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313=gmail=g>
> ße 313
> <https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313=gmail=g>
>
>
> The contents of this e-mail are intended for the named addressee only. It
> contains information that may be confidential. Unless you are the named
> addressee or an authorized designee, you may not copy or use it, or
> disclose it to anyone else. If you received it in error please notify us
> immediately and then destroy it. Dynatrace Austria GmbH (registration
> number FN 91482h) is a company registered in Linz whose registered office
> is at 4040 Linz, Austria, Freistädterstraße 313
> <https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313=gmail=g>
>


RE: Cleanup blocking snapshots - Options?

2018-01-30 Thread Steinmaurer, Thomas
Hi Kurt,

had another try now, and yes, with 2.1.18, this constantly happens. Currently 
running nodetool cleanup on a single node in production with disabled hourly 
snapshots. SSTables with > 100G involved here. Triggering nodetool snapshot 
will result in being blocked. From an operational perspective, a bit annoying 
right now 

Have asked on https://issues.apache.org/jira/browse/CASSANDRA-13873 regarding a 
backport to 2.1, but possibly won’t get attention, cause the ticket has been 
resolved for 2.2+ already.

Regards,
Thomas

From: kurt greaves [mailto:k...@instaclustr.com]
Sent: Montag, 15. Jänner 2018 06:18
To: User <user@cassandra.apache.org>
Subject: Re: Cleanup blocking snapshots - Options?

Disabling the snapshots is the best and only real option other than upgrading 
at the moment. Although apparently it was thought that there was only a small 
race condition in 2.1 that triggered this and it wasn't worth fixing. If you 
are triggering it easily maybe it is worth fixing in 2.1 as well. Does this 
happen consistently? Can you provide some more logs on the JIRA or better yet a 
way to reproduce?

On 14 January 2018 at 16:12, Steinmaurer, Thomas 
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>> 
wrote:
Hello,

we are running 2.1.18 with vnodes in production and due to 
(https://issues.apache.org/jira/browse/CASSANDRA-11155) we can’t run cleanup 
e.g. after extending the cluster without blocking our hourly snapshots.

What options do we have to get rid of partitions a node does not own anymore?

• Using a version which has this issue fixed, although upgrading to 
2.2+, due to various issues, is not an option at the moment

• Temporarily disabling the hourly cron job before starting cleanup and 
re-enable after cleanup has finished

• Any other way to re-write SSTables with data a node owns after a 
cluster scale out

Thanks,
Thomas

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freist<https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313=gmail=g>ädterstra<https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313=gmail=g>ße
 
313<https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313=gmail=g>

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freistädterstraße 313


Re: Cleanup blocking snapshots - Options?

2018-01-15 Thread Nicolas Guyomar
Hi,

It might really be a long shot, but I thought UserDefinedCompaction
triggered by JMX on a single sstable might remove data the node does not
own  (to answer your " *Any other way to re-write SSTables with data a node
owns after a cluster scale out" *question part)

I might be wrong though

On 15 January 2018 at 08:43, Steinmaurer, Thomas <
thomas.steinmau...@dynatrace.com> wrote:

> Hi Kurt,
>
>
>
> it was easily triggered with the mentioned combination (cleanup after
> extending the cluster) a few months ago, thus I guess it will be the same
> when I re-try. Due to the issue we simply omitted running cleanup then, but
> as disk space is becoming some sort of bottle-neck again, we need to
> re-evaluate this situation J
>
>
>
> Regards,
>
> Thomas
>
>
>
> *From:* kurt greaves [mailto:k...@instaclustr.com]
> *Sent:* Montag, 15. Jänner 2018 06:18
> *To:* User <user@cassandra.apache.org>
> *Subject:* Re: Cleanup blocking snapshots - Options?
>
>
>
> Disabling the snapshots is the best and only real option other than
> upgrading at the moment. Although apparently it was thought that there was
> only a small race condition in 2.1 that triggered this and it wasn't worth
> fixing. If you are triggering it easily maybe it is worth fixing in 2.1 as
> well. Does this happen consistently? Can you provide some more logs on the
> JIRA or better yet a way to reproduce?
>
>
>
> On 14 January 2018 at 16:12, Steinmaurer, Thomas <
> thomas.steinmau...@dynatrace.com> wrote:
>
> Hello,
>
>
>
> we are running 2.1.18 with vnodes in production and due to (
> https://issues.apache.org/jira/browse/CASSANDRA-11155) we can’t run
> cleanup e.g. after extending the cluster without blocking our hourly
> snapshots.
>
>
>
> What options do we have to get rid of partitions a node does not own
> anymore?
>
> · Using a version which has this issue fixed, although upgrading
> to 2.2+, due to various issues, is not an option at the moment
>
> · Temporarily disabling the hourly cron job before starting
> cleanup and re-enable after cleanup has finished
>
> · Any other way to re-write SSTables with data a node owns after
> a cluster scale out
>
>
>
> Thanks,
>
> Thomas
>
>
>
> The contents of this e-mail are intended for the named addressee only. It
> contains information that may be confidential. Unless you are the named
> addressee or an authorized designee, you may not copy or use it, or
> disclose it to anyone else. If you received it in error please notify us
> immediately and then destroy it. Dynatrace Austria GmbH (registration
> number FN 91482h) is a company registered in Linz whose registered office
> is at 4040 Linz, Austria, Freist
> <https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313=gmail=g>
> ädterstra
> <https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313=gmail=g>
> ße 313
> <https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313=gmail=g>
>
>
> The contents of this e-mail are intended for the named addressee only. It
> contains information that may be confidential. Unless you are the named
> addressee or an authorized designee, you may not copy or use it, or
> disclose it to anyone else. If you received it in error please notify us
> immediately and then destroy it. Dynatrace Austria GmbH (registration
> number FN 91482h) is a company registered in Linz whose registered office
> is at 4040 Linz, Austria, Freistädterstraße 313
> <https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313=gmail=g>
>


RE: Cleanup blocking snapshots - Options?

2018-01-14 Thread Steinmaurer, Thomas
Hi Kurt,

it was easily triggered with the mentioned combination (cleanup after extending 
the cluster) a few months ago, thus I guess it will be the same when I re-try. 
Due to the issue we simply omitted running cleanup then, but as disk space is 
becoming some sort of bottle-neck again, we need to re-evaluate this situation ☺

Regards,
Thomas

From: kurt greaves [mailto:k...@instaclustr.com]
Sent: Montag, 15. Jänner 2018 06:18
To: User <user@cassandra.apache.org>
Subject: Re: Cleanup blocking snapshots - Options?

Disabling the snapshots is the best and only real option other than upgrading 
at the moment. Although apparently it was thought that there was only a small 
race condition in 2.1 that triggered this and it wasn't worth fixing. If you 
are triggering it easily maybe it is worth fixing in 2.1 as well. Does this 
happen consistently? Can you provide some more logs on the JIRA or better yet a 
way to reproduce?

On 14 January 2018 at 16:12, Steinmaurer, Thomas 
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>> 
wrote:
Hello,

we are running 2.1.18 with vnodes in production and due to 
(https://issues.apache.org/jira/browse/CASSANDRA-11155) we can’t run cleanup 
e.g. after extending the cluster without blocking our hourly snapshots.

What options do we have to get rid of partitions a node does not own anymore?

• Using a version which has this issue fixed, although upgrading to 
2.2+, due to various issues, is not an option at the moment

• Temporarily disabling the hourly cron job before starting cleanup and 
re-enable after cleanup has finished

• Any other way to re-write SSTables with data a node owns after a 
cluster scale out

Thanks,
Thomas

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freist<https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313=gmail=g>ädterstra<https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313=gmail=g>ße
 
313<https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313=gmail=g>

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freistädterstraße 313


Re: Cleanup blocking snapshots - Options?

2018-01-14 Thread kurt greaves
Disabling the snapshots is the best and only real option other than
upgrading at the moment. Although apparently it was thought that there was
only a small race condition in 2.1 that triggered this and it wasn't worth
fixing. If you are triggering it easily maybe it is worth fixing in 2.1 as
well. Does this happen consistently? Can you provide some more logs on the
JIRA or better yet a way to reproduce?

On 14 January 2018 at 16:12, Steinmaurer, Thomas <
thomas.steinmau...@dynatrace.com> wrote:

> Hello,
>
>
>
> we are running 2.1.18 with vnodes in production and due to (
> https://issues.apache.org/jira/browse/CASSANDRA-11155) we can’t run
> cleanup e.g. after extending the cluster without blocking our hourly
> snapshots.
>
>
>
> What options do we have to get rid of partitions a node does not own
> anymore?
>
> · Using a version which has this issue fixed, although upgrading
> to 2.2+, due to various issues, is not an option at the moment
>
> · Temporarily disabling the hourly cron job before starting
> cleanup and re-enable after cleanup has finished
>
> · Any other way to re-write SSTables with data a node owns after
> a cluster scale out
>
>
>
> Thanks,
>
> Thomas
>
>
> The contents of this e-mail are intended for the named addressee only. It
> contains information that may be confidential. Unless you are the named
> addressee or an authorized designee, you may not copy or use it, or
> disclose it to anyone else. If you received it in error please notify us
> immediately and then destroy it. Dynatrace Austria GmbH (registration
> number FN 91482h) is a company registered in Linz whose registered office
> is at 4040 Linz, Austria, Freist
> 
> ädterstra
> 
> ße 313
> 
>


Re: Cleanup and old files

2013-12-30 Thread Aaron Morton
Check the SSTable is actually in use by cassandra, if it’s missing a component 
or otherwise corrupt it will not be opened at run time and so not included in 
all the fun games the other SSTables get to play. 

If you have the last startup in the logs check for an “Opening… “ message or an 
ERROR about the file. 

Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 30/12/2013, at 1:28 pm, David McNelis dmcne...@gmail.com wrote:

 I am currently running a cluster with 1.2.8.  One of my larger column 
 families on one of my nodes has keyspace-tablename-ic--Data.db with a 
 modify date in August.
 
 Since august we have added several nodes (with vnodes), with the same number 
 of vnodes as all the existing nodes.
 
 As a result, (we've since gone from 15 to 21 nodes), then ~32% of my data of 
 the original 15 nodes should have been essentially balanced out to the 6 new 
 nodes.  (1/15 + 1/16 +  1/21).
 
 When I run a cleanup, however, the old data files never get updated, and I 
 can't believe that they all should have remained the same.
 
 The only recently updated files in that data directory are secondary index 
 sstable files.  Am I doing something wrong here?  Am I thinking about this 
 wrong?
 
 David



Re: Cleanup and old files

2013-12-30 Thread David McNelis
I see the SSTable in this log statement:   Stream context metadata (along
with a bunch of other files)but I do not see it in the list of files
Opening (which I see quite a bit of, as expected).

Safe to try moving that file off server (to a backup location)?  If I tried
this, would I want to shut down the node first and monitor startup to see
if it all of a sudden is 'missing' something / throws an error then?


On Mon, Dec 30, 2013 at 9:26 PM, Aaron Morton aa...@thelastpickle.comwrote:

 Check the SSTable is actually in use by cassandra, if it’s missing a
 component or otherwise corrupt it will not be opened at run time and so not
 included in all the fun games the other SSTables get to play.

 If you have the last startup in the logs check for an “Opening… “ message
 or an ERROR about the file.

 Cheers

 -
 Aaron Morton
 New Zealand
 @aaronmorton

 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com

 On 30/12/2013, at 1:28 pm, David McNelis dmcne...@gmail.com wrote:

 I am currently running a cluster with 1.2.8.  One of my larger column
 families on one of my nodes has keyspace-tablename-ic--Data.db with a
 modify date in August.

 Since august we have added several nodes (with vnodes), with the same
 number of vnodes as all the existing nodes.

 As a result, (we've since gone from 15 to 21 nodes), then ~32% of my data
 of the original 15 nodes should have been essentially balanced out to the 6
 new nodes.  (1/15 + 1/16 +  1/21).

 When I run a cleanup, however, the old data files never get updated, and I
 can't believe that they all should have remained the same.

 The only recently updated files in that data directory are secondary index
 sstable files.  Am I doing something wrong here?  Am I thinking about this
 wrong?

 David





Re: cleanup failure; FileNotFoundException deleting (wrong?) db file

2013-11-08 Thread Elias Ross
On Thu, Nov 7, 2013 at 7:01 PM, Krishna Chaitanya bnsk1990r...@gmail.comwrote:

 Check if its an issue with permissions or broken links..


I don't think permissions are an issue. You might be on to something
regarding the links.

I've been seeing this on 4 nodes, configured identically.

Here's what I think the problem may be: (or may be a combination of a few
problems)

1. I have symlinked the data directories. This confuses Cassandra in some
way, causing it to create multiple files. Does Cassandra care if the data
directory was symlinked from someplace? Would this cause an issue.

lrwxrwxrwx1 root root 6 Oct 30 18:37 data01 - /data1 # [1]

Evidence for:
a. Somehow it's creating duplicate hard links.
b. It is unlikely other Cassandra users would have setup their directories
like this and this seems like a serious bug.
c. Also, my other cluster is nearly identical (OS, JVM, 6 drives, same
Cassandra/RHQ, hardware similar) and not seeing the same issues, although
that is a two node cluster.

If I were to grep through, I guess I would see if there's a chance the path
that Java sees, maybe File.getAbsoluteFile() (which might resolve the link)
doesn't match the path of another file. In other words, it is a Cassandra
bug, based on some assumptions from the JVM


2. When I created the cluster, I had a single data directory for each node.
I then added 5 more. Somehow Cassandra mis-remembers where the data was
put, causing all sorts of issues. How does Cassandra decide where to put
its data and where to read it from? What happens when additional data
directories are added? There could be a bug in the code.

Evidence for:
a. Somehow it's looking for data in the wrong directory. It also seems
unlikely a user would create a cluster, then add 5 more drives.

# [1] The reason the links are setup is because the mount points didn't
match my Puppet setup, which sets up my directory permissions. So I added
the links to compensate.


Re: cleanup failure; FileNotFoundException deleting (wrong?) db file

2013-11-08 Thread Elias Ross
On Fri, Nov 8, 2013 at 10:31 AM, Elias Ross gen...@noderunner.net wrote:


 On Thu, Nov 7, 2013 at 7:01 PM, Krishna Chaitanya 
 bnsk1990r...@gmail.comwrote:

 Check if its an issue with permissions or broken links..


 I don't think permissions are an issue. You might be on to something
 regarding the links.


As it turns out (and I noted in CASSANDRA-6298 already) this was a user
issue. One of my links was pointing to the same drive:

lrwxrwxrwx1 root root 6 Oct 30 18:37 data05 - /data5
lrwxrwxrwx1 root root 6 Oct 30 18:37 data06 - /data5

Thanks for the help everyone, I'm happy it's all working. I'm not so happy
that I messed up my configuration like this.


Re: cleanup failure; FileNotFoundException deleting (wrong?) db file

2013-11-07 Thread Krishna Chaitanya
Check if its an issue with permissions or broken links..
On Nov 6, 2013 11:17 AM, Elias Ross gen...@noderunner.net wrote:


 I'm seeing the following:

 Caused by: java.lang.RuntimeException: java.io.FileNotFoundException:
 /data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-1-Data.db (No
 such file or directory)
 at
 org.apache.cassandra.io.util.ThrottledReader.open(ThrottledReader.java:53)
 at
 org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:1212)
 at
 org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:54)
 at
 org.apache.cassandra.io.sstable.SSTableReader.getDirectScanner(SSTableReader.java:1032)
 at
 org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:594)
 at
 org.apache.cassandra.db.compaction.CompactionManager.access$500(CompactionManager.java:73)
 at
 org.apache.cassandra.db.compaction.CompactionManager$5.perform(CompactionManager.java:327)
 at
 org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:253)

 This is on an install with multiple data directories. The actual directory
 contains files named something else:

 [rhq@st11p01ad-rhq006 ~]$ ls -l
 /data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-*
 -rw-r--r-- 1 rhq rhq 849924573 Nov  1 14:24
 /data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-Data.db
 -rw-r--r-- 1 rhq rhq75 Nov  1 14:24
 /data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-Digest.sha1
 -rw-r--r-- 1 rhq rhq151696 Nov  1 14:24
 /data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-Filter.db
 -rw-r--r-- 1 rhq rhq   2186766 Nov  1 14:24
 /data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-Index.db
 -rw-r--r-- 1 rhq rhq  5957 Nov  1 14:24
 /data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-Statistics.db
 -rw-r--r-- 1 rhq rhq 15276 Nov  1 14:24
 /data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-Summary.db
 -rw-r--r-- 1 rhq rhq72 Nov  1 14:24
 /data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-TOC.txt


 It seems like it's missing the files it needs to hit? Is there something I
 can do here?



Re: cleanup failure; FileNotFoundException deleting (wrong?) db file

2013-11-06 Thread Keith Freeman
Is it possible that the keyspace was dropped then re-created ( 
https://issues.apache.org/jira/browse/CASSANDRA-4857)? I've seen similar 
stack traces in that case.


On 11/05/2013 10:47 PM, Elias Ross wrote:


I'm seeing the following:

Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: 
/data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-1-Data.db (No 
such file or directory)
at 
org.apache.cassandra.io.util.ThrottledReader.open(ThrottledReader.java:53)
at 
org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:1212)
at 
org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:54)
at 
org.apache.cassandra.io.sstable.SSTableReader.getDirectScanner(SSTableReader.java:1032)
at 
org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:594)
at 
org.apache.cassandra.db.compaction.CompactionManager.access$500(CompactionManager.java:73)
at 
org.apache.cassandra.db.compaction.CompactionManager$5.perform(CompactionManager.java:327)
at 
org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:253)


This is on an install with multiple data directories. The actual 
directory contains files named something else:


[rhq@st11p01ad-rhq006 ~]$ ls -l 
/data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-*
-rw-r--r-- 1 rhq rhq 849924573 Nov  1 14:24 
/data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-Data.db
-rw-r--r-- 1 rhq rhq75 Nov  1 14:24 
/data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-Digest.sha1
-rw-r--r-- 1 rhq rhq151696 Nov  1 14:24 
/data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-Filter.db
-rw-r--r-- 1 rhq rhq   2186766 Nov  1 14:24 
/data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-Index.db
-rw-r--r-- 1 rhq rhq  5957 Nov  1 14:24 
/data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-Statistics.db
-rw-r--r-- 1 rhq rhq 15276 Nov  1 14:24 
/data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-Summary.db
-rw-r--r-- 1 rhq rhq72 Nov  1 14:24 
/data05/rhq/data/rhq/six_hour_metrics/rhq-six_hour_metrics-ic-6-TOC.txt



It seems like it's missing the files it needs to hit? Is there 
something I can do here?




Re: cleanup failure; FileNotFoundException deleting (wrong?) db file

2013-11-06 Thread Elias Ross
On Wed, Nov 6, 2013 at 9:10 AM, Keith Freeman 8fo...@gmail.com wrote:

 Is it possible that the keyspace was dropped then re-created (
 https://issues.apache.org/jira/browse/CASSANDRA-4857)? I've seen similar
 stack traces in that case.


Thanks for the pointer.

There's a program (RHQ) that's managing my server and may have done the
create-drop-create sequence by mistake.

I also wonder if adding additional data directories after re-starting the
server may cause issues. What I mean is adding more dirs to
'data_file_directories' in cassandra.yaml, then restarting.


Re: Cleanup understastanding

2013-05-29 Thread Víctor Hugo Oliveira Molinar
Thanks for the answers.

I got it. I was using cleanup, because I thought it would delete the
tombstones.
But, that is still awkward. Does cleanup take so much disk space to
complete the compaction operation? In other words, twice the size?


*Atenciosamente,*
*Víctor Hugo Molinar - *@vhmolinar http://twitter.com/#!/vhmolinar


On Tue, May 28, 2013 at 9:55 PM, Takenori Sato(Cloudian) ts...@cloudian.com
 wrote:

  Hi Victor,

 As Andrey said, running cleanup doesn't work as you expect.


  The reason I need to clean things is that I wont need most of my
 inserted data on the next day.

 Deleted objects(columns/records) become deletable from sstable file when
 they get expired(after gc_grace_seconds).

 Such deletable objects are actually gotten rid of by compaction.

 The tricky part is that a deletable object remains unless all of its old
 objects(the same row key) are contained in the set of sstable files
 involved in the compaction.

 - Takenori


 (2013/05/29 3:01), Andrey Ilinykh wrote:

 cleanup removes data which doesn't belong to the current node. You have to
 run it only if you move (or add new) nodes. In your case there is no any
 reason to do it.


 On Tue, May 28, 2013 at 7:39 AM, Víctor Hugo Oliveira Molinar 
 vhmoli...@gmail.com wrote:

 Hello everyone.
 I have a daily maintenance task at c* which does:

 -truncate cfs
 -clearsnapshots
 -repair
 -cleanup

 The reason I need to clean things is that I wont need most of my inserted
 data on the next day. It's kind a business requirement.

 Well,  the problem I'm running to, is the misunderstanding about cleanup
 operation.
 I have 2 nodes with lower than half usage of disk, which is moreless 13GB;

 But, the last few days, arbitrarily each node have reported me a cleanup
 error indicating that the disk was full. Which is not true.

 *Error occured during cleanup*
 *java.util.concurrent.ExecutionException: java.io.IOException: disk full*


  So I'd like to know more about what does happens in a cleanup operation.
 Appreciate any help.






Re: Cleanup understastanding

2013-05-29 Thread Takenori Sato
 But, that is still awkward. Does cleanup take so much disk space to
complete the compaction operation? In other words, twice the size?

Not really, but logically yes.

According to 1.0.7 source, cleanup checks if there's enough space that is
larger than the worst scenario as below. If not, the exception you got is
thrown.

/*
 * Add up all the files sizes this is the worst case file
 * size for compaction of all the list of files given.
 */
public long getExpectedCompactedFileSize(IterableSSTableReader
sstables)
{
long expectedFileSize = 0;
for (SSTableReader sstable : sstables)
{
long size = sstable.onDiskLength();
expectedFileSize = expectedFileSize + size;
}
return expectedFileSize;
}


On Wed, May 29, 2013 at 10:43 PM, Víctor Hugo Oliveira Molinar 
vhmoli...@gmail.com wrote:

 Thanks for the answers.

 I got it. I was using cleanup, because I thought it would delete the
 tombstones.
 But, that is still awkward. Does cleanup take so much disk space to
 complete the compaction operation? In other words, twice the size?


 *Atenciosamente,*
 *Víctor Hugo Molinar - *@vhmolinar http://twitter.com/#!/vhmolinar


 On Tue, May 28, 2013 at 9:55 PM, Takenori Sato(Cloudian) 
 ts...@cloudian.com wrote:

  Hi Victor,

 As Andrey said, running cleanup doesn't work as you expect.


  The reason I need to clean things is that I wont need most of my
 inserted data on the next day.

 Deleted objects(columns/records) become deletable from sstable file when
 they get expired(after gc_grace_seconds).

 Such deletable objects are actually gotten rid of by compaction.

 The tricky part is that a deletable object remains unless all of its old
 objects(the same row key) are contained in the set of sstable files
 involved in the compaction.

 - Takenori


 (2013/05/29 3:01), Andrey Ilinykh wrote:

 cleanup removes data which doesn't belong to the current node. You have
 to run it only if you move (or add new) nodes. In your case there is no any
 reason to do it.


 On Tue, May 28, 2013 at 7:39 AM, Víctor Hugo Oliveira Molinar 
 vhmoli...@gmail.com wrote:

 Hello everyone.
 I have a daily maintenance task at c* which does:

 -truncate cfs
 -clearsnapshots
 -repair
 -cleanup

 The reason I need to clean things is that I wont need most of my
 inserted data on the next day. It's kind a business requirement.

 Well,  the problem I'm running to, is the misunderstanding about cleanup
 operation.
 I have 2 nodes with lower than half usage of disk, which is moreless
 13GB;

 But, the last few days, arbitrarily each node have reported me a cleanup
 error indicating that the disk was full. Which is not true.

 *Error occured during cleanup*
 *java.util.concurrent.ExecutionException: java.io.IOException: disk full
 *


  So I'd like to know more about what does happens in a cleanup
 operation.
 Appreciate any help.







Re: Cleanup understastanding

2013-05-28 Thread Andrey Ilinykh
cleanup removes data which doesn't belong to the current node. You have to
run it only if you move (or add new) nodes. In your case there is no any
reason to do it.


On Tue, May 28, 2013 at 7:39 AM, Víctor Hugo Oliveira Molinar 
vhmoli...@gmail.com wrote:

 Hello everyone.
 I have a daily maintenance task at c* which does:

 -truncate cfs
 -clearsnapshots
 -repair
 -cleanup

 The reason I need to clean things is that I wont need most of my inserted
 data on the next day. It's kind a business requirement.

 Well,  the problem I'm running to, is the misunderstanding about cleanup
 operation.
 I have 2 nodes with lower than half usage of disk, which is moreless 13GB;

 But, the last few days, arbitrarily each node have reported me a cleanup
 error indicating that the disk was full. Which is not true.

 *Error occured during cleanup*
 *java.util.concurrent.ExecutionException: java.io.IOException: disk full*


 So I'd like to know more about what does happens in a cleanup operation.
 Appreciate any help.



Re: Cleanup understastanding

2013-05-28 Thread Robert Coli
On Tue, May 28, 2013 at 7:39 AM, Víctor Hugo Oliveira Molinar
vhmoli...@gmail.com wrote:
 So I'd like to know more about what does happens in a cleanup operation.
 Appreciate any help.

./src/java/org/apache/cassandra/db/compaction/CompactionManager.java
line 591 of 1175

logger.info(Cleaning up  + sstable);
// Calculate the expected compacted filesize
long expectedRangeFileSize =
cfs.getExpectedCompactedFileSize(Arrays.asList(sstable),
OperationType.CLEANUP);
File compactionFileLocation =
cfs.directories.getDirectoryForNewSSTables(expectedRangeFileSize);
if (compactionFileLocation == null)
throw new IOException(disk full);


It looks like it is actually saying your disk is too full to complete
compaction, not actually full right now.

That said, a cleanup compaction does a 1:1 traversal of all SSTables,
writing out a new one without any data that no longer belongs on the
node due to range ownership changes. There is some lag in Cassandra
before the JVM is able to actually delete files from disk, perhaps you
are hitting this race condition?

=Rob


Re: Cleanup understastanding

2013-05-28 Thread Takenori Sato(Cloudian)

Hi Victor,

As Andrey said, running cleanup doesn't work as you expect.

 The reason I need to clean things is that I wont need most of my 
inserted data on the next day.


Deleted objects(columns/records) become deletable from sstable file when 
they get expired(after gc_grace_seconds).


Such deletable objects are actually gotten rid of by compaction.

The tricky part is that a deletable object remains unless all of its old 
objects(the same row key) are contained in the set of sstable files 
involved in the compaction.


- Takenori

(2013/05/29 3:01), Andrey Ilinykh wrote:
cleanup removes data which doesn't belong to the current node. You 
have to run it only if you move (or add new) nodes. In your case there 
is no any reason to do it.



On Tue, May 28, 2013 at 7:39 AM, Víctor Hugo Oliveira Molinar 
vhmoli...@gmail.com mailto:vhmoli...@gmail.com wrote:


Hello everyone.
I have a daily maintenance task at c* which does:

-truncate cfs
-clearsnapshots
-repair
-cleanup

The reason I need to clean things is that I wont need most of my
inserted data on the next day. It's kind a business requirement.

Well,  the problem I'm running to, is the misunderstanding about
cleanup operation.
I have 2 nodes with lower than half usage of disk, which is
moreless 13GB;

But, the last few days, arbitrarily each node have reported me a
cleanup error indicating that the disk was full. Which is not true.

/Error occured during cleanup/
/java.util.concurrent.ExecutionException: java.io.IOException:
disk full/


So I'd like to know more about what does happens in a cleanup
operation.
Appreciate any help.






Re: Cleanup the peers columnfamily

2013-05-06 Thread Sylvain Lebresne
What version of Cassandra are you using. If you're using 1.2.0 (or *were*
using 1.2.0 when the 2 nodes were removed), you might be seeing
https://issues.apache.org/jira/browse/CASSANDRA-5167.

 Or I have to delete the row in the table

That should work.


On Mon, May 6, 2013 at 4:22 PM, Shahryar Sedghi shsed...@gmail.com wrote:

 I had a 4 node cluster in my dev environment and due to resource
 limitation, I had to remove two nodes. Nodetool status shows only two nodes
 on both machines , but peers table on one machine still shows entries of
 the nodes with a null  rpc address. Thrift has no problem with it but new
 Binary protocol client is slow connecting to that node because of the
 entries.

 Nodetool remove does recognize those removed nodes.It there a way through
 the commands to remove those entries, Or I have to delete the row in the
 table.

 Thanks  in advance

 Shahryar



Re: cleanup crashing with java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 8

2012-03-14 Thread Maki Watanabe
Fixed in 1.0.9, 1.1.0
https://issues.apache.org/jira/browse/CASSANDRA-3989

You should better to avoid to use cleanup/scrub/upgradesstable if you
can on 1.0.7 though
it will not corrupt sstables.

2012/3/14 Thomas van Neerijnen t...@bossastudios.com:
 Hi all

 I am trying to run a cleanup on a column family and am getting the following
 error returned after about 15 seconds. A cleanup on a slightly smaller
 column family completes in about 21 minutes. This is on the Apache packaged
 version of Cassandra on Ubuntu 11.10, version 1.0.7.

 ~# nodetool -h localhost cleanup Player PlayerDetail
 Error occured during cleanup
 java.util.concurrent.ExecutionException:
 java.lang.ArrayIndexOutOfBoundsException: 8
     at
 java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
     at java.util.concurrent.FutureTask.get(FutureTask.java:83)
     at
 org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:203)
     at
 org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:237)
     at
 org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:984)
     at
 org.apache.cassandra.service.StorageService.forceTableCleanup(StorageService.java:1635)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
     at
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
     at
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
     at
 com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
     at
 com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
     at
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
     at
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
     at
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
     at
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
     at
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
     at
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
     at
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
     at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
     at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at
 sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
     at sun.rmi.transport.Transport$1.run(Transport.java:159)
     at java.security.AccessController.doPrivileged(Native Method)
     at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
     at
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
     at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
     at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
     at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
     at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
     at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 8
     at
 org.apache.cassandra.db.compaction.LeveledManifest.add(LeveledManifest.java:298)
     at
 org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:186)
     at
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:141)
     at
 org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:494)
     at
 org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:234)
     at
 org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:1006)
     at
 org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:791)
     at
 org.apache.cassandra.db.compaction.CompactionManager.access$300(CompactionManager.java:63)
     at
 org.apache.cassandra.db.compaction.CompactionManager$5.perform(CompactionManager.java:241)
     at
 org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:182)
     at
 

Re: Cleanup in a write-only environment

2011-11-30 Thread Nick Bailey
I believe you are mis-understanding what cleanup does. Cleanup is used
to remove data from a node that the node no longer owns. For example
when you move a node in the ring, it changes responsibility and gets
new data, but does not automatically delete the data it used to be
responsible for but no longer is. In this situation, you run cleanup
to delete all of that old data.

Data that has been deleted/expired will get removed automatically as
compaction runs.

On Wed, Nov 30, 2011 at 7:24 AM, David McNelis
dmcne...@agentisenergy.com wrote:
 In my understanding Cleanup is meant to help clear out data that has  been
 removed.  If you have an environment where data is only ever added (the case
 for the production system I'm working with), is there a point to automating
 cleanup?   I understand that if we were to ever purge a segment of data from
 our cluster we'd certainly want to run it, or after added a new node and
 adjusting the tokens.

 So I want to make sure I'm not missing something here and that there  would
 be other  reasons to run cleanup regularly?

 --
 David McNelis
 Lead Software Engineer
 Agentis Energy
 www.agentisenergy.com
 c: 219.384.5143

 A Smart Grid technology company focused on helping consumers of energy
 control an often under-managed resource.




Re: Cleanup in a write-only environment

2011-11-30 Thread Edward Capriolo
Your understanding of nodetool cleanup is not correct. cleanup is used only
after cluster balancing like adding or removing nodes. It removes data that
does not belong on the node anymore (in older versions it removed hints as
well)

Your debate is needing to run companion . In a write only workload you
should let cassandra do its normal connection.(in most cases)

On Wednesday, November 30, 2011, David McNelis dmcne...@agentisenergy.com
wrote:
 In my understanding Cleanup is meant to help clear out data that has
 been removed.  If you have an environment where data is only ever added
(the case for the production system I'm working with), is there a point to
automating cleanup?   I understand that if we were to ever purge a segment
of data from our cluster we'd certainly want to run it, or after added a
new node and adjusting the tokens.
 So I want to make sure I'm not missing something here and that there
 would be other  reasons to run cleanup regularly?

 --
 David McNelis
 Lead Software Engineer
 Agentis Energy
 www.agentisenergy.com
 c: 219.384.5143
 A Smart Grid technology company focused on helping consumers of energy
control an often under-managed resource.




Re: Cleanup in a write-only environment

2011-11-30 Thread David McNelis
Thanks, folks.

I think I must have read compaction, thought cleanup, and gotten muddled
from there.

David
On Nov 30, 2011 6:45 PM, Edward Capriolo edlinuxg...@gmail.com wrote:

 Your understanding of nodetool cleanup is not correct. cleanup is used
 only after cluster balancing like adding or removing nodes. It removes data
 that does not belong on the node anymore (in older versions it removed
 hints as well)

 Your debate is needing to run companion . In a write only workload you
 should let cassandra do its normal connection.(in most cases)

 On Wednesday, November 30, 2011, David McNelis dmcne...@agentisenergy.com
 wrote:
  In my understanding Cleanup is meant to help clear out data that has
  been removed.  If you have an environment where data is only ever added
 (the case for the production system I'm working with), is there a point to
 automating cleanup?   I understand that if we were to ever purge a segment
 of data from our cluster we'd certainly want to run it, or after added a
 new node and adjusting the tokens.
  So I want to make sure I'm not missing something here and that there
  would be other  reasons to run cleanup regularly?
 
  --
  David McNelis
  Lead Software Engineer
  Agentis Energy
  www.agentisenergy.com
  c: 219.384.5143
  A Smart Grid technology company focused on helping consumers of energy
 control an often under-managed resource.
 
 


Re: cleanup / move

2011-09-12 Thread aaron morton
  is there a techincal problem with running a nodetool move  on a node while a 
 cleanup is running?  
Cleanup is removing data that the node is no longer responsible for while move 
is first removing *all* data from the node and then streaming new data to it. 

I'd put that in the crossing the streams category 
(http://www.youtube.com/watch?v=jyaLZHiJJnE). i.e. best avoided. 

To kill the cleanup kill the node. Operations such as that create new data, and 
then delete old data. They do not mutate existing data. 

Cleanup will write new SSTables, and then mark the old ones as compacted. When 
the old SSTables are marked as compacted you should will see a zero length 
.Compacted file. Cassandra will delete the compacted data files when it needs 
to. 

If you want the deletion to happen sooner rather than later force a Java GC 
through JConsole. 

Hope that helps. 
 
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 13/09/2011, at 7:41 AM, David McNelis wrote:

 While it would certainly be preferable to not run a cleanup and a move  at 
 the same time on the same node, is there a techincal problem with running a 
 nodetool move  on a node while a cleanup is running?  Or if its possible to 
 gracefully kill a cleanup, so that a move can  be run and then cleanup run 
 after?
 
 We have a node that is almost full and need to move it so that we can shift 
 its  loadbut it already has a cleanup process running which, instead of 
 causing less data usage as expected, is  actually growing the amount of space 
 taken at a pretty fast rate.
 
 -- 
 David McNelis
 Lead Software Engineer
 Agentis Energy
 www.agentisenergy.com
 o: 630.359.6395
 c: 219.384.5143
 
 A Smart Grid technology company focused on helping consumers of energy 
 control an often under-managed resource.