Re: [Cassandra] Ignoring interval time

2017-05-30 Thread Akhil Mehra
The debug output is from the failure detector in the gossip module. Code
can be found here
https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/src/java/org/apache/cassandra/gms/FailureDetector.java#L450-L474
.

The debug logging above is reporting around an acknowledgement latency
between  2 to 3 seconds for each gossip message sent by the node to the
respective IP addresses. This is above the expected 2 sec threshold as
stated in MAX_INTERVAL_IN_NANO.

As Varun pointed out your cluster is probably under too much load.

The pressure on your cluster might also be causing the read repairs you
reported in your earlier email.

Regards,
Akhil


On Wed, May 31, 2017 at 7:21 AM, Varun Gupta  wrote:

> Can you please check Cassandra Stats, if cluster is under too much load.
> This is the symptom, not the root cause.
>
> On Tue, May 30, 2017 at 2:33 AM, Abhishek Kumar Maheshwari <
> abhishek.maheshw...@timesinternet.in> wrote:
>
>> Hi All,
>>
>>
>>
>> Please let me know why this debug log is coming:
>>
>>
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:31,496 FailureDetector.java:456 -
>> Ignoring interval time of 2000686406 for /XXX.XX.XXX.204
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 -
>> Ignoring interval time of 2349724693 <(234)%20972-4693> for
>> /XXX.XX.XXX.207
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 -
>> Ignoring interval time of 2000655389 for /XXX.XX.XXX.206
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 -
>> Ignoring interval time of 2000721304 for /XXX.XX.XXX.201
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 -
>> Ignoring interval time of 2000770809 for /XXX.XX.XXX.202
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 -
>> Ignoring interval time of 2000825217 for /XXX.XX.XXX.209
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:35,449 FailureDetector.java:456 -
>> Ignoring interval time of 2953167747 for /XXX.XX.XXX.205
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:37,497 FailureDetector.java:456 -
>> Ignoring interval time of 2047662469 <(204)%20766-2469> for
>> /XXX.XX.XXX.205
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:37,497 FailureDetector.java:456 -
>> Ignoring interval time of 2000717144 for /XXX.XX.XXX.207
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:37,497 FailureDetector.java:456 -
>> Ignoring interval time of 2000780785 for /XXX.XX.XXX.201
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:38,497 FailureDetector.java:456 -
>> Ignoring interval time of 2000113606 for /XXX.XX.XXX.209
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:39,121 FailureDetector.java:456 -
>> Ignoring interval time of 2334491585 for /XXX.XX.XXX.204
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:39,497 FailureDetector.java:456 -
>> Ignoring interval time of 2000209788 for /XXX.XX.XXX.207
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:39,497 FailureDetector.java:456 -
>> Ignoring interval time of 2000226568 for /XXX.XX.XXX.208
>>
>> DEBUG [GossipStage:1] 2017-05-30 15:01:42,178 FailureDetector.java:456 -
>> Ignoring interval time of 2390977968 for /XXX.XX.XXX.204
>>
>>
>>
>> *Thanks & Regards,*
>> *Abhishek Kumar Maheshwari*
>> *+91- 805591 <+91%208%2005591> (Mobile)*
>>
>> Times Internet Ltd. | A Times of India Group Company
>>
>> FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
>>
>> *P** Please do not print this email unless it is absolutely necessary.
>> Spread environmental awareness.*
>>
>>
>>
>
>


Netty SSL memory leak

2017-05-30 Thread John Sanda
I have Cassandra 3.0.9 cluster that is hitting OutOfMemoryErrors with byte
buffer allocation. The stack trace looks like:

java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:694) ~[na:1.8.0_131]
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
~[na:1.8.0_131]
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
~[na:1.8.0_131]
at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:434)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at io.netty.buffer.PoolArena.allocate(PoolArena.java:168)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at io.netty.buffer.PoolArena.allocate(PoolArena.java:98)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at
io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:250)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:155)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:146)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at
io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByteBufAllocator.java:83)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at io.netty.handler.ssl.SslHandler.allocate(SslHandler.java:1265)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at
io.netty.handler.ssl.SslHandler.allocateOutNetBuf(SslHandler.java:1275)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at io.netty.handler.ssl.SslHandler.wrap(SslHandler.java:453)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at io.netty.handler.ssl.SslHandler.flush(SslHandler.java:432)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:688)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]

I do not yet have a heap dump. The two relevant tickets are CASSANDRA-13114
 and CASSANDRA-13126
. The upstream Netty
ticket is 3057 . Cassandra
3.0.11 upgraded Netty to the version with the fix. Is there anything I can
check to confirm that this is in fact the issue I am hitting?

Secondly, is there a way to monitor for this? The OOME does not cause the
JVM to exit. Instead, the logs are getting filled up with OutOfMemoryErrors.
nodetool status reports UN, and nodetool statusbinary reports running.

-- 

- John


Re: Restarting nodes and reported load

2017-05-30 Thread Jonathan Haddad
You're the only one I see in the thread that's made any reference to HDFS.
The OP even noted that his question is about C*, not HDFS.

On Tue, May 30, 2017 at 2:59 PM daemeon reiydelle 
wrote:

> Did you notice that HDFS is the distributed file system used?
>
>
>
>
>
> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>
>
> *“All men dream, but not equally. Those who dream by night in the dusty
> recesses of their minds wake up in the day to find it was vanity, but the
> dreamers of the day are dangerous men, for they may act their dreams with
> open eyes, to make it possible.” — T.E. Lawrence*
>
>
> On Tue, May 30, 2017 at 2:18 PM, Jonathan Haddad 
> wrote:
>
>> This isn't an HDFS mailing list.
>>
>> On Tue, May 30, 2017 at 2:14 PM daemeon reiydelle 
>> wrote:
>>
>>> no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs
>>> node. Depends somewhat on whether there is a mix of more and less
>>> frequently accessed data. But even storing only hot data, never saw
>>> anything less than 20tb hdfs per node.
>>>
>>>
>>>
>>>
>>>
>>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
>>> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>>>
>>>
>>> *“All men dream, but not equally. Those who dream by night in the dusty
>>> recesses of their minds wake up in the day to find it was vanity, but the
>>> dreamers of the day are dangerous men, for they may act their dreams with
>>> open eyes, to make it possible.” — T.E. Lawrence*
>>>
>>>
>>> On Tue, May 30, 2017 at 2:00 PM, tommaso barbugli 
>>> wrote:
>>>
 Am I the only one thinking 3TB is way too much data for a single node
 on a VM?

 On Tue, May 30, 2017 at 10:36 PM, Daniel Steuernol <
 dan...@sendwithus.com> wrote:

> I don't believe incremental repair is enabled, I have never enabled it
> on the cluster, and unless it's the default then it is off. Also I don't
> see a setting in cassandra.yaml for it.
>
>
>
> On May 30 2017, at 1:10 pm, daemeon reiydelle 
> wrote:
>
>> Unless there is a bug, snapshots are excluded (they are not HDFS
>> anyway!) from nodetool status.
>>
>> Out of curiousity, is incremenatal repair enabled? This is almost
>> certainly a rat hole, but there was an issue a few releases back where 
>> load
>> would only increase until the node was restarted. Had been fixed ages 
>> ago,
>> but wondering what happens if you restart a node, IF you have incremental
>> enabled.
>>
>>
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198
>> <+1%20415-501-0198>London (+44) (0) 20 8144 9872 
>> <+44%2020%208144%209872>*
>>
>>
>> *“All men dream, but not equally. Those who dream by night in the
>> dusty recesses of their minds wake up in the day to find it was vanity, 
>> but
>> the dreamers of the day are dangerous men, for they may act their dreams
>> with open eyes, to make it possible.” — T.E. Lawrence*
>>
>>
>> On Tue, May 30, 2017 at 12:15 PM, Varun Gupta 
>> wrote:
>>
>> Can you please check if you have incremental backup enabled and
>> snapshots are occupying the space.
>>
>> run nodetool clearsnapshot command.
>>
>> On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol <
>> dan...@sendwithus.com> wrote:
>>
>> It's 3-4TB per node, and by load rises, I'm talking about load as
>> reported by nodetool status.
>>
>>
>>
>> On May 30 2017, at 10:25 am, daemeon reiydelle 
>> wrote:
>>
>> When you say "the load rises ... ", could you clarify what you mean
>> by "load"? That has a specific Linux term, and in e.g. Cloudera Manager.
>> But in neither case would that be relevant to transient or persisted 
>> disk.
>> Am I missing something?
>>
>>
>> On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli <
>> tbarbu...@gmail.com> wrote:
>>
>> 3-4 TB per node or in total?
>>
>> On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol <
>> dan...@sendwithus.com> wrote:
>>
>> I should also mention that I am running cassandra 3.10 on the cluster
>>
>>
>>
>> On May 29 2017, at 9:43 am, Daniel Steuernol 
>> wrote:
>>
>> The cluster is running with RF=3, right now each node is storing
>> about 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8
>> vCPU's, 61 GB of RAM, and the disks attached for the data drive are gp2 
>> ssd
>> ebs volumes with 10k iops. I guess this brings up the question of what's 
>> a
>> good marker to decide on whether to increase disk space vs provisioning a
>> new node?
>>
>>
>> On May 29 2017, at 9:35 am, 

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle
Did you notice that HDFS is the distributed file system used?





*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*


*“All men dream, but not equally. Those who dream by night in the dusty
recesses of their minds wake up in the day to find it was vanity, but the
dreamers of the day are dangerous men, for they may act their dreams with
open eyes, to make it possible.” — T.E. Lawrence*


On Tue, May 30, 2017 at 2:18 PM, Jonathan Haddad  wrote:

> This isn't an HDFS mailing list.
>
> On Tue, May 30, 2017 at 2:14 PM daemeon reiydelle 
> wrote:
>
>> no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs
>> node. Depends somewhat on whether there is a mix of more and less
>> frequently accessed data. But even storing only hot data, never saw
>> anything less than 20tb hdfs per node.
>>
>>
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
>> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>>
>>
>> *“All men dream, but not equally. Those who dream by night in the dusty
>> recesses of their minds wake up in the day to find it was vanity, but the
>> dreamers of the day are dangerous men, for they may act their dreams with
>> open eyes, to make it possible.” — T.E. Lawrence*
>>
>>
>> On Tue, May 30, 2017 at 2:00 PM, tommaso barbugli 
>> wrote:
>>
>>> Am I the only one thinking 3TB is way too much data for a single node on
>>> a VM?
>>>
>>> On Tue, May 30, 2017 at 10:36 PM, Daniel Steuernol <
>>> dan...@sendwithus.com> wrote:
>>>
 I don't believe incremental repair is enabled, I have never enabled it
 on the cluster, and unless it's the default then it is off. Also I don't
 see a setting in cassandra.yaml for it.



 On May 30 2017, at 1:10 pm, daemeon reiydelle 
 wrote:

> Unless there is a bug, snapshots are excluded (they are not HDFS
> anyway!) from nodetool status.
>
> Out of curiousity, is incremenatal repair enabled? This is almost
> certainly a rat hole, but there was an issue a few releases back where 
> load
> would only increase until the node was restarted. Had been fixed ages ago,
> but wondering what happens if you restart a node, IF you have incremental
> enabled.
>
>
>
>
>
> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <+1%20415-501-0198>London
> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>
>
> *“All men dream, but not equally. Those who dream by night in the
> dusty recesses of their minds wake up in the day to find it was vanity, 
> but
> the dreamers of the day are dangerous men, for they may act their dreams
> with open eyes, to make it possible.” — T.E. Lawrence*
>
>
> On Tue, May 30, 2017 at 12:15 PM, Varun Gupta  wrote:
>
> Can you please check if you have incremental backup enabled and
> snapshots are occupying the space.
>
> run nodetool clearsnapshot command.
>
> On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol <
> dan...@sendwithus.com> wrote:
>
> It's 3-4TB per node, and by load rises, I'm talking about load as
> reported by nodetool status.
>
>
>
> On May 30 2017, at 10:25 am, daemeon reiydelle 
> wrote:
>
> When you say "the load rises ... ", could you clarify what you mean by
> "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But
> in neither case would that be relevant to transient or persisted disk. Am 
> I
> missing something?
>
>
> On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli <
> tbarbu...@gmail.com> wrote:
>
> 3-4 TB per node or in total?
>
> On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol <
> dan...@sendwithus.com> wrote:
>
> I should also mention that I am running cassandra 3.10 on the cluster
>
>
>
> On May 29 2017, at 9:43 am, Daniel Steuernol 
> wrote:
>
> The cluster is running with RF=3, right now each node is storing about
> 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 
> 61
> GB of RAM, and the disks attached for the data drive are gp2 ssd ebs
> volumes with 10k iops. I guess this brings up the question of what's a 
> good
> marker to decide on whether to increase disk space vs provisioning a new
> node?
>
>
> On May 29 2017, at 9:35 am, tommaso barbugli 
> wrote:
>
> Hi Daniel,
>
> This is not normal. Possibly a capacity problem. Whats the RF, how
> much data do you store per node and what kind of servers do you use (core
> count, RAM, disk, ...)?
>
> Cheers,
> Tommaso
>
> On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol <
> dan...@sendwithus.com> wrote:
>
>
> I am running a 6 

Re: Restarting nodes and reported load

2017-05-30 Thread Jonathan Haddad
Daniel - my comment wasn't to you, it was in response to Daemeon.

> no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs
node

Jon

On Tue, May 30, 2017 at 2:30 PM Daniel Steuernol 
wrote:

> My question is about cassandra, ultimately I'm trying to figure out why
> our clusters performance degrades approximately every 6 days. I noticed
> that the load as reported by nodetool status was very high, but that might
> be unrelated to the problem. A restart solves the performance problem.
>
> I've attached a latency graph for inserts into the cluster as you can see
> over the weekend there was a massive latency spike, and it was fixed by a
> restart of all the nodes.
>
> On May 30 2017, at 2:18 pm, Jonathan Haddad  wrote:
>
>> This isn't an HDFS mailing list.
>>
>> On Tue, May 30, 2017 at 2:14 PM daemeon reiydelle 
>> wrote:
>>
>> no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs
>> node. Depends somewhat on whether there is a mix of more and less
>> frequently accessed data. But even storing only hot data, never saw
>> anything less than 20tb hdfs per node.
>>
>>
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*
>>
>>
>> *“All men dream, but not equally. Those who dream by night in the dusty
>> recesses of their minds wake up in the day to find it was vanity, but the
>> dreamers of the day are dangerous men, for they may act their dreams with
>> open eyes, to make it possible.” — T.E. Lawrence*
>>
>>
>> On Tue, May 30, 2017 at 2:00 PM, tommaso barbugli 
>> wrote:
>>
>> Am I the only one thinking 3TB is way too much data for a single node on
>> a VM?
>>
>> On Tue, May 30, 2017 at 10:36 PM, Daniel Steuernol > > wrote:
>>
>> I don't believe incremental repair is enabled, I have never enabled it on
>> the cluster, and unless it's the default then it is off. Also I don't see a
>> setting in cassandra.yaml for it.
>>
>>
>>
>> On May 30 2017, at 1:10 pm, daemeon reiydelle 
>> wrote:
>>
>> Unless there is a bug, snapshots are excluded (they are not HDFS anyway!)
>> from nodetool status.
>>
>> Out of curiousity, is incremenatal repair enabled? This is almost
>> certainly a rat hole, but there was an issue a few releases back where load
>> would only increase until the node was restarted. Had been fixed ages ago,
>> but wondering what happens if you restart a node, IF you have incremental
>> enabled.
>>
>>
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*
>>
>>
>> *“All men dream, but not equally. Those who dream by night in the dusty
>> recesses of their minds wake up in the day to find it was vanity, but the
>> dreamers of the day are dangerous men, for they may act their dreams with
>> open eyes, to make it possible.” — T.E. Lawrence*
>>
>>
>> On Tue, May 30, 2017 at 12:15 PM, Varun Gupta  wrote:
>>
>> Can you please check if you have incremental backup enabled and snapshots
>> are occupying the space.
>>
>> run nodetool clearsnapshot command.
>>
>> On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol > > wrote:
>>
>> It's 3-4TB per node, and by load rises, I'm talking about load as
>> reported by nodetool status.
>>
>>
>>
>> On May 30 2017, at 10:25 am, daemeon reiydelle 
>> wrote:
>>
>> When you say "the load rises ... ", could you clarify what you mean by
>> "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But
>> in neither case would that be relevant to transient or persisted disk. Am I
>> missing something?
>>
>>
>> On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli 
>> wrote:
>>
>> 3-4 TB per node or in total?
>>
>> On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol 
>> wrote:
>>
>> I should also mention that I am running cassandra 3.10 on the cluster
>>
>>
>>
>> On May 29 2017, at 9:43 am, Daniel Steuernol 
>> wrote:
>>
>> The cluster is running with RF=3, right now each node is storing about
>> 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61
>> GB of RAM, and the disks attached for the data drive are gp2 ssd ebs
>> volumes with 10k iops. I guess this brings up the question of what's a good
>> marker to decide on whether to increase disk space vs provisioning a new
>> node?
>>
>>
>> On May 29 2017, at 9:35 am, tommaso barbugli 
>> wrote:
>>
>> Hi Daniel,
>>
>> This is not normal. Possibly a capacity problem. Whats the RF, how much
>> data do you store per node and what kind of servers do you use (core count,
>> RAM, disk, ...)?
>>
>> Cheers,
>> Tommaso
>>
>> On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol 
>> wrote:
>>
>>
>> I am running a 6 node cluster, and I have noticed that the reported load
>> on each node rises throughout the week and grows way past 

Re: Restarting nodes and reported load

2017-05-30 Thread Daniel Steuernol
My question is about cassandra, ultimately I'm trying to figure out why our clusters performance degrades approximately every 6 days. I noticed that the load as reported by nodetool status was very high, but that might be unrelated to the problem. A restart solves the performance problem.I've attached a latency graph for inserts into the cluster as you can see over the weekend there was a massive latency spike, and it was fixed by a restart of all the nodes.
  

On May 30 2017, at 2:18 pm, Jonathan Haddad  wrote:


  This isn't an HDFS mailing list.On Tue, May 30, 2017 at 2:14 PM daemeon reiydelle  wrote:no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs node. Depends somewhat on whether there is a mix of more and less frequently accessed data. But even storing only hot data, never saw anything less than 20tb hdfs per node.Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872“All men dream, but not equally. Those who dream by night in the dusty 
recesses of their minds wake up in the day to find it was vanity, but 
the dreamers of the day are dangerous men, for they may act their dreams
 with open eyes, to make it possible.” — T.E. Lawrence
On Tue, May 30, 2017 at 2:00 PM, tommaso barbugli  wrote:Am I the only one thinking 3TB is way too much data for a single node on a VM?On Tue, May 30, 2017 at 10:36 PM, Daniel Steuernol  wrote:I don't believe incremental repair is enabled, I have never enabled it on the cluster, and unless it's the default then it is off. Also I don't see a setting in cassandra.yaml for it.
  

On May 30 2017, at 1:10 pm, daemeon reiydelle  wrote:


  Unless there is a bug, snapshots are excluded (they are not HDFS anyway!) from nodetool status. Out of curiousity, is incremenatal repair enabled? This is almost certainly a rat hole, but there was an issue a few releases back where load would only increase until the node was restarted. Had been fixed ages ago, but wondering what happens if you restart a node, IF you have incremental enabled.Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872“All men dream, but not equally. Those who dream by night in the dusty 
recesses of their minds wake up in the day to find it was vanity, but 
the dreamers of the day are dangerous men, for they may act their dreams
 with open eyes, to make it possible.” — T.E. Lawrence
On Tue, May 30, 2017 at 12:15 PM, Varun Gupta  wrote:Can you please check if you have incremental backup enabled and snapshots are occupying the space.run nodetool clearsnapshot command.On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol  wrote:It's 3-4TB per node, and by load rises, I'm talking about load as reported by nodetool status.
  

On May 30 2017, at 10:25 am, daemeon reiydelle  wrote:


  When you say "the load rises ... ", could you clarify what you mean by "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But in neither case would that be relevant to transient or persisted disk. Am I missing something?
On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli  wrote:3-4 TB per node or in total?On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol  wrote:I should also mention that I am running cassandra 3.10 on the cluster
  

On May 29 2017, at 9:43 am, Daniel Steuernol  wrote:


  The cluster is running with RF=3, right now each node is storing about 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61 GB of RAM, and the disks attached for the data drive are gp2 ssd ebs volumes with 10k iops. I guess this brings up the question of what's a good marker to decide on whether to increase disk space vs provisioning a new node?
  

On May 29 2017, at 9:35 am, tommaso barbugli  wrote:


  Hi Daniel,This is not normal. Possibly a capacity problem. Whats the RF, how much data do you store per node and what kind of servers do you use (core count, RAM, disk, ...)?Cheers,TommasoOn Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol  wrote:I am running a 6 node cluster, and I have noticed that the reported load on each node rises throughout the week and grows way past the actual disk space used and available on each node. Also eventually latency for operations suffers and the nodes have to be restarted. A couple questions on this, is this normal? Also does cassandra need to be restarted every few days for best performance? Any insight on this 

Re: Restarting nodes and reported load

2017-05-30 Thread Jonathan Haddad
This isn't an HDFS mailing list.

On Tue, May 30, 2017 at 2:14 PM daemeon reiydelle 
wrote:

> no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs
> node. Depends somewhat on whether there is a mix of more and less
> frequently accessed data. But even storing only hot data, never saw
> anything less than 20tb hdfs per node.
>
>
>
>
>
> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>
>
> *“All men dream, but not equally. Those who dream by night in the dusty
> recesses of their minds wake up in the day to find it was vanity, but the
> dreamers of the day are dangerous men, for they may act their dreams with
> open eyes, to make it possible.” — T.E. Lawrence*
>
>
> On Tue, May 30, 2017 at 2:00 PM, tommaso barbugli 
> wrote:
>
>> Am I the only one thinking 3TB is way too much data for a single node on
>> a VM?
>>
>> On Tue, May 30, 2017 at 10:36 PM, Daniel Steuernol > > wrote:
>>
>>> I don't believe incremental repair is enabled, I have never enabled it
>>> on the cluster, and unless it's the default then it is off. Also I don't
>>> see a setting in cassandra.yaml for it.
>>>
>>>
>>>
>>> On May 30 2017, at 1:10 pm, daemeon reiydelle 
>>> wrote:
>>>
 Unless there is a bug, snapshots are excluded (they are not HDFS
 anyway!) from nodetool status.

 Out of curiousity, is incremenatal repair enabled? This is almost
 certainly a rat hole, but there was an issue a few releases back where load
 would only increase until the node was restarted. Had been fixed ages ago,
 but wondering what happens if you restart a node, IF you have incremental
 enabled.





 *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <+1%20415-501-0198>London
 (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*


 *“All men dream, but not equally. Those who dream by night in the dusty
 recesses of their minds wake up in the day to find it was vanity, but the
 dreamers of the day are dangerous men, for they may act their dreams with
 open eyes, to make it possible.” — T.E. Lawrence*


 On Tue, May 30, 2017 at 12:15 PM, Varun Gupta  wrote:

 Can you please check if you have incremental backup enabled and
 snapshots are occupying the space.

 run nodetool clearsnapshot command.

 On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol <
 dan...@sendwithus.com> wrote:

 It's 3-4TB per node, and by load rises, I'm talking about load as
 reported by nodetool status.



 On May 30 2017, at 10:25 am, daemeon reiydelle 
 wrote:

 When you say "the load rises ... ", could you clarify what you mean by
 "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But
 in neither case would that be relevant to transient or persisted disk. Am I
 missing something?


 On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli  wrote:

 3-4 TB per node or in total?

 On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol <
 dan...@sendwithus.com> wrote:

 I should also mention that I am running cassandra 3.10 on the cluster



 On May 29 2017, at 9:43 am, Daniel Steuernol 
 wrote:

 The cluster is running with RF=3, right now each node is storing about
 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61
 GB of RAM, and the disks attached for the data drive are gp2 ssd ebs
 volumes with 10k iops. I guess this brings up the question of what's a good
 marker to decide on whether to increase disk space vs provisioning a new
 node?


 On May 29 2017, at 9:35 am, tommaso barbugli 
 wrote:

 Hi Daniel,

 This is not normal. Possibly a capacity problem. Whats the RF, how much
 data do you store per node and what kind of servers do you use (core count,
 RAM, disk, ...)?

 Cheers,
 Tommaso

 On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol <
 dan...@sendwithus.com> wrote:


 I am running a 6 node cluster, and I have noticed that the reported
 load on each node rises throughout the week and grows way past the actual
 disk space used and available on each node. Also eventually latency for
 operations suffers and the nodes have to be restarted. A couple questions
 on this, is this normal? Also does cassandra need to be restarted every few
 days for best performance? Any insight on this behaviour would be helpful.

 Cheers,
 Daniel
 -
 To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
 additional commands, e-mail: 

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle
no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs
node. Depends somewhat on whether there is a mix of more and less
frequently accessed data. But even storing only hot data, never saw
anything less than 20tb hdfs per node.





*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*


*“All men dream, but not equally. Those who dream by night in the dusty
recesses of their minds wake up in the day to find it was vanity, but the
dreamers of the day are dangerous men, for they may act their dreams with
open eyes, to make it possible.” — T.E. Lawrence*


On Tue, May 30, 2017 at 2:00 PM, tommaso barbugli 
wrote:

> Am I the only one thinking 3TB is way too much data for a single node on a
> VM?
>
> On Tue, May 30, 2017 at 10:36 PM, Daniel Steuernol 
> wrote:
>
>> I don't believe incremental repair is enabled, I have never enabled it on
>> the cluster, and unless it's the default then it is off. Also I don't see a
>> setting in cassandra.yaml for it.
>>
>>
>>
>> On May 30 2017, at 1:10 pm, daemeon reiydelle 
>> wrote:
>>
>>> Unless there is a bug, snapshots are excluded (they are not HDFS
>>> anyway!) from nodetool status.
>>>
>>> Out of curiousity, is incremenatal repair enabled? This is almost
>>> certainly a rat hole, but there was an issue a few releases back where load
>>> would only increase until the node was restarted. Had been fixed ages ago,
>>> but wondering what happens if you restart a node, IF you have incremental
>>> enabled.
>>>
>>>
>>>
>>>
>>>
>>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <+1%20415-501-0198>London
>>> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>>>
>>>
>>> *“All men dream, but not equally. Those who dream by night in the dusty
>>> recesses of their minds wake up in the day to find it was vanity, but the
>>> dreamers of the day are dangerous men, for they may act their dreams with
>>> open eyes, to make it possible.” — T.E. Lawrence*
>>>
>>>
>>> On Tue, May 30, 2017 at 12:15 PM, Varun Gupta  wrote:
>>>
>>> Can you please check if you have incremental backup enabled and
>>> snapshots are occupying the space.
>>>
>>> run nodetool clearsnapshot command.
>>>
>>> On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol <
>>> dan...@sendwithus.com> wrote:
>>>
>>> It's 3-4TB per node, and by load rises, I'm talking about load as
>>> reported by nodetool status.
>>>
>>>
>>>
>>> On May 30 2017, at 10:25 am, daemeon reiydelle 
>>> wrote:
>>>
>>> When you say "the load rises ... ", could you clarify what you mean by
>>> "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But
>>> in neither case would that be relevant to transient or persisted disk. Am I
>>> missing something?
>>>
>>>
>>> On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli 
>>> wrote:
>>>
>>> 3-4 TB per node or in total?
>>>
>>> On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol >> > wrote:
>>>
>>> I should also mention that I am running cassandra 3.10 on the cluster
>>>
>>>
>>>
>>> On May 29 2017, at 9:43 am, Daniel Steuernol 
>>> wrote:
>>>
>>> The cluster is running with RF=3, right now each node is storing about
>>> 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61
>>> GB of RAM, and the disks attached for the data drive are gp2 ssd ebs
>>> volumes with 10k iops. I guess this brings up the question of what's a good
>>> marker to decide on whether to increase disk space vs provisioning a new
>>> node?
>>>
>>>
>>> On May 29 2017, at 9:35 am, tommaso barbugli 
>>> wrote:
>>>
>>> Hi Daniel,
>>>
>>> This is not normal. Possibly a capacity problem. Whats the RF, how much
>>> data do you store per node and what kind of servers do you use (core count,
>>> RAM, disk, ...)?
>>>
>>> Cheers,
>>> Tommaso
>>>
>>> On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol >> > wrote:
>>>
>>>
>>> I am running a 6 node cluster, and I have noticed that the reported load
>>> on each node rises throughout the week and grows way past the actual disk
>>> space used and available on each node. Also eventually latency for
>>> operations suffers and the nodes have to be restarted. A couple questions
>>> on this, is this normal? Also does cassandra need to be restarted every few
>>> days for best performance? Any insight on this behaviour would be helpful.
>>>
>>> Cheers,
>>> Daniel
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>>> additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>>> additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>
>>>
>>>
>


Re: Restarting nodes and reported load

2017-05-30 Thread tommaso barbugli
Am I the only one thinking 3TB is way too much data for a single node on a
VM?

On Tue, May 30, 2017 at 10:36 PM, Daniel Steuernol 
wrote:

> I don't believe incremental repair is enabled, I have never enabled it on
> the cluster, and unless it's the default then it is off. Also I don't see a
> setting in cassandra.yaml for it.
>
>
>
> On May 30 2017, at 1:10 pm, daemeon reiydelle  wrote:
>
>> Unless there is a bug, snapshots are excluded (they are not HDFS anyway!)
>> from nodetool status.
>>
>> Out of curiousity, is incremenatal repair enabled? This is almost
>> certainly a rat hole, but there was an issue a few releases back where load
>> would only increase until the node was restarted. Had been fixed ages ago,
>> but wondering what happens if you restart a node, IF you have incremental
>> enabled.
>>
>>
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <+1%20415-501-0198>London
>> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>>
>>
>> *“All men dream, but not equally. Those who dream by night in the dusty
>> recesses of their minds wake up in the day to find it was vanity, but the
>> dreamers of the day are dangerous men, for they may act their dreams with
>> open eyes, to make it possible.” — T.E. Lawrence*
>>
>>
>> On Tue, May 30, 2017 at 12:15 PM, Varun Gupta  wrote:
>>
>> Can you please check if you have incremental backup enabled and snapshots
>> are occupying the space.
>>
>> run nodetool clearsnapshot command.
>>
>> On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol > > wrote:
>>
>> It's 3-4TB per node, and by load rises, I'm talking about load as
>> reported by nodetool status.
>>
>>
>>
>> On May 30 2017, at 10:25 am, daemeon reiydelle 
>> wrote:
>>
>> When you say "the load rises ... ", could you clarify what you mean by
>> "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But
>> in neither case would that be relevant to transient or persisted disk. Am I
>> missing something?
>>
>>
>> On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli 
>> wrote:
>>
>> 3-4 TB per node or in total?
>>
>> On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol 
>> wrote:
>>
>> I should also mention that I am running cassandra 3.10 on the cluster
>>
>>
>>
>> On May 29 2017, at 9:43 am, Daniel Steuernol 
>> wrote:
>>
>> The cluster is running with RF=3, right now each node is storing about
>> 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61
>> GB of RAM, and the disks attached for the data drive are gp2 ssd ebs
>> volumes with 10k iops. I guess this brings up the question of what's a good
>> marker to decide on whether to increase disk space vs provisioning a new
>> node?
>>
>>
>> On May 29 2017, at 9:35 am, tommaso barbugli 
>> wrote:
>>
>> Hi Daniel,
>>
>> This is not normal. Possibly a capacity problem. Whats the RF, how much
>> data do you store per node and what kind of servers do you use (core count,
>> RAM, disk, ...)?
>>
>> Cheers,
>> Tommaso
>>
>> On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol 
>> wrote:
>>
>>
>> I am running a 6 node cluster, and I have noticed that the reported load
>> on each node rises throughout the week and grows way past the actual disk
>> space used and available on each node. Also eventually latency for
>> operations suffers and the nodes have to be restarted. A couple questions
>> on this, is this normal? Also does cassandra need to be restarted every few
>> days for best performance? Any insight on this behaviour would be helpful.
>>
>> Cheers,
>> Daniel
>> - To
>> unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>> additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>>
>>
>> - To
>> unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>> additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>>
>>


Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle
No degradation.





*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*


*“All men dream, but not equally. Those who dream by night in the dusty
recesses of their minds wake up in the day to find it was vanity, but the
dreamers of the day are dangerous men, for they may act their dreams with
open eyes, to make it possible.” — T.E. Lawrence*


On Tue, May 30, 2017 at 1:54 PM, Daniel Steuernol 
wrote:

> That does sound like what's happening, did performance degrade as the
> reported load increased?
>
>
>
> On May 30 2017, at 1:52 pm, daemeon reiydelle  wrote:
>
>> OK, thanks.
>>
>> So there was a bug in a prior version of C*, symptoms were:
>>
>> Nodetool would show increasing load utilization over time. Stopping and
>> restarting C* nodes would reset the storage back to what one would expect
>> on that node, for a while, then it would creep upwards again, until the
>> node(s) are restarted, etc. FYI it ONLY occurred on an in-use system, etc.
>>
>> I know (double checked) that the problem was fixed a while back.
>> Wondering if it resurfaced?
>>
>>
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
>> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>>
>>
>> *“All men dream, but not equally. Those who dream by night in the dusty
>> recesses of their minds wake up in the day to find it was vanity, but the
>> dreamers of the day are dangerous men, for they may act their dreams with
>> open eyes, to make it possible.” — T.E. Lawrence*
>>
>>
>> On Tue, May 30, 2017 at 1:36 PM, Daniel Steuernol 
>> wrote:
>>
>> I don't believe incremental repair is enabled, I have never enabled it on
>> the cluster, and unless it's the default then it is off. Also I don't see a
>> setting in cassandra.yaml for it.
>>
>>
>> On May 30 2017, at 1:10 pm, daemeon reiydelle 
>> wrote:
>>
>> Unless there is a bug, snapshots are excluded (they are not HDFS anyway!)
>> from nodetool status.
>>
>> Out of curiousity, is incremenatal repair enabled? This is almost
>> certainly a rat hole, but there was an issue a few releases back where load
>> would only increase until the node was restarted. Had been fixed ages ago,
>> but wondering what happens if you restart a node, IF you have incremental
>> enabled.
>>
>>
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*
>>
>>
>> *“All men dream, but not equally. Those who dream by night in the dusty
>> recesses of their minds wake up in the day to find it was vanity, but the
>> dreamers of the day are dangerous men, for they may act their dreams with
>> open eyes, to make it possible.” — T.E. Lawrence*
>>
>>
>> On Tue, May 30, 2017 at 12:15 PM, Varun Gupta  wrote:
>>
>> Can you please check if you have incremental backup enabled and snapshots
>> are occupying the space.
>>
>> run nodetool clearsnapshot command.
>>
>> On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol > > wrote:
>>
>> It's 3-4TB per node, and by load rises, I'm talking about load as
>> reported by nodetool status.
>>
>>
>>
>> On May 30 2017, at 10:25 am, daemeon reiydelle 
>> wrote:
>>
>> When you say "the load rises ... ", could you clarify what you mean by
>> "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But
>> in neither case would that be relevant to transient or persisted disk. Am I
>> missing something?
>>
>>
>> On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli 
>> wrote:
>>
>> 3-4 TB per node or in total?
>>
>> On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol 
>> wrote:
>>
>> I should also mention that I am running cassandra 3.10 on the cluster
>>
>>
>>
>> On May 29 2017, at 9:43 am, Daniel Steuernol 
>> wrote:
>>
>> The cluster is running with RF=3, right now each node is storing about
>> 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61
>> GB of RAM, and the disks attached for the data drive are gp2 ssd ebs
>> volumes with 10k iops. I guess this brings up the question of what's a good
>> marker to decide on whether to increase disk space vs provisioning a new
>> node?
>>
>>
>> On May 29 2017, at 9:35 am, tommaso barbugli 
>> wrote:
>>
>> Hi Daniel,
>>
>> This is not normal. Possibly a capacity problem. Whats the RF, how much
>> data do you store per node and what kind of servers do you use (core count,
>> RAM, disk, ...)?
>>
>> Cheers,
>> Tommaso
>>
>> On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol 
>> wrote:
>>
>>
>> I am running a 6 node cluster, and I have noticed that the reported load
>> on each node rises throughout the week and grows way past the actual disk
>> space used and available on each node. Also eventually latency for
>> operations suffers and the nodes have to be restarted. A couple questions
>> on this, is 

Re: Restarting nodes and reported load

2017-05-30 Thread Daniel Steuernol
That does sound like what's happening, did performance degrade as the reported load increased?
  

On May 30 2017, at 1:52 pm, daemeon reiydelle  wrote:


  OK, thanks.So there was a bug in a prior version of C*, symptoms were:Nodetool would show increasing load utilization over time. Stopping and restarting C* nodes would reset the storage back to what one would expect on that node, for a while, then it would creep upwards again, until the node(s) are restarted, etc. FYI it ONLY occurred on an in-use system, etc.I know (double checked) that the problem was fixed a while back. Wondering if it resurfaced? Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872“All men dream, but not equally. Those who dream by night in the dusty 
recesses of their minds wake up in the day to find it was vanity, but 
the dreamers of the day are dangerous men, for they may act their dreams
 with open eyes, to make it possible.” — T.E. Lawrence
On Tue, May 30, 2017 at 1:36 PM, Daniel Steuernol  wrote:I don't believe incremental repair is enabled, I have never enabled it on the cluster, and unless it's the default then it is off. Also I don't see a setting in cassandra.yaml for it.
  

On May 30 2017, at 1:10 pm, daemeon reiydelle  wrote:


  Unless there is a bug, snapshots are excluded (they are not HDFS anyway!) from nodetool status. Out of curiousity, is incremenatal repair enabled? This is almost certainly a rat hole, but there was an issue a few releases back where load would only increase until the node was restarted. Had been fixed ages ago, but wondering what happens if you restart a node, IF you have incremental enabled.Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872“All men dream, but not equally. Those who dream by night in the dusty 
recesses of their minds wake up in the day to find it was vanity, but 
the dreamers of the day are dangerous men, for they may act their dreams
 with open eyes, to make it possible.” — T.E. Lawrence
On Tue, May 30, 2017 at 12:15 PM, Varun Gupta  wrote:Can you please check if you have incremental backup enabled and snapshots are occupying the space.run nodetool clearsnapshot command.On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol  wrote:It's 3-4TB per node, and by load rises, I'm talking about load as reported by nodetool status.
  

On May 30 2017, at 10:25 am, daemeon reiydelle  wrote:


  When you say "the load rises ... ", could you clarify what you mean by "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But in neither case would that be relevant to transient or persisted disk. Am I missing something?
On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli  wrote:3-4 TB per node or in total?On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol  wrote:I should also mention that I am running cassandra 3.10 on the cluster
  

On May 29 2017, at 9:43 am, Daniel Steuernol  wrote:


  The cluster is running with RF=3, right now each node is storing about 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61 GB of RAM, and the disks attached for the data drive are gp2 ssd ebs volumes with 10k iops. I guess this brings up the question of what's a good marker to decide on whether to increase disk space vs provisioning a new node?
  

On May 29 2017, at 9:35 am, tommaso barbugli  wrote:


  Hi Daniel,This is not normal. Possibly a capacity problem. Whats the RF, how much data do you store per node and what kind of servers do you use (core count, RAM, disk, ...)?Cheers,TommasoOn Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol  wrote:I am running a 6 node cluster, and I have noticed that the reported load on each node rises throughout the week and grows way past the actual disk space used and available on each node. Also eventually latency for operations suffers and the nodes have to be restarted. A couple questions on this, is this normal? Also does cassandra need to be restarted every few days for best performance? Any insight on this behaviour would be helpful.Cheers,Daniel

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




  

  




  

-
To unsubscribe, e-mail: 

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle
OK, thanks.

So there was a bug in a prior version of C*, symptoms were:

Nodetool would show increasing load utilization over time. Stopping and
restarting C* nodes would reset the storage back to what one would expect
on that node, for a while, then it would creep upwards again, until the
node(s) are restarted, etc. FYI it ONLY occurred on an in-use system, etc.

I know (double checked) that the problem was fixed a while back. Wondering
if it resurfaced?





*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*


*“All men dream, but not equally. Those who dream by night in the dusty
recesses of their minds wake up in the day to find it was vanity, but the
dreamers of the day are dangerous men, for they may act their dreams with
open eyes, to make it possible.” — T.E. Lawrence*


On Tue, May 30, 2017 at 1:36 PM, Daniel Steuernol 
wrote:

> I don't believe incremental repair is enabled, I have never enabled it on
> the cluster, and unless it's the default then it is off. Also I don't see a
> setting in cassandra.yaml for it.
>
>
> On May 30 2017, at 1:10 pm, daemeon reiydelle  wrote:
>
>> Unless there is a bug, snapshots are excluded (they are not HDFS anyway!)
>> from nodetool status.
>>
>> Out of curiousity, is incremenatal repair enabled? This is almost
>> certainly a rat hole, but there was an issue a few releases back where load
>> would only increase until the node was restarted. Had been fixed ages ago,
>> but wondering what happens if you restart a node, IF you have incremental
>> enabled.
>>
>>
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
>> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>>
>>
>> *“All men dream, but not equally. Those who dream by night in the dusty
>> recesses of their minds wake up in the day to find it was vanity, but the
>> dreamers of the day are dangerous men, for they may act their dreams with
>> open eyes, to make it possible.” — T.E. Lawrence*
>>
>>
>> On Tue, May 30, 2017 at 12:15 PM, Varun Gupta  wrote:
>>
>> Can you please check if you have incremental backup enabled and snapshots
>> are occupying the space.
>>
>> run nodetool clearsnapshot command.
>>
>> On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol > > wrote:
>>
>> It's 3-4TB per node, and by load rises, I'm talking about load as
>> reported by nodetool status.
>>
>>
>>
>> On May 30 2017, at 10:25 am, daemeon reiydelle 
>> wrote:
>>
>> When you say "the load rises ... ", could you clarify what you mean by
>> "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But
>> in neither case would that be relevant to transient or persisted disk. Am I
>> missing something?
>>
>>
>> On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli 
>> wrote:
>>
>> 3-4 TB per node or in total?
>>
>> On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol 
>> wrote:
>>
>> I should also mention that I am running cassandra 3.10 on the cluster
>>
>>
>>
>> On May 29 2017, at 9:43 am, Daniel Steuernol 
>> wrote:
>>
>> The cluster is running with RF=3, right now each node is storing about
>> 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61
>> GB of RAM, and the disks attached for the data drive are gp2 ssd ebs
>> volumes with 10k iops. I guess this brings up the question of what's a good
>> marker to decide on whether to increase disk space vs provisioning a new
>> node?
>>
>>
>> On May 29 2017, at 9:35 am, tommaso barbugli 
>> wrote:
>>
>> Hi Daniel,
>>
>> This is not normal. Possibly a capacity problem. Whats the RF, how much
>> data do you store per node and what kind of servers do you use (core count,
>> RAM, disk, ...)?
>>
>> Cheers,
>> Tommaso
>>
>> On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol 
>> wrote:
>>
>>
>> I am running a 6 node cluster, and I have noticed that the reported load
>> on each node rises throughout the week and grows way past the actual disk
>> space used and available on each node. Also eventually latency for
>> operations suffers and the nodes have to be restarted. A couple questions
>> on this, is this normal? Also does cassandra need to be restarted every few
>> days for best performance? Any insight on this behaviour would be helpful.
>>
>> Cheers,
>> Daniel
>> - To
>> unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>> additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>>
>>
>> - To
>> unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>> additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>>
>>


Re: Restarting nodes and reported load

2017-05-30 Thread Daniel Steuernol
I don't believe incremental repair is enabled, I have never enabled it on the cluster, and unless it's the default then it is off. Also I don't see a setting in cassandra.yaml for it.
  

On May 30 2017, at 1:10 pm, daemeon reiydelle  wrote:


  Unless there is a bug, snapshots are excluded (they are not HDFS anyway!) from nodetool status. Out of curiousity, is incremenatal repair enabled? This is almost certainly a rat hole, but there was an issue a few releases back where load would only increase until the node was restarted. Had been fixed ages ago, but wondering what happens if you restart a node, IF you have incremental enabled.Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872“All men dream, but not equally. Those who dream by night in the dusty 
recesses of their minds wake up in the day to find it was vanity, but 
the dreamers of the day are dangerous men, for they may act their dreams
 with open eyes, to make it possible.” — T.E. Lawrence
On Tue, May 30, 2017 at 12:15 PM, Varun Gupta  wrote:Can you please check if you have incremental backup enabled and snapshots are occupying the space.run nodetool clearsnapshot command.On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol  wrote:It's 3-4TB per node, and by load rises, I'm talking about load as reported by nodetool status.
  

On May 30 2017, at 10:25 am, daemeon reiydelle  wrote:


  When you say "the load rises ... ", could you clarify what you mean by "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But in neither case would that be relevant to transient or persisted disk. Am I missing something?
On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli  wrote:3-4 TB per node or in total?On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol  wrote:I should also mention that I am running cassandra 3.10 on the cluster
  

On May 29 2017, at 9:43 am, Daniel Steuernol  wrote:


  The cluster is running with RF=3, right now each node is storing about 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61 GB of RAM, and the disks attached for the data drive are gp2 ssd ebs volumes with 10k iops. I guess this brings up the question of what's a good marker to decide on whether to increase disk space vs provisioning a new node?
  

On May 29 2017, at 9:35 am, tommaso barbugli  wrote:


  Hi Daniel,This is not normal. Possibly a capacity problem. Whats the RF, how much data do you store per node and what kind of servers do you use (core count, RAM, disk, ...)?Cheers,TommasoOn Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol  wrote:I am running a 6 node cluster, and I have noticed that the reported load on each node rises throughout the week and grows way past the actual disk space used and available on each node. Also eventually latency for operations suffers and the nodes have to be restarted. A couple questions on this, is this normal? Also does cassandra need to be restarted every few days for best performance? Any insight on this behaviour would be helpful.Cheers,Daniel

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




  

  




  

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org





  

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: How to know when repair repaired something?

2017-05-30 Thread Jan Algermissen


On 30 May 2017, at 21:11, Varun Gupta wrote:


I am missing the point, why do you want to re-trigger the process post
repair. Repair will sync the data correctly.


Sorry - I mis-represented  that. I want to trigger something else, not 
repair.


I am investigating a CQRS/Event Sourced pattern which C* as a 
distributed event log and a process reading from that log, changing 
state in other data bases (Solr, Graph-DB, other C* tables, etc.)


Since I do not want to write to/read from the commit log with 
EACH_QUORUM or LOCAL_QUORUM it could happen that the process processing 
the event log misses an event that only later pops up during repair.


What that happens, I'd like to re-process the log (my processing is 
idempotent, so it can just go again).


This is why I was looking for a way to learn that a repair has actually 
repaired something.



Jan



On Mon, May 29, 2017 at 8:07 AM, Jan Algermissen 


Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle
Unless there is a bug, snapshots are excluded (they are not HDFS anyway!)
from nodetool status.

Out of curiousity, is incremenatal repair enabled? This is almost certainly
a rat hole, but there was an issue a few releases back where load would
only increase until the node was restarted. Had been fixed ages ago, but
wondering what happens if you restart a node, IF you have incremental
enabled.





*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*


*“All men dream, but not equally. Those who dream by night in the dusty
recesses of their minds wake up in the day to find it was vanity, but the
dreamers of the day are dangerous men, for they may act their dreams with
open eyes, to make it possible.” — T.E. Lawrence*


On Tue, May 30, 2017 at 12:15 PM, Varun Gupta  wrote:

> Can you please check if you have incremental backup enabled and snapshots
> are occupying the space.
>
> run nodetool clearsnapshot command.
>
> On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol 
> wrote:
>
>> It's 3-4TB per node, and by load rises, I'm talking about load as
>> reported by nodetool status.
>>
>>
>>
>> On May 30 2017, at 10:25 am, daemeon reiydelle 
>> wrote:
>>
>>> When you say "the load rises ... ", could you clarify what you mean by
>>> "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But
>>> in neither case would that be relevant to transient or persisted disk. Am I
>>> missing something?
>>>
>>>
>>> On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli 
>>> wrote:
>>>
>>> 3-4 TB per node or in total?
>>>
>>> On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol >> > wrote:
>>>
>>> I should also mention that I am running cassandra 3.10 on the cluster
>>>
>>>
>>>
>>> On May 29 2017, at 9:43 am, Daniel Steuernol 
>>> wrote:
>>>
>>> The cluster is running with RF=3, right now each node is storing about
>>> 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61
>>> GB of RAM, and the disks attached for the data drive are gp2 ssd ebs
>>> volumes with 10k iops. I guess this brings up the question of what's a good
>>> marker to decide on whether to increase disk space vs provisioning a new
>>> node?
>>>
>>>
>>> On May 29 2017, at 9:35 am, tommaso barbugli 
>>> wrote:
>>>
>>> Hi Daniel,
>>>
>>> This is not normal. Possibly a capacity problem. Whats the RF, how much
>>> data do you store per node and what kind of servers do you use (core count,
>>> RAM, disk, ...)?
>>>
>>> Cheers,
>>> Tommaso
>>>
>>> On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol >> > wrote:
>>>
>>>
>>> I am running a 6 node cluster, and I have noticed that the reported load
>>> on each node rises throughout the week and grows way past the actual disk
>>> space used and available on each node. Also eventually latency for
>>> operations suffers and the nodes have to be restarted. A couple questions
>>> on this, is this normal? Also does cassandra need to be restarted every few
>>> days for best performance? Any insight on this behaviour would be helpful.
>>>
>>> Cheers,
>>> Daniel
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>>> additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>
>>>
>>>
>>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>> additional commands, e-mail: user-h...@cassandra.apache.org
>>
>
>


Re: Restarting nodes and reported load

2017-05-30 Thread Daniel Steuernol
incremental backup is set to false in the config file, also I have set  snapshot_before_compaction and auto_snapshot to false as well. I ran nodetool clearsnapshot, but before doing that I ran nodetool listsnapshots and it listed a bunch of snapshots. I would have expected that to be empty because I've disabled auto_snapshot.
  

On May 30 2017, at 12:15 pm, Varun Gupta  wrote:


  Can you please check if you have incremental backup enabled and snapshots are occupying the space.run nodetool clearsnapshot command.On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol  wrote:It's 3-4TB per node, and by load rises, I'm talking about load as reported by nodetool status.
  

On May 30 2017, at 10:25 am, daemeon reiydelle  wrote:


  When you say "the load rises ... ", could you clarify what you mean by "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But in neither case would that be relevant to transient or persisted disk. Am I missing something?
On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli  wrote:3-4 TB per node or in total?On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol  wrote:I should also mention that I am running cassandra 3.10 on the cluster
  

On May 29 2017, at 9:43 am, Daniel Steuernol  wrote:


  The cluster is running with RF=3, right now each node is storing about 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61 GB of RAM, and the disks attached for the data drive are gp2 ssd ebs volumes with 10k iops. I guess this brings up the question of what's a good marker to decide on whether to increase disk space vs provisioning a new node?
  

On May 29 2017, at 9:35 am, tommaso barbugli  wrote:


  Hi Daniel,This is not normal. Possibly a capacity problem. Whats the RF, how much data do you store per node and what kind of servers do you use (core count, RAM, disk, ...)?Cheers,TommasoOn Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol  wrote:I am running a 6 node cluster, and I have noticed that the reported load on each node rises throughout the week and grows way past the actual disk space used and available on each node. Also eventually latency for operations suffers and the nodes have to be restarted. A couple questions on this, is this normal? Also does cassandra need to be restarted every few days for best performance? Any insight on this behaviour would be helpful.Cheers,Daniel

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




  

  




  

-
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




  

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: [Cassandra] Ignoring interval time

2017-05-30 Thread Varun Gupta
Can you please check Cassandra Stats, if cluster is under too much load.
This is the symptom, not the root cause.

On Tue, May 30, 2017 at 2:33 AM, Abhishek Kumar Maheshwari <
abhishek.maheshw...@timesinternet.in> wrote:

> Hi All,
>
>
>
> Please let me know why this debug log is coming:
>
>
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:31,496 FailureDetector.java:456 -
> Ignoring interval time of 2000686406 for /XXX.XX.XXX.204
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 -
> Ignoring interval time of 2349724693 <(234)%20972-4693> for
> /XXX.XX.XXX.207
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 -
> Ignoring interval time of 2000655389 for /XXX.XX.XXX.206
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 -
> Ignoring interval time of 2000721304 for /XXX.XX.XXX.201
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 -
> Ignoring interval time of 2000770809 for /XXX.XX.XXX.202
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 -
> Ignoring interval time of 2000825217 for /XXX.XX.XXX.209
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:35,449 FailureDetector.java:456 -
> Ignoring interval time of 2953167747 for /XXX.XX.XXX.205
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:37,497 FailureDetector.java:456 -
> Ignoring interval time of 2047662469 <(204)%20766-2469> for
> /XXX.XX.XXX.205
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:37,497 FailureDetector.java:456 -
> Ignoring interval time of 2000717144 for /XXX.XX.XXX.207
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:37,497 FailureDetector.java:456 -
> Ignoring interval time of 2000780785 for /XXX.XX.XXX.201
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:38,497 FailureDetector.java:456 -
> Ignoring interval time of 2000113606 for /XXX.XX.XXX.209
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:39,121 FailureDetector.java:456 -
> Ignoring interval time of 2334491585 for /XXX.XX.XXX.204
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:39,497 FailureDetector.java:456 -
> Ignoring interval time of 2000209788 for /XXX.XX.XXX.207
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:39,497 FailureDetector.java:456 -
> Ignoring interval time of 2000226568 for /XXX.XX.XXX.208
>
> DEBUG [GossipStage:1] 2017-05-30 15:01:42,178 FailureDetector.java:456 -
> Ignoring interval time of 2390977968 for /XXX.XX.XXX.204
>
>
>
> *Thanks & Regards,*
> *Abhishek Kumar Maheshwari*
> *+91- 805591 <+91%208%2005591> (Mobile)*
>
> Times Internet Ltd. | A Times of India Group Company
>
> FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
>
> *P** Please do not print this email unless it is absolutely necessary.
> Spread environmental awareness.*
>
>
>


Re: Restarting nodes and reported load

2017-05-30 Thread Varun Gupta
Can you please check if you have incremental backup enabled and snapshots
are occupying the space.

run nodetool clearsnapshot command.

On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol 
wrote:

> It's 3-4TB per node, and by load rises, I'm talking about load as reported
> by nodetool status.
>
>
>
> On May 30 2017, at 10:25 am, daemeon reiydelle 
> wrote:
>
>> When you say "the load rises ... ", could you clarify what you mean by
>> "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But
>> in neither case would that be relevant to transient or persisted disk. Am I
>> missing something?
>>
>>
>> On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli 
>> wrote:
>>
>> 3-4 TB per node or in total?
>>
>> On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol 
>> wrote:
>>
>> I should also mention that I am running cassandra 3.10 on the cluster
>>
>>
>>
>> On May 29 2017, at 9:43 am, Daniel Steuernol 
>> wrote:
>>
>> The cluster is running with RF=3, right now each node is storing about
>> 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61
>> GB of RAM, and the disks attached for the data drive are gp2 ssd ebs
>> volumes with 10k iops. I guess this brings up the question of what's a good
>> marker to decide on whether to increase disk space vs provisioning a new
>> node?
>>
>>
>> On May 29 2017, at 9:35 am, tommaso barbugli 
>> wrote:
>>
>> Hi Daniel,
>>
>> This is not normal. Possibly a capacity problem. Whats the RF, how much
>> data do you store per node and what kind of servers do you use (core count,
>> RAM, disk, ...)?
>>
>> Cheers,
>> Tommaso
>>
>> On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol 
>> wrote:
>>
>>
>> I am running a 6 node cluster, and I have noticed that the reported load
>> on each node rises throughout the week and grows way past the actual disk
>> space used and available on each node. Also eventually latency for
>> operations suffers and the nodes have to be restarted. A couple questions
>> on this, is this normal? Also does cassandra need to be restarted every few
>> days for best performance? Any insight on this behaviour would be helpful.
>>
>> Cheers,
>> Daniel
>> - To
>> unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>> additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>>
>>
>> - To
> unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional
> commands, e-mail: user-h...@cassandra.apache.org
>


Re: How to know when repair repaired something?

2017-05-30 Thread Varun Gupta
I am missing the point, why do you want to re-trigger the process post
repair. Repair will sync the data correctly.

On Mon, May 29, 2017 at 8:07 AM, Jan Algermissen  wrote:

> Hi,
>
> is it possible to extract from repair logs the writetime of the writes
> that needed to be repaired?
>
> I have some processes I would like to re-trigger from a time point if
> repair found problems.
>
> Is that useful? Possible?
>
> Jan
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Restarting nodes and reported load

2017-05-30 Thread Daniel Steuernol
It's 3-4TB per node, and by load rises, I'm talking about load as reported by nodetool status.
  

On May 30 2017, at 10:25 am, daemeon reiydelle  wrote:


  When you say "the load rises ... ", could you clarify what you mean by "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But in neither case would that be relevant to transient or persisted disk. Am I missing something?
On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli  wrote:3-4 TB per node or in total?On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol  wrote:I should also mention that I am running cassandra 3.10 on the cluster
  

On May 29 2017, at 9:43 am, Daniel Steuernol  wrote:


  The cluster is running with RF=3, right now each node is storing about 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61 GB of RAM, and the disks attached for the data drive are gp2 ssd ebs volumes with 10k iops. I guess this brings up the question of what's a good marker to decide on whether to increase disk space vs provisioning a new node?
  

On May 29 2017, at 9:35 am, tommaso barbugli  wrote:


  Hi Daniel,This is not normal. Possibly a capacity problem. Whats the RF, how much data do you store per node and what kind of servers do you use (core count, RAM, disk, ...)?Cheers,TommasoOn Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol  wrote:I am running a 6 node cluster, and I have noticed that the reported load on each node rises throughout the week and grows way past the actual disk space used and available on each node. Also eventually latency for operations suffers and the nodes have to be restarted. A couple questions on this, is this normal? Also does cassandra need to be restarted every few days for best performance? Any insight on this behaviour would be helpful.Cheers,Daniel

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




  

  




  

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle
When you say "the load rises ... ", could you clarify what you mean by
"load"? That has a specific Linux term, and in e.g. Cloudera Manager. But
in neither case would that be relevant to transient or persisted disk. Am I
missing something?


On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli 
wrote:

> 3-4 TB per node or in total?
>
> On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol 
> wrote:
>
>> I should also mention that I am running cassandra 3.10 on the cluster
>>
>>
>>
>> On May 29 2017, at 9:43 am, Daniel Steuernol 
>> wrote:
>>
>>> The cluster is running with RF=3, right now each node is storing about
>>> 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61
>>> GB of RAM, and the disks attached for the data drive are gp2 ssd ebs
>>> volumes with 10k iops. I guess this brings up the question of what's a good
>>> marker to decide on whether to increase disk space vs provisioning a new
>>> node?
>>>
>>>
>>> On May 29 2017, at 9:35 am, tommaso barbugli 
>>> wrote:
>>>
>>> Hi Daniel,
>>>
>>> This is not normal. Possibly a capacity problem. Whats the RF, how much
>>> data do you store per node and what kind of servers do you use (core count,
>>> RAM, disk, ...)?
>>>
>>> Cheers,
>>> Tommaso
>>>
>>> On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol >> > wrote:
>>>
>>>
>>> I am running a 6 node cluster, and I have noticed that the reported load
>>> on each node rises throughout the week and grows way past the actual disk
>>> space used and available on each node. Also eventually latency for
>>> operations suffers and the nodes have to be restarted. A couple questions
>>> on this, is this normal? Also does cassandra need to be restarted every few
>>> days for best performance? Any insight on this behaviour would be helpful.
>>>
>>> Cheers,
>>> Daniel
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>>> additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>
>>>
>


Re: Restarting nodes and reported load

2017-05-30 Thread tommaso barbugli
3-4 TB per node or in total?

On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol 
wrote:

> I should also mention that I am running cassandra 3.10 on the cluster
>
>
>
> On May 29 2017, at 9:43 am, Daniel Steuernol 
> wrote:
>
>> The cluster is running with RF=3, right now each node is storing about
>> 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61
>> GB of RAM, and the disks attached for the data drive are gp2 ssd ebs
>> volumes with 10k iops. I guess this brings up the question of what's a good
>> marker to decide on whether to increase disk space vs provisioning a new
>> node?
>>
>>
>> On May 29 2017, at 9:35 am, tommaso barbugli 
>> wrote:
>>
>> Hi Daniel,
>>
>> This is not normal. Possibly a capacity problem. Whats the RF, how much
>> data do you store per node and what kind of servers do you use (core count,
>> RAM, disk, ...)?
>>
>> Cheers,
>> Tommaso
>>
>> On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol 
>> wrote:
>>
>>
>> I am running a 6 node cluster, and I have noticed that the reported load
>> on each node rises throughout the week and grows way past the actual disk
>> space used and available on each node. Also eventually latency for
>> operations suffers and the nodes have to be restarted. A couple questions
>> on this, is this normal? Also does cassandra need to be restarted every few
>> days for best performance? Any insight on this behaviour would be helpful.
>>
>> Cheers,
>> Daniel
>> - To
>> unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>> additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>>


Re: Restarting nodes and reported load

2017-05-30 Thread Daniel Steuernol
I should also mention that I am running cassandra 3.10 on the cluster
  

On May 29 2017, at 9:43 am, Daniel Steuernol  wrote:


  The cluster is running with RF=3, right now each node is storing about 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61 GB of RAM, and the disks attached for the data drive are gp2 ssd ebs volumes with 10k iops. I guess this brings up the question of what's a good marker to decide on whether to increase disk space vs provisioning a new node?
  

On May 29 2017, at 9:35 am, tommaso barbugli  wrote:


  Hi Daniel,This is not normal. Possibly a capacity problem. Whats the RF, how much data do you store per node and what kind of servers do you use (core count, RAM, disk, ...)?Cheers,TommasoOn Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol  wrote:I am running a 6 node cluster, and I have noticed that the reported load on each node rises throughout the week and grows way past the actual disk space used and available on each node. Also eventually latency for operations suffers and the nodes have to be restarted. A couple questions on this, is this normal? Also does cassandra need to be restarted every few days for best performance? Any insight on this behaviour would be helpful.Cheers,Daniel

-
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




  

  

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Is DataStax's DSE better than cassandra's free open source for a newbie developer's good start for cassandra?

2017-05-30 Thread Hannu Kröger
Hello,

DSE is commercial and costs money to use in production. More info from DataStax:
http://www.datastax.com/products/subscriptions 


RPMs are currently not available for the latest version. There is 3.0.13 but 
newer than that are not available from Apache AFAIK:
http://cassandra.apache.org/download/ 

Build instructions are here:
https://github.com/apache/cassandra/tree/trunk/redhat 


I hope those are helpful!

Cheers,
Hannu


> On 30 May 2017, at 13:30, gloCalHelp.com  wrote:
> 
> Dear sir,
> 
> Good evening, this is Georgelin from the biggest market of ShangHai, China,
> 
> I have known how to download an odd-number(bug fixed) version cassandra 
> source but not a rpm package.
> 
> would you like to give me a step by step guiding from  compiling, 
> distributing compiled classes to several computers to setup cassandra 
> cluster?   Cause I am starting to focus on cassandra big data but not other 
> big  data such as Greenplum, CGE, HDP etc. 
> 
> And I am a quick learner and deserving teaching student( I have ever worked 
> in IBM too) in big market of China, is there a senior Cassandra's system 
> designer who  would like to be my Cassandra's teacher?
> 
> And this is a starting point question,  is it better for a newbie to download 
> a third party cassandra source such as DSE to dig in  or download original 
> Cassandra source? Because DSE has spark, solr, graph DB more functions than 
> original cassandra,
> but is there anyone know DSE's license?
> 
> This is my personal mobile phone: 
> 0086 180 5004 2436
> , except cassandra, any big data problems on MPP Greenplum and CGE are 
> welcomed to ask me, we can exchange big data systems' knowledge too.
> 
> Sincerely yours,
> Georgelin
> www_8ems_...@sina.com
> mobile:0086 180 5004 2436
> 



Re: org.apache.cassandra.service.DigestMismatchException: Mismatch for key DecoratedKey

2017-05-30 Thread Akhil Mehra
This blog post 
(http://thelastpickle.com/blog/2011/05/15/Deletes-and-Tombstones.html 
) 
provides good explenation on the exception in your debug log.

Regards,
Akhil

> On 30/05/2017, at 9:29 PM, Abhishek Kumar Maheshwari 
>  wrote:
> 
> Hi All,
>  
> I am getting below exception in debug.log.
>  
> DEBUG [ReadRepairStage:636754] 2017-05-30 14:49:44,259 ReadCallback.java:234 
> - Digest mismatch:
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
> DecoratedKey(4329955402556695061, 
> 000808440801579b425c4000) 
> (343b7ef24feb594118ecb4bf7680d07f vs d41d8cd98f00b204e9800998ecf8427e)
> at 
> org.apache.cassandra.service.DigestResolver.resolve(DigestResolver.java:85) 
> ~[apache-cassandra-3.0.9.jar:3.0.9]
> at 
> org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:225)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_101]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
>  
>  
> Please let me know why it’s coming?
>  
> Thanks & Regards,
> Abhishek Kumar Maheshwari
> +91- 805591 (Mobile)
> Times Internet Ltd. | A Times of India Group Company
> FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
> P Please do not print this email unless it is absolutely necessary. Spread 
> environmental awareness.
>  
> 



Is DataStax's DSE better than cassandra's free open source for a newbie developer's good start for cassandra?

2017-05-30 Thread gloCalHelp.com
Dear sir,

Good evening, this is Georgelin from the biggest market of ShangHai, China,

I have known how to download an odd-number(bug fixed) version cassandra source 
but not a rpm package.

would you like to give me a step by step guiding from  compiling, distributing 
compiled classes to several computers to setup cassandra cluster?   Cause I am 
starting to focus on cassandra big data but not other big  data such as 
Greenplum, CGE, HDP etc. 

And I am a quick learner and deserving teaching student( I have ever worked in 
IBM too) in big market of China, is there a senior Cassandra's system designer 
who  would like to be my Cassandra's teacher?

And this is a starting point question,  is it better for a newbie to download a 
third party cassandra source such as DSE to dig in  or download original 
Cassandra source? Because DSE has spark, solr, graph DB more functions than 
original cassandra,
but is there anyone know DSE's license?

This is my personal mobile phone: 

0086 180 5004 2436, except cassandra, any big data problems on MPP Greenplum 
and CGE are welcomed to ask me, we can exchange big data systems' knowledge too.

Sincerely yours,
Georgelin
www_8ems_...@sina.com
mobile:0086 180 5004 2436



[Cassandra] Ignoring interval time

2017-05-30 Thread Abhishek Kumar Maheshwari
Hi All,

Please let me know why this debug log is coming:

DEBUG [GossipStage:1] 2017-05-30 15:01:31,496 FailureDetector.java:456 - 
Ignoring interval time of 2000686406 for /XXX.XX.XXX.204
DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 - 
Ignoring interval time of 2349724693 for /XXX.XX.XXX.207
DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 - 
Ignoring interval time of 2000655389 for /XXX.XX.XXX.206
DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 - 
Ignoring interval time of 2000721304 for /XXX.XX.XXX.201
DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 - 
Ignoring interval time of 2000770809 for /XXX.XX.XXX.202
DEBUG [GossipStage:1] 2017-05-30 15:01:34,497 FailureDetector.java:456 - 
Ignoring interval time of 2000825217 for /XXX.XX.XXX.209
DEBUG [GossipStage:1] 2017-05-30 15:01:35,449 FailureDetector.java:456 - 
Ignoring interval time of 2953167747 for /XXX.XX.XXX.205
DEBUG [GossipStage:1] 2017-05-30 15:01:37,497 FailureDetector.java:456 - 
Ignoring interval time of 2047662469 for /XXX.XX.XXX.205
DEBUG [GossipStage:1] 2017-05-30 15:01:37,497 FailureDetector.java:456 - 
Ignoring interval time of 2000717144 for /XXX.XX.XXX.207
DEBUG [GossipStage:1] 2017-05-30 15:01:37,497 FailureDetector.java:456 - 
Ignoring interval time of 2000780785 for /XXX.XX.XXX.201
DEBUG [GossipStage:1] 2017-05-30 15:01:38,497 FailureDetector.java:456 - 
Ignoring interval time of 2000113606 for /XXX.XX.XXX.209
DEBUG [GossipStage:1] 2017-05-30 15:01:39,121 FailureDetector.java:456 - 
Ignoring interval time of 2334491585 for /XXX.XX.XXX.204
DEBUG [GossipStage:1] 2017-05-30 15:01:39,497 FailureDetector.java:456 - 
Ignoring interval time of 2000209788 for /XXX.XX.XXX.207
DEBUG [GossipStage:1] 2017-05-30 15:01:39,497 FailureDetector.java:456 - 
Ignoring interval time of 2000226568 for /XXX.XX.XXX.208
DEBUG [GossipStage:1] 2017-05-30 15:01:42,178 FailureDetector.java:456 - 
Ignoring interval time of 2390977968 for /XXX.XX.XXX.204

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

[https://tsweb.timesgroup.com/timescape/images/LOGO.jpg]



org.apache.cassandra.service.DigestMismatchException: Mismatch for key DecoratedKey

2017-05-30 Thread Abhishek Kumar Maheshwari
Hi All,

I am getting below exception in debug.log.

DEBUG [ReadRepairStage:636754] 2017-05-30 14:49:44,259 ReadCallback.java:234 - 
Digest mismatch:
org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
DecoratedKey(4329955402556695061, 000808440801579b425c4000) 
(343b7ef24feb594118ecb4bf7680d07f vs d41d8cd98f00b204e9800998ecf8427e)
at 
org.apache.cassandra.service.DigestResolver.resolve(DigestResolver.java:85) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:225)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_101]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]


Please let me know why it's coming?

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

[https://tsweb.timesgroup.com/timescape/images/LOGO.jpg]