Re: Restarting nodes and reported load

Jonathan Haddad Tue, 30 May 2017 14:32:57 -0700

Daniel - my comment wasn't to you, it was in response to Daemeon.

> no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs
node


Jon

On Tue, May 30, 2017 at 2:30 PM Daniel Steuernol <dan...@sendwithus.com>
wrote:

> My question is about cassandra, ultimately I'm trying to figure out why
> our clusters performance degrades approximately every 6 days. I noticed
> that the load as reported by nodetool status was very high, but that might
> be unrelated to the problem. A restart solves the performance problem.
>
> I've attached a latency graph for inserts into the cluster as you can see
> over the weekend there was a massive latency spike, and it was fixed by a
> restart of all the nodes.
>
> On May 30 2017, at 2:18 pm, Jonathan Haddad <j...@jonhaddad.com> wrote:
>
>> This isn't an HDFS mailing list.
>>
>> On Tue, May 30, 2017 at 2:14 PM daemeon reiydelle <daeme...@gmail.com>
>> wrote:
>>
>> no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs
>> node. Depends somewhat on whether there is a mix of more and less
>> frequently accessed data. But even storing only hot data, never saw
>> anything less than 20tb hdfs per node.
>>
>>
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*
>>
>>
>> *“All men dream, but not equally. Those who dream by night in the dusty
>> recesses of their minds wake up in the day to find it was vanity, but the
>> dreamers of the day are dangerous men, for they may act their dreams with
>> open eyes, to make it possible.” — T.E. Lawrence*
>>
>>
>> On Tue, May 30, 2017 at 2:00 PM, tommaso barbugli <tbarbu...@gmail.com>
>> wrote:
>>
>> Am I the only one thinking 3TB is way too much data for a single node on
>> a VM?
>>
>> On Tue, May 30, 2017 at 10:36 PM, Daniel Steuernol <dan...@sendwithus.com
>> > wrote:
>>
>> I don't believe incremental repair is enabled, I have never enabled it on
>> the cluster, and unless it's the default then it is off. Also I don't see a
>> setting in cassandra.yaml for it.
>>
>>
>>
>> On May 30 2017, at 1:10 pm, daemeon reiydelle <daeme...@gmail.com>
>> wrote:
>>
>> Unless there is a bug, snapshots are excluded (they are not HDFS anyway!)
>> from nodetool status.
>>
>> Out of curiousity, is incremenatal repair enabled? This is almost
>> certainly a rat hole, but there was an issue a few releases back where load
>> would only increase until the node was restarted. Had been fixed ages ago,
>> but wondering what happens if you restart a node, IF you have incremental
>> enabled.
>>
>>
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*
>>
>>
>> *“All men dream, but not equally. Those who dream by night in the dusty
>> recesses of their minds wake up in the day to find it was vanity, but the
>> dreamers of the day are dangerous men, for they may act their dreams with
>> open eyes, to make it possible.” — T.E. Lawrence*
>>
>>
>> On Tue, May 30, 2017 at 12:15 PM, Varun Gupta <var...@uber.com> wrote:
>>
>> Can you please check if you have incremental backup enabled and snapshots
>> are occupying the space.
>>
>> run nodetool clearsnapshot command.
>>
>> On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol <dan...@sendwithus.com
>> > wrote:
>>
>> It's 3-4TB per node, and by load rises, I'm talking about load as
>> reported by nodetool status.
>>
>>
>>
>> On May 30 2017, at 10:25 am, daemeon reiydelle <daeme...@gmail.com>
>> wrote:
>>
>> When you say "the load rises ... ", could you clarify what you mean by
>> "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But
>> in neither case would that be relevant to transient or persisted disk. Am I
>> missing something?
>>
>>
>> On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli <tbarbu...@gmail.com>
>> wrote:
>>
>> 3-4 TB per node or in total?
>>
>> On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol <dan...@sendwithus.com>
>> wrote:
>>
>> I should also mention that I am running cassandra 3.10 on the cluster
>>
>>
>>
>> On May 29 2017, at 9:43 am, Daniel Steuernol <dan...@sendwithus.com>
>> wrote:
>>
>> The cluster is running with RF=3, right now each node is storing about
>> 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61
>> GB of RAM, and the disks attached for the data drive are gp2 ssd ebs
>> volumes with 10k iops. I guess this brings up the question of what's a good
>> marker to decide on whether to increase disk space vs provisioning a new
>> node?
>>
>>
>> On May 29 2017, at 9:35 am, tommaso barbugli <tbarbu...@gmail.com>
>> wrote:
>>
>> Hi Daniel,
>>
>> This is not normal. Possibly a capacity problem. Whats the RF, how much
>> data do you store per node and what kind of servers do you use (core count,
>> RAM, disk, ...)?
>>
>> Cheers,
>> Tommaso
>>
>> On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol <dan...@sendwithus.com>
>> wrote:
>>
>>
>> I am running a 6 node cluster, and I have noticed that the reported load
>> on each node rises throughout the week and grows way past the actual disk
>> space used and available on each node. Also eventually latency for
>> operations suffers and the nodes have to be restarted. A couple questions
>> on this, is this normal? Also does cassandra need to be restarted every few
>> days for best performance? Any insight on this behaviour would be helpful.
>>
>> Cheers,
>> Daniel
>> --------------------------------------------------------------------- To
>> unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>> additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>>
>>
>> --------------------------------------------------------------------- To
>> unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>> additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>>
>>
>>
>>

Re: Restarting nodes and reported load

Reply via email to