Daniel - my comment wasn't to you, it was in response to Daemeon. > no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs node
Jon On Tue, May 30, 2017 at 2:30 PM Daniel Steuernol <dan...@sendwithus.com> wrote: > My question is about cassandra, ultimately I'm trying to figure out why > our clusters performance degrades approximately every 6 days. I noticed > that the load as reported by nodetool status was very high, but that might > be unrelated to the problem. A restart solves the performance problem. > > I've attached a latency graph for inserts into the cluster as you can see > over the weekend there was a massive latency spike, and it was fixed by a > restart of all the nodes. > > On May 30 2017, at 2:18 pm, Jonathan Haddad <j...@jonhaddad.com> wrote: > >> This isn't an HDFS mailing list. >> >> On Tue, May 30, 2017 at 2:14 PM daemeon reiydelle <daeme...@gmail.com> >> wrote: >> >> no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs >> node. Depends somewhat on whether there is a mix of more and less >> frequently accessed data. But even storing only hot data, never saw >> anything less than 20tb hdfs per node. >> >> >> >> >> >> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* >> >> >> *“All men dream, but not equally. Those who dream by night in the dusty >> recesses of their minds wake up in the day to find it was vanity, but the >> dreamers of the day are dangerous men, for they may act their dreams with >> open eyes, to make it possible.” — T.E. Lawrence* >> >> >> On Tue, May 30, 2017 at 2:00 PM, tommaso barbugli <tbarbu...@gmail.com> >> wrote: >> >> Am I the only one thinking 3TB is way too much data for a single node on >> a VM? >> >> On Tue, May 30, 2017 at 10:36 PM, Daniel Steuernol <dan...@sendwithus.com >> > wrote: >> >> I don't believe incremental repair is enabled, I have never enabled it on >> the cluster, and unless it's the default then it is off. Also I don't see a >> setting in cassandra.yaml for it. >> >> >> >> On May 30 2017, at 1:10 pm, daemeon reiydelle <daeme...@gmail.com> >> wrote: >> >> Unless there is a bug, snapshots are excluded (they are not HDFS anyway!) >> from nodetool status. >> >> Out of curiousity, is incremenatal repair enabled? This is almost >> certainly a rat hole, but there was an issue a few releases back where load >> would only increase until the node was restarted. Had been fixed ages ago, >> but wondering what happens if you restart a node, IF you have incremental >> enabled. >> >> >> >> >> >> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* >> >> >> *“All men dream, but not equally. Those who dream by night in the dusty >> recesses of their minds wake up in the day to find it was vanity, but the >> dreamers of the day are dangerous men, for they may act their dreams with >> open eyes, to make it possible.” — T.E. Lawrence* >> >> >> On Tue, May 30, 2017 at 12:15 PM, Varun Gupta <var...@uber.com> wrote: >> >> Can you please check if you have incremental backup enabled and snapshots >> are occupying the space. >> >> run nodetool clearsnapshot command. >> >> On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol <dan...@sendwithus.com >> > wrote: >> >> It's 3-4TB per node, and by load rises, I'm talking about load as >> reported by nodetool status. >> >> >> >> On May 30 2017, at 10:25 am, daemeon reiydelle <daeme...@gmail.com> >> wrote: >> >> When you say "the load rises ... ", could you clarify what you mean by >> "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But >> in neither case would that be relevant to transient or persisted disk. Am I >> missing something? >> >> >> On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli <tbarbu...@gmail.com> >> wrote: >> >> 3-4 TB per node or in total? >> >> On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol <dan...@sendwithus.com> >> wrote: >> >> I should also mention that I am running cassandra 3.10 on the cluster >> >> >> >> On May 29 2017, at 9:43 am, Daniel Steuernol <dan...@sendwithus.com> >> wrote: >> >> The cluster is running with RF=3, right now each node is storing about >> 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61 >> GB of RAM, and the disks attached for the data drive are gp2 ssd ebs >> volumes with 10k iops. I guess this brings up the question of what's a good >> marker to decide on whether to increase disk space vs provisioning a new >> node? >> >> >> On May 29 2017, at 9:35 am, tommaso barbugli <tbarbu...@gmail.com> >> wrote: >> >> Hi Daniel, >> >> This is not normal. Possibly a capacity problem. Whats the RF, how much >> data do you store per node and what kind of servers do you use (core count, >> RAM, disk, ...)? >> >> Cheers, >> Tommaso >> >> On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol <dan...@sendwithus.com> >> wrote: >> >> >> I am running a 6 node cluster, and I have noticed that the reported load >> on each node rises throughout the week and grows way past the actual disk >> space used and available on each node. Also eventually latency for >> operations suffers and the nodes have to be restarted. A couple questions >> on this, is this normal? Also does cassandra need to be restarted every few >> days for best performance? Any insight on this behaviour would be helpful. >> >> Cheers, >> Daniel >> --------------------------------------------------------------------- To >> unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For >> additional commands, e-mail: user-h...@cassandra.apache.org >> >> >> >> >> --------------------------------------------------------------------- To >> unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For >> additional commands, e-mail: user-h...@cassandra.apache.org >> >> >> >> >> >>