This isn't an HDFS mailing list.

On Tue, May 30, 2017 at 2:14 PM daemeon reiydelle <daeme...@gmail.com>
wrote:

> no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs
> node. Depends somewhat on whether there is a mix of more and less
> frequently accessed data. But even storing only hot data, never saw
> anything less than 20tb hdfs per node.
>
>
>
>
>
> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>
>
> *“All men dream, but not equally. Those who dream by night in the dusty
> recesses of their minds wake up in the day to find it was vanity, but the
> dreamers of the day are dangerous men, for they may act their dreams with
> open eyes, to make it possible.” — T.E. Lawrence*
>
>
> On Tue, May 30, 2017 at 2:00 PM, tommaso barbugli <tbarbu...@gmail.com>
> wrote:
>
>> Am I the only one thinking 3TB is way too much data for a single node on
>> a VM?
>>
>> On Tue, May 30, 2017 at 10:36 PM, Daniel Steuernol <dan...@sendwithus.com
>> > wrote:
>>
>>> I don't believe incremental repair is enabled, I have never enabled it
>>> on the cluster, and unless it's the default then it is off. Also I don't
>>> see a setting in cassandra.yaml for it.
>>>
>>>
>>>
>>> On May 30 2017, at 1:10 pm, daemeon reiydelle <daeme...@gmail.com>
>>> wrote:
>>>
>>>> Unless there is a bug, snapshots are excluded (they are not HDFS
>>>> anyway!) from nodetool status.
>>>>
>>>> Out of curiousity, is incremenatal repair enabled? This is almost
>>>> certainly a rat hole, but there was an issue a few releases back where load
>>>> would only increase until the node was restarted. Had been fixed ages ago,
>>>> but wondering what happens if you restart a node, IF you have incremental
>>>> enabled.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <+1%20415-501-0198>London
>>>> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>>>>
>>>>
>>>> *“All men dream, but not equally. Those who dream by night in the dusty
>>>> recesses of their minds wake up in the day to find it was vanity, but the
>>>> dreamers of the day are dangerous men, for they may act their dreams with
>>>> open eyes, to make it possible.” — T.E. Lawrence*
>>>>
>>>>
>>>> On Tue, May 30, 2017 at 12:15 PM, Varun Gupta <var...@uber.com> wrote:
>>>>
>>>> Can you please check if you have incremental backup enabled and
>>>> snapshots are occupying the space.
>>>>
>>>> run nodetool clearsnapshot command.
>>>>
>>>> On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol <
>>>> dan...@sendwithus.com> wrote:
>>>>
>>>> It's 3-4TB per node, and by load rises, I'm talking about load as
>>>> reported by nodetool status.
>>>>
>>>>
>>>>
>>>> On May 30 2017, at 10:25 am, daemeon reiydelle <daeme...@gmail.com>
>>>> wrote:
>>>>
>>>> When you say "the load rises ... ", could you clarify what you mean by
>>>> "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But
>>>> in neither case would that be relevant to transient or persisted disk. Am I
>>>> missing something?
>>>>
>>>>
>>>> On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli <tbarbu...@gmail.com
>>>> > wrote:
>>>>
>>>> 3-4 TB per node or in total?
>>>>
>>>> On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol <
>>>> dan...@sendwithus.com> wrote:
>>>>
>>>> I should also mention that I am running cassandra 3.10 on the cluster
>>>>
>>>>
>>>>
>>>> On May 29 2017, at 9:43 am, Daniel Steuernol <dan...@sendwithus.com>
>>>> wrote:
>>>>
>>>> The cluster is running with RF=3, right now each node is storing about
>>>> 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61
>>>> GB of RAM, and the disks attached for the data drive are gp2 ssd ebs
>>>> volumes with 10k iops. I guess this brings up the question of what's a good
>>>> marker to decide on whether to increase disk space vs provisioning a new
>>>> node?
>>>>
>>>>
>>>> On May 29 2017, at 9:35 am, tommaso barbugli <tbarbu...@gmail.com>
>>>> wrote:
>>>>
>>>> Hi Daniel,
>>>>
>>>> This is not normal. Possibly a capacity problem. Whats the RF, how much
>>>> data do you store per node and what kind of servers do you use (core count,
>>>> RAM, disk, ...)?
>>>>
>>>> Cheers,
>>>> Tommaso
>>>>
>>>> On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol <
>>>> dan...@sendwithus.com> wrote:
>>>>
>>>>
>>>> I am running a 6 node cluster, and I have noticed that the reported
>>>> load on each node rises throughout the week and grows way past the actual
>>>> disk space used and available on each node. Also eventually latency for
>>>> operations suffers and the nodes have to be restarted. A couple questions
>>>> on this, is this normal? Also does cassandra need to be restarted every few
>>>> days for best performance? Any insight on this behaviour would be helpful.
>>>>
>>>> Cheers,
>>>> Daniel
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>>>> additional commands, e-mail: user-h...@cassandra.apache.org
>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>>>> additional commands, e-mail: user-h...@cassandra.apache.org
>>>>
>>>>
>>>>
>>>>
>>
>

Reply via email to