Note that you can dedicated, lets say, 50 GB to the OS... considering
your disks should be a minimum of 1000GB and probably 1.5-2TB now,
depending on whats cost effective, that is small change and worth the
price of admission I think.

The raid idea sounds good, the only potential caveats would be to
ensure IO to the OS raid isnt choking out HDFS and that the raid5 hole
doesnt ruin you.

On Thu, Sep 30, 2010 at 4:31 PM, Daniel Einspanjer
<[email protected]> wrote:
> We had another configuration we used at first which had four disks with the
> first disk having an extra partition that was quite tiny for the OS.
> Two annoying things there:
> 1. If we lost disk 1, we lost the whole box.
> 2. RHEL5 tends to put several GB worth of crap on the OS partition such as
> unused locale files and the yum cache can easily take up a few hundred MB.
>  This means that we are constantly cleaning up yum and deleting old log
> files and being generally space constrained on these boxes.
>
> -Daniel
>
> On 9/30/10 7:25 PM, Ryan Rawson wrote:
>>
>> What kind of raid are you doing?  Sounds like raid0, which means you
>> have a 100% chance of losing the entire box if a single disk goes
>> down.  If you choose just one, lets say sda, to host the OS you are
>> now at 33% chance of losing the box if a disk goes bad - assuming that
>> all disks have the same failure probability of course.
>>
>> What we do is install the OS on disk1, (sda), then have 4 JBODs and I
>> put our logs on disk1 as well.  log4j is tricky because it will cause
>> issues on disk corruption/io error events, but i have seen systems
>> continue to operate even if log4j can't write to disk due to a disk
>> full scenario.
>>
>> There is almost no non-HDFS data, you can literally wedge it in like
>> 8gb.  The biggest things that are not HDFS data are logs, and those
>> can go into the HDFS partition, they tend to be low volume but can add
>> up over time since the default is not to reap them.
>>
>>
>>
>> On Thu, Sep 30, 2010 at 4:17 PM, Daniel Einspanjer
>> <[email protected]>  wrote:
>>>
>>> Right now, most of our boxes have 3 disk in them.  We take a small
>>> partition
>>> on each of those and raid stripe them together to use as the OS partition
>>> then allocate the rest of the disks as JBOD for HDFS storage.
>>>
>>> We are building out a new cluster and I'm wondering if there are any
>>> better
>>> ideas for balancing the need for storage and speed of the HDFS disks with
>>> having *some place* to put the OS and non-HDFS data.
>>>
>>> What are other people doing about that?
>>>
>>> -Daniel
>>>
>

Reply via email to