Hi.

>
> Depends on what kind of I/O you do - are you going to be using MapReduce
> and co-locating jobs and data?  If so, it's possible to get close to those
> speeds if you are I/O bound in your job and read right through each chunk.
>  If you have multiple disks mounted individually, you'll need the number of
> streams equal to the number of disks.  If you're going to do I/O that's not
> through MapReduce, you'll probably be bound by the network interface.
>

Btw, this what I wanted to ask as well:

Is it more efficient to unify the disks into one volume (RAID or LVM), and
then present them as a single space? Or it's better to specify each disk
separately?

Reliability-wise, the latter sounds more correct, as a single/several (up to
3) disks going down won't take the whole node with them. But perhaps there
is a performance penalty?

Reply via email to