Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-15 Thread Phil Schwarz
Hi,
thanks for the explanation, but...
Twisting the Ceph storage model as you plan it is not a good idea :
- You will decrease the support level(I'm not sure many people will
build such an architecture)
- You are certainly going to face strange issues with HW Raid on top of
Ceph OSD
- You should'nt want to go to size=2. I know the counterparts of size=3
(IOPS, Usable space), but it seems not really safe to downgrade to size=2.
- Your servers seem to have enough horsepower regarding CPU,RAM and
disks. But you havent't told us about Ceph replication Network. At least
10Gbe, i hope.
- Your public network should be more than 1Gbe too, far more..
- How will you export VM ? single KVM samba server ? Ceph authx clients ???
- Rapidly, with size=3, you have, with 4 servers : 4*8*2/3=22TB usable
space. With 100 VDI, 220 GB per VM.. Is it enough to expand those VM sizes ?


In a conclusion,i fully understand the issues doing a complete test lab
before buying a complete cluster. But, you should do a few tests before
to tweak the solution to your needs.

Good luck
Best regards


Le 14/11/2017 à 11:36, Oscar Segarra a écrit :
> Hi Anthony,
> 
> 
> o I think you might have some misunderstandings about how Ceph works. 
> Ceph is best deployed as a single cluster spanning multiple servers,
> generally at least 3.  Is that your plan?   
> 
> I want to deply servers for 100VDI Windows 10 each (at least 3 servers).
> I plan to sell servers dependingo of the number of VDI required by my
> customer. For 100 VDI --> 3 servers, for 400 VDI --> 4 servers
> 
> This is my proposal of configuration:
> 
> *Server1:*
> CPU: 2x16 Core
> RAM: 512
> Disk: 2x400 for OS and 8x1.9TB for VM (SSD)
> 
> *Server2:*
> CPU: 2x16 Core
> RAM: 512
> Disk: 2x400 for OS and 8x1.9TB for VM (SSD)
> 
> *Server3:*
> CPU: 2x16 Core
> RAM: 512
> Disk: 2x400 for OS and 8x1.9TB for VM (SSD)
> 
> *Server4:*
> CPU: 2x16 Core
> RAM: 512
> Disk: 2x400 for OS and 8x1.9TB for VM (SSD)
> ...
> *ServerN:*
> CPU: 2x16 Core
> RAM: 512
> Disk: 2x400 for OS and 8x1.9TB for VM (SSD)
> 
> If I create an OSD for each disk and I pin a core for each osd in a
> server I wil need 8 cores just for managing osd. If I create 4 RAID0 of
> 2 disks each, I will need just 4 osd, and so on:
> 
> 1 osd x 1 disk of 4TB
> 1 osd x 2 disks of 2TB
> 1 odd x 4 disks of 1 TB
> 
> If the CPU cycles used by Ceph are a problem, your architecture has IMHO
> bigger problems.  You need to design for a safety margin of RAM and CPU
> to accommodate spikes in usage, both by Ceph and by your desktops. 
> There is no way each of the systems you describe is going to have enough
> cycles for 100 desktops concurrently active.  You'd be allocating each
> of them only ~3GB of RAM -- I've not had to run MS Windows 10 but even
> with page sharing that seems awfully tight on RAM.
> 
> Sorry, I think my design has not been correctly explained. I hope my
> previous explanation clarifies it. The problem is i'm in the design
> phase and I don't know if ceph CPU cycles can be a problem and that is
> the principal object of this post.
> 
> With the numbers you mention throughout the thread, it would seem as
> though you would end up with potentially as little as 80GB of usable
> space per virtual desktop - will that meet your needs?
> 
> Sorry, I think 80GB is enough, nevertheless, I plan to use RBD clones
> and therefore even with size=2, I think I will have more than 80GB
> available for each vdi.
> 
> In this design phase where I am, every advice is really welcome!
> 
> Thanks a lot
> 
> 2017-11-13 23:40 GMT+01:00 Anthony D'Atri  >:
> 
> Oscar, a few thoughts:
> 
> o I think you might have some misunderstandings about how Ceph
> works.  Ceph is best deployed as a single cluster spanning multiple
> servers, generally at least 3.  Is that your plan?  It sort of
> sounds as though you're thinking of Ceph managing only the drives
> local to each of your converged VDI hosts, like local RAID would. 
> Ceph doesn't work that way.  Well, technically it could but wouldn't
> be a great architecture.  You would want to have at least 3 servers,
> with all of the Ceph OSDs in a single cluster.
> 
> o Re RAID0:
> 
> > Then, may I understand that your advice is a RAID0 for each 4TB? For a
> > balanced configuration...
> >
> > 1 osd x 1 disk of 4TB
> > 1 osd x 2 disks of 2TB
> > 1 odd x 4 disks of 1 TB
> 
> 
> For performance a greater number of smaller drives is generally
> going to be best.  VDI desktops are going to be fairly
> latency-sensitive and you'd really do best with SSDs.  All those
> desktops thrashing a small number of HDDs is not going to deliver
> tolerable performance.
> 
> Don't use RAID at all for the OSDs.  Even if you get hardware RAID
> HBAs, configure JBOD/passthrough mode so that OSDs are deployed
> directly on the drives.  This will minimize latency as well as
>  

Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-14 Thread Oscar Segarra
Hi Anthony,


o I think you might have some misunderstandings about how Ceph works.  Ceph
is best deployed as a single cluster spanning multiple servers, generally
at least 3.  Is that your plan?

I want to deply servers for 100VDI Windows 10 each (at least 3 servers). I
plan to sell servers dependingo of the number of VDI required by my
customer. For 100 VDI --> 3 servers, for 400 VDI --> 4 servers

This is my proposal of configuration:

*Server1:*
CPU: 2x16 Core
RAM: 512
Disk: 2x400 for OS and 8x1.9TB for VM (SSD)

*Server2:*
CPU: 2x16 Core
RAM: 512
Disk: 2x400 for OS and 8x1.9TB for VM (SSD)

*Server3:*
CPU: 2x16 Core
RAM: 512
Disk: 2x400 for OS and 8x1.9TB for VM (SSD)

*Server4:*
CPU: 2x16 Core
RAM: 512
Disk: 2x400 for OS and 8x1.9TB for VM (SSD)
...
*ServerN:*
CPU: 2x16 Core
RAM: 512
Disk: 2x400 for OS and 8x1.9TB for VM (SSD)

If I create an OSD for each disk and I pin a core for each osd in a server
I wil need 8 cores just for managing osd. If I create 4 RAID0 of 2 disks
each, I will need just 4 osd, and so on:

1 osd x 1 disk of 4TB
1 osd x 2 disks of 2TB
1 odd x 4 disks of 1 TB

If the CPU cycles used by Ceph are a problem, your architecture has IMHO
bigger problems.  You need to design for a safety margin of RAM and CPU to
accommodate spikes in usage, both by Ceph and by your desktops.  There is
no way each of the systems you describe is going to have enough cycles for
100 desktops concurrently active.  You'd be allocating each of them only
~3GB of RAM -- I've not had to run MS Windows 10 but even with page sharing
that seems awfully tight on RAM.

Sorry, I think my design has not been correctly explained. I hope my
previous explanation clarifies it. The problem is i'm in the design phase
and I don't know if ceph CPU cycles can be a problem and that is the
principal object of this post.

With the numbers you mention throughout the thread, it would seem as though
you would end up with potentially as little as 80GB of usable space per
virtual desktop - will that meet your needs?

Sorry, I think 80GB is enough, nevertheless, I plan to use RBD clones and
therefore even with size=2, I think I will have more than 80GB available
for each vdi.

In this design phase where I am, every advice is really welcome!

Thanks a lot

2017-11-13 23:40 GMT+01:00 Anthony D'Atri :

> Oscar, a few thoughts:
>
> o I think you might have some misunderstandings about how Ceph works.
> Ceph is best deployed as a single cluster spanning multiple servers,
> generally at least 3.  Is that your plan?  It sort of sounds as though
> you're thinking of Ceph managing only the drives local to each of your
> converged VDI hosts, like local RAID would.  Ceph doesn't work that way.
> Well, technically it could but wouldn't be a great architecture.  You would
> want to have at least 3 servers, with all of the Ceph OSDs in a single
> cluster.
>
> o Re RAID0:
>
> > Then, may I understand that your advice is a RAID0 for each 4TB? For a
> > balanced configuration...
> >
> > 1 osd x 1 disk of 4TB
> > 1 osd x 2 disks of 2TB
> > 1 odd x 4 disks of 1 TB
>
>
> For performance a greater number of smaller drives is generally going to
> be best.  VDI desktops are going to be fairly latency-sensitive and you'd
> really do best with SSDs.  All those desktops thrashing a small number of
> HDDs is not going to deliver tolerable performance.
>
> Don't use RAID at all for the OSDs.  Even if you get hardware RAID HBAs,
> configure JBOD/passthrough mode so that OSDs are deployed directly on the
> drives.  This will minimize latency as well as manifold hassles that one
> adds when wrapping drives in HBA RAID volumes.
>
> o Re CPU:
>
> > The other question is considering having one OSDs vs 8 OSDs... 8 OSDs
> will
> > consume more CPU than 1 OSD (RAID5) ?
> >
> > As I want to share compute and osd in the same box, resources consumed by
> > OSD can be a handicap.
>
>
> If the CPU cycles used by Ceph are a problem, your architecture has IMHO
> bigger problems.  You need to design for a safety margin of RAM and CPU to
> accommodate spikes in usage, both by Ceph and by your desktops.  There is
> no way each of the systems you describe is going to have enough cycles for
> 100 desktops concurrently active.  You'd be allocating each of them only
> ~3GB of RAM -- I've not had to run MS Windows 10 but even with page sharing
> that seems awfully tight on RAM.
>
> Since you mention PProLiant and 8 drives I'm going assume you're targeting
> the DL360?  I suggest if possible considering the 10SFF models to get you
> more drive bays, ditching the optical drive.  If you can get rear bays to
> use to boot the OS from, that's better yet so you free up front panel drive
> bays for OSD use.  You want to maximize the number of drive bays available
> for OSD use, and if at all possible you want to avoid deploying the
> operating system's filesystems and OSDs on the same drives.
>
> With the numbers you mention throughout the thread, it would seem as
> 

Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Anthony D'Atri
Oscar, a few thoughts:

o I think you might have some misunderstandings about how Ceph works.  Ceph is 
best deployed as a single cluster spanning multiple servers, generally at least 
3.  Is that your plan?  It sort of sounds as though you're thinking of Ceph 
managing only the drives local to each of your converged VDI hosts, like local 
RAID would.  Ceph doesn't work that way.  Well, technically it could but 
wouldn't be a great architecture.  You would want to have at least 3 servers, 
with all of the Ceph OSDs in a single cluster.

o Re RAID0:

> Then, may I understand that your advice is a RAID0 for each 4TB? For a
> balanced configuration...
> 
> 1 osd x 1 disk of 4TB
> 1 osd x 2 disks of 2TB
> 1 odd x 4 disks of 1 TB


For performance a greater number of smaller drives is generally going to be 
best.  VDI desktops are going to be fairly latency-sensitive and you'd really 
do best with SSDs.  All those desktops thrashing a small number of HDDs is not 
going to deliver tolerable performance.

Don't use RAID at all for the OSDs.  Even if you get hardware RAID HBAs, 
configure JBOD/passthrough mode so that OSDs are deployed directly on the 
drives.  This will minimize latency as well as manifold hassles that one adds 
when wrapping drives in HBA RAID volumes.

o Re CPU:

> The other question is considering having one OSDs vs 8 OSDs... 8 OSDs will
> consume more CPU than 1 OSD (RAID5) ?
> 
> As I want to share compute and osd in the same box, resources consumed by
> OSD can be a handicap.


If the CPU cycles used by Ceph are a problem, your architecture has IMHO bigger 
problems.  You need to design for a safety margin of RAM and CPU to accommodate 
spikes in usage, both by Ceph and by your desktops.  There is no way each of 
the systems you describe is going to have enough cycles for 100 desktops 
concurrently active.  You'd be allocating each of them only ~3GB of RAM -- I've 
not had to run MS Windows 10 but even with page sharing that seems awfully 
tight on RAM.

Since you mention PProLiant and 8 drives I'm going assume you're targeting the 
DL360?  I suggest if possible considering the 10SFF models to get you more 
drive bays, ditching the optical drive.  If you can get rear bays to use to 
boot the OS from, that's better yet so you free up front panel drive bays for 
OSD use.  You want to maximize the number of drive bays available for OSD use, 
and if at all possible you want to avoid deploying the operating system's 
filesystems and OSDs on the same drives.

With the numbers you mention throughout the thread, it would seem as though you 
would end up with potentially as little as 80GB of usable space per virtual 
desktop - will that meet your needs?  One of the difficulties with converged 
architectures is that storage and compute don't necessarily scale at the same 
rate.  To that end I suggest considering 2U 25-drive-bay systems so that you 
have room to add more drives.



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Oscar Segarra
Hi Brady,

For me is very difficult to make a PoC because servers are very expensive.

Then, may I understand that your advice is a RAID0 for each 4TB? For a
balanced configuration...

1 osd x 1 disk of 4TB
1 osd x 2 disks of 2TB
1 odd x 4 disks of 1 TB

Isn't it?

Thanks a lot



El 13 nov. 2017 18:40, "Brady Deetz"  escribió:



On Nov 13, 2017 11:17 AM, "Oscar Segarra"  wrote:

Hi Brady,

Thanks a lot again for your comments and experience.

This is a departure from what I've seen people do here. I agree that 100
VMs on 24 cores would be potentially over consolidating. But, when it comes
to your storage, you probably don't want to lose the data and shouldn't
skimp. Could you lower VMs per host to 75-80?
--> Yes, that's the reason I'm asking this... If I create a RAID5 or RAID0
with 8 disks... I will have just a single OSD process and therefore, I can
let 31 for myy 100 VDIs that I think can be enough.

Also, I notice you have no ssd storage. Are these VMs expected to be
performant at all? 100 VMs accessing 8 spinners could cause some serious
latency.
--> I'm planning to use all SSD in my infraestructure in order to avoid IO
issues. This might not be a problem


My mistake, I read 8x 8TB not 1TB. There are some decent sizing
conversations on the list regarding all ssd deployments. If I were doing
this and forced to scrape a few more cores per host, I would run some tests
in different configurations. My guess is that 4x raid 0 per host will
result in a nice compromise between overhead, performance, and
consolidation ratio. But again, this is a not so advised configuration. No
matter what, before I took this into production, I'd purchase enough
hardware to do a proof of concept using a minimal configuration of 3 hosts.
Then just run benchmarks with 1x raid 6, 1x raid 0, 4x raid 0, and no raid
+ pinned osd process 2-to-1 core.

If none of that works, it's back to the drawing board for you.


Minimum cluster size should be 3 because you are making 3 replicas with
min_size 2. If you lose 1 host in a cluster of 2, you will likely lose
access to data because 2 replicas existed on the host that went down. You
will have a bad time if you run a cluster with 2 replicas.
--> Yes, depend on the VDI nodes, starting from 3.

Thanks a lot in advance for your help!



2017-11-13 18:06 GMT+01:00 Brady Deetz :

>
>
> On Nov 13, 2017 10:44 AM, "Oscar Segarra"  wrote:
>
> Hi Brady,
>
> Thanks a lot for your comments.
>
> I can't think of a reason to use raid 5 and ceph together, even in a vdi
> instance. You're going to want throughput for this use case. What you can
> do is set the affinity of those osd processes to cores not in use by the
> VMs. I do think it will need to be more than 1 core. It is recommended that
> you dedicate 1 core per osd, but you could maybe get away with collocating
> the processes. You'd just have to experiment.
> What we really need to help you is more information.
> --> If my host has 32 cores and 8 disks or 8 OSDs and I have to pin each
> osd process to a core. I will have just 24 cores for all my host an windows
> guest load.
>
>
> What hardware are you planning to use?
> --> I'm planning to use a standard server as ProLiant. In my
> configuration, each ProLiant will be compute for 100 VDIs and Storage node.
> Each ProLiant will have 32 cores, 384GB RAM and a RAID1 for OS
>
>
> This is a departure from what I've seen people do here. I agree that 100
> VMs on 24 cores would be potentially over consolidating. But, when it comes
> to your storage, you probably don't want to lose the data and shouldn't
> skimp. Could you lower VMs per host to 75-80?
> Also, I notice you have no ssd storage. Are these VMs expected to be
> performant at all? 100 VMs accessing 8 spinners could cause some serious
> latency.
>
>
>
> How many osd nodes do you plan to deploy?
> --> Depends on the VDIs to deploy. If customer wants to deploy 100 VDIs
> then 2 OSD nodes will be deployed.
>
>
> Minimum cluster size should be 3 because you are making 3 replicas with
> min_size 2. If you lose 1 host in a cluster of 2, you will likely lose
> access to data because 2 replicas existed on the host that went down. You
> will have a bad time if you run a cluster with 2 replicas.
>
>
> What will the network look like?
> --> I'm planning to use a 10G. Don't know if with 1GB is enough.
>
>
> For the sake of latency alone, you want 10gbps sfp+
>
>
> Are you sure Ceph is the right solution for you?
> --> Yes, I have tested some others like gluster but looks ceph is the one
> that fits better to my solution.
>
> Have you read and do you understand the architecture docs for Ceph?
> --> Absolutely.
>
> Thanks a lot!
>
>
>
> 2017-11-13 17:27 GMT+01:00 Brady Deetz :
>
>> I can't think of a reason to use raid 5 and ceph together, even in a vdi
>> instance. You're going to want throughput for this use case. What you can
>> do is set the 

Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Lionel Bouton
Le 13/11/2017 à 15:47, Oscar Segarra a écrit :
> Thanks Mark, Peter, 
>
> For clarification, the configuration with RAID5 is having many servers
> (2 or more) with RAID5 and CEPH on top of it. Ceph will replicate data
> between servers. Of course, each server will have just one OSD daemon
> managing a big disk.
>
> It looks functionally is the same using RAID5 +  1 Ceph daemon as 8
> CEPH daemons.

Functionally it's the same but RAID5 will kill your write performance.

For example if you start with 3 OSD hosts and a pool size of 3, due to
RAID5 each and every write on your Ceph cluster will imply a read on one
server on every disks minus one then a write on *all* the disks of the
cluster.

If you use one OSD per disk you'll have a read on one disk only and a
write on 3 disks only : you'll get approximately 8 times the IOPS for
writes (with 8 disks per server). Clever RAID5 logic can minimize this
for some I/O patterns but it is a bet and will never be as good as what
you'll get with one disk per OSD.

Best regards,

Lionel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Oscar Segarra
Thanks Mark, Peter,

For clarification, the configuration with RAID5 is having many servers (2
or more) with RAID5 and CEPH on top of it. Ceph will replicate data between
servers. Of course, each server will have just one OSD daemon managing a
big disk.

It looks functionally is the same using RAID5 +  1 Ceph daemon as 8 CEPH
daemons.

I appreciate a lot your comments!

Oscar Segarra



2017-11-13 15:37 GMT+01:00 Marc Roos <m.r...@f1-outsourcing.eu>:

>
> Keep in mind also if you want to have fail over in the future. We were
> running a 2nd server and were replicating via DRBD the raid arrays.
> Expanding this storage is quite hastle, compared to just adding a few
> osd's.
>
>
>
> -Original Message-
> From: Oscar Segarra [mailto:oscar.sega...@gmail.com]
> Sent: maandag 13 november 2017 15:26
> To: Peter Maloney
> Cc: ceph-users
> Subject: Re: [ceph-users] HW Raid vs. Multiple OSD
>
> Hi Peter,
>
> Thanks a lot for your consideration in terms of storage consumption.
>
> The other question is considering having one OSDs vs 8 OSDs... 8 OSDs
> will consume more CPU than 1 OSD (RAID5) ?
>
> As I want to share compute and osd in the same box, resources consumed
> by OSD can be a handicap.
>
> Thanks a lot.
>
> 2017-11-13 12:59 GMT+01:00 Peter Maloney
> <peter.malo...@brockmann-consult.de>:
>
>
> Once you've replaced an OSD, you'll see it is quite simple... doing
> it for a few is not much more work (you've scripted it, right?). I don't
> see RAID as giving any benefit here at all. It's not tricky...it's
> perfectly normal operation. Just get used to ceph, and it'll be as
> normal as replacing a RAID disk. And for performance degradation, maybe
> it could be better on either... or better on ceph if you don't mind
> setting the rate to the lowest... but when the QoS functionality is
> ready, probably ceph will be much better. Also RAID will cost you more
> for hardware.
>
> And raid5 is really bad for IOPS. And ceph already replicates, so
> you will have 2 layers of redundancy... and ceph does it cluster wide,
> not just one machine. Using ceph with replication is like all your free
> space as hot spares... you could lose 2 disks on all your machines, and
> it can still run (assuming it had time to recover in between, and enough
> space). And you don't want min_size=1, and if you have 2 layers of
> redundancy, you'll be tempted to do that probably.
>
> But for some workloads, like RBD, ceph doesn't balance out the
> workload very evenly for a specific client, only many clients at once...
> raid might help solve that, but I don't see it as worth it.
>
> I would just software RAID1 the OS and mons, and mds, not the OSDs.
>
>
> On 11/13/17 12:26, Oscar Segarra wrote:
>
>
> Hi,
>
> I'm designing my infraestructure. I want to provide 8TB (8
> disks x 1TB each) of data per host just for Microsoft Windows 10 VDI. In
> each host I will have storage (ceph osd) and compute (on kvm).
>
> I'd like to hear your opinion about theese two
> configurations:
>
> 1.- RAID5 with 8 disks (I will have 7TB but for me it is
> enough) + 1 OSD daemon
> 2.- 8 OSD daemons
>
> I'm a little bit worried that 8 osd daemons can affect
> performance because all jobs running and scrubbing.
>
> Another question is the procedure of a replacement of a
> failed
> disk. In case of a big RAID, replacement is direct. In case of many
> OSDs, the procedure is a little bit tricky.
>
>
> http://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-
> ceph-cluster/
> <http://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-
> ceph-cluster/>
>
>
> What is your advice?
>
> Thanks a lot everybody in advance...
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>
>
>
>
> --
>
> 
> Peter Maloney
> Brockmann Consult
> Max-Planck-Str. 2
> 21502 Geesthacht
> Germany
> Tel: +49 4152 889 300 <tel:+49%204152%20889300>
> Fax: +49 4152 889 333 <tel:+49%204152%20889333>
> E-mail: peter.malo...@brockmann-consult.de
> <mailto:peter.malo...@brockmann-consult.de>
> Internet: http://www.brockmann-consult.de
> <http://www.brockmann-consult.de>
> 
>
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Marc Roos
 
Keep in mind also if you want to have fail over in the future. We were 
running a 2nd server and were replicating via DRBD the raid arrays. 
Expanding this storage is quite hastle, compared to just adding a few 
osd's. 



-Original Message-
From: Oscar Segarra [mailto:oscar.sega...@gmail.com] 
Sent: maandag 13 november 2017 15:26
To: Peter Maloney
Cc: ceph-users
Subject: Re: [ceph-users] HW Raid vs. Multiple OSD

Hi Peter, 

Thanks a lot for your consideration in terms of storage consumption. 

The other question is considering having one OSDs vs 8 OSDs... 8 OSDs 
will consume more CPU than 1 OSD (RAID5) ?

As I want to share compute and osd in the same box, resources consumed 
by OSD can be a handicap.

Thanks a lot.

2017-11-13 12:59 GMT+01:00 Peter Maloney 
<peter.malo...@brockmann-consult.de>:


Once you've replaced an OSD, you'll see it is quite simple... doing 
it for a few is not much more work (you've scripted it, right?). I don't 
see RAID as giving any benefit here at all. It's not tricky...it's 
perfectly normal operation. Just get used to ceph, and it'll be as 
normal as replacing a RAID disk. And for performance degradation, maybe 
it could be better on either... or better on ceph if you don't mind 
setting the rate to the lowest... but when the QoS functionality is 
ready, probably ceph will be much better. Also RAID will cost you more 
for hardware.

And raid5 is really bad for IOPS. And ceph already replicates, so 
you will have 2 layers of redundancy... and ceph does it cluster wide, 
not just one machine. Using ceph with replication is like all your free 
space as hot spares... you could lose 2 disks on all your machines, and 
it can still run (assuming it had time to recover in between, and enough 
space). And you don't want min_size=1, and if you have 2 layers of 
redundancy, you'll be tempted to do that probably.

But for some workloads, like RBD, ceph doesn't balance out the 
workload very evenly for a specific client, only many clients at once... 
raid might help solve that, but I don't see it as worth it.

I would just software RAID1 the OS and mons, and mds, not the OSDs.


On 11/13/17 12:26, Oscar Segarra wrote:


Hi,  

I'm designing my infraestructure. I want to provide 8TB (8 
disks x 1TB each) of data per host just for Microsoft Windows 10 VDI. In 
each host I will have storage (ceph osd) and compute (on kvm).

I'd like to hear your opinion about theese two configurations:

1.- RAID5 with 8 disks (I will have 7TB but for me it is 
enough) + 1 OSD daemon
2.- 8 OSD daemons

I'm a little bit worried that 8 osd daemons can affect 
performance because all jobs running and scrubbing.

Another question is the procedure of a replacement of a failed 
disk. In case of a big RAID, replacement is direct. In case of many 
OSDs, the procedure is a little bit tricky.


http://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-ceph-cluster/
 
<http://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-ceph-cluster/>
 


What is your advice?

Thanks a lot everybody in advance...

 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> 




-- 


Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300 <tel:+49%204152%20889300> 
Fax: +49 4152 889 333 <tel:+49%204152%20889333> 
E-mail: peter.malo...@brockmann-consult.de 
<mailto:peter.malo...@brockmann-consult.de> 
Internet: http://www.brockmann-consult.de 
<http://www.brockmann-consult.de> 




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Oscar Segarra
Hi Peter,

Thanks a lot for your consideration in terms of storage consumption.

The other question is considering having one OSDs vs 8 OSDs... 8 OSDs will
consume more CPU than 1 OSD (RAID5) ?

As I want to share compute and osd in the same box, resources consumed by
OSD can be a handicap.

Thanks a lot.

2017-11-13 12:59 GMT+01:00 Peter Maloney :

> Once you've replaced an OSD, you'll see it is quite simple... doing it for
> a few is not much more work (you've scripted it, right?). I don't see RAID
> as giving any benefit here at all. It's not tricky...it's perfectly normal
> operation. Just get used to ceph, and it'll be as normal as replacing a
> RAID disk. And for performance degradation, maybe it could be better on
> either... or better on ceph if you don't mind setting the rate to the
> lowest... but when the QoS functionality is ready, probably ceph will be
> much better. Also RAID will cost you more for hardware.
>
> And raid5 is really bad for IOPS. And ceph already replicates, so you will
> have 2 layers of redundancy... and ceph does it cluster wide, not just one
> machine. Using ceph with replication is like all your free space as hot
> spares... you could lose 2 disks on all your machines, and it can still run
> (assuming it had time to recover in between, and enough space). And you
> don't want min_size=1, and if you have 2 layers of redundancy, you'll be
> tempted to do that probably.
>
> But for some workloads, like RBD, ceph doesn't balance out the workload
> very evenly for a specific client, only many clients at once... raid might
> help solve that, but I don't see it as worth it.
>
> I would just software RAID1 the OS and mons, and mds, not the OSDs.
>
>
> On 11/13/17 12:26, Oscar Segarra wrote:
>
> Hi,
>
> I'm designing my infraestructure. I want to provide 8TB (8 disks x 1TB
> each) of data per host just for Microsoft Windows 10 VDI. In each host I
> will have storage (ceph osd) and compute (on kvm).
>
> I'd like to hear your opinion about theese two configurations:
>
> 1.- RAID5 with 8 disks (I will have 7TB but for me it is enough) + 1 OSD
> daemon
> 2.- 8 OSD daemons
>
> I'm a little bit worried that 8 osd daemons can affect performance because
> all jobs running and scrubbing.
>
> Another question is the procedure of a replacement of a failed disk. In
> case of a big RAID, replacement is direct. In case of many OSDs, the
> procedure is a little bit tricky.
>
> http://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-
> ceph-cluster/
>
> What is your advice?
>
> Thanks a lot everybody in advance...
>
>
> ___
> ceph-users mailing 
> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> --
>
> 
> Peter Maloney
> Brockmann Consult
> Max-Planck-Str. 2
> 21502 Geesthacht
> Germany
> Tel: +49 4152 889 300 <+49%204152%20889300>
> Fax: +49 4152 889 333 <+49%204152%20889333>
> E-mail: peter.malo...@brockmann-consult.de
> Internet: http://www.brockmann-consult.de
> 
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Michael

Oscar Segarra wrote:

I'd like to hear your opinion about theese two configurations:

1.- RAID5 with 8 disks (I will have 7TB but for me it is enough) + 1 
OSD daemon

2.- 8 OSD daemons
You mean 1 OSD daemon on top of RAID5? I don't think I'd do that. You'll 
probably want redundancy at Ceph's level anyhow, and then where is the 
point...?
I'm a little bit worried that 8 osd daemons can affect performance 
because all jobs running and scrubbing.
If you ran RAID instead of Ceph, RAID might still perform better. But I 
don't believe anything much changes for the better if you run the Ceph 
on top of RAID rather than on top of individual OSD, unless your 
configuration is bad. I generally don't think you have to worry that 
much that a reasonably modern machine can't handle running a few extra 
jobs, either.


But you could certainly do some tests on your hardware to be sure.

Another question is the procedure of a replacement of a failed disk. 
In case of a big RAID, replacement is direct. In case of many OSDs, 
the procedure is a little bit tricky.


http://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-ceph-cluster/ 

I wasn't using Ceph in 2014, but at least in my limited experience, 
today the most important step is done when you add the new drive and 
activate an OSD on it.


You probably still want to remove the leftovers of the old failed OSD 
for it to not clutter your list, but as far as I can tell replication 
and so on will trigger *before* you remove it. (There is a configurable 
timeout for how long an OSD can be down, after which the OSD is 
essentially treated as dead already, at which point replication and 
rebalancing starts).



-Michael


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Peter Maloney
Once you've replaced an OSD, you'll see it is quite simple... doing it
for a few is not much more work (you've scripted it, right?). I don't
see RAID as giving any benefit here at all. It's not tricky...it's
perfectly normal operation. Just get used to ceph, and it'll be as
normal as replacing a RAID disk. And for performance degradation, maybe
it could be better on either... or better on ceph if you don't mind
setting the rate to the lowest... but when the QoS functionality is
ready, probably ceph will be much better. Also RAID will cost you more
for hardware.

And raid5 is really bad for IOPS. And ceph already replicates, so you
will have 2 layers of redundancy... and ceph does it cluster wide, not
just one machine. Using ceph with replication is like all your free
space as hot spares... you could lose 2 disks on all your machines, and
it can still run (assuming it had time to recover in between, and enough
space). And you don't want min_size=1, and if you have 2 layers of
redundancy, you'll be tempted to do that probably.

But for some workloads, like RBD, ceph doesn't balance out the workload
very evenly for a specific client, only many clients at once... raid
might help solve that, but I don't see it as worth it.

I would just software RAID1 the OS and mons, and mds, not the OSDs.

On 11/13/17 12:26, Oscar Segarra wrote:
> Hi, 
>
> I'm designing my infraestructure. I want to provide 8TB (8 disks x 1TB
> each) of data per host just for Microsoft Windows 10 VDI. In each host
> I will have storage (ceph osd) and compute (on kvm).
>
> I'd like to hear your opinion about theese two configurations:
>
> 1.- RAID5 with 8 disks (I will have 7TB but for me it is enough) + 1
> OSD daemon
> 2.- 8 OSD daemons
>
> I'm a little bit worried that 8 osd daemons can affect performance
> because all jobs running and scrubbing.
>
> Another question is the procedure of a replacement of a failed disk.
> In case of a big RAID, replacement is direct. In case of many OSDs,
> the procedure is a little bit tricky.
>
> http://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-ceph-cluster/
>
> What is your advice?
>
> Thanks a lot everybody in advance...
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


-- 


Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: peter.malo...@brockmann-consult.de
Internet: http://www.brockmann-consult.de


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Oscar Segarra
Hi,

I'm designing my infraestructure. I want to provide 8TB (8 disks x 1TB
each) of data per host just for Microsoft Windows 10 VDI. In each host I
will have storage (ceph osd) and compute (on kvm).

I'd like to hear your opinion about theese two configurations:

1.- RAID5 with 8 disks (I will have 7TB but for me it is enough) + 1 OSD
daemon
2.- 8 OSD daemons

I'm a little bit worried that 8 osd daemons can affect performance because
all jobs running and scrubbing.

Another question is the procedure of a replacement of a failed disk. In
case of a big RAID, replacement is direct. In case of many OSDs, the
procedure is a little bit tricky.

http://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-ceph-cluster/

What is your advice?

Thanks a lot everybody in advance...
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com