Re: [ovirt-users] Good practices

2017-08-08 Thread Moacir Ferreira
Thanks Johan, you brought "light" into my darkness! I went looking for the 
GlusterFS tiering how-to and it looks like quite simple to attach a SSD as hot 
tier. For those willing to read about it, go here: 
http://blog.gluster.org/2016/03/automated-tiering-in-gluster/


Now, I still have a question: VMs are made of very large .qcow2 files. My 
understanding is that files in Gluster are kept all together in a single brick. 
If so, I will not benefit from tiering as a single SSD will not be big enough 
to fit all my large VM .qcow2 files. This would not be true if Gluster can 
store "blocks" of data that compose a large file spread on several bricks. But 
if I am not wrong, this is one of key differences in between GlusterFS and 
Ceph. Can you comment?


Moacir



From: Johan Bernhardsson <jo...@kafit.se>
Sent: Tuesday, August 8, 2017 7:03 AM
To: Moacir Ferreira; Devin Acosta; users@ovirt.org
Subject: Re: [ovirt-users] Good practices


You attach the ssd as a hot tier with a gluster command. I don't think that 
gdeploy or ovirt gui can do it.

The gluster docs and redhat docs explains tiering quite good.

/Johan

On August 8, 2017 07:06:42 Moacir Ferreira <moacirferre...@hotmail.com> wrote:

Hi Devin,


Please consider that for the OS I have a RAID 1. Now, lets say I use RAID 5 to 
assemble a single disk on each server. In this case, the SSD will not make any 
difference, right? I guess that to be possible to use it, the SSD should not be 
part of the RAID 5. In this case I could create a logical volume made of the 
RAIDed brick and then extend it using the SSD. I.e.: Using gdeploy:


[disktype]

jbod



[pv1]

action=create

devices=sdb, sdc

wipefs=yes

ignore_vg_erros=no


[vg1]

action=create

vgname=gluster_vg_jbod

pvname=sdb

ignore_vg_erros=no


[vg2]

action=extend

vgname=gluster_vg_jbod

pvname=sdc

ignore_vg_erros=no


But will Gluster be able to auto-detect and use this SSD brick for tiering? Do 
I have to do some other configurations? Also, as the VM files (.qcow2) are 
quite big will I benefit from tiering? This is wrong and my approach should be 
other?


Thanks,

Moacir



From: Devin Acosta <de...@pabstatencio.com>
Sent: Monday, August 7, 2017 7:46 AM
To: Moacir Ferreira; users@ovirt.org
Subject: Re: [ovirt-users] Good practices


Moacir,

I have recently installed multiple Red Hat Virtualization hosts for several 
different companies, and have dealt with the Red Hat Support Team in depth 
about optimal configuration in regards to setting up GlusterFS most efficiently 
and I wanted to share with you what I learned.

In general Red Hat Virtualization team frowns upon using each DISK of the 
system as just a JBOD, sure there is some protection by having the data 
replicated, however, the recommendation is to use RAID 6 (preferred) or RAID-5, 
or at least RAID-1 at the very least.

Here is the direct quote from Red Hat when I asked about RAID and Bricks:

"A typical Gluster configuration would use RAID underneath the bricks. RAID 6 
is most typical as it gives you 2 disk failure protection, but RAID 5 could be 
used too. Once you have the RAIDed bricks, you'd then apply the desired 
replication on top of that. The most popular way of doing this would be 
distributed replicated with 2x replication. In general you'll get better 
performance with larger bricks. 12 drives is often a sweet spot. Another option 
would be to create a separate tier using all SSD’s.”

In order to SSD tiering from my understanding you would need 1 x NVMe drive in 
each server, or 4 x SSD hot tier (it needs to be distributed, replicated for 
the hot tier if not using NVME). So with you only having 1 SSD drive in each 
server, I’d suggest maybe looking into the NVME option.

Since your using only 3-servers, what I’d probably suggest is to do (2 Replicas 
+ Arbiter Node), this setup actually doesn’t require the 3rd server to have big 
drives at all as it only stores meta-data about the files and not actually a 
full copy.

Please see the attached document that was given to me by Red Hat to get more 
information on this. Hope this information helps you.


--

Devin Acosta, RHCA, RHVCA
Red Hat Certified Architect


On August 6, 2017 at 7:29:29 PM, Moacir Ferreira 
(moacirferre...@hotmail.com<mailto:moacirferre...@hotmail.com>) wrote:

I am willing to assemble a oVirt "pod", made of 3 servers, each with 2 CPU 
sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use GlusterFS 
to provide HA for the VMs. The 3 servers have a dual 40Gb NIC and a dual 10Gb 
NIC. So my intention is to create a loop like a server triangle using the 40Gb 
NICs for virtualization files (VMs .qcow2) access and to move VMs around the 
pod (east /west traffic) while using the 10Gb interfaces for giving services to 
the outside world (north/south traffic).


This said, my first question is: How should I depl

Re: [ovirt-users] Good practices

2017-08-08 Thread Pavel Gashev
Fernando,

I agree that RAID is not required here by common sense. The only point to setup 
RAID is a lack of manageability of GlusterFS. So you just buy manageability for 
extra hardware cost and write performance in some scenarios. That is it.

On 08/08/2017, 16:24, "users-boun...@ovirt.org on behalf of FERNANDO FREDIANI" 
 wrote:

That's something on the way RAID works, regardless what most 
'super-ultra' powerfull hardware controller you may have. RAID 5 or 6 
will never have the same write performance as a RAID 10 o 0 for example. 
Writeback caches can deal with bursts well but they have a limit 
therefore there will always be a penalty compared to what else you could 
have.

If you have a continuous stream of data (a big VM deployment or a large 
data copy) there will be a continuous write and that will likely fill up 
the cache making the disks underneath the bottleneck.
That's why on some other scenarios, like ZFS people have multiple groups 
of RAID 6 (called RAIDZ2) so it improves the write speeds for these type 
of scenarios.

In the scenario given in this thread with just 3 servers, each with a 
RAID 6 there will be a bare limit on the write performance specially for 
streammed data for most powerfull your hardware controller can do 
write-back.

Also I agree the 40Gb NICs may not be used fully and 10Gb can do the job 
well, but if they were available at the begining, why not use them.

Fernando


On 08/08/2017 03:16, Fabrice Bacchella wrote:
>> Le 8 août 2017 à 04:08, FERNANDO FREDIANI  a 
écrit :
>> Even if you have a Hardware RAID Controller with Writeback cache you 
will have a significant performance penalty and may not fully use all the 
resources you mentioned you have.
>>
> Nope again,from my experience with HP Smart Array and write back cache, 
write, that goes in the cache, are even faster that read that must goes to the 
disks. of course if the write are too fast and to big, they will over overflow 
the cache. But on todays controller they are multi-gigabyte cache, you must 
write a lot to fill them. And if you can afford 40Gb card, you can afford 
decent controller.
>
>
>

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-08 Thread Moacir Ferreira
Thanks once again Johan!


What would be your approach: JBOD straight or JBOD made of RAIDed bricks?


Moacir


From: Johan Bernhardsson <jo...@kafit.se>
Sent: Tuesday, August 8, 2017 11:24 AM
To: Moacir Ferreira; Devin Acosta; users@ovirt.org
Subject: Re: [ovirt-users] Good practices


On ovirt gluster uses sharding. So all large files are broken up in small 
pieces on the gluster bricks.

/Johan

On August 8, 2017 12:19:39 Moacir Ferreira <moacirferre...@hotmail.com> wrote:

Thanks Johan, you brought "light" into my darkness! I went looking for the 
GlusterFS tiering how-to and it looks like quite simple to attach a SSD as hot 
tier. For those willing to read about it, go here: 
http://blog.gluster.org/2016/03/automated-tiering-in-gluster/


Now, I still have a question: VMs are made of very large .qcow2 files. My 
understanding is that files in Gluster are kept all together in a single brick. 
If so, I will not benefit from tiering as a single SSD will not be big enough 
to fit all my large VM .qcow2 files. This would not be true if Gluster can 
store "blocks" of data that compose a large file spread on several bricks. But 
if I am not wrong, this is one of key differences in between GlusterFS and 
Ceph. Can you comment?


Moacir



From: Johan Bernhardsson <jo...@kafit.se>
Sent: Tuesday, August 8, 2017 7:03 AM
To: Moacir Ferreira; Devin Acosta; users@ovirt.org
Subject: Re: [ovirt-users] Good practices


You attach the ssd as a hot tier with a gluster command. I don't think that 
gdeploy or ovirt gui can do it.

The gluster docs and redhat docs explains tiering quite good.

/Johan

On August 8, 2017 07:06:42 Moacir Ferreira <moacirferre...@hotmail.com> wrote:

Hi Devin,


Please consider that for the OS I have a RAID 1. Now, lets say I use RAID 5 to 
assemble a single disk on each server. In this case, the SSD will not make any 
difference, right? I guess that to be possible to use it, the SSD should not be 
part of the RAID 5. In this case I could create a logical volume made of the 
RAIDed brick and then extend it using the SSD. I.e.: Using gdeploy:


[disktype]

jbod



[pv1]

action=create

devices=sdb, sdc

wipefs=yes

ignore_vg_erros=no


[vg1]

action=create

vgname=gluster_vg_jbod

pvname=sdb

ignore_vg_erros=no


[vg2]

action=extend

vgname=gluster_vg_jbod

pvname=sdc

ignore_vg_erros=no


But will Gluster be able to auto-detect and use this SSD brick for tiering? Do 
I have to do some other configurations? Also, as the VM files (.qcow2) are 
quite big will I benefit from tiering? This is wrong and my approach should be 
other?


Thanks,

Moacir



From: Devin Acosta <de...@pabstatencio.com>
Sent: Monday, August 7, 2017 7:46 AM
To: Moacir Ferreira; users@ovirt.org
Subject: Re: [ovirt-users] Good practices


Moacir,

I have recently installed multiple Red Hat Virtualization hosts for several 
different companies, and have dealt with the Red Hat Support Team in depth 
about optimal configuration in regards to setting up GlusterFS most efficiently 
and I wanted to share with you what I learned.

In general Red Hat Virtualization team frowns upon using each DISK of the 
system as just a JBOD, sure there is some protection by having the data 
replicated, however, the recommendation is to use RAID 6 (preferred) or RAID-5, 
or at least RAID-1 at the very least.

Here is the direct quote from Red Hat when I asked about RAID and Bricks:

"A typical Gluster configuration would use RAID underneath the bricks. RAID 6 
is most typical as it gives you 2 disk failure protection, but RAID 5 could be 
used too. Once you have the RAIDed bricks, you'd then apply the desired 
replication on top of that. The most popular way of doing this would be 
distributed replicated with 2x replication. In general you'll get better 
performance with larger bricks. 12 drives is often a sweet spot. Another option 
would be to create a separate tier using all SSD’s.”

In order to SSD tiering from my understanding you would need 1 x NVMe drive in 
each server, or 4 x SSD hot tier (it needs to be distributed, replicated for 
the hot tier if not using NVME). So with you only having 1 SSD drive in each 
server, I’d suggest maybe looking into the NVME option.

Since your using only 3-servers, what I’d probably suggest is to do (2 Replicas 
+ Arbiter Node), this setup actually doesn’t require the 3rd server to have big 
drives at all as it only stores meta-data about the files and not actually a 
full copy.

Please see the attached document that was given to me by Red Hat to get more 
information on this. Hope this information helps you.


--

Devin Acosta, RHCA, RHVCA
Red Hat Certified Architect


On August 6, 2017 at 7:29:29 PM, Moacir Ferreira 
(moacirferre...@hotmail.com<mailto:moacirferre...@hotmail.com>) wrote:

I am willing to assemble a oVirt "pod", made o

Re: [ovirt-users] Good practices

2017-08-08 Thread Fabrice Bacchella

> Le 8 août 2017 à 15:24, FERNANDO FREDIANI  a écrit 
> :
> 
> That's something on the way RAID works, regardless what most 'super-ultra' 
> powerfull hardware controller you may have. RAID 5 or 6 will never have the 
> same write performance as a RAID 10 o 0 for example. Writeback caches can 
> deal with bursts well but they have a limit therefore there will always be a 
> penalty compared to what else you could have.

Hardware RAID5/6 can have better performance with quite common hardware that 
software RAID0. I have seen many time on on even old servers that write latency 
(hitting the cache) was smaller that read latency that was going directly to 
the disk. I'm not talking about 'super-ultra' powerfull hardware. An HP Smart 
Array P440ar with 2 GB flash is sell at 560€, public price. Not cheap, but not 
ultra powerfull.

It's now a matter of identifying the bootle neck, and how much money you can 
throw at it.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-08 Thread Moacir Ferreira
Fernando,


Let's see what people say... But this is what I understood Red Hat says is the 
best performance model. This is the main reason to open this discussion because 
as long as I can see, some of you in the community, do not agree.


But when I think about a "distributed file system", that can make any number of 
copies you want, it does not make sense using a RAIDed brick, what it makes 
sense is to use JBOD.


Moacir


From: fernando.fredi...@upx.com.br <fernando.fredi...@upx.com.br> on behalf of 
FERNANDO FREDIANI <fernando.fredi...@upx.com>
Sent: Tuesday, August 8, 2017 3:08 AM
To: Moacir Ferreira
Cc: Colin Coe; users@ovirt.org
Subject: Re: [ovirt-users] Good practices

Moacir, I understand that if you do this type of configuration you will be 
severely impacted on storage performance, specially for writes. Even if you 
have a Hardware RAID Controller with Writeback cache you will have a 
significant performance penalty and may not fully use all the resources you 
mentioned you have.

Fernando

2017-08-07 10:03 GMT-03:00 Moacir Ferreira 
<moacirferre...@hotmail.com<mailto:moacirferre...@hotmail.com>>:

Hi Colin,


Take a look on Devin's response. Also, read the doc he shared that gives some 
hints on how to deploy Gluster.


It is more like that if you want high-performance you should have the bricks 
created as RAID (5 or 6) by the server's disk controller and them assemble a 
JBOD GlusterFS. The attached document is Gluster specific and not for oVirt. 
But at this point I think that having SSD will not be a plus as using the RAID 
controller Gluster will not be aware of the SSD. Regarding the OS, my idea is 
to have a RAID 1, made of 2 low cost HDDs, to install it.


So far, based on the information received I should create a single RAID 5 or 6 
on each server and then use this disk as a brick to create my Gluster cluster, 
made of 2 replicas + 1 arbiter. What is new for me is the detail that the 
arbiter does not need a lot of space as it only keeps meta data.


Thanks for your response!

Moacir


From: Colin Coe <colin@gmail.com<mailto:colin@gmail.com>>
Sent: Monday, August 7, 2017 12:41 PM

To: Moacir Ferreira
Cc: users@ovirt.org<mailto:users@ovirt.org>
Subject: Re: [ovirt-users] Good practices

Hi

I just thought that you'd do hardware RAID if you had the controller or JBOD if 
you didn't.  In hindsight, a server with 40Gbps NICs is pretty likely to have a 
hardware RAID controller.  I've never done JBOD with hardware RAID.  I think 
having a single gluster brick on hardware JBOD would be riskier than multiple 
bricks, each on a single disk, but thats not based on anything other than my 
prejudices.

I thought gluster tiering was for the most frequently accessed files, in which 
case all the VMs disks would end up in the hot tier.  However, I have been 
wrong before...

I just wanted to know where the OS was going as I didn't see it mentioned in 
the OP.  Normally, I'd have the OS on a RAID1 but in your case thats a lot of 
wasted disk.

Honestly, I think Yaniv's answer was far better than my own and made the 
important point about having an arbiter.

Thanks

On Mon, Aug 7, 2017 at 5:56 PM, Moacir Ferreira 
<moacirferre...@hotmail.com<mailto:moacirferre...@hotmail.com>> wrote:

Hi Colin,


I am in Portugal, so sorry for this late response. It is quite confusing for 
me, please consider:

1 - What if the RAID is done by the server's disk controller, not by software?


2 - For JBOD I am just using gdeploy to deploy it. However, I am not using the 
oVirt node GUI to do this.


3 - As the VM .qcow2 files are quite big, tiering would only help if made by an 
intelligent system that uses SSD for chunks of data not for the entire .qcow2 
file. But I guess this is a problem everybody else has. So, Do you know how 
tiering works in Gluster?


4 - I am putting the OS on the first disk. However, would you do differently?


Moacir


From: Colin Coe <colin@gmail.com<mailto:colin@gmail.com>>
Sent: Monday, August 7, 2017 4:48 AM
To: Moacir Ferreira
Cc: users@ovirt.org<mailto:users@ovirt.org>
Subject: Re: [ovirt-users] Good practices

1) RAID5 may be a performance hit-

2) I'd be inclined to do this as JBOD by creating a distributed disperse volume 
on each server.  Something like

echo gluster volume create dispersevol disperse-data 5 redundancy 2 \
$(for SERVER in a b c; do for BRICK in $(seq 1 5); do echo -e 
"server${SERVER}:/brick/brick-${SERVER}${BRICK}/brick \c"; done; done)

3) I think the above.

4) Gluster does support tiering, but IIRC you'd need the same number of SSD as 
spindle drives.  There may be another way to use the SSD as a fast cache.

Where are you putting the OS?

Hope I understood the question...

Thanks

On Sun, Aug 6, 2017 at 10:49 PM, Moacir Ferreira 
<moacirferre...@hotma

Re: [ovirt-users] Good practices

2017-08-08 Thread Karli Sjöberg
On tis, 2017-08-08 at 10:24 -0300, FERNANDO FREDIANI wrote:
> That's something on the way RAID works, regardless what most 
> 'super-ultra' powerfull hardware controller you may have. RAID 5 or
> 6 
> will never have the same write performance as a RAID 10 o 0 for
> example. 
> Writeback caches can deal with bursts well but they have a limit 
> therefore there will always be a penalty compared to what else you
> could 
> have.
> 
> If you have a continuous stream of data (a big VM deployment or a
> large 
> data copy) there will be a continuous write and that will likely fill
> up 
> the cache making the disks underneath the bottleneck.
> That's why on some other scenarios, like ZFS people have multiple
> groups 
> of RAID 6 (called RAIDZ2) so it improves the write speeds for these
> type 
> of scenarios.

Just pointing out that it is commonly known as RAID 60, outside of the
ZFS lingo:
https://en.wikipedia.org/wiki/Nested_RAID_levels#RAID_60

/K

> 
> In the scenario given in this thread with just 3 servers, each with
> a 
> RAID 6 there will be a bare limit on the write performance specially
> for 
> streammed data for most powerfull your hardware controller can do 
> write-back.
> 
> Also I agree the 40Gb NICs may not be used fully and 10Gb can do the
> job 
> well, but if they were available at the begining, why not use them.
> 
> Fernando
> 
> 
> On 08/08/2017 03:16, Fabrice Bacchella wrote:
> > 
> > > 
> > > Le 8 août 2017 à 04:08, FERNANDO FREDIANI  > > com> a écrit :
> > > Even if you have a Hardware RAID Controller with Writeback cache
> > > you will have a significant performance penalty and may not fully
> > > use all the resources you mentioned you have.
> > > 
> > Nope again,from my experience with HP Smart Array and write back
> > cache, write, that goes in the cache, are even faster that read
> > that must goes to the disks. of course if the write are too fast
> > and to big, they will over overflow the cache. But on todays
> > controller they are multi-gigabyte cache, you must write a lot to
> > fill them. And if you can afford 40Gb card, you can afford decent
> > controller.
> > 
> > 
> > 
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-08 Thread FERNANDO FREDIANI

Exactly Moacir, that is my point.


A proper Distributed FIlesystem should not rely on any type of RAID as 
it can make its own redundancy without having to rely on any underneath 
layer (look at CEPH). Using RAID may help with management and in certain 
scenarios to replace a faulty disk, but at a cost, not cheap by the way.
That's why in terms of resourcing saving, if a replica 3 brings those 
issues mentioned it is much worth to have a small arbiter somewhere 
instead of wasting a significant amount of disk space.



Fernando


On 08/08/2017 06:09, Moacir Ferreira wrote:


Fernando,


Let's see what people say... But this is what I understood Red Hat 
says is the best performance model. This is the main reason to open 
this discussion because as long as I can see, some of you in the 
community, do not agree.



But when I think about a "distributed file system", that can make any 
number of copies you want, it does not make sense using a RAIDed 
brick, what it makes sense is to use JBOD.



Moacir



*From:* fernando.fredi...@upx.com.br <fernando.fredi...@upx.com.br> on 
behalf of FERNANDO FREDIANI <fernando.fredi...@upx.com>

*Sent:* Tuesday, August 8, 2017 3:08 AM
*To:* Moacir Ferreira
*Cc:* Colin Coe; users@ovirt.org
*Subject:* Re: [ovirt-users] Good practices
Moacir, I understand that if you do this type of configuration you 
will be severely impacted on storage performance, specially for 
writes. Even if you have a Hardware RAID Controller with Writeback 
cache you will have a significant performance penalty and may not 
fully use all the resources you mentioned you have.


Fernando

2017-08-07 10:03 GMT-03:00 Moacir Ferreira <moacirferre...@hotmail.com 
<mailto:moacirferre...@hotmail.com>>:


Hi Colin,


Take a look on Devin's response. Also, read the doc he shared that
gives some hints on how to deploy Gluster.


It is more like that if you want high-performance you should have
the bricks created as RAID (5 or 6) by the server's disk
controller and them assemble a JBOD GlusterFS. The attached
document is Gluster specific and not for oVirt. But at this point
I think that having SSD will not be a plus as using the RAID
controller Gluster will not be aware of the SSD. Regarding the OS,
my idea is to have a RAID 1, made of 2 low cost HDDs, to install it.


So far, based on the information received I should create a single
RAID 5 or 6 on each server and then use this disk as a brick to
create my Gluster cluster, made of 2 replicas + 1 arbiter. What is
new for me is the detail that the arbiter does not need a lot of
space as it only keeps meta data.


Thanks for your response!

Moacir


*From:* Colin Coe <colin@gmail.com <mailto:colin@gmail.com>>
*Sent:* Monday, August 7, 2017 12:41 PM

*To:* Moacir Ferreira
*Cc:* users@ovirt.org <mailto:users@ovirt.org>
*Subject:* Re: [ovirt-users] Good practices
Hi

I just thought that you'd do hardware RAID if you had the
controller or JBOD if you didn't.  In hindsight, a server with
40Gbps NICs is pretty likely to have a hardware RAID controller. 
I've never done JBOD with hardware RAID.  I think having a single

gluster brick on hardware JBOD would be riskier than multiple
bricks, each on a single disk, but thats not based on anything
other than my prejudices.

I thought gluster tiering was for the most frequently accessed
files, in which case all the VMs disks would end up in the hot
tier.  However, I have been wrong before...

I just wanted to know where the OS was going as I didn't see it
mentioned in the OP.  Normally, I'd have the OS on a RAID1 but in
your case thats a lot of wasted disk.

Honestly, I think Yaniv's answer was far better than my own and
made the important point about having an arbiter.

Thanks

On Mon, Aug 7, 2017 at 5:56 PM, Moacir Ferreira
<moacirferre...@hotmail.com <mailto:moacirferre...@hotmail.com>>
wrote:

Hi Colin,


I am in Portugal, so sorry for this late response. It is quite
confusing for me, please consider:

*
*1*- *What if the RAID is done by the server's disk
controller, not by software?

2 -**For JBOD I am just using gdeploy to deploy it. However, I
am not using the oVirt node GUI to do this.


3 -**As the VM .qcow2 files are quite big, tiering would only
help if made by an intelligent system that uses SSD for chunks
of data not for the entire .qcow2 file. But I guess this is a
problem everybody else has. So, Do you know how tiering works
in Gluster?


4 - I am putting the OS on the first d

Re: [ovirt-users] Good practices

2017-08-08 Thread FERNANDO FREDIANI
That's something on the way RAID works, regardless what most 
'super-ultra' powerfull hardware controller you may have. RAID 5 or 6 
will never have the same write performance as a RAID 10 o 0 for example. 
Writeback caches can deal with bursts well but they have a limit 
therefore there will always be a penalty compared to what else you could 
have.


If you have a continuous stream of data (a big VM deployment or a large 
data copy) there will be a continuous write and that will likely fill up 
the cache making the disks underneath the bottleneck.
That's why on some other scenarios, like ZFS people have multiple groups 
of RAID 6 (called RAIDZ2) so it improves the write speeds for these type 
of scenarios.


In the scenario given in this thread with just 3 servers, each with a 
RAID 6 there will be a bare limit on the write performance specially for 
streammed data for most powerfull your hardware controller can do 
write-back.


Also I agree the 40Gb NICs may not be used fully and 10Gb can do the job 
well, but if they were available at the begining, why not use them.


Fernando


On 08/08/2017 03:16, Fabrice Bacchella wrote:

Le 8 août 2017 à 04:08, FERNANDO FREDIANI  a écrit :
Even if you have a Hardware RAID Controller with Writeback cache you will have 
a significant performance penalty and may not fully use all the resources you 
mentioned you have.


Nope again,from my experience with HP Smart Array and write back cache, write, 
that goes in the cache, are even faster that read that must goes to the disks. 
of course if the write are too fast and to big, they will over overflow the 
cache. But on todays controller they are multi-gigabyte cache, you must write a 
lot to fill them. And if you can afford 40Gb card, you can afford decent 
controller.





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-08 Thread Moacir Ferreira
Ok, the 40Gb NIC that I got were for free. But anyway, if you were working with 
6 HDD + 1 SSD per server, then you get 21 disks on your cluster. As data in a 
JBOD will be built all over the network, then it can be really intensive 
especially depending on the number of replicas you choose for your needs. Also, 
when moving a VM alive you must transfer the memory contents of a VM to another 
node (just think about moving a VM with 32GB RAM). All together, it can be a 
quite large chunk of data moving over the network all the time. While 40Gb NIC 
is not a "must", I think it is more affordable as it cost much less then a good 
disk controller.


But my confusion is that, as said by other fellows, the best "performance 
model" is when you use a hardware RAIDed brick (i.e.: 5 or 6) to assemble your 
GlusterFS. In this case, as I would have to buy a good controller but have less 
network traffic, to lower the cost I would then use a separate network made of 
10Gb NICs plus the controller.


Moacir



>
> > Le 8 ao?t 2017 ? 04:08, FERNANDO FREDIANI  a
> ?crit :
>
> > Even if you have a Hardware RAID Controller with Writeback cache you
> will have a significant performance penalty and may not fully use all the
> resources you mentioned you have.
> >
>
> Nope again,from my experience with HP Smart Array and write back cache,
> write, that goes in the cache, are even faster that read that must goes to
> the disks. of course if the write are too fast and to big, they will over
> overflow the cache. But on todays controller they are multi-gigabyte cache,
> you must write a lot to fill them. And if you can afford 40Gb card, you can
> afford decent controller.
>

The last sentence raises an excellent point: balance your resources. Don't
spend a fortune on one component while another will end up being your
bottleneck.
Storage is usually the slowest link in the chain. I personally believe that
spending the money on NVMe drives makes more sense than 40Gb (except [1],
which is suspiciously cheap!)

Y.
[1] http://a.co/4hsCTqG

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-08 Thread Johan Bernhardsson
On ovirt gluster uses sharding. So all large files are broken up in small 
pieces on the gluster bricks.


/Johan


On August 8, 2017 12:19:39 Moacir Ferreira <moacirferre...@hotmail.com> wrote:

Thanks Johan, you brought "light" into my darkness! I went looking for the 
GlusterFS tiering how-to and it looks like quite simple to attach a SSD as 
hot tier. For those willing to read about it, go here: 
http://blog.gluster.org/2016/03/automated-tiering-in-gluster/



Now, I still have a question: VMs are made of very large .qcow2 files. My 
understanding is that files in Gluster are kept all together in a single 
brick. If so, I will not benefit from tiering as a single SSD will not be 
big enough to fit all my large VM .qcow2 files. This would not be true if 
Gluster can store "blocks" of data that compose a large file spread on 
several bricks. But if I am not wrong, this is one of key differences in 
between GlusterFS and Ceph. Can you comment?



Moacir



From: Johan Bernhardsson <jo...@kafit.se>
Sent: Tuesday, August 8, 2017 7:03 AM
To: Moacir Ferreira; Devin Acosta; users@ovirt.org
Subject: Re: [ovirt-users] Good practices


You attach the ssd as a hot tier with a gluster command. I don't think that 
gdeploy or ovirt gui can do it.


The gluster docs and redhat docs explains tiering quite good.

/Johan

On August 8, 2017 07:06:42 Moacir Ferreira <moacirferre...@hotmail.com> wrote:

Hi Devin,


Please consider that for the OS I have a RAID 1. Now, lets say I use RAID 5 
to assemble a single disk on each server. In this case, the SSD will not 
make any difference, right? I guess that to be possible to use it, the SSD 
should not be part of the RAID 5. In this case I could create a logical 
volume made of the RAIDed brick and then extend it using the SSD. I.e.: 
Using gdeploy:



[disktype]

jbod



[pv1]

action=create

devices=sdb, sdc

wipefs=yes

ignore_vg_erros=no


[vg1]

action=create

vgname=gluster_vg_jbod

pvname=sdb

ignore_vg_erros=no


[vg2]

action=extend

vgname=gluster_vg_jbod

pvname=sdc

ignore_vg_erros=no


But will Gluster be able to auto-detect and use this SSD brick for tiering? 
Do I have to do some other configurations? Also, as the VM files (.qcow2) 
are quite big will I benefit from tiering? This is wrong and my approach 
should be other?



Thanks,

Moacir



From: Devin Acosta <de...@pabstatencio.com>
Sent: Monday, August 7, 2017 7:46 AM
To: Moacir Ferreira; users@ovirt.org
Subject: Re: [ovirt-users] Good practices


Moacir,

I have recently installed multiple Red Hat Virtualization hosts for several 
different companies, and have dealt with the Red Hat Support Team in depth 
about optimal configuration in regards to setting up GlusterFS most 
efficiently and I wanted to share with you what I learned.


In general Red Hat Virtualization team frowns upon using each DISK of the 
system as just a JBOD, sure there is some protection by having the data 
replicated, however, the recommendation is to use RAID 6 (preferred) or 
RAID-5, or at least RAID-1 at the very least.


Here is the direct quote from Red Hat when I asked about RAID and Bricks:

"A typical Gluster configuration would use RAID underneath the bricks. RAID 
6 is most typical as it gives you 2 disk failure protection, but RAID 5 
could be used too. Once you have the RAIDed bricks, you'd then apply the 
desired replication on top of that. The most popular way of doing this 
would be distributed replicated with 2x replication. In general you'll get 
better performance with larger bricks. 12 drives is often a sweet spot. 
Another option would be to create a separate tier using all SSD’s.”


In order to SSD tiering from my understanding you would need 1 x NVMe drive 
in each server, or 4 x SSD hot tier (it needs to be distributed, replicated 
for the hot tier if not using NVME). So with you only having 1 SSD drive in 
each server, I’d suggest maybe looking into the NVME option.


Since your using only 3-servers, what I’d probably suggest is to do (2 
Replicas + Arbiter Node), this setup actually doesn’t require the 3rd 
server to have big drives at all as it only stores meta-data about the 
files and not actually a full copy.


Please see the attached document that was given to me by Red Hat to get 
more information on this. Hope this information helps you.



--

Devin Acosta, RHCA, RHVCA
Red Hat Certified Architect


On August 6, 2017 at 7:29:29 PM, Moacir Ferreira 
(moacirferre...@hotmail.com<mailto:moacirferre...@hotmail.com>) wrote:


I am willing to assemble a oVirt "pod", made of 3 servers, each with 2 CPU 
sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use 
GlusterFS to provide HA for the VMs. The 3 servers have a dual 40Gb NIC and 
a dual 10Gb NIC. So my intention is to create a loop like a server triangle 
using the 40Gb NICs for virtualization files (VMs .qcow2) 

Re: [ovirt-users] Good practices

2017-08-08 Thread Fabrice Bacchella

> Le 8 août 2017 à 08:50, Yaniv Kaul  a écrit :
> 

> Storage is usually the slowest link in the chain. I personally believe that 
> spending the money on NVMe drives makes more sense than 40Gb (except [1], 
> which is suspiciously cheap!)
> 
> Y.
> [1] http://a.co/4hsCTqG 

http://h20564.www2.hpe.com/hpsc/doc/public/display?docId=emr_na-c04374078

It's supported on old Gen8 servers (G10 is comming). It must be coming from an 
attic.


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-08 Thread Yaniv Kaul
On Tue, Aug 8, 2017 at 9:16 AM, Fabrice Bacchella <
fabrice.bacche...@orange.fr> wrote:

>
> > Le 8 août 2017 à 04:08, FERNANDO FREDIANI  a
> écrit :
>
> > Even if you have a Hardware RAID Controller with Writeback cache you
> will have a significant performance penalty and may not fully use all the
> resources you mentioned you have.
> >
>
> Nope again,from my experience with HP Smart Array and write back cache,
> write, that goes in the cache, are even faster that read that must goes to
> the disks. of course if the write are too fast and to big, they will over
> overflow the cache. But on todays controller they are multi-gigabyte cache,
> you must write a lot to fill them. And if you can afford 40Gb card, you can
> afford decent controller.
>

The last sentence raises an excellent point: balance your resources. Don't
spend a fortune on one component while another will end up being your
bottleneck.
Storage is usually the slowest link in the chain. I personally believe that
spending the money on NVMe drives makes more sense than 40Gb (except [1],
which is suspiciously cheap!)

Y.
[1] http://a.co/4hsCTqG


>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-08 Thread Yaniv Kaul
On Tue, Aug 8, 2017 at 12:03 AM, FERNANDO FREDIANI <
fernando.fredi...@upx.com> wrote:

> Thanks for the detailed answer Erekle.
>
> I conclude that it is worth in any scenario to have a arbiter node in
> order to avoid wasting more disk space to RAID X + Gluster Replication on
> the top of it. The cost seems much lower if you consider running costs of
> the whole storage and compare it with the cost to build the arbiter node.
> Even having a fully redundant arbiter service with 2 nodes would make it
> wort on a larger deployment.
>


Note that although you get the same consistency as a replica 3 setup, a
2+arbiter gives you data availability as a replica 2 setup. May or may not
be OK with your high availability requirements.
Y.


> Regards
> Fernando
> On 07/08/2017 17:07, Erekle Magradze wrote:
>
> Hi Fernando (sorry for misspelling your name, I used a different keyboard),
>
> So let's go with the following scenarios:
>
> 1. Let's say you have two servers (replication factor is 2), i.e. two
> bricks per volume, in this case it is strongly recommended to have the
> arbiter node, the metadata storage that will guarantee avoiding the split
> brain situation, in this case for arbiter you don't even need a disk with
> lots of space, it's enough to have a tiny ssd but hosted on a separate
> server. Advantage of such setup is that you don't need the RAID 1 for each
> brick, you have the metadata information stored in arbiter node and brick
> replacement is easy.
>
> 2. If you have odd number of bricks (let's say 3, i.e. replication factor
> is 3) in your volume and you didn't create the arbiter node as well as you
> didn't configure the quorum, in this case the entire load for keeping the
> consistency of the volume resides on all 3 servers, each of them is
> important and each brick contains key information, they need to cross-check
> each other (that's what people usually do with the first try of gluster :)
> ), in this case replacing a brick is a big pain and in this case RAID 1 is
> a good option to have (that's the disadvantage, i.e. loosing the space and
> not having the JBOD option) advantage is that you don't have the to have
> additional arbiter node.
>
> 3. You have odd number of bricks and configured arbiter node, in this case
> you can easily go with JBOD, however a good practice would be to have a
> RAID 1 for arbiter disks (tiny 128GB SSD-s ar perfectly sufficient for
> volumes with 10s of TB-s in size.)
>
> That's basically it
>
> The rest about the reliability and setup scenarios you can find in gluster
> documentation, especially look for quorum and arbiter node configs+options.
>
> Cheers
>
> Erekle
> P.S. What I was mentioning, regarding a good practice is mostly related to
> the operations of gluster not installation or deployment, i.e. not the
> conceptual understanding of gluster (conceptually it's a JBOD system).
>
> On 08/07/2017 05:41 PM, FERNANDO FREDIANI wrote:
>
> Thanks for the clarification Erekle.
>
> However I get surprised with this way of operating from GlusterFS as it
> adds another layer of complexity to the system (either a hardware or
> software RAID) before the gluster config and increase the system's overall
> costs.
>
> An important point to consider is: In RAID configuration you already have
> space 'wasted' in order to build redundancy (either RAID 1, 5, or 6). Then
> when you have GlusterFS on the top of several RAIDs you have again more
> data replicated so you end up with the same data consuming more space in a
> group of disks and again on the top of several RAIDs depending on the
> Gluster configuration you have (in a RAID 1 config the same data is
> replicated 4 times).
>
> Yet another downside of having a RAID (specially RAID 5 or 6) is that it
> reduces considerably the write speeds as each group of disks will end up
> having the write speed of a single disk as all other disks of that group
> have to wait for each other to write as well.
>
> Therefore if Gluster already replicates data why does it create this big
> pain you mentioned if the data is replicated somewhere else, can still be
> retrieved to both serve clients and reconstruct the equivalent disk when it
> is replaced ?
>
> Fernando
>
> On 07/08/2017 10:26, Erekle Magradze wrote:
>
> Hi Frenando,
>
> Here is my experience, if you consider a particular hard drive as a brick
> for gluster volume and it dies, i.e. it becomes not accessible it's a huge
> hassle to discard that brick and exchange with another one, since gluster
> some tries to access that broken brick and it's causing (at least it cause
> for me) a big pain, therefore it's better to have a RAID as brick, i.e.
> have RAID 1 (mirroring) for each brick, in this case if the disk is down
> you can easily exchange it and rebuild the RAID without going offline, i.e
> switching off the volume doing brick manipulations and switching it back on.
>
> Cheers
>
> Erekle
>
> On 08/07/2017 03:04 PM, FERNANDO FREDIANI wrote:
>
> For any RAID 5 or 6 

Re: [ovirt-users] Good practices

2017-08-08 Thread Fabrice Bacchella

> Le 8 août 2017 à 04:08, FERNANDO FREDIANI  a écrit 
> :

> Even if you have a Hardware RAID Controller with Writeback cache you will 
> have a significant performance penalty and may not fully use all the 
> resources you mentioned you have.
> 

Nope again,from my experience with HP Smart Array and write back cache, write, 
that goes in the cache, are even faster that read that must goes to the disks. 
of course if the write are too fast and to big, they will over overflow the 
cache. But on todays controller they are multi-gigabyte cache, you must write a 
lot to fill them. And if you can afford 40Gb card, you can afford decent 
controller.



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-08 Thread Johan Bernhardsson
You attach the ssd as a hot tier with a gluster command. I don't think that 
gdeploy or ovirt gui can do it.


The gluster docs and redhat docs explains tiering quite good.

/Johan


On August 8, 2017 07:06:42 Moacir Ferreira <moacirferre...@hotmail.com> wrote:


Hi Devin,


Please consider that for the OS I have a RAID 1. Now, lets say I use RAID 5 
to assemble a single disk on each server. In this case, the SSD will not 
make any difference, right? I guess that to be possible to use it, the SSD 
should not be part of the RAID 5. In this case I could create a logical 
volume made of the RAIDed brick and then extend it using the SSD. I.e.: 
Using gdeploy:



[disktype]

jbod



[pv1]

action=create

devices=sdb, sdc

wipefs=yes

ignore_vg_erros=no


[vg1]

action=create

vgname=gluster_vg_jbod

pvname=sdb

ignore_vg_erros=no


[vg2]

action=extend

vgname=gluster_vg_jbod

pvname=sdc

ignore_vg_erros=no


But will Gluster be able to auto-detect and use this SSD brick for tiering? 
Do I have to do some other configurations? Also, as the VM files (.qcow2) 
are quite big will I benefit from tiering? This is wrong and my approach 
should be other?



Thanks,

Moacir



From: Devin Acosta <de...@pabstatencio.com>
Sent: Monday, August 7, 2017 7:46 AM
To: Moacir Ferreira; users@ovirt.org
Subject: Re: [ovirt-users] Good practices


Moacir,

I have recently installed multiple Red Hat Virtualization hosts for several 
different companies, and have dealt with the Red Hat Support Team in depth 
about optimal configuration in regards to setting up GlusterFS most 
efficiently and I wanted to share with you what I learned.


In general Red Hat Virtualization team frowns upon using each DISK of the 
system as just a JBOD, sure there is some protection by having the data 
replicated, however, the recommendation is to use RAID 6 (preferred) or 
RAID-5, or at least RAID-1 at the very least.


Here is the direct quote from Red Hat when I asked about RAID and Bricks:

"A typical Gluster configuration would use RAID underneath the bricks. RAID 
6 is most typical as it gives you 2 disk failure protection, but RAID 5 
could be used too. Once you have the RAIDed bricks, you'd then apply the 
desired replication on top of that. The most popular way of doing this 
would be distributed replicated with 2x replication. In general you'll get 
better performance with larger bricks. 12 drives is often a sweet spot. 
Another option would be to create a separate tier using all SSD’s.”


In order to SSD tiering from my understanding you would need 1 x NVMe drive 
in each server, or 4 x SSD hot tier (it needs to be distributed, replicated 
for the hot tier if not using NVME). So with you only having 1 SSD drive in 
each server, I’d suggest maybe looking into the NVME option.


Since your using only 3-servers, what I’d probably suggest is to do (2 
Replicas + Arbiter Node), this setup actually doesn’t require the 3rd 
server to have big drives at all as it only stores meta-data about the 
files and not actually a full copy.


Please see the attached document that was given to me by Red Hat to get 
more information on this. Hope this information helps you.



--

Devin Acosta, RHCA, RHVCA
Red Hat Certified Architect


On August 6, 2017 at 7:29:29 PM, Moacir Ferreira 
(moacirferre...@hotmail.com<mailto:moacirferre...@hotmail.com>) wrote:


I am willing to assemble a oVirt "pod", made of 3 servers, each with 2 CPU 
sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use 
GlusterFS to provide HA for the VMs. The 3 servers have a dual 40Gb NIC and 
a dual 10Gb NIC. So my intention is to create a loop like a server triangle 
using the 40Gb NICs for virtualization files (VMs .qcow2) access and to 
move VMs around the pod (east /west traffic) while using the 10Gb 
interfaces for giving services to the outside world (north/south traffic).



This said, my first question is: How should I deploy GlusterFS in such 
oVirt scenario? My questions are:



1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node, and then 
create a GlusterFS using them?


2 - Instead, should I create a JBOD array made of all server's disks?

3 - What is the best Gluster configuration to provide for HA while not 
consuming too much disk space?


4 - Does a oVirt hypervisor pod like I am planning to build, and the 
virtualization environment, benefits from tiering when using a SSD disk? 
And yes, will Gluster do it by default or I have to configure it to do so?



At the bottom line, what is the good practice for using GlusterFS in small 
pods for enterprises?



You opinion/feedback will be really appreciated!

Moacir

___
Users mailing list
Users@ovirt.org<mailto:Users@ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users



--
___
Users mailing list
Users@ovirt.org
http:/

Re: [ovirt-users] Good practices

2017-08-07 Thread Moacir Ferreira
Hi Devin,


Please consider that for the OS I have a RAID 1. Now, lets say I use RAID 5 to 
assemble a single disk on each server. In this case, the SSD will not make any 
difference, right? I guess that to be possible to use it, the SSD should not be 
part of the RAID 5. In this case I could create a logical volume made of the 
RAIDed brick and then extend it using the SSD. I.e.: Using gdeploy:


[disktype]

jbod



[pv1]

action=create

devices=sdb, sdc

wipefs=yes

ignore_vg_erros=no


[vg1]

action=create

vgname=gluster_vg_jbod

pvname=sdb

ignore_vg_erros=no


[vg2]

action=extend

vgname=gluster_vg_jbod

pvname=sdc

ignore_vg_erros=no


But will Gluster be able to auto-detect and use this SSD brick for tiering? Do 
I have to do some other configurations? Also, as the VM files (.qcow2) are 
quite big will I benefit from tiering? This is wrong and my approach should be 
other?


Thanks,

Moacir



From: Devin Acosta <de...@pabstatencio.com>
Sent: Monday, August 7, 2017 7:46 AM
To: Moacir Ferreira; users@ovirt.org
Subject: Re: [ovirt-users] Good practices


Moacir,

I have recently installed multiple Red Hat Virtualization hosts for several 
different companies, and have dealt with the Red Hat Support Team in depth 
about optimal configuration in regards to setting up GlusterFS most efficiently 
and I wanted to share with you what I learned.

In general Red Hat Virtualization team frowns upon using each DISK of the 
system as just a JBOD, sure there is some protection by having the data 
replicated, however, the recommendation is to use RAID 6 (preferred) or RAID-5, 
or at least RAID-1 at the very least.

Here is the direct quote from Red Hat when I asked about RAID and Bricks:

"A typical Gluster configuration would use RAID underneath the bricks. RAID 6 
is most typical as it gives you 2 disk failure protection, but RAID 5 could be 
used too. Once you have the RAIDed bricks, you'd then apply the desired 
replication on top of that. The most popular way of doing this would be 
distributed replicated with 2x replication. In general you'll get better 
performance with larger bricks. 12 drives is often a sweet spot. Another option 
would be to create a separate tier using all SSD’s.”

In order to SSD tiering from my understanding you would need 1 x NVMe drive in 
each server, or 4 x SSD hot tier (it needs to be distributed, replicated for 
the hot tier if not using NVME). So with you only having 1 SSD drive in each 
server, I’d suggest maybe looking into the NVME option.

Since your using only 3-servers, what I’d probably suggest is to do (2 Replicas 
+ Arbiter Node), this setup actually doesn’t require the 3rd server to have big 
drives at all as it only stores meta-data about the files and not actually a 
full copy.

Please see the attached document that was given to me by Red Hat to get more 
information on this. Hope this information helps you.


--

Devin Acosta, RHCA, RHVCA
Red Hat Certified Architect


On August 6, 2017 at 7:29:29 PM, Moacir Ferreira 
(moacirferre...@hotmail.com<mailto:moacirferre...@hotmail.com>) wrote:

I am willing to assemble a oVirt "pod", made of 3 servers, each with 2 CPU 
sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use GlusterFS 
to provide HA for the VMs. The 3 servers have a dual 40Gb NIC and a dual 10Gb 
NIC. So my intention is to create a loop like a server triangle using the 40Gb 
NICs for virtualization files (VMs .qcow2) access and to move VMs around the 
pod (east /west traffic) while using the 10Gb interfaces for giving services to 
the outside world (north/south traffic).


This said, my first question is: How should I deploy GlusterFS in such oVirt 
scenario? My questions are:


1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node, and then 
create a GlusterFS using them?

2 - Instead, should I create a JBOD array made of all server's disks?

3 - What is the best Gluster configuration to provide for HA while not 
consuming too much disk space?

4 - Does a oVirt hypervisor pod like I am planning to build, and the 
virtualization environment, benefits from tiering when using a SSD disk? And 
yes, will Gluster do it by default or I have to configure it to do so?


At the bottom line, what is the good practice for using GlusterFS in small pods 
for enterprises?


You opinion/feedback will be really appreciated!

Moacir

___
Users mailing list
Users@ovirt.org<mailto:Users@ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-07 Thread FERNANDO FREDIANI
Moacir, I understand that if you do this type of configuration you will be
severely impacted on storage performance, specially for writes. Even if you
have a Hardware RAID Controller with Writeback cache you will have a
significant performance penalty and may not fully use all the resources you
mentioned you have.

Fernando

2017-08-07 10:03 GMT-03:00 Moacir Ferreira <moacirferre...@hotmail.com>:

> Hi Colin,
>
>
> Take a look on Devin's response. Also, read the doc he shared that gives
> some hints on how to deploy Gluster.
>
>
> It is more like that if you want high-performance you should have the
> bricks created as RAID (5 or 6) by the server's disk controller and them
> assemble a JBOD GlusterFS. The attached document is Gluster specific and
> not for oVirt. But at this point I think that having SSD will not be a plus
> as using the RAID controller Gluster will not be aware of the SSD.
> Regarding the OS, my idea is to have a RAID 1, made of 2 low cost HDDs, to
> install it.
>
>
> So far, based on the information received I should create a single RAID 5
> or 6 on each server and then use this disk as a brick to create my Gluster
> cluster, made of 2 replicas + 1 arbiter. What is new for me is the detail
> that the arbiter does not need a lot of space as it only keeps meta data.
>
>
> Thanks for your response!
> Moacir
>
> --
> *From:* Colin Coe <colin@gmail.com>
> *Sent:* Monday, August 7, 2017 12:41 PM
>
> *To:* Moacir Ferreira
> *Cc:* users@ovirt.org
> *Subject:* Re: [ovirt-users] Good practices
>
> Hi
>
> I just thought that you'd do hardware RAID if you had the controller or
> JBOD if you didn't.  In hindsight, a server with 40Gbps NICs is pretty
> likely to have a hardware RAID controller.  I've never done JBOD with
> hardware RAID.  I think having a single gluster brick on hardware JBOD
> would be riskier than multiple bricks, each on a single disk, but thats not
> based on anything other than my prejudices.
>
> I thought gluster tiering was for the most frequently accessed files, in
> which case all the VMs disks would end up in the hot tier.  However, I have
> been wrong before...
>
> I just wanted to know where the OS was going as I didn't see it mentioned
> in the OP.  Normally, I'd have the OS on a RAID1 but in your case thats a
> lot of wasted disk.
>
> Honestly, I think Yaniv's answer was far better than my own and made the
> important point about having an arbiter.
>
> Thanks
>
> On Mon, Aug 7, 2017 at 5:56 PM, Moacir Ferreira <
> moacirferre...@hotmail.com> wrote:
>
>> Hi Colin,
>>
>>
>> I am in Portugal, so sorry for this late response. It is quite confusing
>> for me, please consider:
>>
>>
>> 1* - *What if the RAID is done by the server's disk controller, not by
>> software?
>>
>> 2 - For JBOD I am just using gdeploy to deploy it. However, I am not
>> using the oVirt node GUI to do this.
>>
>>
>> 3 - As the VM .qcow2 files are quite big, tiering would only help if
>> made by an intelligent system that uses SSD for chunks of data not for the
>> entire .qcow2 file. But I guess this is a problem everybody else has. So,
>> Do you know how tiering works in Gluster?
>>
>>
>> 4 - I am putting the OS on the first disk. However, would you do
>> differently?
>>
>>
>> Moacir
>>
>> --
>> *From:* Colin Coe <colin@gmail.com>
>> *Sent:* Monday, August 7, 2017 4:48 AM
>> *To:* Moacir Ferreira
>> *Cc:* users@ovirt.org
>> *Subject:* Re: [ovirt-users] Good practices
>>
>> 1) RAID5 may be a performance hit-
>>
>> 2) I'd be inclined to do this as JBOD by creating a distributed disperse
>> volume on each server.  Something like
>>
>> echo gluster volume create dispersevol disperse-data 5 redundancy 2 \
>> $(for SERVER in a b c; do for BRICK in $(seq 1 5); do echo -e
>> "server${SERVER}:/brick/brick-${SERVER}${BRICK}/brick \c"; done; done)
>>
>> 3) I think the above.
>>
>> 4) Gluster does support tiering, but IIRC you'd need the same number of
>> SSD as spindle drives.  There may be another way to use the SSD as a fast
>> cache.
>>
>> Where are you putting the OS?
>>
>> Hope I understood the question...
>>
>> Thanks
>>
>> On Sun, Aug 6, 2017 at 10:49 PM, Moacir Ferreira <
>> moacirferre...@hotmail.com> wrote:
>>
>>> I am willing to assemble a oVirt "pod", made of 3 servers, each with 2
>>> CPU sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use

Re: [ovirt-users] Good practices

2017-08-07 Thread Moacir Ferreira
Hi Colin,


Take a look on Devin's response. Also, read the doc he shared that gives some 
hints on how to deploy Gluster.


It is more like that if you want high-performance you should have the bricks 
created as RAID (5 or 6) by the server's disk controller and them assemble a 
JBOD GlusterFS. The attached document is Gluster specific and not for oVirt. 
But at this point I think that having SSD will not be a plus as using the RAID 
controller Gluster will not be aware of the SSD. Regarding the OS, my idea is 
to have a RAID 1, made of 2 low cost HDDs, to install it.


So far, based on the information received I should create a single RAID 5 or 6 
on each server and then use this disk as a brick to create my Gluster cluster, 
made of 2 replicas + 1 arbiter. What is new for me is the detail that the 
arbiter does not need a lot of space as it only keeps meta data.


Thanks for your response!

Moacir


From: Colin Coe <colin@gmail.com>
Sent: Monday, August 7, 2017 12:41 PM
To: Moacir Ferreira
Cc: users@ovirt.org
Subject: Re: [ovirt-users] Good practices

Hi

I just thought that you'd do hardware RAID if you had the controller or JBOD if 
you didn't.  In hindsight, a server with 40Gbps NICs is pretty likely to have a 
hardware RAID controller.  I've never done JBOD with hardware RAID.  I think 
having a single gluster brick on hardware JBOD would be riskier than multiple 
bricks, each on a single disk, but thats not based on anything other than my 
prejudices.

I thought gluster tiering was for the most frequently accessed files, in which 
case all the VMs disks would end up in the hot tier.  However, I have been 
wrong before...

I just wanted to know where the OS was going as I didn't see it mentioned in 
the OP.  Normally, I'd have the OS on a RAID1 but in your case thats a lot of 
wasted disk.

Honestly, I think Yaniv's answer was far better than my own and made the 
important point about having an arbiter.

Thanks

On Mon, Aug 7, 2017 at 5:56 PM, Moacir Ferreira 
<moacirferre...@hotmail.com<mailto:moacirferre...@hotmail.com>> wrote:

Hi Colin,


I am in Portugal, so sorry for this late response. It is quite confusing for 
me, please consider:

1 - What if the RAID is done by the server's disk controller, not by software?


2 - For JBOD I am just using gdeploy to deploy it. However, I am not using the 
oVirt node GUI to do this.


3 - As the VM .qcow2 files are quite big, tiering would only help if made by an 
intelligent system that uses SSD for chunks of data not for the entire .qcow2 
file. But I guess this is a problem everybody else has. So, Do you know how 
tiering works in Gluster?


4 - I am putting the OS on the first disk. However, would you do differently?


Moacir


From: Colin Coe <colin@gmail.com<mailto:colin@gmail.com>>
Sent: Monday, August 7, 2017 4:48 AM
To: Moacir Ferreira
Cc: users@ovirt.org<mailto:users@ovirt.org>
Subject: Re: [ovirt-users] Good practices

1) RAID5 may be a performance hit-

2) I'd be inclined to do this as JBOD by creating a distributed disperse volume 
on each server.  Something like

echo gluster volume create dispersevol disperse-data 5 redundancy 2 \
$(for SERVER in a b c; do for BRICK in $(seq 1 5); do echo -e 
"server${SERVER}:/brick/brick-${SERVER}${BRICK}/brick \c"; done; done)

3) I think the above.

4) Gluster does support tiering, but IIRC you'd need the same number of SSD as 
spindle drives.  There may be another way to use the SSD as a fast cache.

Where are you putting the OS?

Hope I understood the question...

Thanks

On Sun, Aug 6, 2017 at 10:49 PM, Moacir Ferreira 
<moacirferre...@hotmail.com<mailto:moacirferre...@hotmail.com>> wrote:

I am willing to assemble a oVirt "pod", made of 3 servers, each with 2 CPU 
sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use GlusterFS 
to provide HA for the VMs. The 3 servers have a dual 40Gb NIC and a dual 10Gb 
NIC. So my intention is to create a loop like a server triangle using the 40Gb 
NICs for virtualization files (VMs .qcow2) access and to move VMs around the 
pod (east /west traffic) while using the 10Gb interfaces for giving services to 
the outside world (north/south traffic).


This said, my first question is: How should I deploy GlusterFS in such oVirt 
scenario? My questions are:


1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node, and then 
create a GlusterFS using them?

2 - Instead, should I create a JBOD array made of all server's disks?

3 - What is the best Gluster configuration to provide for HA while not 
consuming too much disk space?

4 - Does a oVirt hypervisor pod like I am planning to build, and the 
virtualization environment, benefits from tiering when using a SSD disk? And 
yes, will Gluster do it by default or I have to configure it to do so?


At the bottom line, what is the good practice for us

Re: [ovirt-users] Good practices

2017-08-07 Thread Erekle Magradze

Hi Fernando,

Indeed, having and arbiter node is always a good idea, and it saves 
costs a lot.


Good luck with your setup.

Cheers

Erekle


On 07.08.2017 23:03, FERNANDO FREDIANI wrote:


Thanks for the detailed answer Erekle.

I conclude that it is worth in any scenario to have a arbiter node in 
order to avoid wasting more disk space to RAID X + Gluster Replication 
on the top of it. The cost seems much lower if you consider running 
costs of the whole storage and compare it with the cost to build the 
arbiter node. Even having a fully redundant arbiter service with 2 
nodes would make it wort on a larger deployment.


Regards
Fernando

On 07/08/2017 17:07, Erekle Magradze wrote:


Hi Fernando (sorry for misspelling your name, I used a different 
keyboard),


So let's go with the following scenarios:

1. Let's say you have two servers (replication factor is 2), i.e. two 
bricks per volume, in this case it is strongly recommended to have 
the arbiter node, the metadata storage that will guarantee avoiding 
the split brain situation, in this case for arbiter you don't even 
need a disk with lots of space, it's enough to have a tiny ssd but 
hosted on a separate server. Advantage of such setup is that you 
don't need the RAID 1 for each brick, you have the metadata 
information stored in arbiter node and brick replacement is easy.


2. If you have odd number of bricks (let's say 3, i.e. replication 
factor is 3) in your volume and you didn't create the arbiter node as 
well as you didn't configure the quorum, in this case the entire load 
for keeping the consistency of the volume resides on all 3 servers, 
each of them is important and each brick contains key information, 
they need to cross-check each other (that's what people usually do 
with the first try of gluster :) ), in this case replacing a brick is 
a big pain and in this case RAID 1 is a good option to have (that's 
the disadvantage, i.e. loosing the space and not having the JBOD 
option) advantage is that you don't have the to have additional 
arbiter node.


3. You have odd number of bricks and configured arbiter node, in this 
case you can easily go with JBOD, however a good practice would be to 
have a RAID 1 for arbiter disks (tiny 128GB SSD-s ar perfectly 
sufficient for volumes with 10s of TB-s in size.)


That's basically it

The rest about the reliability and setup scenarios you can find in 
gluster documentation, especially look for quorum and arbiter node 
configs+options.


Cheers

Erekle

P.S. What I was mentioning, regarding a good practice is mostly 
related to the operations of gluster not installation or deployment, 
i.e. not the conceptual understanding of gluster (conceptually it's a 
JBOD system).


On 08/07/2017 05:41 PM, FERNANDO FREDIANI wrote:


Thanks for the clarification Erekle.

However I get surprised with this way of operating from GlusterFS as 
it adds another layer of complexity to the system (either a hardware 
or software RAID) before the gluster config and increase the 
system's overall costs.


An important point to consider is: In RAID configuration you already 
have space 'wasted' in order to build redundancy (either RAID 1, 5, 
or 6). Then when you have GlusterFS on the top of several RAIDs you 
have again more data replicated so you end up with the same data 
consuming more space in a group of disks and again on the top of 
several RAIDs depending on the Gluster configuration you have (in a 
RAID 1 config the same data is replicated 4 times).


Yet another downside of having a RAID (specially RAID 5 or 6) is 
that it reduces considerably the write speeds as each group of disks 
will end up having the write speed of a single disk as all other 
disks of that group have to wait for each other to write as well.


Therefore if Gluster already replicates data why does it create this 
big pain you mentioned if the data is replicated somewhere else, can 
still be retrieved to both serve clients and reconstruct the 
equivalent disk when it is replaced ?


Fernando


On 07/08/2017 10:26, Erekle Magradze wrote:


Hi Frenando,

Here is my experience, if you consider a particular hard drive as a 
brick for gluster volume and it dies, i.e. it becomes not 
accessible it's a huge hassle to discard that brick and exchange 
with another one, since gluster some tries to access that broken 
brick and it's causing (at least it cause for me) a big pain, 
therefore it's better to have a RAID as brick, i.e. have RAID 1 
(mirroring) for each brick, in this case if the disk is down you 
can easily exchange it and rebuild the RAID without going offline, 
i.e switching off the volume doing brick manipulations and 
switching it back on.


Cheers

Erekle


On 08/07/2017 03:04 PM, FERNANDO FREDIANI wrote:


For any RAID 5 or 6 configuration I normally follow a simple gold 
rule which gave good results so far:

- up to 4 disks RAID 5
- 5 or more disks RAID 6

However I didn't really understand well the recommendation to use 
any RAID 

Re: [ovirt-users] Good practices

2017-08-07 Thread FERNANDO FREDIANI

Thanks for the detailed answer Erekle.

I conclude that it is worth in any scenario to have a arbiter node in 
order to avoid wasting more disk space to RAID X + Gluster Replication 
on the top of it. The cost seems much lower if you consider running 
costs of the whole storage and compare it with the cost to build the 
arbiter node. Even having a fully redundant arbiter service with 2 nodes 
would make it wort on a larger deployment.


Regards
Fernando

On 07/08/2017 17:07, Erekle Magradze wrote:


Hi Fernando (sorry for misspelling your name, I used a different 
keyboard),


So let's go with the following scenarios:

1. Let's say you have two servers (replication factor is 2), i.e. two 
bricks per volume, in this case it is strongly recommended to have the 
arbiter node, the metadata storage that will guarantee avoiding the 
split brain situation, in this case for arbiter you don't even need a 
disk with lots of space, it's enough to have a tiny ssd but hosted on 
a separate server. Advantage of such setup is that you don't need the 
RAID 1 for each brick, you have the metadata information stored in 
arbiter node and brick replacement is easy.


2. If you have odd number of bricks (let's say 3, i.e. replication 
factor is 3) in your volume and you didn't create the arbiter node as 
well as you didn't configure the quorum, in this case the entire load 
for keeping the consistency of the volume resides on all 3 servers, 
each of them is important and each brick contains key information, 
they need to cross-check each other (that's what people usually do 
with the first try of gluster :) ), in this case replacing a brick is 
a big pain and in this case RAID 1 is a good option to have (that's 
the disadvantage, i.e. loosing the space and not having the JBOD 
option) advantage is that you don't have the to have additional 
arbiter node.


3. You have odd number of bricks and configured arbiter node, in this 
case you can easily go with JBOD, however a good practice would be to 
have a RAID 1 for arbiter disks (tiny 128GB SSD-s ar perfectly 
sufficient for volumes with 10s of TB-s in size.)


That's basically it

The rest about the reliability and setup scenarios you can find in 
gluster documentation, especially look for quorum and arbiter node 
configs+options.


Cheers

Erekle

P.S. What I was mentioning, regarding a good practice is mostly 
related to the operations of gluster not installation or deployment, 
i.e. not the conceptual understanding of gluster (conceptually it's a 
JBOD system).


On 08/07/2017 05:41 PM, FERNANDO FREDIANI wrote:


Thanks for the clarification Erekle.

However I get surprised with this way of operating from GlusterFS as 
it adds another layer of complexity to the system (either a hardware 
or software RAID) before the gluster config and increase the system's 
overall costs.


An important point to consider is: In RAID configuration you already 
have space 'wasted' in order to build redundancy (either RAID 1, 5, 
or 6). Then when you have GlusterFS on the top of several RAIDs you 
have again more data replicated so you end up with the same data 
consuming more space in a group of disks and again on the top of 
several RAIDs depending on the Gluster configuration you have (in a 
RAID 1 config the same data is replicated 4 times).


Yet another downside of having a RAID (specially RAID 5 or 6) is that 
it reduces considerably the write speeds as each group of disks will 
end up having the write speed of a single disk as all other disks of 
that group have to wait for each other to write as well.


Therefore if Gluster already replicates data why does it create this 
big pain you mentioned if the data is replicated somewhere else, can 
still be retrieved to both serve clients and reconstruct the 
equivalent disk when it is replaced ?


Fernando


On 07/08/2017 10:26, Erekle Magradze wrote:


Hi Frenando,

Here is my experience, if you consider a particular hard drive as a 
brick for gluster volume and it dies, i.e. it becomes not accessible 
it's a huge hassle to discard that brick and exchange with another 
one, since gluster some tries to access that broken brick and it's 
causing (at least it cause for me) a big pain, therefore it's better 
to have a RAID as brick, i.e. have RAID 1 (mirroring) for each 
brick, in this case if the disk is down you can easily exchange it 
and rebuild the RAID without going offline, i.e switching off the 
volume doing brick manipulations and switching it back on.


Cheers

Erekle


On 08/07/2017 03:04 PM, FERNANDO FREDIANI wrote:


For any RAID 5 or 6 configuration I normally follow a simple gold 
rule which gave good results so far:

- up to 4 disks RAID 5
- 5 or more disks RAID 6

However I didn't really understand well the recommendation to use 
any RAID with GlusterFS. I always thought that GlusteFS likes to 
work in JBOD mode and control the disks (bricks) directlly so you 
can create whatever distribution rule you wish, and if a single 

Re: [ovirt-users] Good practices

2017-08-07 Thread Erekle Magradze

Hi Fernando (sorry for misspelling your name, I used a different keyboard),

So let's go with the following scenarios:

1. Let's say you have two servers (replication factor is 2), i.e. two 
bricks per volume, in this case it is strongly recommended to have the 
arbiter node, the metadata storage that will guarantee avoiding the 
split brain situation, in this case for arbiter you don't even need a 
disk with lots of space, it's enough to have a tiny ssd but hosted on a 
separate server. Advantage of such setup is that you don't need the RAID 
1 for each brick, you have the metadata information stored in arbiter 
node and brick replacement is easy.


2. If you have odd number of bricks (let's say 3, i.e. replication 
factor is 3) in your volume and you didn't create the arbiter node as 
well as you didn't configure the quorum, in this case the entire load 
for keeping the consistency of the volume resides on all 3 servers, each 
of them is important and each brick contains key information, they need 
to cross-check each other (that's what people usually do with the first 
try of gluster :) ), in this case replacing a brick is a big pain and in 
this case RAID 1 is a good option to have (that's the disadvantage, i.e. 
loosing the space and not having the JBOD option) advantage is that you 
don't have the to have additional arbiter node.


3. You have odd number of bricks and configured arbiter node, in this 
case you can easily go with JBOD, however a good practice would be to 
have a RAID 1 for arbiter disks (tiny 128GB SSD-s ar perfectly 
sufficient for volumes with 10s of TB-s in size.)


That's basically it

The rest about the reliability and setup scenarios you can find in 
gluster documentation, especially look for quorum and arbiter node 
configs+options.


Cheers

Erekle

P.S. What I was mentioning, regarding a good practice is mostly related 
to the operations of gluster not installation or deployment, i.e. not 
the conceptual understanding of gluster (conceptually it's a JBOD system).


On 08/07/2017 05:41 PM, FERNANDO FREDIANI wrote:


Thanks for the clarification Erekle.

However I get surprised with this way of operating from GlusterFS as 
it adds another layer of complexity to the system (either a hardware 
or software RAID) before the gluster config and increase the system's 
overall costs.


An important point to consider is: In RAID configuration you already 
have space 'wasted' in order to build redundancy (either RAID 1, 5, or 
6). Then when you have GlusterFS on the top of several RAIDs you have 
again more data replicated so you end up with the same data consuming 
more space in a group of disks and again on the top of several RAIDs 
depending on the Gluster configuration you have (in a RAID 1 config 
the same data is replicated 4 times).


Yet another downside of having a RAID (specially RAID 5 or 6) is that 
it reduces considerably the write speeds as each group of disks will 
end up having the write speed of a single disk as all other disks of 
that group have to wait for each other to write as well.


Therefore if Gluster already replicates data why does it create this 
big pain you mentioned if the data is replicated somewhere else, can 
still be retrieved to both serve clients and reconstruct the 
equivalent disk when it is replaced ?


Fernando


On 07/08/2017 10:26, Erekle Magradze wrote:


Hi Frenando,

Here is my experience, if you consider a particular hard drive as a 
brick for gluster volume and it dies, i.e. it becomes not accessible 
it's a huge hassle to discard that brick and exchange with another 
one, since gluster some tries to access that broken brick and it's 
causing (at least it cause for me) a big pain, therefore it's better 
to have a RAID as brick, i.e. have RAID 1 (mirroring) for each brick, 
in this case if the disk is down you can easily exchange it and 
rebuild the RAID without going offline, i.e switching off the volume 
doing brick manipulations and switching it back on.


Cheers

Erekle


On 08/07/2017 03:04 PM, FERNANDO FREDIANI wrote:


For any RAID 5 or 6 configuration I normally follow a simple gold 
rule which gave good results so far:

- up to 4 disks RAID 5
- 5 or more disks RAID 6

However I didn't really understand well the recommendation to use 
any RAID with GlusterFS. I always thought that GlusteFS likes to 
work in JBOD mode and control the disks (bricks) directlly so you 
can create whatever distribution rule you wish, and if a single disk 
fails you just replace it and which obviously have the data 
replicated from another. The only downside of using in this way is 
that the replication data will be flow accross all servers but that 
is not much a big issue.


Anyone can elaborate about Using RAID + GlusterFS and JBOD + GlusterFS.

Thanks
Regards
Fernando


On 07/08/2017 03:46, Devin Acosta wrote:


Moacir,

I have recently installed multiple Red Hat Virtualization hosts for 
several different companies, and have dealt with the Red Hat 

Re: [ovirt-users] Good practices

2017-08-07 Thread Erekle Magradze

Hi Franando,

So let's go with the following scenarios:

1. Let's say you have two servers (replication factor is 2), i.e. two 
bricks per volume, in this case it is strongly recommended to have the 
arbiter node, the metadata storage that will guarantee avoiding the 
split brain situation, in this case for arbiter you don't even need a 
disk with lots of space, it's enough to have a tiny ssd but hosted on a 
separate server. Advantage of such setup is that you don't need the RAID 
1 for each brick, you have the metadata information stored in arbiter 
node and brick replacement is easy.


2. If you have odd number of bricks (let's say 3, i.e. replication 
factor is 3) in your volume and you didn't create the arbiter node as 
well as you didn't configure the quorum, in this case the entire load 
for keeping the consistency of the volume resides on all 3 servers, each 
of them is important and each brick contains key information, they need 
to cross-check each other (that's what people usually do with the first 
try of gluster :) ), in this case replacing a brick is a big pain and in 
this case RAID 1 is a good option to have (that's the disadvantage, i.e. 
loosing the space and not having the JBOD option) advantage is that you 
don't have the to have additional arbiter node.


3. You have odd number of bricks and configured arbiter node, in this 
case you can easily go with JBOD, however a good practice would be to 
have a RAID 1 for arbiter disks (tiny 128GB SSD-s ar perfectly 
sufficient for volumes with 10s of TB-s in size.)


That's basically it

The rest about the reliability and setup scenarios you can find in 
gluster documentation, especially look for quorum and arbiter node 
configs+options.


Cheers

Erekle

P.S. What I was mentioning, regarding a good practice is mostly related 
to the operations of gluster not installation or deployment, i.e. not 
the conceptual understanding of gluster (conceptually it's a JBOD system).



On 08/07/2017 05:41 PM, FERNANDO FREDIANI wrote:


Thanks for the clarification Erekle.

However I get surprised with this way of operating from GlusterFS as 
it adds another layer of complexity to the system (either a hardware 
or software RAID) before the gluster config and increase the system's 
overall costs.


An important point to consider is: In RAID configuration you already 
have space 'wasted' in order to build redundancy (either RAID 1, 5, or 
6). Then when you have GlusterFS on the top of several RAIDs you have 
again more data replicated so you end up with the same data consuming 
more space in a group of disks and again on the top of several RAIDs 
depending on the Gluster configuration you have (in a RAID 1 config 
the same data is replicated 4 times).


Yet another downside of having a RAID (specially RAID 5 or 6) is that 
it reduces considerably the write speeds as each group of disks will 
end up having the write speed of a single disk as all other disks of 
that group have to wait for each other to write as well.


Therefore if Gluster already replicates data why does it create this 
big pain you mentioned if the data is replicated somewhere else, can 
still be retrieved to both serve clients and reconstruct the 
equivalent disk when it is replaced ?


Fernando


On 07/08/2017 10:26, Erekle Magradze wrote:


Hi Frenando,

Here is my experience, if you consider a particular hard drive as a 
brick for gluster volume and it dies, i.e. it becomes not accessible 
it's a huge hassle to discard that brick and exchange with another 
one, since gluster some tries to access that broken brick and it's 
causing (at least it cause for me) a big pain, therefore it's better 
to have a RAID as brick, i.e. have RAID 1 (mirroring) for each brick, 
in this case if the disk is down you can easily exchange it and 
rebuild the RAID without going offline, i.e switching off the volume 
doing brick manipulations and switching it back on.


Cheers

Erekle


On 08/07/2017 03:04 PM, FERNANDO FREDIANI wrote:


For any RAID 5 or 6 configuration I normally follow a simple gold 
rule which gave good results so far:

- up to 4 disks RAID 5
- 5 or more disks RAID 6

However I didn't really understand well the recommendation to use 
any RAID with GlusterFS. I always thought that GlusteFS likes to 
work in JBOD mode and control the disks (bricks) directlly so you 
can create whatever distribution rule you wish, and if a single disk 
fails you just replace it and which obviously have the data 
replicated from another. The only downside of using in this way is 
that the replication data will be flow accross all servers but that 
is not much a big issue.


Anyone can elaborate about Using RAID + GlusterFS and JBOD + GlusterFS.

Thanks
Regards
Fernando


On 07/08/2017 03:46, Devin Acosta wrote:


Moacir,

I have recently installed multiple Red Hat Virtualization hosts for 
several different companies, and have dealt with the Red Hat 
Support Team in depth about optimal configuration in regards 

Re: [ovirt-users] Good practices

2017-08-07 Thread FERNANDO FREDIANI
What you mentioned is a specific case and not a generic situation. The 
main point there is that RAID 5 or 6 impacts write performance compared 
when you write to only 2 given disks at a time. That was the comparison 
made.


Fernando


On 07/08/2017 16:49, Fabrice Bacchella wrote:


Le 7 août 2017 à 17:41, FERNANDO FREDIANI > a écrit :




Yet another downside of having a RAID (specially RAID 5 or 6) is that 
it reduces considerably the write speeds as each group of disks will 
end up having the write speed of a single disk as all other disks of 
that group have to wait for each other to write as well.




That's not true if you have medium to high range hardware raid. For 
example, HP Smart Array come with a flash cache of about 1 or 2 Gb 
that hides that from the OS. 


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-07 Thread Fabrice Bacchella
>> Moacir: Yes! This is another reason to have separate networks for 
>> north/south and east/west. In that way I can use the standard MTU on the 
>> 10Gb NICs and jumbo frames on the file/move 40Gb NICs.

Why not Jumbo frame every where ?___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-07 Thread Fabrice Bacchella

> Le 7 août 2017 à 17:41, FERNANDO FREDIANI  a écrit 
> :
> 

> Yet another downside of having a RAID (specially RAID 5 or 6) is that it 
> reduces considerably the write speeds as each group of disks will end up 
> having the write speed of a single disk as all other disks of that group have 
> to wait for each other to write as well.
> 

That's not true if you have medium to high range hardware raid. For example, HP 
Smart Array come with a flash cache of about 1 or 2 Gb that hides that from the 
OS.___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-07 Thread Moacir Ferreira
Hi Colin,


I am in Portugal, so sorry for this late response. It is quite confusing for 
me, please consider:

1 - What if the RAID is done by the server's disk controller, not by software?


2 - For JBOD I am just using gdeploy to deploy it. However, I am not using the 
oVirt node GUI to do this.


3 - As the VM .qcow2 files are quite big, tiering would only help if made by an 
intelligent system that uses SSD for chunks of data not for the entire .qcow2 
file. But I guess this is a problem everybody else has. So, Do you know how 
tiering works in Gluster?


4 - I am putting the OS on the first disk. However, would you do differently?


Moacir


From: Colin Coe <colin@gmail.com>
Sent: Monday, August 7, 2017 4:48 AM
To: Moacir Ferreira
Cc: users@ovirt.org
Subject: Re: [ovirt-users] Good practices

1) RAID5 may be a performance hit-

2) I'd be inclined to do this as JBOD by creating a distributed disperse volume 
on each server.  Something like

echo gluster volume create dispersevol disperse-data 5 redundancy 2 \
$(for SERVER in a b c; do for BRICK in $(seq 1 5); do echo -e 
"server${SERVER}:/brick/brick-${SERVER}${BRICK}/brick \c"; done; done)

3) I think the above.

4) Gluster does support tiering, but IIRC you'd need the same number of SSD as 
spindle drives.  There may be another way to use the SSD as a fast cache.

Where are you putting the OS?

Hope I understood the question...

Thanks

On Sun, Aug 6, 2017 at 10:49 PM, Moacir Ferreira 
<moacirferre...@hotmail.com<mailto:moacirferre...@hotmail.com>> wrote:

I am willing to assemble a oVirt "pod", made of 3 servers, each with 2 CPU 
sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use GlusterFS 
to provide HA for the VMs. The 3 servers have a dual 40Gb NIC and a dual 10Gb 
NIC. So my intention is to create a loop like a server triangle using the 40Gb 
NICs for virtualization files (VMs .qcow2) access and to move VMs around the 
pod (east /west traffic) while using the 10Gb interfaces for giving services to 
the outside world (north/south traffic).


This said, my first question is: How should I deploy GlusterFS in such oVirt 
scenario? My questions are:


1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node, and then 
create a GlusterFS using them?

2 - Instead, should I create a JBOD array made of all server's disks?

3 - What is the best Gluster configuration to provide for HA while not 
consuming too much disk space?

4 - Does a oVirt hypervisor pod like I am planning to build, and the 
virtualization environment, benefits from tiering when using a SSD disk? And 
yes, will Gluster do it by default or I have to configure it to do so?


At the bottom line, what is the good practice for using GlusterFS in small pods 
for enterprises?


You opinion/feedback will be really appreciated!

Moacir

___
Users mailing list
Users@ovirt.org<mailto:Users@ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-07 Thread Yaniv Kaul
On Mon, Aug 7, 2017 at 2:41 PM, Colin Coe <colin@gmail.com> wrote:

> Hi
>
> I just thought that you'd do hardware RAID if you had the controller or
> JBOD if you didn't.  In hindsight, a server with 40Gbps NICs is pretty
> likely to have a hardware RAID controller.  I've never done JBOD with
> hardware RAID.  I think having a single gluster brick on hardware JBOD
> would be riskier than multiple bricks, each on a single disk, but thats not
> based on anything other than my prejudices.
>
> I thought gluster tiering was for the most frequently accessed files, in
> which case all the VMs disks would end up in the hot tier.  However, I have
> been wrong before...
>

The most frequent shards, may not be complete files.
Y.


> I just wanted to know where the OS was going as I didn't see it mentioned
> in the OP.  Normally, I'd have the OS on a RAID1 but in your case thats a
> lot of wasted disk.
>
> Honestly, I think Yaniv's answer was far better than my own and made the
> important point about having an arbiter.
>
> Thanks
>
> On Mon, Aug 7, 2017 at 5:56 PM, Moacir Ferreira <
> moacirferre...@hotmail.com> wrote:
>
>> Hi Colin,
>>
>>
>> I am in Portugal, so sorry for this late response. It is quite confusing
>> for me, please consider:
>>
>>
>> 1* - *What if the RAID is done by the server's disk controller, not by
>> software?
>>
>> 2 - For JBOD I am just using gdeploy to deploy it. However, I am not
>> using the oVirt node GUI to do this.
>>
>>
>> 3 - As the VM .qcow2 files are quite big, tiering would only help if
>> made by an intelligent system that uses SSD for chunks of data not for the
>> entire .qcow2 file. But I guess this is a problem everybody else has. So,
>> Do you know how tiering works in Gluster?
>>
>>
>> 4 - I am putting the OS on the first disk. However, would you do
>> differently?
>>
>>
>> Moacir
>>
>> --
>> *From:* Colin Coe <colin@gmail.com>
>> *Sent:* Monday, August 7, 2017 4:48 AM
>> *To:* Moacir Ferreira
>> *Cc:* users@ovirt.org
>> *Subject:* Re: [ovirt-users] Good practices
>>
>> 1) RAID5 may be a performance hit-
>>
>> 2) I'd be inclined to do this as JBOD by creating a distributed disperse
>> volume on each server.  Something like
>>
>> echo gluster volume create dispersevol disperse-data 5 redundancy 2 \
>> $(for SERVER in a b c; do for BRICK in $(seq 1 5); do echo -e
>> "server${SERVER}:/brick/brick-${SERVER}${BRICK}/brick \c"; done; done)
>>
>> 3) I think the above.
>>
>> 4) Gluster does support tiering, but IIRC you'd need the same number of
>> SSD as spindle drives.  There may be another way to use the SSD as a fast
>> cache.
>>
>> Where are you putting the OS?
>>
>> Hope I understood the question...
>>
>> Thanks
>>
>> On Sun, Aug 6, 2017 at 10:49 PM, Moacir Ferreira <
>> moacirferre...@hotmail.com> wrote:
>>
>>> I am willing to assemble a oVirt "pod", made of 3 servers, each with 2
>>> CPU sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use
>>> GlusterFS to provide HA for the VMs. The 3 servers have a dual 40Gb NIC and
>>> a dual 10Gb NIC. So my intention is to create a loop like a server triangle
>>> using the 40Gb NICs for virtualization files (VMs .qcow2) access and to
>>> move VMs around the pod (east /west traffic) while using the 10Gb
>>> interfaces for giving services to the outside world (north/south traffic).
>>>
>>>
>>> This said, my first question is: How should I deploy GlusterFS in such
>>> oVirt scenario? My questions are:
>>>
>>>
>>> 1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node, and
>>> then create a GlusterFS using them?
>>>
>>> 2 - Instead, should I create a JBOD array made of all server's disks?
>>>
>>> 3 - What is the best Gluster configuration to provide for HA while not
>>> consuming too much disk space?
>>>
>>> 4 - Does a oVirt hypervisor pod like I am planning to build, and the
>>> virtualization environment, benefits from tiering when using a SSD disk?
>>> And yes, will Gluster do it by default or I have to configure it to do so?
>>>
>>>
>>> At the bottom line, what is the good practice for using GlusterFS in
>>> small pods for enterprises?
>>>
>>>
>>> You opinion/feedback will be really appreciated!
>>>
>>> Moacir
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-07 Thread Yaniv Kaul
On Mon, Aug 7, 2017 at 6:41 PM, FERNANDO FREDIANI  wrote:

> Thanks for the clarification Erekle.
>
> However I get surprised with this way of operating from GlusterFS as it
> adds another layer of complexity to the system (either a hardware or
> software RAID) before the gluster config and increase the system's overall
> costs.
>

It does, but with HW based RAID it's not a big deal. The complexity is all
the stripe size math... which I personally don't like to calculate.


> An important point to consider is: In RAID configuration you already have
> space 'wasted' in order to build redundancy (either RAID 1, 5, or 6). Then
> when you have GlusterFS on the top of several RAIDs you have again more
> data replicated so you end up with the same data consuming more space in a
> group of disks and again on the top of several RAIDs depending on the
> Gluster configuration you have (in a RAID 1 config the same data is
> replicated 4 times).
>
> Yet another downside of having a RAID (specially RAID 5 or 6) is that it
> reduces considerably the write speeds as each group of disks will end up
> having the write speed of a single disk as all other disks of that group
> have to wait for each other to write as well.
>
> Therefore if Gluster already replicates data why does it create this big
> pain you mentioned if the data is replicated somewhere else, can still be
> retrieved to both serve clients and reconstruct the equivalent disk when it
> is replaced ?
>

I think it's a matter of how fast you can replace a disk (over a long
weekend?), how reliably you can do it (please, don't pull the wrong disk!
I've seen it happening too many times!) and how much of a performance hit
are you willing to accept while in degraded mode (and how long it took to
detect it. HDDs, unlike SSDs, die slowly. At least when SSD dies, it dies a
quick and determined death. HDDs may accumulate errors and errors and still
function).
Y.



Fernando
>
> On 07/08/2017 10:26, Erekle Magradze wrote:
>
> Hi Frenando,
>
> Here is my experience, if you consider a particular hard drive as a brick
> for gluster volume and it dies, i.e. it becomes not accessible it's a huge
> hassle to discard that brick and exchange with another one, since gluster
> some tries to access that broken brick and it's causing (at least it cause
> for me) a big pain, therefore it's better to have a RAID as brick, i.e.
> have RAID 1 (mirroring) for each brick, in this case if the disk is down
> you can easily exchange it and rebuild the RAID without going offline, i.e
> switching off the volume doing brick manipulations and switching it back on.
>
> Cheers
>
> Erekle
>
> On 08/07/2017 03:04 PM, FERNANDO FREDIANI wrote:
>
> For any RAID 5 or 6 configuration I normally follow a simple gold rule
> which gave good results so far:
> - up to 4 disks RAID 5
> - 5 or more disks RAID 6
>
> However I didn't really understand well the recommendation to use any RAID
> with GlusterFS. I always thought that GlusteFS likes to work in JBOD mode
> and control the disks (bricks) directlly so you can create whatever
> distribution rule you wish, and if a single disk fails you just replace it
> and which obviously have the data replicated from another. The only
> downside of using in this way is that the replication data will be flow
> accross all servers but that is not much a big issue.
>
> Anyone can elaborate about Using RAID + GlusterFS and JBOD + GlusterFS.
>
> Thanks
> Regards
> Fernando
>
> On 07/08/2017 03:46, Devin Acosta wrote:
>
>
> Moacir,
>
> I have recently installed multiple Red Hat Virtualization hosts for
> several different companies, and have dealt with the Red Hat Support Team
> in depth about optimal configuration in regards to setting up GlusterFS
> most efficiently and I wanted to share with you what I learned.
>
> In general Red Hat Virtualization team frowns upon using each DISK of the
> system as just a JBOD, sure there is some protection by having the data
> replicated, however, the recommendation is to use RAID 6 (preferred) or
> RAID-5, or at least RAID-1 at the very least.
>
> Here is the direct quote from Red Hat when I asked about RAID and Bricks:
>
> *"A typical Gluster configuration would use RAID underneath the bricks.
> RAID 6 is most typical as it gives you 2 disk failure protection, but RAID
> 5 could be used too. Once you have the RAIDed bricks, you'd then apply the
> desired replication on top of that. The most popular way of doing this
> would be distributed replicated with 2x replication. In general you'll get
> better performance with larger bricks. 12 drives is often a sweet spot.
> Another option would be to create a separate tier using all SSD’s.” *
>
> *In order to SSD tiering from my understanding you would need 1 x NVMe
> drive in each server, or 4 x SSD hot tier (it needs to be distributed,
> replicated for the hot tier if not using NVME). So with you only having 1
> SSD drive in each server, I’d suggest maybe looking 

Re: [ovirt-users] Good practices

2017-08-07 Thread FERNANDO FREDIANI

Thanks for the clarification Erekle.

However I get surprised with this way of operating from GlusterFS as it 
adds another layer of complexity to the system (either a hardware or 
software RAID) before the gluster config and increase the system's 
overall costs.


An important point to consider is: In RAID configuration you already 
have space 'wasted' in order to build redundancy (either RAID 1, 5, or 
6). Then when you have GlusterFS on the top of several RAIDs you have 
again more data replicated so you end up with the same data consuming 
more space in a group of disks and again on the top of several RAIDs 
depending on the Gluster configuration you have (in a RAID 1 config the 
same data is replicated 4 times).


Yet another downside of having a RAID (specially RAID 5 or 6) is that it 
reduces considerably the write speeds as each group of disks will end up 
having the write speed of a single disk as all other disks of that group 
have to wait for each other to write as well.


Therefore if Gluster already replicates data why does it create this big 
pain you mentioned if the data is replicated somewhere else, can still 
be retrieved to both serve clients and reconstruct the equivalent disk 
when it is replaced ?


Fernando


On 07/08/2017 10:26, Erekle Magradze wrote:


Hi Frenando,

Here is my experience, if you consider a particular hard drive as a 
brick for gluster volume and it dies, i.e. it becomes not accessible 
it's a huge hassle to discard that brick and exchange with another 
one, since gluster some tries to access that broken brick and it's 
causing (at least it cause for me) a big pain, therefore it's better 
to have a RAID as brick, i.e. have RAID 1 (mirroring) for each brick, 
in this case if the disk is down you can easily exchange it and 
rebuild the RAID without going offline, i.e switching off the volume 
doing brick manipulations and switching it back on.


Cheers

Erekle


On 08/07/2017 03:04 PM, FERNANDO FREDIANI wrote:


For any RAID 5 or 6 configuration I normally follow a simple gold 
rule which gave good results so far:

- up to 4 disks RAID 5
- 5 or more disks RAID 6

However I didn't really understand well the recommendation to use any 
RAID with GlusterFS. I always thought that GlusteFS likes to work in 
JBOD mode and control the disks (bricks) directlly so you can create 
whatever distribution rule you wish, and if a single disk fails you 
just replace it and which obviously have the data replicated from 
another. The only downside of using in this way is that the 
replication data will be flow accross all servers but that is not 
much a big issue.


Anyone can elaborate about Using RAID + GlusterFS and JBOD + GlusterFS.

Thanks
Regards
Fernando


On 07/08/2017 03:46, Devin Acosta wrote:


Moacir,

I have recently installed multiple Red Hat Virtualization hosts for 
several different companies, and have dealt with the Red Hat Support 
Team in depth about optimal configuration in regards to setting up 
GlusterFS most efficiently and I wanted to share with you what I 
learned.


In general Red Hat Virtualization team frowns upon using each DISK 
of the system as just a JBOD, sure there is some protection by 
having the data replicated, however, the recommendation is to use 
RAID 6 (preferred) or RAID-5, or at least RAID-1 at the very least.


Here is the direct quote from Red Hat when I asked about RAID and 
Bricks:

/
/
/"A typical Gluster configuration would use RAID underneath the 
bricks. RAID 6 is most typical as it gives you 2 disk failure 
protection, but RAID 5 could be used too. Once you have the RAIDed 
bricks, you'd then apply the desired replication on top of that. The 
most popular way of doing this would be distributed replicated with 
2x replication. In general you'll get better performance with larger 
bricks. 12 drives is often a sweet spot. Another option would be to 
create a separate tier using all SSD’s.” /


/In order to SSD tiering from my understanding you would need 1 x 
NVMe drive in each server, or 4 x SSD hot tier (it needs to be 
distributed, replicated for the hot tier if not using NVME). So with 
you only having 1 SSD drive in each server, I’d suggest maybe 
looking into the NVME option. /

/
/
/Since your using only 3-servers, what I’d probably suggest is to do 
(2 Replicas + Arbiter Node), this setup actually doesn’t require the 
3rd server to have big drives at all as it only stores meta-data 
about the files and not actually a full copy. /

/
/
/Please see the attached document that was given to me by Red Hat to 
get more information on this. Hope this information helps you./

/
/

--

Devin Acosta, RHCA, RHVCA
Red Hat Certified Architect

On August 6, 2017 at 7:29:29 PM, Moacir Ferreira 
(moacirferre...@hotmail.com ) wrote:


I am willing to assemble a oVirt "pod", made of 3 servers, each 
with 2 CPU sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The 
idea is to use GlusterFS to provide HA for the 

Re: [ovirt-users] Good practices

2017-08-07 Thread Erekle Magradze

Hi Frenando,

Here is my experience, if you consider a particular hard drive as a 
brick for gluster volume and it dies, i.e. it becomes not accessible 
it's a huge hassle to discard that brick and exchange with another one, 
since gluster some tries to access that broken brick and it's causing 
(at least it cause for me) a big pain, therefore it's better to have a 
RAID as brick, i.e. have RAID 1 (mirroring) for each brick, in this case 
if the disk is down you can easily exchange it and rebuild the RAID 
without going offline, i.e switching off the volume doing brick 
manipulations and switching it back on.


Cheers

Erekle


On 08/07/2017 03:04 PM, FERNANDO FREDIANI wrote:


For any RAID 5 or 6 configuration I normally follow a simple gold rule 
which gave good results so far:

- up to 4 disks RAID 5
- 5 or more disks RAID 6

However I didn't really understand well the recommendation to use any 
RAID with GlusterFS. I always thought that GlusteFS likes to work in 
JBOD mode and control the disks (bricks) directlly so you can create 
whatever distribution rule you wish, and if a single disk fails you 
just replace it and which obviously have the data replicated from 
another. The only downside of using in this way is that the 
replication data will be flow accross all servers but that is not much 
a big issue.


Anyone can elaborate about Using RAID + GlusterFS and JBOD + GlusterFS.

Thanks
Regards
Fernando


On 07/08/2017 03:46, Devin Acosta wrote:


Moacir,

I have recently installed multiple Red Hat Virtualization hosts for 
several different companies, and have dealt with the Red Hat Support 
Team in depth about optimal configuration in regards to setting up 
GlusterFS most efficiently and I wanted to share with you what I learned.


In general Red Hat Virtualization team frowns upon using each DISK of 
the system as just a JBOD, sure there is some protection by having 
the data replicated, however, the recommendation is to use RAID 6 
(preferred) or RAID-5, or at least RAID-1 at the very least.


Here is the direct quote from Red Hat when I asked about RAID and Bricks:
/
/
/"A typical Gluster configuration would use RAID underneath the 
bricks. RAID 6 is most typical as it gives you 2 disk failure 
protection, but RAID 5 could be used too. Once you have the RAIDed 
bricks, you'd then apply the desired replication on top of that. The 
most popular way of doing this would be distributed replicated with 
2x replication. In general you'll get better performance with larger 
bricks. 12 drives is often a sweet spot. Another option would be to 
create a separate tier using all SSD’s.” /


/In order to SSD tiering from my understanding you would need 1 x 
NVMe drive in each server, or 4 x SSD hot tier (it needs to be 
distributed, replicated for the hot tier if not using NVME). So with 
you only having 1 SSD drive in each server, I’d suggest maybe looking 
into the NVME option. /

/
/
/Since your using only 3-servers, what I’d probably suggest is to do 
(2 Replicas + Arbiter Node), this setup actually doesn’t require the 
3rd server to have big drives at all as it only stores meta-data 
about the files and not actually a full copy. /

/
/
/Please see the attached document that was given to me by Red Hat to 
get more information on this. Hope this information helps you./

/
/

--

Devin Acosta, RHCA, RHVCA
Red Hat Certified Architect

On August 6, 2017 at 7:29:29 PM, Moacir Ferreira 
(moacirferre...@hotmail.com ) wrote:


I am willing to assemble a oVirt "pod", made of 3 servers, each with 
2 CPU sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is 
to use GlusterFS to provide HA for the VMs. The 3 servers have a 
dual 40Gb NIC and a dual 10Gb NIC. So my intention is to create a 
loop like a server triangle using the 40Gb NICs for virtualization 
files (VMs .qcow2) access and to move VMs around the pod (east /west 
traffic) while using the 10Gb interfaces for giving services to the 
outside world (north/south traffic).



This said, my first question is: How should I deploy GlusterFS in 
such oVirt scenario? My questions are:



1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node, 
and then create a GlusterFS using them?


2 - Instead, should I create a JBOD array made of all server's disks?

3 - What is the best Gluster configuration to provide for HA while 
not consuming too much disk space?


4 - Does a oVirt hypervisor pod like I am planning to build, and the 
virtualization environment, benefits from tiering when using a SSD 
disk? And yes, will Gluster do it by default or I have to configure 
it to do so?



At the bottom line, what is the good practice for using GlusterFS in 
small pods for enterprises?



You opinion/feedback will be really appreciated!

Moacir

___
Users mailing list
Users@ovirt.org 
http://lists.ovirt.org/mailman/listinfo/users




Re: [ovirt-users] Good practices

2017-08-07 Thread FERNANDO FREDIANI
Moacir, I beleive for using the 3 servers directly connected to each 
others without switch you have to have a Bridge on each server for every 
2 physical interfaces to allow the traffic passthrough in layer2 (Is it 
possible to create this from the oVirt Engine Web Interface?). If your 
ovirtmgmt network is separate from other (should really be) that should 
be fine to do.



Fernando


On 07/08/2017 07:13, Moacir Ferreira wrote:


Hi, in-line responses.


Thanks,

Moacir



*From:* Yaniv Kaul <yk...@redhat.com>
*Sent:* Monday, August 7, 2017 7:42 AM
*To:* Moacir Ferreira
*Cc:* users@ovirt.org
*Subject:* Re: [ovirt-users] Good practices


On Sun, Aug 6, 2017 at 5:49 PM, Moacir Ferreira 
<moacirferre...@hotmail.com <mailto:moacirferre...@hotmail.com>> wrote:


I am willing to assemble a oVirt "pod", made of 3 servers, each
with 2 CPU sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The
idea is to use GlusterFS to provide HA for the VMs. The 3 servers
have a dual 40Gb NIC and a dual 10Gb NIC. So my intention is to
create a loop like a server triangle using the 40Gb NICs for
virtualization files (VMs .qcow2) access and to move VMs around
the pod (east /west traffic) while using the 10Gb interfaces for
giving services to the outside world (north/south traffic).


Very nice gear. How are you planning the network exactly? Without a 
switch, back-to-back? (sounds OK to me, just wanted to ensure this is 
what the 'dual' is used for). However, I'm unsure if you have the 
correct balance between the interface speeds (40g) and the disks (too 
many HDDs?).


Moacir:The idea is to have a very high performance network for the 
distributed file system and to prevent bottlenecks when we move one VM 
from a node to another. Using 40Gb NICs I can just connect the servers 
back-to-back. In this case I don't need the expensive 40Gb switch, I 
get very high speed and no contention between north/south traffic with 
east/west.



This said, my first question is: How should I deploy GlusterFS in
such oVirt scenario? My questions are:


1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node,
and then create a GlusterFS using them?

I would assume RAID 1 for the operating system (you don't want a 
single point of failure there?) and the rest JBODs. The SSD will be 
used for caching, I reckon? (I personally would add more SSDs instead 
of HDDs, but it does depend on the disk sizes and your space requirements.


Moacir: Yes, I agree that I need a RAID-1 for the OS. Now, generic 
JBOD or a JBOD assembled using RAID-5 "disks" createdby the server's 
disk controller?


2 - Instead, should I create a JBOD array made of all server's disks?

3 - What is the best Gluster configuration to provide for HA while
not consuming too much disk space?


Replica 2 + Arbiter sounds good to me.
Moacir:I agree, and that is what I am using.

4 - Does a oVirt hypervisor pod like I am planning to build, and
the virtualization environment, benefits from tiering when using a
SSD disk? And yes, will Gluster do it by default or I have to
configure it to do so?


Yes, I believe using lvmcache is the best way to go.

Moacir: Are you sure? I say that because the qcow2 files will be
quite big. So if tiering is "file based" the SSD would have to be
very, very big unless Gluster tiering do it by "chunks of data".


At the bottom line, what is the good practice for using GlusterFS
in small pods for enterprises?


Don't forget jumbo frames. libgfapi (coming hopefully in 4.1.5). 
Sharding (enabled out of the box if you use a hyper-converged setup 
via gdeploy).
*Moacir:* Yes! This is another reason to have separate networks for 
north/south and east/west. In that way I can use the standard MTU on 
the 10Gb NICs and jumbo frames on the file/move 40Gb NICs.


Y.


You opinion/feedback will be really appreciated!

Moacir


___
Users mailing list
Users@ovirt.org <mailto:Users@ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users
<http://lists.ovirt.org/mailman/listinfo/users>




___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-07 Thread FERNANDO FREDIANI
For any RAID 5 or 6 configuration I normally follow a simple gold rule 
which gave good results so far:

- up to 4 disks RAID 5
- 5 or more disks RAID 6

However I didn't really understand well the recommendation to use any 
RAID with GlusterFS. I always thought that GlusteFS likes to work in 
JBOD mode and control the disks (bricks) directlly so you can create 
whatever distribution rule you wish, and if a single disk fails you just 
replace it and which obviously have the data replicated from another. 
The only downside of using in this way is that the replication data will 
be flow accross all servers but that is not much a big issue.


Anyone can elaborate about Using RAID + GlusterFS and JBOD + GlusterFS.

Thanks
Regards
Fernando


On 07/08/2017 03:46, Devin Acosta wrote:


Moacir,

I have recently installed multiple Red Hat Virtualization hosts for 
several different companies, and have dealt with the Red Hat Support 
Team in depth about optimal configuration in regards to setting up 
GlusterFS most efficiently and I wanted to share with you what I learned.


In general Red Hat Virtualization team frowns upon using each DISK of 
the system as just a JBOD, sure there is some protection by having the 
data replicated, however, the recommendation is to use RAID 6 
(preferred) or RAID-5, or at least RAID-1 at the very least.


Here is the direct quote from Red Hat when I asked about RAID and Bricks:
/
/
/"A typical Gluster configuration would use RAID underneath the 
bricks. RAID 6 is most typical as it gives you 2 disk failure 
protection, but RAID 5 could be used too. Once you have the RAIDed 
bricks, you'd then apply the desired replication on top of that. The 
most popular way of doing this would be distributed replicated with 2x 
replication. In general you'll get better performance with larger 
bricks. 12 drives is often a sweet spot. Another option would be to 
create a separate tier using all SSD’s.” /


/In order to SSD tiering from my understanding you would need 1 x NVMe 
drive in each server, or 4 x SSD hot tier (it needs to be distributed, 
replicated for the hot tier if not using NVME). So with you only 
having 1 SSD drive in each server, I’d suggest maybe looking into the 
NVME option. /

/
/
/Since your using only 3-servers, what I’d probably suggest is to do 
(2 Replicas + Arbiter Node), this setup actually doesn’t require the 
3rd server to have big drives at all as it only stores meta-data about 
the files and not actually a full copy. /

/
/
/Please see the attached document that was given to me by Red Hat to 
get more information on this. Hope this information helps you./

/
/

--

Devin Acosta, RHCA, RHVCA
Red Hat Certified Architect

On August 6, 2017 at 7:29:29 PM, Moacir Ferreira 
(moacirferre...@hotmail.com ) wrote:


I am willing to assemble a oVirt "pod", made of 3 servers, each with 
2 CPU sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is 
to use GlusterFS to provide HA for the VMs. The 3 servers have a dual 
40Gb NIC and a dual 10Gb NIC. So my intention is to create a loop 
like a server triangle using the 40Gb NICs for virtualization files 
(VMs .qcow2) access and to move VMs around the pod (east /west 
traffic) while using the 10Gb interfaces for giving services to the 
outside world (north/south traffic).



This said, my first question is: How should I deploy GlusterFS in 
such oVirt scenario? My questions are:



1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node, 
and then create a GlusterFS using them?


2 - Instead, should I create a JBOD array made of all server's disks?

3 - What is the best Gluster configuration to provide for HA while 
not consuming too much disk space?


4 - Does a oVirt hypervisor pod like I am planning to build, and the 
virtualization environment, benefits from tiering when using a SSD 
disk? And yes, will Gluster do it by default or I have to configure 
it to do so?



At the bottom line, what is the good practice for using GlusterFS in 
small pods for enterprises?



You opinion/feedback will be really appreciated!

Moacir

___
Users mailing list
Users@ovirt.org 
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-07 Thread Moacir Ferreira
Hi, in-line responses.


Thanks,

Moacir


From: Yaniv Kaul <yk...@redhat.com>
Sent: Monday, August 7, 2017 7:42 AM
To: Moacir Ferreira
Cc: users@ovirt.org
Subject: Re: [ovirt-users] Good practices



On Sun, Aug 6, 2017 at 5:49 PM, Moacir Ferreira 
<moacirferre...@hotmail.com<mailto:moacirferre...@hotmail.com>> wrote:

I am willing to assemble a oVirt "pod", made of 3 servers, each with 2 CPU 
sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use GlusterFS 
to provide HA for the VMs. The 3 servers have a dual 40Gb NIC and a dual 10Gb 
NIC. So my intention is to create a loop like a server triangle using the 40Gb 
NICs for virtualization files (VMs .qcow2) access and to move VMs around the 
pod (east /west traffic) while using the 10Gb interfaces for giving services to 
the outside world (north/south traffic).

Very nice gear. How are you planning the network exactly? Without a switch, 
back-to-back? (sounds OK to me, just wanted to ensure this is what the 'dual' 
is used for). However, I'm unsure if you have the correct balance between the 
interface speeds (40g) and the disks (too many HDDs?).

Moacir: The idea is to have a very high performance network for the distributed 
file system and to prevent bottlenecks when we move one VM from a node to 
another. Using 40Gb NICs I can just connect the servers back-to-back. In this 
case I don't need the expensive 40Gb switch, I get very high speed and no 
contention between north/south traffic with east/west.



This said, my first question is: How should I deploy GlusterFS in such oVirt 
scenario? My questions are:


1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node, and then 
create a GlusterFS using them?

I would assume RAID 1 for the operating system (you don't want a single point 
of failure there?) and the rest JBODs. The SSD will be used for caching, I 
reckon? (I personally would add more SSDs instead of HDDs, but it does depend 
on the disk sizes and your space requirements.

Moacir: Yes, I agree that I need a RAID-1 for the OS. Now, generic JBOD or a 
JBOD assembled using RAID-5 "disks" created by the server's disk controller?


2 - Instead, should I create a JBOD array made of all server's disks?

3 - What is the best Gluster configuration to provide for HA while not 
consuming too much disk space?

Replica 2 + Arbiter sounds good to me.
Moacir: I agree, and that is what I am using.


4 - Does a oVirt hypervisor pod like I am planning to build, and the 
virtualization environment, benefits from tiering when using a SSD disk? And 
yes, will Gluster do it by default or I have to configure it to do so?

Yes, I believe using lvmcache is the best way to go.

Moacir: Are you sure? I say that because the qcow2 files will be quite big. So 
if tiering is "file based" the SSD would have to be very, very big unless 
Gluster tiering do it by "chunks of data".


At the bottom line, what is the good practice for using GlusterFS in small pods 
for enterprises?

Don't forget jumbo frames. libgfapi (coming hopefully in 4.1.5). Sharding 
(enabled out of the box if you use a hyper-converged setup via gdeploy).
Moacir: Yes! This is another reason to have separate networks for north/south 
and east/west. In that way I can use the standard MTU on the 10Gb NICs and 
jumbo frames on the file/move 40Gb NICs.

Y.



You opinion/feedback will be really appreciated!

Moacir

___
Users mailing list
Users@ovirt.org<mailto:Users@ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-07 Thread Colin Coe
Hi

I just thought that you'd do hardware RAID if you had the controller or
JBOD if you didn't.  In hindsight, a server with 40Gbps NICs is pretty
likely to have a hardware RAID controller.  I've never done JBOD with
hardware RAID.  I think having a single gluster brick on hardware JBOD
would be riskier than multiple bricks, each on a single disk, but thats not
based on anything other than my prejudices.

I thought gluster tiering was for the most frequently accessed files, in
which case all the VMs disks would end up in the hot tier.  However, I have
been wrong before...

I just wanted to know where the OS was going as I didn't see it mentioned
in the OP.  Normally, I'd have the OS on a RAID1 but in your case thats a
lot of wasted disk.

Honestly, I think Yaniv's answer was far better than my own and made the
important point about having an arbiter.

Thanks

On Mon, Aug 7, 2017 at 5:56 PM, Moacir Ferreira <moacirferre...@hotmail.com>
wrote:

> Hi Colin,
>
>
> I am in Portugal, so sorry for this late response. It is quite confusing
> for me, please consider:
>
>
> 1* - *What if the RAID is done by the server's disk controller, not by
> software?
>
> 2 - For JBOD I am just using gdeploy to deploy it. However, I am not
> using the oVirt node GUI to do this.
>
>
> 3 - As the VM .qcow2 files are quite big, tiering would only help if made
> by an intelligent system that uses SSD for chunks of data not for the
> entire .qcow2 file. But I guess this is a problem everybody else has. So,
> Do you know how tiering works in Gluster?
>
>
> 4 - I am putting the OS on the first disk. However, would you do
> differently?
>
>
> Moacir
>
> --
> *From:* Colin Coe <colin@gmail.com>
> *Sent:* Monday, August 7, 2017 4:48 AM
> *To:* Moacir Ferreira
> *Cc:* users@ovirt.org
> *Subject:* Re: [ovirt-users] Good practices
>
> 1) RAID5 may be a performance hit-
>
> 2) I'd be inclined to do this as JBOD by creating a distributed disperse
> volume on each server.  Something like
>
> echo gluster volume create dispersevol disperse-data 5 redundancy 2 \
> $(for SERVER in a b c; do for BRICK in $(seq 1 5); do echo -e
> "server${SERVER}:/brick/brick-${SERVER}${BRICK}/brick \c"; done; done)
>
> 3) I think the above.
>
> 4) Gluster does support tiering, but IIRC you'd need the same number of
> SSD as spindle drives.  There may be another way to use the SSD as a fast
> cache.
>
> Where are you putting the OS?
>
> Hope I understood the question...
>
> Thanks
>
> On Sun, Aug 6, 2017 at 10:49 PM, Moacir Ferreira <
> moacirferre...@hotmail.com> wrote:
>
>> I am willing to assemble a oVirt "pod", made of 3 servers, each with 2
>> CPU sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use
>> GlusterFS to provide HA for the VMs. The 3 servers have a dual 40Gb NIC and
>> a dual 10Gb NIC. So my intention is to create a loop like a server triangle
>> using the 40Gb NICs for virtualization files (VMs .qcow2) access and to
>> move VMs around the pod (east /west traffic) while using the 10Gb
>> interfaces for giving services to the outside world (north/south traffic).
>>
>>
>> This said, my first question is: How should I deploy GlusterFS in such
>> oVirt scenario? My questions are:
>>
>>
>> 1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node, and
>> then create a GlusterFS using them?
>>
>> 2 - Instead, should I create a JBOD array made of all server's disks?
>>
>> 3 - What is the best Gluster configuration to provide for HA while not
>> consuming too much disk space?
>>
>> 4 - Does a oVirt hypervisor pod like I am planning to build, and the
>> virtualization environment, benefits from tiering when using a SSD disk?
>> And yes, will Gluster do it by default or I have to configure it to do so?
>>
>>
>> At the bottom line, what is the good practice for using GlusterFS in
>> small pods for enterprises?
>>
>>
>> You opinion/feedback will be really appreciated!
>>
>> Moacir
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-07 Thread Moacir Ferreira
Devin,


Many, many thaks for your response. I will read the doc you sent and if I still 
have questions I will post them here.


But why would I use a RAIDed brick if Gluster, by itself, already "protects" 
the data by making replicas. You see, that is what is confusing to me...


Thanks,

Moacir



From: Devin Acosta <de...@pabstatencio.com>
Sent: Monday, August 7, 2017 7:46 AM
To: Moacir Ferreira; users@ovirt.org
Subject: Re: [ovirt-users] Good practices


Moacir,

I have recently installed multiple Red Hat Virtualization hosts for several 
different companies, and have dealt with the Red Hat Support Team in depth 
about optimal configuration in regards to setting up GlusterFS most efficiently 
and I wanted to share with you what I learned.

In general Red Hat Virtualization team frowns upon using each DISK of the 
system as just a JBOD, sure there is some protection by having the data 
replicated, however, the recommendation is to use RAID 6 (preferred) or RAID-5, 
or at least RAID-1 at the very least.

Here is the direct quote from Red Hat when I asked about RAID and Bricks:

"A typical Gluster configuration would use RAID underneath the bricks. RAID 6 
is most typical as it gives you 2 disk failure protection, but RAID 5 could be 
used too. Once you have the RAIDed bricks, you'd then apply the desired 
replication on top of that. The most popular way of doing this would be 
distributed replicated with 2x replication. In general you'll get better 
performance with larger bricks. 12 drives is often a sweet spot. Another option 
would be to create a separate tier using all SSD’s.”

In order to SSD tiering from my understanding you would need 1 x NVMe drive in 
each server, or 4 x SSD hot tier (it needs to be distributed, replicated for 
the hot tier if not using NVME). So with you only having 1 SSD drive in each 
server, I’d suggest maybe looking into the NVME option.

Since your using only 3-servers, what I’d probably suggest is to do (2 Replicas 
+ Arbiter Node), this setup actually doesn’t require the 3rd server to have big 
drives at all as it only stores meta-data about the files and not actually a 
full copy.

Please see the attached document that was given to me by Red Hat to get more 
information on this. Hope this information helps you.


--

Devin Acosta, RHCA, RHVCA
Red Hat Certified Architect


On August 6, 2017 at 7:29:29 PM, Moacir Ferreira 
(moacirferre...@hotmail.com<mailto:moacirferre...@hotmail.com>) wrote:

I am willing to assemble a oVirt "pod", made of 3 servers, each with 2 CPU 
sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use GlusterFS 
to provide HA for the VMs. The 3 servers have a dual 40Gb NIC and a dual 10Gb 
NIC. So my intention is to create a loop like a server triangle using the 40Gb 
NICs for virtualization files (VMs .qcow2) access and to move VMs around the 
pod (east /west traffic) while using the 10Gb interfaces for giving services to 
the outside world (north/south traffic).


This said, my first question is: How should I deploy GlusterFS in such oVirt 
scenario? My questions are:


1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node, and then 
create a GlusterFS using them?

2 - Instead, should I create a JBOD array made of all server's disks?

3 - What is the best Gluster configuration to provide for HA while not 
consuming too much disk space?

4 - Does a oVirt hypervisor pod like I am planning to build, and the 
virtualization environment, benefits from tiering when using a SSD disk? And 
yes, will Gluster do it by default or I have to configure it to do so?


At the bottom line, what is the good practice for using GlusterFS in small pods 
for enterprises?


You opinion/feedback will be really appreciated!

Moacir

___
Users mailing list
Users@ovirt.org<mailto:Users@ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-07 Thread Yaniv Kaul
On Sun, Aug 6, 2017 at 5:49 PM, Moacir Ferreira 
wrote:

> I am willing to assemble a oVirt "pod", made of 3 servers, each with 2 CPU
> sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use
> GlusterFS to provide HA for the VMs. The 3 servers have a dual 40Gb NIC and
> a dual 10Gb NIC. So my intention is to create a loop like a server triangle
> using the 40Gb NICs for virtualization files (VMs .qcow2) access and to
> move VMs around the pod (east /west traffic) while using the 10Gb
> interfaces for giving services to the outside world (north/south traffic).
>

Very nice gear. How are you planning the network exactly? Without a switch,
back-to-back? (sounds OK to me, just wanted to ensure this is what the
'dual' is used for). However, I'm unsure if you have the correct balance
between the interface speeds (40g) and the disks (too many HDDs?).


>
> This said, my first question is: How should I deploy GlusterFS in such
> oVirt scenario? My questions are:
>
>
> 1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node, and
> then create a GlusterFS using them?
>
I would assume RAID 1 for the operating system (you don't want a single
point of failure there?) and the rest JBODs. The SSD will be used for
caching, I reckon? (I personally would add more SSDs instead of HDDs, but
it does depend on the disk sizes and your space requirements.


> 2 - Instead, should I create a JBOD array made of all server's disks?
>
> 3 - What is the best Gluster configuration to provide for HA while not
> consuming too much disk space?
>

Replica 2 + Arbiter sounds good to me.


> 4 - Does a oVirt hypervisor pod like I am planning to build, and the
> virtualization environment, benefits from tiering when using a SSD disk?
> And yes, will Gluster do it by default or I have to configure it to do so?
>

Yes, I believe using lvmcache is the best way to go.

>
> At the bottom line, what is the good practice for using GlusterFS in small
> pods for enterprises?
>

Don't forget jumbo frames. libgfapi (coming hopefully in 4.1.5). Sharding
(enabled out of the box if you use a hyper-converged setup via gdeploy).
Y.


>
> You opinion/feedback will be really appreciated!
>
> Moacir
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Good practices

2017-08-06 Thread Colin Coe
1) RAID5 may be a performance hit

2) I'd be inclined to do this as JBOD by creating a distributed disperse
volume on each server.  Something like

echo gluster volume create dispersevol disperse-data 5 redundancy 2 \
$(for SERVER in a b c; do for BRICK in $(seq 1 5); do echo -e
"server${SERVER}:/brick/brick-${SERVER}${BRICK}/brick \c"; done; done)

3) I think the above

4) Gluster does support tiering, but IIRC you'd need the same number of SSD
as spindle drives.  There may be another way to use the SSD as a fast
cache.

Where are you putting the OS?

Hope I understood the question...

Thanks

On Sun, Aug 6, 2017 at 10:49 PM, Moacir Ferreira  wrote:

> I am willing to assemble a oVirt "pod", made of 3 servers, each with 2 CPU
> sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use
> GlusterFS to provide HA for the VMs. The 3 servers have a dual 40Gb NIC and
> a dual 10Gb NIC. So my intention is to create a loop like a server triangle
> using the 40Gb NICs for virtualization files (VMs .qcow2) access and to
> move VMs around the pod (east /west traffic) while using the 10Gb
> interfaces for giving services to the outside world (north/south traffic).
>
>
> This said, my first question is: How should I deploy GlusterFS in such
> oVirt scenario? My questions are:
>
>
> 1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node, and
> then create a GlusterFS using them?
>
> 2 - Instead, should I create a JBOD array made of all server's disks?
>
> 3 - What is the best Gluster configuration to provide for HA while not
> consuming too much disk space?
>
> 4 - Does a oVirt hypervisor pod like I am planning to build, and the
> virtualization environment, benefits from tiering when using a SSD disk?
> And yes, will Gluster do it by default or I have to configure it to do so?
>
>
> At the bottom line, what is the good practice for using GlusterFS in small
> pods for enterprises?
>
>
> You opinion/feedback will be really appreciated!
>
> Moacir
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Good practices

2017-08-06 Thread Moacir Ferreira
I am willing to assemble a oVirt "pod", made of 3 servers, each with 2 CPU 
sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use GlusterFS 
to provide HA for the VMs. The 3 servers have a dual 40Gb NIC and a dual 10Gb 
NIC. So my intention is to create a loop like a server triangle using the 40Gb 
NICs for virtualization files (VMs .qcow2) access and to move VMs around the 
pod (east /west traffic) while using the 10Gb interfaces for giving services to 
the outside world (north/south traffic).


This said, my first question is: How should I deploy GlusterFS in such oVirt 
scenario? My questions are:


1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node, and then 
create a GlusterFS using them?

2 - Instead, should I create a JBOD array made of all server's disks?

3 - What is the best Gluster configuration to provide for HA while not 
consuming too much disk space?

4 - Does a oVirt hypervisor pod like I am planning to build, and the 
virtualization environment, benefits from tiering when using a SSD disk? And 
yes, will Gluster do it by default or I have to configure it to do so?


At the bottom line, what is the good practice for using GlusterFS in small pods 
for enterprises?


You opinion/feedback will be really appreciated!

Moacir
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users