Re: [Gluster-users] [ovirt-users] Re: Announcing Gluster release 5.5

2019-03-31 Thread Krutika Dhananjay
Adding back gluster-users
Comments inline ...

On Fri, Mar 29, 2019 at 8:11 PM Olaf Buitelaar 
wrote:

> Dear Krutika,
>
>
>
> 1. I’ve made 2 profile runs of around 10 minutes (see files
> profile_data.txt and profile_data2.txt). Looking at it, most time seems be
> spent at the  fop’s fsync and readdirp.
>
> Unfortunate I don’t have the profile info for the 3.12.15 version so it’s
> a bit hard to compare.
>
> One additional thing I do notice on 1 machine (10.32.9.5) the iowait time
> increased a lot, from an average below the 1% it’s now around the 12% after
> the upgrade.
>
> So first suspicion with be lighting strikes twice, and I’ve also just now
> a bad disk, but that doesn’t appear to be the case, since all smart status
> report ok.
>
> Also dd shows performance I would more or less expect;
>
> dd if=/dev/zero of=/data/test_file  bs=100M count=1  oflag=dsync
>
> 1+0 records in
>
> 1+0 records out
>
> 104857600 bytes (105 MB) copied, 0.686088 s, 153 MB/s
>
> dd if=/dev/zero of=/data/test_file  bs=1G count=1  oflag=dsync
>
> 1+0 records in
>
> 1+0 records out
>
> 1073741824 bytes (1.1 GB) copied, 7.61138 s, 141 MB/s
>
> if=/dev/urandom of=/data/test_file  bs=1024 count=100
>
> 100+0 records in
>
> 100+0 records out
>
> 102400 bytes (1.0 GB) copied, 6.35051 s, 161 MB/s
>
> dd if=/dev/zero of=/data/test_file  bs=1024 count=100
>
> 100+0 records in
>
> 100+0 records out
>
> 102400 bytes (1.0 GB) copied, 1.6899 s, 606 MB/s
>
> When I disable this brick (service glusterd stop; pkill glusterfsd)
> performance in gluster is better, but not on par with what it was. Also the
> cpu usages on the “neighbor” nodes which hosts the other bricks in the same
> subvolume increases quite a lot in this case, which I wouldn’t expect
> actually since they shouldn't handle much more work, except flagging shards
> to heal. Iowait  also goes to idle once gluster is stopped, so it’s for
> sure gluster which waits for io.
>
>
>

So I see that FSYNC %-latency is on the higher side. And I also noticed you
don't have direct-io options enabled on the volume.
Could you set the following options on the volume -
# gluster volume set  network.remote-dio off
# gluster volume set  performance.strict-o-direct on
and also disable choose-local
# gluster volume set  cluster.choose-local off

let me know if this helps.

2. I’ve attached the mnt log and volume info, but I couldn’t find anything
> relevant in in those logs. I think this is because we run the VM’s with
> libgfapi;
>
> [root@ovirt-host-01 ~]# engine-config  -g LibgfApiSupported
>
> LibgfApiSupported: true version: 4.2
>
> LibgfApiSupported: true version: 4.1
>
> LibgfApiSupported: true version: 4.3
>
> And I can confirm the qemu process is invoked with the gluster:// address
> for the images.
>
> The message is logged in the /var/lib/libvert/qemu/  file, which
> I’ve also included. For a sample case see around; 2019-03-28 20:20:07
>
> Which has the error; E [MSGID: 133010]
> [shard.c:2294:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on
> shard 109886 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c
> [Stale file handle]
>

Could you also attach the brick logs for this volume?


>
> 3. yes I see multiple instances for the same brick directory, like;
>
> /usr/sbin/glusterfsd -s 10.32.9.6 --volfile-id
> ovirt-core.10.32.9.6.data-gfs-bricks-brick1-ovirt-core -p
> /var/run/gluster/vols/ovirt-core/10.32.9.6-data-gfs-bricks-brick1-ovirt-core.pid
> -S /var/run/gluster/452591c9165945d9.socket --brick-name
> /data/gfs/bricks/brick1/ovirt-core -l
> /var/log/glusterfs/bricks/data-gfs-bricks-brick1-ovirt-core.log
> --xlator-option *-posix.glusterd-uuid=fb513da6-f3bd-4571-b8a2-db5efaf60cc1
> --process-name brick --brick-port 49154 --xlator-option
> ovirt-core-server.listen-port=49154
>
>
>
> I’ve made an export of the output of ps from the time I observed these
> multiple processes.
>
> In addition the brick_mux bug as noted by Atin. I might also have another
> possible cause, as ovirt moves nodes from none-operational state or
> maintenance state to active/activating, it also seems to restart gluster,
> however I don’t have direct proof for this theory.
>
>
>

+Atin Mukherjee  ^^
+Mohit Agrawal   ^^

-Krutika

Thanks Olaf
>
> Op vr 29 mrt. 2019 om 10:03 schreef Sandro Bonazzola  >:
>
>>
>>
>> Il giorno gio 28 mar 2019 alle ore 17:48  ha
>> scritto:
>>
>>> Dear All,
>>>
>>> I wanted to share my experience upgrading from 4.2.8 to 4.3.1. While
>>> previous upgrades from 4.1 to 4.2 etc. went rather smooth, this one was a
>>> different experience. After first trying a test upgrade on a 3 node setup,
>>> which went fine. i headed to upgrade the 9 node production platform,
>>> unaware of the backward compatibility issues between gluster 3.12.15 ->
>>> 5.3. After upgrading 2 nodes, the HA engine stopped and wouldn't start.
>>> Vdsm wasn't able to mount the engine storage domain, since /dom_md/metadata
>>> was missing or couldn't be accessed. 

Re: [Gluster-users] upgrade best practices

2019-03-31 Thread Hari Gowtham
Hi,

As mentioned above you need not stop the whole cluster and then
upgrade and restart the gluster processes.
We did do the basic rolling upgrade test
with replica volume. And things turned out fine.
There was this minor issue:
https://bugzilla.redhat.com/show_bug.cgi?id=1694010

To overcome this, you will have to check if your upgraded node is
getting disconnect. if it does, then you will have to
1) stop glusterd service on all the nodes (only glusterd)
2) flush the iptables (iptables -F)
3) start glusterd

If you are fine with stopping your service and upgrading all nodes at
the same time,
You can go ahead with that as well.

On Sun, Mar 31, 2019 at 11:02 PM Soumya Koduri  wrote:
>
>
>
> On 3/29/19 10:39 PM, Poornima Gurusiddaiah wrote:
> >
> >
> > On Fri, Mar 29, 2019, 10:03 PM Jim Kinney  > > wrote:
> >
> > Currently running 3.12 on Centos 7.6. Doing cleanups on split-brain
> > and out of sync, need heal files.
> >
> > We need to migrate the three replica servers to gluster v. 5 or 6.
> > Also will need to upgrade about 80 clients as well. Given that a
> > complete removal of gluster will not touch the 200+TB of data on 12
> > volumes, we are looking at doing that process, Stop all clients,
> > stop all glusterd services, remove all of it, install new version,
> > setup new volumes from old bricks, install new clients, mount
> > everything.
> >
> > We would like to get some better performance from nfs-ganesha mounts
> > but that doesn't look like an option (not done any parameter tweaks
> > in testing yet). At a bare minimum, we would like to minimize the
> > total downtime of all systems.
>
> Could you please be more specific here? As in are you looking for better
> performance during upgrade process or in general? Compared to 3.12,
> there are lot of perf improvements done in both glusterfs and esp.,
> nfs-ganesha (latest stable - V2.7.x) stack. If you could provide more
> information about your workloads (for eg., large-file,small-files,
> metadata-intensive) , we can make some recommendations wrt to configuration.
>
> Thanks,
> Soumya
>
> >
> > Does this process make more sense than a version upgrade process to
> > 4.1, then 5, then 6? What "gotcha's" do I need to be ready for? I
> > have until late May to prep and test on old, slow hardware with a
> > small amount of files and volumes.
> >
> >
> > You can directly upgrade from 3.12 to 6.x. I would suggest that rather
> > than deleting and creating Gluster volume. +Hari and +Sanju for further
> > guidelines on upgrade, as they recently did upgrade tests. +Soumya to
> > add to the nfs-ganesha aspect.
> >
> > Regards,
> > Poornima
> >
> > --
> >
> > James P. Kinney III
> >
> > Every time you stop a school, you will have to build a jail. What you
> > gain at one end you lose at the other. It's like feeding a dog on his
> > own tail. It won't fatten the dog.
> > - Speech 11/23/1900 Mark Twain
> >
> > http://heretothereideas.blogspot.com/
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org 
> > https://lists.gluster.org/mailman/listinfo/gluster-users
> >



-- 
Regards,
Hari Gowtham.
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] Upgrade testing to gluster 6

2019-03-31 Thread Hari Gowtham
Comments inline.

On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay
 wrote:
>
> Quite a considerable amount of detail here. Thank you!
>
> On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham  wrote:
> >
> > Hello Gluster users,
> >
> > As you all aware that glusterfs-6 is out, we would like to inform you
> > that, we have spent a significant amount of time in testing
> > glusterfs-6 in upgrade scenarios. We have done upgrade testing to
> > glusterfs-6 from various releases like 3.12, 4.1 and 5.3.
> >
> > As glusterfs-6 has got in a lot of changes, we wanted to test those 
> > portions.
> > There were xlators (and respective options to enable/disable them)
> > added and deprecated in glusterfs-6 from various versions [1].
> >
> > We had to check the following upgrade scenarios for all such options
> > Identified in [1]:
> > 1) option never enabled and upgraded
> > 2) option enabled and then upgraded
> > 3) option enabled and then disabled and then upgraded
> >
> > We weren't manually able to check all the combinations for all the options.
> > So the options involving enabling and disabling xlators were prioritized.
> > The below are the result of the ones tested.
> >
> > Never enabled and upgraded:
> > checked from 3.12, 4.1, 5.3 to 6 the upgrade works.
> >
> > Enabled and upgraded:
> > Tested for tier which is deprecated, It is not a recommended upgrade.
> > As expected the volume won't be consumable and will have a few more
> > issues as well.
> > Tested with 3.12, 4.1 and 5.3 to 6 upgrade.
> >
> > Enabled, disabled before upgrade.
> > Tested for tier with 3.12 and the upgrade went fine.
> >
> > There is one common issue to note in every upgrade. The node being
> > upgraded is going into disconnected state. You have to flush the iptables
> > and the restart glusterd on all nodes to fix this.
> >
>
> Is this something that is written in the upgrade notes? I do not seem
> to recall, if not, I'll send a PR

No this wasn't mentioned in the release notes. PRs are welcome.

>
> > The testing for enabling new options is still pending. The new options
> > won't cause as much issues as the deprecated ones so this was put at
> > the end of the priority list. It would be nice to get contributions
> > for this.
> >
>
> Did the range of tests lead to any new issues?

Yes. In the first round of testing we found an issue and had to postpone the
release of 6 until the fix was made available.
https://bugzilla.redhat.com/show_bug.cgi?id=1684029

And then we tested it again after this patch was made available.
and came  across this:
https://bugzilla.redhat.com/show_bug.cgi?id=1694010

Have mentioned this in the second mail as to how to over this situation
for now until the fix is available.

>
> > For the disable testing, tier was used as it covers most of the xlator
> > that was removed. And all of these tests were done on a replica 3 volume.
> >
>
> I'm not sure if the Glusto team is reading this, but it would be
> pertinent to understand if the approach you have taken can be
> converted into a form of automated testing pre-release.

I don't have an answer for this, have CCed Vijay.
He might have an idea.

>
> > Note: This is only for upgrade testing of the newly added and removed
> > xlators. Does not involve the normal tests for the xlator.
> >
> > If you have any questions, please feel free to reach us.
> >
> > [1] 
> > https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing
> >
> > Regards,
> > Hari and Sanju.
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Regards,
Hari Gowtham.
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] Upgrade testing to gluster 6

2019-03-31 Thread Sankarshan Mukhopadhyay
Quite a considerable amount of detail here. Thank you!

On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham  wrote:
>
> Hello Gluster users,
>
> As you all aware that glusterfs-6 is out, we would like to inform you
> that, we have spent a significant amount of time in testing
> glusterfs-6 in upgrade scenarios. We have done upgrade testing to
> glusterfs-6 from various releases like 3.12, 4.1 and 5.3.
>
> As glusterfs-6 has got in a lot of changes, we wanted to test those portions.
> There were xlators (and respective options to enable/disable them)
> added and deprecated in glusterfs-6 from various versions [1].
>
> We had to check the following upgrade scenarios for all such options
> Identified in [1]:
> 1) option never enabled and upgraded
> 2) option enabled and then upgraded
> 3) option enabled and then disabled and then upgraded
>
> We weren't manually able to check all the combinations for all the options.
> So the options involving enabling and disabling xlators were prioritized.
> The below are the result of the ones tested.
>
> Never enabled and upgraded:
> checked from 3.12, 4.1, 5.3 to 6 the upgrade works.
>
> Enabled and upgraded:
> Tested for tier which is deprecated, It is not a recommended upgrade.
> As expected the volume won't be consumable and will have a few more
> issues as well.
> Tested with 3.12, 4.1 and 5.3 to 6 upgrade.
>
> Enabled, disabled before upgrade.
> Tested for tier with 3.12 and the upgrade went fine.
>
> There is one common issue to note in every upgrade. The node being
> upgraded is going into disconnected state. You have to flush the iptables
> and the restart glusterd on all nodes to fix this.
>

Is this something that is written in the upgrade notes? I do not seem
to recall, if not, I'll send a PR

> The testing for enabling new options is still pending. The new options
> won't cause as much issues as the deprecated ones so this was put at
> the end of the priority list. It would be nice to get contributions
> for this.
>

Did the range of tests lead to any new issues?

> For the disable testing, tier was used as it covers most of the xlator
> that was removed. And all of these tests were done on a replica 3 volume.
>

I'm not sure if the Glusto team is reading this, but it would be
pertinent to understand if the approach you have taken can be
converted into a form of automated testing pre-release.

> Note: This is only for upgrade testing of the newly added and removed
> xlators. Does not involve the normal tests for the xlator.
>
> If you have any questions, please feel free to reach us.
>
> [1] 
> https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing
>
> Regards,
> Hari and Sanju.
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Gluster Monthly Newsletter, March 2019

2019-03-31 Thread Amye Scavarda
Gluster Monthly Newsletter, March 2019
Congratulations to the team for getting Gluster 6 released!
https://www.gluster.org/announcing-gluster-6/
https://lists.gluster.org/pipermail/gluster-users/2019-March/036144.html

Our retrospective survey is open through April 8th, give us feedback
on what we should start, stop or continue!
https://lists.gluster.org/pipermail/gluster-users/2019-March/036144.html

Gluster 7 Roadmap
Discussion kicked off for our 7 roadmap on the mailing lists, see
[Gluster-users] GlusterFS v7.0 (and v8.0) roadmap discussion
https://lists.gluster.org/pipermail/gluster-users/2019-March/036139.html
for more details.

Gluster Friday Five:
See all of our Friday Five casts at
https://www.youtube.com/user/GlusterCommunity

Contributors
Top Contributing Companies:  Red Hat, Comcast, DataLab, Gentoo Linux,
Facebook, BioDec, Samsung, Etersoft

Top Contributors in February: Yaniv Kaul, Pranith Kumar Karampuri,
Aravinda VK, Ravishankar N

Noteworthy Threads:
[Gluster-users] Gluster : Improvements on "heal info" command
https://lists.gluster.org/pipermail/gluster-users/2019-March/035955.html
[Gluster-users] Announcing Gluster release 5.5
https://lists.gluster.org/pipermail/gluster-users/2019-March/036098.html
[Gluster-users] GlusterFS v7.0 (and v8.0) roadmap discussion
https://lists.gluster.org/pipermail/gluster-users/2019-March/036139.html
[Gluster-users] Proposal: Changes in Gluster Community meetings
https://lists.gluster.org/pipermail/gluster-users/2019-March/036140.html
[Gluster-users] Help: gluster-block
https://lists.gluster.org/pipermail/gluster-users/2019-March/036147.html
[Gluster-users] POSIX locks and disconnections between clients and bricks
https://lists.gluster.org/pipermail/gluster-users/2019-March/036161.html
[Gluster-users] [Gluster-infra] Gluster HA
https://lists.gluster.org/pipermail/gluster-users/2019-March/036200.html
[Gluster-users] [Event CfP Announce] DevConf events India and US in
the month of August 2019
https://lists.gluster.org/pipermail/gluster-users/2019-March/036211.html
[Gluster-users] Upgrade testing to gluster 6
https://lists.gluster.org/pipermail/gluster-users/2019-March/036214.html
[Gluster-users] Quick update on glusterd's volume scalability improvements
https://lists.gluster.org/pipermail/gluster-users/2019-March/036219.html
[Gluster-devel] [Gluster-infra] 8/10 AWS jenkins builders disconnected
https://lists.gluster.org/pipermail/gluster-devel/2019-March/055906.html
[Gluster-devel] [Gluster-users] Experiences with FUSE in real world -
Presentationat Vault 2019
https://lists.gluster.org/pipermail/gluster-devel/2019-March/055944.html
[Gluster-devel] [Gluster-infra] Upgrading build.gluster.org
https://lists.gluster.org/pipermail/gluster-devel/2019-March/055912.html
[Gluster-devel] Github#268 Compatibility with Alpine Linux
https://lists.gluster.org/pipermail/gluster-devel/2019-March/055921.html
[Gluster-devel] GF_CALLOC to GF_MALLOC conversion - is it safe?
https://lists.gluster.org/pipermail/gluster-devel/2019-March/055969.html
[Gluster-devel] Issue with posix locks
https://lists.gluster.org/pipermail/gluster-devel/2019-March/056027.html

Events:
Red Hat Summit, May 4-6, 2019 - https://www.redhat.com/en/summit/2019
Open Source Summit and KubeCon + CloudNativeCon Shanghai, June 24-26,
2019 https://www.lfasiallc.com/events/kubecon-cloudnativecon-china-2019/
DevConf India, August 2- 3 2019, Bengaluru  - https://devconf.info/in
DevConf USA, August 15-17, 2019, Boston -  https://devconf.info/us/



-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] upgrade best practices

2019-03-31 Thread Soumya Koduri




On 3/29/19 10:39 PM, Poornima Gurusiddaiah wrote:



On Fri, Mar 29, 2019, 10:03 PM Jim Kinney > wrote:


Currently running 3.12 on Centos 7.6. Doing cleanups on split-brain
and out of sync, need heal files.

We need to migrate the three replica servers to gluster v. 5 or 6.
Also will need to upgrade about 80 clients as well. Given that a
complete removal of gluster will not touch the 200+TB of data on 12
volumes, we are looking at doing that process, Stop all clients,
stop all glusterd services, remove all of it, install new version,
setup new volumes from old bricks, install new clients, mount
everything.

We would like to get some better performance from nfs-ganesha mounts
but that doesn't look like an option (not done any parameter tweaks
in testing yet). At a bare minimum, we would like to minimize the
total downtime of all systems.


Could you please be more specific here? As in are you looking for better 
performance during upgrade process or in general? Compared to 3.12, 
there are lot of perf improvements done in both glusterfs and esp., 
nfs-ganesha (latest stable - V2.7.x) stack. If you could provide more 
information about your workloads (for eg., large-file,small-files, 
metadata-intensive) , we can make some recommendations wrt to configuration.


Thanks,
Soumya



Does this process make more sense than a version upgrade process to
4.1, then 5, then 6? What "gotcha's" do I need to be ready for? I
have until late May to prep and test on old, slow hardware with a
small amount of files and volumes.


You can directly upgrade from 3.12 to 6.x. I would suggest that rather 
than deleting and creating Gluster volume. +Hari and +Sanju for further 
guidelines on upgrade, as they recently did upgrade tests. +Soumya to 
add to the nfs-ganesha aspect.


Regards,
Poornima

-- 


James P. Kinney III

Every time you stop a school, you will have to build a jail. What you
gain at one end you lose at the other. It's like feeding a dog on his
own tail. It won't fatten the dog.
- Speech 11/23/1900 Mark Twain

http://heretothereideas.blogspot.com/

___
Gluster-users mailing list
Gluster-users@gluster.org 
https://lists.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users