Re: [Gluster-users] Upgrading from 3.6.3 to 3.10/3.11

2017-08-08 Thread Diego Remolina
I cannot speak to an interim version. I went from 3.6.x to 3.7.x a
long time ago and it was a disaster. Many samba crashes and core dumps
scared me so I rolled back to 3.6.x series and stayed there until I
upgraded to 3.10.2

I never tried 3.8.x so I cannot speak to it, other than knowing is
what Red Hat considers stable on their supported RHEL OS.

Diego

On Tue, Aug 8, 2017 at 9:04 AM, Brett Randall <brett.rand...@gmail.com> wrote:
> Thanks Diego. This is invaluable information, appreciate it immensely. I had
> heard previously that you can always go back to previous Gluster binaries,
> but without understanding the data structures behind Gluster, I had no idea
> how safe that was. Backing up the lib folder makes perfect sense.
>
> The performance issues we're specifically keen to address are the small-file
> performance improvements introduced in 3.7. I feel that a lot of the
> complaints we get are from people using apps that are {slowly} crawling
> massively deep folders via SMB. I'm hoping that the improvements made in 3.7
> have stayed intact in 3.10! Otherwise, is there a generally accepted "fast
> and stable" version earlier than 3.10 that we should be looking at as an
> interim step?
>
> Brett
>
> 
> From: Diego Remolina <dijur...@gmail.com>
> Sent: Tuesday, August 8, 2017 10:39:27 PM
> To: Brett Randall
> Cc: gluster-users@gluster.org List
> Subject: Re: [Gluster-users] Upgrading from 3.6.3 to 3.10/3.11
>
> I had a mixed experience going from 3.6.6 to 3.10.2 on a two server
> setup. I have since upgraded to 3.10.3 but I still have a bad problem
> with specific files (see CONS below).
>
> PROS
> - Back on a "supported" version.
> - Windows roaming profiles (small file performance) improved
> significantly via samba. This may be due to new tuning options added
> (see my tuning options for the volume below):
> Volume Name: export
> Type: Replicate
> Volume ID: ---snip---
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: 10.0.1.7:/bricks/hdds/brick
> Brick2: 10.0.1.6:/bricks/hdds/brick
> Options Reconfigured:
> performance.stat-prefetch: on
> performance.cache-min-file-size: 0
> network.inode-lru-limit: 65536
> performance.cache-invalidation: on
> features.cache-invalidation: on
> performance.md-cache-timeout: 600
> features.cache-invalidation-timeout: 600
> performance.cache-samba-metadata: on
> transport.address-family: inet
> server.allow-insecure: on
> performance.cache-size: 10GB
> cluster.server-quorum-type: server
> nfs.disable: on
> performance.io-thread-count: 64
> performance.io-cache: on
> cluster.lookup-optimize: on
> cluster.readdir-optimize: on
> server.event-threads: 5
> client.event-threads: 5
> performance.cache-max-file-size: 256MB
> diagnostics.client-log-level: INFO
> diagnostics.brick-log-level: INFO
> cluster.server-quorum-ratio: 51%
>
> CONS
> - New problems came up with specific files (Autodesk Revit files) for
> which no solution has been found, other than stop using samba vfs
> gluster plugin and also doing some stupid file renaming game. See:
> http://lists.gluster.org/pipermail/gluster-users/2017-June/031377.html
> - With 3.6.6 I had a nightly rsync process that would copy all the
> data from the gluster server pair to another server (nightly backup).
> This operation used to finish between 1-2AM every day. After upgrade,
> this operation is much slower with rsync finishing up between 3-5AM.
> - I have not looked a lot into it, but after 40-ish days after the
> upgrade, the gluster mount in one server became stuck and I had to
> reboot the servers.
>
> As for recommendations, definitively do *not* go with 3.11 as that is
> *not* a long term release. Stay with 3.10.
> https://www.gluster.org/community/release-schedule/
>
> Make sure you have the 3.6.3 rpms available to downgrade if needed.
> You can always go back to the previous rpms if you have them available
> (this is not easy if you have a mix with other distros, i.e ubuntu,
> where the ppa only have the latest .deb file for each minor version).
>
> You must schedule downtime and bring the whole gluster down for the
> upgrade. Upgrade all servers, then clients then test, test, test and
> test more (I did not notice my Revit file problem until users brought
> it to my attention). If things are going well in your testing, then
> you should do the op version upgrade, but not before committing to
> staying with 3.10. It is truth you can lower the op version later
> manually, but then you have to manually edit several files on each
> server, so I say, stay with the *older* op version until you are sure
> you want

Re: [Gluster-users] Upgrading from 3.6.3 to 3.10/3.11

2017-08-08 Thread Brett Randall
Thanks Diego. This is invaluable information, appreciate it immensely. I had 
heard previously that you can always go back to previous Gluster binaries, but 
without understanding the data structures behind Gluster, I had no idea how 
safe that was. Backing up the lib folder makes perfect sense.

The performance issues we're specifically keen to address are the small-file 
performance improvements introduced in 3.7. I feel that a lot of the complaints 
we get are from people using apps that are {slowly} crawling massively deep 
folders via SMB. I'm hoping that the improvements made in 3.7 have stayed 
intact in 3.10! Otherwise, is there a generally accepted "fast and stable" 
version earlier than 3.10 that we should be looking at as an interim step?

Brett


From: Diego Remolina <dijur...@gmail.com>
Sent: Tuesday, August 8, 2017 10:39:27 PM
To: Brett Randall
Cc: gluster-users@gluster.org List
Subject: Re: [Gluster-users] Upgrading from 3.6.3 to 3.10/3.11

I had a mixed experience going from 3.6.6 to 3.10.2 on a two server
setup. I have since upgraded to 3.10.3 but I still have a bad problem
with specific files (see CONS below).

PROS
- Back on a "supported" version.
- Windows roaming profiles (small file performance) improved
significantly via samba. This may be due to new tuning options added
(see my tuning options for the volume below):
Volume Name: export
Type: Replicate
Volume ID: ---snip---
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.0.1.7:/bricks/hdds/brick
Brick2: 10.0.1.6:/bricks/hdds/brick
Options Reconfigured:
performance.stat-prefetch: on
performance.cache-min-file-size: 0
network.inode-lru-limit: 65536
performance.cache-invalidation: on
features.cache-invalidation: on
performance.md-cache-timeout: 600
features.cache-invalidation-timeout: 600
performance.cache-samba-metadata: on
transport.address-family: inet
server.allow-insecure: on
performance.cache-size: 10GB
cluster.server-quorum-type: server
nfs.disable: on
performance.io-thread-count: 64
performance.io-cache: on
cluster.lookup-optimize: on
cluster.readdir-optimize: on
server.event-threads: 5
client.event-threads: 5
performance.cache-max-file-size: 256MB
diagnostics.client-log-level: INFO
diagnostics.brick-log-level: INFO
cluster.server-quorum-ratio: 51%

CONS
- New problems came up with specific files (Autodesk Revit files) for
which no solution has been found, other than stop using samba vfs
gluster plugin and also doing some stupid file renaming game. See:
http://lists.gluster.org/pipermail/gluster-users/2017-June/031377.html
- With 3.6.6 I had a nightly rsync process that would copy all the
data from the gluster server pair to another server (nightly backup).
This operation used to finish between 1-2AM every day. After upgrade,
this operation is much slower with rsync finishing up between 3-5AM.
- I have not looked a lot into it, but after 40-ish days after the
upgrade, the gluster mount in one server became stuck and I had to
reboot the servers.

As for recommendations, definitively do *not* go with 3.11 as that is
*not* a long term release. Stay with 3.10.
https://www.gluster.org/community/release-schedule/

Make sure you have the 3.6.3 rpms available to downgrade if needed.
You can always go back to the previous rpms if you have them available
(this is not easy if you have a mix with other distros, i.e ubuntu,
where the ppa only have the latest .deb file for each minor version).

You must schedule downtime and bring the whole gluster down for the
upgrade. Upgrade all servers, then clients then test, test, test and
test more (I did not notice my Revit file problem until users brought
it to my attention). If things are going well in your testing, then
you should do the op version upgrade, but not before committing to
staying with 3.10. It is truth you can lower the op version later
manually, but then you have to manually edit several files on each
server, so I say, stay with the *older* op version until you are sure
you want to stay on 3.10 then upgrade the op version.

https://gluster.readthedocs.io/en/latest/Upgrade-Guide/op_version/

Prior to any changes, backup all your gluster server configuration
folders ( /var/lib/glusterd/ ) in every single server. That will allow
you to go back to the moment before upgrade if really needed.

HTH,

Diego



On Tue, Aug 8, 2017 at 6:51 AM, Brett Randall <brett.rand...@gmail.com> wrote:
> Hi all
>
> We have a 20-node, 1pb Gluster deployment that is running 3.6.3 - the same
> version we installed on day 1. There are obviously numerous performance and
> feature improvements that we'd like to take advantage of. However, this is a
> production system and we don't have a replica of it that we can test the
> upgrade on.
>
> We're running CentOS 6.6 with official Gluster binaries. We rely on
> Gluster's NFS daemon, and also use samba-glusterfs with samb

Re: [Gluster-users] Upgrading from 3.6.3 to 3.10/3.11

2017-08-08 Thread Diego Remolina
I had a mixed experience going from 3.6.6 to 3.10.2 on a two server
setup. I have since upgraded to 3.10.3 but I still have a bad problem
with specific files (see CONS below).

PROS
- Back on a "supported" version.
- Windows roaming profiles (small file performance) improved
significantly via samba. This may be due to new tuning options added
(see my tuning options for the volume below):
Volume Name: export
Type: Replicate
Volume ID: ---snip---
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.0.1.7:/bricks/hdds/brick
Brick2: 10.0.1.6:/bricks/hdds/brick
Options Reconfigured:
performance.stat-prefetch: on
performance.cache-min-file-size: 0
network.inode-lru-limit: 65536
performance.cache-invalidation: on
features.cache-invalidation: on
performance.md-cache-timeout: 600
features.cache-invalidation-timeout: 600
performance.cache-samba-metadata: on
transport.address-family: inet
server.allow-insecure: on
performance.cache-size: 10GB
cluster.server-quorum-type: server
nfs.disable: on
performance.io-thread-count: 64
performance.io-cache: on
cluster.lookup-optimize: on
cluster.readdir-optimize: on
server.event-threads: 5
client.event-threads: 5
performance.cache-max-file-size: 256MB
diagnostics.client-log-level: INFO
diagnostics.brick-log-level: INFO
cluster.server-quorum-ratio: 51%

CONS
- New problems came up with specific files (Autodesk Revit files) for
which no solution has been found, other than stop using samba vfs
gluster plugin and also doing some stupid file renaming game. See:
http://lists.gluster.org/pipermail/gluster-users/2017-June/031377.html
- With 3.6.6 I had a nightly rsync process that would copy all the
data from the gluster server pair to another server (nightly backup).
This operation used to finish between 1-2AM every day. After upgrade,
this operation is much slower with rsync finishing up between 3-5AM.
- I have not looked a lot into it, but after 40-ish days after the
upgrade, the gluster mount in one server became stuck and I had to
reboot the servers.

As for recommendations, definitively do *not* go with 3.11 as that is
*not* a long term release. Stay with 3.10.
https://www.gluster.org/community/release-schedule/

Make sure you have the 3.6.3 rpms available to downgrade if needed.
You can always go back to the previous rpms if you have them available
(this is not easy if you have a mix with other distros, i.e ubuntu,
where the ppa only have the latest .deb file for each minor version).

You must schedule downtime and bring the whole gluster down for the
upgrade. Upgrade all servers, then clients then test, test, test and
test more (I did not notice my Revit file problem until users brought
it to my attention). If things are going well in your testing, then
you should do the op version upgrade, but not before committing to
staying with 3.10. It is truth you can lower the op version later
manually, but then you have to manually edit several files on each
server, so I say, stay with the *older* op version until you are sure
you want to stay on 3.10 then upgrade the op version.

https://gluster.readthedocs.io/en/latest/Upgrade-Guide/op_version/

Prior to any changes, backup all your gluster server configuration
folders ( /var/lib/glusterd/ ) in every single server. That will allow
you to go back to the moment before upgrade if really needed.

HTH,

Diego



On Tue, Aug 8, 2017 at 6:51 AM, Brett Randall  wrote:
> Hi all
>
> We have a 20-node, 1pb Gluster deployment that is running 3.6.3 - the same
> version we installed on day 1. There are obviously numerous performance and
> feature improvements that we'd like to take advantage of. However, this is a
> production system and we don't have a replica of it that we can test the
> upgrade on.
>
> We're running CentOS 6.6 with official Gluster binaries. We rely on
> Gluster's NFS daemon, and also use samba-glusterfs with samba for SMB access
> to our Gluster volume.
>
> What risks might we face with an upgrade from 3.6 to 3.10/3.11? And what
> rollback options do we have?
>
> More importantly, is there anyone who would be willing to work for a
> retainer plus worked hours to be "on call" in case we have problems during
> the upgrade? Someone with plenty of experience in Gluster over the years and
> could diagnose any issues we may experience in an upgrade. If you're
> interested, please e-mail me off-list. I'm, of course, interested in advice
> on-list as well.
>
> Thanks
>
> Brett.
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Upgrading from 3.6.3 to 3.10/3.11

2017-08-08 Thread Brett Randall
Hi all

We have a 20-node, 1pb Gluster deployment that is running 3.6.3 - the same
version we installed on day 1. There are obviously numerous performance and
feature improvements that we'd like to take advantage of. However, this is
a production system and we don't have a replica of it that we can test the
upgrade on.

We're running CentOS 6.6 with official Gluster binaries. We rely on
Gluster's NFS daemon, and also use samba-glusterfs with samba for SMB
access to our Gluster volume.

What risks might we face with an upgrade from 3.6 to 3.10/3.11? And what
rollback options do we have?

More importantly, is there anyone who would be willing to work for a
retainer plus worked hours to be "on call" in case we have problems during
the upgrade? Someone with plenty of experience in Gluster over the years
and could diagnose any issues we may experience in an upgrade. If you're
interested, please e-mail me off-list. I'm, of course, interested in advice
on-list as well.

Thanks

Brett.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users