Re: [Gluster-Maintainers] Release 4.0: Unable to complete rolling upgrade tests

Anoop C S Thu, 01 Mar 2018 21:35:12 -0800

On Fri, 2018-03-02 at 10:11 +0530, Ravishankar N wrote:
> + Anoop.
> 
> It looks like clients on the old (3.12) nodes are not able to talk to 
> the upgraded (4.0) node. I see messages like these on the old clients:
> 
>   [2018-03-02 03:49:13.483458] W [MSGID: 114007] 
> [client-handshake.c:1197:client_setvolume_cbk] 0-testvol-client-2: 
> failed to find key 'clnt-lk-version' in the options


Seems like we need to set clnt-lk-version from server side too similar to what 
we did for client via
https://review.gluster.org/#/c/19560/. Can you try with the attached patch?

> Is there something more to be done on BZ 1544366?
> 
> -Ravi
> On 03/02/2018 08:44 AM, Ravishankar N wrote:
> > 
> > On 03/02/2018 07:26 AM, Shyam Ranganathan wrote:
> > > Hi Pranith/Ravi,
> > > 
> > > So, to keep a long story short, post upgrading 1 node in a 3 node 3.13
> > > cluster, self-heal is not able to catch the heal backlog and this is a
> > > very simple synthetic test anyway, but the end result is that upgrade
> > > testing is failing.
> > 
> > Let me try this now and get back. I had done some thing similar when 
> > testing the FIPS patch and the rolling upgrade had worked.
> > Thanks,
> > Ravi
> > > 
> > > Here are the details,
> > > 
> > > - Using
> > > https://hackmd.io/GYIwTADCDsDMCGBaArAUxAY0QFhBAbIgJwCMySIwJmAJvGMBvNEA#
> > > I setup 3 server containers to install 3.13 first as follows (within the
> > > containers)
> > > 
> > > (inside the 3 server containers)
> > > yum -y update; yum -y install centos-release-gluster313; yum install
> > > glusterfs-server; glusterd
> > > 
> > > (inside centos-glfs-server1)
> > > gluster peer probe centos-glfs-server2
> > > gluster peer probe centos-glfs-server3
> > > gluster peer status
> > > gluster v create patchy replica 3 centos-glfs-server1:/d/brick1
> > > centos-glfs-server2:/d/brick2 centos-glfs-server3:/d/brick3
> > > centos-glfs-server1:/d/brick4 centos-glfs-server2:/d/brick5
> > > centos-glfs-server3:/d/brick6 force
> > > gluster v start patchy
> > > gluster v status
> > > 
> > > Create a client container as per the document above, and mount the above
> > > volume and create 1 file, 1 directory and a file within that directory.
> > > 
> > > Now we start the upgrade process (as laid out for 3.13 here
> > > http://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_3.13/ ):
> > > - killall glusterfs glusterfsd glusterd
> > > - yum install
> > > http://cbs.centos.org/kojifiles/work/tasks/1548/311548/centos-release-gluster40-0.9-1.el7.cent
> > > os.x86_64.rpm 
> > > 
> > > - yum upgrade --enablerepo=centos-gluster40-test glusterfs-server
> > > 
> > > < Go back to the client and edit the contents of one of the files and
> > > change the permissions of a directory, so that there are things to heal
> > > when we bring up the newly upgraded server>
> > > 
> > > - gluster --version
> > > - glusterd
> > > - gluster v status
> > > - gluster v heal patchy
> > > 
> > > The above starts failing as follows,
> > > [root@centos-glfs-server1 /]# gluster v heal patchy
> > > Launching heal operation to perform index self heal on volume patchy has
> > > been unsuccessful:
> > > Commit failed on centos-glfs-server2.glfstest20. Please check log file
> > > for details.
> > > Commit failed on centos-glfs-server3. Please check log file for details.
> > > 
> > >  From here, if further files or directories are created from the client,
> > > they just get added to the heal backlog, and heal does not catchup.
> > > 
> > > As is obvious, I cannot proceed, as the upgrade procedure is broken. The
> > > issue itself may not be selfheal deamon, but something around
> > > connections, but as the process fails here, looking to you guys to
> > > unblock this as soon as possible, as we are already running a day's slip
> > > in the release.
> > > 
> > > Thanks,
> > > Shyam
> 
>

From 081f8cfeb6b1947d281125fab718a9478d775574 Mon Sep 17 00:00:00 2001
From: Anoop C S <anoo...@redhat.com>
Date: Fri, 2 Mar 2018 10:32:17 +0530
Subject: [PATCH] protocol/server: Insert dummy clnt-lk-version to avoid
 upgrade failure

This is required as we check for 'clnt-lk-version' in SETVOLUME callback
with older clients in place against newer servers. Change is similar to
what we have done via https://review.gluster.org/#/c/19560/.

Change-Id: If333c20cf9503f40687ec926c44c7e50222c05b5
Signed-off-by: Anoop C S <anoo...@redhat.com>
---
 xlators/protocol/server/src/server-handshake.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/xlators/protocol/server/src/server-handshake.c b/xlators/protocol/server/src/server-handshake.c
index f6057da3b..de90a6b8e 100644
--- a/xlators/protocol/server/src/server-handshake.c
+++ b/xlators/protocol/server/src/server-handshake.c
@@ -838,6 +838,16 @@ server_setvolume (rpcsvc_request_t *req)
         if (ret)
                 gf_msg_debug (this->name, 0, "failed to set 'process-uuid'");
 
+        /* Insert a dummy key value pair to avoid failure at client side for
+         * clnt-lk-version with older clients.
+         */
+        ret = dict_set_uint32 (reply, "clnt-lk-version", 0);
+        if (ret) {
+               gf_msg (this->name, GF_LOG_WARNING, 0,
+                       PS_MSG_CLIENT_LK_VERSION_ERROR, "failed to set "
+                       "'clnt-lk-version'");
+        }
+
         ret = dict_set_uint64 (reply, "transport-ptr",
                                ((uint64_t) (long) req->trans));
         if (ret)
-- 
2.14.3

_______________________________________________
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Release 4.0: Unable to complete rolling upgrade tests

Reply via email to