On Fri, 2018-03-02 at 10:11 +0530, Ravishankar N wrote: > + Anoop. > > It looks like clients on the old (3.12) nodes are not able to talk to > the upgraded (4.0) node. I see messages like these on the old clients: > > [2018-03-02 03:49:13.483458] W [MSGID: 114007] > [client-handshake.c:1197:client_setvolume_cbk] 0-testvol-client-2: > failed to find key 'clnt-lk-version' in the options
Seems like we need to set clnt-lk-version from server side too similar to what we did for client via https://review.gluster.org/#/c/19560/. Can you try with the attached patch? > Is there something more to be done on BZ 1544366? > > -Ravi > On 03/02/2018 08:44 AM, Ravishankar N wrote: > > > > On 03/02/2018 07:26 AM, Shyam Ranganathan wrote: > > > Hi Pranith/Ravi, > > > > > > So, to keep a long story short, post upgrading 1 node in a 3 node 3.13 > > > cluster, self-heal is not able to catch the heal backlog and this is a > > > very simple synthetic test anyway, but the end result is that upgrade > > > testing is failing. > > > > Let me try this now and get back. I had done some thing similar when > > testing the FIPS patch and the rolling upgrade had worked. > > Thanks, > > Ravi > > > > > > Here are the details, > > > > > > - Using > > > https://hackmd.io/GYIwTADCDsDMCGBaArAUxAY0QFhBAbIgJwCMySIwJmAJvGMBvNEA# > > > I setup 3 server containers to install 3.13 first as follows (within the > > > containers) > > > > > > (inside the 3 server containers) > > > yum -y update; yum -y install centos-release-gluster313; yum install > > > glusterfs-server; glusterd > > > > > > (inside centos-glfs-server1) > > > gluster peer probe centos-glfs-server2 > > > gluster peer probe centos-glfs-server3 > > > gluster peer status > > > gluster v create patchy replica 3 centos-glfs-server1:/d/brick1 > > > centos-glfs-server2:/d/brick2 centos-glfs-server3:/d/brick3 > > > centos-glfs-server1:/d/brick4 centos-glfs-server2:/d/brick5 > > > centos-glfs-server3:/d/brick6 force > > > gluster v start patchy > > > gluster v status > > > > > > Create a client container as per the document above, and mount the above > > > volume and create 1 file, 1 directory and a file within that directory. > > > > > > Now we start the upgrade process (as laid out for 3.13 here > > > http://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_3.13/ ): > > > - killall glusterfs glusterfsd glusterd > > > - yum install > > > http://cbs.centos.org/kojifiles/work/tasks/1548/311548/centos-release-gluster40-0.9-1.el7.cent > > > os.x86_64.rpm > > > > > > - yum upgrade --enablerepo=centos-gluster40-test glusterfs-server > > > > > > < Go back to the client and edit the contents of one of the files and > > > change the permissions of a directory, so that there are things to heal > > > when we bring up the newly upgraded server> > > > > > > - gluster --version > > > - glusterd > > > - gluster v status > > > - gluster v heal patchy > > > > > > The above starts failing as follows, > > > [root@centos-glfs-server1 /]# gluster v heal patchy > > > Launching heal operation to perform index self heal on volume patchy has > > > been unsuccessful: > > > Commit failed on centos-glfs-server2.glfstest20. Please check log file > > > for details. > > > Commit failed on centos-glfs-server3. Please check log file for details. > > > > > > From here, if further files or directories are created from the client, > > > they just get added to the heal backlog, and heal does not catchup. > > > > > > As is obvious, I cannot proceed, as the upgrade procedure is broken. The > > > issue itself may not be selfheal deamon, but something around > > > connections, but as the process fails here, looking to you guys to > > > unblock this as soon as possible, as we are already running a day's slip > > > in the release. > > > > > > Thanks, > > > Shyam > >
From 081f8cfeb6b1947d281125fab718a9478d775574 Mon Sep 17 00:00:00 2001 From: Anoop C S <anoo...@redhat.com> Date: Fri, 2 Mar 2018 10:32:17 +0530 Subject: [PATCH] protocol/server: Insert dummy clnt-lk-version to avoid upgrade failure This is required as we check for 'clnt-lk-version' in SETVOLUME callback with older clients in place against newer servers. Change is similar to what we have done via https://review.gluster.org/#/c/19560/. Change-Id: If333c20cf9503f40687ec926c44c7e50222c05b5 Signed-off-by: Anoop C S <anoo...@redhat.com> --- xlators/protocol/server/src/server-handshake.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/xlators/protocol/server/src/server-handshake.c b/xlators/protocol/server/src/server-handshake.c index f6057da3b..de90a6b8e 100644 --- a/xlators/protocol/server/src/server-handshake.c +++ b/xlators/protocol/server/src/server-handshake.c @@ -838,6 +838,16 @@ server_setvolume (rpcsvc_request_t *req) if (ret) gf_msg_debug (this->name, 0, "failed to set 'process-uuid'"); + /* Insert a dummy key value pair to avoid failure at client side for + * clnt-lk-version with older clients. + */ + ret = dict_set_uint32 (reply, "clnt-lk-version", 0); + if (ret) { + gf_msg (this->name, GF_LOG_WARNING, 0, + PS_MSG_CLIENT_LK_VERSION_ERROR, "failed to set " + "'clnt-lk-version'"); + } + ret = dict_set_uint64 (reply, "transport-ptr", ((uint64_t) (long) req->trans)); if (ret) -- 2.14.3
_______________________________________________ maintainers mailing list maintainers@gluster.org http://lists.gluster.org/mailman/listinfo/maintainers