Re: [Gluster-users] Issues with glustershd with release 8.4 and 9.1

2021-05-26 Thread Srijan Sivakumar
Hi Marco,

If possible, let's open an issue in github and track this from there. I am
checking the previous mails in the chain to see if I can infer something
about the situation. It would be helpful if we could analyze this with the
help of log files. Especially glusterd.log and glustershd.log.

To open an issue, you can use this link : Open a new issue


On Wed, May 26, 2021 at 5:02 PM Marco Fais  wrote:

> Ravi,
>
> thanks a million.
> @Mohit, @Srijan please let me know if you need any additional information.
>
> Thanks,
> Marco
>
>
> On Tue, 25 May 2021 at 17:28, Ravishankar N 
> wrote:
>
>> Hi Marco,
>> I haven't had any luck yet.  Adding Mohit and Srijan who work in glusterd
>> in case they have some inputs.
>> -Ravi
>>
>>
>> On Tue, May 25, 2021 at 9:31 PM Marco Fais  wrote:
>>
>>> Hi Ravi
>>>
>>> just wondering if you have any further thoughts on this -- unfortunately
>>> it is something still very much affecting us at the moment.
>>> I am trying to understand how to troubleshoot it further but haven't
>>> been able to make much progress...
>>>
>>> Thanks,
>>> Marco
>>>
>>>
>>> On Thu, 20 May 2021 at 19:04, Marco Fais  wrote:
>>>
 Just to complete...

 from the FUSE mount log on server 2 I see the same errors as in
 glustershd.log on node 1:

 [2021-05-20 17:58:34.157971 +] I [MSGID: 114020]
 [client.c:2319:notify] 0-VM_Storage_1-client-11: parent translators are
 ready, attempting connect on transport []
 [2021-05-20 17:58:34.160586 +] I
 [rpc-clnt.c:1968:rpc_clnt_reconfig] 0-VM_Storage_1-client-11: changing port
 to 49170 (from 0)
 [2021-05-20 17:58:34.160608 +] I [socket.c:849:__socket_shutdown]
 0-VM_Storage_1-client-11: intentional socket shutdown(20)
 [2021-05-20 17:58:34.161403 +] I [MSGID: 114046]
 [client-handshake.c:857:client_setvolume_cbk] 0-VM_Storage_1-client-10:
 Connected, attached to remote volume [{conn-name=VM_Storage_1-client-10},
 {remote_subvol=/bricks/vm_b3_vol/brick}]
 [2021-05-20 17:58:34.161513 +] I [MSGID: 108002]
 [afr-common.c:6435:afr_notify] 0-VM_Storage_1-replicate-3: Client-quorum is
 met
 [2021-05-20 17:58:34.162043 +] I [MSGID: 114020]
 [client.c:2319:notify] 0-VM_Storage_1-client-13: parent translators are
 ready, attempting connect on transport []
 [2021-05-20 17:58:34.162491 +] I
 [rpc-clnt.c:1968:rpc_clnt_reconfig] 0-VM_Storage_1-client-12: changing port
 to 49170 (from 0)
 [2021-05-20 17:58:34.162507 +] I [socket.c:849:__socket_shutdown]
 0-VM_Storage_1-client-12: intentional socket shutdown(26)
 [2021-05-20 17:58:34.163076 +] I [MSGID: 114057]
 [client-handshake.c:1128:select_server_supported_programs]
 0-VM_Storage_1-client-11: Using Program [{Program-name=GlusterFS 4.x v1},
 {Num=1298437}, {Version=400}]
 [2021-05-20 17:58:34.163339 +] W [MSGID: 114043]
 [client-handshake.c:727:client_setvolume_cbk] 0-VM_Storage_1-client-11:
 failed to set the volume [{errno=2}, {error=No such file or directory}]
 [2021-05-20 17:58:34.163351 +] W [MSGID: 114007]
 [client-handshake.c:752:client_setvolume_cbk] 0-VM_Storage_1-client-11:
 failed to get from reply dict [{process-uuid}, {errno=22}, {error=Invalid
 argument}]
 [2021-05-20 17:58:34.163360 +] E [MSGID: 114044]
 [client-handshake.c:757:client_setvolume_cbk] 0-VM_Storage_1-client-11:
 SETVOLUME on remote-host failed [{remote-error=Brick not found}, {errno=2},
 {error=No such file or directory}]
 [2021-05-20 17:58:34.163365 +] I [MSGID: 114051]
 [client-handshake.c:879:client_setvolume_cbk] 0-VM_Storage_1-client-11:
 sending CHILD_CONNECTING event []
 [2021-05-20 17:58:34.163425 +] I [MSGID: 114018]
 [client.c:2229:client_rpc_notify] 0-VM_Storage_1-client-11: disconnected
 from client, process will keep trying to connect glusterd until brick's
 port is available [{conn-name=VM_Storage_1-client-11}]

 On Thu, 20 May 2021 at 18:54, Marco Fais  wrote:

> HI Ravi,
>
> thanks again for your help.
>
> Here is the output of "cat
> graphs/active/VM_Storage_1-client-11/private" from the same node
> where glustershd is complaining:
>
> [xlator.protocol.client.VM_Storage_1-client-11.priv]
> fd.0.remote_fd = 1
> -- = --
> granted-posix-lock[0] = owner = 7904e87d91693fb7, cmd = F_SETLK
> fl_type = F_RDLCK, fl_start = 100, fl_end = 100, user_flock: l_type =
> F_RDLCK, l_start = 100, l_len = 1
> granted-posix-lock[1] = owner = 7904e87d91693fb7, cmd = F_SETLK
> fl_type = F_RDLCK, fl_start = 101, fl_end = 101, user_flock: l_type =
> F_RDLCK, l_start = 101, l_len = 1
> granted-posix-lock[2] = owner = 7904e87d91693fb7, cmd = F_SETLK
> fl_type = F_RDLCK, fl_start = 103, fl_end = 103, user_flock: l_type =
> F_RDLCK, l_start = 103, l_len = 1

Re: [Gluster-users] Issues with glustershd with release 8.4 and 9.1

2021-05-26 Thread Marco Fais
Ravi,

thanks a million.
@Mohit, @Srijan please let me know if you need any additional information.

Thanks,
Marco


On Tue, 25 May 2021 at 17:28, Ravishankar N  wrote:

> Hi Marco,
> I haven't had any luck yet.  Adding Mohit and Srijan who work in glusterd
> in case they have some inputs.
> -Ravi
>
>
> On Tue, May 25, 2021 at 9:31 PM Marco Fais  wrote:
>
>> Hi Ravi
>>
>> just wondering if you have any further thoughts on this -- unfortunately
>> it is something still very much affecting us at the moment.
>> I am trying to understand how to troubleshoot it further but haven't been
>> able to make much progress...
>>
>> Thanks,
>> Marco
>>
>>
>> On Thu, 20 May 2021 at 19:04, Marco Fais  wrote:
>>
>>> Just to complete...
>>>
>>> from the FUSE mount log on server 2 I see the same errors as in
>>> glustershd.log on node 1:
>>>
>>> [2021-05-20 17:58:34.157971 +] I [MSGID: 114020]
>>> [client.c:2319:notify] 0-VM_Storage_1-client-11: parent translators are
>>> ready, attempting connect on transport []
>>> [2021-05-20 17:58:34.160586 +] I [rpc-clnt.c:1968:rpc_clnt_reconfig]
>>> 0-VM_Storage_1-client-11: changing port to 49170 (from 0)
>>> [2021-05-20 17:58:34.160608 +] I [socket.c:849:__socket_shutdown]
>>> 0-VM_Storage_1-client-11: intentional socket shutdown(20)
>>> [2021-05-20 17:58:34.161403 +] I [MSGID: 114046]
>>> [client-handshake.c:857:client_setvolume_cbk] 0-VM_Storage_1-client-10:
>>> Connected, attached to remote volume [{conn-name=VM_Storage_1-client-10},
>>> {remote_subvol=/bricks/vm_b3_vol/brick}]
>>> [2021-05-20 17:58:34.161513 +] I [MSGID: 108002]
>>> [afr-common.c:6435:afr_notify] 0-VM_Storage_1-replicate-3: Client-quorum is
>>> met
>>> [2021-05-20 17:58:34.162043 +] I [MSGID: 114020]
>>> [client.c:2319:notify] 0-VM_Storage_1-client-13: parent translators are
>>> ready, attempting connect on transport []
>>> [2021-05-20 17:58:34.162491 +] I [rpc-clnt.c:1968:rpc_clnt_reconfig]
>>> 0-VM_Storage_1-client-12: changing port to 49170 (from 0)
>>> [2021-05-20 17:58:34.162507 +] I [socket.c:849:__socket_shutdown]
>>> 0-VM_Storage_1-client-12: intentional socket shutdown(26)
>>> [2021-05-20 17:58:34.163076 +] I [MSGID: 114057]
>>> [client-handshake.c:1128:select_server_supported_programs]
>>> 0-VM_Storage_1-client-11: Using Program [{Program-name=GlusterFS 4.x v1},
>>> {Num=1298437}, {Version=400}]
>>> [2021-05-20 17:58:34.163339 +] W [MSGID: 114043]
>>> [client-handshake.c:727:client_setvolume_cbk] 0-VM_Storage_1-client-11:
>>> failed to set the volume [{errno=2}, {error=No such file or directory}]
>>> [2021-05-20 17:58:34.163351 +] W [MSGID: 114007]
>>> [client-handshake.c:752:client_setvolume_cbk] 0-VM_Storage_1-client-11:
>>> failed to get from reply dict [{process-uuid}, {errno=22}, {error=Invalid
>>> argument}]
>>> [2021-05-20 17:58:34.163360 +] E [MSGID: 114044]
>>> [client-handshake.c:757:client_setvolume_cbk] 0-VM_Storage_1-client-11:
>>> SETVOLUME on remote-host failed [{remote-error=Brick not found}, {errno=2},
>>> {error=No such file or directory}]
>>> [2021-05-20 17:58:34.163365 +] I [MSGID: 114051]
>>> [client-handshake.c:879:client_setvolume_cbk] 0-VM_Storage_1-client-11:
>>> sending CHILD_CONNECTING event []
>>> [2021-05-20 17:58:34.163425 +] I [MSGID: 114018]
>>> [client.c:2229:client_rpc_notify] 0-VM_Storage_1-client-11: disconnected
>>> from client, process will keep trying to connect glusterd until brick's
>>> port is available [{conn-name=VM_Storage_1-client-11}]
>>>
>>> On Thu, 20 May 2021 at 18:54, Marco Fais  wrote:
>>>
 HI Ravi,

 thanks again for your help.

 Here is the output of "cat
 graphs/active/VM_Storage_1-client-11/private" from the same node
 where glustershd is complaining:

 [xlator.protocol.client.VM_Storage_1-client-11.priv]
 fd.0.remote_fd = 1
 -- = --
 granted-posix-lock[0] = owner = 7904e87d91693fb7, cmd = F_SETLK fl_type
 = F_RDLCK, fl_start = 100, fl_end = 100, user_flock: l_type = F_RDLCK,
 l_start = 100, l_len = 1
 granted-posix-lock[1] = owner = 7904e87d91693fb7, cmd = F_SETLK fl_type
 = F_RDLCK, fl_start = 101, fl_end = 101, user_flock: l_type = F_RDLCK,
 l_start = 101, l_len = 1
 granted-posix-lock[2] = owner = 7904e87d91693fb7, cmd = F_SETLK fl_type
 = F_RDLCK, fl_start = 103, fl_end = 103, user_flock: l_type = F_RDLCK,
 l_start = 103, l_len = 1
 granted-posix-lock[3] = owner = 7904e87d91693fb7, cmd = F_SETLK fl_type
 = F_RDLCK, fl_start = 201, fl_end = 201, user_flock: l_type = F_RDLCK,
 l_start = 201, l_len = 1
 granted-posix-lock[4] = owner = 7904e87d91693fb7, cmd = F_SETLK fl_type
 = F_RDLCK, fl_start = 203, fl_end = 203, user_flock: l_type = F_RDLCK,
 l_start = 203, l_len = 1
 -- = --
 fd.1.remote_fd = 0
 -- = --
 granted-posix-lock[0] = owner = b43238094746d9fe, cmd = F_SETLK fl_type
 = F_RDLCK, fl_start = 100, fl_end = 100, user_flock: l_type = 

Re: [Gluster-users] [Gluster-devel] Meeting minutes for the Gluster community meeting held on 25-05-2021

2021-05-26 Thread Saju Mohammed Noohu
Hi Shankarsan,

Replies inline.

Thanks
Saju


On Tue, May 25, 2021 at 6:51 PM sankarshan  wrote:

> Ayush - thank you for hosting what is your first Gluster community
> meeting! It was an excellent effort at keeping the conversation moving
> along.
>
> Some additional comments in-line.
>
> On Tue, 25 May 2021 at 17:52, Ayush Ujjwal  wrote:
> >
> > # Gluster Community Meeting -  25/05/2021
>
> [snip]
>
> > * Project metrics:
> >
> > |Metrics|   Value  |
> > | - |  |
> > |[Coverity](https://scan.coverity.com/projects/gluster-glusterfs)  |
> 38  |
> > |[Clang Scan](https://build.gluster.org/job/clang-scan/lastBuild/) |
>  89  |
> > |[Test coverage](
> https://build.gluster.org/job/line-coverage/lastCompletedBuild/Line_20Coverage_20Report/)|
>   70.9 |
> > |[Gluster User Queries in last 14 days](
> https://lists.gluster.org/pipermail/gluster-users/2021-May/thread.html#start)
>   | 27 |
> > |[Total Github issues](https://github.com/gluster/glusterfs/issues)
>|315   |
> >
>
> As brought up at the meeting - it might be useful to discuss the trend
> of these values and from there deduce if these are in the right
> direction. The values in isolation do not communicate enough data to
> determine whether there are opportunities to improve. At a certain
> point in time in early 2020 there was intense focus on test coverage.
> I am not sure if that has resulted in actual better coverage or just
> spreading butter on toast.
>

Please follow the gluster-devel mailing list, there is a email sent out
every Monday
With subject: "Gluster Code Metrics Weekly Report", which gives weekly
values
and a trend graph and links to the respective jobs.
On the test coverage: Yes we need more participation from community.


>
> >
> > * Any release updates?
> > * None
> >
> > * Blocker issues across the project?
> > * It looks like the lock-recovery changes introduced with
> https://review.gluster.org/#/c/glusterfs/+/22712/ has issues. We already
> fixed https://github.com/gluster/glusterfs/pull/2456 and
> https://github.com/gluster/glusterfs/issues/2337 but looks like the code
> is buggy. Need someone to take a look at the difference between posix-locks
> and client xlators in how the locks are maintained to fix the issue
> completely.
> >
>
> As a project it is necessary to do right by our community - this means
> that the impact of the issue and remedy/workaround should be
> immediately shared widely enough to ensure that this is not missed.
> Since the issues are tagged 'blocker' I am guessing that these meet
> the somewhat established criteria of a blocker issue and would need an
> enhanced level of attention. Have the issues been triaged and
> developers assigned?
>
The above doesnt look to be a blocker issue as this happens on particular
locking pattern. Discussed and confirmed this with Pranit yesterday. The
fix for this will go in the next minor releases happening in June. We do
highlight issues in the release notes; this is not a showstopper one.


>
> >
> > * Notable thread form mailing list
> > * Not exactly from mailing list. Slack user pinged me and asked me
> if it is possible to let the users know of any known issues in the latest
> releases so that they can make a decision about which version to use. For
> example: 9.0 and 9.1 had protocol issue.
> > * Along the same lines, I wanted to ask one more question. Should we
> release beta-releases for major releases so that we get feedback about any
> issues that happen in their particular environment to address the issues
> even before the stable releases are made?
> >
> >
>
> I've offered to look at the criteria which defines a 'beta' and check
> how it aligns with a release schedule. The history of 'beta' releases
> of storage software (as compared to say a browser) is that we have
> often received no uptake. There are many reasons for this - but one
> key aspect is that it is additional work being asked from the
> community. If the 'beta' is reasonably well described perhaps the
> accrued value from this testing cycle would be better understood.
> 
>
The key aspect statement: "additional work being asked from the
community" itself is worrying. The community should help and serve each
other.
As community, we expect more participation.
As you know its practically not possible to test all
scenarios/usecases/setups.
That is why more regression tests will make sense here.

Thanks
Saju


>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org