Re: [Gluster-users] blocking process on FUSE mount in directory which is using quota

2018-09-02 Thread mabi
Hello,

I wanted to report that I had this morning a similar issue on another server 
where a few PHP-FPM processes get blocked on different GlusterFS volume mounted 
through a FUSE mount. This GlusterFS volume has no quota enabled so it might 
not be quota related after all.

Here would be the Linux kernel stack trace:

[Sun Sep  2 06:47:47 2018] INFO: task php5-fpm:25880 blocked for more than 120 
seconds.
[Sun Sep  2 06:47:47 2018]   Not tainted 3.16.0-4-amd64 #1
[Sun Sep  2 06:47:47 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[Sun Sep  2 06:47:47 2018] php5-fpmD 88017ee12f40 0 25880  
1 0x0004
[Sun Sep  2 06:47:47 2018]  880101688b60 0282 00012f40 
880059ca3fd8
[Sun Sep  2 06:47:47 2018]  00012f40 880101688b60 8801093b51b0 
8801067ec800
[Sun Sep  2 06:47:47 2018]  880059ca3cc0 8801093b5290 8801093b51b0 
880059ca3e80
[Sun Sep  2 06:47:47 2018] Call Trace:
[Sun Sep  2 06:47:47 2018]  [] ? 
__fuse_request_send+0xbd/0x270 [fuse]
[Sun Sep  2 06:47:47 2018]  [] ? 
prepare_to_wait_event+0xf0/0xf0
[Sun Sep  2 06:47:47 2018]  [] ? fuse_send_write+0xd0/0x100 
[fuse]
[Sun Sep  2 06:47:47 2018]  [] ? 
fuse_perform_write+0x26f/0x4b0 [fuse]
[Sun Sep  2 06:47:47 2018]  [] ? 
fuse_file_write_iter+0x1dd/0x2b0 [fuse]
[Sun Sep  2 06:47:47 2018]  [] ? new_sync_write+0x74/0xa0
[Sun Sep  2 06:47:47 2018]  [] ? vfs_write+0xb2/0x1f0
[Sun Sep  2 06:47:47 2018]  [] ? vfs_read+0xed/0x170
[Sun Sep  2 06:47:47 2018]  [] ? SyS_write+0x42/0xa0
[Sun Sep  2 06:47:47 2018]  [] ? SyS_lseek+0x7e/0xa0
[Sun Sep  2 06:47:47 2018]  [] ? 
system_call_fast_compare_end+0x10/0x15

Did anyone already have time to have a look at the statedump file I sent around 
3 weeks ago?

I never saw this type of problems in the past and it started to appear since I 
upgraded to GluterFS 3.12.12.

Best regards,
Mabi

‐‐‐ Original Message ‐‐‐
On August 15, 2018 9:21 AM, mabi  wrote:

> Great, you will then find attached here the statedump of the client using the 
> FUSE glusterfs mount right after two processes have blocked.
>
> Two notes here regarding the "path=" in this statedump file:
> - I have renamed all the "path=" which has the problematic directory as 
> "path=PROBLEMATIC_DIRECTORY_HERE
> - All the other "path=" I have renamed them to "path=REMOVED_FOR_PRIVACY".
>
> Note also that funnily enough the number of "path=" for that problematic 
> directory sums up to exactly 5000 entries. Coincidence or hint to the problem 
> maybe?
>
> ‐‐‐ Original Message ‐‐‐
> On August 15, 2018 5:21 AM, Raghavendra Gowdappa  wrote:
>
>> On Tue, Aug 14, 2018 at 7:23 PM, mabi  wrote:
>>
>>> Bad news: the process blocked happened again this time with another 
>>> directory of another user which is NOT over his quota but which also has 
>>> quota enabled.
>>>
>>> The symptoms on the Linux side are the same:
>>>
>>> [Tue Aug 14 15:30:33 2018] INFO: task php5-fpm:14773 blocked for more than 
>>> 120 seconds.
>>> [Tue Aug 14 15:30:33 2018]   Not tainted 3.16.0-4-amd64 #1
>>> [Tue Aug 14 15:30:33 2018] "echo 0 > 
>>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> [Tue Aug 14 15:30:33 2018] php5-fpmD 8801fea13200 0 14773   
>>>  729 0x
>>> [Tue Aug 14 15:30:33 2018]  880100bbe0d0 0282 
>>> 00013200 880129bcffd8
>>> [Tue Aug 14 15:30:33 2018]  00013200 880100bbe0d0 
>>> 880153ed0d68 880129bcfee0
>>> [Tue Aug 14 15:30:33 2018]  880153ed0d6c 880100bbe0d0 
>>>  880153ed0d70
>>> [Tue Aug 14 15:30:33 2018] Call Trace:
>>> [Tue Aug 14 15:30:33 2018]  [] ? 
>>> schedule_preempt_disabled+0x25/0x70
>>> [Tue Aug 14 15:30:33 2018]  [] ? 
>>> __mutex_lock_slowpath+0xd3/0x1d0
>>> [Tue Aug 14 15:30:33 2018]  [] ? write_inode_now+0x93/0xc0
>>> [Tue Aug 14 15:30:33 2018]  [] ? mutex_lock+0x1b/0x2a
>>> [Tue Aug 14 15:30:33 2018]  [] ? fuse_flush+0x8f/0x1e0 
>>> [fuse]
>>> [Tue Aug 14 15:30:33 2018]  [] ? vfs_read+0x93/0x170
>>> [Tue Aug 14 15:30:33 2018]  [] ? filp_close+0x2a/0x70
>>> [Tue Aug 14 15:30:33 2018]  [] ? SyS_close+0x1f/0x50
>>> [Tue Aug 14 15:30:33 2018]  [] ? 
>>> system_call_fast_compare_end+0x10/0x15
>>>
>>> and if I check this process it has state "D" which is "D = uninterruptible 
>>> sleep".
>>>
>>> Now I also managed to take a statedump file as recommended but I see in its 
>>> content under the "[io-cache.inode]" a "path=" which I would need to remove 
>>> as it contains filenames for privacy reasons. Can I remove every "path=" 
>>> line and still send you the statedump file for analysis?
>>
>> Yes. Removing path is fine and statedumps will still be useful for debugging 
>> the issue.
>>
>>> Thank you.
>>>
>>> ‐‐‐ Original Message ‐‐‐
>>> On August 14, 2018 10:48 AM, Nithya Balachandran  
>>> wrote:
>>>
 Thanks for letting us know. Sanoj, can you take a look at this?

 Thanks.
 Nithya

Re: [Gluster-users] blocking process on FUSE mount in directory which is using quota

2018-08-14 Thread Raghavendra Gowdappa
On Tue, Aug 14, 2018 at 7:23 PM, mabi  wrote:

> Bad news: the process blocked happened again this time with another
> directory of another user which is NOT over his quota but which also has
> quota enabled.
>
> The symptoms on the Linux side are the same:
>
> [Tue Aug 14 15:30:33 2018] INFO: task php5-fpm:14773 blocked for more than
> 120 seconds.
> [Tue Aug 14 15:30:33 2018]   Not tainted 3.16.0-4-amd64 #1
> [Tue Aug 14 15:30:33 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [Tue Aug 14 15:30:33 2018] php5-fpmD 8801fea13200 0
> 14773729 0x
> [Tue Aug 14 15:30:33 2018]  880100bbe0d0 0282
> 00013200 880129bcffd8
> [Tue Aug 14 15:30:33 2018]  00013200 880100bbe0d0
> 880153ed0d68 880129bcfee0
> [Tue Aug 14 15:30:33 2018]  880153ed0d6c 880100bbe0d0
>  880153ed0d70
> [Tue Aug 14 15:30:33 2018] Call Trace:
> [Tue Aug 14 15:30:33 2018]  [] ?
> schedule_preempt_disabled+0x25/0x70
> [Tue Aug 14 15:30:33 2018]  [] ?
> __mutex_lock_slowpath+0xd3/0x1d0
> [Tue Aug 14 15:30:33 2018]  [] ?
> write_inode_now+0x93/0xc0
> [Tue Aug 14 15:30:33 2018]  [] ? mutex_lock+0x1b/0x2a
> [Tue Aug 14 15:30:33 2018]  [] ? fuse_flush+0x8f/0x1e0
> [fuse]
> [Tue Aug 14 15:30:33 2018]  [] ? vfs_read+0x93/0x170
> [Tue Aug 14 15:30:33 2018]  [] ? filp_close+0x2a/0x70
> [Tue Aug 14 15:30:33 2018]  [] ? SyS_close+0x1f/0x50
> [Tue Aug 14 15:30:33 2018]  [] ?
> system_call_fast_compare_end+0x10/0x15
>
> and if I check this process it has state "D" which is "D = uninterruptible
> sleep".
>
> Now I also managed to take a statedump file as recommended but I see in
> its content under the "[io-cache.inode]" a "path=" which I would need to
> remove as it contains filenames for privacy reasons. Can I remove every
> "path=" line and still send you the statedump file for analysis?
>

Yes. Removing path is fine and statedumps will still be useful for
debugging the issue.


> Thank you.
>
>
>
> ‐‐‐ Original Message ‐‐‐
> On August 14, 2018 10:48 AM, Nithya Balachandran 
> wrote:
>
> Thanks for letting us know. Sanoj, can you take a look at this?
>
> Thanks.
> Nithya
>
> On 14 August 2018 at 13:58, mabi  wrote:
>
>> As you mentioned after creating the /var/run/gluster directory I got a
>> statedump file in there.
>>
>> As a workaround I have now removed the quota for this specific directory
>> and as it is a production server I can currently not "play" with it by
>> adding the quota back and having the same problem as it requires me to
>> reboot the server with downtime...
>>
>> But I can confirm that by removing the quota from that directory, the
>> problem is gone (no more blocking processes such as "ls") so there must be
>> an issue or bug with the quota part of gluster.
>>
>>
>>
>> ‐‐‐ Original Message ‐‐‐
>> On August 10, 2018 4:19 PM, Nithya Balachandran 
>> wrote:
>>
>>
>>
>> On 9 August 2018 at 19:54, mabi  wrote:
>>
>>> Thanks for the documentation. On my client using FUSE mount I found the
>>> PID by using ps (output below):
>>>
>>> root   456 1  4 14:17 ?00:05:15 /usr/sbin/glusterfs
>>> --volfile-server=gfs1a --volfile-id=myvol-private /mnt/myvol-private
>>>
>>> Then I ran the following command
>>>
>>> sudo kill -USR1 456
>>>
>>> but now I can't find where the files are stored. Are these supposed to
>>> be stored on the client directly? I checked /var/run/gluster and
>>> /var/log/gluster but could not see anything and /var/log/gluster does not
>>> even exist on the client.
>>>
>>
>> They are usually created in /var/run/gluster. You will need to create the
>> directory on the client if it does not exist.
>>
>>
>>
>> ‐‐‐ Original Message ‐‐‐
>>> On August 9, 2018 3:59 PM, Raghavendra Gowdappa 
>>> wrote:
>>>
>>>
>>>
>>> On Thu, Aug 9, 2018 at 6:47 PM, mabi  wrote:
>>>
 Hi Nithya,

 Thanks for the fast answer. Here the additional info:

 1. gluster volume info

 Volume Name: myvol-private
 Type: Replicate
 Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5
 Status: Started
 Snapshot Count: 0
 Number of Bricks: 1 x (2 + 1) = 3
 Transport-type: tcp
 Bricks:
 Brick1: gfs1a:/data/myvol-private/brick
 Brick2: gfs1b:/data/myvol-private/brick
 Brick3: gfs1c:/srv/glusterfs/myvol-private/brick (arbiter)
 Options Reconfigured:
 features.default-soft-limit: 95%
 transport.address-family: inet
 features.quota-deem-statfs: on
 features.inode-quota: on
 features.quota: on
 nfs.disable: on
 performance.readdir-ahead: on
 client.event-threads: 4
 server.event-threads: 4
 auth.allow: 192.168.100.92



 2. Sorry I have no clue how to take a "statedump" of a process on
 Linux. Which command should I use for that? and which process would you
 like, the blocked process (for example "ls")?

>>>
>>> Statedumps are gluster specific. Please refer to
>>> 

Re: [Gluster-users] blocking process on FUSE mount in directory which is using quota

2018-08-14 Thread mabi
Bad news: the process blocked happened again this time with another directory 
of another user which is NOT over his quota but which also has quota enabled.

The symptoms on the Linux side are the same:

[Tue Aug 14 15:30:33 2018] INFO: task php5-fpm:14773 blocked for more than 120 
seconds.
[Tue Aug 14 15:30:33 2018]   Not tainted 3.16.0-4-amd64 #1
[Tue Aug 14 15:30:33 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[Tue Aug 14 15:30:33 2018] php5-fpmD 8801fea13200 0 14773
729 0x
[Tue Aug 14 15:30:33 2018]  880100bbe0d0 0282 00013200 
880129bcffd8
[Tue Aug 14 15:30:33 2018]  00013200 880100bbe0d0 880153ed0d68 
880129bcfee0
[Tue Aug 14 15:30:33 2018]  880153ed0d6c 880100bbe0d0  
880153ed0d70
[Tue Aug 14 15:30:33 2018] Call Trace:
[Tue Aug 14 15:30:33 2018]  [] ? 
schedule_preempt_disabled+0x25/0x70
[Tue Aug 14 15:30:33 2018]  [] ? 
__mutex_lock_slowpath+0xd3/0x1d0
[Tue Aug 14 15:30:33 2018]  [] ? write_inode_now+0x93/0xc0
[Tue Aug 14 15:30:33 2018]  [] ? mutex_lock+0x1b/0x2a
[Tue Aug 14 15:30:33 2018]  [] ? fuse_flush+0x8f/0x1e0 [fuse]
[Tue Aug 14 15:30:33 2018]  [] ? vfs_read+0x93/0x170
[Tue Aug 14 15:30:33 2018]  [] ? filp_close+0x2a/0x70
[Tue Aug 14 15:30:33 2018]  [] ? SyS_close+0x1f/0x50
[Tue Aug 14 15:30:33 2018]  [] ? 
system_call_fast_compare_end+0x10/0x15

and if I check this process it has state "D" which is "D = uninterruptible 
sleep".

Now I also managed to take a statedump file as recommended but I see in its 
content under the "[io-cache.inode]" a "path=" which I would need to remove as 
it contains filenames for privacy reasons. Can I remove every "path=" line and 
still send you the statedump file for analysis?

Thank you.

‐‐‐ Original Message ‐‐‐
On August 14, 2018 10:48 AM, Nithya Balachandran  wrote:

> Thanks for letting us know. Sanoj, can you take a look at this?
>
> Thanks.
> Nithya
>
> On 14 August 2018 at 13:58, mabi  wrote:
>
>> As you mentioned after creating the /var/run/gluster directory I got a 
>> statedump file in there.
>>
>> As a workaround I have now removed the quota for this specific directory and 
>> as it is a production server I can currently not "play" with it by adding 
>> the quota back and having the same problem as it requires me to reboot the 
>> server with downtime...
>>
>> But I can confirm that by removing the quota from that directory, the 
>> problem is gone (no more blocking processes such as "ls") so there must be 
>> an issue or bug with the quota part of gluster.
>>
>> ‐‐‐ Original Message ‐‐‐
>> On August 10, 2018 4:19 PM, Nithya Balachandran  wrote:
>>
>>> On 9 August 2018 at 19:54, mabi  wrote:
>>>
 Thanks for the documentation. On my client using FUSE mount I found the 
 PID by using ps (output below):

 root   456 1  4 14:17 ?00:05:15 /usr/sbin/glusterfs 
 --volfile-server=gfs1a --volfile-id=myvol-private /mnt/myvol-private

 Then I ran the following command

 sudo kill -USR1 456

 but now I can't find where the files are stored. Are these supposed to be 
 stored on the client directly? I checked /var/run/gluster and 
 /var/log/gluster but could not see anything and /var/log/gluster does not 
 even exist on the client.
>>>
>>> They are usually created in /var/run/gluster. You will need to create the 
>>> directory on the client if it does not exist.
>>>
 ‐‐‐ Original Message ‐‐‐
 On August 9, 2018 3:59 PM, Raghavendra Gowdappa  
 wrote:

> On Thu, Aug 9, 2018 at 6:47 PM, mabi  wrote:
>
>> Hi Nithya,
>>
>> Thanks for the fast answer. Here the additional info:
>>
>> 1. gluster volume info
>>
>> Volume Name: myvol-private
>> Type: Replicate
>> Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: gfs1a:/data/myvol-private/brick
>> Brick2: gfs1b:/data/myvol-private/brick
>> Brick3: gfs1c:/srv/glusterfs/myvol-private/brick (arbiter)
>> Options Reconfigured:
>> features.default-soft-limit: 95%
>> transport.address-family: inet
>> features.quota-deem-statfs: on
>> features.inode-quota: on
>> features.quota: on
>> nfs.disable: on
>> performance.readdir-ahead: on
>> client.event-threads: 4
>> server.event-threads: 4
>> auth.allow: 192.168.100.92
>>
>> 2. Sorry I have no clue how to take a "statedump" of a process on Linux. 
>> Which command should I use for that? and which process would you like, 
>> the blocked process (for example "ls")?
>
> Statedumps are gluster specific. Please refer to 
> https://docs.gluster.org/en/v3/Troubleshooting/statedump/ for 
> instructions.
>
>> Regards,
>> M.
>>
>> ‐‐‐ Original Message 

Re: [Gluster-users] blocking process on FUSE mount in directory which is using quota

2018-08-14 Thread Nithya Balachandran
Thanks for letting us know. Sanoj, can you take a look at this?

Thanks.
Nithya

On 14 August 2018 at 13:58, mabi  wrote:

> As you mentioned after creating the /var/run/gluster directory I got a
> statedump file in there.
>
> As a workaround I have now removed the quota for this specific directory
> and as it is a production server I can currently not "play" with it by
> adding the quota back and having the same problem as it requires me to
> reboot the server with downtime...
>
> But I can confirm that by removing the quota from that directory, the
> problem is gone (no more blocking processes such as "ls") so there must be
> an issue or bug with the quota part of gluster.
>
>
>
> ‐‐‐ Original Message ‐‐‐
> On August 10, 2018 4:19 PM, Nithya Balachandran 
> wrote:
>
>
>
> On 9 August 2018 at 19:54, mabi  wrote:
>
>> Thanks for the documentation. On my client using FUSE mount I found the
>> PID by using ps (output below):
>>
>> root   456 1  4 14:17 ?00:05:15 /usr/sbin/glusterfs
>> --volfile-server=gfs1a --volfile-id=myvol-private /mnt/myvol-private
>>
>> Then I ran the following command
>>
>> sudo kill -USR1 456
>>
>> but now I can't find where the files are stored. Are these supposed to be
>> stored on the client directly? I checked /var/run/gluster and
>> /var/log/gluster but could not see anything and /var/log/gluster does not
>> even exist on the client.
>>
>
> They are usually created in /var/run/gluster. You will need to create the
> directory on the client if it does not exist.
>
>
>
> ‐‐‐ Original Message ‐‐‐
>> On August 9, 2018 3:59 PM, Raghavendra Gowdappa 
>> wrote:
>>
>>
>>
>> On Thu, Aug 9, 2018 at 6:47 PM, mabi  wrote:
>>
>>> Hi Nithya,
>>>
>>> Thanks for the fast answer. Here the additional info:
>>>
>>> 1. gluster volume info
>>>
>>> Volume Name: myvol-private
>>> Type: Replicate
>>> Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x (2 + 1) = 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: gfs1a:/data/myvol-private/brick
>>> Brick2: gfs1b:/data/myvol-private/brick
>>> Brick3: gfs1c:/srv/glusterfs/myvol-private/brick (arbiter)
>>> Options Reconfigured:
>>> features.default-soft-limit: 95%
>>> transport.address-family: inet
>>> features.quota-deem-statfs: on
>>> features.inode-quota: on
>>> features.quota: on
>>> nfs.disable: on
>>> performance.readdir-ahead: on
>>> client.event-threads: 4
>>> server.event-threads: 4
>>> auth.allow: 192.168.100.92
>>>
>>>
>>>
>>> 2. Sorry I have no clue how to take a "statedump" of a process on Linux.
>>> Which command should I use for that? and which process would you like, the
>>> blocked process (for example "ls")?
>>>
>>
>> Statedumps are gluster specific. Please refer to
>> https://docs.gluster.org/en/v3/Troubleshooting/statedump/ for
>> instructions.
>>
>>
>>>
>>> Regards,
>>> M.
>>>
>>> ‐‐‐ Original Message ‐‐‐
>>> On August 9, 2018 3:10 PM, Nithya Balachandran 
>>> wrote:
>>>
>>> Hi,
>>>
>>> Please provide the following:
>>>
>>>1. gluster volume info
>>>2. statedump of the fuse process when it hangs
>>>
>>>
>>> Thanks,
>>> Nithya
>>>
>>> On 9 August 2018 at 18:24, mabi  wrote:
>>>
 Hello,

 I recently upgraded my GlusterFS replica 2+1 (aribter) to version
 3.12.12 and now I see a weird behaviour on my client (using FUSE mount)
 where I have processes (PHP 5.6 FPM) trying to access a specific directory
 and then the process blocks. I can't kill the process either, not even with
 kill -9. I need to reboot the machine in order to get rid of these blocked
 processes.

 This directory has one particularity compared to the other directories
 it is that it has reached it's quota soft-limit as you can see here in the
 output of gluster volume quota list:

   Path   Hard-limit  Soft-limit
 Used  Available  Soft-limit exceeded? Hard-limit exceeded?
 
 ---
 /directory  100.0GB 80%(80.0GB)   90.5GB
  9.5GB Yes   No

 That does not mean that it is the quota's fault but it might be a hint
 where to start looking for... And by the way can someone explain me what
 the soft-limit does? or does it not do anything special?

 Here is an the linux stack of a blocking process on that directory
 which happened with a simple "ls -la":

 [Thu Aug  9 14:21:07 2018] INFO: task ls:2272 blocked for more than 120
 seconds.
 [Thu Aug  9 14:21:07 2018]   Not tainted 3.16.0-4-amd64 #1
 [Thu Aug  9 14:21:07 2018] "echo 0 > 
 /proc/sys/kernel/hung_task_timeout_secs"
 disables this message.
 [Thu Aug  9 14:21:07 2018] ls  D 88017ef93200 0
 2272   2268 0x0004
 [Thu Aug  9 14:21:07 2018]  

Re: [Gluster-users] blocking process on FUSE mount in directory which is using quota

2018-08-14 Thread mabi
As you mentioned after creating the /var/run/gluster directory I got a 
statedump file in there.

As a workaround I have now removed the quota for this specific directory and as 
it is a production server I can currently not "play" with it by adding the 
quota back and having the same problem as it requires me to reboot the server 
with downtime...

But I can confirm that by removing the quota from that directory, the problem 
is gone (no more blocking processes such as "ls") so there must be an issue or 
bug with the quota part of gluster.

‐‐‐ Original Message ‐‐‐
On August 10, 2018 4:19 PM, Nithya Balachandran  wrote:

> On 9 August 2018 at 19:54, mabi  wrote:
>
>> Thanks for the documentation. On my client using FUSE mount I found the PID 
>> by using ps (output below):
>>
>> root   456 1  4 14:17 ?00:05:15 /usr/sbin/glusterfs 
>> --volfile-server=gfs1a --volfile-id=myvol-private /mnt/myvol-private
>>
>> Then I ran the following command
>>
>> sudo kill -USR1 456
>>
>> but now I can't find where the files are stored. Are these supposed to be 
>> stored on the client directly? I checked /var/run/gluster and 
>> /var/log/gluster but could not see anything and /var/log/gluster does not 
>> even exist on the client.
>
> They are usually created in /var/run/gluster. You will need to create the 
> directory on the client if it does not exist.
>
>> ‐‐‐ Original Message ‐‐‐
>> On August 9, 2018 3:59 PM, Raghavendra Gowdappa  wrote:
>>
>>> On Thu, Aug 9, 2018 at 6:47 PM, mabi  wrote:
>>>
 Hi Nithya,

 Thanks for the fast answer. Here the additional info:

 1. gluster volume info

 Volume Name: myvol-private
 Type: Replicate
 Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5
 Status: Started
 Snapshot Count: 0
 Number of Bricks: 1 x (2 + 1) = 3
 Transport-type: tcp
 Bricks:
 Brick1: gfs1a:/data/myvol-private/brick
 Brick2: gfs1b:/data/myvol-private/brick
 Brick3: gfs1c:/srv/glusterfs/myvol-private/brick (arbiter)
 Options Reconfigured:
 features.default-soft-limit: 95%
 transport.address-family: inet
 features.quota-deem-statfs: on
 features.inode-quota: on
 features.quota: on
 nfs.disable: on
 performance.readdir-ahead: on
 client.event-threads: 4
 server.event-threads: 4
 auth.allow: 192.168.100.92

 2. Sorry I have no clue how to take a "statedump" of a process on Linux. 
 Which command should I use for that? and which process would you like, the 
 blocked process (for example "ls")?
>>>
>>> Statedumps are gluster specific. Please refer to 
>>> https://docs.gluster.org/en/v3/Troubleshooting/statedump/ for instructions.
>>>
 Regards,
 M.

 ‐‐‐ Original Message ‐‐‐
 On August 9, 2018 3:10 PM, Nithya Balachandran  wrote:

> Hi,
>
> Please provide the following:
>
> - gluster volume info
> - statedump of the fuse process when it hangs
>
> Thanks,
> Nithya
>
> On 9 August 2018 at 18:24, mabi  wrote:
>
>> Hello,
>>
>> I recently upgraded my GlusterFS replica 2+1 (aribter) to version 
>> 3.12.12 and now I see a weird behaviour on my client (using FUSE mount) 
>> where I have processes (PHP 5.6 FPM) trying to access a specific 
>> directory and then the process blocks. I can't kill the process either, 
>> not even with kill -9. I need to reboot the machine in order to get rid 
>> of these blocked processes.
>>
>> This directory has one particularity compared to the other directories 
>> it is that it has reached it's quota soft-limit as you can see here in 
>> the output of gluster volume quota list:
>>
>>   Path   Hard-limit  Soft-limit  
>> Used  Available  Soft-limit exceeded? Hard-limit exceeded?
>> ---
>> /directory  100.0GB 80%(80.0GB)   90.5GB   
>> 9.5GB Yes   No
>>
>> That does not mean that it is the quota's fault but it might be a hint 
>> where to start looking for... And by the way can someone explain me what 
>> the soft-limit does? or does it not do anything special?
>>
>> Here is an the linux stack of a blocking process on that directory which 
>> happened with a simple "ls -la":
>>
>> [Thu Aug  9 14:21:07 2018] INFO: task ls:2272 blocked for more than 120 
>> seconds.
>> [Thu Aug  9 14:21:07 2018]   Not tainted 3.16.0-4-amd64 #1
>> [Thu Aug  9 14:21:07 2018] "echo 0 > 
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [Thu Aug  9 14:21:07 2018] ls  D 88017ef93200 0  
>> 2272   2268 0x0004
>> [Thu Aug  9 14:21:07 2018]  88017653f490 0286 
>> 

Re: [Gluster-users] blocking process on FUSE mount in directory which is using quota

2018-08-10 Thread Nithya Balachandran
On 9 August 2018 at 19:54, mabi  wrote:

> Thanks for the documentation. On my client using FUSE mount I found the
> PID by using ps (output below):
>
> root   456 1  4 14:17 ?00:05:15 /usr/sbin/glusterfs
> --volfile-server=gfs1a --volfile-id=myvol-private /mnt/myvol-private
>
> Then I ran the following command
>
> sudo kill -USR1 456
>
> but now I can't find where the files are stored. Are these supposed to be
> stored on the client directly? I checked /var/run/gluster and
> /var/log/gluster but could not see anything and /var/log/gluster does not
> even exist on the client.
>

They are usually created in /var/run/gluster. You will need to create the
directory on the client if it does not exist.



‐‐‐ Original Message ‐‐‐
> On August 9, 2018 3:59 PM, Raghavendra Gowdappa 
> wrote:
>
>
>
> On Thu, Aug 9, 2018 at 6:47 PM, mabi  wrote:
>
>> Hi Nithya,
>>
>> Thanks for the fast answer. Here the additional info:
>>
>> 1. gluster volume info
>>
>> Volume Name: myvol-private
>> Type: Replicate
>> Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: gfs1a:/data/myvol-private/brick
>> Brick2: gfs1b:/data/myvol-private/brick
>> Brick3: gfs1c:/srv/glusterfs/myvol-private/brick (arbiter)
>> Options Reconfigured:
>> features.default-soft-limit: 95%
>> transport.address-family: inet
>> features.quota-deem-statfs: on
>> features.inode-quota: on
>> features.quota: on
>> nfs.disable: on
>> performance.readdir-ahead: on
>> client.event-threads: 4
>> server.event-threads: 4
>> auth.allow: 192.168.100.92
>>
>>
>>
>> 2. Sorry I have no clue how to take a "statedump" of a process on Linux.
>> Which command should I use for that? and which process would you like, the
>> blocked process (for example "ls")?
>>
>
> Statedumps are gluster specific. Please refer to
> https://docs.gluster.org/en/v3/Troubleshooting/statedump/ for
> instructions.
>
>
>>
>> Regards,
>> M.
>>
>> ‐‐‐ Original Message ‐‐‐
>> On August 9, 2018 3:10 PM, Nithya Balachandran 
>> wrote:
>>
>> Hi,
>>
>> Please provide the following:
>>
>>1. gluster volume info
>>2. statedump of the fuse process when it hangs
>>
>>
>> Thanks,
>> Nithya
>>
>> On 9 August 2018 at 18:24, mabi  wrote:
>>
>>> Hello,
>>>
>>> I recently upgraded my GlusterFS replica 2+1 (aribter) to version
>>> 3.12.12 and now I see a weird behaviour on my client (using FUSE mount)
>>> where I have processes (PHP 5.6 FPM) trying to access a specific directory
>>> and then the process blocks. I can't kill the process either, not even with
>>> kill -9. I need to reboot the machine in order to get rid of these blocked
>>> processes.
>>>
>>> This directory has one particularity compared to the other directories
>>> it is that it has reached it's quota soft-limit as you can see here in the
>>> output of gluster volume quota list:
>>>
>>>   Path   Hard-limit  Soft-limit
>>> Used  Available  Soft-limit exceeded? Hard-limit exceeded?
>>> 
>>> ---
>>> /directory  100.0GB 80%(80.0GB)   90.5GB
>>>  9.5GB Yes   No
>>>
>>> That does not mean that it is the quota's fault but it might be a hint
>>> where to start looking for... And by the way can someone explain me what
>>> the soft-limit does? or does it not do anything special?
>>>
>>> Here is an the linux stack of a blocking process on that directory which
>>> happened with a simple "ls -la":
>>>
>>> [Thu Aug  9 14:21:07 2018] INFO: task ls:2272 blocked for more than 120
>>> seconds.
>>> [Thu Aug  9 14:21:07 2018]   Not tainted 3.16.0-4-amd64 #1
>>> [Thu Aug  9 14:21:07 2018] "echo 0 > 
>>> /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [Thu Aug  9 14:21:07 2018] ls  D 88017ef93200 0
>>> 2272   2268 0x0004
>>> [Thu Aug  9 14:21:07 2018]  88017653f490 0286
>>> 00013200 880174d7bfd8
>>> [Thu Aug  9 14:21:07 2018]  00013200 88017653f490
>>> 8800eeb3d5f0 8800fefac800
>>> [Thu Aug  9 14:21:07 2018]  880174d7bbe0 8800eeb3d6d0
>>> 8800fefac800 8800ffe1e1c0
>>> [Thu Aug  9 14:21:07 2018] Call Trace:
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> __fuse_request_send+0xbd/0x270 [fuse]
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> prepare_to_wait_event+0xf0/0xf0
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> fuse_dentry_revalidate+0x181/0x300 [fuse]
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> lookup_fast+0x25e/0x2b0
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> path_lookupat+0x155/0x780
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> kmem_cache_alloc+0x75/0x480
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> fuse_getxattr+0xe9/0x150 [fuse]
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> filename_lookup+0x26/0xc0
>>> [Thu Aug  9 14:21:07 2018] 

Re: [Gluster-users] blocking process on FUSE mount in directory which is using quota

2018-08-09 Thread mabi
Thanks for the documentation. On my client using FUSE mount I found the PID by 
using ps (output below):

root   456 1  4 14:17 ?00:05:15 /usr/sbin/glusterfs 
--volfile-server=gfs1a --volfile-id=myvol-private /mnt/myvol-private

Then I ran the following command

sudo kill -USR1 456

but now I can't find where the files are stored. Are these supposed to be 
stored on the client directly? I checked /var/run/gluster and /var/log/gluster 
but could not see anything and /var/log/gluster does not even exist on the 
client.

‐‐‐ Original Message ‐‐‐
On August 9, 2018 3:59 PM, Raghavendra Gowdappa  wrote:

> On Thu, Aug 9, 2018 at 6:47 PM, mabi  wrote:
>
>> Hi Nithya,
>>
>> Thanks for the fast answer. Here the additional info:
>>
>> 1. gluster volume info
>>
>> Volume Name: myvol-private
>> Type: Replicate
>> Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: gfs1a:/data/myvol-private/brick
>> Brick2: gfs1b:/data/myvol-private/brick
>> Brick3: gfs1c:/srv/glusterfs/myvol-private/brick (arbiter)
>> Options Reconfigured:
>> features.default-soft-limit: 95%
>> transport.address-family: inet
>> features.quota-deem-statfs: on
>> features.inode-quota: on
>> features.quota: on
>> nfs.disable: on
>> performance.readdir-ahead: on
>> client.event-threads: 4
>> server.event-threads: 4
>> auth.allow: 192.168.100.92
>>
>> 2. Sorry I have no clue how to take a "statedump" of a process on Linux. 
>> Which command should I use for that? and which process would you like, the 
>> blocked process (for example "ls")?
>
> Statedumps are gluster specific. Please refer to 
> https://docs.gluster.org/en/v3/Troubleshooting/statedump/ for instructions.
>
>> Regards,
>> M.
>>
>> ‐‐‐ Original Message ‐‐‐
>> On August 9, 2018 3:10 PM, Nithya Balachandran  wrote:
>>
>>> Hi,
>>>
>>> Please provide the following:
>>>
>>> - gluster volume info
>>> - statedump of the fuse process when it hangs
>>>
>>> Thanks,
>>> Nithya
>>>
>>> On 9 August 2018 at 18:24, mabi  wrote:
>>>
 Hello,

 I recently upgraded my GlusterFS replica 2+1 (aribter) to version 3.12.12 
 and now I see a weird behaviour on my client (using FUSE mount) where I 
 have processes (PHP 5.6 FPM) trying to access a specific directory and 
 then the process blocks. I can't kill the process either, not even with 
 kill -9. I need to reboot the machine in order to get rid of these blocked 
 processes.

 This directory has one particularity compared to the other directories it 
 is that it has reached it's quota soft-limit as you can see here in the 
 output of gluster volume quota list:

   Path   Hard-limit  Soft-limit  Used  
 Available  Soft-limit exceeded? Hard-limit exceeded?
 ---
 /directory  100.0GB 80%(80.0GB)   90.5GB   
 9.5GB Yes   No

 That does not mean that it is the quota's fault but it might be a hint 
 where to start looking for... And by the way can someone explain me what 
 the soft-limit does? or does it not do anything special?

 Here is an the linux stack of a blocking process on that directory which 
 happened with a simple "ls -la":

 [Thu Aug  9 14:21:07 2018] INFO: task ls:2272 blocked for more than 120 
 seconds.
 [Thu Aug  9 14:21:07 2018]   Not tainted 3.16.0-4-amd64 #1
 [Thu Aug  9 14:21:07 2018] "echo 0 > 
 /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 [Thu Aug  9 14:21:07 2018] ls  D 88017ef93200 0  2272  
  2268 0x0004
 [Thu Aug  9 14:21:07 2018]  88017653f490 0286 
 00013200 880174d7bfd8
 [Thu Aug  9 14:21:07 2018]  00013200 88017653f490 
 8800eeb3d5f0 8800fefac800
 [Thu Aug  9 14:21:07 2018]  880174d7bbe0 8800eeb3d6d0 
 8800fefac800 8800ffe1e1c0
 [Thu Aug  9 14:21:07 2018] Call Trace:
 [Thu Aug  9 14:21:07 2018]  [] ? 
 __fuse_request_send+0xbd/0x270 [fuse]
 [Thu Aug  9 14:21:07 2018]  [] ? 
 prepare_to_wait_event+0xf0/0xf0
 [Thu Aug  9 14:21:07 2018]  [] ? 
 fuse_dentry_revalidate+0x181/0x300 [fuse]
 [Thu Aug  9 14:21:07 2018]  [] ? lookup_fast+0x25e/0x2b0
 [Thu Aug  9 14:21:07 2018]  [] ? 
 path_lookupat+0x155/0x780
 [Thu Aug  9 14:21:07 2018]  [] ? 
 kmem_cache_alloc+0x75/0x480
 [Thu Aug  9 14:21:07 2018]  [] ? 
 fuse_getxattr+0xe9/0x150 [fuse]
 [Thu Aug  9 14:21:07 2018]  [] ? 
 filename_lookup+0x26/0xc0
 [Thu Aug  9 14:21:07 2018]  [] ? 
 user_path_at_empty+0x54/0x90
 [Thu Aug  9 14:21:07 2018]  [] ? 
 kmem_cache_free+0xd8/0x210
 [Thu Aug  9 

Re: [Gluster-users] blocking process on FUSE mount in directory which is using quota

2018-08-09 Thread Raghavendra Gowdappa
On Thu, Aug 9, 2018 at 6:47 PM, mabi  wrote:

> Hi Nithya,
>
> Thanks for the fast answer. Here the additional info:
>
> 1. gluster volume info
>
> Volume Name: myvol-private
> Type: Replicate
> Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gfs1a:/data/myvol-private/brick
> Brick2: gfs1b:/data/myvol-private/brick
> Brick3: gfs1c:/srv/glusterfs/myvol-private/brick (arbiter)
> Options Reconfigured:
> features.default-soft-limit: 95%
> transport.address-family: inet
> features.quota-deem-statfs: on
> features.inode-quota: on
> features.quota: on
> nfs.disable: on
> performance.readdir-ahead: on
> client.event-threads: 4
> server.event-threads: 4
> auth.allow: 192.168.100.92
>
>
>
> 2. Sorry I have no clue how to take a "statedump" of a process on Linux.
> Which command should I use for that? and which process would you like, the
> blocked process (for example "ls")?
>

Statedumps are gluster specific. Please refer to
https://docs.gluster.org/en/v3/Troubleshooting/statedump/ for instructions.


> Regards,
> M.
>
> ‐‐‐ Original Message ‐‐‐
> On August 9, 2018 3:10 PM, Nithya Balachandran 
> wrote:
>
> Hi,
>
> Please provide the following:
>
>1. gluster volume info
>2. statedump of the fuse process when it hangs
>
>
> Thanks,
> Nithya
>
> On 9 August 2018 at 18:24, mabi  wrote:
>
>> Hello,
>>
>> I recently upgraded my GlusterFS replica 2+1 (aribter) to version 3.12.12
>> and now I see a weird behaviour on my client (using FUSE mount) where I
>> have processes (PHP 5.6 FPM) trying to access a specific directory and then
>> the process blocks. I can't kill the process either, not even with kill -9.
>> I need to reboot the machine in order to get rid of these blocked processes.
>>
>> This directory has one particularity compared to the other directories it
>> is that it has reached it's quota soft-limit as you can see here in the
>> output of gluster volume quota list:
>>
>>   Path   Hard-limit  Soft-limit
>> Used  Available  Soft-limit exceeded? Hard-limit exceeded?
>> 
>> ---
>> /directory  100.0GB 80%(80.0GB)   90.5GB
>>  9.5GB Yes   No
>>
>> That does not mean that it is the quota's fault but it might be a hint
>> where to start looking for... And by the way can someone explain me what
>> the soft-limit does? or does it not do anything special?
>>
>> Here is an the linux stack of a blocking process on that directory which
>> happened with a simple "ls -la":
>>
>> [Thu Aug  9 14:21:07 2018] INFO: task ls:2272 blocked for more than 120
>> seconds.
>> [Thu Aug  9 14:21:07 2018]   Not tainted 3.16.0-4-amd64 #1
>> [Thu Aug  9 14:21:07 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [Thu Aug  9 14:21:07 2018] ls  D 88017ef93200 0
>> 2272   2268 0x0004
>> [Thu Aug  9 14:21:07 2018]  88017653f490 0286
>> 00013200 880174d7bfd8
>> [Thu Aug  9 14:21:07 2018]  00013200 88017653f490
>> 8800eeb3d5f0 8800fefac800
>> [Thu Aug  9 14:21:07 2018]  880174d7bbe0 8800eeb3d6d0
>> 8800fefac800 8800ffe1e1c0
>> [Thu Aug  9 14:21:07 2018] Call Trace:
>> [Thu Aug  9 14:21:07 2018]  [] ?
>> __fuse_request_send+0xbd/0x270 [fuse]
>> [Thu Aug  9 14:21:07 2018]  [] ?
>> prepare_to_wait_event+0xf0/0xf0
>> [Thu Aug  9 14:21:07 2018]  [] ?
>> fuse_dentry_revalidate+0x181/0x300 [fuse]
>> [Thu Aug  9 14:21:07 2018]  [] ? lookup_fast+0x25e/0x2b0
>> [Thu Aug  9 14:21:07 2018]  [] ?
>> path_lookupat+0x155/0x780
>> [Thu Aug  9 14:21:07 2018]  [] ?
>> kmem_cache_alloc+0x75/0x480
>> [Thu Aug  9 14:21:07 2018]  [] ?
>> fuse_getxattr+0xe9/0x150 [fuse]
>> [Thu Aug  9 14:21:07 2018]  [] ?
>> filename_lookup+0x26/0xc0
>> [Thu Aug  9 14:21:07 2018]  [] ?
>> user_path_at_empty+0x54/0x90
>> [Thu Aug  9 14:21:07 2018]  [] ?
>> kmem_cache_free+0xd8/0x210
>> [Thu Aug  9 14:21:07 2018]  [] ?
>> user_path_at_empty+0x5f/0x90
>> [Thu Aug  9 14:21:07 2018]  [] ? vfs_fstatat+0x46/0x90
>> [Thu Aug  9 14:21:07 2018]  [] ? SYSC_newlstat+0x1d/0x40
>> [Thu Aug  9 14:21:07 2018]  [] ? SyS_lgetxattr+0x58/0x80
>> [Thu Aug  9 14:21:07 2018]  [] ?
>> system_call_fast_compare_end+0x10/0x15
>>
>>
>> My 3 gluster nodes are all Debian 9 and my client Debian 8.
>>
>> Let me know if you need more information.
>>
>> Best regards,
>> Mabi
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>

Re: [Gluster-users] blocking process on FUSE mount in directory which is using quota

2018-08-09 Thread Alvin Starr

A while back we found an issue with PHP around file locks.

https://bugs.php.net/bug.php?id=53076

This would leave hanging processes that we could only kill by forcibly 
remounting the filesystems.


Its not really a PHP bug but they are not willing to fix the code to 
deal with Glusters idiosyncrasies.



On 08/09/2018 09:10 AM, Nithya Balachandran wrote:

Hi,

Please provide the following:

 1. gluster volume info
 2. statedump of the fuse process when it hangs


Thanks,
Nithya

On 9 August 2018 at 18:24, mabi > wrote:


Hello,

I recently upgraded my GlusterFS replica 2+1 (aribter) to version
3.12.12 and now I see a weird behaviour on my client (using FUSE
mount) where I have processes (PHP 5.6 FPM) trying to access a
specific directory and then the process blocks. I can't kill the
process either, not even with kill -9. I need to reboot the
machine in order to get rid of these blocked processes.

This directory has one particularity compared to the other
directories it is that it has reached it's quota soft-limit as you
can see here in the output of gluster volume quota list:

                  Path                   Hard-limit Soft-limit   
  Used  Available  Soft-limit exceeded? Hard-limit exceeded?

---
/directory                          100.0GB     80%(80.0GB)
 90.5GB   9.5GB             Yes                   No

That does not mean that it is the quota's fault but it might be a
hint where to start looking for... And by the way can someone
explain me what the soft-limit does? or does it not do anything
special?

Here is an the linux stack of a blocking process on that directory
which happened with a simple "ls -la":

[Thu Aug  9 14:21:07 2018] INFO: task ls:2272 blocked for more
than 120 seconds.
[Thu Aug  9 14:21:07 2018]       Not tainted 3.16.0-4-amd64 #1
[Thu Aug  9 14:21:07 2018] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Thu Aug  9 14:21:07 2018] ls              D 88017ef93200   
 0  2272   2268 0x0004
[Thu Aug  9 14:21:07 2018]  88017653f490 0286
00013200 880174d7bfd8
[Thu Aug  9 14:21:07 2018]  00013200 88017653f490
8800eeb3d5f0 8800fefac800
[Thu Aug  9 14:21:07 2018]  880174d7bbe0 8800eeb3d6d0
8800fefac800 8800ffe1e1c0
[Thu Aug  9 14:21:07 2018] Call Trace:
[Thu Aug  9 14:21:07 2018]  [] ?
__fuse_request_send+0xbd/0x270 [fuse]
[Thu Aug  9 14:21:07 2018]  [] ?
prepare_to_wait_event+0xf0/0xf0
[Thu Aug  9 14:21:07 2018]  [] ?
fuse_dentry_revalidate+0x181/0x300 [fuse]
[Thu Aug  9 14:21:07 2018]  [] ?
lookup_fast+0x25e/0x2b0
[Thu Aug  9 14:21:07 2018]  [] ?
path_lookupat+0x155/0x780
[Thu Aug  9 14:21:07 2018]  [] ?
kmem_cache_alloc+0x75/0x480
[Thu Aug  9 14:21:07 2018]  [] ?
fuse_getxattr+0xe9/0x150 [fuse]
[Thu Aug  9 14:21:07 2018]  [] ?
filename_lookup+0x26/0xc0
[Thu Aug  9 14:21:07 2018]  [] ?
user_path_at_empty+0x54/0x90
[Thu Aug  9 14:21:07 2018]  [] ?
kmem_cache_free+0xd8/0x210
[Thu Aug  9 14:21:07 2018]  [] ?
user_path_at_empty+0x5f/0x90
[Thu Aug  9 14:21:07 2018]  [] ?
vfs_fstatat+0x46/0x90
[Thu Aug  9 14:21:07 2018]  [] ?
SYSC_newlstat+0x1d/0x40
[Thu Aug  9 14:21:07 2018]  [] ?
SyS_lgetxattr+0x58/0x80
[Thu Aug  9 14:21:07 2018]  [] ?
system_call_fast_compare_end+0x10/0x15


My 3 gluster nodes are all Debian 9 and my client Debian 8.

Let me know if you need more information.

Best regards,
Mabi
___
Gluster-users mailing list
Gluster-users@gluster.org 
https://lists.gluster.org/mailman/listinfo/gluster-users





___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


--
Alvin Starr   ||   land:  (905)513-7688
Netvel Inc.   ||   Cell:  (416)806-0133
al...@netvel.net  ||

___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] blocking process on FUSE mount in directory which is using quota

2018-08-09 Thread mabi
Hi Nithya,

Thanks for the fast answer. Here the additional info:

1. gluster volume info

Volume Name: myvol-private
Type: Replicate
Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: gfs1a:/data/myvol-private/brick
Brick2: gfs1b:/data/myvol-private/brick
Brick3: gfs1c:/srv/glusterfs/myvol-private/brick (arbiter)
Options Reconfigured:
features.default-soft-limit: 95%
transport.address-family: inet
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
nfs.disable: on
performance.readdir-ahead: on
client.event-threads: 4
server.event-threads: 4
auth.allow: 192.168.100.92

2. Sorry I have no clue how to take a "statedump" of a process on Linux. Which 
command should I use for that? and which process would you like, the blocked 
process (for example "ls")?

Regards,
M.

‐‐‐ Original Message ‐‐‐
On August 9, 2018 3:10 PM, Nithya Balachandran  wrote:

> Hi,
>
> Please provide the following:
>
> - gluster volume info
> - statedump of the fuse process when it hangs
>
> Thanks,
> Nithya
>
> On 9 August 2018 at 18:24, mabi  wrote:
>
>> Hello,
>>
>> I recently upgraded my GlusterFS replica 2+1 (aribter) to version 3.12.12 
>> and now I see a weird behaviour on my client (using FUSE mount) where I have 
>> processes (PHP 5.6 FPM) trying to access a specific directory and then the 
>> process blocks. I can't kill the process either, not even with kill -9. I 
>> need to reboot the machine in order to get rid of these blocked processes.
>>
>> This directory has one particularity compared to the other directories it is 
>> that it has reached it's quota soft-limit as you can see here in the output 
>> of gluster volume quota list:
>>
>>   Path   Hard-limit  Soft-limit  Used  
>> Available  Soft-limit exceeded? Hard-limit exceeded?
>> ---
>> /directory  100.0GB 80%(80.0GB)   90.5GB   9.5GB 
>> Yes   No
>>
>> That does not mean that it is the quota's fault but it might be a hint where 
>> to start looking for... And by the way can someone explain me what the 
>> soft-limit does? or does it not do anything special?
>>
>> Here is an the linux stack of a blocking process on that directory which 
>> happened with a simple "ls -la":
>>
>> [Thu Aug  9 14:21:07 2018] INFO: task ls:2272 blocked for more than 120 
>> seconds.
>> [Thu Aug  9 14:21:07 2018]   Not tainted 3.16.0-4-amd64 #1
>> [Thu Aug  9 14:21:07 2018] "echo 0 > 
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [Thu Aug  9 14:21:07 2018] ls  D 88017ef93200 0  2272   
>> 2268 0x0004
>> [Thu Aug  9 14:21:07 2018]  88017653f490 0286 
>> 00013200 880174d7bfd8
>> [Thu Aug  9 14:21:07 2018]  00013200 88017653f490 
>> 8800eeb3d5f0 8800fefac800
>> [Thu Aug  9 14:21:07 2018]  880174d7bbe0 8800eeb3d6d0 
>> 8800fefac800 8800ffe1e1c0
>> [Thu Aug  9 14:21:07 2018] Call Trace:
>> [Thu Aug  9 14:21:07 2018]  [] ? 
>> __fuse_request_send+0xbd/0x270 [fuse]
>> [Thu Aug  9 14:21:07 2018]  [] ? 
>> prepare_to_wait_event+0xf0/0xf0
>> [Thu Aug  9 14:21:07 2018]  [] ? 
>> fuse_dentry_revalidate+0x181/0x300 [fuse]
>> [Thu Aug  9 14:21:07 2018]  [] ? lookup_fast+0x25e/0x2b0
>> [Thu Aug  9 14:21:07 2018]  [] ? path_lookupat+0x155/0x780
>> [Thu Aug  9 14:21:07 2018]  [] ? 
>> kmem_cache_alloc+0x75/0x480
>> [Thu Aug  9 14:21:07 2018]  [] ? fuse_getxattr+0xe9/0x150 
>> [fuse]
>> [Thu Aug  9 14:21:07 2018]  [] ? filename_lookup+0x26/0xc0
>> [Thu Aug  9 14:21:07 2018]  [] ? 
>> user_path_at_empty+0x54/0x90
>> [Thu Aug  9 14:21:07 2018]  [] ? kmem_cache_free+0xd8/0x210
>> [Thu Aug  9 14:21:07 2018]  [] ? 
>> user_path_at_empty+0x5f/0x90
>> [Thu Aug  9 14:21:07 2018]  [] ? vfs_fstatat+0x46/0x90
>> [Thu Aug  9 14:21:07 2018]  [] ? SYSC_newlstat+0x1d/0x40
>> [Thu Aug  9 14:21:07 2018]  [] ? SyS_lgetxattr+0x58/0x80
>> [Thu Aug  9 14:21:07 2018]  [] ? 
>> system_call_fast_compare_end+0x10/0x15
>>
>> My 3 gluster nodes are all Debian 9 and my client Debian 8.
>>
>> Let me know if you need more information.
>>
>> Best regards,
>> Mabi
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] blocking process on FUSE mount in directory which is using quota

2018-08-09 Thread Nithya Balachandran
Hi,

Please provide the following:

   1. gluster volume info
   2. statedump of the fuse process when it hangs


Thanks,
Nithya

On 9 August 2018 at 18:24, mabi  wrote:

> Hello,
>
> I recently upgraded my GlusterFS replica 2+1 (aribter) to version 3.12.12
> and now I see a weird behaviour on my client (using FUSE mount) where I
> have processes (PHP 5.6 FPM) trying to access a specific directory and then
> the process blocks. I can't kill the process either, not even with kill -9.
> I need to reboot the machine in order to get rid of these blocked processes.
>
> This directory has one particularity compared to the other directories it
> is that it has reached it's quota soft-limit as you can see here in the
> output of gluster volume quota list:
>
>   Path   Hard-limit  Soft-limit  Used
> Available  Soft-limit exceeded? Hard-limit exceeded?
> 
> ---
> /directory  100.0GB 80%(80.0GB)   90.5GB
>  9.5GB Yes   No
>
> That does not mean that it is the quota's fault but it might be a hint
> where to start looking for... And by the way can someone explain me what
> the soft-limit does? or does it not do anything special?
>
> Here is an the linux stack of a blocking process on that directory which
> happened with a simple "ls -la":
>
> [Thu Aug  9 14:21:07 2018] INFO: task ls:2272 blocked for more than 120
> seconds.
> [Thu Aug  9 14:21:07 2018]   Not tainted 3.16.0-4-amd64 #1
> [Thu Aug  9 14:21:07 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [Thu Aug  9 14:21:07 2018] ls  D 88017ef93200 0  2272
>  2268 0x0004
> [Thu Aug  9 14:21:07 2018]  88017653f490 0286
> 00013200 880174d7bfd8
> [Thu Aug  9 14:21:07 2018]  00013200 88017653f490
> 8800eeb3d5f0 8800fefac800
> [Thu Aug  9 14:21:07 2018]  880174d7bbe0 8800eeb3d6d0
> 8800fefac800 8800ffe1e1c0
> [Thu Aug  9 14:21:07 2018] Call Trace:
> [Thu Aug  9 14:21:07 2018]  [] ?
> __fuse_request_send+0xbd/0x270 [fuse]
> [Thu Aug  9 14:21:07 2018]  [] ?
> prepare_to_wait_event+0xf0/0xf0
> [Thu Aug  9 14:21:07 2018]  [] ?
> fuse_dentry_revalidate+0x181/0x300 [fuse]
> [Thu Aug  9 14:21:07 2018]  [] ? lookup_fast+0x25e/0x2b0
> [Thu Aug  9 14:21:07 2018]  [] ?
> path_lookupat+0x155/0x780
> [Thu Aug  9 14:21:07 2018]  [] ?
> kmem_cache_alloc+0x75/0x480
> [Thu Aug  9 14:21:07 2018]  [] ?
> fuse_getxattr+0xe9/0x150 [fuse]
> [Thu Aug  9 14:21:07 2018]  [] ?
> filename_lookup+0x26/0xc0
> [Thu Aug  9 14:21:07 2018]  [] ?
> user_path_at_empty+0x54/0x90
> [Thu Aug  9 14:21:07 2018]  [] ?
> kmem_cache_free+0xd8/0x210
> [Thu Aug  9 14:21:07 2018]  [] ?
> user_path_at_empty+0x5f/0x90
> [Thu Aug  9 14:21:07 2018]  [] ? vfs_fstatat+0x46/0x90
> [Thu Aug  9 14:21:07 2018]  [] ? SYSC_newlstat+0x1d/0x40
> [Thu Aug  9 14:21:07 2018]  [] ? SyS_lgetxattr+0x58/0x80
> [Thu Aug  9 14:21:07 2018]  [] ?
> system_call_fast_compare_end+0x10/0x15
>
>
> My 3 gluster nodes are all Debian 9 and my client Debian 8.
>
> Let me know if you need more information.
>
> Best regards,
> Mabi
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users