Re: [Gluster-devel] Bugs summary jenkins job

2021-03-10 Thread Ravishankar N

+1.

On 10/03/21 7:37 pm, Amar Tumballi wrote:

I personally haven't checked it after migrating to GitHub.

Haven't seen any PRs coming with bug reference either. IMO, ok to stop 
the job, and cleanup python2 reference.


On Wed, 10 Mar, 2021, 7:21 pm Michael Scherer, > wrote:


Hi,

are we still using the bugs summary on
https://bugs.gluster.org/gluster-bugs.html
 ?

As we moved out of bugzilla, I think the script wasn't adapted to
github, and it is still running on python 2 (so we need to keep a
Fedora 30 around for that)


-- 
Michael Scherer / He/Il/Er/Él

Sysadmin, Community Infrastructure



---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk


Gluster-devel mailing list
Gluster-devel@gluster.org 
https://lists.gluster.org/mailman/listinfo/gluster-devel



---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Removing problematic language in geo-replication

2020-12-30 Thread Ravishankar N

Hello,

Just a quick update:  all geo-rep related offensive words (as a matter 
of fact, even the non geo-rep ones) that can be removed from the code 
have been done so from the devel branch of the glusterfs repo. I thank 
everyone for their suggestions, debugging/testing help and code reviews.


Since we have some soak time before the changes make it to the 
release-10 branch, I would encourage you to test the changes and report 
any issues that you might find. Please try out both new geo-rep set ups 
as well as upgrade scenarios (say from a supported release version to 
the latest devel branch).


Also, for any new PRs that we are sending/ reviewing/merging, we need to 
keep in mind not to re-introduce any offensive words.


Wishing you all a happy new year!
Ravi

On 22/07/20 5:06 pm, Aravinda VK wrote:

+1


On 22-Jul-2020, at 2:34 PM, Ravishankar N  wrote:

Hi,

The gluster code base has some words and terminology (blacklist, whitelist, 
master, slave etc.) that can be considered hurtful/offensive to people in a 
global open source setting. Some of words can be fixed trivially but the 
Geo-replication code seems to be something that needs extensive rework. More so 
because we have these words being used in the CLI itself. Two questions that I 
had were:

1. Can I replace master:slave with primary:secondary everywhere in the code and 
the CLI? Are there any suggestions for more appropriate terminology?

Primary -> Secondary looks good.


2. Is it okay to target the changes to a major release (release-9) and *not* 
provide backward compatibility for the CLI?

Functionality is not affected and CLI commands are compatible since all are 
positional arguments. Need changes in

- Geo-rep status xml output
- Documentation
- CLI help
- Variables and other references in Code.


Thanks,

Ravi


___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Aravinda Vishwanathapura
https://kadalu.io





---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] .glusterfs directory?

2020-12-21 Thread Ravishankar N



On 21/12/20 2:35 pm, Emmanuel Dreyfus wrote:

On Mon, Dec 21, 2020 at 01:53:06PM +0530, Ravishankar N wrote:

Are you talking about the entries inside.glusterfs/indices/xattrop/* ? Any
stale entries here should automatically be purged when self-heal daemon as
it crawls the folder periodically.

I mean for instance:
# ls -l .glusterfs/aa/aa/dd69-7b3d-45e9-bd0f-8a8bbaa189a5
lrwxrwxrwx  1 root  wheel  60 Nov  4  2018 
.glusterfs//aa/aa/dd69-7b3d-45e9-bd0f-8a8bbaa189a5 -> 
../../f0/91/f091de81-a4e2-4548-acf4-4b19c7bdac5e/tpm_nvwrite
# ls -l .glusterfs/f0/91/f091de81-a4e2-4548-acf4-4b19c7bdac
ls: .glusterfs/f0/91/f091de81-a4e2-4548-acf4-4b19c7bdac5e/tpm_nvwrite: No such 
file or directory


If this is the case on all bricks, then it might be okay to remove this 
stale symlink. But if tpm_nvwrite directory is present on other bricks, 
then it is better to check what the path to it is [1] and if its 
trusted.gfid xattr is indeed dd69-7b3d-45e9-bd0f-8a8bbaa189a5 and 
why its missing on this brick alone (maybe pending self heal?)


[1] 
https://github.com/gluster/glusterfs/commit/afbdcda3f4d6ffb906976064e0fa6f6b824718c8


-Ravi

---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] .glusterfs directory?

2020-12-21 Thread Ravishankar N



On 21/12/20 1:16 pm, Emmanuel Dreyfus wrote:

On a healthy system, one should definitely not remove any files or sub
directories inside .glusterfs as they contain important metadata. Which
entries specifically inside .glusterfs do you think are stale and why?

There are indexes leading to no file, causing heal complains.
Are you talking about the entries inside.glusterfs/indices/xattrop/* ? 
Any stale entries here should automatically be purged when self-heal 
daemon as it crawls the folder periodically.




---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] .glusterfs directory?

2020-12-20 Thread Ravishankar N



On 21/12/20 7:10 am, Emmanuel Dreyfus wrote:

Hello

I have a lot of stale entries in bricks' .glusterfs directories. Is it
safe to just rm -rf it and hope for automatic rebuild? Reading the
source and experimenting, it does not seems obvious.

Or is there a way to clean up stale entries that lead to files that do
not exist anymore?

On a healthy system, one should definitely not remove any files or sub 
directories inside .glusterfs as they contain important metadata. Which 
entries specifically inside .glusterfs do you think are stale and why?


-Ravi

---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Toggle storage.linux-aio and volume restart

2020-12-08 Thread Ravishankar N



On 09/12/20 10:39 am, Ravishankar N wrote:


On 08/12/20 9:15 pm, Dmitry Antipov wrote:
IOW if aio_configured is true, fops->readv and fops->writev should be 
set to posix_aio_readv()
and posix_aio_writev(), respectively. But the whole picture looks 
like something in xlator
graph silently reverts fops->readv and fops->writev back to 
posix_xxxv() defaults.
Looks like the (priv->io_uring_configured) check (I added this) which 
comes after the (priv->aio_configure check) in posix_reconfigure() is 
overwriting this.


And considering we do not have graph switch implemented on the server 
side, it is undesirable to toggle the fop dispatch table when there 
could be in-fight fops. Perhaps we should disallow changing linux-aio on 
the fly, like how it is done for linux-io_uring (see the check added in 
glusterd_op_stage_set_volume() for io_uring).


-Ravi



-Ravi

---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Toggle storage.linux-aio and volume restart

2020-12-08 Thread Ravishankar N



On 08/12/20 9:15 pm, Dmitry Antipov wrote:
IOW if aio_configured is true, fops->readv and fops->writev should be 
set to posix_aio_readv()
and posix_aio_writev(), respectively. But the whole picture looks like 
something in xlator
graph silently reverts fops->readv and fops->writev back to 
posix_xxxv() defaults.
Looks like the (priv->io_uring_configured) check (I added this) which 
comes after the (priv->aio_configure check) in posix_reconfigure() is 
overwriting this.


-Ravi

---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] NFS Ganesha fails to export a volume

2020-11-17 Thread Ravishankar N


On 18/11/20 12:17 pm, Strahil Nikolov wrote:

Nope, it's a deeper s**t.
I had to edit the ".spec.in" file so it has Source0 point to local tar.gz.
The I edit the requires in both ".spec" & ".spec.in" and also I had to remove 
an obsolete stanza in the glusterfs section.

In the end, I got the source - extracted, copied the spec & spec.in , and then 
tar.gz-ed again and put it into the dir.

Only then the rpms were properly built.

The proposed patch is fixing the issue.

Thanks for confirming!


Why do we have line 285 in 
https://raw.githubusercontent.com/gluster/glusterfs/devel/glusterfs.spec.in ?

I guess I need to open 2 issues for the glusterfs:
- that obsolete stanza is useless


Using git blame points me to 
https://github.com/gluster/glusterfs/commit/f9118c2c9389e0793951388c2d69ce0350bb9318. 
Adding Shwetha to confirm if the change was intended.


-Ravi




Best Regards,
Strahil Nikolov



В вторник, 17 ноември 2020 г., 14:16:36 Гринуич+2, Ravishankar N 
 написа:





Hi Strahil,

I would have imagined editing the 'Requires' section in
glusterfs.spec.in would have sufficed. Do you need rpms though? A source
install is not enough?

Regards,
Ravi

On 17/11/20 5:32 pm, Strahil Nikolov wrote:

Hi Ravi,


Any idea how to make the glusterfs-ganesha.x86_64 require resource-agents >= 
4.1.0 (instead of 4.2.0) ?
I 've replaced every occurance I found and still it tries to grab 
resource-agents 4.2 (which is not available on EL8).

Best Regards,
Strahil Nikolov






В понеделник, 16 ноември 2020 г., 13:15:54 Гринуич+2, Ravishankar 
N  написа:






I am surprised too that it wasn't caught earlier.


Steps:

1. Clone the gluster repo

2. Compile  the 
sourcehttps://docs.gluster.org/en/latest/Developer-guide/Building-GlusterFS/

3. Make the changes (in a different branch if you prefer), compile again and 
install

4.  Test it out:

[root@linuxpad glusterfs]#  gluster v create testvol  
127.0.0.2:/home/ravi/bricks/brick{1..2} force
volume create: testvol: success: please start the volume to access data
[root@linuxpad glusterfs]#
[root@linuxpad glusterfs]# gluster v start testvol
volume start: testvol: success
[root@linuxpad glusterfs]#
[root@linuxpad glusterfs]# gluster v set testvol ganesha.enable on
volume set: failed: The option nfs-ganesha should be enabled before setting 
ganesha.enable.
[root@linuxpad glusterfs]#
 


I just tried the change and it looks like some new error shows up. Not too 
familiar with these settings; I will need to debug further.

Thanks,

Ravi


On 16/11/20 4:05 pm, Strahil Nikolov wrote:



     I can try to help with the testing (I'm quite new to that).
Can someone share documentation of that process ?

yet we have another problem -> ganesha is deployed with ocf:heartbeat:portblock 
which supports only IPTABLES, while EL8 uses NFTABLES ...

Best Regards,
Strahil Nikolov






В понеделник, 16 ноември 2020 г., 10:47:43 Гринуич+2, Yaniv 
Kaul  написа:







On Mon, Nov 16, 2020 at 10:26 AM Ravishankar N  wrote:


     On 15/11/20 8:24 pm, Strahil Nikolov wrote:


     Hello All,

did anyone get a chance to look 
athttps://github.com/gluster/glusterfs/issues/1778  ?


A look at
https://review.gluster.org/#/c/glusterfs/+/23648/4/xlators/mgmt/glusterd/src/glusterd-op-sm.c@1117
seems to indicate this could be due to a typo error. Do you have a
source install where you can apply this simple diff and see if it fixes
the issue?


I think you are right - I seem to have introduced it as part 
ofhttps://github.com/gluster/glusterfs/commit/e081ac683b6a5bda54891318fa1e3ffac981e553
  - my bad.

However, it was merged ~1 year ago, and no one has complained thus far... :-/
1. Is no one using NFS Ganesha?
2. We are lacking tests for NFS Ganesha - code coverage indicates this path is 
not covered.

Y.


   
diff --git a/xlators/mgmt/glusterd/src/glusterd-op-sm.c

b/xlators/mgmt/glusterd/src/glusterd-op-sm.c
index 558f04fb2..d7bf96adf 100644
--- a/xlators/mgmt/glusterd/src/glusterd-op-sm.c
+++ b/xlators/mgmt/glusterd/src/glusterd-op-sm.c
@@ -1177,7 +1177,7 @@ glusterd_op_stage_set_volume(dict_t *dict, char
**op_errstr)
     }
     } else if (len_strcmp(key, keylen, "ganesha.enable")) {
     key_matched = _gf_true;
-    if (!strcmp(value, "off") == 0) {
+    if (strcmp(value, "off") == 0) {
     ret = ganesha_manage_export(dict, "off", _gf_true,
op_errstr);
     if (ret)
     goto out;

Thanks,

Ravi


     It's really strange that NFS Ganesha has ever passed the tests.
How do we test NFS Ganesha exporting ?

Best Regards,
Strahil Nikolov
___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge:https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listi

Re: [Gluster-devel] NFS Ganesha fails to export a volume

2020-11-17 Thread Ravishankar N

Hi Strahil,

I would have imagined editing the 'Requires' section in 
glusterfs.spec.in would have sufficed. Do you need rpms though? A source 
install is not enough?


Regards,
Ravi

On 17/11/20 5:32 pm, Strahil Nikolov wrote:

Hi Ravi,


Any idea how to make the glusterfs-ganesha.x86_64 require resource-agents >= 
4.1.0 (instead of 4.2.0) ?
I 've replaced every occurance I found and still it tries to grab 
resource-agents 4.2 (which is not available on EL8).

Best Regards,
Strahil Nikolov






В понеделник, 16 ноември 2020 г., 13:15:54 Гринуич+2, Ravishankar 
N  написа:






I am surprised too that it wasn't caught earlier.


Steps:

1. Clone the gluster repo

2. Compile  the 
sourcehttps://docs.gluster.org/en/latest/Developer-guide/Building-GlusterFS/

3. Make the changes (in a different branch if you prefer), compile again and 
install

4.  Test it out:

[root@linuxpad glusterfs]#  gluster v create testvol  
127.0.0.2:/home/ravi/bricks/brick{1..2} force
volume create: testvol: success: please start the volume to access data
[root@linuxpad glusterfs]#
[root@linuxpad glusterfs]# gluster v start testvol
volume start: testvol: success
[root@linuxpad glusterfs]#
[root@linuxpad glusterfs]# gluster v set testvol ganesha.enable on
volume set: failed: The option nfs-ganesha should be enabled before setting 
ganesha.enable.
[root@linuxpad glusterfs]#
   


I just tried the change and it looks like some new error shows up. Not too 
familiar with these settings; I will need to debug further.

Thanks,

Ravi


On 16/11/20 4:05 pm, Strahil Nikolov wrote:



   I can try to help with the testing (I'm quite new to that).
Can someone share documentation of that process ?

yet we have another problem -> ganesha is deployed with ocf:heartbeat:portblock 
which supports only IPTABLES, while EL8 uses NFTABLES ...

Best Regards,
Strahil Nikolov






В понеделник, 16 ноември 2020 г., 10:47:43 Гринуич+2, Yaniv 
Kaul  написа:







On Mon, Nov 16, 2020 at 10:26 AM Ravishankar N  wrote:


   On 15/11/20 8:24 pm, Strahil Nikolov wrote:


   Hello All,

did anyone get a chance to look 
athttps://github.com/gluster/glusterfs/issues/1778  ?


A look at
https://review.gluster.org/#/c/glusterfs/+/23648/4/xlators/mgmt/glusterd/src/glusterd-op-sm.c@1117  
seems to indicate this could be due to a typo error. Do you have a

source install where you can apply this simple diff and see if it fixes
the issue?


I think you are right - I seem to have introduced it as part 
ofhttps://github.com/gluster/glusterfs/commit/e081ac683b6a5bda54891318fa1e3ffac981e553
  - my bad.

However, it was merged ~1 year ago, and no one has complained thus far... :-/
1. Is no one using NFS Ganesha?
2. We are lacking tests for NFS Ganesha - code coverage indicates this path is 
not covered.

Y.


 
diff --git a/xlators/mgmt/glusterd/src/glusterd-op-sm.c

b/xlators/mgmt/glusterd/src/glusterd-op-sm.c
index 558f04fb2..d7bf96adf 100644
--- a/xlators/mgmt/glusterd/src/glusterd-op-sm.c
+++ b/xlators/mgmt/glusterd/src/glusterd-op-sm.c
@@ -1177,7 +1177,7 @@ glusterd_op_stage_set_volume(dict_t *dict, char
**op_errstr)
   }
   } else if (len_strcmp(key, keylen, "ganesha.enable")) {
   key_matched = _gf_true;
-    if (!strcmp(value, "off") == 0) {
+    if (strcmp(value, "off") == 0) {
   ret = ganesha_manage_export(dict, "off", _gf_true,
op_errstr);
   if (ret)
   goto out;

Thanks,

Ravi


   It's really strange that NFS Ganesha has ever passed the tests.
How do we test NFS Ganesha exporting ?

Best Regards,
Strahil Nikolov
___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge:https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

   

___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge:https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel





___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] NFS Ganesha fails to export a volume

2020-11-16 Thread Ravishankar N

I am surprised too that it wasn't caught earlier.

Steps:

1. Clone the gluster repo

2. Compile  the source 
https://docs.gluster.org/en/latest/Developer-guide/Building-GlusterFS/


3. Make the changes (in a different branch if you prefer), compile again 
and install


4.  Test it out:

[root@linuxpad glusterfs]#  gluster v create testvol 
127.0.0.2:/home/ravi/bricks/brick{1..2} force

volume create: testvol: success: please start the volume to access data
[root@linuxpad glusterfs]#
[root@linuxpad glusterfs]# gluster v start testvol
volume start: testvol: success
[root@linuxpad glusterfs]#
[root@linuxpad glusterfs]# gluster v set testvol ganesha.enable on
volume set: failed: The option nfs-ganesha should be enabled before 
setting ganesha.enable.

[root@linuxpad glusterfs]#

I just tried the change and it looks like some new error shows up. Not 
too familiar with these settings; I will need to debug further.


Thanks,

Ravi

On 16/11/20 4:05 pm, Strahil Nikolov wrote:

I can try to help with the testing (I'm quite new to that).
Can someone share documentation of that process ?

yet we have another problem -> ganesha is deployed with ocf:heartbeat:portblock 
which supports only IPTABLES, while EL8 uses NFTABLES ...

Best Regards,
Strahil Nikolov






В понеделник, 16 ноември 2020 г., 10:47:43 Гринуич+2, Yaniv Kaul 
 написа:







On Mon, Nov 16, 2020 at 10:26 AM Ravishankar N  wrote:

On 15/11/20 8:24 pm, Strahil Nikolov wrote:

Hello All,

did anyone get a chance to look at 
https://github.com/gluster/glusterfs/issues/1778 ?

A look at
https://review.gluster.org/#/c/glusterfs/+/23648/4/xlators/mgmt/glusterd/src/glusterd-op-sm.c@1117
seems to indicate this could be due to a typo error. Do you have a
source install where you can apply this simple diff and see if it fixes
the issue?

I think you are right - I seem to have introduced it as part of 
https://github.com/gluster/glusterfs/commit/e081ac683b6a5bda54891318fa1e3ffac981e553
 - my bad.

However, it was merged ~1 year ago, and no one has complained thus far... :-/
1. Is no one using NFS Ganesha?
2. We are lacking tests for NFS Ganesha - code coverage indicates this path is 
not covered.

Y.

   
diff --git a/xlators/mgmt/glusterd/src/glusterd-op-sm.c

b/xlators/mgmt/glusterd/src/glusterd-op-sm.c
index 558f04fb2..d7bf96adf 100644
--- a/xlators/mgmt/glusterd/src/glusterd-op-sm.c
+++ b/xlators/mgmt/glusterd/src/glusterd-op-sm.c
@@ -1177,7 +1177,7 @@ glusterd_op_stage_set_volume(dict_t *dict, char
**op_errstr)
   }
   } else if (len_strcmp(key, keylen, "ganesha.enable")) {
   key_matched = _gf_true;
-    if (!strcmp(value, "off") == 0) {
+    if (strcmp(value, "off") == 0) {
   ret = ganesha_manage_export(dict, "off", _gf_true,
op_errstr);
   if (ret)
   goto out;

Thanks,

Ravi

It's really strange that NFS Ganesha has ever passed the tests.
How do we test NFS Ganesha exporting ?

Best Regards,
Strahil Nikolov
___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [Gluster-users] Docs on gluster parameters

2020-11-16 Thread Ravishankar N

Hi Strahil

On 16/11/20 4:21 pm, Strahil Nikolov wrote:

Hi Ravi,

I can propose a pull request if someone gives me a general idea of each setting.
Do we have comments in the source code that can be used as a description ?


`gluster volume set help`  lists many of the documented options. For the 
others (which  usually are not needed to be tweaked but its there if you 
still want to play with), each translator has a*struct volume_options 
options[]* in the source code (do agit grep "struct volume_options 
options" on the source code) which usually has a ".*description*" filed 
that gives a short description.


HTH,

Ravi



Best Regards,
Strahil Nikolov






В понеделник, 16 ноември 2020 г., 10:36:09 Гринуич+2, Ravishankar N 
 написа:









On 14/11/20 3:23 am, Mahdi Adnan wrote:


   

Hi,



  Differently, the Gluster docs missing quite a bit regarding the available 
options that can be used in the volumes.

Not only that, there are some options that might corrupt data and do not have proper 
documentation, for example, disable Sharding will lead to data corruption and I think it 
does not give any warning? "maybe I'm wrong regarding the warning tho" and I 
can not find any details about it in the official Gluster docs. The same goes for 
multiple clients accessing a volume with Sharding enabled.

also, in some cases, write-behind and stat-prefetch can lead to data 
inconsistency if multiple clients accessing the same data.

I think having solid "Official" Gluster docs with all of these details is 
essential to have stable Gluster deployments.




On Thu, Nov 12, 2020 at 7:34 PM Eli V  wrote:



I think docs.gluster.org needs a section on the available parameters,
especially considering how important some of them can be. For example
a google for performance.parallel-readdir, or
features.cache-invalidation only seems to turn up some hits in the
release notes on docs.gluster.org. I woudn't expect a new user to have
to go read the release notes for all previous releases to understand
the importance of these parameters, or what paremeters even exist.






https://docs.gluster.org/en/latest/  can be updated by sending pull requests to 
https://github.com/gluster/glusterdocs. It would be great if you can send some 
patches regarding the changes you would like to see. It doesn't have to be 
perfect. I can help in getting them reviewed and merged.
Thanks,
Ravi





   
   

   



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
gluster-us...@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users






--

   
Respectfully

Mahdi









Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
gluster-us...@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users






Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
gluster-us...@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [Gluster-users] Docs on gluster parameters

2020-11-16 Thread Ravishankar N


On 14/11/20 3:23 am, Mahdi Adnan wrote:

Hi,

 Differently, the Gluster docs missing quite a bit regarding the 
available options that can be used in the volumes.
Not only that, there are some options that might corrupt data and do 
not have proper documentation, for example, disable Sharding will lead 
to data corruption and I think it does not give any warning? "maybe 
I'm wrong regarding the warning tho" and I can not find any details 
about it in the official Gluster docs. The same goes for multiple 
clients accessing a volume with Sharding enabled.
also, in some cases, write-behind and stat-prefetch can lead to data 
inconsistency if multiple clients accessing the same data.
I think having solid "Official" Gluster docs with all of these details 
is essential to have stable Gluster deployments.


On Thu, Nov 12, 2020 at 7:34 PM Eli V > wrote:


I think docs.gluster.org  needs a section
on the available parameters,
especially considering how important some of them can be. For example
a google for performance.parallel-readdir, or
features.cache-invalidation only seems to turn up some hits in the
release notes on docs.gluster.org . I
woudn't expect a new user to have
to go read the release notes for all previous releases to understand
the importance of these parameters, or what paremeters even exist.



https://docs.gluster.org/en/latest/  can be updated by sending pull 
requests to https://github.com/gluster/glusterdocs. It would be great if 
you can send some patches regarding the changes you would like to see. 
It doesn't have to be perfect. I can help in getting them reviewed and 
merged.


Thanks,
Ravi







Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-users mailing list
gluster-us...@gluster.org 
https://lists.gluster.org/mailman/listinfo/gluster-users




--
Respectfully
Mahdi





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
gluster-us...@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users
___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] NFS Ganesha fails to export a volume

2020-11-16 Thread Ravishankar N


On 15/11/20 8:24 pm, Strahil Nikolov wrote:

Hello All,

did anyone get a chance to look at 
https://github.com/gluster/glusterfs/issues/1778 ?


A look at 
https://review.gluster.org/#/c/glusterfs/+/23648/4/xlators/mgmt/glusterd/src/glusterd-op-sm.c@1117 
seems to indicate this could be due to a typo error. Do you have a 
source install where you can apply this simple diff and see if it fixes 
the issue?


diff --git a/xlators/mgmt/glusterd/src/glusterd-op-sm.c 
b/xlators/mgmt/glusterd/src/glusterd-op-sm.c

index 558f04fb2..d7bf96adf 100644
--- a/xlators/mgmt/glusterd/src/glusterd-op-sm.c
+++ b/xlators/mgmt/glusterd/src/glusterd-op-sm.c
@@ -1177,7 +1177,7 @@ glusterd_op_stage_set_volume(dict_t *dict, char 
**op_errstr)

 }
 } else if (len_strcmp(key, keylen, "ganesha.enable")) {
 key_matched = _gf_true;
-    if (!strcmp(value, "off") == 0) {
+    if (strcmp(value, "off") == 0) {
 ret = ganesha_manage_export(dict, "off", _gf_true, 
op_errstr);

 if (ret)
 goto out;

Thanks,

Ravi


It's really strange that NFS Ganesha has ever passed the tests.
How do we test NFS Ganesha exporting ?

Best Regards,
Strahil Nikolov
___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Experimental xlators? Where to find info about them

2020-11-12 Thread Ravishankar N



On 12/11/20 5:07 pm, Federico Strati wrote:

Hello,

thanks but I already found their code, I was looking for documentation.


Development of JBR was not fully over AFAIK, which is why it was moved 
out of the main repo.


So if you are looking for documentation to create a JBR based gluster 
volume and use it, there isn't any doc. But if you want documentation on 
the design etc, you can find it online if you search. (Eg: 
https://www.snia.org/sites/default/files/SDCIndia/2017/Slides/Mohammed%20Rafi%20KC%20-%20Red%20Hat%20-%20Next%20Generation%20File%20Replication%20system%20in%20Gluster%20FS.pdf)


Regards,

Ravi



Federico

On 12/11/20 11:55, Ravishankar N wrote:


On 12/11/20 4:18 pm, Federico Strati wrote:

Hello,

I'm looking for info on experimental xlators fdl and jbr and lex,

where to find info about them?

They were moved to https://github.com/gluster/glusterfs-xlators


Thanks in advance

Federico

___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel







___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Experimental xlators? Where to find info about them

2020-11-12 Thread Ravishankar N



On 12/11/20 4:18 pm, Federico Strati wrote:

Hello,

I'm looking for info on experimental xlators fdl and jbr and lex,

where to find info about them?

They were moved to https://github.com/gluster/glusterfs-xlators


Thanks in advance

Federico

___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] On some (spurious) test failures

2020-10-27 Thread Ravishankar N


On 27/10/20 9:13 pm, Dmitry Antipov wrote:

I've never had the following tests succeeded, neither via
'run-tests.sh' nor running manually with 'prove -vf':

tests/basic/afr/entry-self-heal.t (Wstat: 0 Tests: 252 Failed: 2)
  Failed tests:  104, 208

tests/basic/afr/entry-self-heal-anon-dir-off.t (Wstat: 0 Tests: 261 
Failed: 2)

  Failed tests:  105, 209

tests/basic/afr/granular-esh/granular-esh.t (Wstat: 0 Tests: 82 
Failed: 1)

  Failed test:  46

tests/basic/afr/self-heal.t (Wstat: 0 Tests: 145 Failed: 1)
  Failed test:  124

tests/basic/ec/ec-quorum-count.t (Wstat: 0 Tests: 142 Failed: 2)
  Failed tests:  111, 139

tests/basic/ctime/ctime-heal-symlinks.t (Wstat: 0 Tests: 31 Failed: 2)
  Failed tests:  12, 26

tests/basic/ctime/ctime-utimesat.t (Wstat: 0 Tests: 14 Failed: 1)
  Failed test:  14

tests/bugs/heal-symlinks.t (Wstat: 0 Tests: 31 Failed: 2)
  Failed tests:  12, 26

Does anyone has seen something similar? Any ideas on what may be wrong 
here?


Which branch are you running them on? They pass for me locally on the 
latest devel branch, as do the upstream regression runs when you submit 
a patch.


If a .t is always failing in the same line in your setup, you could put 
an 'exit' after the line in the test so that you have a live state for 
debugging.


Thanks,
Ravi


Thanks,
Dmitry

___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Pull Request review workflow

2020-10-15 Thread Ravishankar N


On 15/10/20 4:36 pm, Sheetal Pamecha wrote:


+1
Just a note to the maintainers who are merging PRs to have patience 
and check the commit message when there are more than 1 commits in PR.


Makes sense.



Another thing to consider is that rfc.sh script always does a
rebase before pushing changes. This rewrites history and
changes all commits of a PR. I think we shouldn't do a rebase
in rfc.sh. Only if there are conflicts, I would do a manual
rebase and push the changes.



I think we would also need to rebase if say some .t failure was fixed 
and we need to submit the PR on top of that, unless "run regression" 
always applies your PR on the latest HEAD in the concerned branch and 
triggers the regression.





Actually True, Since the migration to github. I have not been using 
./rfc.sh and For me it's easier and cleaner.



Me as well :)

-Ravi
___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



[Gluster-devel] Removing problematic language in geo-replication

2020-07-22 Thread Ravishankar N

Hi,

The gluster code base has some words and terminology (blacklist, 
whitelist, master, slave etc.) that can be considered hurtful/offensive 
to people in a global open source setting. Some of words can be fixed 
trivially but the Geo-replication code seems to be something that needs 
extensive rework. More so because we have these words being used in the 
CLI itself. Two questions that I had were:


1. Can I replace master:slave with primary:secondary everywhere in the 
code and the CLI? Are there any suggestions for more appropriate 
terminology?


2. Is it okay to target the changes to a major release (release-9) and 
*not* provide backward compatibility for the CLI?


Thanks,

Ravi


___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Help with smoke test failure

2020-07-17 Thread Ravishankar N



On 17/07/20 7:47 pm, Emmanuel Dreyfus wrote:

Hello

I am still stuck on this one: how should I address the missing
SpecApproved and DocApproved here?

On Fri, Jul 10, 2020 at 05:43:54PM +0200, Emmanuel Dreyfus wrote:

What should I do to get this passed?


One needs to add the appropriate label (Type:Bug) on the github issue 
for the smoke to pass.


I'm not sure if you have the right permissions to add it but I have done 
it for you.


-Ravi



https://build.gluster.org/job/comment-on-issue/19308/ : FAILURE <<<
Missing SpecApproved flag on Issue 1361
Missing DocApproved flag on Issue 1361


___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] gluster safe ugpgrade

2020-01-20 Thread Ravishankar N


On 20/01/20 2:13 pm, Roman wrote:

Hello dear devs team!

I'm sorry to write to you, not user-s list, but I really would like to 
have DEV's opinion on my issue.


I've got multiple solutions running on old gluster versions (same 
version per cluster, no mixed versions in same cluster):


Some of them are: 3.7.19, 3.10.9, 3.8.15

Yeah, I'm the geek who likes gluster from its beginning. I've started 
with glusterfs for KVM (proxmox) and now am running glusterfs for 
Tallinn University Academic Library digitalization project (NAS) which 
runs fully at 1gbps without any problems. One of gluster is running 
250 TB storage and it has to be extended, it is the DISTRIBUTED VOLUME 
:). I could go the easy way and get 3.8.15 from old releases of 
gluster repo, but i don't feel it to be the right way. What I would 
really like to do is:

1. upgrade OS atm it is  Debian GNU/Linux 8.10
2. upgrade the glusterfs

So my question is: what is the safe gluster version I could upgrade to 
from my versions? I will do the offline upgrade. And what my steps 
would be right? As I can see it:


1. shutdown gluster on all servers
2. upgrade os and gluster (change repo file for both os and glusterfs 
version supported on that os and safe to upgrade to, run apt-get dist 
upgrade)


As you understand, there is no possibility to backup that ammount of 
data (the time it would take is not acceptable). We have some data 
copied periodically to national archive, but not all of it.


What do you think on this?


As long as you are upgrading, it is better to use the latest version 
(7.x).  While nothing should go wrong, since 3.x is really old, I'm not 
100% sure if everything will be smooth post the upgrade. The best way is 
to create a small 'test' setup (maybe a 1 brick volume with 1 client and 
some data) and try upgrading it to 7.x , bump up to the right op-version 
etc. and verify that every thing works. If it does, then you can do the 
actual offline upgrade in peace.


Hope this helps,

Ravi



--
Best regards,
Roman.

___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [Gluster-Maintainers] Modifying gluster's logging mechanism

2019-11-22 Thread Ravishankar N


On 22/11/19 3:13 pm, Barak Sason Rofman wrote:
This is actually one of the main reasons I wanted to bring this up for 
discussion - will it be fine with the community to run a dedicated 
tool to reorder the logs offline?


I think it is a bad idea to log without ordering and later relying on an 
external tool to sort it.  This is definitely not something I would want 
to do while doing test and development or debugging field issues.  
Structured logging  is definitely useful for gathering statistics and 
post-processing to make reports and charts and what not,  but from a 
debugging point of view, maintaining causality of messages and working 
with command line text editors and tools on log files is important IMO.


I had a similar concerns when  brick multiplexing feature was developed 
where a single log file was used for logging all multiplexed bricks' 
logs.  So much extra work to weed out messages of 99 processes to read 
the log of the 1 process you are interested in.


Regards,
Ravi

___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Proposal: move glusterfs development to github workflow, completely

2019-08-26 Thread Ravishankar N



On 24/08/19 9:26 AM, Yaniv Kaul wrote:
I don't like mixed mode, but I also dislike Github's code review 
tools, so I'd like to remind the option of using http://gerrithub.io/ 
for code review.

Other than that, I'm in favor of moving over.
Y.

+1 for using gerrithub for code review when we move to github.
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [RFC] What if client fuse process crash?

2019-08-06 Thread Ravishankar N


On 06/08/19 11:44 AM, Changwei Ge wrote:

Hi Ravishankar,


Thanks for your share, it's very useful to me.

I am setting up a glusterfs storage cluster recently and the 
umount/mount recovering process bothered me.

Hi Changwei,
Why are you needing to do frequent remounts? If your gluster fuse client 
is crashing frequently, that should be investigated and fixed. If you 
have a reproducer, please raise a bug with all the details like the 
glusterfs version, core files and log files.

Regards,
Ravi



I happened to find some patches[1] from internet aiming to address 
such a problem but no idea why they were not managed to merge into 
glusterfs mainline.


Do you know why?


Thanks,

Changwei


[1]:

https://review.gluster.org/#/c/glusterfs/+/16843/

https://github.com/gluster/glusterfs/issues/242


On 2019/8/6 1:12 下午, Ravishankar N wrote:

On 05/08/19 3:31 PM, Changwei Ge wrote:

Hi list,

If somehow, glusterfs client fuse process dies. All subsequent file 
operations will be failed with error 'no connection'.


I am curious if the only way to recover is umount and mount again?
Yes, this is pretty much the case with all fuse based file systems. 
You can use -o auto_unmount (https://review.gluster.org/#/c/17230/) 
to automatically cleanup and not having to manually unmount.


If so, that means all processes working on top of glusterfs have to 
close files, which sometimes is hard to be acceptable.


There is 
https://research.cs.wisc.edu/wind/Publications/refuse-eurosys11.html, 
which claims to provide a framework for transparent failovers. I 
can't find any publicly available code though.


Regards,
Ravi



Thanks,

Changwei


___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [RFC] What if client fuse process crash?

2019-08-05 Thread Ravishankar N

On 05/08/19 3:31 PM, Changwei Ge wrote:

Hi list,

If somehow, glusterfs client fuse process dies. All subsequent file 
operations will be failed with error 'no connection'.


I am curious if the only way to recover is umount and mount again?
Yes, this is pretty much the case with all fuse based file systems. You 
can use -o auto_unmount (https://review.gluster.org/#/c/17230/) to 
automatically cleanup and not having to manually unmount.


If so, that means all processes working on top of glusterfs have to 
close files, which sometimes is hard to be acceptable.


There is 
https://research.cs.wisc.edu/wind/Publications/refuse-eurosys11.html, 
which claims to provide a framework for transparent failovers.  I can't 
find any publicly available code though.


Regards,
Ravi



Thanks,

Changwei


___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] fallocate behavior in glusterfs

2019-07-02 Thread Ravishankar N


On 02/07/19 8:52 PM, FNU Raghavendra Manjunath wrote:


Hi All,

In glusterfs, there is an issue regarding the fallocate behavior. In 
short, if someone does fallocate from the mount point with some size 
that is greater than the available size in the backend filesystem 
where the file is present, then fallocate can fail with a subset of 
the required number of blocks allocated and then failing in the 
backend filesystem with ENOSPC error.


The behavior of fallocate in itself is simlar to how it would have 
been on a disk filesystem (atleast xfs where it was checked). i.e. 
allocates subset of the required number of blocks and then fail with 
ENOSPC. And the file in itself would show the number of blocks in stat 
to be whatever was allocated as part of fallocate. Please refer [1] 
where the issue is explained.


Now, there is one small difference between how the behavior is between 
glusterfs and xfs.
In xfs after fallocate fails, doing 'stat' on the file shows the 
number of blocks that have been allocated. Whereas in glusterfs, the 
number of blocks is shown as zero which makes tools like "du" show 
zero consumption. This difference in behavior in glusterfs is because 
of libglusterfs on how it handles sparse files etc for calculating 
number of blocks (mentioned in [1])


At this point I can think of 3 things on how to handle this.

1) Except for how many blocks are shown in the stat output for the 
file from the mount point (on which fallocate was done), the remaining 
behavior of attempting to allocate the requested size and failing when 
the filesystem becomes full is similar to that of XFS.


Hence, what is required is to come up with a solution on how 
libglusterfs calculate blocks for sparse files etc (without breaking 
any of the existing components and features). This makes the behavior 
similar to that of backend filesystem. This might require its own time 
to fix libglusterfs logic without impacting anything else.


I think we should just revert the commit 
b1a5fa55695f497952264e35a9c8eb2bbf1ec4c3 (BZ 817343) and see if it 
really breaks anything (or check whatever it breaks is something that we 
can live with). XFS speculative preallocation is not permanent and the 
extra space is freed up eventually. It can be sped up via procfs 
tunable: 
http://xfs.org/index.php/XFS_FAQ#Q:_How_can_I_speed_up_or_avoid_delayed_removal_of_speculative_preallocation.3F. 
We could also tune the allocsize option to a low value like 4k so that 
glusterfs quota is not affected.


FWIW, ENOSPC is not the only fallocate problem in gluster because of  
'iatt->ia_block' tweaking. It also breaks the --keep-size option (i.e. 
the FALLOC_FL_KEEP_SIZE flag in fallocate(2)) and reports incorrect du size.


Regards,
Ravi


OR

2) Once the fallocate fails in the backend filesystem, make posix 
xlator in the brick truncate the file to the previous size of the file 
before attempting fallocate. A patch [2] has been sent for this. But 
there is an issue with this when there are parallel writes and 
fallocate operations happening on the same file. It can lead to a data 
loss.


a) statpre is obtained ===> before fallocate is attempted, get the 
stat hence the size of the file b) A parrallel Write fop on the same 
file that extends the file is successful c) Fallocate fails d) 
ftruncate truncates it to size given by statpre (i.e. the previous 
stat and the size obtained in step a)


OR

3) Make posix check for available disk size before doing fallocate. 
i.e. in fallocate once posix gets the number of bytes to be allocated 
for the file from a particular offset, it checks whether so many bytes 
are available or not in the disk. If not, fail the fallocate fop with 
ENOSPC (without attempting it on the backend filesystem).


There still is a probability of a parallel write happening while this 
fallocate is happening and by the time falllocate system call is 
attempted on the disk, the available space might have been less than 
what was calculated before fallocate.

i.e. following things can happen

 a) statfs ===> get the available space of the backend filesystem
 b) a parallel write succeeds and extends the file
 c) fallocate is attempted assuming there is sufficient space in the 
backend


While the above situation can arise, I think we are still fine. 
Because fallocate is attempted from the offset received in the fop. 
So, irrespective of whether write extended the file or not, the 
fallocate itself will be attempted for so many bytes from the offset 
which we found to be available by getting statfs information.


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1724754#c3
[2] https://review.gluster.org/#/c/glusterfs/+/22969/

Please provide feedback.

Regards,
Raghavendra

___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: 

Re: [Gluster-devel] [Gluster-users] No healing on peer disconnect - is it correct?

2019-06-10 Thread Ravishankar N
There will be pending heals only when the brick process goes down or 
there is a disconnect between the client and that brick. When you say " 
gluster process is down but bricks running", I'm guessing you killed 
only glusterd and not the glusterfsd brick process. That won't cause any 
pending heals. If there is something to be healed, `gluster volume heal 
$volname info` will display the list of files.


Hope that helps,
Ravi
On 10/06/19 7:53 PM, Martin wrote:
My VMs using Gluster as storage through libgfapi support in Qemu. But 
I dont see any healing of reconnected brick.


Thanks Karthik / Ravishankar in advance!

On 10 Jun 2019, at 16:07, Hari Gowtham <mailto:hgowt...@redhat.com>> wrote:


On Mon, Jun 10, 2019 at 7:21 PM snowmailer <mailto:snowmai...@gmail.com>> wrote:


Can someone advice on this, please?

BR!

Dňa 3. 6. 2019 o 18:58 užívateľ Martin <mailto:snowmai...@gmail.com>> napísal:



Hi all,

I need someone to explain if my gluster behaviour is correct. I am 
not sure if my gluster works as it should. I have simple Replica 3 
- Number of Bricks: 1 x 3 = 3.


When one of my hypervisor is disconnected as peer, i.e. gluster 
process is down but bricks running, other two healthy nodes start 
signalling that they lost one peer. This is correct.
Next, I restart gluster process on node where gluster process 
failed and I thought It should trigger healing of files on failed 
node but nothing is happening.


I run VMs disks on this gluster volume. No healing is triggered 
after gluster restart, remaining two nodes get peer back after 
restart of gluster and everything is running without down time.
Even VMs that are running on “failed” node where gluster process 
was down (bricks were up) are running without down time.


I assume your VMs use gluster as the storage. In that case, the
gluster volume might be mounted on all the hypervisors.
The mount/ client is smart enough to give the correct data from the
other two machines which were always up.
This is the reason things are working fine.

Gluster should heal the brick.
Adding people how can help you better with the heal part.
@Karthik Subrahmanya  @Ravishankar N do take a look and answer this part.



Is this behaviour correct? I mean No healing is triggered after 
peer is reconnected back and VMs.


Thanks for explanation.

BR!
Martin



___
Gluster-users mailing list
gluster-us...@gluster.org <mailto:gluster-us...@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users




--
Regards,
Hari Gowtham.


___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] questions on callstubs and "link-count" in index translator

2019-04-27 Thread Ravishankar N


On 26/04/19 10:53 PM, Junsong Li wrote:


Hello list,

I have a couple of questions on index translator implementation.

  * Why does gluster need callstub and a different worker queue (and
thread) to process those call stubs? Is it just to lower the
priority of fops of internal inodes?

As far as I know, this is to move the processing to background and free 
up the server thread to process other requests.


 *


  * What’s the purpose of “link-count” in xdata? It’s being used only
in index_fstat and index_lookup. I see sometimes the key is
assigned 0/1 after callback, and sometimes AFR uses it to store
flag GF_XATTROP_INDEX_COUNT. Is the code purposely reusing the key?

A non-zero link count means there are entries that are pending heal. AFR 
requests this information in lookup and fstat fops and updates 
priv->need_heal in the fop-callbacks. It then uses that information to 
not nullify the inodes of the entries fetched during a readdirp call, 
improving readdirp performance.


https://review.gluster.org/#/c/glusterfs/+/12507/ is the patch that 
introduced it.


HTH,
Ravi


 *

Thanks,

Junsong


___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 6.1: Expected tagging on April 10th

2019-04-06 Thread Ravishankar N
Tracker bug is https://bugzilla.redhat.com/show_bug.cgi?id=1692394, in 
case anyone wants to add blocker bugs.



On 05/04/19 8:03 PM, Shyam Ranganathan wrote:

Hi,

Expected tagging date for release-6.1 is on April, 10th, 2019.

Please ensure required patches are backported and also are passing
regressions and are appropriately reviewed for easy merging and tagging
on the date.

Thanks,
Shyam
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Self/Healing process after node maintenance

2019-01-22 Thread Ravishankar N



On 01/22/2019 02:57 PM, Martin Toth wrote:

Hi all,

I just want to ensure myself how self-healing process exactly works, because I 
need to turn one of my nodes down for maintenance.
I have replica 3 setup. Nothing complicated. 3 nodes, 1 volume, 1 brick per 
node (ZFS pool). All nodes running Qemu VMs and disks of VMs are on Gluster 
volume.

I want to turn off node1 for maintenance. If I will migrate all VMs to node2 
and node3 and shutdown node1, I suppose everything will be running without 
downtime. (2 nodes of 3 will be online)
Yes it should. Before you `shutdown` a node, kill all the gluster 
processes on it. i.e. `pkill gluster`.


My question is if I will start up node1 after maintenance and node1 will be 
done back online in running state, this will trigger self-healing process on 
all disk files of all VMs.. will this healing process be only and only on node1?
The list of files needing heal on node1 are captured on the other 2 
nodes that were up, so the selfheal daemons on those nodes will do the 
heals.

Can node2 and node3 run VMs without problem while node1 will be healing these 
files?
Yes. You might notice some performance drop if there are a lot of heals 
happening though.



I want to ensure myself this files (VM disks) will not get “locked” on node2 
and node3 while self-healing will be in process on node1.
Heal won't block I/O from clients indefinitely. If both are writing to 
overlapping offset, one of them (i.e either heal or client I/O)  will 
get the lock, do its job and release the lock so that the other can 
acquire it and continue.

HTH,
Ravi


Thanks for clarification in advance.

BR!
___
Gluster-users mailing list
gluster-us...@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Branched and further dates

2018-10-08 Thread Ravishankar N




On 10/05/2018 08:29 PM, Shyam Ranganathan wrote:

On 10/04/2018 11:33 AM, Shyam Ranganathan wrote:

On 09/13/2018 11:10 AM, Shyam Ranganathan wrote:

RC1 would be around 24th of Sep. with final release tagging around 1st
of Oct.

RC1 now stands to be tagged tomorrow, and patches that are being
targeted for a back port include,

We still are awaiting release notes (other than the bugs section) to be
closed.

There is one new bug that needs attention from the replicate team.
https://bugzilla.redhat.com/show_bug.cgi?id=1636502

The above looks important to me to be fixed before the release, @ravi or
@pranith can you take a look?

I've attempted a fix @ https://review.gluster.org/#/c/glusterfs/+/21366/
-Ravi



1) https://review.gluster.org/c/glusterfs/+/21314 (snapshot volfile in
mux cases)

@RaBhat working on this.

Done


2) Py3 corrections in master

@Kotresh are all changes made to master backported to release-5 (may not
be merged, but looking at if they are backported and ready for merge)?

Done, release notes amend pending


3) Release notes review and updates with GD2 content pending

@Kaushal/GD2 team can we get the updates as required?
https://review.gluster.org/c/glusterfs/+/21303

Still awaiting this.


4) This bug [2] was filed when we released 4.0.

The issue has not bitten us in 4.0 or in 4.1 (yet!) (i.e the options
missing and hence post-upgrade clients failing the mount). This is
possibly the last chance to fix it.

Glusterd and protocol maintainers, can you chime in, if this bug needs
to be and can be fixed? (thanks to @anoopcs for pointing it out to me)

Release notes to be corrected to call this out.


The tracker bug [1] does not have any other blockers against it, hence
assuming we are not tracking/waiting on anything other than the set above.

Thanks,
Shyam

[1] Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-5.0
[2] Potential upgrade bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1540659
___
maintainers mailing list
maintain...@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers



___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] index_lookup segfault in glusterfsd brick process

2018-10-04 Thread Ravishankar N



On 10/04/2018 01:57 PM, Pranith Kumar Karampuri wrote:



On Wed, Oct 3, 2018 at 11:20 PM 김경표 > wrote:


Hello folks.

Few days ago I found my EC(4+2) volume was degraded.
I am using 3.12.13-1.el7.x86_64.
One brick was down, below is bricklog
I am suspicious loc->inode bug in index.c (see attached picture)
In GDB, loc->inode is null

inode_find (loc->inode->table, loc->gfid);


I see that loc->inode is coming from resolve_gfid() where the 
following should have been executed.

  0 resolve_loc->inode = server_inode_new (state->itable,
  1 resolve_loc->gfid);

As per the log:
"[2018-09-29 13:22:36.536579] W [inode.c:680:inode_new] 
(-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd048) 
[0x7f9bd2494048] 
-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc14d) 
[0x7f9bd249314d] -->/lib64/libglusterfs.so.0(inode_new+0x8a) [0x

7f9be70900ba] ) 0-gluvol02-05-server: inode not found"

it indicates that inode-table is NULL. Is there a possibility to 
upload the core somewhere for us to take a closer look?


https://bugzilla.redhat.com/show_bug.cgi?id=1635784 has been raised by 
kpkim, best  to attach the core to the BZ.

-Ravi




Thansk for Gluster Community!!!

- kpkim

--
[2018-09-29 13:22:36.536532] W [inode.c:942:inode_find]
(-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd01c)
[0x7f9bd249401c]
-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc638)
[0x7f9bd2493638] -->/lib64/libglusterfs.so.0(inode_find+0x92) [
0x7f9be7090a82] ) 0-gluvol02-05-server: table not found
[2018-09-29 13:22:36.536579] W [inode.c:680:inode_new]
(-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd048)
[0x7f9bd2494048]
-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc14d)
[0x7f9bd249314d] -->/lib64/libglusterfs.so.0(inode_new+0x8a) [0x
7f9be70900ba] ) 0-gluvol02-05-server: inode not found
[2018-09-29 13:22:36.537568] W [inode.c:2305:inode_is_linked]
(-->/usr/lib64/glusterfs/3.12.13/xlator/features/quota.so(+0x4fc6)
[0x7f9bd2b1cfc6]
-->/usr/lib64/glusterfs/3.12.13/xlator/features/index.so(+0x4bb9)
[0x7f9bd2d43bb9] -->/lib64/libglusterfs.so.0(inode_is_linke
d+0x8a) [0x7f9be70927ea] ) 0-gluvol02-05-index: inode not found
pending frames:
frame : type(0) op(18)
frame : type(0) op(18)
frame : type(0) op(28)
--snip --
frame : type(0) op(28)
frame : type(0) op(28)
frame : type(0) op(18)
patchset: git://git.gluster.org/glusterfs.git

signal received: 11
time of crash:
2018-09-29 13:22:36
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.12.13
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7f9be70804c0]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f9be708a3f4]
/lib64/libc.so.6(+0x362f0)[0x7f9be56e02f0]

/usr/lib64/glusterfs/3.12.13/xlator/features/index.so(+0x4bc4)[0x7f9bd2d43bc4]

/usr/lib64/glusterfs/3.12.13/xlator/features/quota.so(+0x4fc6)[0x7f9bd2b1cfc6]

/usr/lib64/glusterfs/3.12.13/xlator/debug/io-stats.so(+0x4e53)[0x7f9bd28eee53]
/lib64/libglusterfs.so.0(default_lookup+0xbd)[0x7f9be70fddfd]

/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc342)[0x7f9bd2493342]

/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd048)[0x7f9bd2494048]

/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd2c0)[0x7f9bd24942c0]

/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc89e)[0x7f9bd249389e]

/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd354)[0x7f9bd2494354]

/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0x2f829)[0x7f9bd24b6829]
/lib64/libgfrpc.so.0(rpcsvc_request_handler+0x96)[0x7f9be6e42246]
/lib64/libpthread.so.0(+0x7e25)[0x7f9be5edfe25]
/lib64/libc.so.6(clone+0x6d)[0x7f9be57a8bad]


___
Gluster-devel mailing list
Gluster-devel@gluster.org 
https://lists.gluster.org/mailman/listinfo/gluster-devel



--
Pranith


___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Clang-Formatter for GlusterFS.

2018-09-18 Thread Ravishankar N




On 09/18/2018 02:02 PM, Hari Gowtham wrote:

I see that the procedure mentioned in the coding standard document is buggy.

git show --pretty="format:" --name-only | grep -v "contrib/" | egrep
"*\.[ch]$" | xargs clang-format -i

The above command edited the whole file. which is not supposed to happen.
It works fine on fedora 28 (clang version 6.0.1). I had the same problem 
you faced on fedora 26 though, presumably because of the older clang 
version.

-Ravi



+1 for the readability of the code having been affected.
On Mon, Sep 17, 2018 at 10:45 AM Amar Tumballi  wrote:



On Mon, Sep 17, 2018 at 10:00 AM, Ravishankar N  wrote:


On 09/13/2018 03:34 PM, Niels de Vos wrote:

On Thu, Sep 13, 2018 at 02:25:22PM +0530, Ravishankar N wrote:
...

What rules does clang impose on function/argument wrapping and alignment? I
somehow found the new code wrapping to be random and highly unreadable. An
example of 'before and after' the clang format patches went in:
https://paste.fedoraproject.org/paste/dC~aRCzYgliqucGYIzxPrQ Wondering if
this is just me or is it some problem of spurious clang fixes.

I agree that this example looks pretty ugly. Looking at random changes
to the code where I am most active does not show this awkward
formatting.


So one of my recent patches is failing smoke and clang-format is insisting 
[https://build.gluster.org/job/clang-format/22/console] on wrapping function 
arguments in an unsightly manner. Should I resend my patch with this new style 
of wrapping ?


I would say yes! We will get better, by changing options of clang-format once 
we get better options there. But for now, just following the option suggested 
by clang-format job is good IMO.

-Amar


Regards,
Ravi




However, I was expecting to see enforcing of the
single-line-if-statements like this (and while/for/.. loops):

  if (need_to_do_it) {
   do_it();
  }

instead of

  if (need_to_do_it)
   do_it();

At least the conversion did not take care of this. But, maybe I'm wrong
as I can not find the discussion in https://bugzilla.redhat.com/1564149
about this. Does someone remember what was decided in the end?

Thanks,
Niels





--
Amar Tumballi (amarts)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel





___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Clang-Formatter for GlusterFS.

2018-09-13 Thread Ravishankar N


On 09/12/2018 07:31 PM, Amar Tumballi wrote:

Top posting:

All is well at the tip of glusterfs master branch now.

We will post a postmortem report of events and what went wrong in this 
activity, later.


With this, Shyam, you can go ahead with release-v5.0 branching.

-Amar

On Wed, Sep 12, 2018 at 6:21 PM, Amar Tumballi > wrote:




On Wed, Sep 12, 2018 at 5:36 PM, Amar Tumballi
mailto:atumb...@redhat.com>> wrote:



On Mon, Aug 27, 2018 at 8:47 AM, Amar Tumballi
mailto:atumb...@redhat.com>> wrote:



On Wed, Aug 22, 2018 at 12:35 PM, Amar Tumballi
mailto:atumb...@redhat.com>> wrote:

Hi All,

Below is an update about the project’s move towards
using clang-formatter for imposing few coding-standards.

Gluster project, since inception followed certain
basic coding standard, which was (at that time) easy
to follow, and easy to review.

Over the time, with inclusion of many more developers
and working with other communities, as the coding
standards are different across projects, we got
different type of code into source. After 11+years,
now is the time we should be depending on tool for it
more than ever, and hence we have decided to depend on
clang-formatter for this.

Below are some highlights of this activity. We expect
each of you to actively help us in this move, so it is
smooth for all of us.

  * We kickstarted this activity sometime aroundApril
2018

  * There was a repo created for trying out the
options, and validating the code.Link to Repo

  * Now, with the latest|.clang-format|file, we have
made the whole GlusterFS codebase changes.The
change here 
  * We will be running regression with the changes,
multiple times, so we don’t want to miss something
getting in without our notice.
  * As it is a very big change (Almost 6 lakh lines
changed), we will not put this commit through
gerrit, but directly pushing to the repo.
  * Once this patch gets in (ETA: 28th August), all
the pending patches needs to go through rebase.


All, as Shyam has proposed to change the branch out date
for release-5.0 as Sept 10th [1], we are now targeting
Sept 7th for this activity.


We are finally Done!

We delayed in by another 4 days to make sure we pass the
regression properly with clang changes, and it doesn't break
anything.

Also note, from now, it is always better to format the changes
with below command before committing.

 sh$ cd glusterfs-git-repo/
 sh$ clang-format -i $(list_of_files_changed)
 sh$ git commit # and usual steps to publish your changes.

Also note, all the changes which were present earlier, needs
to be rebased with clang-format too.

One of the quick and dirty way to get your changes rebased in
the case if your patch is significantly large, is by applying
the patches on top of the commit before the clang-changes, and
copy the files over, and run clang-format -i on them, and
checking the diff. As no code other coding style changes
happened, this should work fine.

Please post if you have any concerns.




What rules does clang impose on function/argument wrapping and 
alignment? I somehow found the new code wrapping to be random and highly 
unreadable. An example of 'before and after' the clang format patches 
went in: https://paste.fedoraproject.org/paste/dC~aRCzYgliqucGYIzxPrQ 
Wondering if this is just me or is it some problem of spurious clang fixes.


Regards,
Ravi





Noticed some glitches! Stand with us till we handle the situation...

meantime, found that below command for git am works better for
applying smaller patches:

 $ git am --ignore-whitespace --ignore-space-change --reject
0001-patch
-Amar

Regards,
Amar

[1] -

https://lists.gluster.org/pipermail/gluster-devel/2018-August/055308.html



What are the next steps:

  * Thepatch
of

Re: [Gluster-devel] Master branch lock down: RCA for tests (remove-brick-testcases.t)

2018-08-13 Thread Ravishankar N



On 08/13/2018 06:12 AM, Shyam Ranganathan wrote:

As a means of keeping the focus going and squashing the remaining tests
that were failing sporadically, request each test/component owner to,

- respond to this mail changing the subject (testname.t) to the test
name that they are responding to (adding more than one in case they have
the same RCA)
- with the current RCA and status of the same

List of tests and current owners as per the spreadsheet that we were
tracking are:

TBD

./tests/bugs/glusterd/remove-brick-testcases.t  TBD
In this case, the .t passed but self-heal-daemon (which btw does not 
have any role in this test because there is no I/O or heals in this .t) 
has crashed with the following bt:


Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7ff8c6bc0b4f in _IO_cleanup () from ./lib64/libc.so.6
[Current thread is 1 (LWP 17530)]
(gdb)
(gdb) bt
#0  0x7ff8c6bc0b4f in _IO_cleanup () from ./lib64/libc.so.6
#1  0x7ff8c6b7cb8b in __run_exit_handlers () from ./lib64/libc.so.6
#2  0x7ff8c6b7cc27 in exit () from ./lib64/libc.so.6
#3  0x0040b14d in cleanup_and_exit (signum=15) at glusterfsd.c:1570
#4  0x0040de71 in glusterfs_sigwaiter (arg=0x7ffd5f270d20) at 
glusterfsd.c:2332

#5  0x7ff8c757ce25 in start_thread () from ./lib64/libpthread.so.0
#6  0x7ff8c6c41bad in clone () from ./lib64/libc.so.6

Not able to find out the reason of the crash. Any pointers are 
appreciated. Regression run/core can be found at 
https://build.gluster.org/job/line-coverage/432/consoleFull .


Thanks,
Ravi

___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Announcing Softserve- serve yourself a VM

2018-08-11 Thread Ravishankar N



On 02/28/2018 06:56 PM, Deepshikha Khandelwal wrote:


Hi,


We have launched the alpha version of  SOFTSERVE[1], which allows 
Gluster Github organization members to provision virtual machines for 
a specified duration of time. These machines will be deleted 
automatically afterwards.



Now you don’t need to file a bug to get VM. It’s just a form away with 
a dashboard to monitor the machines.



Once the machine is up, you can access it via SSH and run your 
debugging (test regression runs).



We’ve enabled certain limits for this application:

1.

Maximum allowance of 5 VM at a time across all the users. User
have to wait until a slot is available for them after 5 machines
allocation.

2.

User will get the requesting machines maximum upto 4 hours.

3.

Access to only Gluster organization members.


These limits may be resolved in the near future. This service is ready 
to use and if you find any problems, feel free to file an issue on the 
github repository[2].



Hi,
While https://github.com/gluster/softserve/issues/31 gets some 
attention, I've sent a pr [*]  based on grepping through source code to 
allow a 24 hour reservation slot. Please have a look.

Thanks!
Ravi
[*] https://github.com/gluster/softserve/pull/46



[1]https://softserve.gluster.org

[2]https://github.com/gluster/softserve

Thanks,

Deepshikha




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Master branch lock down status (Fri, August 9th)

2018-08-10 Thread Ravishankar N




On 08/11/2018 07:29 AM, Shyam Ranganathan wrote:

./tests/bugs/replicate/bug-1408712.t (one retry)
I'll take a look at this. But it looks like archiving the artifacts 
(logs) for this run 
(https://build.gluster.org/job/regression-on-demand-full-run/44/consoleFull) 
was a failure.

Thanks,
Ravi
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Are there daemons directly talking to arbiter bricks?

2018-08-09 Thread Ravishankar N

Hi,

Arbiter brick does not store any data and fails any readv requests wound 
to it with a log message.  Any client talking to the bricks via AFR is 
safe because AFR takes care of not winding down any readv to arbiter bricks.
But we have other daemons like bitd and scrubd talk directly to bricks. 
This is causing flooding of brick logs when bitrot is enabled in arbiter 
volumes as listed in [1].


While we need to fix the volfile in these daemons to not talk to the 
arbiter bricks, I wanted to know if there are other daemons existing in 
gluster that directly talk to brick processes which would need similar 
fixing.


Thanks,
Ravi

[1] https://github.com/gluster/glusterd2/issues/1089
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Master branch lock down status

2018-08-08 Thread Ravishankar N



On 08/08/2018 05:07 AM, Shyam Ranganathan wrote:

5) Current test failures
We still have the following tests failing and some without any RCA or
attention, (If something is incorrect, write back).

./tests/basic/afr/add-brick-self-heal.t (needs attention)
From the runs captured at https://review.gluster.org/#/c/20637/ , I saw 
that the latest runs where this particular .t failed were at 
https://build.gluster.org/job/line-coverage/415 and 
https://build.gluster.org/job/line-coverage/421/.
In both of these runs, there are no gluster 'regression' logs available 
at https://build.gluster.org/job/line-coverage//artifact. 
I have raised BZ 1613721 for it.


Also, Shyam was saying that in case of retries, the old (failure) logs 
get overwritten by the retries which are successful. Can we disable 
re-trying the .ts when they fail just for this lock down period alone so 
that we do have the logs?


Regards,
Ravi

___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] regression failures on afr/split-brain-resolution

2018-07-24 Thread Ravishankar N



On 07/24/2018 08:45 PM, Raghavendra Gowdappa wrote:


I tried higher values of attribute-timeout and its not helping. Are 
there any other similar split brain related tests? Can I mark these 
tests bad for time being as  the md-cache patch has a deadline?





`git grep split-brain-status ` on the tests folder returned the following:
tests/basic/afr/split-brain-resolution.t:
tests/bugs/bug-1368312.t:
tests/bugs/replicate/bug-1238398-split-brain-resolution.t:
tests/bugs/replicate/bug-1417522-block-split-brain-resolution.t

I guess if it is blocking you , you can mark them as bad tests and 
assign the bug to me.

-Ravi
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regression failures on afr/split-brain-resolution

2018-07-24 Thread Ravishankar N



On 07/25/2018 09:06 AM, Raghavendra Gowdappa wrote:



On Tue, Jul 24, 2018 at 6:54 PM, Ravishankar N <mailto:ravishan...@redhat.com>> wrote:




On 07/24/2018 06:30 PM, Ravishankar N wrote:




On 07/24/2018 02:56 PM, Raghavendra Gowdappa wrote:

All,

I was trying to debug regression failures on [1] and observed
that split-brain-resolution.t was failing consistently.

=
TEST 45 (line 88): 0 get_pending_heal_count patchy
./tests/basic/afr/split-brain-resolution.t .. 45/45 RESULT 45: 1
./tests/basic/afr/split-brain-resolution.t .. Failed 17/45 subtests

Test Summary Report
---
./tests/basic/afr/split-brain-resolution.t (Wstat: 0 Tests: 45
Failed: 17)
  Failed tests:  24-26, 28-36, 41-45


On probing deeper, I observed a curious fact - on most of the
failures stat was not served from md-cache, but instead was
wound down to afr which failed stat with EIO as the file was in
split brain. So, I did another test:
* disabled md-cache
* mount glusterfs with attribute-timeout 0 and entry-timeout 0

Now the test fails always. So, I think the test relied on stat
requests being absorbed either by kernel attribute cache or
md-cache. When its not happening stats are reaching afr and
resulting in failures of cmds like getfattr etc.


This indeed seems to be the case.  Is there any way we can avoid
the stat? When a getfattr is performed on the mount, aren't
lookup + getfattr are the only fops that need to be hit in gluster?


Or should afr allow (f)stat even for replica-2 split-brains
because it is allowing lookup anyway (lookup cbk contains stat
information from one of its children) ?


I think the question here should be what kind of access we've to 
provide for files in split-brain. Once that broader question is 
answered, we should think about what fops come under those kinds of 
access. If setfattr/getfattr cmd access has to be provided I guess 
lookup, stat, setxattr, getxattr need to work with split-brain files.


Ideally, the only fop that should be allowed access is checking whether 
the file exists or not (i.e. lookup), subject to quorum checks. All 
others should be denied. This is how it works as of today too but we 
(afr) overloaded setfattr and getfattr with virtual xattrs to allow 
examining and resolving split-brain from the mount, which is now failing 
in the .t because of the stat failing like you pointed out. I think we 
should allow (f)stat too for replica-2 case even when there are no good 
copies (i.e. read_subvol) to support the mount based split-brain 
resolution method.  Pranith, what do you think?


-Ravi




-Ravi

-Ravi


Thoughts?

[1] https://review.gluster.org/#/c/20549/
<https://review.gluster.org/#/c/20549/>


___
Gluster-devel mailing list
Gluster-devel@gluster.org <mailto:Gluster-devel@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-devel
<https://lists.gluster.org/mailman/listinfo/gluster-devel>




___
Gluster-devel mailing list
Gluster-devel@gluster.org <mailto:Gluster-devel@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-devel
<https://lists.gluster.org/mailman/listinfo/gluster-devel>





___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regression failures on afr/split-brain-resolution

2018-07-24 Thread Ravishankar N



On 07/24/2018 06:30 PM, Ravishankar N wrote:




On 07/24/2018 02:56 PM, Raghavendra Gowdappa wrote:

All,

I was trying to debug regression failures on [1] and observed that 
split-brain-resolution.t was failing consistently.


=
TEST 45 (line 88): 0 get_pending_heal_count patchy
./tests/basic/afr/split-brain-resolution.t .. 45/45 RESULT 45: 1
./tests/basic/afr/split-brain-resolution.t .. Failed 17/45 subtests

Test Summary Report
---
./tests/basic/afr/split-brain-resolution.t (Wstat: 0 Tests: 45 
Failed: 17)

  Failed tests:  24-26, 28-36, 41-45


On probing deeper, I observed a curious fact - on most of the 
failures stat was not served from md-cache, but instead was wound 
down to afr which failed stat with EIO as the file was in split 
brain. So, I did another test:

* disabled md-cache
* mount glusterfs with attribute-timeout 0 and entry-timeout 0

Now the test fails always. So, I think the test relied on stat 
requests being absorbed either by kernel attribute cache or md-cache. 
When its not happening stats are reaching afr and resulting in 
failures of cmds like getfattr etc.


This indeed seems to be the case.  Is there any way we can avoid the 
stat? When a getfattr is performed on the mount, aren't lookup + 
getfattr are the only fops that need to be hit in gluster?


Or should afr allow (f)stat even for replica-2 split-brains because it 
is allowing lookup anyway (lookup cbk contains stat information from one 
of its children) ?

-Ravi

-Ravi


Thoughts?

[1] https://review.gluster.org/#/c/20549/


___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel




___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] How gluster handle split-brain in the corner case from non-overlapping range lock in same file?

2018-05-13 Thread Ravishankar N



On 05/05/2018 10:04 PM, Yanfei Wang wrote:

Hi,

https://docs.gluster.org/en/v3/Administrator%20Guide/arbiter-volumes-and-quorum/,
said,

```
There is a corner case even with replica 3 volumes where the file can
end up in a split-brain. AFR usually takes range locks for the
{offset, length} of the write. If 3 writes happen on the same file at
non-overlapping {offset, length} and each write fails on (only) one
different brick, then we have AFR xattrs of the file blaming each
other.
```

Could some body give more details on it? Any clues are welcome.


We recently (glusterfs 3.13 onward) changed AFR to take full locks for 
all writes. You can see the commit message in 
https://review.gluster.org/#/c/19218/ for details. It is a configurable 
option, so you can change it to use range locks if needed.

-Ravi



Thanks a lot

- Fei
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Release 3.12.8: Scheduled for the 12th of April

2018-04-11 Thread Ravishankar N

Mabi,

It looks like one of the patches is not a straight forward cherry-pick 
to the 3.12 branch. Even though the conflict might be easy to resolve, I 
don't think it is a good idea to hurry it for tomorrow. We will 
definitely have it ready by the next minor release (or if by chance the 
release is delayed and the back port is reviewed and merged before 
that). Hope that is acceptable.


-Ravi

On 04/11/2018 01:11 PM, mabi wrote:

Dear Jiffin,

Would it be possible to have the following backported to 3.12:

https://bugzilla.redhat.com/show_bug.cgi?id=1482064



See my mail with subject "New 3.12.7 possible split-brain on replica 
3" on the list earlier this week for more details.


Thank you very much.

Best regards,
Mabi

‐‐‐ Original Message ‐‐‐
On April 11, 2018 5:16 AM, Jiffin Tony Thottan  
wrote:



Hi,

It's time to prepare the 3.12.8 release, which falls on the 10th of
each month, and hence would be 12-04-2018 this time around.

This mail is to call out the following,

1) Are there any pending *blocker* bugs that need to be tracked for
3.12.7? If so mark them against the provided tracker [1] as blockers
for the release, or at the very least post them as a response to this
mail

2) Pending reviews in the 3.12 dashboard will be part of the release,
*iff* they pass regressions and have the review votes, so use the
dashboard [2] to check on the status of your patches to 3.12 and get
these going

3) I have made checks on what went into 3.10 post 3.12 release and if
these fixes are already included in 3.12 branch, then status on this 
is *green*

as all fixes ported to 3.10, are ported to 3.12 as well.

@Mlind

IMO https://review.gluster.org/19659 is like a minor feature to me. 
Can please provide a justification for why it need to include in 3.12 
stable release?


And please rebase the change as well

@Raghavendra

The smoke failed for https://review.gluster.org/#/c/19818/. Can 
please check the same?


Thanks,
Jiffin

[1] Release bug tracker:
https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.12.8

[2] 3.12 review dashboard:
https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:3-12-dashboard




___
Gluster-users mailing list
gluster-us...@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 4.1: LTM release targeted for end of May

2018-03-21 Thread Ravishankar N



On 03/20/2018 07:07 PM, Shyam Ranganathan wrote:

On 03/12/2018 09:37 PM, Shyam Ranganathan wrote:

Hi,

As we wind down on 4.0 activities (waiting on docs to hit the site, and
packages to be available in CentOS repositories before announcing the
release), it is time to start preparing for the 4.1 release.

4.1 is where we have GD2 fully functional and shipping with migration
tools to aid Glusterd to GlusterD2 migrations.

Other than the above, this is a call out for features that are in the
works for 4.1. Please *post* the github issues to the *devel lists* that
you would like as a part of 4.1, and also mention the current state of
development.

Thanks for those who responded. The github lane and milestones for the
said features are updated, request those who mentioned issues being
tracked for 4.1 check that these are reflected in the project lane [1].

I have few requests as follows that if picked up would be a good thing
to achieve by 4.1, volunteers welcome!

- Issue #224: Improve SOS report plugin maintenance
   - https://github.com/gluster/glusterfs/issues/224

- Issue #259: Compilation warnings with gcc 7.x
   - https://github.com/gluster/glusterfs/issues/259

- Issue #411: Ensure python3 compatibility across code base
   - https://github.com/gluster/glusterfs/issues/411

- NFS Ganesha HA (storhaug)
   - Does this need an issue for Gluster releases to track? (maybe packaging)

I will close the call for features by Monday 26th Mar, 2018. Post this,
I would request that features that need to make it into 4.1 be raised as
exceptions to the devel and maintainers list for evaluation.


Hi Shyam,

I want to add https://github.com/gluster/glusterfs/issues/363 also for 
4.1. It is not a new feature but rather an enhancement to a volume 
option in AFR. I don't think it can qualify as a bug fix, so mentioning 
it here just in case it needs to be tracked too. The (only) patch is 
undergoing review cycles.


Regards,
Ravi



Further, as we hit end of March, we would make it mandatory for features
to have required spec and doc labels, before the code is merged, so
factor in efforts for the same if not already done.

Current 4.1 project release lane is empty! I cleaned it up, because I
want to hear from all as to what content to add, than add things marked
with the 4.1 milestone by default.

[1] 4.1 Release lane:
https://github.com/gluster/glusterfs/projects/1#column-1075416


Thanks,
Shyam
P.S: Also any volunteers to shadow/participate/run 4.1 as a release owner?

Calling this out again!


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Release 4.1: LTM release targeted for end of May

2018-03-15 Thread Ravishankar N



On 03/13/2018 07:07 AM, Shyam Ranganathan wrote:

Hi,

As we wind down on 4.0 activities (waiting on docs to hit the site, and
packages to be available in CentOS repositories before announcing the
release), it is time to start preparing for the 4.1 release.

4.1 is where we have GD2 fully functional and shipping with migration
tools to aid Glusterd to GlusterD2 migrations.

Other than the above, this is a call out for features that are in the
works for 4.1. Please *post* the github issues to the *devel lists* that
you would like as a part of 4.1, and also mention the current state of
development.

Hi,

We are targeting the 'thin-arbiter' feature for 4.1 
:https://github.com/gluster/glusterfs/issues/352

Status: High level design is there in the github issue.
Thin arbiter xlator patch https://review.gluster.org/#/c/19545/ is 
undergoing reviews.
Implementation details on AFR and glusterd(2) related changes are being 
discussed.  Will make sure all patches are posted against issue 352.


Thanks,
Ravi



Further, as we hit end of March, we would make it mandatory for features
to have required spec and doc labels, before the code is merged, so
factor in efforts for the same if not already done.

Current 4.1 project release lane is empty! I cleaned it up, because I
want to hear from all as to what content to add, than add things marked
with the 4.1 milestone by default.

Thanks,
Shyam
P.S: Also any volunteers to shadow/participate/run 4.1 as a release owner?
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 4.0: Unable to complete rolling upgrade tests

2018-03-01 Thread Ravishankar N



On 03/02/2018 11:04 AM, Anoop C S wrote:

On Fri, 2018-03-02 at 10:11 +0530, Ravishankar N wrote:

+ Anoop.

It looks like clients on the old (3.12) nodes are not able to talk to
the upgraded (4.0) node. I see messages like these on the old clients:

   [2018-03-02 03:49:13.483458] W [MSGID: 114007]
[client-handshake.c:1197:client_setvolume_cbk] 0-testvol-client-2:
failed to find key 'clnt-lk-version' in the options

Seems like we need to set clnt-lk-version from server side too similar to what 
we did for client via
https://review.gluster.org/#/c/19560/. Can you try with the attached patch?
Thanks, self-heal works with this. You might want to get it merged in 
4.0 ASAP.


I still got the mkdir error on a plain distribute volume that I referred 
to in the other email in this thread. Anyone who is interested in trying 
it out, the steps are:

- Create a 2 node 2x1 plain distribute vol on 3.13 and fuse mount on node-1
- Upgrade 2nd node to 4.0 and once it is up and running,
- Perform mkdir from the mount on node1-->this returns EIO

Thanks
Ravi
PS: Feeling a bit under the weather, so I might not be online today again.




Is there something more to be done on BZ 1544366?

-Ravi
On 03/02/2018 08:44 AM, Ravishankar N wrote:

On 03/02/2018 07:26 AM, Shyam Ranganathan wrote:

Hi Pranith/Ravi,

So, to keep a long story short, post upgrading 1 node in a 3 node 3.13
cluster, self-heal is not able to catch the heal backlog and this is a
very simple synthetic test anyway, but the end result is that upgrade
testing is failing.

Let me try this now and get back. I had done some thing similar when
testing the FIPS patch and the rolling upgrade had worked.
Thanks,
Ravi

Here are the details,

- Using
https://hackmd.io/GYIwTADCDsDMCGBaArAUxAY0QFhBAbIgJwCMySIwJmAJvGMBvNEA#
I setup 3 server containers to install 3.13 first as follows (within the
containers)

(inside the 3 server containers)
yum -y update; yum -y install centos-release-gluster313; yum install
glusterfs-server; glusterd

(inside centos-glfs-server1)
gluster peer probe centos-glfs-server2
gluster peer probe centos-glfs-server3
gluster peer status
gluster v create patchy replica 3 centos-glfs-server1:/d/brick1
centos-glfs-server2:/d/brick2 centos-glfs-server3:/d/brick3
centos-glfs-server1:/d/brick4 centos-glfs-server2:/d/brick5
centos-glfs-server3:/d/brick6 force
gluster v start patchy
gluster v status

Create a client container as per the document above, and mount the above
volume and create 1 file, 1 directory and a file within that directory.

Now we start the upgrade process (as laid out for 3.13 here
http://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_3.13/ ):
- killall glusterfs glusterfsd glusterd
- yum install
http://cbs.centos.org/kojifiles/work/tasks/1548/311548/centos-release-gluster40-0.9-1.el7.cent
os.x86_64.rpm

- yum upgrade --enablerepo=centos-gluster40-test glusterfs-server

< Go back to the client and edit the contents of one of the files and
change the permissions of a directory, so that there are things to heal
when we bring up the newly upgraded server>

- gluster --version
- glusterd
- gluster v status
- gluster v heal patchy

The above starts failing as follows,
[root@centos-glfs-server1 /]# gluster v heal patchy
Launching heal operation to perform index self heal on volume patchy has
been unsuccessful:
Commit failed on centos-glfs-server2.glfstest20. Please check log file
for details.
Commit failed on centos-glfs-server3. Please check log file for details.

  From here, if further files or directories are created from the client,
they just get added to the heal backlog, and heal does not catchup.

As is obvious, I cannot proceed, as the upgrade procedure is broken. The
issue itself may not be selfheal deamon, but something around
connections, but as the process fails here, looking to you guys to
unblock this as soon as possible, as we are already running a day's slip
in the release.

Thanks,
Shyam


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Release 4.0: Unable to complete rolling upgrade tests

2018-03-01 Thread Ravishankar N


On 03/02/2018 10:11 AM, Ravishankar N wrote:

+ Anoop.

It looks like clients on the old (3.12) nodes are not able to talk to 
the upgraded (4.0) node. I see messages like these on the old clients:


 [2018-03-02 03:49:13.483458] W [MSGID: 114007] 
[client-handshake.c:1197:client_setvolume_cbk] 0-testvol-client-2: 
failed to find key 'clnt-lk-version' in the options


I see this in a 2x1 plain distribute also. I see ENOTCONN for the 
upgraded brick on the old client:


[2018-03-02 04:58:54.559446] E [MSGID: 114058] 
[client-handshake.c:1571:client_query_portmap_cbk] 0-testvol-client-1: 
failed to get the port number for remote subvolume. Please run 'gluster 
volume status' on server to see if brick process is running.
[2018-03-02 04:58:54.559618] I [MSGID: 114018] 
[client.c:2285:client_rpc_notify] 0-testvol-client-1: disconnected from 
testvol-client-1. Client process will keep trying to connect to glusterd 
until brick's port is available
[2018-03-02 04:58:56.973199] I [rpc-clnt.c:1994:rpc_clnt_reconfig] 
0-testvol-client-1: changing port to 49152 (from 0)
[2018-03-02 04:58:56.975844] I [MSGID: 114057] 
[client-handshake.c:1484:select_server_supported_programs] 
0-testvol-client-1: Using Program GlusterFS 3.3, Num (1298437), Version 
(330)
[2018-03-02 04:58:56.978114] W [MSGID: 114007] 
[client-handshake.c:1197:client_setvolume_cbk] 0-testvol-client-1: 
failed to find key 'clnt-lk-version' in the options
[2018-03-02 04:58:46.618036] E [MSGID: 114031] 
[client-rpc-fops.c:2768:client3_3_opendir_cbk] 0-testvol-client-1: 
remote operation failed. Path: / (----0001) 
[Transport endpoint is not connected]
The message "W [MSGID: 114031] 
[client-rpc-fops.c:2577:client3_3_readdirp_cbk] 0-testvol-client-1: 
remote operation failed [Transport endpoint is not connected]" repeated 
3 times between [2018-03-02 04:58:46.609529] and [2018-03-02 
04:58:46.618683]


Also, mkdir fails on the old mount with EIO, though physically 
succeeding on both bricks. Can the rpc folks offer a helping hand?


-Ravi

Is there something more to be done on BZ 1544366?

-Ravi
On 03/02/2018 08:44 AM, Ravishankar N wrote:


On 03/02/2018 07:26 AM, Shyam Ranganathan wrote:

Hi Pranith/Ravi,

So, to keep a long story short, post upgrading 1 node in a 3 node 3.13
cluster, self-heal is not able to catch the heal backlog and this is a
very simple synthetic test anyway, but the end result is that upgrade
testing is failing.


Let me try this now and get back. I had done some thing similar when 
testing the FIPS patch and the rolling upgrade had worked.

Thanks,
Ravi


Here are the details,

- Using
https://hackmd.io/GYIwTADCDsDMCGBaArAUxAY0QFhBAbIgJwCMySIwJmAJvGMBvNEA#
I setup 3 server containers to install 3.13 first as follows (within 
the

containers)

(inside the 3 server containers)
yum -y update; yum -y install centos-release-gluster313; yum install
glusterfs-server; glusterd

(inside centos-glfs-server1)
gluster peer probe centos-glfs-server2
gluster peer probe centos-glfs-server3
gluster peer status
gluster v create patchy replica 3 centos-glfs-server1:/d/brick1
centos-glfs-server2:/d/brick2 centos-glfs-server3:/d/brick3
centos-glfs-server1:/d/brick4 centos-glfs-server2:/d/brick5
centos-glfs-server3:/d/brick6 force
gluster v start patchy
gluster v status

Create a client container as per the document above, and mount the 
above

volume and create 1 file, 1 directory and a file within that directory.

Now we start the upgrade process (as laid out for 3.13 here
http://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_3.13/ ):
- killall glusterfs glusterfsd glusterd
- yum install
http://cbs.centos.org/kojifiles/work/tasks/1548/311548/centos-release-gluster40-0.9-1.el7.centos.x86_64.rpm 


- yum upgrade --enablerepo=centos-gluster40-test glusterfs-server

< Go back to the client and edit the contents of one of the files and
change the permissions of a directory, so that there are things to heal
when we bring up the newly upgraded server>

- gluster --version
- glusterd
- gluster v status
- gluster v heal patchy

The above starts failing as follows,
[root@centos-glfs-server1 /]# gluster v heal patchy
Launching heal operation to perform index self heal on volume patchy 
has

been unsuccessful:
Commit failed on centos-glfs-server2.glfstest20. Please check log file
for details.
Commit failed on centos-glfs-server3. Please check log file for 
details.


 From here, if further files or directories are created from the 
client,

they just get added to the heal backlog, and heal does not catchup.

As is obvious, I cannot proceed, as the upgrade procedure is broken. 
The

issue itself may not be selfheal deamon, but something around
connections, but as the process fails here, looking to you guys to
unblock this as soon as possible, as we are already running a day's 
slip

in the release.

Thanks,
Shyam






___
Gluster-devel mailing list
Gluster-devel@

Re: [Gluster-devel] Release 4.0: Unable to complete rolling upgrade tests

2018-03-01 Thread Ravishankar N

+ Anoop.

It looks like clients on the old (3.12) nodes are not able to talk to 
the upgraded (4.0) node. I see messages like these on the old clients:


 [2018-03-02 03:49:13.483458] W [MSGID: 114007] 
[client-handshake.c:1197:client_setvolume_cbk] 0-testvol-client-2: 
failed to find key 'clnt-lk-version' in the options


Is there something more to be done on BZ 1544366?

-Ravi
On 03/02/2018 08:44 AM, Ravishankar N wrote:


On 03/02/2018 07:26 AM, Shyam Ranganathan wrote:

Hi Pranith/Ravi,

So, to keep a long story short, post upgrading 1 node in a 3 node 3.13
cluster, self-heal is not able to catch the heal backlog and this is a
very simple synthetic test anyway, but the end result is that upgrade
testing is failing.


Let me try this now and get back. I had done some thing similar when 
testing the FIPS patch and the rolling upgrade had worked.

Thanks,
Ravi


Here are the details,

- Using
https://hackmd.io/GYIwTADCDsDMCGBaArAUxAY0QFhBAbIgJwCMySIwJmAJvGMBvNEA#
I setup 3 server containers to install 3.13 first as follows (within the
containers)

(inside the 3 server containers)
yum -y update; yum -y install centos-release-gluster313; yum install
glusterfs-server; glusterd

(inside centos-glfs-server1)
gluster peer probe centos-glfs-server2
gluster peer probe centos-glfs-server3
gluster peer status
gluster v create patchy replica 3 centos-glfs-server1:/d/brick1
centos-glfs-server2:/d/brick2 centos-glfs-server3:/d/brick3
centos-glfs-server1:/d/brick4 centos-glfs-server2:/d/brick5
centos-glfs-server3:/d/brick6 force
gluster v start patchy
gluster v status

Create a client container as per the document above, and mount the above
volume and create 1 file, 1 directory and a file within that directory.

Now we start the upgrade process (as laid out for 3.13 here
http://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_3.13/ ):
- killall glusterfs glusterfsd glusterd
- yum install
http://cbs.centos.org/kojifiles/work/tasks/1548/311548/centos-release-gluster40-0.9-1.el7.centos.x86_64.rpm 


- yum upgrade --enablerepo=centos-gluster40-test glusterfs-server

< Go back to the client and edit the contents of one of the files and
change the permissions of a directory, so that there are things to heal
when we bring up the newly upgraded server>

- gluster --version
- glusterd
- gluster v status
- gluster v heal patchy

The above starts failing as follows,
[root@centos-glfs-server1 /]# gluster v heal patchy
Launching heal operation to perform index self heal on volume patchy has
been unsuccessful:
Commit failed on centos-glfs-server2.glfstest20. Please check log file
for details.
Commit failed on centos-glfs-server3. Please check log file for details.

 From here, if further files or directories are created from the client,
they just get added to the heal backlog, and heal does not catchup.

As is obvious, I cannot proceed, as the upgrade procedure is broken. The
issue itself may not be selfheal deamon, but something around
connections, but as the process fails here, looking to you guys to
unblock this as soon as possible, as we are already running a day's slip
in the release.

Thanks,
Shyam




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 4.0: Unable to complete rolling upgrade tests

2018-03-01 Thread Ravishankar N


On 03/02/2018 07:26 AM, Shyam Ranganathan wrote:

Hi Pranith/Ravi,

So, to keep a long story short, post upgrading 1 node in a 3 node 3.13
cluster, self-heal is not able to catch the heal backlog and this is a
very simple synthetic test anyway, but the end result is that upgrade
testing is failing.


Let me try this now and get back. I had done some thing similar when 
testing the FIPS patch and the rolling upgrade had worked.

Thanks,
Ravi


Here are the details,

- Using
https://hackmd.io/GYIwTADCDsDMCGBaArAUxAY0QFhBAbIgJwCMySIwJmAJvGMBvNEA#
I setup 3 server containers to install 3.13 first as follows (within the
containers)

(inside the 3 server containers)
yum -y update; yum -y install centos-release-gluster313; yum install
glusterfs-server; glusterd

(inside centos-glfs-server1)
gluster peer probe centos-glfs-server2
gluster peer probe centos-glfs-server3
gluster peer status
gluster v create patchy replica 3 centos-glfs-server1:/d/brick1
centos-glfs-server2:/d/brick2 centos-glfs-server3:/d/brick3
centos-glfs-server1:/d/brick4 centos-glfs-server2:/d/brick5
centos-glfs-server3:/d/brick6 force
gluster v start patchy
gluster v status

Create a client container as per the document above, and mount the above
volume and create 1 file, 1 directory and a file within that directory.

Now we start the upgrade process (as laid out for 3.13 here
http://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_3.13/ ):
- killall glusterfs glusterfsd glusterd
- yum install
http://cbs.centos.org/kojifiles/work/tasks/1548/311548/centos-release-gluster40-0.9-1.el7.centos.x86_64.rpm
- yum upgrade --enablerepo=centos-gluster40-test glusterfs-server

< Go back to the client and edit the contents of one of the files and
change the permissions of a directory, so that there are things to heal
when we bring up the newly upgraded server>

- gluster --version
- glusterd
- gluster v status
- gluster v heal patchy

The above starts failing as follows,
[root@centos-glfs-server1 /]# gluster v heal patchy
Launching heal operation to perform index self heal on volume patchy has
been unsuccessful:
Commit failed on centos-glfs-server2.glfstest20. Please check log file
for details.
Commit failed on centos-glfs-server3. Please check log file for details.

 From here, if further files or directories are created from the client,
they just get added to the heal backlog, and heal does not catchup.

As is obvious, I cannot proceed, as the upgrade procedure is broken. The
issue itself may not be selfheal deamon, but something around
connections, but as the process fails here, looking to you guys to
unblock this as soon as possible, as we are already running a day's slip
in the release.

Thanks,
Shyam


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] use-case for 4 replicas and 1 arbiter

2018-02-12 Thread Ravishankar N



On 02/12/2018 05:02 PM, Niels de Vos wrote:

Hi Ravi,

Last week I was in a discussion about 4-way replication and one arbiter
(5 bricks per set). It seems that it is not possible to create this
configuration through the CLI. What would it take to make this
available?
The most important changes would be in afr write transaction code, 
deciding on when to prevent winding of FOPS and when to report FOP cbk 
as a failure if quorum is not met and for split-brain avoidance.The 
current arbitration logic is mostly written for 2+1, so that would need 
some thinking to modify/validate it for the generic n +1 (n being even) 
case that you mention.

The idea is to get a high available storage, split over three
datacenters. Two large datacenter have red and blue racks (separated
power supplies, networking etc.) and the smaller datacenter can host the
arbiter brick.

 .--.   .--.
 |   DC-1   |   |   DC-2   |
 | .---red---.  .--blue---. |   | .---red---.  .--blue---. |
 | | |  | | |   | | |  | | |
 | | |  | | |   | | |  | | |
 | |  [b-1]  |  |  [b-2]  | |===| |  [b-3]  |  |  [b-4]  | |
 | | |  | | |   | | |  | | |
 | | |  | | |   | | |  | | |
 | '-'  '-' |   | '-'  '-' |
 '--'   '--'
\   /
 \ /
  \   /
   .-.
   | DC-3|
   | .-. |
   | | | |
   | | | |
   | |  [a-1]  | |
   | | | |
   | | | |
   | '-' |
   '-'

Creating the volume looks like this, and errors out:

# gluster volume create red-blue replica 5 arbiter 1 \
dc1-red-svr1:/bricks/b-1 dc1-blue-svr1:/bricks/b-2 \
dc2-red-svr1:/bricks/b-3 dc2-blue-svr1:/bricks/b-4 \
dc3-svr1:/bricks/a-1
For arbiter configuration, replica count must be 3 and arbiter count
must be 1. The 3rd brick of the replica will be the arbiter

Possibly the thin-arbiter from https://review.gluster.org/19545 could be
a replacement for the 'full' arbiter. But that may require more time to
get stable than the current arbiter?
Thin arbiter is also targeted as a 2 +1 solution, except there is only 
one brick that acts as arbiter for all replica sub-volumes in a dist-rep 
setup. Also, it won't participate in I/O path in the happy case of all 
bricks being up, so the latency of the thin arbiter node can be higher, 
unlike normal arbiter which has to be in the trusted storage pool. The 
level of granularity (for file availability)  is less than normal 
arbiter volumes. Details can be found @ 
https://github.com/gluster/glusterfs/issues/352


Regards,
Ravi


Thanks,
Niels


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 4.0: Release notes (please read and contribute)

2018-02-09 Thread Ravishankar N



On 02/10/2018 01:24 AM, Shyam Ranganathan wrote:

On 02/02/2018 10:26 AM, Ravishankar N wrote:

2) "Replace MD5 usage to enable FIPS support" - Ravi, Amar

+ Kotresh who has done most (all to be precise) of the patches listed in
https://github.com/gluster/glusterfs/issues/230  in case he would like
to add anything.

There is a pending work for this w.r.t rolling upgrade support.  I hope
to work on this next week, but I cannot commit anything looking at other
things in my queue :(.

I have this confusion reading the issue comments, so if one of the
servers is updated, and other server(s) in the replica are still old,
then self heal deamon would work, without the fix?

 From the comment [1], I understand that the new node self heal deamon
would crash.

If the above is true and the fix is to be at MD5 till  then
that is a must fix before release, as there is no way to upgrade and
handle heals, and hence not get into split brains later as I understand.

Where am I going wrong? or, is this understanding correct?
This is right. In the new node, the shd (afr) requests the checksum from 
both bricks (assuming a 1x2 setup). In saving the checksum in its local 
structures in the cbk, (see __checksum_cbk, 
https://github.com/gluster/glusterfs/blob/release-4.0/xlators/cluster/afr/src/afr-self-heal-data.c#L45), 
it will do a memcpy of SHA256_DIGEST_LENGTH bytes even if the older 
brick sends only MD5_DIGEST_LENGTH bytes. It might or not crash but it 
is illegal memory access.


The summary of changes we had in mind are:
- Restore md5sum in the code.
- Have a volume set option for posix xlator tied to 
GD_OP_VERSION_4_0_0.  By default, without this option set, 
posix_rcheksum will still send MD5SUM.
- Amar has introduced a flag in gfx_rchecksum_rsp. At the brick side, 
set that flag to 1 only if we are sending SHA256
- change rchecksum fop_cbk signature to include the flag (or maybe 
capture the flag in response xadata dict instead?).
- In afr depending on whether the flag is set or not, memcpy the 
appropriate length.
- After upgrade is complete and cluster op version becomes 
GD_OP_VERSION_4_0_0, user can set the volume option and from then on 
wards rchecksum will use SHA256.


Regards,
Ravi



To add more clarity, for fresh setup (clients + servers) in 4.0,
enabling FIPS works fine. But we need to handle case of old servers and
new clients and vice versa. If this can be considered a bug fix, then
here is my attempt at the release notes for this fix:

"Previously, if gluster was run on a FIPS enabled system, it used to
crash because gluster used MD5 checksum in various places like self-heal
and geo-rep. This has been fixed by replacing MD5 with SHA256 which is
FIPS compliant."

I'm happy to update the above text in doc/release-notes/4.0.0.md and
send it on gerrit for review.

I can take care of this, no worries. I need the information, so that I
do not misrepresent :) provided any which way is fine...

[1] https://github.com/gluster/glusterfs/issues/230#issuecomment-358293386


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] query about a split-brain problem found in glusterfs3.12.3

2018-02-08 Thread Ravishankar N



On 02/08/2018 01:08 PM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:


Hi,

I check the link you provided. It does not mention the the “dirty” 
attribute, if I try to fix this split-brain by manually setfattr 
command, should I only set the “trusted.afr.export-client-0” command?


Manually resetting xattrs is not recommended. Use the gluster CLI to 
resolve it.


By the way, I feel it is quite strange that the output of “gluster 
volume heal export info” command there is two entries with the same 
name, how does this happen?


Maybe the same entry is listed in different subfolders of 
.glusterfs/indices?


gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

/testdir - Possibly undergoing heal

Status: Connected
Number of entries: 2

I also do some other test, when sn-0 side file/dir does not has 
“dirty” and “trusted.afr.export-client-*” attribute and sn-1 side 
file/dir has both “dirty” and “trusted.afr.export-client-*” non-zero. 
The gluster could self heal such scenario. But in this case the it 
could never self heal.


*From:*Ravishankar N [mailto:ravishan...@redhat.com]
*Sent:* Thursday, February 08, 2018 11:56 AM
*To:* Zhou, Cynthia (NSB - CN/Hangzhou) 
<cynthia.z...@nokia-sbell.com>; Gluster-devel@gluster.org

*Subject:* Re: query about a split-brain problem found in glusterfs3.12.3

On 02/08/2018 07:16 AM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:

Hi,

Thanks for responding?

If split-brain happen in such kind of test is reasonable, how to
fix this split-brain situation?

If you are using replica 2, then there is no prevention. Once they 
occur, you can resolve them using 
http://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/


If you want to prevent split-brain, you would need to use replica 3 or 
arbiter volume.


Regards,
Ravi

*From:*Ravishankar N [mailto:ravishan...@redhat.com]
*Sent:* Thursday, February 08, 2018 12:12 AM
*To:* Zhou, Cynthia (NSB - CN/Hangzhou)
<cynthia.z...@nokia-sbell.com>
<mailto:cynthia.z...@nokia-sbell.com>; Gluster-devel@gluster.org
<mailto:Gluster-devel@gluster.org>
*Subject:* Re: query about a split-brain problem found in
glusterfs3.12.3

On 02/07/2018 10:39 AM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:

Hi glusterfs expert:

Good day.

Lately, we meet a glusterfs split brain problem in our env in
/mnt/export/testdir. We start 3 ior process (IOR tool) from
non-sn nodes, which is creating/removing files repeatedly in
testdir. then we reboot sn nodes(sn0 and sn1) by sequence.
Then we meet following problem.

Do you have some comments on how this could happen? And how to
fix it in this situation? Thanks!


Is the problem that split-brain is happening? Is this a replica 2
volume? If yes, then it looks like it is expected behavior?
Regards
Ravi


gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

/testdir - Possibly undergoing heal

Status: Connected
Number of entries: 2

wait for a while …..

gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Possibly undergoing heal

/testdir - Possibly undergoing heal

and finally:

[root@sn-0:/root <http://sn-0/root>]
# gluster v heal export info
Brick sn-0.local:/mnt/bricks/export/brick
<http://local/mnt/bricks/export/brick>
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
<http://local/mnt/bricks/export/brick>
/testdir - Is in split-brain

Status: Connected
Number of entries: 1

[root@sn-0:/root <http://sn-0/root>]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001

[root@sn-1:/root <http://sn-1/root>]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.afr.dirty=0x0001

trusted.afr.export-client-0=0x0038

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001



_

Re: [Gluster-devel] query about a split-brain problem found in glusterfs3.12.3

2018-02-07 Thread Ravishankar N



On 02/08/2018 07:16 AM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:


Hi,

Thanks for responding?

If split-brain happen in such kind of test is reasonable, how to fix 
this split-brain situation?


If you are using replica 2, then there is no prevention. Once they 
occur, you can resolve them using 
http://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/


If you want to prevent split-brain, you would need to use replica 3 or 
arbiter volume.


Regards,
Ravi


*From:*Ravishankar N [mailto:ravishan...@redhat.com]
*Sent:* Thursday, February 08, 2018 12:12 AM
*To:* Zhou, Cynthia (NSB - CN/Hangzhou) 
<cynthia.z...@nokia-sbell.com>; Gluster-devel@gluster.org

*Subject:* Re: query about a split-brain problem found in glusterfs3.12.3

On 02/07/2018 10:39 AM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:

Hi glusterfs expert:

Good day.

Lately, we meet a glusterfs split brain problem in our env in
/mnt/export/testdir. We start 3 ior process (IOR tool) from non-sn
nodes, which is creating/removing files repeatedly in testdir.
then we reboot sn nodes(sn0 and sn1) by sequence. Then we meet
following problem.

Do you have some comments on how this could happen? And how to fix
it in this situation? Thanks!


Is the problem that split-brain is happening? Is this a replica 2 
volume? If yes, then it looks like it is expected behavior?

Regards
Ravi

gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

/testdir - Possibly undergoing heal

Status: Connected
Number of entries: 2

wait for a while …..

gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Possibly undergoing heal

/testdir - Possibly undergoing heal

and finally:

[root@sn-0:/root <http://sn-0/root>]
# gluster v heal export info
Brick sn-0.local:/mnt/bricks/export/brick
<http://local/mnt/bricks/export/brick>
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
<http://local/mnt/bricks/export/brick>
/testdir - Is in split-brain

Status: Connected
Number of entries: 1

[root@sn-0:/root <http://sn-0/root>]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001

[root@sn-1:/root <http://sn-1/root>]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.afr.dirty=0x0001

trusted.afr.export-client-0=0x0038

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] query about a split-brain problem found in glusterfs3.12.3

2018-02-07 Thread Ravishankar N



On 02/07/2018 10:39 AM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:


Hi glusterfs expert:

Good day.

Lately, we meet a glusterfs split brain problem in our env in 
/mnt/export/testdir. We start 3 ior process (IOR tool) from non-sn 
nodes, which is creating/removing files repeatedly in testdir. then we 
reboot sn nodes(sn0 and sn1) by sequence. Then we meet following problem.


Do you have some comments on how this could happen? And how to fix it 
in this situation? Thanks!




Is the problem that split-brain is happening? Is this a replica 2 
volume? If yes, then it looks like it is expected behavior?

Regards
Ravi


gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

/testdir - Possibly undergoing heal

Status: Connected
Number of entries: 2

wait for a while …..

gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Possibly undergoing heal

/testdir - Possibly undergoing heal

and finally:

[root@sn-0:/root ]
# gluster v heal export info
Brick sn-0.local:/mnt/bricks/export/brick 


Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick 


/testdir - Is in split-brain

Status: Connected
Number of entries: 1

[root@sn-0:/root ]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001

[root@sn-1:/root ]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.afr.dirty=0x0001

trusted.afr.export-client-0=0x0038

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 4.0: Release notes (please read and contribute)

2018-02-02 Thread Ravishankar N



On 02/01/2018 11:02 PM, Shyam Ranganathan wrote:

On 01/29/2018 05:10 PM, Shyam Ranganathan wrote:

Hi,

I have posted an initial draft version of the release notes here [1].

I would like to *suggest* the following contributors to help improve and
finish the release notes by 06th Feb, 2017. As you read this mail, if
you feel you cannot contribute, do let us know, so that we can find the
appropriate contributors for the same.

Reminder (1)

Request a response if you would be able to provide the release notes.
Release notes itself can come in later.

Helps plan for contingency in case you are unable to generate the
required notes.

Thanks!


NOTE: Please use the release tracker to post patches that modify the
release notes, the bug ID is *1539842* (see [2]).

1) Aravinda/Kotresh: Geo-replication section in the release notes

2) Kaushal/Aravinda/ppai: GD2 section in the release notes

3) Du/Poornima/Pranith: Performance section in the release notes

4) Amar: monitoring section in the release notes

Following are individual call outs for certain features:

1) "Ability to force permissions while creating files/directories on a
volume" - Niels

2) "Replace MD5 usage to enable FIPS support" - Ravi, Amar


+ Kotresh who has done most (all to be precise) of the patches listed in 
https://github.com/gluster/glusterfs/issues/230 in case he would like to 
add anything.


There is a pending work for this w.r.t rolling upgrade support.  I hope 
to work on this next week, but I cannot commit anything looking at other 
things in my queue :(.
To add more clarity, for fresh setup (clients + servers) in 4.0, 
enabling FIPS works fine. But we need to handle case of old servers and 
new clients and vice versa. If this can be considered a bug fix, then 
here is my attempt at the release notes for this fix:


"Previously, if gluster was run on a FIPS enabled system, it used to 
crash because gluster used MD5 checksum in various places like self-heal 
and geo-rep. This has been fixed by replacing MD5 with SHA256 which is 
FIPS compliant."


I'm happy to update the above text in doc/release-notes/4.0.0.md and 
send it on gerrit for review.



Regards,
Ravi






3) "Dentry fop serializer xlator on brick stack" - Du

4) "Add option to disable nftw() based deletes when purging the landfill
directory" - Amar

5) "Enhancements for directory listing in readdirp" - Nithya

6) "xlators should not provide init(), fini() and others directly, but
have class_methods" - Amar

7) "New on-wire protocol (XDR) needed to support iattx and cleaner
dictionary structure" - Amar

8) "The protocol xlators should prevent sending binary values in a dict
over the networks" - Amar

9) "Translator to handle 'global' options" - Amar

Thanks,
Shyam

[1] github link to draft release notes:
https://github.com/gluster/glusterfs/blob/release-4.0/doc/release-notes/4.0.0.md

[2] Initial gerrit patch for the release notes:
https://review.gluster.org/#/c/19370/
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 3.13.2: Planned for 19th of Jan, 2018

2018-01-18 Thread Ravishankar N



On 01/19/2018 06:19 AM, Shyam Ranganathan wrote:

On 01/18/2018 07:34 PM, Ravishankar N wrote:


On 01/18/2018 11:53 PM, Shyam Ranganathan wrote:

On 01/02/2018 11:08 AM, Shyam Ranganathan wrote:

Hi,

As release 3.13.1 is announced, here is are the needed details for
3.13.2

Release date: 19th Jan, 2018 (20th is a Saturday)

Heads up, this is tomorrow.


Tracker bug for blockers:
https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.13.2

The one blocker bug has had its patch merged, so I am assuming there are
no more that should block this release.

As usual, shout out in case something needs attention.

Hi Shyam,

1. There is one patch https://review.gluster.org/#/c/19218/ which
introduces full locks for afr writevs. We're introducing this as a
GD_OP_VERSION_3_13_2 option. Please wait for it to be merged on 3.13
branch today. Karthik, please back port the patch.

Do we need this behind an option, if existing behavior causes split
brains?
Yes this is for split-brain prevention. Arbiter volumes already take 
full locks but not normal replica volumes. This is for normal replica 
volumes. See Pranith's comment in 
https://review.gluster.org/#/c/19218/1/xlators/mgmt/glusterd/src/glusterd-volume-set.c@1557

Or is the option being added for workloads that do not have
multiple clients or clients writing to non-overlapping regions (and thus
need not suffer a penalty in performance maybe? But they should not
anyway as a single client and AFR eager locks should ensure this is done
only once for the lifetime of the file being accesses, right?)
Yes, single writers take eager lock which is always a full lock 
regardless of this change.

Regards
Ravi

Basically I would like to keep options out it possible in backports, as
that changes the gluster op-version and involves other upgrade steps to
be sure users can use this option etc. Which means more reading and
execution of upgrade steps for our users. Hence the concern!


2. I'm also backporting https://review.gluster.org/#/c/18571/. Please
consider merging it too today if it is ready.

This should be fine.


We will attach the relevant BZs to the tracker bug.

Thanks
Ravi

Shyam
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Release 3.13.2: Planned for 19th of Jan, 2018

2018-01-18 Thread Ravishankar N



On 01/18/2018 11:53 PM, Shyam Ranganathan wrote:

On 01/02/2018 11:08 AM, Shyam Ranganathan wrote:

Hi,

As release 3.13.1 is announced, here is are the needed details for 3.13.2

Release date: 19th Jan, 2018 (20th is a Saturday)

Heads up, this is tomorrow.


Tracker bug for blockers:
https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.13.2

The one blocker bug has had its patch merged, so I am assuming there are
no more that should block this release.

As usual, shout out in case something needs attention.


Hi Shyam,

1. There is one patch https://review.gluster.org/#/c/19218/ which 
introduces full locks for afr writevs. We're introducing this as a 
GD_OP_VERSION_3_13_2 option. Please wait for it to be merged on 3.13 
branch today. Karthik, please back port the patch.


2. I'm also backporting https://review.gluster.org/#/c/18571/. Please 
consider merging it too today if it is ready.


We will attach the relevant BZs to the tracker bug.

Thanks
Ravi



Shyam
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] query about why glustershd can not afr_selfheal_recreate_entry because of "afr: Prevent null gfids in self-heal entry re-creation"

2018-01-16 Thread Ravishankar N



On 01/16/2018 02:22 PM, Lian, George (NSB - CN/Hangzhou) wrote:


Hi,

Thanks a lots for your update.

I would like try to introduce more detail for which the issue came from.

This issue is came from a test case in our team, it is the step like 
the following:


1)Setup a glusterfs ENV with replicate 2 storage server nodes and 2 
client nodes


2)Generate a split-brain file , sn-0 is normal, sn-1 is dirty.

Hi , sorry I did not understand the test case. What type of split-brain 
did you create? (data/metadata or gfid or file type mismatch)?


3)Delete the directory before heal begin  (in this phase, the normal 
correct file in sn-0 is deleted by “rm” command , dirty file is still 
there )



Delete from the backend brick directly?


4)After that, the self-heal process will always be failure with the 
log which attached in last mail


Maybe you can write a script or a .t file (like the ones in 
https://github.com/gluster/glusterfs/tree/master/tests/basic/afr) so 
that your test can be understood unambiguously.



Also attach some command output FYI.

From my understand , the Glusterfs maybe can’t handle the split-brain 
file in this case, could you share your comments and confirm whether 
do some enhancement for this case or not?


If you create a split-brain in gluster, self-heal cannot heal it. You 
need to resolve it using one of the methods listed in 
https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/#heal-info-and-split-brain-resolution


Thanks,
Ravi

/_rm -rf /mnt/export/testdir rm: cannot remove 
'/mnt/export/testdir/test file': No data available_//__/


/__/

/__/

/_[root@sn-1:/root]_/

/_# ls -l /mnt/export/testdir/_/

/_ls: cannot access '/mnt/export/testdir/IORFILE_82_2': No data 
available_/


/_total 0_/

/_-? ? ? ? ?    ? test_file_/

/__/

/_[root@sn-1:/root]_/

/_# getfattr -m . -d -e hex /mnt/bricks/export/brick/testdir/_/

/_getfattr: Removing leading '/' from absolute path names_/

/_# file: mnt/bricks/export/brick/testdir/_/

/_trusted.afr.dirty=0x0001_/

/_trusted.afr.export-client-0=0x0054_/

/_trusted.gfid=0xb217d6af49024f189a69e0ccf5207572_/

/_trusted.glusterfs.dht=0x0001_/

/__/

/_[root@sn-0:/var/log/glusterfs]_/

/_#  getfattr -m . -d -e hex /mnt/bricks/export/brick/testdir/_/

/_getfattr: Removing leading '/' from absolute path names_/

/_# file: mnt/bricks/export/brick/testdir/_/

/_trusted.gfid=0xb217d6af49024f189a69e0ccf5207572_/

/_trusted.glusterfs.dht=0x0001_/

/__/

Best Regards

George

*From:*gluster-devel-boun...@gluster.org 
[mailto:gluster-devel-boun...@gluster.org] *On Behalf Of *Ravishankar N

*Sent:* Tuesday, January 16, 2018 1:44 PM
*To:* Zhou, Cynthia (NSB - CN/Hangzhou) 
<cynthia.z...@nokia-sbell.com>; Gluster Devel <gluster-devel@gluster.org>
*Subject:* Re: [Gluster-devel] query about why glustershd can not 
afr_selfheal_recreate_entry because of "afr: Prevent null gfids in 
self-heal entry re-creation"


+ gluster-devel

On 01/15/2018 01:41 PM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:

Hi glusterfs expert,

    Good day,

    When I do some test about glusterfs self-heal I find
following prints showing when dir/file type get error it cannot
get self-healed.

*Could you help to check if it is an expected behavior ? because I
find the code change **https://review.gluster.org/#/c/17981/**add
check for iatt->ia_type,  so what if a file’s ia_type get
corrupted ? in this case it should not get self-healed* ?


Yes, without knowing the ia-type , afr_selfheal_recreate_entry () 
cannot decide what type of FOP to do (mkdir/link/mknod ) to create the 
appropriate file on the sink. You would need to find out why the 
source brick is not returning valid ia_type. i.e. why 
replies[source].poststat is not valid.

Thanks,
Ravi


Thanks!

//heal info output

[root@sn-0:/home/robot]

# gluster v heal export info

Brick sn-0.local:/mnt/bricks/export/brick

Status: Connected

Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick

/testdir - Is in split-brain

Status: Connected

Number of entries: 1

//sn-1 glustershd
log///

[2018-01-15 03:53:40.011422] I [MSGID: 108026]
[afr-self-heal-entry.c:887:afr_selfheal_entry_do]
0-export-replicate-0: performing entry selfheal on
b217d6af-4902-4f18-9a69-e0ccf5207572

[2018-01-15 03:53:40.013994] W [MSGID: 114031]
[client-rpc-fops.c:2860:client3_3_lookup_cbk] 0-export-client-1:
remote operation failed. Path: (null)
(----) [No data available]

[2018-01-15 03:53:40.014025] E [MSGID: 108037]
[afr-self-heal-entry.c:92:afr_selfheal_recreate_entry]
0-export-replic

Re: [Gluster-devel] query about why glustershd can not afr_selfheal_recreate_entry because of "afr: Prevent null gfids in self-heal entry re-creation"

2018-01-15 Thread Ravishankar N

+ gluster-devel


On 01/15/2018 01:41 PM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:

Hi glusterfs expert,
    Good day,
    When I do some test about glusterfs self-heal I find following 
prints showing when dir/file type get error it cannot get self-healed.
*Could you help to check if it is an expected behavior ? because I 
find the code change **_https://review.gluster.org/#/c/17981/_**add 
check for **iatt->ia_typ**e,  so what if a file’s ia_type get 
corrupted ? in this case it should not get self-healed* ?


Yes, without knowing the ia-type , afr_selfheal_recreate_entry () cannot 
decide what type of FOP to do (mkdir/link/mknod ) to create the 
appropriate file on the sink. You would need to find out why the source 
brick is not returning valid ia_type. i.e. why replies[source].poststat 
is not valid.

Thanks,
Ravi


Thanks!
//heal info output
[root@sn-0:/home/robot]
# gluster v heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0
Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain
Status: Connected
Number of entries: 1
//sn-1 glustershd 
log///
[2018-01-15 03:53:40.011422] I [MSGID: 108026] 
[afr-self-heal-entry.c:887:afr_selfheal_entry_do] 
0-export-replicate-0: performing entry selfheal on 
b217d6af-4902-4f18-9a69-e0ccf5207572
[2018-01-15 03:53:40.013994] W [MSGID: 114031] 
[client-rpc-fops.c:2860:client3_3_lookup_cbk] 0-export-client-1: 
remote operation failed. Path: (null) 
(----) [No data available]
[2018-01-15 03:53:40.014025] E [MSGID: 108037] 
[afr-self-heal-entry.c:92:afr_selfheal_recreate_entry] 
0-export-replicate-0: Invalid ia_type (0) or 
gfid(----). source brick=1, 
pargfid=----, name=IORFILE_82_2
//gdb attached to sn-1 
glustershd/

root@sn-1:/var/log/glusterfs]
# gdb attach 2191
GNU gdb (GDB) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<_http://gnu.org/licenses/gpl.html_>

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<_http://www.gnu.org/software/gdb/bugs/_>.
Find the GDB manual and other documentation resources online at:
<_http://www.gnu.org/software/gdb/documentation/_>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
attach: No such file or directory.
Attaching to process 2191
[New LWP 2192]
[New LWP 2193]
[New LWP 2194]
[New LWP 2195]
[New LWP 2196]
[New LWP 2197]
[New LWP 2239]
[New LWP 2241]
[New LWP 2243]
[New LWP 2245]
[New LWP 2247]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x7f90aca037bd in __pthread_join (threadid=140259279345408, 
thread_return=0x0) at pthread_join.c:90

90 pthread_join.c: No such file or directory.
(gdb) break afr_selfheal_recreate_entry
Breakpoint 1 at 0x7f90a3b56dec: file afr-self-heal-entry.c, line 73.
(gdb) c
Continuing.
[Switching to Thread 0x7f90a1b8e700 (LWP 2241)]
Thread 9 "glustershdheal" hit Breakpoint 1, 
afr_selfheal_recreate_entry (frame=0x7f90980018d0, dst=0, source=1, 
sources=0x7f90a1b8ceb0 "", dir=0x7f9098011940, name=0x7f909c015d48 
"IORFILE_82_2",

inode=0x7f9098001bd0, replies=0x7f90a1b8c890) at afr-self-heal-entry.c:73
73 afr-self-heal-entry.c: No such file or directory.
(gdb) n
74  in afr-self-heal-entry.c
(gdb) n
75  in afr-self-heal-entry.c
(gdb) n
76  in afr-self-heal-entry.c
(gdb) n
77  in afr-self-heal-entry.c
(gdb) n
78  in afr-self-heal-entry.c
(gdb) n
79  in afr-self-heal-entry.c
(gdb) n
80  in afr-self-heal-entry.c
(gdb) n
81  in afr-self-heal-entry.c
(gdb) n
82  in afr-self-heal-entry.c
(gdb) n
83  in afr-self-heal-entry.c
(gdb) n
85  in afr-self-heal-entry.c
(gdb) n
86  in afr-self-heal-entry.c
(gdb) n
87  in afr-self-heal-entry.c
(gdb) print iatt->ia_type
$1 = IA_INVAL
(gdb) print gf_uuid_is_null(iatt->ia_gfid)
$2 = 1
(gdb) bt
#0 afr_selfheal_recreate_entry (frame=0x7f90980018d0, dst=0, source=1, 
sources=0x7f90a1b8ceb0 "", dir=0x7f9098011940, name=0x7f909c015d48 
"IORFILE_82_2", inode=0x7f9098001bd0, replies=0x7f90a1b8c890)

    at afr-self-heal-entry.c:87
#1 0x7f90a3b57d20 in __afr_selfheal_merge_dirent 
(frame=0x7f90980018d0, this=0x7f90a4024610, fd=0x7f9098413090, 
name=0x7f909c015d48 "IORFILE_82_2", inode=0x7f9098001bd0,
sources=0x7f90a1b8ceb0 "", healed_sinks=0x7f90a1b8ce70 
"\001\001A\230\220\177", locked_on=0x7f90a1b8ce50 
"\001\001\270\241\220\177", 

[Gluster-devel] Delete stale gfid entries during lookup

2017-12-28 Thread Ravishankar N

Hi,

https://review.gluster.org/#/c/19070/2 modifies posix_lookup() to remove 
stale .glusterfs entry during gfid (nameless) lookup.
The initial version (v1) of the patch attempted to remove it in 
posix_symlink in order to fix BZ 1529488 but the in the review 
discussions (see the patch), it was decided that  the correct fix should 
be to remove it in lookup code path. Does anyone foresee any issues with 
this fix?


Regards,
Ravi
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] glusterd crashes on /tests/bugs/replicate/bug-884328.t

2017-12-14 Thread Ravishankar N
...for a lot of patches on master .The crash is in volume set; the .t 
just does a volume set help. Can the glusterd devs take a look as it is 
blocking merging patches? I have raised BZ 1526268 with the details.


Thanks!

Ravi

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Tests failing on Centos 7

2017-11-27 Thread Ravishankar N



On 11/27/2017 07:12 PM, Nigel Babu wrote:

Hello folks,

I have an update on chunking. There's good news and bad. The first bit 
is that We a chunked regression job now. It splits it out into 10 
chunks that are run in parallel. This chunking is quite simple at the 
moment and doesn't try to be very smart. The intelligence steps will 
come in once we're ready to go live.


In the meanwhile, we've run into a few road blocks. The following 
tests do not work on CentOS 7:


./tests/bugs/cli/bug-1169302.t
./tests/bugs/posix/bug-990028.t
./tests/bugs/glusterd/bug-1260185-donot-allow-detach-commit-unnecessarily.t
./tests/bugs/core/multiplex-limit-issue-151.t
./tests/basic/afr/split-brain-favorite-child-policy.t


Raised 1518062 for the centOS 7 machine.
-Ravi

./tests/bugs/core/bug-1432542-mpx-restart-crash.t

Can the maintainers for these components please take a look at this 
test and fix them to run on Centos 7? When we land chunked 
regressions, we'll switch out our entire build farm over to centos 7. 
If you want a test machine to reproduce the failure and debug, please 
file a bug requesting one with your SSH public key attached.


--
nigelb


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] ./tests/basic/ec/ec-4-1.t failed

2017-11-23 Thread Ravishankar N
...for my patch https://review.gluster.org/#/c/18791/ which only has AFR 
fixes.  Log is at 
https://build.gluster.org/job/centos6-regression/7616/console . Request 
EC folks to take a look.


Thanks,

Ravi

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Test cases failing on X86

2017-11-17 Thread Ravishankar N


On 11/17/2017 05:34 PM, Vaibhav Vaingankar wrote:


Hi,

I was executing test cases on x86 Ubuntu:16.04 VM, however I found 
following test cases are consistently failing. are they expected 
failure? or something is missing? following are the build steps I used:


apt-get install make automake autoconf libtool flex bison
pkg-config libssl-dev libxml2-dev python-dev libaio-dev
libibverbs-dev librdmacm-dev libreadline-dev liblvm2-dev
libglib2.0-dev liburcu-dev libcmocka-dev libsqlite3-dev
libacl1-dev wget tar dbench git xfsprogs attr nfs-common
yajl-tools sqlite3
git clone https://github.com/gluster/glusterfs
cd glusterfs
git checkout v3.10.7
./autogen.sh
./configure
make
make install
./run-tests.sh 



Most likely setup issues. There were only 2 tests failing for me on 
master (see the thread on 
http://lists.gluster.org/pipermail/maintainers/2017-November/003617.html). 
Tests with G_TESTDEF_TEST_STATUS are known bad tests.  For nfs tests, 
you need to `./configure --enable-gnfs` Run the tests individually and 
see why they are failing.


HTH
Ravi


 1. arbiter-mount
 2. arbiter-statfs
 3. split-brain-favorite-child-policy
 4. bd
 5. nfs
 6. basic/mount
 7. nufa
 8. op_errnos
 9. quota-nfs
10. basic/quota
11. tier-snapshot
12. uss
13. volume-snapshot-clone
14. volume-snapshot-xml
15. volume-snapshot
16. mount-nfs-auth
17. quota-anon-fd-nfs
18. stats-dump
19. volume-status
20. br-stub
21. weighted-rebalance


Waiting for positive response.

Thanks and Regards.
Vaibhav


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] AFR: Fail lookups when quorum not met

2017-10-09 Thread Ravishankar N



On 09/22/2017 07:27 PM, Niels de Vos wrote:

On Fri, Sep 22, 2017 at 12:27:46PM +0530, Ravishankar N wrote:

Hello,

In AFR we currently allow look-ups to pass through without taking into
account whether the lookup is served from the good or bad brick. We always
serve from the good brick whenever possible, but if there is none, we just
serve the lookup from one of the bricks that we got a positive reply from.

We found a bug  [1] due to this behavior were the iatt values returned in
the lookup call was bad and caused the client to hang. The proposed fix [2]
was to fail look ups when we definitely know the lookup can't be trusted (by
virtue of AFR xattrs indicating the replies we got from the up bricks are
indeed bad).

Note that this fix is *only* for replica 3 or arbiter volumes (not replica
2, where there is no notion of quorum). But we want to 'harden' the fix by
not allowing any look ups at all if quorum is not met (or) it is met but
there are no good copies.

Some implications of this:

-If a file ends up in data/meta data split-brain in replica 3/arbiter (rare
occurrence), we won't be able to delete it from the mount.

-Even if the only brick that is up is the good copy, we still fail it due to
lack of quorum.

Does any one have comments/ feedback?

I think additional improvements for correctness outweigh the two
negative side-effects that you listed.
Thanks for the feedback Niels. Since we haven't received any other 
inputs, I will rework my patch to include the changes for correctness.


Possibly the 2nd point could get some confusion from users. "it always
worked before" may be a reason to add a volume option for this? That is
something you can consider, but if you deem that overkill then I'm ok
with that too.
Yeah, vol option is an overkill IMO. Once merged in master, I am 
thinking of having this fix only in 3.13.0 and calling it out explicitly 
(and not back-port to a 3.12 minor release) to mitigate confusion to a 
certain extent.

Regards,
Ravi


Thanks,
Niels



Thanks,

Ravi

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1467250

[2] https://review.gluster.org/#/c/17673/ (See review comments on the
landing page if interested)

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] brick multiplexing regression is broken

2017-10-06 Thread Ravishankar N

The test is failing on master without any patches:

[root@tuxpad glusterfs]# prove tests/bugs/bug-1371806_1.t
tests/bugs/bug-1371806_1.t .. 7/9 setfattr: ./tmp1: No such file or 
directory

setfattr: ./tmp2: No such file or directory
setfattr: ./tmp3: No such file or directory
setfattr: ./tmp4: No such file or directory
setfattr: ./tmp5: No such file or directory
setfattr: ./tmp6: No such file or directory
setfattr: ./tmp7: No such file or directory
setfattr: ./tmp8: No such file or directory
setfattr: ./tmp9: No such file or directory
setfattr: ./tmp10: No such file or directory
./tmp1: user.foo: No such attribute
tests/bugs/bug-1371806_1.t .. Failed 2/9 subtests

Mount log for one of the directories:
[2017-10-06 05:32:10.059798] I [MSGID: 109005] 
[dht-selfheal.c:2458:dht_selfheal_directory] 0-patchy-dht: Directory 
selfheal failed: Unable to form layout for directory /tmp1
[2017-10-06 05:32:10.060013] E [MSGID: 109011] 
[dht-common.c:5011:dht_dir_common_setxattr] 0-patchy-dht: Failed to get 
mds subvol for path /tmp1gfid is ----
[2017-10-06 05:32:10.060041] W [fuse-bridge.c:1377:fuse_err_cbk] 
0-glusterfs-fuse: 99: SETXATTR() /tmp1 => -1 (No such file or directory)


Request the patch authors to take a look at it.
Thanks
Ravi

On 10/05/2017 06:04 PM, Atin Mukherjee wrote:
The following commit has broken the brick multiplexing regression job. 
tests/bugs/bug-1371806_1.t has failed couple of times.  One of the 
latest regression job report is at 
https://build.gluster.org/job/regression-test-with-multiplex/406/console .



commit 9b4de61a136b8e5ba7bf0e48690cdb1292d0dee8
Author: Mohit Agrawal >
Date:   Fri May 12 21:12:47 2017 +0530

    cluster/dht : User xattrs are not healed after brick stop/start

    Problem: In a distributed volume custom extended attribute value 
for a directory
 does not display correct value after stop/start or added 
newly brick.
 If any extended(acl) attribute value is set for a 
directory after stop/added
 the brick the attribute(user|acl|quota) value is not 
updated on brick

 after start the brick.

    Solution: First store hashed subvol or subvol(has internal xattr) 
on inode ctx and
  consider it as a MDS subvol.At the time of update custom 
xattr
  (user,quota,acl, selinux) on directory first check the 
mds from
  inode ctx, if mds is not present on inode ctx then throw 
EINVAL error
  to application otherwise set xattr on MDS subvol with 
internal xattr
  value of -1 and then try to update the attribute on 
other non MDS

  volumes also.If mds subvol is down in that case throw an
  error "Transport endpoint is not connected". In 
dht_dir_lookup_cbk|
  dht_revalidate_cbk|dht_discover_complete call 
dht_call_dir_xattr_heal

  to heal custom extended attribute.
  In case of gnfs server if hashed subvol has not found 
based on

  loc then wind a call on all subvol to update xattr.

    Fix:    1) Save MDS subvol on inode ctx
    2) Check if mds subvol is present on inode ctx
    3) If mds subvol is down then call unwind with error 
ENOTCONN and if it is up
   then set new xattr "GF_DHT_XATTR_MDS" to -1 and wind a 
call on other

   subvol.
    4) If setxattr fop is successful on non-mds subvol then 
increment the value of

   internal xattr to +1
    5) At the time of directory_lookup check the value of new 
xattr GF_DHT_XATTR_MDS
    6) If value is not 0 in dht_lookup_dir_cbk(other cbk) 
functions then call heal

   function to heal user xattr
    7) syncop_setxattr on hashed_subvol to reset the value of 
xattr to 0

   if heal is successful on all subvol.

    Test : To reproduce the issue followed below steps
   1) Create a distributed volume and create mount point
   2) Create some directory from mount point mkdir tmp{1..5}
   3) Kill any one brick from the volume
   4) Set extended attribute from mount point on directory
  setfattr -n user.foo -v "abc" ./tmp{1..5}
  It will throw error " Transport End point is not connected "
  for those hashed subvol is down
   5) Start volume with force option to start brick process
   6) Execute getfattr command on mount point for directory
   7) Check extended attribute on brick
  getfattr -n user.foo /tmp{1..5}
  It shows correct value for directories for those
  xattr fop were executed successfully.

    Note: The patch will resolve xattr healing problem only for fuse mount
  not for nfs mount.

    BUG: 1371806
    Signed-off-by: Mohit Agrawal >


    Change-Id: 

Re: [Gluster-devel] brick multiplexing regression is broken

2017-10-06 Thread Ravishankar N



On 10/06/2017 11:08 AM, Mohit Agrawal wrote:

Without a patch test case will fail, it is an expected behavior.
When I said without patches, I meant it is failing on current HEAD on 
master which has the commit 9b4de61a136b8e5ba7bf0e48690cdb

1292d0dee8.
-Ravi



Regards
Mohit Agrawal

On Fri, Oct 6, 2017 at 11:04 AM, Ravishankar N <ravishan...@redhat.com 
<mailto:ravishan...@redhat.com>> wrote:


The test is failing on master without any patches:

[root@tuxpad glusterfs]# prove tests/bugs/bug-1371806_1.t
tests/bugs/bug-1371806_1.t .. 7/9 setfattr: ./tmp1: No such file
or directory
setfattr: ./tmp2: No such file or directory
setfattr: ./tmp3: No such file or directory
setfattr: ./tmp4: No such file or directory
setfattr: ./tmp5: No such file or directory
setfattr: ./tmp6: No such file or directory
setfattr: ./tmp7: No such file or directory
setfattr: ./tmp8: No such file or directory
setfattr: ./tmp9: No such file or directory
setfattr: ./tmp10: No such file or directory
./tmp1: user.foo: No such attribute
tests/bugs/bug-1371806_1.t .. Failed 2/9 subtests

Mount log for one of the directories:
[2017-10-06 05:32:10.059798] I [MSGID: 109005]
[dht-selfheal.c:2458:dht_selfheal_directory] 0-patchy-dht:
Directory selfheal failed: Unable to form layout for directory /tmp1
[2017-10-06 05:32:10.060013] E [MSGID: 109011]
[dht-common.c:5011:dht_dir_common_setxattr] 0-patchy-dht: Failed
to get mds subvol for path /tmp1gfid is
----
[2017-10-06 05:32:10.060041] W [fuse-bridge.c:1377:fuse_err_cbk]
0-glusterfs-fuse: 99: SETXATTR() /tmp1 => -1 (No such file or
directory)

Request the patch authors to take a look at it.
Thanks
Ravi


On 10/05/2017 06:04 PM, Atin Mukherjee wrote:

The following commit has broken the brick multiplexing regression
job. tests/bugs/bug-1371806_1.t has failed couple of times.  One
of the latest regression job report is at
https://build.gluster.org/job/regression-test-with-multiplex/406/console
<https://build.gluster.org/job/regression-test-with-multiplex/406/console>
.


commit 9b4de61a136b8e5ba7bf0e48690cdb1292d0dee8
Author: Mohit Agrawal <moagr...@redhat.com
<mailto:moagr...@redhat.com>>
Date:   Fri May 12 21:12:47 2017 +0530

    cluster/dht : User xattrs are not healed after brick stop/start

    Problem: In a distributed volume custom extended attribute
value for a directory
 does not display correct value after stop/start or
added newly brick.
 If any extended(acl) attribute value is set for a
directory after stop/added
 the brick the attribute(user|acl|quota) value is not
updated on brick
 after start the brick.

    Solution: First store hashed subvol or subvol(has internal
xattr) on inode ctx and
  consider it as a MDS subvol.At the time of update
custom xattr
  (user,quota,acl, selinux) on directory first check
the mds from
  inode ctx, if mds is not present on inode ctx then
throw EINVAL error
  to application otherwise set xattr on MDS subvol
with internal xattr
  value of -1 and then try to update the attribute on
other non MDS
  volumes also.If mds subvol is down in that case
throw an
  error "Transport endpoint is not connected". In
dht_dir_lookup_cbk|
  dht_revalidate_cbk|dht_discover_complete call
dht_call_dir_xattr_heal
  to heal custom extended attribute.
  In case of gnfs server if hashed subvol has not
found based on
  loc then wind a call on all subvol to update xattr.

    Fix:    1) Save MDS subvol on inode ctx
    2) Check if mds subvol is present on inode ctx
    3) If mds subvol is down then call unwind with error
ENOTCONN and if it is up
   then set new xattr "GF_DHT_XATTR_MDS" to -1 and
wind a call on other
   subvol.
    4) If setxattr fop is successful on non-mds subvol
then increment the value of
   internal xattr to +1
    5) At the time of directory_lookup check the value of
new xattr GF_DHT_XATTR_MDS
    6) If value is not 0 in dht_lookup_dir_cbk(other cbk)
functions then call heal
   function to heal user xattr
    7) syncop_setxattr on hashed_subvol to reset the
value of xattr to 0
   if heal is successful on all subvol.

    Test : To reproduce the issue followed below steps
   1) Create a distributed volume and create mount point
   2) Create some directory from mount point mkdir tmp{1..5}
   3

[Gluster-devel] AFR: Fail lookups when quorum not met

2017-09-22 Thread Ravishankar N

Hello,

In AFR we currently allow look-ups to pass through without taking into 
account whether the lookup is served from the good or bad brick. We 
always serve from the good brick whenever possible, but if there is 
none, we just serve the lookup from one of the bricks that we got a 
positive reply from.


We found a bug  [1] due to this behavior were the iatt values returned 
in the lookup call was bad and caused the client to hang. The proposed 
fix [2] was to fail look ups when we definitely know the lookup can't be 
trusted (by virtue of AFR xattrs indicating the replies we got from the 
up bricks are indeed bad).


Note that this fix is *only* for replica 3 or arbiter volumes (not 
replica 2, where there is no notion of quorum). But we want to 'harden' 
the fix by  not allowing any look ups at all if quorum is not met (or) 
it is met but there are no good copies.


Some implications of this:

-If a file ends up in data/meta data split-brain in replica 3/arbiter 
(rare occurrence), we won't be able to delete it from the mount.


-Even if the only brick that is up is the good copy, we still fail it 
due to lack of quorum.


Does any one have comments/ feedback?

Thanks,

Ravi

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1467250

[2] https://review.gluster.org/#/c/17673/ (See review comments on the 
landing page if interested)


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] [Gluster-infra] lists.gluster.org issues this weekend

2017-09-22 Thread Ravishankar N

Hello,
Are our servers still facing the overload issue? My replies to 
gluster-users ML are not getting delivered to the list.

Regards,
Ravi

On 09/19/2017 10:03 PM, Michael Scherer wrote:

Le samedi 16 septembre 2017 à 20:48 +0530, Nigel Babu a écrit :

Hello folks,

We have discovered that for the last few weeks our mailman server was
used
for a spam attack. The attacker would make use of the + feature
offered by
gmail and hotmail. If you send an email to exam...@hotmail.com,
example+...@hotmail.com, example+...@hotmail.com, it goes to the same
inbox. We were constantly hit with requests to subscribe to a few
inboxes.
These requests overloaded our mail server so much that it gave up. We
detected this failure because a postmortem email to
gluster-in...@gluster.org bounced. Any emails sent to our mailman
server
may have been on hold for the last 24 hours or so. They should be
processed
now as your email provider re-attempts.

For the moment, we've banned subscribing with an email address with a
+ in
the name. If you are already subscribed to the lists with a + in your
email
address, you will continue to be able to use the lists.

We're looking at banning the spam IP addresses from being able to hit
the
web interface at all. When we have a working alternative, we will
look at
removing the current ban of using + in address.

So we have a alternative in place, I pushed a blacklist using
mod_security and a few DNS blacklist:
https://github.com/gluster/gluster.org_ansible_configuration/commit/2f4
c1b8feeae16e1d0b7d6073822a6786ed21dde





Apologies for the outage and a big shout out to Michael for taking
time out
of his weekend to debug and fix the issue.

Well, you can thanks the airport in Prague for being less interesting
than a spammer attacking us.



___
Gluster-users mailing list
gluster-us...@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] How commonly applications make use of fadvise?

2017-08-11 Thread Ravishankar N



On 08/11/2017 04:51 PM, Niels de Vos wrote:

On Fri, Aug 11, 2017 at 12:47:47AM -0400, Raghavendra Gowdappa wrote:

Hi all,

In a conversation between me, Milind and Csaba, Milind pointed out
fadvise(2) [1] and its potential benefits to Glusterfs' caching
translators like read-ahead etc. After discussing about it, we agreed
that our performance translators can leverage the hints to provide
better performance. Now the question is how commonly applications
actually provide hints? Is it something that is used quite frequently?
If yes, we can think of implementing this in glusterfs (probably
kernel-fuse too?). If no, there is not much of an advantage in
spending our energies here. Your inputs will help us to prioritize
this feature.

If functionality like this is available, we would add support in
libgfapi.so as well. NFS-Ganesha is prepared for consuming this
(fsal_obj_ops->io_advise), so applications running on top of NFS will
benefit. I failed to see if the standard Samba/vfs can use it.

A quick check in QEMU does not suggest it is used by the block drivers.

I don't think Linux/FUSE supports fadvise though. So this is an
oppertunity for a Gluster developer to get their name in the Linux
kernel :-) Feature additions like this have been done before by us, and
we should continue where we can. It is a relatively easy entry for
contributing to the Linux kernel.


To me it looks like fadvise (mm/fadvise.c) affects only the linux page 
cache behavior and is decoupled from the filesystem itself. What this 
means for fuse  is that the  'advise' is only to the content that the 
fuse kernel module has stored in that machine's page cache.  Exposing it 
as a FOP would likely involve adding a new fop to struct file_operations 
that is common across the entire VFS and likely  won't fly with the 
kernel folks. I could be wrong in understanding all of this. :-)


Regards,
Ravi
  

[1] https://linux.die.net/man/2/fadvise

As well as local man-pages for fadvise64/posix_fadvise.

Showing that we have support for this, suggests that the filesystem
becomes more mature and gains advanced features. This should impress
users and might open up more interest for certain (HPC?) use-cases.

Thanks,
Niels



regards,
Raghavendra
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] upstream regression suite is broken

2017-07-06 Thread Ravishankar N

I've sent a fix @ https://review.gluster.org/#/c/17721

On 07/07/2017 09:51 AM, Atin Mukherjee wrote:

Krutika,

tests/basis/stats-dump.t is failing all the time and as per my initial 
analysis after https://review.gluster.org/#/c/17709/ got into the 
mainline the failures are seen and reverting this patch makes the test 
to run successfully. I do understand that the centos vote for this 
patch was green but the last run was on 5th June which was 1 month 
back. So some other changes have gone into in between which is now 
causing this patch to break the test.


This makes me think as a maintainer we do need to ensure the if the 
regression vote on the patch is quite old, a rebase of the patch is 
must to be on the safer side?


~Atin


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] reagarding backport information while porting patches

2017-06-22 Thread Ravishankar N

On 06/23/2017 09:15 AM, Pranith Kumar Karampuri wrote:

hi,
 Now that we are doing backports with same Change-Id, we can find 
the patches and their backports both online and in the tree without 
any extra information in the commit message. So shall we stop adding 
text similar to:


> Reviewed-on: https://review.gluster.org/17414


Sometimes I combine 2 commits from master (typically commit 2 which 
fixes a bug in commit 1) in to a single patch while backporting. The 
change ID is not the same in that case and I explicitly mention the 2 
patch urls in the squashed commit sent to the release branch.  So in 
those cases, some way to trace back to the patches in master is helpful. 
Otherwise I think it is fair to omit it.


> Smoke: Gluster Build System >
> Reviewed-by: Pranith Kumar Karampuri >
> Tested-by: Pranith Kumar Karampuri >
> NetBSD-regression: NetBSD Build System 
>
> Reviewed-by: Amar Tumballi >
> CentOS-regression: Gluster Build System 
>

(cherry picked from commit de92c363c95d16966dbcc9d8763fd4448dd84d13)

in the patches?

Do you see any other value from this information that I might be missing?

--
Pranith


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Support for statx in 4.0

2017-06-14 Thread Ravishankar N

On 06/12/2017 01:21 PM, Vijay Bellur wrote:

Hey All,

Linux 4.11 has added support for a new system call, statx [1]. statx 
provides more information than what stat() does today. Given that 
there could be potential users for this new interface it would be nice 
to have statx supported in 4.0. We could review if some of our 
translators, nfs & smb accesses leverage this interface for some 
enhanced functionality.


I have not yet checked the current state of support for statx in fuse. 
Needless to say it would be useful to have it in there. If somebody's 
interested in working this out across the entire stack (fuse & 
gluster), do let us know here!




I can take a shot at this.  A cursory look seems to indicate support in 
fuse as well, since the syscall is basically overloading vfs_getattr()
and the patch contains changes to fuse_getattr to accommodate the new 
params. So it might just be the gluster bits where we need to introduce 
the fop.


Regards,
Ravi


Regards,
Vijay

[1] https://patchwork.kernel.org/patch/8982111/
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] 120k context switches on GlsuterFS nodes

2017-05-17 Thread Ravishankar N

On 05/17/2017 11:07 PM, Pranith Kumar Karampuri wrote:

+ gluster-devel

On Wed, May 17, 2017 at 10:50 PM, mabi > wrote:


I don't know exactly what kind of context-switches it was but what
I know is that it is the "cs" number under "system" when you run
vmstat.

Okay, that could be due to the  syscalls themselves or pre-emptive 
multitasking in case there aren't enough cpu cores. I think the spike in 
numbers is due to more users accessing the files at the same time like 
you observed, translating into more syscalls.  You can try capturing the 
gluster volume profile info the next time it occurs and co-relate with 
the cs count. If you don't see any negative performance impact, I think 
you don't need to be bothered much by the numbers.


HTH,
Ravi



Also I use the percona linux monitoring template for cacti

(https://www.percona.com/doc/percona-monitoring-plugins/LATEST/cacti/linux-templates.html

)
which monitors context switches too. If that's of any use
interrupts where also quite high during that time with peaks up to
50k interrupts.




 Original Message 
Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes
Local Time: May 17, 2017 2:37 AM
UTC Time: May 17, 2017 12:37 AM
From: ravishan...@redhat.com 
To: mabi >,
Gluster Users >


On 05/16/2017 11:13 PM, mabi wrote:

Today I even saw up to 400k context switches for around 30
minutes on my two nodes replica... Does anyone else have so high
context switches on their GlusterFS nodes?

I am wondering what is "normal" and if I should be worried...





 Original Message 
Subject: 120k context switches on GlsuterFS nodes
Local Time: May 11, 2017 9:18 PM
UTC Time: May 11, 2017 7:18 PM
From: m...@protonmail.ch 
To: Gluster Users 


Hi,

Today I noticed that for around 50 minutes my two GlusterFS
3.8.11 nodes had a very high amount of context switches, around
120k. Usually the average is more around 1k-2k. So I checked
what was happening and there where just more users accessing
(downloading) their files at the same time. These are
directories with typical cloud files, which means files of any
sizes ranging from a few kB to MB and a lot of course.

Now I never saw such a high number in context switches in my
entire life so I wanted to ask if this is normal or to be
expected? I do not find any signs of errors or warnings in any
log files.



What context switch are you referring to (syscalls context-switch
on the bricks?) ? How did you measure this?
-Ravi


My volume is a replicated volume on two nodes with ZFS as
filesystem behind and the volume is mounted using FUSE on the
client (the cloud server). On that cloud server the glusterfs
process was using quite a lot of system CPU but that server
(VM) only has 2 vCPUs so maybe I should increase the number of
vCPUs...

Any ideas or recommendations?



Regards,
M.




___
Gluster-users mailing list
gluster-us...@gluster.org 
http://lists.gluster.org/mailman/listinfo/gluster-users




___ Gluster-users
mailing list gluster-us...@gluster.org

http://lists.gluster.org/mailman/listinfo/gluster-users
 


--
Pranith


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t - regression failures

2017-05-17 Thread Ravishankar N

On 05/17/2017 04:09 PM, Pranith Kumar Karampuri wrote:

karthik, Ravi,
 What is the plan to bring it back? Did you guys find RC for the 
failure?
Are you referring to gfid-mismatch-resolution-with-fav-child-policy.t? I 
already mentioned the RCA in the patch I linked to earlier in this thread?


On Mon, May 15, 2017 at 10:52 AM, Ravishankar N 
<ravishan...@redhat.com <mailto:ravishan...@redhat.com>> wrote:


On 05/12/2017 03:33 PM, Atin Mukherjee wrote:

|tests/basic/afr/add-brick-self-heal.t|

<http://git.gluster.org/cgit/glusterfs.git/tree/tests/basic/afr/add-brick-self-heal.t>
is the 2nd in the list.



All failures (https://fstat.gluster.org/weeks/1/failure/2
<https://fstat.gluster.org/weeks/1/failure/2>) are in netbsd and
looks like an issue with the slaves.
Most of the runs have this error:

[07:29:32] Running tests in file ./tests/basic/afr/add-brick-self-heal.t
volume create: patchy: failed: Hostnbslave70.cloud.gluster.org 
<http://nbslave70.cloud.gluster.org>  is not in 'Peer in Cluster' state

Not investigating the .t itself any further. -Ravi
___ Gluster-devel
mailing list Gluster-devel@gluster.org
<mailto:Gluster-devel@gluster.org>
http://lists.gluster.org/mailman/listinfo/gluster-devel
<http://lists.gluster.org/mailman/listinfo/gluster-devel> 


--
Pranith


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t - regression failures

2017-05-14 Thread Ravishankar N

On 05/12/2017 03:33 PM, Atin Mukherjee wrote:
|tests/basic/afr/add-brick-self-heal.t| 
 
is the 2nd in the list.



All failures (https://fstat.gluster.org/weeks/1/failure/2) are in netbsd 
and looks like an issue with the slaves.

Most of the runs have this error:

[07:29:32] Running tests in file ./tests/basic/afr/add-brick-self-heal.t
volume create: patchy: failed: Host nbslave70.cloud.gluster.org is not in 'Peer 
in Cluster' state

Not investigating the .t itself any further.

-Ravi
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t - regression failures

2017-05-14 Thread Ravishankar N

On 05/14/2017 10:05 PM, Atin Mukherjee wrote:



On Fri, May 12, 2017 at 3:51 PM, Karthik Subrahmanya 
> wrote:


Hey Atin,

I had a look at
"tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t".
The test case passes in my local system with latest master. I also
tried cherry picking some of the patches which failed regression
but it passed on my system.
In the list https://fstat.gluster.org/weeks/1/failure/214
 many of the
patches passed this test case in the later phase and are already
merged on master.

For many patches the test case failed for first time and passed
when it tried for second time.
In some cases it failed with EIO while doing "ls" for the file,
but the immediate "cat" on the file passed.
It has some dependency on the cli option to resolve gfid
split-brain, which is under progress.
So as discussed with Ravi, we were planning to mark it as bad at
the moment. Is that fine?


I'd suggest that and would request to mark it bad asap. It's been 
failing very frequently now.
Sent https://review.gluster.org/17290 . Karthik, please remove it as a 
part of testing your gfid split-brain CLI patch.

-Ravi



Regards,
Karthik

On Fri, May 12, 2017 at 3:33 PM, Atin Mukherjee
> wrote:

Refer https://fstat.gluster.org/weeks/1
 .
|tests/basic/afr/add-brick-self-heal.t|


is the 2nd in the list.


___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://lists.gluster.org/mailman/listinfo/gluster-devel






___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] stat() returns invalid file size when self healing

2017-04-12 Thread Ravishankar N

On 04/12/2017 01:57 PM, Mateusz Slupny wrote:

Hi,

I'm observing strange behavior when accessing glusterfs 3.10.0 volume 
through FUSE mount: when self-healing, stat() on a file that I know 
has non-zero size and is being appended to results in stat() return 
code 0, and st_size being set to 0 as well.


Next week I'm planning to find a minimal reproducible example and file 
a bug report. I wasn't able to find any references to similar issues, 
but I wanted to make sure that it isn't an already known problem.


Some notes about my current setup:
- Multiple applications are writing to multiple FUSE mounts pointing 
to the same gluster volume. Only one of those applicatuibs is writing 
to a given file at a time. I am only appending to files, or to be 
specific calling pwrite() with offset set to file size obtained by 
stat(). (I'm not sure if using O_APPEND would change anything, but 
still it would be a workaround, so shouldn't matter.)
- The issue happens even if no reads are performed on those files, 
e.g. load is no higher than usual.
- Since I'm calling stat() only before writing, and only one node 
writes to a given file, it means that stat() returns invalid size even 
to clients that write to the file.


Steps to reproduce:
0. Have multiple processes constantly appending data to files.
1. Stop one replica.
2. Wait few minutes.
3. Start that replica again - shd starts self healing.
4. stat() on some of the files that are being healed returns st_size 
equal to 0.


Setup:
- glusterfs 3.10.0

- volume type: replicas with arbiters
Type: Distributed-Replicate
Number of Bricks: 12 x (2 + 1) = 36

- FUSE mount configuration:
-o direct-io-mode=on passed explicitly to mount

- volume configuration:
cluster.consistent-metadata: yes
cluster.eager-lock: on
cluster.readdir-optimize: on
cluster.self-heal-readdir-size: 64KB
cluster.self-heal-daemon: on
cluster.read-hash-mode: 2
cluster.use-compound-fops: on
cluster.ensure-durability: on
cluster.granular-entry-heal: enable
cluster.entry-self-heal: off
cluster.data-self-heal: off
cluster.metadata-self-heal: off
performance.quick-read: off
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
performance.flush-behind: off
performance.write-behind: off
performance.open-behind: off
cluster.background-self-heal-count: 1
network.inode-lru-limit: 1024
network.ping-timeout: 1
performance.io-cache: off
transport.address-family: inet
nfs.disable: on
cluster.locking-scheme: granular

I have already verified that following options do not influence this 
behavior:

- cluster.data-self-heal-algorithm (all possible values)
- cluster.eager-lock
- cluster.consistent-metadata
- performance.stat-prefetch

I would greatly appreciate any hints on what may be wrong with the 
current setup, or what to focus on (or not) in minimal reproducible 
example.



Would you be able to  try and see if you can reproduce this in a 
replica-3 volume? Since you are observing it on arbiter config, the bug 
could be that the stat is being served from the arbiter brick but we had 
fixed (http://review.gluster.org/13609) in one of the 3.7 releases 
itself, so maybe this is a new bug. In any case please do raise the bug 
with the gluster logs attached.


Regards,
Ravi



thanks and best regards,
Matt
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] glusterd regression failure on centos

2017-03-22 Thread Ravishankar N

On 03/22/2017 11:54 PM, Atin Mukherjee wrote:
Please file a bug in project-infra in gluster asking for a centos 
slave machine to debug the issue further and Nigel should be able to 
assist you on that.


On Wed, 22 Mar 2017 at 13:55, Gaurav Yadav > wrote:


Hi All,

glusterd regression is getting failed while executing
"tests/basic/afr/arbiter-mount.t "test case.

Test Summary Report
*14:16:27* ---
*14:16:27* ./tests/basic/afr/arbiter-mount.t (Wstat: 0 Tests: 22 Failed: 4)
*14:16:27*Failed tests:  7, 17, 21-22
*14:16:27* Files=1, Tests=22, 71 wallclock secs ( 0.03 usr  0.01 sys +  
1.54 cusr  2.42 csys =  4.00 CPU)
*14:16:27* Result: FAIL
*14:16:27* End of test ./tests/basic/afr/arbiter-mount.t

Here is the link of logs generated by jenkins :-
https://build.gluster.org/job/centos6-regression/3732/consoleFull


"EXPECT_WITHIN $NFS_EXPORT_TIMEOUT "1" is_nfs_export_available"  is failing.
Looks like rpcbind was not running on the slave. From the regression 
log, the `cleanup` which is called before the TESTs in the .t are run is 
spewing out some errors:


*4:14:04* [14:14:04] Running tests in file ./tests/basic/afr/arbiter-mount.t
*14:14:47* rm: cannot remove `/mnt/glusterfs/0/xy_zzy': Transport endpoint is 
not connected
*14:14:47* mount.nfs: rpc.statd is not running but is required for remote 
locking.
*14:14:47* mount.nfs: Either use '-o nolock' to keep locks local, or start 
statd.
*14:14:47* mount.nfs: an incorrect mount option was specified
*14:15:10* mount.nfs: rpc.statd is not running but is required for remote 
locking.
*14:15:10* mount.nfs: Either use '-o nolock' to keep locks local, or start 
statd.
*14:15:10* mount.nfs: an incorrect mount option was specified


-Ravi


I executed tests/basic/afr/arbiter-mount.t script explicitly, but this 
test-case passed for me.

prove tests/basic/afr/arbiter-mount.t
tests/basic/afr/arbiter-mount.t .. 9/22 rm: cannot remove 
'/mnt/glusterfs/0/xy_zzy': Transport endpoint is not connected
tests/basic/afr/arbiter-mount.t .. 10/22 mount.nfs: Remote I/O error
tests/basic/afr/arbiter-mount.t .. ok
All tests successful.
Files=1, Tests=22, 55 wallclock secs ( 0.03 usr  0.00 sys +  0.69 cusr  
0.55 csys =  1.27 CPU)
Result: PASS


Thanks
Gaurav

___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://lists.gluster.org/mailman/listinfo/gluster-devel

--
- Atin (atinm)


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Announcing release 3.11 : Scope, schedule and feature tracking

2017-03-03 Thread Ravishankar N

On 03/03/2017 07:23 PM, Shyam wrote:

On 03/03/2017 06:44 AM, Prashanth Pai wrote:



On 02/28/2017 08:47 PM, Shyam wrote:

We should be transitioning to using github for feature reporting and
tracking, more fully from this release. So once again, if there exists
any confusion on that front, reach out to the lists for clarification.


I see that there was a discussion on this on the maintainers ML [1]. If
it is not too late or if I may cast vote as a non-maintainer, I prefer
to have bugzilla  for tracking bugs and the users ML for queries. I see
many 'issues' on the github page which are mostly candidates for
gluster-users ML.


+1 to that


Ok, there is some confusion here I think ;)

So,
- github issues is for features *only*
- Issues are *not* for bugs or a substitute for ML posts/discussions
- The proposal in maintainers for the same (i.e [1]) did not go 
through, and hence was never proposed to the devel and users groups


So to make this clear in github, this [2] commit is put up for review 
(and now merged and live, check [3]), that will clarify this for users 
coming into github to file issues. If users still do file issues 
beyond this information, we can politely redirect them to the ML or BZ 
and close the issue.


Further, thanks to folks who still have been responding to queries 
over github issues (which has increased in frequency in the recent past).


HTH 


Yes, this makes sense! Thanks Shyam.



and thanks for the shout out,
Shyam

[1] 
http://lists.gluster.org/pipermail/maintainers/2017-February/002195.html 



[2] Review on github issue template: 
https://review.gluster.org/#/c/16795/2/.github/ISSUE_TEMPLATE


[3] New issue in glusterfs github view: 
https://github.com/gluster/glusterfs/issues/new



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Announcing release 3.11 : Scope, schedule and feature tracking

2017-03-03 Thread Ravishankar N

On 02/28/2017 08:47 PM, Shyam wrote:
We should be transitioning to using github for feature reporting and 
tracking, more fully from this release. So once again, if there exists 
any confusion on that front, reach out to the lists for clarification.


I see that there was a discussion on this on the maintainers ML [1]. If 
it is not too late or if I may cast vote as a non-maintainer, I prefer 
to have bugzilla  for tracking bugs and the users ML for queries. I see 
many 'issues' on the github page which are mostly candidates for 
gluster-users ML.


Regards,

Ravi


[1] http://lists.gluster.org/pipermail/maintainers/2017-February/002195.html

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Logging in a multi-brick daemon

2017-02-15 Thread Ravishankar N

On 02/16/2017 04:09 AM, Jeff Darcy wrote:

One of the issues that has come up with multiplexing is that all of the bricks 
in a process end up sharing a single log file.  The reaction from both of the 
people who have mentioned this is that we should find a way to give each brick 
its own log even when they're in the same process, and make sure gf_log etc. 
are able to direct messages to the correct one.  I can think of ways to do 
this, but it doesn't seem optimal to me.  It will certainly use up a lot of 
file descriptors.  I think it will use more memory.  And then there's the issue 
of whether this would really be better for debugging.  Often it's necessary to 
look at multiple brick logs while trying to diagnose this problem, so it's 
actually kind of handy to have them all in one file.  Which would you rather do?

(a) Weave together entries in multiple logs, either via a script or in your 
head?

(b) Split or filter entries in a single log, according to which brick they're 
from?

To me, (b) seems like a much more tractable problem.  I'd say that what we need 
is not multiple logs, but *marking of entries* so that everything pertaining to 
one brick can easily be found.  One way to do this would be to modify volgen so 
that a brick ID (not name because that's a path and hence too long) is 
appended/prepended to the name of every translator in the brick.  Grep for that 
brick ID, and voila!  You now have all log messages for that brick and no 
other.  A variant of this would be to leave the names alone and modify gf_log 
so that it adds the brick ID automagically (based on a thread-local variable 
similar to THIS).  Same effect, other than making translator names longer, so 
I'd kind of prefer this approach.  Before I start writing the code, does 
anybody else have any opinions, preferences, or alternatives I haven't 
mentioned yet?

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel
My vote is for having separate log files per brick. Even in separate log 
files
that we have today, I find it difficult to mentally ignore irrelevant 
messages

in a single log file as I am sifting through it to look for errors that are
related to the problem at hand. Having entries from multiple bricks and then
grepping it would only make things harder. I cannot think of a case 
where having

entries from all bricks in one file would particularly be beneficial for
debugging since what happens in one brick is independent of the other 
bricks
(at least until we move client xlators to server side and run them in 
the brick process).

As for file descriptor count/memory usage, I think we should be okay
as it is not any worse than that in the non-multiplexed approach we have
today.

On a side-note, I think the problem is not having too many log files but 
having
them in multiple nodes. Having a log-aggregation solution where all 
messages are
logged to a single machine (but still in separate files) would make it 
easier to

monitor/debug issues.
-Ravi
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] gluster source code help

2017-02-03 Thread Ravishankar N

On 02/03/2017 09:14 AM, jayakrishnan mm wrote:



On Thu, Feb 2, 2017 at 8:17 PM, Ravishankar N <ravishan...@redhat.com 
<mailto:ravishan...@redhat.com>> wrote:


On 02/02/2017 10:46 AM, jayakrishnan mm wrote:

Hi

How  do I determine, which part of the  code is run on the
client, and which part of the code is run on the server nodes by
merely looking at the the glusterfs  source code ?
I knew  there are client side  and server side translators which
will run on respective platforms. I am looking at part of self
heal daemon source  (ec/afr) which will run on the server nodes 
and  the part which run on the clients.


The self-heal daemon that runs on the server is also a client
process in the sense that it has client side xlators like ec or
afr and  protocol/client (see the shd volfile
'glustershd-server.vol') loaded and talks to the bricks like a
normal client does.
The difference is that only self-heal related 'logic' get executed
on the shd while both self-heal and I/O related logic get executed
from the mount. The self-heal logic resides mostly in
afr-self-heal*.[ch] while I/O related logic is there in the other
files.
HTH,
Ravi



Hi JK,

Dear  Ravi,
Thanks for your kind explanation.
So, each server node will have a separate self-heal daemon(shd) up and 
running , every time a child_up event occurs, and this will  be an 
index healer.
And each daemon  will spawn  "priv->child_count " number of threads on 
each server node . correct ?
shd is always running and yes those many threads are spawned for index 
heal when the process starts.

1. When exactly a full healer spawns  threads?
Whenever you run `gluster volume heal volname full`. See afr_xl_op(). 
There are some bugs in launching full heal though.
2. When can GF_EVENT_TRANSLATOR_OP & GF_SHD_OP_HEAL_INDEX happen 
together (so that index healer spawns thread) ?
similarly when can GF_EVENT_TRANSLATOR_OP & GF_SHD_OP_HEAL_FULL 
 happen ? During replace-brick ?
Is it possible that index healer and full healer spawns threads 
together (so that total number of  threads  is 2*priv->child_count)?


index heal threads wake up and run once every 10 minutes or whatever the 
cluster.heal-timeout is. They are also run when a brick comes up like 
you said, via afr_notify(). It is also run when you manually launch 
'gluster volume heal volname`. Again see afr_xl_op().
3. In /var/lib/glusterd/glustershd/glustershd-server.vol , why 
 debug/io-stats  is chosen as the top xlator ?


io-stats is generally loaded as the top most xlator in all graphs at the 
appropriate place for gathering profile-info, but for shd, I'm not sure 
if it has any specific use other than acting as a placeholder as a 
parent to all replica xlators.


Regards,
Ravi

Thanks
Best regards



Best regards
JK


___
Gluster-devel mailing list
Gluster-devel@gluster.org <mailto:Gluster-devel@gluster.org>
http://lists.gluster.org/mailman/listinfo/gluster-devel
<http://lists.gluster.org/mailman/listinfo/gluster-devel>


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] gluster source code help

2017-02-02 Thread Ravishankar N

On 02/02/2017 10:46 AM, jayakrishnan mm wrote:

Hi

How  do I determine, which part of the  code is run on the client, and 
which part of the code is run on the server nodes by merely looking at 
the the glusterfs  source code ?
I knew  there are client side  and server side translators which will 
run on respective platforms. I am looking at part of self heal daemon 
source  (ec/afr) which will run on the server nodes  and  the part 
which run on the clients.


The self-heal daemon that runs on the server is also a client process in 
the sense that it has client side xlators like ec or afr and  
protocol/client (see the shd volfile 'glustershd-server.vol') loaded and 
talks to the bricks like a normal client does.
The difference is that only self-heal related 'logic' get executed on 
the shd while both self-heal and I/O related logic get executed from the 
mount. The self-heal logic resides mostly in afr-self-heal*.[ch] while 
I/O related logic is there in the other files.

HTH,
Ravi


Best regards
JK


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Patches being posted by Facebook and plans thereof

2016-12-22 Thread Ravishankar N

On 12/22/2016 11:31 AM, Shyam wrote:
1) Facebook will port all their patches to the special branch 
release-3.8-fb, where they have exclusive merge rights. 


i) I see that the Bugzilla IDs they are using for these patches are the 
same as the BZ ID of the corresponding 3.8 branch patches. These BZs are 
in MODIFIED/ CLOSED CURRENTRELEASE. Is that alright?


ii) Do we need to review these backports? (Asking because they have 
added the folks who sent the original patch as reviewers).


-Ravi


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Feature Request: Lock Volume Settings

2016-11-14 Thread Ravishankar N

On 11/14/2016 05:57 PM, Atin Mukherjee wrote:
This would be a straight forward thing to implement at glusterd, 
anyone up for it? If not, we will take this into consideration for 
GlusterD 2.0.


On Mon, Nov 14, 2016 at 10:28 AM, Mohammed Rafi K C 
> wrote:


I think it is worth to implement a lock option.

+1


Rafi KC


On 11/14/2016 06:12 AM, David Gossage wrote:

On Sun, Nov 13, 2016 at 6:35 PM, Lindsay Mathieson
> wrote:

As discussed recently, it is way to easy to make destructive
changes
to a volume,e.g change shard size. This can corrupt the data
with no
warnings and its all to easy to make a typo or access the
wrong volume
when doing 3am maintenance ...

So I'd like to suggest something like the following:

  gluster volume lock 




I don't think this is a good idea. It would make more sense to give out 
verbose warnings in the individual commands themselves. A volume lock 
doesn't prevent users from unlocking and still inadvertently running 
those commands without knowing the implications. The remove brick set of 
commands provides verbose messages nicely:


$gluster v remove-brick testvol 127.0.0.2:/home/ravi/bricks/brick{4..6} 
commit

Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit: success
Check the removed bricks to ensure all files are migrated.
If files with data are found on the brick path, copy them via a gluster 
mount point before re-purposing the removed brick


My 2 cents,
Ravi




Setting this would fail all:
- setting changes
- add bricks
- remove bricks
- delete volume

  gluster volume unlock 

would allow all changes to be made.

Just a thought, open to alternate suggestions.

Thanks

+
sounds handy

--
Lindsay
___
Gluster-users mailing list
gluster-us...@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-users





___
Gluster-users mailing list
gluster-us...@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-users


___ Gluster-devel
mailing list Gluster-devel@gluster.org

http://www.gluster.org/mailman/listinfo/gluster-devel
 


--
~ Atin (atinm)

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Hole punch support

2016-11-11 Thread Ravishankar N

+ gluster-devel.

Can you raise an RFE bug for this and assign it to me?
The thing is,  FALLOC_FL_PUNCH_HOLE must be used in tandem with 
FALLOC_FL_KEEP_SIZE, and the latter is currently broken in gluster 
because there are some conversions done in iatt_from_stat() in gluster 
for quota to work. I'm not sure if these are needed anymore, or can be 
circumvented, but it is about time we looked into it.


Thanks,
Ravi

On 11/11/2016 07:55 PM, Ankireddypalle Reddy wrote:


Hi,

   Any idea about when will hole punch support be available with 
glusterfs.


Thanks and Regards,

Ram

***Legal Disclaimer***
"This communication may contain confidential and privileged material 
for the
sole use of the intended recipient. Any unauthorized review, use or 
distribution
by others is strictly prohibited. If you have received the message by 
mistake,
please advise the sender by reply email and delete the message. Thank 
you."

**


___
Gluster-users mailing list
gluster-us...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Preventing lookups from serving metadata.

2016-11-08 Thread Ravishankar N
So there is a class of bugs* in exposed in replicate volumes where if 
the only good copy of the file is down, we still end up serving stale 
data to the application because of caching
in various layers outside gluster. In fuse, this can be mitigated by 
setting attribute and entry-timeout to zero so that the actual FOP 
(stat, read, write, etc) reaches AFR which will

then fail it with EIO. But this does not work for NFS based clients.

1) Is there a way by which we can make the 'lookup' FOP in gluster do 
just that- i.e. tell whether the entry exists or not, and *not serve* 
any other (stat) information except the gfid?


2) If that is not possible, it is okay for AFR to fail lookups with EIO 
if when client-quorum is met and there is no source available? The 
downside is that if we fail lookups with EIO, even unlink cannot be served.
(Think of  a user who doesn't want to resolve a file in split-brain, but 
rather delete it).


Thanks,
Ravi

*bugs:
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1356974
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1224709




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-Maintainers] 'Reviewd-by' tag for commits

2016-10-02 Thread Ravishankar N

On 10/03/2016 06:58 AM, Pranith Kumar Karampuri wrote:



On Mon, Oct 3, 2016 at 6:41 AM, Pranith Kumar Karampuri 
<pkara...@redhat.com <mailto:pkara...@redhat.com>> wrote:




On Fri, Sep 30, 2016 at 8:50 PM, Ravishankar N
<ravishan...@redhat.com <mailto:ravishan...@redhat.com>> wrote:

On 09/30/2016 06:38 PM, Niels de Vos wrote:

On Fri, Sep 30, 2016 at 07:11:51AM +0530, Pranith Kumar Karampuri wrote:

hi,
  At the moment 'Reviewed-by' tag comes only if a +1 is given on the
final version of the patch. But for most of the patches, different 
people
would spend time on different versions making the patch better, they may
not get time to do the review for every version of the patch. Is it
possible to change the gerrit script to add 'Reviewed-by' for all the
people who participated in the review?

+1 to this. For the argument that this *might* encourage
me-too +1s, it only exposes
such persons in bad light.

Or removing 'Reviewed-by' tag completely would also help to make sure it
doesn't give skewed counts.

I'm not going to lie, for me, that takes away the incentive of
doing any reviews at all.


Could you elaborate why? May be you should also talk about your
primary motivation for doing reviews.


I guess it is probably because the effort needs to be recognized? I 
think there is an option to recognize it so it is probably not a good 
idea to remove the tag I guess.


Yes, numbers provide good motivation for me:
Motivation for looking at patches and finding bugs for known components 
even though I am not its maintainer.
Motivation to learning new components because a bug and a fix is usually 
when I look at code for unknown components.

Motivation to level-up when statistics indicate I'm behind my peers.

I think even you said some time back in an ML thread that what can be 
measured can be improved.


-Ravi




I would not feel comfortable automatically adding Reviewed-by tags for
people that did not review the last version. They may not agree with the
last version, so adding their "approved stamp" on it may not be correct.
See the description of Reviewed-by in the Linux kernel sources [0].

While the Linux kernel model is the poster child for projects
to draw standards
from, IMO, their email based review system is certainly not
one to emulate. It
does not provide a clean way to view patch-set diffs, does not
present a single
URL based history that tracks all review comments, relies on
the sender to
provide information on what changed between versions, allows a
variety of
'Komedians' [1] to add random tags which may or may not be
picked up
by the maintainer who takes patches in etc.

Maybe we can add an additional tag that mentions all the people that
did do reviews of older versions of the patch. Not sure what the tag
would be, maybe just CC?

It depends on what tags would be processed to obtain
statistics on review contributions.
I agree that not all reviewers might be okay with the latest
revision but that
% might be miniscule (zero, really) compared to the normal
case where the reviewer spent
considerable time and effort to provide feedback (and an
eventual +1) on previous
revisions. If converting all +1s into 'Reviewed-by's is not
feasible in gerrit
or is not considered acceptable, then the maintainer could
wait for a reasonable
time for reviewers to give +1 for the final revision before
he/she goes ahead
with a +2 and merges it. While we cannot wait indefinitely for
all acks, a comment
like 'LGTM, will wait for a day for other acks before I go
ahead and merge' would be
appreciated.

Enough of bike-shedding from my end I suppose.:-)
Ravi

[1] https://lwn.net/Articles/503829/
<https://lwn.net/Articles/503829/>


Niels


0.http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches#n552

<http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches#n552>

___
Gluster-devel mailing list
Gluster-devel@gluster.org <mailto:Gluster-devel@gluster.org>
http://www.gluster.org/mailman/listinfo/gluster-devel
<http://www.gluster.org/mailman/listinfo/gluster-devel>


-- 
Pranith


--
Pranith


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] 'Reviewd-by' tag for commits

2016-09-30 Thread Ravishankar N

On 09/30/2016 06:38 PM, Niels de Vos wrote:

On Fri, Sep 30, 2016 at 07:11:51AM +0530, Pranith Kumar Karampuri wrote:

hi,
  At the moment 'Reviewed-by' tag comes only if a +1 is given on the
final version of the patch. But for most of the patches, different people
would spend time on different versions making the patch better, they may
not get time to do the review for every version of the patch. Is it
possible to change the gerrit script to add 'Reviewed-by' for all the
people who participated in the review?
+1 to this. For the argument that this *might* encourage me-too +1s, it 
only exposes

such persons in bad light.

Or removing 'Reviewed-by' tag completely would also help to make sure it
doesn't give skewed counts.
I'm not going to lie, for me, that takes away the incentive of doing any 
reviews at all.

I would not feel comfortable automatically adding Reviewed-by tags for
people that did not review the last version. They may not agree with the
last version, so adding their "approved stamp" on it may not be correct.
See the description of Reviewed-by in the Linux kernel sources [0].
While the Linux kernel model is the poster child for projects to draw 
standards
from, IMO, their email based review system is certainly not one to 
emulate. It
does not provide a clean way to view patch-set diffs, does not present a 
single

URL based history that tracks all review comments, relies on the sender to
provide information on what changed between versions, allows a variety of
'Komedians' [1] to add random tags which may or may not be picked up
by the maintainer who takes patches in etc.

Maybe we can add an additional tag that mentions all the people that
did do reviews of older versions of the patch. Not sure what the tag
would be, maybe just CC?
It depends on what tags would be processed to obtain statistics on 
review contributions.
I agree that not all reviewers might be okay with the latest revision 
but that
% might be miniscule (zero, really) compared to the normal case where 
the reviewer spent
considerable time and effort to provide feedback (and an eventual +1) on 
previous
revisions. If converting all +1s into 'Reviewed-by's is not feasible in 
gerrit
or is not considered acceptable, then the maintainer could wait for a 
reasonable
time for reviewers to give +1 for the final revision before he/she goes 
ahead
with a +2 and merges it. While we cannot wait indefinitely for all acks, 
a comment
like 'LGTM, will wait for a day for other acks before I go ahead and 
merge' would be

appreciated.

Enough of bike-shedding from my end I suppose.:-)
Ravi

[1] https://lwn.net/Articles/503829/



Niels

0. 
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches#n552


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] GlusterFs upstream bugzilla components Fine graining

2016-09-28 Thread Ravishankar N

On 09/28/2016 11:24 AM, Muthu Vigneshwaran wrote:

Hi,

This an update to the previous mail about Fine graining of the
GlusterFS upstream bugzilla components.

Finally we have come out a new structure that would help in easy
access of the bug for reporter and assignee too.

In the new structure we have decided to remove components that are
listed as below -

- BDB
- HDFS
- booster
- coreutils
- gluster-hdoop
- gluster-hadoop-install
- libglusterfsclient
- map
- path-converter
- protect
- qemu-block
- stripe
- unify

as we find that the above mentioned components are either
deprecated,uses GitHub for bugs/issues filing and also planned to add
the following components as the main component

- common-ha
- documentation
- gdeploy
- gluster-nagios
- project-infrastructure
- puppet-gluster

The final structure would look like as below -



Structure



Product GlusterFS (Versions: 3.6, 3.7, 3.8, 3.9, mainline )

|

+- Component GlusterFS

|  |

|  +Subcomponent access-controll

|  +Subcomponent afr(automatic file replication)

|  +Subcomponent arbiter

|  +Subcomponent barrier

|  +Subcomponent blockdevice

|  +Subcomponent bitrot

|  +Subcomponent build

|  +Subcomponent changelog

|  +Subcomponent changetimerecorder

|  +Subcomponent cli

|  +Subcomponent core

|  +Subcomponent dht2(distributed hashing table)

|  +Subcomponent disperse

|  +Subcomponent distribute

|  +Subcomponent encryption-xlator

|  +Subcomponent error-gen

|  +Subcomponent eventsapi

|  +Subcomponent filter

|  +Subcomponent fuse

|  +Subcomponent geo-replication

|  +Subcomponent gfid-access

|  +Subcomponent glupy

|  +Subcomponent gluster-smb

|  +Subcomponent glusterd

|  +Subcomponent glusterd2

|  +Subcomponent glusterfind

|  +Subcomponent index

|  +Subcomponent io-cache

|  +Subcomponent io-stats

|  +Subcomponent io-threads

|  +Subcomponent jbr

|  +Subcomponent libgfapi

|  +Subcomponent locks

|  +Subcomponent logging

|  +Subcomponent marker

|  +Subcomponent md-cache

|  +Subcomponent nfs

|  +Subcomponent open-behind

|  +Subcomponent packaging

|  +Subcomponent porting

|  +Subcomponent posix

|  +Subcomponent posix-acl

|  +Subcomponent protocol

|  +Subcomponent quick-read

|  +Subcomponent quiesce

|  +Subcomponent quota

|  +Subcomponent rdma

|  +Subcomponent read-head

|  +Subcomponent replicate
Currently this is the component being used for AFR, so you could remove 
AFR from the list. Or retain AFR and remove this one, since we also have 
jbr as a form of replication. I'd prefer the former since all current 
AFR bugs are filed under replicate.




|  +Subcomponent richacl

|  +Subcomponent rpc

|  +Subcomponent scripts

|  +Subcomponent selfheal
Is this new component being introduced for a specific reason? selfheal 
is just a process used by various components like afr and ec and IMO 
doesn't need to be an explicit component.


Regards,
Ravi


|  +Subcomponent sharding

|  +Subcomponent snapshot

|  +Subcomponent stat-prefetch

|  +Subcomponent symlink-cache

|  +Subcomponent tests

|  +Subcomponent tiering

|  +Subcomponent trace

|  +Subcomponent transport

|  +Subcomponent trash-xlator

|  +Subcomponent unclassified

|  +Subcomponent upcall

|  +Subcomponent write-behind

|

+- Component common-ha

|  |

|  +Subcomponent ganesha

|

+- documentation

|

+- Component gdeploy

|  |

|  +Subcomponent sambha

|  +Subcomponent hyperconvergence

|  +Subcomponent RHSC 2.0

|

+- Component gluster-nagios

|

+- Component project-infrastructure (Version: staging, production)

|  |

|  +Subcomponent website

|  +Subcomponent jenkins

|

+- Component puppet-gluster

Here the versions for all the component is the same as the versions
does not vary per component and varies per product.

So we would like to have your comments on the new structure before 1st
OCT,i.e. three days from now on, is there anything needed to be added
or removed or moved. :) and also we are planning to ask the Bugzilla
admins to update the structure early next week.

Thanks and regards,

Muthu Vigneshwaran & Niels de vos
___
Gluster-users mailing list
gluster-us...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] logs/cores for smoke failures

2016-09-26 Thread Ravishankar N

On 09/27/2016 09:36 AM, Pranith Kumar Karampuri wrote:

hi Nigel,
  Is there already a bug to capture these in the runs when 
failures happen? I am not able to understand why this failure 
happened: https://build.gluster.org/job/smoke/30843/console, 
logs/cores would have helped. Let me know if I should raise a bug for 
this.

I raised one y'day: https://bugzilla.redhat.com/show_bug.cgi?id=1379228
-Ravi


--
Pranith


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] make install again compiling source

2016-09-19 Thread Ravishankar N

On 09/19/2016 05:07 PM, Avra Sengupta wrote:

Hi,

I ran "make -j" on the latest master, followed by make install. The 
make install, by itself is doing a fresh compile every time (and 
totally ignoring the make i did before it). Is there any recent 
change, which would cause this. Thanks.


Regards,
Avra
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



Reverting http://review.gluster.org/14085 seems to fix things for me.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] How to enable FUSE kernel cache about dentry and inode?

2016-09-06 Thread Ravishankar N

On 09/06/2016 12:27 PM, Keiviw wrote:
Could you please tell me your glusterfs version and the mount command 
that you have used?? My GlusterFS version is 3.3.0, different versions 
may be exits different results.


I tried it on the master branch, on Fedora 22 virtual machines (kernel 
version: 4.1.6-200.fc22.x86_64 ). By the way 3.3 is a rather old 
version, you might want to use the latest 3.8.x release.








At 2016-09-06 12:35:19, "Ravishankar N" <ravishan...@redhat.com> wrote:

That is strange. I tried the experiment on a volume with a million
files. The client node's memory usage did grow, as I observed from
the output of free(1) http://paste.fedoraproject.org/422551/ when
I did a `ls`.
-Ravi

On 09/02/2016 07:31 AM, Keiviw wrote:

Exactly, I mounted the volume in a no-brick node(nodeB), and
nodeA was the server. I have set different timeout, but when I
excute "ls /mnt/glusterfs(about 3 million small files, in other
words, about 3 million dentries)", the result was the same,
memory usage in nodeB didn't change at all while nodeA's memory
usage was changed about 4GB!

发自 网易邮箱大师 <http://u.163.com/signature>
On 09/02/2016 09:45, Ravishankar N
<mailto:ravishan...@redhat.com> wrote:

On 09/02/2016 05:42 AM, Keiviw wrote:

Even if I set the attribute-timeout and entry-timeout to
3600s(1h), in the nodeB, it didn't cache any metadata
because the memory usage didn't change. So I was confused
that why did the client not cache dentries and inodes.


If you only want to test fuse's caching, I would try mounting
the volume on a separate machine (not on the brick node
itself), disable all gluster performance xlators, do a
find.|xargs stat on the mount 2 times in succession and see
what free(1) reports the 1st and 2nd time. You could do this
experiment with various attr/entry timeout values. Make sure
your volume has a lot of small files.
-Ravi



    在 2016-09-01 16:37:00,"Ravishankar N"
<ravishan...@redhat.com> 写道:

On 09/01/2016 01:04 PM, Keiviw wrote:

Hi,
I have found that GlusterFS client(mounted by FUSE)
didn't cache metadata like dentries and inodes. I have
installed GlusterFS 3.6.0 in nodeA and nodeB, and the
brick1 and brick2 was in nodeA, then in nodeB, I
mounted the volume to /mnt/glusterfs by FUSE. From my
test, I excuted 'ls /mnt/glusterfs' in nodeB, and found
that the memory didn't use at all. Here are my questions:
1. In fuse kernel, the author set some attributes
to control the time-out about dentry and inode, in
other words, the fuse kernel supports metadata cache,
but in my test, dentries and inodes were not cached. WHY?
2. Were there some options in GlusterFS mounted to
local to enable the metadata cache in fuse kernel?



You can tweak the attribute-timeout and entry-timeout
seconds while mounting the volume. Default is 1 second
for both.  `man mount.glusterfs` lists various mount
options.
-Ravi




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel















___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] How to enable FUSE kernel cache about dentry and inode?

2016-09-05 Thread Ravishankar N
That is strange. I tried the experiment on a volume with a million 
files. The client node's memory usage did grow, as I observed from the 
output of free(1) http://paste.fedoraproject.org/422551/ when I did a `ls`.

-Ravi

On 09/02/2016 07:31 AM, Keiviw wrote:
Exactly, I mounted the volume in a no-brick node(nodeB), and nodeA was 
the server. I have set different timeout, but when I excute "ls 
/mnt/glusterfs(about 3 million small files, in other words, about 3 
million dentries)", the result was the same, memory usage in nodeB 
didn't change at all while nodeA's memory usage was changed about 4GB!


发自 网易邮箱大师 <http://u.163.com/signature>
On 09/02/2016 09:45, Ravishankar N <mailto:ravishan...@redhat.com> wrote:

On 09/02/2016 05:42 AM, Keiviw wrote:

Even if I set the attribute-timeout and entry-timeout to
3600s(1h), in the nodeB, it didn't cache any metadata because the
memory usage didn't change. So I was confused that why did the
client not cache dentries and inodes.


If you only want to test fuse's caching, I would try mounting the
volume on a separate machine (not on the brick node itself),
disable all gluster performance xlators, do a find.|xargs stat on
the mount 2 times in succession and see what free(1) reports the
1st and 2nd time. You could do this experiment with various
attr/entry timeout values. Make sure your volume has a lot of
small files.
-Ravi



在 2016-09-01 16:37:00,"Ravishankar N" <ravishan...@redhat.com>
写道:

On 09/01/2016 01:04 PM, Keiviw wrote:

Hi,
I have found that GlusterFS client(mounted by FUSE)
didn't cache metadata like dentries and inodes. I have
installed GlusterFS 3.6.0 in nodeA and nodeB, and the brick1
and brick2 was in nodeA, then in nodeB, I mounted the volume
to /mnt/glusterfs by FUSE. From my test, I excuted 'ls
/mnt/glusterfs' in nodeB, and found that the memory didn't
use at all. Here are my questions:
1. In fuse kernel, the author set some attributes to
control the time-out about dentry and inode, in other words,
the fuse kernel supports metadata cache, but in my test,
dentries and inodes were not cached. WHY?
2. Were there some options in GlusterFS mounted to local
to enable the metadata cache in fuse kernel?



You can tweak the attribute-timeout and entry-timeout seconds
while mounting the volume. Default is 1 second for both. 
`man mount.glusterfs` lists various mount options.

-Ravi




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel











___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] How to enable FUSE kernel cache about dentry and inode?

2016-09-01 Thread Ravishankar N

On 09/02/2016 05:42 AM, Keiviw wrote:
Even if I set the attribute-timeout and entry-timeout to 3600s(1h), in 
the nodeB, it didn't cache any metadata because the memory usage 
didn't change. So I was confused that why did the client not cache 
dentries and inodes.


If you only want to test fuse's caching, I would try mounting the volume 
on a separate machine (not on the brick node itself), disable all 
gluster performance xlators, do a find.|xargs stat on the mount 2 times 
in succession and see what free(1) reports the 1st and 2nd time. You 
could do this experiment with various attr/entry timeout values. Make 
sure your volume has a lot of small files.

-Ravi



在 2016-09-01 16:37:00,"Ravishankar N" <ravishan...@redhat.com> 写道:

On 09/01/2016 01:04 PM, Keiviw wrote:

Hi,
I have found that GlusterFS client(mounted by FUSE) didn't
cache metadata like dentries and inodes. I have installed
GlusterFS 3.6.0 in nodeA and nodeB, and the brick1 and brick2 was
in nodeA, then in nodeB, I mounted the volume to /mnt/glusterfs
by FUSE. From my test, I excuted 'ls /mnt/glusterfs' in nodeB,
and found that the memory didn't use at all. Here are my questions:
1. In fuse kernel, the author set some attributes to control
the time-out about dentry and inode, in other words, the fuse
kernel supports metadata cache, but in my test, dentries and
inodes were not cached. WHY?
2. Were there some options in GlusterFS mounted to local to
enable the metadata cache in fuse kernel?



You can tweak the attribute-timeout and entry-timeout seconds
while mounting the volume. Default is 1 second for both.  `man
mount.glusterfs` lists various mount options.
-Ravi




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel







___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Fwd: [Gluster-users] bug-upcall-stat.t always fails on master

2016-09-01 Thread Ravishankar N

Sorry sent it to users instead of devel.

I'll show myself out.

 Forwarded Message 
Subject:[Gluster-users] bug-upcall-stat.t always fails on master
Date:   Thu, 1 Sep 2016 22:41:53 +0530
From:   Ravishankar N <ravishan...@redhat.com>
To: gluster-us...@gluster.org List <gluster-us...@gluster.org>



Test Summary Report
---
./tests/bugs/upcall/bug-upcall-stat.t (Wstat: 0 Tests: 16 Failed: 2)
   Failed tests:  15-16

https://build.gluster.org/job/centos6-regression/470/consoleFull
https://build.gluster.org/job/centos6-regression/471/consoleFull
https://build.gluster.org/job/centos6-regression/469/console

Please take a look. It's failing locally also,

___
Gluster-users mailing list
gluster-us...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


--
After watching my newly-retired dad spend two weeks learning how to make 
a new folder, it became obvious that "intuitive" mostly means "what the 
writer or speaker of intuitive likes". (Bruce Ediger, 
bedi...@teal.csn.org, in comp.os.linux.misc, on X the intuitiveness of a 
Mac interface.)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] How to enable FUSE kernel cache about dentry and inode?

2016-09-01 Thread Ravishankar N

On 09/01/2016 01:04 PM, Keiviw wrote:

Hi,
I have found that GlusterFS client(mounted by FUSE) didn't cache 
metadata like dentries and inodes. I have installed GlusterFS 3.6.0 in 
nodeA and nodeB, and the brick1 and brick2 was in nodeA, then in 
nodeB, I mounted the volume to /mnt/glusterfs by FUSE. From my test, I 
excuted 'ls /mnt/glusterfs' in nodeB, and found that the memory didn't 
use at all. Here are my questions:
1. In fuse kernel, the author set some attributes to control the 
time-out about dentry and inode, in other words, the fuse kernel 
supports metadata cache, but in my test, dentries and inodes were not 
cached. WHY?
2. Were there some options in GlusterFS mounted to local to enable 
the metadata cache in fuse kernel?



You can tweak the attribute-timeout and entry-timeout seconds while 
mounting the volume. Default is 1 second for both.  `man 
mount.glusterfs` lists various mount options.

-Ravi




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] 3.9. feature freeze status check

2016-08-29 Thread Ravishankar N

On 08/26/2016 09:39 PM, Pranith Kumar Karampuri wrote:



On Fri, Aug 26, 2016 at 9:38 PM, Pranith Kumar Karampuri 
> wrote:


hi,
  Now that we are almost near the feature freeze date (31st of
Aug), want to get a sense if any of the status of the features.


I meant "want to get a sense of the status of the features"


Please respond with:
1) Feature already merged
2) Undergoing review will make it by 31st Aug
3) Undergoing review, but may not make it by 31st Aug
4) Feature won't make it for 3.9.

I added the features that were not planned(i.e. not in the 3.9
roadmap page) but made it to the release and not planned but may
make it to release at the end of this mail.
If you added a feature on master that will be released as part of
3.9.0 but forgot to add it to roadmap page, please let me know I
will add it.

Here are the features planned as per the roadmap:
1) Throttling
Feature owner: Ravishankar



Sorry, this won't make it to 3.9. I'm working on the patch and hope to 
get it ready for the next release.

Thanks,
Ravi



2) Trash improvements
Feature owners: Anoop, Jiffin

3) Kerberos for Gluster protocols:
Feature owners: Niels, Csaba

4) SELinux on gluster volumes:
Feature owners: Niels, Manikandan

5) Native sub-directory mounts:
Feature owners: Kaushal, Pranith

6) RichACL support for GlusterFS:
Feature owners: Rajesh Joseph

7) Sharemodes/Share reservations:
Feature owners: Raghavendra Talur, Poornima G, Soumya Koduri,
Rajesh Joseph, Anoop C S

8) Integrate with external resource management software
Feature owners: Kaleb Keithley, Jose Rivera

9) Python Wrappers for Gluster CLI Commands
Feature owners: Aravinda VK

10) Package and ship libgfapi-python
Feature owners: Prashant Pai

11) Management REST APIs
Feature owners: Aravinda VK

12) Events APIs
Feature owners: Aravinda VK

13) CLI to get state representation of a cluster from the local
glusterd pov
Feature owners: Samikshan Bairagya

14) Posix-locks Reclaim support
Feature owners: Soumya Koduri

15) Deprecate striped volumes
Feature owners: Vijay Bellur, Niels de Vos

16) Improvements in Gluster NFS-Ganesha integration
Feature owners: Jiffin Tony Thottan, Soumya Koduri

*The following need to be added to the roadmap:*

Features that made it to master already but were not palnned:
1) Multi threaded self-heal in EC
Feature owner: Pranith (Did this because serkan asked for it. He
has 9PB volume, self-healing takes a long time :-/)

2) Lock revocation (Facebook patch)
Feature owner: Richard Wareing

Features that look like will make it to 3.9.0:
1) Hardware extension support for EC
Feature owner: Xavi

2) Reset brick support for replica volumes:
Feature owner: Anuradha

3) Md-cache perf improvements in smb:
Feature owner: Poornima

-- 
Pranith





--
Pranith



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] CFP for Gluster Developer Summit

2016-08-23 Thread Ravishankar N

Hello,

Here is a proposal I'd like to make.

Title: Throttling in gluster 
(https://github.com/gluster/glusterfs-specs/blob/master/accepted/throttling.md)

Theme: Performance and scalability.

The talk/ discussion will be focused on server side throttling of FOPS, 
using a throttling translator. The primary consumer of this would be 
self-heal traffic in AFR but can be extended to other clients as well.
I'm working on getting it working for AFR for the first cut so that the 
multi-threaded self-heal (courtesy facebook) can be enabled without 
consuming system resources too much possibly leading to client starvation.
I'm hoping to have some discussions around this to make it more generic 
and see if it can be aligned with long term goals for QoS in gluster.


Thanks.
Ravi

On 08/13/2016 01:18 AM, Vijay Bellur wrote:

Hey All,

Gluster Developer Summit 2016 is fast approaching [1] on us. We are 
looking to have talks and discussions related to the following themes 
in the summit:


1. Gluster.Next - focusing on features shaping the future of Gluster

2. Experience - Description of real world experience and feedback from:
   a> Devops and Users deploying Gluster in production
   b> Developers integrating Gluster with other 
ecosystems


3. Use cases  - focusing on key use cases that drive Gluster.today and 
Gluster.Next


4. Stability & Performance - focusing on current improvements to 
reduce our technical debt backlog


5. Process & infrastructure  - focusing on improving current workflow, 
infrastructure to make life easier for all of us!


If you have a talk/discussion proposal that can be part of these 
themes, please send out your proposal(s) by replying to this thread. 
Please clearly mention the theme for which your proposal is relevant 
when you do so. We will be ending the CFP by 12 midnight PDT on August 
31st, 2016.


If you have other topics that do not fit in the themes listed, please 
feel free to propose and we might be able to accommodate some of them 
as lightening talks or something similar.


Please do reach out to me or Amye if you have any questions.

Thanks!
Vijay

[1] https://www.gluster.org/events/summit2016/
___
Gluster-users mailing list
gluster-us...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

  1   2   3   >