Re: [Gluster-devel] Spurious failures because of nfs and snapshots

2014-05-18 Thread Pranith Kumar Karampuri
The latest build failure also has the same issue:
Download it from here:
http://build.gluster.org:443/logs/glusterfs-logs-20140518%3a22%3a27%3a31.tgz

Pranith

- Original Message -
> From: "Vijaikumar M" 
> To: "Joseph Fernandes" 
> Cc: "Pranith Kumar Karampuri" , "Gluster Devel" 
> 
> Sent: Monday, 19 May, 2014 11:41:28 AM
> Subject: Re: Spurious failures because of nfs and snapshots
> 
> Hi Joseph,
> 
> In the log mentioned below, it say ping-time is set to default value
> 30sec.I think issue is different.
> Can you please point me to the logs where you where able to re-create
> the problem.
> 
> Thanks,
> Vijay
> 
> 
> 
> On Monday 19 May 2014 09:39 AM, Pranith Kumar Karampuri wrote:
> > hi Vijai, Joseph,
> >  In 2 of the last 3 build failures,
> >  http://build.gluster.org/job/regression/4479/console,
> >  http://build.gluster.org/job/regression/4478/console this
> >  test(tests/bugs/bug-1090042.t) failed. Do you guys think it is better
> >  to revert this test until the fix is available? Please send a patch
> >  to revert the test case if you guys feel so. You can re-submit it
> >  along with the fix to the bug mentioned by Joseph.
> >
> > Pranith.
> >
> > - Original Message -
> >> From: "Joseph Fernandes" 
> >> To: "Pranith Kumar Karampuri" 
> >> Cc: "Gluster Devel" 
> >> Sent: Friday, 16 May, 2014 5:13:57 PM
> >> Subject: Re: Spurious failures because of nfs and snapshots
> >>
> >>
> >> Hi All,
> >>
> >> tests/bugs/bug-1090042.t :
> >>
> >> I was able to reproduce the issue i.e when this test is done in a loop
> >>
> >> for i in {1..135} ; do  ./bugs/bug-1090042.t
> >>
> >> When checked the logs
> >> [2014-05-16 10:49:49.003978] I [rpc-clnt.c:973:rpc_clnt_connection_init]
> >> 0-management: setting frame-timeout to 600
> >> [2014-05-16 10:49:49.004035] I [rpc-clnt.c:988:rpc_clnt_connection_init]
> >> 0-management: defaulting ping-timeout to 30secs
> >> [2014-05-16 10:49:49.004303] I [rpc-clnt.c:973:rpc_clnt_connection_init]
> >> 0-management: setting frame-timeout to 600
> >> [2014-05-16 10:49:49.004340] I [rpc-clnt.c:988:rpc_clnt_connection_init]
> >> 0-management: defaulting ping-timeout to 30secs
> >>
> >> The issue is with ping-timeout and is tracked under the bug
> >>
> >> https://bugzilla.redhat.com/show_bug.cgi?id=1096729
> >>
> >>
> >> The workaround is mentioned in
> >> https://bugzilla.redhat.com/show_bug.cgi?id=1096729#c8
> >>
> >>
> >> Regards,
> >> Joe
> >>
> >> - Original Message -
> >> From: "Pranith Kumar Karampuri" 
> >> To: "Gluster Devel" 
> >> Cc: "Joseph Fernandes" 
> >> Sent: Friday, May 16, 2014 6:19:54 AM
> >> Subject: Spurious failures because of nfs and snapshots
> >>
> >> hi,
> >>  In the latest build I fired for review.gluster.com/7766
> >>  (http://build.gluster.org/job/regression/4443/console) failed because
> >>  of
> >>  spurious failure. The script doesn't wait for nfs export to be
> >>  available. I fixed that, but interestingly I found quite a few
> >>  scripts
> >>  with same problem. Some of the scripts are relying on 'sleep 5' which
> >>  also could lead to spurious failures if the export is not available
> >>  in 5
> >>  seconds. We found that waiting for 20 seconds is better, but 'sleep
> >>  20'
> >>  would unnecessarily delay the build execution. So if you guys are
> >>  going
> >>  to write any scripts which has to do nfs mounts, please do it the
> >>  following way:
> >>
> >> EXPECT_WITHIN 20 "1" is_nfs_export_available;
> >> TEST mount -t nfs -o vers=3 $H0:/$V0 $N0;
> >>
> >> Please review http://review.gluster.com/7773 :-)
> >>
> >> I saw one more spurious failure in a snapshot related script
> >> tests/bugs/bug-1090042.t on the next build fired by Niels.
> >> Joesph (CCed) is debugging it. He agreed to reply what he finds and share
> >> it
> >> with us so that we won't introduce similar bugs in future.
> >>
> >> I encourage you guys to share what you fix to prevent spurious failures in
> >> future.
> >>
> >> Thanks
> >> Pranith
> >>
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious failures because of nfs and snapshots

2014-05-18 Thread Vijaikumar M

Hi Joseph,

In the log mentioned below, it say ping-time is set to default value 
30sec.I think issue is different.
Can you please point me to the logs where you where able to re-create 
the problem.


Thanks,
Vijay



On Monday 19 May 2014 09:39 AM, Pranith Kumar Karampuri wrote:

hi Vijai, Joseph,
 In 2 of the last 3 build failures, 
http://build.gluster.org/job/regression/4479/console, 
http://build.gluster.org/job/regression/4478/console this 
test(tests/bugs/bug-1090042.t) failed. Do you guys think it is better to revert 
this test until the fix is available? Please send a patch to revert the test 
case if you guys feel so. You can re-submit it along with the fix to the bug 
mentioned by Joseph.

Pranith.

- Original Message -

From: "Joseph Fernandes" 
To: "Pranith Kumar Karampuri" 
Cc: "Gluster Devel" 
Sent: Friday, 16 May, 2014 5:13:57 PM
Subject: Re: Spurious failures because of nfs and snapshots


Hi All,

tests/bugs/bug-1090042.t :

I was able to reproduce the issue i.e when this test is done in a loop

for i in {1..135} ; do  ./bugs/bug-1090042.t

When checked the logs
[2014-05-16 10:49:49.003978] I [rpc-clnt.c:973:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2014-05-16 10:49:49.004035] I [rpc-clnt.c:988:rpc_clnt_connection_init]
0-management: defaulting ping-timeout to 30secs
[2014-05-16 10:49:49.004303] I [rpc-clnt.c:973:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2014-05-16 10:49:49.004340] I [rpc-clnt.c:988:rpc_clnt_connection_init]
0-management: defaulting ping-timeout to 30secs

The issue is with ping-timeout and is tracked under the bug

https://bugzilla.redhat.com/show_bug.cgi?id=1096729


The workaround is mentioned in
https://bugzilla.redhat.com/show_bug.cgi?id=1096729#c8


Regards,
Joe

- Original Message -
From: "Pranith Kumar Karampuri" 
To: "Gluster Devel" 
Cc: "Joseph Fernandes" 
Sent: Friday, May 16, 2014 6:19:54 AM
Subject: Spurious failures because of nfs and snapshots

hi,
 In the latest build I fired for review.gluster.com/7766
 (http://build.gluster.org/job/regression/4443/console) failed because of
 spurious failure. The script doesn't wait for nfs export to be
 available. I fixed that, but interestingly I found quite a few scripts
 with same problem. Some of the scripts are relying on 'sleep 5' which
 also could lead to spurious failures if the export is not available in 5
 seconds. We found that waiting for 20 seconds is better, but 'sleep 20'
 would unnecessarily delay the build execution. So if you guys are going
 to write any scripts which has to do nfs mounts, please do it the
 following way:

EXPECT_WITHIN 20 "1" is_nfs_export_available;
TEST mount -t nfs -o vers=3 $H0:/$V0 $N0;

Please review http://review.gluster.com/7773 :-)

I saw one more spurious failure in a snapshot related script
tests/bugs/bug-1090042.t on the next build fired by Niels.
Joesph (CCed) is debugging it. He agreed to reply what he finds and share it
with us so that we won't introduce similar bugs in future.

I encourage you guys to share what you fix to prevent spurious failures in
future.

Thanks
Pranith



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious failures because of nfs and snapshots

2014-05-18 Thread Pranith Kumar Karampuri


- Original Message -
> From: "Justin Clift" 
> To: "Pranith Kumar Karampuri" 
> Cc: "Gluster Devel" 
> Sent: Monday, 19 May, 2014 10:41:03 AM
> Subject: Re: [Gluster-devel] Spurious failures because of nfs and snapshots
> 
> On 19/05/2014, at 6:00 AM, Pranith Kumar Karampuri wrote:
> 
> > This particular class is eliminated :-). Patch was merged on Friday.
> 
> 
> Excellent.  I've just kicked off 10 instances in Rackspace to each run
> the regression tests on master head.
> 
> Hopefully less than 1/2 of them fail this time.  Has been about 30%
> pass rate recently. :)

I am working on one more patch about timeouts at the moment. Will be sending it 
shortly. That should help us manage waiting for timeouts easily.
With the work kaushal, vijay did for providing logs, core files, we should be 
able to reduce the number of spurious regressions. Because now, we can debug 
them without stopping running of regressions :-).

Pranith
> 
> + Justin
> 
> --
> Open Source and Standards @ Red Hat
> 
> twitter.com/realjustinclift
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Log level readability in gluster logfile

2014-05-18 Thread Atin Mukherjee
Hi List,

I would appreciate if you can go through the patch [1] and let me know
whether this makes sense or not.

[1] http://review.gluster.org/#/c/7790/

Cheers,
Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious failures because of nfs and snapshots

2014-05-18 Thread Justin Clift
On 19/05/2014, at 6:00 AM, Pranith Kumar Karampuri wrote:

> This particular class is eliminated :-). Patch was merged on Friday.


Excellent.  I've just kicked off 10 instances in Rackspace to each run
the regression tests on master head.

Hopefully less than 1/2 of them fail this time.  Has been about 30%
pass rate recently. :)

+ Justin

--
Open Source and Standards @ Red Hat

twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Changes to Regression script

2014-05-18 Thread Pranith Kumar Karampuri


- Original Message -
> From: "Vijay Bellur" 
> To: "Pranith Kumar Karampuri" 
> Cc: "gluster-infra" , gluster-devel@gluster.org
> Sent: Monday, 19 May, 2014 10:03:41 AM
> Subject: Re: [Gluster-devel] Changes to Regression script
> 
> On 05/19/2014 09:41 AM, Pranith Kumar Karampuri wrote:
> >
> >
> > - Original Message -
> >> From: "Vijay Bellur" 
> >> To: "Pranith Kumar Karampuri" 
> >> Cc: "gluster-infra" , gluster-devel@gluster.org
> >> Sent: Saturday, 17 May, 2014 2:52:03 PM
> >> Subject: Re: [Gluster-devel] Changes to Regression script
> >>
> >> On 05/17/2014 02:10 PM, Pranith Kumar Karampuri wrote:
> >>>
> >>>
> >>> - Original Message -
>  From: "Vijay Bellur" 
>  To: "gluster-infra" 
>  Cc: gluster-devel@gluster.org
>  Sent: Tuesday, May 13, 2014 4:13:02 PM
>  Subject: [Gluster-devel] Changes to Regression script
> 
>  Hi All,
> 
>  Me and Kaushal have effected the following changes on regression.sh in
>  build.gluster.org:
> 
>  1. If a regression run results in a core and all tests pass, that
>  particular run will be flagged as a failure. Previously a core that
>  would cause test failures only would get marked as a failure.
> 
>  2. Cores from a particular test run are now archived and are available
>  at /d/archived_builds/. This will also prevent manual intervention for
>  managing cores.
> 
>  3. Logs from failed regression runs are now archived and are available
>  at /d/logs/glusterfs-.tgz
> 
>  Do let us know if you have any comments on these changes.
> >>>
> >>> This is already proving to be useful :-). I was able to debug one of the
> >>> spurious failures for crypt.t. But the only problem is I was not able
> >>> copy
> >>> out the logs. Had to take avati's help to get the log files. Will it be
> >>> possible to give access to these files so that anyone can download them?
> >>>
> >>
> >> Good to know!
> >>
> >> You can access the .tgz files from:
> >>
> >> http://build.gluster.org:443/logs/
> >
> > I was able to access these yesterday. But now it gives 404.

Its working now. But how do we convert the timestamp to logs' timestamp. I want 
to know the time difference.

Pranith.

> >
> 
> Fixed.
> 
> -Vijay
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious failures because of nfs and snapshots

2014-05-18 Thread Pranith Kumar Karampuri


- Original Message -
> From: "Justin Clift" 
> To: "Pranith Kumar Karampuri" 
> Cc: "Gluster Devel" 
> Sent: Monday, 19 May, 2014 10:26:04 AM
> Subject: Re: [Gluster-devel] Spurious failures because of nfs and snapshots
> 
> On 16/05/2014, at 1:49 AM, Pranith Kumar Karampuri wrote:
> > hi,
> >In the latest build I fired for review.gluster.com/7766
> >(http://build.gluster.org/job/regression/4443/console) failed because
> >of spurious failure. The script doesn't wait for nfs export to be
> >available. I fixed that, but interestingly I found quite a few scripts
> >with same problem. Some of the scripts are relying on 'sleep 5' which
> >also could lead to spurious failures if the export is not available in
> >5 seconds.
> 
> Cool.  Fixing this NFS problem across all of the tests would be really
> welcome.  That specific failed test (bug-1087198.t) is the most common
> one I've seen over the last few weeks, causing about half of all
> failures in master.
> 
> Eliminating this class of regression failure would be really helpful. :)

This particular class is eliminated :-). Patch was merged on Friday.

Pranith
> 
> + Justin
> 
> --
> Open Source and Standards @ Red Hat
> 
> twitter.com/realjustinclift
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious failures because of nfs and snapshots

2014-05-18 Thread Justin Clift
On 16/05/2014, at 1:49 AM, Pranith Kumar Karampuri wrote:
> hi,
>In the latest build I fired for review.gluster.com/7766 
> (http://build.gluster.org/job/regression/4443/console) failed because of 
> spurious failure. The script doesn't wait for nfs export to be available. I 
> fixed that, but interestingly I found quite a few scripts with same problem. 
> Some of the scripts are relying on 'sleep 5' which also could lead to 
> spurious failures if the export is not available in 5 seconds.

Cool.  Fixing this NFS problem across all of the tests would be really
welcome.  That specific failed test (bug-1087198.t) is the most common
one I've seen over the last few weeks, causing about half of all
failures in master.

Eliminating this class of regression failure would be really helpful. :)

+ Justin

--
Open Source and Standards @ Red Hat

twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Changes to Regression script

2014-05-18 Thread Vijay Bellur

On 05/19/2014 09:41 AM, Pranith Kumar Karampuri wrote:



- Original Message -

From: "Vijay Bellur" 
To: "Pranith Kumar Karampuri" 
Cc: "gluster-infra" , gluster-devel@gluster.org
Sent: Saturday, 17 May, 2014 2:52:03 PM
Subject: Re: [Gluster-devel] Changes to Regression script

On 05/17/2014 02:10 PM, Pranith Kumar Karampuri wrote:



- Original Message -

From: "Vijay Bellur" 
To: "gluster-infra" 
Cc: gluster-devel@gluster.org
Sent: Tuesday, May 13, 2014 4:13:02 PM
Subject: [Gluster-devel] Changes to Regression script

Hi All,

Me and Kaushal have effected the following changes on regression.sh in
build.gluster.org:

1. If a regression run results in a core and all tests pass, that
particular run will be flagged as a failure. Previously a core that
would cause test failures only would get marked as a failure.

2. Cores from a particular test run are now archived and are available
at /d/archived_builds/. This will also prevent manual intervention for
managing cores.

3. Logs from failed regression runs are now archived and are available
at /d/logs/glusterfs-.tgz

Do let us know if you have any comments on these changes.


This is already proving to be useful :-). I was able to debug one of the
spurious failures for crypt.t. But the only problem is I was not able copy
out the logs. Had to take avati's help to get the log files. Will it be
possible to give access to these files so that anyone can download them?



Good to know!

You can access the .tgz files from:

http://build.gluster.org:443/logs/


I was able to access these yesterday. But now it gives 404.



Fixed.

-Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Changes to Regression script

2014-05-18 Thread Pranith Kumar Karampuri


- Original Message -
> From: "Vijay Bellur" 
> To: "Pranith Kumar Karampuri" 
> Cc: "gluster-infra" , gluster-devel@gluster.org
> Sent: Saturday, 17 May, 2014 2:52:03 PM
> Subject: Re: [Gluster-devel] Changes to Regression script
> 
> On 05/17/2014 02:10 PM, Pranith Kumar Karampuri wrote:
> >
> >
> > - Original Message -
> >> From: "Vijay Bellur" 
> >> To: "gluster-infra" 
> >> Cc: gluster-devel@gluster.org
> >> Sent: Tuesday, May 13, 2014 4:13:02 PM
> >> Subject: [Gluster-devel] Changes to Regression script
> >>
> >> Hi All,
> >>
> >> Me and Kaushal have effected the following changes on regression.sh in
> >> build.gluster.org:
> >>
> >> 1. If a regression run results in a core and all tests pass, that
> >> particular run will be flagged as a failure. Previously a core that
> >> would cause test failures only would get marked as a failure.
> >>
> >> 2. Cores from a particular test run are now archived and are available
> >> at /d/archived_builds/. This will also prevent manual intervention for
> >> managing cores.
> >>
> >> 3. Logs from failed regression runs are now archived and are available
> >> at /d/logs/glusterfs-.tgz
> >>
> >> Do let us know if you have any comments on these changes.
> >
> > This is already proving to be useful :-). I was able to debug one of the
> > spurious failures for crypt.t. But the only problem is I was not able copy
> > out the logs. Had to take avati's help to get the log files. Will it be
> > possible to give access to these files so that anyone can download them?
> >
> 
> Good to know!
> 
> You can access the .tgz files from:
> 
> http://build.gluster.org:443/logs/

I was able to access these yesterday. But now it gives 404.

Pranith
> 
> -Vijay
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious failures because of nfs and snapshots

2014-05-18 Thread Pranith Kumar Karampuri
hi Vijai, Joseph,
In 2 of the last 3 build failures, 
http://build.gluster.org/job/regression/4479/console, 
http://build.gluster.org/job/regression/4478/console this 
test(tests/bugs/bug-1090042.t) failed. Do you guys think it is better to revert 
this test until the fix is available? Please send a patch to revert the test 
case if you guys feel so. You can re-submit it along with the fix to the bug 
mentioned by Joseph.

Pranith.

- Original Message -
> From: "Joseph Fernandes" 
> To: "Pranith Kumar Karampuri" 
> Cc: "Gluster Devel" 
> Sent: Friday, 16 May, 2014 5:13:57 PM
> Subject: Re: Spurious failures because of nfs and snapshots
> 
> 
> Hi All,
> 
> tests/bugs/bug-1090042.t :
> 
> I was able to reproduce the issue i.e when this test is done in a loop
> 
> for i in {1..135} ; do  ./bugs/bug-1090042.t
> 
> When checked the logs
> [2014-05-16 10:49:49.003978] I [rpc-clnt.c:973:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2014-05-16 10:49:49.004035] I [rpc-clnt.c:988:rpc_clnt_connection_init]
> 0-management: defaulting ping-timeout to 30secs
> [2014-05-16 10:49:49.004303] I [rpc-clnt.c:973:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2014-05-16 10:49:49.004340] I [rpc-clnt.c:988:rpc_clnt_connection_init]
> 0-management: defaulting ping-timeout to 30secs
> 
> The issue is with ping-timeout and is tracked under the bug
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1096729
> 
> 
> The workaround is mentioned in
> https://bugzilla.redhat.com/show_bug.cgi?id=1096729#c8
> 
> 
> Regards,
> Joe
> 
> - Original Message -
> From: "Pranith Kumar Karampuri" 
> To: "Gluster Devel" 
> Cc: "Joseph Fernandes" 
> Sent: Friday, May 16, 2014 6:19:54 AM
> Subject: Spurious failures because of nfs and snapshots
> 
> hi,
> In the latest build I fired for review.gluster.com/7766
> (http://build.gluster.org/job/regression/4443/console) failed because of
> spurious failure. The script doesn't wait for nfs export to be
> available. I fixed that, but interestingly I found quite a few scripts
> with same problem. Some of the scripts are relying on 'sleep 5' which
> also could lead to spurious failures if the export is not available in 5
> seconds. We found that waiting for 20 seconds is better, but 'sleep 20'
> would unnecessarily delay the build execution. So if you guys are going
> to write any scripts which has to do nfs mounts, please do it the
> following way:
> 
> EXPECT_WITHIN 20 "1" is_nfs_export_available;
> TEST mount -t nfs -o vers=3 $H0:/$V0 $N0;
> 
> Please review http://review.gluster.com/7773 :-)
> 
> I saw one more spurious failure in a snapshot related script
> tests/bugs/bug-1090042.t on the next build fired by Niels.
> Joesph (CCed) is debugging it. He agreed to reply what he finds and share it
> with us so that we won't introduce similar bugs in future.
> 
> I encourage you guys to share what you fix to prevent spurious failures in
> future.
> 
> Thanks
> Pranith
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Regression tests: Should we test non-XFS too?

2014-05-18 Thread Dan Mons
On 15 May 2014 14:35, Ric Wheeler  wrote:
>
> it is up to those developers and users to test their preferred combination.
>

Not sure if this was quoting me or someone else.  BtrFS is in-tree for
most distros these days, and RHEL is putting it in as a "technology
preview" in 7, which likely means it'll be supported in a point
release down the road somewhere.  My question was merely if that's
going to be a bigger emphasis for Gluster.org folks to test into the
future, or if XFS is going to remain the default/recommended for a lot
longer yet.

If the answer is "it depends on our customers' needs", then put me
down as one who needs something better than XFS.  I'll happily put in
the hard yards to test BtrFS with GlusterFS, but at the same time I'm
keen to know if that's a wise use of my time or a complete waste of my
time if I'm deviating too far from what RedHat/Gluster.org is planning
on blessing in the future.

>
> The reason to look at either ZFS or btrfs is not really performance driven
> in most cases.
>

"Performance" means different things to different people.  For me,
part of XFS's production performance is how frequently I need to
xfs_repair my 40TB bricks.  BtrFS/ZFS drastically reduces this sort of
thing thanks to various checksumming properties not native to other
current filesystems.

When I average my MB/s over 6 months in a 24x7 business, a weekend
long outage required to run xfs_repair my entire cluster has as much
impact (potentially even more) as a file system with slower file IO
performance.

XFS is great when it works.  When it doesn't, there's tears and
tantrums.  Over the course of a production year, that all impacts
"performance" when the resolution of my Munin graphs are that low.

-Dan
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] regarding special treatment of ENOTSUP for setxattr

2014-05-18 Thread Pranith Kumar Karampuri
Sent the following patch to remove the special treatment of ENOTSUP here: 
http://review.gluster.org/7788

Pranith
- Original Message -
> From: "Kaleb KEITHLEY" 
> To: gluster-devel@gluster.org
> Sent: Tuesday, May 13, 2014 8:01:53 PM
> Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for   
> setxattr
> 
> On 05/13/2014 08:00 AM, Nagaprasad Sathyanarayana wrote:
> > On 05/07/2014 03:44 PM, Pranith Kumar Karampuri wrote:
> >>
> >> - Original Message -
> >>> From: "Raghavendra Gowdappa" 
> >>> To: "Pranith Kumar Karampuri" 
> >>> Cc: "Vijay Bellur" , gluster-devel@gluster.org,
> >>> "Anand Avati" 
> >>> Sent: Wednesday, May 7, 2014 3:42:16 PM
> >>> Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP
> >>> for setxattr
> >>>
> >>> I think with "repetitive log message suppression" patch being merged, we
> >>> don't really need gf_log_occasionally (except if they are logged in
> >>> DEBUG or
> >>> TRACE levels).
> >> That definitely helps. But still, setxattr calls are not supposed to
> >> fail with ENOTSUP on FS where we support gluster. If there are special
> >> keys which fail with ENOTSUPP, we can conditionally log setxattr
> >> failures only when the key is something new?
> 
> I know this is about EOPNOTSUPP (a.k.a. ENOTSUPP) returned by
> setxattr(2) for legitimate attrs.
> 
> But I can't help but wondering if this isn't related to other bugs we've
> had with, e.g., lgetxattr(2) called on invalid xattrs?
> 
> E.g. see https://bugzilla.redhat.com/show_bug.cgi?id=765202. We have a
> hack where xlators communicate with each other by getting (and setting?)
> invalid xattrs; the posix xlator has logic to filter out  invalid
> xattrs, but due to bugs this hasn't always worked perfectly.
> 
> It would be interesting to know which xattrs are getting errors and on
> which fs types.
> 
> FWIW, in a quick perusal of a fairly recent (3.14.3) kernel, in xfs
> there are only six places where EOPNOTSUPP is returned, none of them
> related to xattrs. In ext[34] EOPNOTSUPP can be returned if the
> user_xattr option is not enabled (enabled by default in ext4.) And in
> the higher level vfs xattr code there are many places where EOPNOTSUPP
> _might_ be returned, primarily only if subordinate function calls aren't
> invoked which would clear the default or return a different error.
> 
> --
> 
> Kaleb
> 
> 
> 
> 
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel