Re: [Gluster-devel] NetBSD's read-subvol-entry.t spurious failures explained

2015-03-06 Thread Emmanuel Dreyfus
Ravishankar N  wrote:

> But since in the test case, we are doing a 'volume start force' , this
> code path doesn't seem to be hit and looks like we are calling 
> local->readfn() from afr_read_txn(). But read_subvol still is 1 (i.e the
> 2nd brick).  Is that the case for you too? i.e does afr_readdir_wind()
> get called for subvol=1?

When the test fails, afr_readdir_wind()  is always called with
subvol=0.When it succeeds, with subvol=1.


-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD hanging regression tests

2015-03-06 Thread Emmanuel Dreyfus
Emmanuel Dreyfus  wrote:

> Obviously something went wrong. Perhaps there should be a timeout there,
> and/or a check that write() does not fail?

Submitted here:
http://review.gluster.org/9825

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD hanging regression tests

2015-03-06 Thread Emmanuel Dreyfus
Hi 

Recently NetBSD regression tests started hanging quite frequently. Here is
an example:
http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/1679/

The offending test is root-squash-self-heal.t which starts a never-ending
glfsheal process:
  PID LID WCHAN   STAT LTIME COMMAND
28554   5 parked  Rl 0:00.04 /build/install/sbin/glfsheal patchy 
28554   4 nanoslp Rl 0:01.28 /build/install/sbin/glfsheal patchy 
28554   3 -   Rl 0:00.00 /build/install/sbin/glfsheal patchy 
28554   1 -   Rl   754:21.27 /build/install/sbin/glfsheal patchy 

Thread 1 ate a lot of CPU time. It is looping or failed writes:
 28554  1 glfsheal CALL  __gettimeofday50(0xbf7fe650,0)
 28554  1 glfsheal RET   __gettimeofday50 0
 28554  1 glfsheal CALL  write(9,0xbb7c63fb,6)
 28554  1 glfsheal RET   write -1 errno 35 Resource temporarily
unavailable

Running a standalone glfsheal process shows it first writes "dummy" before
it hits the same error. This suggests we are in event_dispatch_destroy():

/* Write to pipe(fd[1]) and then wait for 1 second or until
 * a poller thread that is dying, broadcasts.
 */
while (event_pool->activethreadcount > 0) {
write (fd[1], "dummy", 6);
sleep_till.tv_sec = time (NULL) + 1;
ret = pthread_cond_timedwait (&event_pool->cond,
  &event_pool->mutex,
  &sleep_till);
}

Obviously something went wrong. Perhaps there should be a timeout there,
and/or a check that write() does not fail?

diff --git a/libglusterfs/src/event.c b/libglusterfs/src/event.c
index f19d43a..b956d25 100644
--- a/libglusterfs/src/event.c
+++ b/libglusterfs/src/event.c
@@ -235,10 +235,14 @@ event_dispatch_destroy (struct event_pool *event_pool)
 pthread_mutex_lock (&event_pool->mutex);
 {
 /* Write to pipe(fd[1]) and then wait for 1 second or until
- * a poller thread that is dying, broadcasts.
+ * a poller thread that is dying, broadcasts. Make sure we
+ * do not loop forever by limiting to 10 retries
  */
-while (event_pool->activethreadcount > 0) {
-write (fd[1], "dummy", 6);
+int retry = 0;
+
+while (event_pool->activethreadcount > 0 && retry++ < 10) {
+if (write (fd[1], "dummy", 6) == -1)
+break;
 sleep_till.tv_sec = time (NULL) + 1;
 ret = pthread_cond_timedwait (&event_pool->cond,
   &event_pool->mutex,

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD's read-subvol-entry.t spurious failures explained

2015-03-06 Thread Ravishankar N


On 03/06/2015 10:28 PM, Emmanuel Dreyfus wrote:

On Fri, Mar 06, 2015 at 05:55:34PM +0530, Ravishankar N wrote:

On NetBSD I can see that AFR never gets trusted.afr.patchy-client-0
and walways things brick0 is fine. AFR randomly picks brick0 or brick1
to list directory content, and when it picks brick0 the test fails.

After bringing brick0 up, and performing "ls abc/def", does afr_do_readdir()
get called for "def"?

Yes, it is.


If it does,  then AFR will send lookup to both bricks via
afr_inode_refresh() ,

How is it supposed to happen? I can see I do not get into
afr_inode_refresh_do() after visiting afr_do_readdir().



If only the brick process is killed and brought back: afr_do_readdir() 
-->afr_read_txn() -->afr_inode_refresh() because 
"local->event_generation != event_generation".


But since in the test case, we are doing a 'volume start force' , this 
code path doesn't seem to be hit and looks like we are calling 
local->readfn() from afr_read_txn(). But read_subvol still is 1 (i.e the 
2nd brick).  Is that the case for you too? i.e does afr_readdir_wind() 
get called for subvol=1?





___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD's read-subvol-entry.t spurious failures explained

2015-03-06 Thread Emmanuel Dreyfus
On Fri, Mar 06, 2015 at 05:55:34PM +0530, Ravishankar N wrote:
> >On NetBSD I can see that AFR never gets trusted.afr.patchy-client-0
> >and walways things brick0 is fine. AFR randomly picks brick0 or brick1
> >to list directory content, and when it picks brick0 the test fails.
> After bringing brick0 up, and performing "ls abc/def", does afr_do_readdir()
> get called for "def"?

Yes, it is. 

> If it does,  then AFR will send lookup to both bricks via
> afr_inode_refresh() ,

How is it supposed to happen? I can see I do not get into 
afr_inode_refresh_do() after visiting afr_do_readdir().

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious failure report for master branch - 2015-03-03

2015-03-06 Thread Justin Clift
On 4 Mar 2015, at 15:25, Shyam  wrote:
> On 03/03/2015 11:27 PM, Justin Clift wrote:
>> 2 x Coredumps
>> *
>> 
>>   * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk5/
>> 
>> IP - 104.130.74.142
>> 
>> This coredump run also failed on:
>> 
>>   * tests/basic/fops-sanity.t
>>  (Wstat: 0 Tests: 11 Failed: 1)
>> Failed test:  10
>> 
>>   * tests/bugs/glusterfs-server/bug-861542.t 
>>  (Wstat: 0 Tests: 13 Failed: 1)
>> Failed test:  10
>> 
>>   * tests/performance/open-behind.t  
>>  (Wstat: 0 Tests: 17 Failed: 1)
>> Failed test:  17
> 
> FWIW, this is the same as https://bugzilla.redhat.com/show_bug.cgi?id=1195415
> 
>> 
>>   * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk8/
>> 
>> IP - 104.130.74.143
>> 
>> This coredump run also failed on:
>> 
>>   * tests/basic/afr/entry-self-heal.t
>>  (Wstat: 0 Tests: 180 Failed: 2)
>> Failed tests:  127-128
>> 
>>   * tests/bugs/glusterfs-server/bug-861542.t 
>>  (Wstat: 0 Tests: 13 Failed: 1)
>> Failed test:  10
> 
> So is this one. i.e same as 
> https://bugzilla.redhat.com/show_bug.cgi?id=1195415

Thanks Shyam.  Somehow missed your email earlier, but all good now.

:)

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD's read-subvol-entry.t spurious failures explained

2015-03-06 Thread Ravishankar N


On 03/06/2015 04:31 PM, Emmanuel Dreyfus wrote:

Hi

I tracked down the spurious failures of read-subvol-entry.t on NetBSD.

Here is what should happen: we have a volume with brick0 and brick1.
We disable self-heal, kill brick0, create a file in a directory,
restart brick0, and we list directory content to check we find the file.

The tested mechanism is that in brick1, trusted.afr.patchy-client-0
accuse brick0 of being outdated, hence AFR should rule out brick0
for listing directory content, and it should use brick1 which contains
the file we look for.

On NetBSD I can see that AFR never gets trusted.afr.patchy-client-0
and walways things brick0 is fine. AFR randomly picks brick0 or brick1
to list directory content, and when it picks brick0 the test fails.
After bringing brick0 up, and performing "ls abc/def", does 
afr_do_readdir() get called for "def"?
If it does,  then AFR will send lookup to both bricks via 
afr_inode_refresh() , and  it will pick brick1 as the source.
Like I suggested earlier, we could put a print in afr_readdir_wind() and 
see that it indeed goes to brick0 when the test fails.

The reason why trusted.afr.patchy-client-0 is not there is that the
node is cached in kernel FUSE from an earlier lookup. The TTL obtained
at that times tells the kernel this node is still valid, hence the
kernel does not send the new lookup to GlusterFS. Since GlusterFS uses
lookups to referesh client view of xattr, it sticks with older value
where brick0 was not yet oudated, and trusted.afr.patchy-client-0 is
unset.
If readdir comes on def, then it is AFR that initiates the lookup. So no 
fuse caching should be involved.




Questions:

1) Is NetBSD behavior wrong here? It got a TTL for a node, I understand
it should not send lookups to the filesystem until the TTL is expired.

2) How to fix it? If NetBSD behavior is correct, then I guess the test
only succeeds on Linux by chance and we only need to fix the test.
The change below flush kernel cache before looking for the file:

--- a/tests/basic/afr/read-subvol-entry.t
+++ b/tests/basic/afr/read-subvol-entry.t
@@ -26,6 +26,7 @@ TEST kill_brick $V0 $H0 $B0/brick0
  
  TEST touch $M0/abc/def/ghi

  TEST $CLI volume start $V0 force
+( cd $M0 && umount $M0 )
  EXPECT_WITHIN $PROCESS_UP_TIMEOUT "ghi" echo `ls $M0/abc/def/`
  
  #Cleanup






___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] NetBSD's read-subvol-entry.t spurious failures explained

2015-03-06 Thread Emmanuel Dreyfus
Hi

I tracked down the spurious failures of read-subvol-entry.t on NetBSD.

Here is what should happen: we have a volume with brick0 and brick1.
We disable self-heal, kill brick0, create a file in a directory, 
restart brick0, and we list directory content to check we find the file.

The tested mechanism is that in brick1, trusted.afr.patchy-client-0
accuse brick0 of being outdated, hence AFR should rule out brick0 
for listing directory content, and it should use brick1 which contains
the file we look for.

On NetBSD I can see that AFR never gets trusted.afr.patchy-client-0
and walways things brick0 is fine. AFR randomly picks brick0 or brick1
to list directory content, and when it picks brick0 the test fails.

The reason why trusted.afr.patchy-client-0 is not there is that the
node is cached in kernel FUSE from an earlier lookup. The TTL obtained
at that times tells the kernel this node is still valid, hence the
kernel does not send the new lookup to GlusterFS. Since GlusterFS uses
lookups to referesh client view of xattr, it sticks with older value
where brick0 was not yet oudated, and trusted.afr.patchy-client-0 is 
unset.

Questions:

1) Is NetBSD behavior wrong here? It got a TTL for a node, I understand
it should not send lookups to the filesystem until the TTL is expired.

2) How to fix it? If NetBSD behavior is correct, then I guess the test
only succeeds on Linux by chance and we only need to fix the test.
The change below flush kernel cache before looking for the file:

--- a/tests/basic/afr/read-subvol-entry.t
+++ b/tests/basic/afr/read-subvol-entry.t
@@ -26,6 +26,7 @@ TEST kill_brick $V0 $H0 $B0/brick0
 
 TEST touch $M0/abc/def/ghi
 TEST $CLI volume start $V0 force
+( cd $M0 && umount $M0 ) 
 EXPECT_WITHIN $PROCESS_UP_TIMEOUT "ghi" echo `ls $M0/abc/def/`
 
 #Cleanup



-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] glfs-fini issue in upstream master

2015-03-06 Thread RAGHAVENDRA TALUR
On Thu, Mar 5, 2015 at 10:23 PM, Ravishankar N 
wrote:

>  tests/basic/afr/split-brain-healing.t is failing in upstream master:
>
>
> 
> ok 52
> ok 53
> *glfsheal: quick-read.c:1052: qr_inode_table_destroy: Assertion
> `list_empty (&priv->table.lru[i])' failed.*
> Healing /file1 failed: File not in split-brain.
> *n**ot ok 54 Got "0" instead of "1"*
> FAILED COMMAND: 1 echo 0
> *glfsheal: quick-read.c:1052: qr_inode_table_destroy: Assertion
> `list_empty (&priv->table.lru[i])' failed.*
> Healing /file3 failed: File not in split-brain.
> *not ok 55 Got "0" instead of "1"*
> FAILED COMMAND: 1 echo 0
> /root/workspace/glusterfs
> ok 56
> Failed 2/56 subtests
>
> 
>
>
> If I comment the calls to glfs_fini() in glfs-heal.c, the test passes.
>
> 
>
> ok 52
> ok 53
> Healing /file1 failed: File not in split-brain.
> Volume heal failed.
> ok 54
> Healing /file3 failed: File not in split-brain.
> Volume heal failed.
> ok 55
> /root/workspace/glusterfs
> ok 56
>
>
> 
>

>
> Help!
>
>
I think this is a issue similar to what Poornima fixed in io-cache xlator.
Refer:
http://review.gluster.org/#/c/7642/25/xlators/performance/io-cache/src/io-cache.c

Adding Poornima.


>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>


-- 
*Raghavendra Talur *
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious failure report for master branch - 2015-03-03

2015-03-06 Thread Pranith Kumar Karampuri


On 03/04/2015 09:57 AM, Justin Clift wrote:

Ran 20 x regression tests on our GlusterFS master branch code
as of a few hours ago, commit 95d5e60afb29aedc29909340e7564d54a6a247c2.

5 of them were successful (25%), 15 of them failed in various ways
(75%).

We need to get this down to about 5% or less (preferably 0%), as it's
killing our development iteration speed.  We're wasting huge amounts
of time working around this. :(


Spurious failures
*

   * 5 x tests/bugs/distribute/bug-1117851.t
   (Wstat: 0 Tests: 24 Failed: 1)
 Failed test:  15

 This one is causing a 25% failure rate all by itself. :(

 This needs fixing soon. :)


   * 3 x tests/bugs/geo-replication/bug-877293.t
   (Wstat: 0 Tests: 15 Failed: 1)
 Failed test:  11

Nice catch by regression. Fix: http://review.gluster.org/9817

Pranith


   * 2 x tests/basic/afr/entry-self-heal.t  
   (Wstat: 0 Tests: 180 Failed: 2)
 Failed tests:  127-128

   * 1 x tests/basic/ec/ec-12-4.t   
   (Wstat: 0 Tests: 541 Failed: 2)
 Failed tests:  409, 441

   * 1 x tests/basic/fops-sanity.t  
   (Wstat: 0 Tests: 11 Failed: 1)
 Failed test:  10

   * 1 x tests/basic/uss.t  
   (Wstat: 0 Tests: 160 Failed: 1)
 Failed test:  26

   * 1 x tests/performance/open-behind.t
   (Wstat: 0 Tests: 17 Failed: 1)
 Failed test:  17

   * 1 x tests/bugs/distribute/bug-884455.t 
   (Wstat: 0 Tests: 22 Failed: 1)
 Failed test:  11

   * 1 x tests/bugs/fuse/bug-1126048.t  
   (Wstat: 0 Tests: 12 Failed: 1)
 Failed test:  10

   * 1 x tests/bugs/quota/bug-1038598.t 
   (Wstat: 0 Tests: 28 Failed: 1)
 Failed test:  28


2 x Coredumps
*

   * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk5/

 IP - 104.130.74.142

 This coredump run also failed on:

   * tests/basic/fops-sanity.t  
   (Wstat: 0 Tests: 11 Failed: 1)
 Failed test:  10

   * tests/bugs/glusterfs-server/bug-861542.t   
   (Wstat: 0 Tests: 13 Failed: 1)
 Failed test:  10

   * tests/performance/open-behind.t
   (Wstat: 0 Tests: 17 Failed: 1)
 Failed test:  17

   * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk8/

 IP - 104.130.74.143

 This coredump run also failed on:

   * tests/basic/afr/entry-self-heal.t  
   (Wstat: 0 Tests: 180 Failed: 2)
 Failed tests:  127-128

   * tests/bugs/glusterfs-server/bug-861542.t   
   (Wstat: 0 Tests: 13 Failed: 1)
 Failed test:  10

Both VMs are also online, in case they're useful to log into
for investigation (root / the jenkins slave pw).

If they're not, please let me know so I can blow them away. :)


1 x hung host
*

Hung on tests/bugs/posix/bug-1113960.t

root  12497  1290  0 Mar03 ?  S  0:00  \_ /bin/bash /opt/qa/regression.sh
root  12504 12497  0 Mar03 ?  S  0:00  \_ /bin/bash ./run-tests.sh
root  12519 12504  0 Mar03 ?  S  0:03  \_ /usr/bin/perl /usr/bin/prove 
-rf --timer ./tests
root  22018 12519  0 00:17 ?  S  0:00  \_ /bin/bash 
./tests/bugs/posix/bug-1113960.t
root  30002 22018  0 01:57 ?  S  0:00  \_ mv 
/mnt/glusterfs/0/longernamedir1/longernamedir2/longernamedir3/

This VM (23.253.53.111) is still online + untouched (still hung),
if someone wants to log in to investigate.  (root / the jenkins
slave pw)

Hope that's helpful. :)

Regards and best wishes,

Justin Clift

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel