Re: [Gluster-devel] Spurious regression failure? tests/basic/ec/ec-background-heals.t
On Thu, Jan 26, 2017 at 7:45 PM, Ashish Pandey <aspan...@redhat.com> wrote: > > Xavi, > > shd has been disabled in this test on line number 12 and we have also > disabled client side heal. > So, no body is going to try to heal it. > Already enqueued heals should be healed. I am taking a look at it. Let's see. > > Ashish > > -- > *From: *"Atin Mukherjee" <amukh...@redhat.com> > *To: *"Ashish Pandey" <aspan...@redhat.com>, "Raghavendra Gowdappa" < > rgowd...@redhat.com>, "Xavier Hernandez" <xhernan...@datalab.es> > *Cc: *"Gluster Devel" <gluster-devel@gluster.org> > *Sent: *Thursday, January 26, 2017 5:50:00 PM > *Subject: *Re: [Gluster-devel] Spurious regression failure? > tests/basic/ec/ec-background-heals.t > > > I've +1ed it now. > > On Thu, 26 Jan 2017 at 15:05, Xavier Hernandez <xhernan...@datalab.es> > wrote: > >> Hi Atin, >> >> I don't clearly see what's the problem. Even if the truncate causes a >> dirty flag to be set, eventually it should be removed before the >> $HEAL_TIMEOUT value. >> >> For now I've marked the test as bad. >> >> Patch is: https://review.gluster.org/16470 >> >> Xavi >> >> On 25/01/17 17:24, Atin Mukherjee wrote: >> > Can we please address this as early as possible, my patch has hit this >> > failure 3 out of 4 recheck attempts now. I'm guessing some recent >> > changes has caused it. >> > >> > On Wed, 25 Jan 2017 at 12:10, Ashish Pandey <aspan...@redhat.com >> > <mailto:aspan...@redhat.com>> wrote: >> > >> > >> > Pranith, >> > >> > In this test tests/basic/ec/ec-background-heals.t, I think the line >> > number 86 actually creating a heal entry instead of >> > helping data heal quickly. What if all the data was already healed >> > at that moment, truncate came and in preop set the dirty flag and >> at the >> > end, as part of the heal, dirty flag was unset on previous good >> > bricks only and the brick which acted as heal-sink still has dirty >> > marked by truncate. >> > That is why we are only seeing "1" as get_pending_heal_count. If a >> > file was actually not healed it should be "2". >> > If heal on this file completes and unset of dirty flag happens >> > before truncate everything will be fine. >> > >> > I think we can wait for file to be heal without truncate? >> > >> > 71 #Test that disabling background-heals still drains the queue >> > 72 TEST $CLI volume set $V0 disperse.background-heals 1 >> > 73 TEST touch $M0/{a,b,c,d} >> > 74 TEST kill_brick $V0 $H0 $B0/${V0}2 >> > 75 EXPECT_WITHIN $CONFIG_UPDATE_TIMEOUT "1" mount_get_option_value >> > $M0 $V0-disperse-0 background-heals >> > 76 EXPECT_WITHIN $CONFIG_UPDATE_TIMEOUT "200" >> > mount_get_option_value $M0 $V0-disperse-0 heal-wait-qlength >> > 77 TEST truncate -s 1GB $M0/a >> > 78 echo abc > $M0/b >> > 79 echo abc > $M0/c >> > 80 echo abc > $M0/d >> > 81 TEST $CLI volume start $V0 force >> > 82 EXPECT_WITHIN $CHILD_UP_TIMEOUT "3" ec_child_up_count $V0 0 >> > 83 TEST chown root:root $M0/{a,b,c,d} >> > 84 TEST $CLI volume set $V0 disperse.background-heals 0 >> > 85 EXPECT_NOT "0" mount_get_option_value $M0 $V0-disperse-0 >> > heal-waiters >> > >> > 86 TEST truncate -s 0 $M0/a # This completes the heal fast ;-) >> <<<<<<< >> > >> > 87 EXPECT_WITHIN $HEAL_TIMEOUT "^0$" get_pending_heal_count $V0 >> > >> > >> > Ashish >> > >> > >> > >> > >> > >> > --- >> - >> > *From: *"Raghavendra Gowdappa" <rgowd...@redhat.com >> > <mailto:rgowd...@redhat.com>> >> > *To: *"Nithya Balachandran" <nbala...@redhat.com >> > <mailto:nbala...@redhat.com>> >> > *Cc: *"Gluster Devel" <gluster-devel@gluster.org >> > <mailto:gluster-devel@gluster.org>>, "Pranith Kumar Karampuri" >> > <pkara...@redhat.com <mailto:pkara...@redhat.com>>, "Ashish Pandey"
Re: [Gluster-devel] Spurious regression failure? tests/basic/ec/ec-background-heals.t
Xavi, shd has been disabled in this test on line number 12 and we have also disabled client side heal. So, no body is going to try to heal it. Ashish - Original Message - From: "Atin Mukherjee" <amukh...@redhat.com> To: "Ashish Pandey" <aspan...@redhat.com>, "Raghavendra Gowdappa" <rgowd...@redhat.com>, "Xavier Hernandez" <xhernan...@datalab.es> Cc: "Gluster Devel" <gluster-devel@gluster.org> Sent: Thursday, January 26, 2017 5:50:00 PM Subject: Re: [Gluster-devel] Spurious regression failure? tests/basic/ec/ec-background-heals.t I've +1ed it now. On Thu, 26 Jan 2017 at 15:05, Xavier Hernandez < xhernan...@datalab.es > wrote: Hi Atin, I don't clearly see what's the problem. Even if the truncate causes a dirty flag to be set, eventually it should be removed before the $HEAL_TIMEOUT value. For now I've marked the test as bad. Patch is: https://review.gluster.org/16470 Xavi On 25/01/17 17:24, Atin Mukherjee wrote: > Can we please address this as early as possible, my patch has hit this > failure 3 out of 4 recheck attempts now. I'm guessing some recent > changes has caused it. > > On Wed, 25 Jan 2017 at 12:10, Ashish Pandey < aspan...@redhat.com > > wrote: > > > Pranith, > > In this test tests/basic/ec/ec-background-heals.t, I think the line > number 86 actually creating a heal entry instead of > helping data heal quickly. What if all the data was already healed > at that moment, truncate came and in preop set the dirty flag and at the > end, as part of the heal, dirty flag was unset on previous good > bricks only and the brick which acted as heal-sink still has dirty > marked by truncate. > That is why we are only seeing "1" as get_pending_heal_count. If a > file was actually not healed it should be "2". > If heal on this file completes and unset of dirty flag happens > before truncate everything will be fine. > > I think we can wait for file to be heal without truncate? > > 71 #Test that disabling background-heals still drains the queue > 72 TEST $CLI volume set $V0 disperse.background-heals 1 > 73 TEST touch $M0/{a,b,c,d} > 74 TEST kill_brick $V0 $H0 $B0/${V0}2 > 75 EXPECT_WITHIN $CONFIG_UPDATE_TIMEOUT "1" mount_get_option_value > $M0 $V0-disperse-0 background-heals > 76 EXPECT_WITHIN $CONFIG_UPDATE_TIMEOUT "200" > mount_get_option_value $M0 $V0-disperse-0 heal-wait-qlength > 77 TEST truncate -s 1GB $M0/a > 78 echo abc > $M0/b > 79 echo abc > $M0/c > 80 echo abc > $M0/d > 81 TEST $CLI volume start $V0 force > 82 EXPECT_WITHIN $CHILD_UP_TIMEOUT "3" ec_child_up_count $V0 0 > 83 TEST chown root:root $M0/{a,b,c,d} > 84 TEST $CLI volume set $V0 disperse.background-heals 0 > 85 EXPECT_NOT "0" mount_get_option_value $M0 $V0-disperse-0 > heal-waiters > > 86 TEST truncate -s 0 $M0/a # This completes the heal fast ;-) <<<<<<< > > 87 EXPECT_WITHIN $HEAL_TIMEOUT "^0$" get_pending_heal_count $V0 > > > Ashish > > > > > > ------------ > *From: *"Raghavendra Gowdappa" < rgowd...@redhat.com > > > *To: *"Nithya Balachandran" < nbala...@redhat.com > > > *Cc: *"Gluster Devel" < gluster-devel@gluster.org > >, "Pranith Kumar Karampuri" > < pkara...@redhat.com >, "Ashish Pandey" > < aspan...@redhat.com > > *Sent: *Wednesday, January 25, 2017 9:41:38 AM > *Subject: *Re: [Gluster-devel] Spurious regression > failure? tests/basic/ec/ec-background-heals.t > > > Found another failure on same test: > https://build.gluster.org/job/centos6-regression/2874/consoleFull > > - Original Message - > > From: "Nithya Balachandran" < nbala...@redhat.com > > > > To: "Gluster Devel" < gluster-devel@gluster.org > >, "Pranith Kumar Karampuri" > < pkara...@redhat.com >, "Ashish Pandey" > > < aspan...@redhat.com > > > Sent: Tuesday, January 24, 2017 9:16:31 AM > > Subject: [Gluster-devel] Spurious regression > failure? tests/basic/ec/ec-background-heals.t > > > > Hi, > > > > > > Can you please take a look at > > https://build.gluster.org/job/centos6-regression/2859/console ? > > > > tests/basic/ec/ec-background-heals.t has failed. > > > > Thanks, > > Nithya > > > > ___ > > Gluster-devel mailing list > > Gluster-devel@gluster.org > > http://lists.gluster.org/mailman/listinfo/gluster-devel > ___ > > Gluster-devel mailing list > > Gluster-devel@gluster.org > > http://lists.gluster.org/mailman/listinfo/gluster-devel > > -- > - Atin (atinm) > > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-devel > -- - Atin (atinm) ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Spurious regression failure? tests/basic/ec/ec-background-heals.t
I've +1ed it now. On Thu, 26 Jan 2017 at 15:05, Xavier Hernandez <xhernan...@datalab.es> wrote: > Hi Atin, > > I don't clearly see what's the problem. Even if the truncate causes a > dirty flag to be set, eventually it should be removed before the > $HEAL_TIMEOUT value. > > For now I've marked the test as bad. > > Patch is: https://review.gluster.org/16470 > > Xavi > > On 25/01/17 17:24, Atin Mukherjee wrote: > > Can we please address this as early as possible, my patch has hit this > > failure 3 out of 4 recheck attempts now. I'm guessing some recent > > changes has caused it. > > > > On Wed, 25 Jan 2017 at 12:10, Ashish Pandey <aspan...@redhat.com > > <mailto:aspan...@redhat.com>> wrote: > > > > > > Pranith, > > > > In this test tests/basic/ec/ec-background-heals.t, I think the line > > number 86 actually creating a heal entry instead of > > helping data heal quickly. What if all the data was already healed > > at that moment, truncate came and in preop set the dirty flag and at > the > > end, as part of the heal, dirty flag was unset on previous good > > bricks only and the brick which acted as heal-sink still has dirty > > marked by truncate. > > That is why we are only seeing "1" as get_pending_heal_count. If a > > file was actually not healed it should be "2". > > If heal on this file completes and unset of dirty flag happens > > before truncate everything will be fine. > > > > I think we can wait for file to be heal without truncate? > > > > 71 #Test that disabling background-heals still drains the queue > > 72 TEST $CLI volume set $V0 disperse.background-heals 1 > > 73 TEST touch $M0/{a,b,c,d} > > 74 TEST kill_brick $V0 $H0 $B0/${V0}2 > > 75 EXPECT_WITHIN $CONFIG_UPDATE_TIMEOUT "1" mount_get_option_value > > $M0 $V0-disperse-0 background-heals > > 76 EXPECT_WITHIN $CONFIG_UPDATE_TIMEOUT "200" > > mount_get_option_value $M0 $V0-disperse-0 heal-wait-qlength > > 77 TEST truncate -s 1GB $M0/a > > 78 echo abc > $M0/b > > 79 echo abc > $M0/c > > 80 echo abc > $M0/d > > 81 TEST $CLI volume start $V0 force > > 82 EXPECT_WITHIN $CHILD_UP_TIMEOUT "3" ec_child_up_count $V0 0 > > 83 TEST chown root:root $M0/{a,b,c,d} > > 84 TEST $CLI volume set $V0 disperse.background-heals 0 > > 85 EXPECT_NOT "0" mount_get_option_value $M0 $V0-disperse-0 > > heal-waiters > > > > 86 TEST truncate -s 0 $M0/a # This completes the heal fast ;-) > <<<<<<< > > > > 87 EXPECT_WITHIN $HEAL_TIMEOUT "^0$" get_pending_heal_count $V0 > > > > > > Ashish > > > > > > > > > > > > > > > *From: *"Raghavendra Gowdappa" <rgowd...@redhat.com > > <mailto:rgowd...@redhat.com>> > > *To: *"Nithya Balachandran" <nbala...@redhat.com > > <mailto:nbala...@redhat.com>> > > *Cc: *"Gluster Devel" <gluster-devel@gluster.org > > <mailto:gluster-devel@gluster.org>>, "Pranith Kumar Karampuri" > > <pkara...@redhat.com <mailto:pkara...@redhat.com>>, "Ashish Pandey" > > <aspan...@redhat.com <mailto:aspan...@redhat.com>> > > *Sent: *Wednesday, January 25, 2017 9:41:38 AM > > *Subject: *Re: [Gluster-devel] Spurious regression > > failure?tests/basic/ec/ec-background-heals.t > > > > > > Found another failure on same test: > > https://build.gluster.org/job/centos6-regression/2874/consoleFull > > > > - Original Message - > > > From: "Nithya Balachandran" <nbala...@redhat.com > > <mailto:nbala...@redhat.com>> > > > To: "Gluster Devel" <gluster-devel@gluster.org > > <mailto:gluster-devel@gluster.org>>, "Pranith Kumar Karampuri" > > <pkara...@redhat.com <mailto:pkara...@redhat.com>>, "Ashish Pandey" > > > <aspan...@redhat.com <mailto:aspan...@redhat.com>> > > > Sent: Tuesday, January 24, 2017 9:16:31 AM > > > Subject: [Gluster-devel] Spurious regression > > failure?tests/basic/ec/ec-background-heals.t > > > > > > Hi, > > > > >
Re: [Gluster-devel] Spurious regression failure? tests/basic/ec/ec-background-heals.t
Hi Atin, I don't clearly see what's the problem. Even if the truncate causes a dirty flag to be set, eventually it should be removed before the $HEAL_TIMEOUT value. For now I've marked the test as bad. Patch is: https://review.gluster.org/16470 Xavi On 25/01/17 17:24, Atin Mukherjee wrote: Can we please address this as early as possible, my patch has hit this failure 3 out of 4 recheck attempts now. I'm guessing some recent changes has caused it. On Wed, 25 Jan 2017 at 12:10, Ashish Pandey <aspan...@redhat.com <mailto:aspan...@redhat.com>> wrote: Pranith, In this test tests/basic/ec/ec-background-heals.t, I think the line number 86 actually creating a heal entry instead of helping data heal quickly. What if all the data was already healed at that moment, truncate came and in preop set the dirty flag and at the end, as part of the heal, dirty flag was unset on previous good bricks only and the brick which acted as heal-sink still has dirty marked by truncate. That is why we are only seeing "1" as get_pending_heal_count. If a file was actually not healed it should be "2". If heal on this file completes and unset of dirty flag happens before truncate everything will be fine. I think we can wait for file to be heal without truncate? 71 #Test that disabling background-heals still drains the queue 72 TEST $CLI volume set $V0 disperse.background-heals 1 73 TEST touch $M0/{a,b,c,d} 74 TEST kill_brick $V0 $H0 $B0/${V0}2 75 EXPECT_WITHIN $CONFIG_UPDATE_TIMEOUT "1" mount_get_option_value $M0 $V0-disperse-0 background-heals 76 EXPECT_WITHIN $CONFIG_UPDATE_TIMEOUT "200" mount_get_option_value $M0 $V0-disperse-0 heal-wait-qlength 77 TEST truncate -s 1GB $M0/a 78 echo abc > $M0/b 79 echo abc > $M0/c 80 echo abc > $M0/d 81 TEST $CLI volume start $V0 force 82 EXPECT_WITHIN $CHILD_UP_TIMEOUT "3" ec_child_up_count $V0 0 83 TEST chown root:root $M0/{a,b,c,d} 84 TEST $CLI volume set $V0 disperse.background-heals 0 85 EXPECT_NOT "0" mount_get_option_value $M0 $V0-disperse-0 heal-waiters 86 TEST truncate -s 0 $M0/a # This completes the heal fast ;-) <<<<<<< 87 EXPECT_WITHIN $HEAL_TIMEOUT "^0$" get_pending_heal_count $V0 Ashish *From: *"Raghavendra Gowdappa" <rgowd...@redhat.com <mailto:rgowd...@redhat.com>> *To: *"Nithya Balachandran" <nbala...@redhat.com <mailto:nbala...@redhat.com>> *Cc: *"Gluster Devel" <gluster-devel@gluster.org <mailto:gluster-devel@gluster.org>>, "Pranith Kumar Karampuri" <pkara...@redhat.com <mailto:pkara...@redhat.com>>, "Ashish Pandey" <aspan...@redhat.com <mailto:aspan...@redhat.com>> *Sent: *Wednesday, January 25, 2017 9:41:38 AM *Subject: *Re: [Gluster-devel] Spurious regression failure?tests/basic/ec/ec-background-heals.t Found another failure on same test: https://build.gluster.org/job/centos6-regression/2874/consoleFull - Original Message - > From: "Nithya Balachandran" <nbala...@redhat.com <mailto:nbala...@redhat.com>> > To: "Gluster Devel" <gluster-devel@gluster.org <mailto:gluster-devel@gluster.org>>, "Pranith Kumar Karampuri" <pkara...@redhat.com <mailto:pkara...@redhat.com>>, "Ashish Pandey" > <aspan...@redhat.com <mailto:aspan...@redhat.com>> > Sent: Tuesday, January 24, 2017 9:16:31 AM > Subject: [Gluster-devel] Spurious regression failure?tests/basic/ec/ec-background-heals.t > > Hi, > > > Can you please take a look at > https://build.gluster.org/job/centos6-regression/2859/console ? > > tests/basic/ec/ec-background-heals.t has failed. > > Thanks, > Nithya > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org <mailto:Gluster-devel@gluster.org> > http://lists.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org <mailto:Gluster-devel@gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-devel -- - Atin (atinm) ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Spurious regression failure? tests/basic/ec/ec-background-heals.t
Pranith, In this test tests/basic/ec/ec-background-heals.t, I think the line number 86 actually creating a heal entry instead of helping data heal quickly. What if all the data was already healed at that moment, truncate came and in preop set the dirty flag and at the end, as part of the heal, dirty flag was unset on previous good bricks only and the brick which acted as heal-sink still has dirty marked by truncate. That is why we are only seeing "1" as get_pending_heal_count. If a file was actually not healed it should be "2". If heal on this file completes and unset of dirty flag happens before truncate everything will be fine. I think we can wait for file to be heal without truncate? 71 #Test that disabling background-heals still drains the queue 72 TEST $CLI volume set $V0 disperse.background-heals 1 73 TEST touch $M0/{a,b,c,d} 74 TEST kill_brick $V0 $H0 $B0/${V0}2 75 EXPECT_WITHIN $CONFIG_UPDATE_TIMEOUT "1" mount_get_option_value $M0 $V0-disperse-0 background-heals 76 EXPECT_WITHIN $CONFIG_UPDATE_TIMEOUT "200" mount_get_option_value $M0 $V0-disperse-0 heal-wait-qlength 77 TEST truncate -s 1GB $M0/a 78 echo abc > $M0/b 79 echo abc > $M0/c 80 echo abc > $M0/d 81 TEST $CLI volume start $V0 force 82 EXPECT_WITHIN $CHILD_UP_TIMEOUT "3" ec_child_up_count $V0 0 83 TEST chown root:root $M0/{a,b,c,d} 84 TEST $CLI volume set $V0 disperse.background-heals 0 85 EXPECT_NOT "0" mount_get_option_value $M0 $V0-disperse-0 heal-waiters 86 TEST truncate -s 0 $M0/a # This completes the heal fast ;-) <<<<<<< 87 EXPECT_WITHIN $HEAL_TIMEOUT "^0$" get_pending_heal_count $V0 Ashish - Original Message - From: "Raghavendra Gowdappa" <rgowd...@redhat.com> To: "Nithya Balachandran" <nbala...@redhat.com> Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Pranith Kumar Karampuri" <pkara...@redhat.com>, "Ashish Pandey" <aspan...@redhat.com> Sent: Wednesday, January 25, 2017 9:41:38 AM Subject: Re: [Gluster-devel] Spurious regression failure? tests/basic/ec/ec-background-heals.t Found another failure on same test: https://build.gluster.org/job/centos6-regression/2874/consoleFull - Original Message - > From: "Nithya Balachandran" <nbala...@redhat.com> > To: "Gluster Devel" <gluster-devel@gluster.org>, "Pranith Kumar Karampuri" > <pkara...@redhat.com>, "Ashish Pandey" > <aspan...@redhat.com> > Sent: Tuesday, January 24, 2017 9:16:31 AM > Subject: [Gluster-devel] Spurious regression failure? > tests/basic/ec/ec-background-heals.t > > Hi, > > > Can you please take a look at > https://build.gluster.org/job/centos6-regression/2859/console ? > > tests/basic/ec/ec-background-heals.t has failed. > > Thanks, > Nithya > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Spurious regression failure? tests/basic/ec/ec-background-heals.t
Found another failure on same test: https://build.gluster.org/job/centos6-regression/2874/consoleFull - Original Message - > From: "Nithya Balachandran"> To: "Gluster Devel" , "Pranith Kumar Karampuri" > , "Ashish Pandey" > > Sent: Tuesday, January 24, 2017 9:16:31 AM > Subject: [Gluster-devel] Spurious regression failure? > tests/basic/ec/ec-background-heals.t > > Hi, > > > Can you please take a look at > https://build.gluster.org/job/centos6-regression/2859/console ? > > tests/basic/ec/ec-background-heals.t has failed. > > Thanks, > Nithya > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel