Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status (Fri, August 9th)

2018-08-11 Thread Shyam Ranganathan
On 08/11/2018 02:09 AM, Atin Mukherjee wrote:
> I saw the same behaviour for
> https://build.gluster.org/job/regression-on-demand-full-run/47/consoleFull
> as well. In both the cases the common pattern is if a test was retried
> but overall the job succeeded. Is this a bug which got introduced
> recently? At the moment, this is blocking us to debug any tests which
> has been retried but the job overall succeeded.
> 
> *01:54:20* Archiving artifacts
> *01:54:21* ‘glusterfs-logs.tgz’ doesn’t match anything
> *01:54:21* No artifacts found that match the file pattern 
> "glusterfs-logs.tgz". Configuration error?
> *01:54:21* Finished: SUCCESS
> 
> I saw the same behaviour for 
> https://build.gluster.org/job/regression-on-demand-full-run/47/consoleFull as 
> well.

This has been the behavior always, if we call out a run as failed from
run-tests.sh (when there are retries) then the logs will be archived. We
do not call out a run as a failure in case there were retries, hence no
logs.

I will add this today to the WIP testing patchset.

> 
> 
> On Sat, Aug 11, 2018 at 9:40 AM Ravishankar N  > wrote:
> 
> 
> 
> On 08/11/2018 07:29 AM, Shyam Ranganathan wrote:
> > ./tests/bugs/replicate/bug-1408712.t (one retry)
> I'll take a look at this. But it looks like archiving the artifacts
> (logs) for this run
> 
> (https://build.gluster.org/job/regression-on-demand-full-run/44/consoleFull)
> 
> was a failure.
> Thanks,
> Ravi
> ___
> maintainers mailing list
> maintain...@gluster.org 
> https://lists.gluster.org/mailman/listinfo/maintainers
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status (Fri, August 9th)

2018-08-11 Thread Atin Mukherjee
I saw the same behaviour for
https://build.gluster.org/job/regression-on-demand-full-run/47/consoleFull
as well. In both the cases the common pattern is if a test was retried but
overall the job succeeded. Is this a bug which got introduced recently? At
the moment, this is blocking us to debug any tests which has been retried
but the job overall succeeded.

*01:54:20* Archiving artifacts*01:54:21* ‘glusterfs-logs.tgz’ doesn’t
match anything*01:54:21* No artifacts found that match the file
pattern "glusterfs-logs.tgz". Configuration error?*01:54:21* Finished:
SUCCESS

I saw the same behaviour for
https://build.gluster.org/job/regression-on-demand-full-run/47/consoleFull
as well.


On Sat, Aug 11, 2018 at 9:40 AM Ravishankar N 
wrote:

>
>
> On 08/11/2018 07:29 AM, Shyam Ranganathan wrote:
> > ./tests/bugs/replicate/bug-1408712.t (one retry)
> I'll take a look at this. But it looks like archiving the artifacts
> (logs) for this run
> (
> https://build.gluster.org/job/regression-on-demand-full-run/44/consoleFull)
>
> was a failure.
> Thanks,
> Ravi
> ___
> maintainers mailing list
> maintain...@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status (Thu, August 09th)

2018-08-10 Thread Atin Mukherjee
Pranith,

https://review.gluster.org/c/glusterfs/+/20685 seems to have caused
multiple failure runs out of
https://review.gluster.org/c/glusterfs/+/20637/8 out of yesterday's report.
Did you get a chance to look at it?

On Fri, Aug 10, 2018 at 1:03 PM Pranith Kumar Karampuri 
wrote:

>
>
> On Fri, Aug 10, 2018 at 6:34 AM Shyam Ranganathan 
> wrote:
>
>> Today's test results are updated in the spreadsheet in sheet named "Run
>> patch set 8".
>>
>> I took in patch https://review.gluster.org/c/glusterfs/+/20685 which
>> caused quite a few failures, so not updating new failures as issue yet.
>>
>> Please look at the failures for tests that were retried and passed, as
>> the logs for the initial runs should be preserved from this run onward.
>>
>> Otherwise nothing else to report on the run status, if you are averse to
>> spreadsheets look at this comment in gerrit [1].
>>
>> Shyam
>>
>> [1] Patch set 8 run status:
>>
>> https://review.gluster.org/c/glusterfs/+/20637/8#message-54de30fa384fd02b0426d9db6d07fad4eeefcf08
>> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
>> > Deserves a new beginning, threads on the other mail have gone deep
>> enough.
>> >
>> > NOTE: (5) below needs your attention, rest is just process and data on
>> > how to find failures.
>> >
>> > 1) We are running the tests using the patch [2].
>> >
>> > 2) Run details are extracted into a separate sheet in [3] named "Run
>> > Failures" use a search to find a failing test and the corresponding run
>> > that it failed in.
>> >
>> > 3) Patches that are fixing issues can be found here [1], if you think
>> > you have a patch out there, that is not in this list, shout out.
>> >
>> > 4) If you own up a test case failure, update the spreadsheet [3] with
>> > your name against the test, and also update other details as needed (as
>> > comments, as edit rights to the sheet are restricted).
>> >
>> > 5) Current test failures
>> > We still have the following tests failing and some without any RCA or
>> > attention, (If something is incorrect, write back).
>> >
>> > ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
>> > attention)
>> > ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
>> > ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
>> > (Atin)
>> > ./tests/bugs/ec/bug-1236065.t (Ashish)
>> > ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
>> > ./tests/basic/ec/ec-1468261.t (needs attention)
>> > ./tests/basic/afr/add-brick-self-heal.t (needs attention)
>> > ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
>> > ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
>> > ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
>> > ./tests/bugs/replicate/bug-1363721.t (Ravi)
>> >
>> > Here are some newer failures, but mostly one-off failures except cores
>> > in ec-5-2.t. All of the following need attention as these are new.
>> >
>> > ./tests/00-geo-rep/00-georep-verify-setup.t
>> > ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
>> > ./tests/basic/stats-dump.t
>> > ./tests/bugs/bug-1110262.t
>> >
>> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
>> > ./tests/basic/ec/ec-data-heal.t
>> > ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
>>
>
> Sent https://review.gluster.org/c/glusterfs/+/20697 for the test above.
>
>
>> >
>> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
>> > ./tests/basic/ec/ec-5-2.t
>> >
>> > 6) Tests that are addressed or are not occurring anymore are,
>> >
>> > ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
>> > ./tests/bugs/index/bug-1559004-EMLINK-handling.t
>> > ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
>> > ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
>> > ./tests/bitrot/bug-1373520.t
>> > ./tests/bugs/distribute/bug-1117851.t
>> > ./tests/bugs/glusterd/quorum-validation.t
>> > ./tests/bugs/distribute/bug-1042725.t
>> >
>> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
>> > ./tests/bugs/quota/bug-1293601.t
>> > ./tests/bugs/bug-1368312.t
>> > ./tests/bugs/distribute/bug-1122443.t
>> > ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
>> >
>> > Shyam (and Atin)
>> >
>> > On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
>> >> Health on master as of the last nightly run [4] is still the same.
>> >>
>> >> Potential patches that rectify the situation (as in [1]) are bunched in
>> >> a patch [2] that Atin and myself have put through several regressions
>> >> (mux, normal and line coverage) and these have also not passed.
>> >>
>> >> Till we rectify the situation we are locking down master branch commit
>> >> rights to the following people, Amar, Atin, Shyam, Vijay.
>> >>
>> >> The intention is to stabilize master and not add more patches that my
>> >> destabilize it.
>> >>
>> >> Test cases that are tracked as failures and need action are present
>> here
>> >> [3].
>> 

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status (Wed, August 08th)

2018-08-10 Thread Kotresh Hiremath Ravishankar
Hi Shyam/Atin,

I have posted the patch[1] for geo-rep test cases failure:
tests/00-geo-rep/georep-basic-dr-rsync.t
tests/00-geo-rep/georep-basic-dr-tarssh.t
tests/00-geo-rep/00-georep-verify-setup.t

Please include patch [1] while triggering tests.
The instrumentation patch [2] which was included can be removed.

[1]  https://review.gluster.org/#/c/glusterfs/+/20704/
[2]  https://review.gluster.org/#/c/glusterfs/+/20477/

Thanks,
Kotresh HR




On Fri, Aug 10, 2018 at 3:21 PM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

>
>
> On Thu, Aug 9, 2018 at 4:02 PM Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>>
>>
>> On Thu, Aug 9, 2018 at 6:34 AM Shyam Ranganathan 
>> wrote:
>>
>>> Today's patch set 7 [1], included fixes provided till last evening IST,
>>> and its runs can be seen here [2] (yay! we can link to comments in
>>> gerrit now).
>>>
>>> New failures: (added to the spreadsheet)
>>> ./tests/bugs/protocol/bug-808400-repl.t (core dumped)
>>> ./tests/bugs/quick-read/bug-846240.t
>>>
>>> Older tests that had not recurred, but failed today: (moved up in the
>>> spreadsheet)
>>> ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
>>> ./tests/bugs/index/bug-1559004-EMLINK-handling.t
>>>
>>
>> The above test is timing out. I had to increase the timeout while adding
>> the .t so that creation of maximum number of links that will max-out in
>> ext4. Will re-check if it is the same issue and get back.
>>
>
> This test is timing out with lcov. I bumped up timeout to 30 minutes @
> https://review.gluster.org/#/c/glusterfs/+/20699, I am not happy that
> this test takes so long, but without this it is difficult to find
> regression on ext4 which has limits on number of hardlinks in a
> directory(It took us almost one year after we introduced regression to find
> this problem when we did introduce regression last time). If there is a way
> of running this .t once per day and before each release. I will be happy to
> make it part of that. Let me know.
>
>
>>
>>
>>>
>>> Other issues;
>>> Test ./tests/basic/ec/ec-5-2.t core dumped again
>>> Few geo-rep failures, Kotresh should have more logs to look at with
>>> these runs
>>> Test ./tests/bugs/glusterd/quorum-validation.t dumped core again
>>>
>>> Atin/Amar, we may need to merge some of the patches that have proven to
>>> be holding up and fixing issues today, so that we do not leave
>>> everything to the last. Check and move them along or lmk.
>>>
>>> Shyam
>>>
>>> [1] Patch set 7: https://review.gluster.org/c/glusterfs/+/20637/7
>>> [2] Runs against patch set 7 and its status (incomplete as some runs
>>> have not completed):
>>> https://review.gluster.org/c/glusterfs/+/20637/7#message-
>>> 37bc68ce6f2157f2947da6fd03b361ab1b0d1a77
>>> (also updated in the spreadsheet)
>>>
>>> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
>>> > Deserves a new beginning, threads on the other mail have gone deep
>>> enough.
>>> >
>>> > NOTE: (5) below needs your attention, rest is just process and data on
>>> > how to find failures.
>>> >
>>> > 1) We are running the tests using the patch [2].
>>> >
>>> > 2) Run details are extracted into a separate sheet in [3] named "Run
>>> > Failures" use a search to find a failing test and the corresponding run
>>> > that it failed in.
>>> >
>>> > 3) Patches that are fixing issues can be found here [1], if you think
>>> > you have a patch out there, that is not in this list, shout out.
>>> >
>>> > 4) If you own up a test case failure, update the spreadsheet [3] with
>>> > your name against the test, and also update other details as needed (as
>>> > comments, as edit rights to the sheet are restricted).
>>> >
>>> > 5) Current test failures
>>> > We still have the following tests failing and some without any RCA or
>>> > attention, (If something is incorrect, write back).
>>> >
>>> > ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
>>> > attention)
>>> > ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
>>> > ./tests/bugs/glusterd/add-brick-and-validate-replicated-
>>> volume-options.t
>>> > (Atin)
>>> > ./tests/bugs/ec/bug-1236065.t (Ashish)
>>> > ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
>>> > ./tests/basic/ec/ec-1468261.t (needs attention)
>>> > ./tests/basic/afr/add-brick-self-heal.t (needs attention)
>>> > ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
>>> > ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
>>> > ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
>>> > ./tests/bugs/replicate/bug-1363721.t (Ravi)
>>> >
>>> > Here are some newer failures, but mostly one-off failures except cores
>>> > in ec-5-2.t. All of the following need attention as these are new.
>>> >
>>> > ./tests/00-geo-rep/00-georep-verify-setup.t
>>> > ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
>>> > ./tests/basic/stats-dump.t
>>> > ./tests/bugs/bug-1110262.t
>>> > ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-
>>> 

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status (Wed, August 08th)

2018-08-10 Thread Pranith Kumar Karampuri
On Thu, Aug 9, 2018 at 4:02 PM Pranith Kumar Karampuri 
wrote:

>
>
> On Thu, Aug 9, 2018 at 6:34 AM Shyam Ranganathan 
> wrote:
>
>> Today's patch set 7 [1], included fixes provided till last evening IST,
>> and its runs can be seen here [2] (yay! we can link to comments in
>> gerrit now).
>>
>> New failures: (added to the spreadsheet)
>> ./tests/bugs/protocol/bug-808400-repl.t (core dumped)
>> ./tests/bugs/quick-read/bug-846240.t
>>
>> Older tests that had not recurred, but failed today: (moved up in the
>> spreadsheet)
>> ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
>> ./tests/bugs/index/bug-1559004-EMLINK-handling.t
>>
>
> The above test is timing out. I had to increase the timeout while adding
> the .t so that creation of maximum number of links that will max-out in
> ext4. Will re-check if it is the same issue and get back.
>

This test is timing out with lcov. I bumped up timeout to 30 minutes @
https://review.gluster.org/#/c/glusterfs/+/20699, I am not happy that this
test takes so long, but without this it is difficult to find regression on
ext4 which has limits on number of hardlinks in a directory(It took us
almost one year after we introduced regression to find this problem when we
did introduce regression last time). If there is a way of running this .t
once per day and before each release. I will be happy to make it part of
that. Let me know.


>
>
>>
>> Other issues;
>> Test ./tests/basic/ec/ec-5-2.t core dumped again
>> Few geo-rep failures, Kotresh should have more logs to look at with
>> these runs
>> Test ./tests/bugs/glusterd/quorum-validation.t dumped core again
>>
>> Atin/Amar, we may need to merge some of the patches that have proven to
>> be holding up and fixing issues today, so that we do not leave
>> everything to the last. Check and move them along or lmk.
>>
>> Shyam
>>
>> [1] Patch set 7: https://review.gluster.org/c/glusterfs/+/20637/7
>> [2] Runs against patch set 7 and its status (incomplete as some runs
>> have not completed):
>>
>> https://review.gluster.org/c/glusterfs/+/20637/7#message-37bc68ce6f2157f2947da6fd03b361ab1b0d1a77
>> (also updated in the spreadsheet)
>>
>> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
>> > Deserves a new beginning, threads on the other mail have gone deep
>> enough.
>> >
>> > NOTE: (5) below needs your attention, rest is just process and data on
>> > how to find failures.
>> >
>> > 1) We are running the tests using the patch [2].
>> >
>> > 2) Run details are extracted into a separate sheet in [3] named "Run
>> > Failures" use a search to find a failing test and the corresponding run
>> > that it failed in.
>> >
>> > 3) Patches that are fixing issues can be found here [1], if you think
>> > you have a patch out there, that is not in this list, shout out.
>> >
>> > 4) If you own up a test case failure, update the spreadsheet [3] with
>> > your name against the test, and also update other details as needed (as
>> > comments, as edit rights to the sheet are restricted).
>> >
>> > 5) Current test failures
>> > We still have the following tests failing and some without any RCA or
>> > attention, (If something is incorrect, write back).
>> >
>> > ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
>> > attention)
>> > ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
>> > ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
>> > (Atin)
>> > ./tests/bugs/ec/bug-1236065.t (Ashish)
>> > ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
>> > ./tests/basic/ec/ec-1468261.t (needs attention)
>> > ./tests/basic/afr/add-brick-self-heal.t (needs attention)
>> > ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
>> > ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
>> > ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
>> > ./tests/bugs/replicate/bug-1363721.t (Ravi)
>> >
>> > Here are some newer failures, but mostly one-off failures except cores
>> > in ec-5-2.t. All of the following need attention as these are new.
>> >
>> > ./tests/00-geo-rep/00-georep-verify-setup.t
>> > ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
>> > ./tests/basic/stats-dump.t
>> > ./tests/bugs/bug-1110262.t
>> >
>> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
>> > ./tests/basic/ec/ec-data-heal.t
>> > ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
>> >
>> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
>> > ./tests/basic/ec/ec-5-2.t
>> >
>> > 6) Tests that are addressed or are not occurring anymore are,
>> >
>> > ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
>> > ./tests/bugs/index/bug-1559004-EMLINK-handling.t
>> > ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
>> > ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
>> > ./tests/bitrot/bug-1373520.t
>> > ./tests/bugs/distribute/bug-1117851.t
>> > 

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status (Thu, August 09th)

2018-08-10 Thread Pranith Kumar Karampuri
On Fri, Aug 10, 2018 at 6:34 AM Shyam Ranganathan 
wrote:

> Today's test results are updated in the spreadsheet in sheet named "Run
> patch set 8".
>
> I took in patch https://review.gluster.org/c/glusterfs/+/20685 which
> caused quite a few failures, so not updating new failures as issue yet.
>
> Please look at the failures for tests that were retried and passed, as
> the logs for the initial runs should be preserved from this run onward.
>
> Otherwise nothing else to report on the run status, if you are averse to
> spreadsheets look at this comment in gerrit [1].
>
> Shyam
>
> [1] Patch set 8 run status:
>
> https://review.gluster.org/c/glusterfs/+/20637/8#message-54de30fa384fd02b0426d9db6d07fad4eeefcf08
> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
> > Deserves a new beginning, threads on the other mail have gone deep
> enough.
> >
> > NOTE: (5) below needs your attention, rest is just process and data on
> > how to find failures.
> >
> > 1) We are running the tests using the patch [2].
> >
> > 2) Run details are extracted into a separate sheet in [3] named "Run
> > Failures" use a search to find a failing test and the corresponding run
> > that it failed in.
> >
> > 3) Patches that are fixing issues can be found here [1], if you think
> > you have a patch out there, that is not in this list, shout out.
> >
> > 4) If you own up a test case failure, update the spreadsheet [3] with
> > your name against the test, and also update other details as needed (as
> > comments, as edit rights to the sheet are restricted).
> >
> > 5) Current test failures
> > We still have the following tests failing and some without any RCA or
> > attention, (If something is incorrect, write back).
> >
> > ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
> > attention)
> > ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
> > ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
> > (Atin)
> > ./tests/bugs/ec/bug-1236065.t (Ashish)
> > ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
> > ./tests/basic/ec/ec-1468261.t (needs attention)
> > ./tests/basic/afr/add-brick-self-heal.t (needs attention)
> > ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
> > ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
> > ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
> > ./tests/bugs/replicate/bug-1363721.t (Ravi)
> >
> > Here are some newer failures, but mostly one-off failures except cores
> > in ec-5-2.t. All of the following need attention as these are new.
> >
> > ./tests/00-geo-rep/00-georep-verify-setup.t
> > ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
> > ./tests/basic/stats-dump.t
> > ./tests/bugs/bug-1110262.t
> >
> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
> > ./tests/basic/ec/ec-data-heal.t
> > ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
>

Sent https://review.gluster.org/c/glusterfs/+/20697 for the test above.


> >
> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
> > ./tests/basic/ec/ec-5-2.t
> >
> > 6) Tests that are addressed or are not occurring anymore are,
> >
> > ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
> > ./tests/bugs/index/bug-1559004-EMLINK-handling.t
> > ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
> > ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
> > ./tests/bitrot/bug-1373520.t
> > ./tests/bugs/distribute/bug-1117851.t
> > ./tests/bugs/glusterd/quorum-validation.t
> > ./tests/bugs/distribute/bug-1042725.t
> >
> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
> > ./tests/bugs/quota/bug-1293601.t
> > ./tests/bugs/bug-1368312.t
> > ./tests/bugs/distribute/bug-1122443.t
> > ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
> >
> > Shyam (and Atin)
> >
> > On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
> >> Health on master as of the last nightly run [4] is still the same.
> >>
> >> Potential patches that rectify the situation (as in [1]) are bunched in
> >> a patch [2] that Atin and myself have put through several regressions
> >> (mux, normal and line coverage) and these have also not passed.
> >>
> >> Till we rectify the situation we are locking down master branch commit
> >> rights to the following people, Amar, Atin, Shyam, Vijay.
> >>
> >> The intention is to stabilize master and not add more patches that my
> >> destabilize it.
> >>
> >> Test cases that are tracked as failures and need action are present here
> >> [3].
> >>
> >> @Nigel, request you to apply the commit rights change as you see this
> >> mail and let the list know regarding the same as well.
> >>
> >> Thanks,
> >> Shyam
> >>
> >> [1] Patches that address regression failures:
> >> https://review.gluster.org/#/q/starredby:srangana%2540redhat.com
> >>
> >> [2] Bunched up patch against which regressions were run:
> >> https://review.gluster.org/#/c/20637
> >>
> 

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status (Wed, August 08th)

2018-08-09 Thread Pranith Kumar Karampuri
On Thu, Aug 9, 2018 at 6:34 AM Shyam Ranganathan 
wrote:

> Today's patch set 7 [1], included fixes provided till last evening IST,
> and its runs can be seen here [2] (yay! we can link to comments in
> gerrit now).
>
> New failures: (added to the spreadsheet)
> ./tests/bugs/protocol/bug-808400-repl.t (core dumped)
> ./tests/bugs/quick-read/bug-846240.t
>
> Older tests that had not recurred, but failed today: (moved up in the
> spreadsheet)
> ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
> ./tests/bugs/index/bug-1559004-EMLINK-handling.t
>

The above test is timing out. I had to increase the timeout while adding
the .t so that creation of maximum number of links that will max-out in
ext4. Will re-check if it is the same issue and get back.


>
> Other issues;
> Test ./tests/basic/ec/ec-5-2.t core dumped again
> Few geo-rep failures, Kotresh should have more logs to look at with
> these runs
> Test ./tests/bugs/glusterd/quorum-validation.t dumped core again
>
> Atin/Amar, we may need to merge some of the patches that have proven to
> be holding up and fixing issues today, so that we do not leave
> everything to the last. Check and move them along or lmk.
>
> Shyam
>
> [1] Patch set 7: https://review.gluster.org/c/glusterfs/+/20637/7
> [2] Runs against patch set 7 and its status (incomplete as some runs
> have not completed):
>
> https://review.gluster.org/c/glusterfs/+/20637/7#message-37bc68ce6f2157f2947da6fd03b361ab1b0d1a77
> (also updated in the spreadsheet)
>
> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
> > Deserves a new beginning, threads on the other mail have gone deep
> enough.
> >
> > NOTE: (5) below needs your attention, rest is just process and data on
> > how to find failures.
> >
> > 1) We are running the tests using the patch [2].
> >
> > 2) Run details are extracted into a separate sheet in [3] named "Run
> > Failures" use a search to find a failing test and the corresponding run
> > that it failed in.
> >
> > 3) Patches that are fixing issues can be found here [1], if you think
> > you have a patch out there, that is not in this list, shout out.
> >
> > 4) If you own up a test case failure, update the spreadsheet [3] with
> > your name against the test, and also update other details as needed (as
> > comments, as edit rights to the sheet are restricted).
> >
> > 5) Current test failures
> > We still have the following tests failing and some without any RCA or
> > attention, (If something is incorrect, write back).
> >
> > ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
> > attention)
> > ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
> > ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
> > (Atin)
> > ./tests/bugs/ec/bug-1236065.t (Ashish)
> > ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
> > ./tests/basic/ec/ec-1468261.t (needs attention)
> > ./tests/basic/afr/add-brick-self-heal.t (needs attention)
> > ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
> > ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
> > ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
> > ./tests/bugs/replicate/bug-1363721.t (Ravi)
> >
> > Here are some newer failures, but mostly one-off failures except cores
> > in ec-5-2.t. All of the following need attention as these are new.
> >
> > ./tests/00-geo-rep/00-georep-verify-setup.t
> > ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
> > ./tests/basic/stats-dump.t
> > ./tests/bugs/bug-1110262.t
> >
> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
> > ./tests/basic/ec/ec-data-heal.t
> > ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
> >
> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
> > ./tests/basic/ec/ec-5-2.t
> >
> > 6) Tests that are addressed or are not occurring anymore are,
> >
> > ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
> > ./tests/bugs/index/bug-1559004-EMLINK-handling.t
> > ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
> > ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
> > ./tests/bitrot/bug-1373520.t
> > ./tests/bugs/distribute/bug-1117851.t
> > ./tests/bugs/glusterd/quorum-validation.t
> > ./tests/bugs/distribute/bug-1042725.t
> >
> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
> > ./tests/bugs/quota/bug-1293601.t
> > ./tests/bugs/bug-1368312.t
> > ./tests/bugs/distribute/bug-1122443.t
> > ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
> >
> > Shyam (and Atin)
> >
> > On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
> >> Health on master as of the last nightly run [4] is still the same.
> >>
> >> Potential patches that rectify the situation (as in [1]) are bunched in
> >> a patch [2] that Atin and myself have put through several regressions
> >> (mux, normal and line coverage) and these have also not passed.
> >>
> >> Till we rectify the 

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status

2018-08-09 Thread Pranith Kumar Karampuri
On Wed, Aug 8, 2018 at 5:08 AM Shyam Ranganathan 
wrote:

> Deserves a new beginning, threads on the other mail have gone deep enough.
>
> NOTE: (5) below needs your attention, rest is just process and data on
> how to find failures.
>
> 1) We are running the tests using the patch [2].
>
> 2) Run details are extracted into a separate sheet in [3] named "Run
> Failures" use a search to find a failing test and the corresponding run
> that it failed in.
>
> 3) Patches that are fixing issues can be found here [1], if you think
> you have a patch out there, that is not in this list, shout out.
>
> 4) If you own up a test case failure, update the spreadsheet [3] with
> your name against the test, and also update other details as needed (as
> comments, as edit rights to the sheet are restricted).
>
> 5) Current test failures
> We still have the following tests failing and some without any RCA or
> attention, (If something is incorrect, write back).
>
> ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
> attention)
> ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
> ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
> (Atin)
> ./tests/bugs/ec/bug-1236065.t (Ashish)
> ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
> ./tests/basic/ec/ec-1468261.t (needs attention)
>

Sent a fix for above @ https://review.gluster.org/#/c/glusterfs/+/20685


> ./tests/basic/afr/add-brick-self-heal.t (needs attention)
> ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
> ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
> ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
> ./tests/bugs/replicate/bug-1363721.t (Ravi)
>
> Here are some newer failures, but mostly one-off failures except cores
> in ec-5-2.t. All of the following need attention as these are new.
>
> ./tests/00-geo-rep/00-georep-verify-setup.t
> ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
> ./tests/basic/stats-dump.t
> ./tests/bugs/bug-1110262.t
>
> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
> ./tests/basic/ec/ec-data-heal.t
> ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
>
> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
> ./tests/basic/ec/ec-5-2.t
>
> 6) Tests that are addressed or are not occurring anymore are,
>
> ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
> ./tests/bugs/index/bug-1559004-EMLINK-handling.t
> ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
> ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
> ./tests/bitrot/bug-1373520.t
> ./tests/bugs/distribute/bug-1117851.t
> ./tests/bugs/glusterd/quorum-validation.t
> ./tests/bugs/distribute/bug-1042725.t
>
> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
> ./tests/bugs/quota/bug-1293601.t
> ./tests/bugs/bug-1368312.t
> ./tests/bugs/distribute/bug-1122443.t
> ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
>
> Shyam (and Atin)
>
> On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
> > Health on master as of the last nightly run [4] is still the same.
> >
> > Potential patches that rectify the situation (as in [1]) are bunched in
> > a patch [2] that Atin and myself have put through several regressions
> > (mux, normal and line coverage) and these have also not passed.
> >
> > Till we rectify the situation we are locking down master branch commit
> > rights to the following people, Amar, Atin, Shyam, Vijay.
> >
> > The intention is to stabilize master and not add more patches that my
> > destabilize it.
> >
> > Test cases that are tracked as failures and need action are present here
> > [3].
> >
> > @Nigel, request you to apply the commit rights change as you see this
> > mail and let the list know regarding the same as well.
> >
> > Thanks,
> > Shyam
> >
> > [1] Patches that address regression failures:
> > https://review.gluster.org/#/q/starredby:srangana%2540redhat.com
> >
> > [2] Bunched up patch against which regressions were run:
> > https://review.gluster.org/#/c/20637
> >
> > [3] Failing tests list:
> >
> https://docs.google.com/spreadsheets/d/1IF9GhpKah4bto19RQLr0y_Kkw26E_-crKALHSaSjZMQ/edit?usp=sharing
> >
> > [4] Nightly run dashboard: https://build.gluster.org/job/nightly-master/
> ___
> maintainers mailing list
> maintain...@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>


-- 
Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status (Wed, August 08th)

2018-08-08 Thread Atin Mukherjee
On Thu, 9 Aug 2018 at 06:34, Shyam Ranganathan  wrote:

> Today's patch set 7 [1], included fixes provided till last evening IST,
> and its runs can be seen here [2] (yay! we can link to comments in
> gerrit now).
>
> New failures: (added to the spreadsheet)
> ./tests/bugs/protocol/bug-808400-repl.t (core dumped)
> ./tests/bugs/quick-read/bug-846240.t
>
> Older tests that had not recurred, but failed today: (moved up in the
> spreadsheet)
> ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
> ./tests/bugs/index/bug-1559004-EMLINK-handling.t
>
> Other issues;
> Test ./tests/basic/ec/ec-5-2.t core dumped again




> Few geo-rep failures, Kotresh should have more logs to look at with
> these runs
> Test ./tests/bugs/glusterd/quorum-validation.t dumped core again


>
> Atin/Amar, we may need to merge some of the patches that have proven to
> be holding up and fixing issues today, so that we do not leave
> everything to the last. Check and move them along or lmk.


Ack. I’ll be merging those patches.


>
> Shyam
>
> [1] Patch set 7: https://review.gluster.org/c/glusterfs/+/20637/7
> [2] Runs against patch set 7 and its status (incomplete as some runs
> have not completed):
>
> https://review.gluster.org/c/glusterfs/+/20637/7#message-37bc68ce6f2157f2947da6fd03b361ab1b0d1a77
> (also updated in the spreadsheet)
>
> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
> > Deserves a new beginning, threads on the other mail have gone deep
> enough.
> >
> > NOTE: (5) below needs your attention, rest is just process and data on
> > how to find failures.
> >
> > 1) We are running the tests using the patch [2].
> >
> > 2) Run details are extracted into a separate sheet in [3] named "Run
> > Failures" use a search to find a failing test and the corresponding run
> > that it failed in.
> >
> > 3) Patches that are fixing issues can be found here [1], if you think
> > you have a patch out there, that is not in this list, shout out.
> >
> > 4) If you own up a test case failure, update the spreadsheet [3] with
> > your name against the test, and also update other details as needed (as
> > comments, as edit rights to the sheet are restricted).
> >
> > 5) Current test failures
> > We still have the following tests failing and some without any RCA or
> > attention, (If something is incorrect, write back).
> >
> > ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
> > attention)
> > ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
> > ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
> > (Atin)
> > ./tests/bugs/ec/bug-1236065.t (Ashish)
> > ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
> > ./tests/basic/ec/ec-1468261.t (needs attention)
> > ./tests/basic/afr/add-brick-self-heal.t (needs attention)
> > ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
> > ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
> > ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
> > ./tests/bugs/replicate/bug-1363721.t (Ravi)
> >
> > Here are some newer failures, but mostly one-off failures except cores
> > in ec-5-2.t. All of the following need attention as these are new.
> >
> > ./tests/00-geo-rep/00-georep-verify-setup.t
> > ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
> > ./tests/basic/stats-dump.t
> > ./tests/bugs/bug-1110262.t
> >
> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
> > ./tests/basic/ec/ec-data-heal.t
> > ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
> >
> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
> > ./tests/basic/ec/ec-5-2.t
> >
> > 6) Tests that are addressed or are not occurring anymore are,
> >
> > ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
> > ./tests/bugs/index/bug-1559004-EMLINK-handling.t
> > ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
> > ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
> > ./tests/bitrot/bug-1373520.t
> > ./tests/bugs/distribute/bug-1117851.t
> > ./tests/bugs/glusterd/quorum-validation.t
> > ./tests/bugs/distribute/bug-1042725.t
> >
> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
> > ./tests/bugs/quota/bug-1293601.t
> > ./tests/bugs/bug-1368312.t
> > ./tests/bugs/distribute/bug-1122443.t
> > ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
> >
> > Shyam (and Atin)
> >
> > On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
> >> Health on master as of the last nightly run [4] is still the same.
> >>
> >> Potential patches that rectify the situation (as in [1]) are bunched in
> >> a patch [2] that Atin and myself have put through several regressions
> >> (mux, normal and line coverage) and these have also not passed.
> >>
> >> Till we rectify the situation we are locking down master branch commit
> >> rights to the following people, Amar, Atin, Shyam, Vijay.
> >>
> >> The intention is to stabilize master and not 

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status

2018-08-08 Thread Shyam Ranganathan
On 08/08/2018 09:43 AM, Shyam Ranganathan wrote:
> On 08/08/2018 09:41 AM, Kotresh Hiremath Ravishankar wrote:
>> For geo-rep test retrials. Could you take this instrumentation patch [1]
>> and give a run?
>> I am have tried thrice on the patch with brick mux enabled and without
>> but couldn't hit
>> geo-rep failure. May be some race and it's not happening with
>> instrumentation patch.
>>
>> [1] https://review.gluster.org/20477
> 
> Will do in my refresh today, thanks.
> 

Kotresh, this run may have the additional logs that you are looking for.
As this is a failed run on one of the geo-rep test cases.

https://build.gluster.org/job/line-coverage/434/consoleFull
19:10:55, 1 test(s) failed
19:10:55, ./tests/00-geo-rep/georep-basic-dr-tarssh.t
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status

2018-08-08 Thread Shyam Ranganathan
On 08/08/2018 04:56 AM, Nigel Babu wrote:
> Also, Shyam was saying that in case of retries, the old (failure) logs
> get overwritten by the retries which are successful. Can we disable
> re-trying the .ts when they fail just for this lock down period
> alone so
> that we do have the logs?
> 
> 
> Please don't apply a band-aid. Please fix run-test.sh so that the second
> run has a -retry attached to the file name or some such, please.

Posted patch https://review.gluster.org/c/glusterfs/+/20682 that
achieves this.

I do not like the fact that I use the gluster CLI in run-scripts.sh,
alternatives welcome.

If it looks functionally fine, then I will merge it into the big patch
[1] that we are using to run multiple tests (so that at least we start
getting retry logs from there).

Prior to this I had done this within include.rc and in cleanup, but that
gets invoked twice (at least) per test, and so generated far too many
empty tarballs for no reason.

Also, the change above does not prevent half complete logs if any test
calls cleanup in between (as that would create a tarball in between that
would be overwritten by the last invocation of cleanup).

Shyam

[1] big patch: https://review.gluster.org/c/glusterfs/+/20637
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status

2018-08-08 Thread Shyam Ranganathan
On 08/08/2018 09:41 AM, Kotresh Hiremath Ravishankar wrote:
> For geo-rep test retrials. Could you take this instrumentation patch [1]
> and give a run?
> I am have tried thrice on the patch with brick mux enabled and without
> but couldn't hit
> geo-rep failure. May be some race and it's not happening with
> instrumentation patch.
> 
> [1] https://review.gluster.org/20477

Will do in my refresh today, thanks.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status

2018-08-08 Thread Kotresh Hiremath Ravishankar
Hi Atin/Shyam

For geo-rep test retrials. Could you take this instrumentation patch [1]
and give a run?
I am have tried thrice on the patch with brick mux enabled and without but
couldn't hit
geo-rep failure. May be some race and it's not happening with
instrumentation patch.

[1] https://review.gluster.org/20477

Thanks,
Kotresh HR


On Wed, Aug 8, 2018 at 4:00 PM, Pranith Kumar Karampuri  wrote:

>
>
> On Wed, Aug 8, 2018 at 5:08 AM Shyam Ranganathan 
> wrote:
>
>> Deserves a new beginning, threads on the other mail have gone deep enough.
>>
>> NOTE: (5) below needs your attention, rest is just process and data on
>> how to find failures.
>>
>> 1) We are running the tests using the patch [2].
>>
>> 2) Run details are extracted into a separate sheet in [3] named "Run
>> Failures" use a search to find a failing test and the corresponding run
>> that it failed in.
>>
>> 3) Patches that are fixing issues can be found here [1], if you think
>> you have a patch out there, that is not in this list, shout out.
>>
>> 4) If you own up a test case failure, update the spreadsheet [3] with
>> your name against the test, and also update other details as needed (as
>> comments, as edit rights to the sheet are restricted).
>>
>> 5) Current test failures
>> We still have the following tests failing and some without any RCA or
>> attention, (If something is incorrect, write back).
>>
>> ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
>> attention)
>> ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
>> ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
>> (Atin)
>> ./tests/bugs/ec/bug-1236065.t (Ashish)
>> ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
>> ./tests/basic/ec/ec-1468261.t (needs attention)
>> ./tests/basic/afr/add-brick-self-heal.t (needs attention)
>> ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
>>
>
> Sent https://review.gluster.org/#/c/glusterfs/+/20681 for the failure
> above. Because it was retried there were no logs. Entry heal succeeded but
> data/metadata heal after that didn't succeed. Found only one case based on
> code reading and the point at which it failed in .t
>
>
>> ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
>> ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
>> ./tests/bugs/replicate/bug-1363721.t (Ravi)
>>
>> Here are some newer failures, but mostly one-off failures except cores
>> in ec-5-2.t. All of the following need attention as these are new.
>>
>> ./tests/00-geo-rep/00-georep-verify-setup.t
>> ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
>> ./tests/basic/stats-dump.t
>> ./tests/bugs/bug-1110262.t
>> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-
>> post-glusterd-restart.t
>> ./tests/basic/ec/ec-data-heal.t
>> ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
>> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-
>> other-processes-accessing-mounted-path.t
>> ./tests/basic/ec/ec-5-2.t
>>
>> 6) Tests that are addressed or are not occurring anymore are,
>>
>> ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
>> ./tests/bugs/index/bug-1559004-EMLINK-handling.t
>> ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
>> ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
>> ./tests/bitrot/bug-1373520.t
>> ./tests/bugs/distribute/bug-1117851.t
>> ./tests/bugs/glusterd/quorum-validation.t
>> ./tests/bugs/distribute/bug-1042725.t
>> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-
>> txn-on-quorum-failure.t
>> ./tests/bugs/quota/bug-1293601.t
>> ./tests/bugs/bug-1368312.t
>> ./tests/bugs/distribute/bug-1122443.t
>> ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
>>
>> Shyam (and Atin)
>>
>> On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
>> > Health on master as of the last nightly run [4] is still the same.
>> >
>> > Potential patches that rectify the situation (as in [1]) are bunched in
>> > a patch [2] that Atin and myself have put through several regressions
>> > (mux, normal and line coverage) and these have also not passed.
>> >
>> > Till we rectify the situation we are locking down master branch commit
>> > rights to the following people, Amar, Atin, Shyam, Vijay.
>> >
>> > The intention is to stabilize master and not add more patches that my
>> > destabilize it.
>> >
>> > Test cases that are tracked as failures and need action are present here
>> > [3].
>> >
>> > @Nigel, request you to apply the commit rights change as you see this
>> > mail and let the list know regarding the same as well.
>> >
>> > Thanks,
>> > Shyam
>> >
>> > [1] Patches that address regression failures:
>> > https://review.gluster.org/#/q/starredby:srangana%2540redhat.com
>> >
>> > [2] Bunched up patch against which regressions were run:
>> > https://review.gluster.org/#/c/20637
>> >
>> > [3] Failing tests list:
>> > https://docs.google.com/spreadsheets/d/1IF9GhpKah4bto19RQLr0y_Kkw26E_
>> -crKALHSaSjZMQ/edit?usp=sharing
>> >
>> > [4] 

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status

2018-08-08 Thread Karthik Subrahmanya
On Wed, Aug 8, 2018 at 2:28 PM Nigel Babu  wrote:

>
>
> On Wed, Aug 8, 2018 at 2:00 PM Ravishankar N 
> wrote:
>
>>
>> On 08/08/2018 05:07 AM, Shyam Ranganathan wrote:
>> > 5) Current test failures
>> > We still have the following tests failing and some without any RCA or
>> > attention, (If something is incorrect, write back).
>> >
>> > ./tests/basic/afr/add-brick-self-heal.t (needs attention)
>>  From the runs captured at https://review.gluster.org/#/c/20637/ , I saw
>> that the latest runs where this particular .t failed were at
>> https://build.gluster.org/job/line-coverage/415 and
>> https://build.gluster.org/job/line-coverage/421/.
>> In both of these runs, there are no gluster 'regression' logs available
>> at https://build.gluster.org/job/line-coverage//artifact.
>> I have raised BZ 1613721 for it.
>
>
> We've fixed this for newer runs, but we can do nothing for older runs,
> sadly.
>
Thanks Nigel! I'm also blocked on this. The failures are not reproducible
locally.
Without the logs we can not debug the issue. Will wait for the new runs to
complete.

>
>
>>
>> Also, Shyam was saying that in case of retries, the old (failure) logs
>> get overwritten by the retries which are successful. Can we disable
>> re-trying the .ts when they fail just for this lock down period alone so
>> that we do have the logs?
>
>
> Please don't apply a band-aid. Please fix run-test.sh so that the second
> run has a -retry attached to the file name or some such, please.
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status

2018-08-08 Thread Nigel Babu
On Wed, Aug 8, 2018 at 2:00 PM Ravishankar N  wrote:

>
> On 08/08/2018 05:07 AM, Shyam Ranganathan wrote:
> > 5) Current test failures
> > We still have the following tests failing and some without any RCA or
> > attention, (If something is incorrect, write back).
> >
> > ./tests/basic/afr/add-brick-self-heal.t (needs attention)
>  From the runs captured at https://review.gluster.org/#/c/20637/ , I saw
> that the latest runs where this particular .t failed were at
> https://build.gluster.org/job/line-coverage/415 and
> https://build.gluster.org/job/line-coverage/421/.
> In both of these runs, there are no gluster 'regression' logs available
> at https://build.gluster.org/job/line-coverage//artifact.
> I have raised BZ 1613721 for it.
>

We've fixed this for newer runs, but we can do nothing for older runs,
sadly.


>
> Also, Shyam was saying that in case of retries, the old (failure) logs
> get overwritten by the retries which are successful. Can we disable
> re-trying the .ts when they fail just for this lock down period alone so
> that we do have the logs?


Please don't apply a band-aid. Please fix run-test.sh so that the second
run has a -retry attached to the file name or some such, please.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status

2018-08-08 Thread Atin Mukherjee
On Wed, Aug 8, 2018 at 5:08 AM Shyam Ranganathan 
wrote:

> Deserves a new beginning, threads on the other mail have gone deep enough.
>
> NOTE: (5) below needs your attention, rest is just process and data on
> how to find failures.
>
> 1) We are running the tests using the patch [2].
>
> 2) Run details are extracted into a separate sheet in [3] named "Run
> Failures" use a search to find a failing test and the corresponding run
> that it failed in.
>
> 3) Patches that are fixing issues can be found here [1], if you think
> you have a patch out there, that is not in this list, shout out.
>
> 4) If you own up a test case failure, update the spreadsheet [3] with
> your name against the test, and also update other details as needed (as
> comments, as edit rights to the sheet are restricted).
>
> 5) Current test failures
> We still have the following tests failing and some without any RCA or
> attention, (If something is incorrect, write back).
>
> ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
> attention)
> ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
> ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
> (Atin)
>

This one is fixed through https://review.gluster.org/20651  as I see no
failures from this patch in the latest report from patch set 6.

./tests/bugs/ec/bug-1236065.t (Ashish)
> ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
> ./tests/basic/ec/ec-1468261.t (needs attention)
> ./tests/basic/afr/add-brick-self-heal.t (needs attention)
> ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
> ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
> ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
> ./tests/bugs/replicate/bug-1363721.t (Ravi)
>
> Here are some newer failures, but mostly one-off failures except cores
> in ec-5-2.t. All of the following need attention as these are new.
>
> ./tests/00-geo-rep/00-georep-verify-setup.t
> ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
> ./tests/basic/stats-dump.t
> ./tests/bugs/bug-1110262.t
>
> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
>

This failed because of https://review.gluster.org/20584. I believe there's
some timing issue introduced from this patch. As I highlighted in
https://review.gluster.org/#/c/20637 as a comment I'd request you to revert
this change and include https://review.gluster.org/20658

./tests/basic/ec/ec-data-heal.t
> ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
>
> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
> ./tests/basic/ec/ec-5-2.t
>
> 6) Tests that are addressed or are not occurring anymore are,
>
> ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
> ./tests/bugs/index/bug-1559004-EMLINK-handling.t
> ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
> ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
> ./tests/bitrot/bug-1373520.t
> ./tests/bugs/distribute/bug-1117851.t
> ./tests/bugs/glusterd/quorum-validation.t
> ./tests/bugs/distribute/bug-1042725.t
>
> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
> ./tests/bugs/quota/bug-1293601.t
> ./tests/bugs/bug-1368312.t
> ./tests/bugs/distribute/bug-1122443.t
> ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
>
> Shyam (and Atin)
>
> On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
> > Health on master as of the last nightly run [4] is still the same.
> >
> > Potential patches that rectify the situation (as in [1]) are bunched in
> > a patch [2] that Atin and myself have put through several regressions
> > (mux, normal and line coverage) and these have also not passed.
> >
> > Till we rectify the situation we are locking down master branch commit
> > rights to the following people, Amar, Atin, Shyam, Vijay.
> >
> > The intention is to stabilize master and not add more patches that my
> > destabilize it.
> >
> > Test cases that are tracked as failures and need action are present here
> > [3].
> >
> > @Nigel, request you to apply the commit rights change as you see this
> > mail and let the list know regarding the same as well.
> >
> > Thanks,
> > Shyam
> >
> > [1] Patches that address regression failures:
> > https://review.gluster.org/#/q/starredby:srangana%2540redhat.com
> >
> > [2] Bunched up patch against which regressions were run:
> > https://review.gluster.org/#/c/20637
> >
> > [3] Failing tests list:
> >
> https://docs.google.com/spreadsheets/d/1IF9GhpKah4bto19RQLr0y_Kkw26E_-crKALHSaSjZMQ/edit?usp=sharing
> >
> > [4] Nightly run dashboard: https://build.gluster.org/job/nightly-master/
> ___
> maintainers mailing list
> maintain...@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel