Re: [Gluster-devel] Release 5: Master branch health report (Week of 30th July)

2018-08-07 Thread Yaniv Kaul
On Tue, Aug 7, 2018, 10:46 PM Shyam Ranganathan  wrote:

> On 08/07/2018 02:58 PM, Yaniv Kaul wrote:
> > The intention is to stabilize master and not add more patches that my
> > destabilize it.
> >
> >
> > https://review.gluster.org/#/c/20603/ has been merged.
> > As far as I can see, it has nothing to do with stabilization and should
> > be reverted.
>
> Posted this on the gerrit review as well:
>
> 
> 4.1 does not have nightly tests, those run on master only.
>

That should change of course. We cannot strive for stability otherwise,
AFAIK.


> Stability of master does not (will not), in the near term guarantee
> stability of release branches, unless patches that impact code already
> on release branches, get fixes on master and are back ported.
>
> Release branches get fixes back ported (as is normal), this fix and its
> merge should not impact current master stability in any way, and neither
> stability of 4.1 branch.
> 
>
> The current hold is on master, not on release branches. I agree that
> merging further code changes on release branches (for example geo-rep
> issues that are backported (see [1]), as there are tests that fail
> regularly on master), may further destabilize the release branch. This
> patch is not one of those.
>

Two issues I have with the merge:
1. It just makes comparing master branch to release branch harder. For
example, to understand if there's a test that fails on master but succeeds
on release branch, or vice versa.
2. It means we are not focused on stabilizing master branch.
Y.


> Merging patches on release branches are allowed by release owners only,
> and usual practice is keeping the backlog low (merging weekly) in these
> cases as per the dashboard [1].
>
> Allowing for the above 2 reasons this patch was found,
> - Not on master
> - Not stabilizing or destabilizing the release branch
> and hence was merged.
>
> If maintainers disagree I can revert the same.
>
> Shyam
>
> [1] Release 4.1 dashboard:
>
> https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:4-1-dashboard
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 5: Master branch health report (Week of 30th July)

2018-08-07 Thread Shyam Ranganathan
On 08/07/2018 02:58 PM, Yaniv Kaul wrote:
> The intention is to stabilize master and not add more patches that my
> destabilize it.
> 
> 
> https://review.gluster.org/#/c/20603/ has been merged.
> As far as I can see, it has nothing to do with stabilization and should
> be reverted.

Posted this on the gerrit review as well:


4.1 does not have nightly tests, those run on master only.

Stability of master does not (will not), in the near term guarantee
stability of release branches, unless patches that impact code already
on release branches, get fixes on master and are back ported.

Release branches get fixes back ported (as is normal), this fix and its
merge should not impact current master stability in any way, and neither
stability of 4.1 branch.


The current hold is on master, not on release branches. I agree that
merging further code changes on release branches (for example geo-rep
issues that are backported (see [1]), as there are tests that fail
regularly on master), may further destabilize the release branch. This
patch is not one of those.

Merging patches on release branches are allowed by release owners only,
and usual practice is keeping the backlog low (merging weekly) in these
cases as per the dashboard [1].

Allowing for the above 2 reasons this patch was found,
- Not on master
- Not stabilizing or destabilizing the release branch
and hence was merged.

If maintainers disagree I can revert the same.

Shyam

[1] Release 4.1 dashboard:
https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:4-1-dashboard
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Release 5: Master branch health report (Week of 30th July)

2018-08-07 Thread Yaniv Kaul
On Mon, Aug 6, 2018 at 1:24 AM, Shyam Ranganathan 
wrote:

> On 07/31/2018 07:16 AM, Shyam Ranganathan wrote:
> > On 07/30/2018 03:21 PM, Shyam Ranganathan wrote:
> >> On 07/24/2018 03:12 PM, Shyam Ranganathan wrote:
> >>> 1) master branch health checks (weekly, till branching)
> >>>   - Expect every Monday a status update on various tests runs
> >> See https://build.gluster.org/job/nightly-master/ for a report on
> >> various nightly and periodic jobs on master.
> > Thinking aloud, we may have to stop merges to master to get these test
> > failures addressed at the earliest and to continue maintaining them
> > GREEN for the health of the branch.
> >
> > I would give the above a week, before we lockdown the branch to fix the
> > failures.
> >
> > Let's try and get line-coverage and nightly regression tests addressed
> > this week (leaving mux-regression open), and if addressed not lock the
> > branch down.
> >
>
> Health on master as of the last nightly run [4] is still the same.
>
> Potential patches that rectify the situation (as in [1]) are bunched in
> a patch [2] that Atin and myself have put through several regressions
> (mux, normal and line coverage) and these have also not passed.
>
> Till we rectify the situation we are locking down master branch commit
> rights to the following people, Amar, Atin, Shyam, Vijay.
>
> The intention is to stabilize master and not add more patches that my
> destabilize it.
>

https://review.gluster.org/#/c/20603/ has been merged.
As far as I can see, it has nothing to do with stabilization and should be
reverted.
Y.


>
> Test cases that are tracked as failures and need action are present here
> [3].
>
> @Nigel, request you to apply the commit rights change as you see this
> mail and let the list know regarding the same as well.
>
> Thanks,
> Shyam
>
> [1] Patches that address regression failures:
> https://review.gluster.org/#/q/starredby:srangana%2540redhat.com
>
> [2] Bunched up patch against which regressions were run:
> https://review.gluster.org/#/c/20637
>
> [3] Failing tests list:
> https://docs.google.com/spreadsheets/d/1IF9GhpKah4bto19RQLr0y_Kkw26E_
> -crKALHSaSjZMQ/edit?usp=sharing
>
> [4] Nightly run dashboard: https://build.gluster.org/job/nightly-master/
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 5: Master branch health report (Week of 30th July)

2018-08-05 Thread Shyam Ranganathan
On 07/31/2018 07:16 AM, Shyam Ranganathan wrote:
> On 07/30/2018 03:21 PM, Shyam Ranganathan wrote:
>> On 07/24/2018 03:12 PM, Shyam Ranganathan wrote:
>>> 1) master branch health checks (weekly, till branching)
>>>   - Expect every Monday a status update on various tests runs
>> See https://build.gluster.org/job/nightly-master/ for a report on
>> various nightly and periodic jobs on master.
> Thinking aloud, we may have to stop merges to master to get these test
> failures addressed at the earliest and to continue maintaining them
> GREEN for the health of the branch.
> 
> I would give the above a week, before we lockdown the branch to fix the
> failures.
> 
> Let's try and get line-coverage and nightly regression tests addressed
> this week (leaving mux-regression open), and if addressed not lock the
> branch down.
> 

Health on master as of the last nightly run [4] is still the same.

Potential patches that rectify the situation (as in [1]) are bunched in
a patch [2] that Atin and myself have put through several regressions
(mux, normal and line coverage) and these have also not passed.

Till we rectify the situation we are locking down master branch commit
rights to the following people, Amar, Atin, Shyam, Vijay.

The intention is to stabilize master and not add more patches that my
destabilize it.

Test cases that are tracked as failures and need action are present here
[3].

@Nigel, request you to apply the commit rights change as you see this
mail and let the list know regarding the same as well.

Thanks,
Shyam

[1] Patches that address regression failures:
https://review.gluster.org/#/q/starredby:srangana%2540redhat.com

[2] Bunched up patch against which regressions were run:
https://review.gluster.org/#/c/20637

[3] Failing tests list:
https://docs.google.com/spreadsheets/d/1IF9GhpKah4bto19RQLr0y_Kkw26E_-crKALHSaSjZMQ/edit?usp=sharing

[4] Nightly run dashboard: https://build.gluster.org/job/nightly-master/
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Release 5: Master branch health report (Week of 30th July)

2018-08-01 Thread Shyam Ranganathan
Below is a summary of failures over the last 7 days on the nightly
health check jobs. This is one test per line, sorted in descending order
of occurrence (IOW, most frequent failure is on top).

The list includes spurious failures as well, IOW passed on a retry. This
is because if we do not weed out the spurious errors, failures may
persist and make it difficult to gauge the health of the branch.

The number at the end of the test line are Jenkins job numbers where
these failed. The job numbers runs as follows,
- https://build.gluster.org/job/regression-test-burn-in/ ID: 4048 - 4053
- https://build.gluster.org/job/line-coverage/ ID: 392 - 407
- https://build.gluster.org/job/regression-test-with-multiplex/ ID: 811
- 817

So to get to job 4051 (say), use the link
https://build.gluster.org/job/regression-test-burn-in/4051

Atin has called out some folks for attention to some tests, consider
this a call out to others, if you see a test against your component,
help around root causing and fixing it is needed.

tests/bugs/core/bug-1432542-mpx-restart-crash.t, 4049, 4051, 4052, 405,
404, 403, 396, 392

tests/00-geo-rep/georep-basic-dr-tarssh.t, 811, 814, 817, 4050, 4053

tests/bugs/bug-1368312.t, 815, 816, 811, 813, 403

tests/bugs/distribute/bug-1122443.t, 4050, 407, 403, 815, 816

tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t,
814, 816, 817, 812, 815

tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t,
4049, 812, 814, 405, 392

tests/bitrot/bug-1373520.t, 811, 816, 817, 813

tests/bugs/ec/bug-1236065.t, 812, 813, 815

tests/00-geo-rep/georep-basic-dr-rsync.t, 813, 4046

tests/basic/ec/ec-1468261.t, 817, 812

tests/bugs/glusterd/quorum-validation.t, 4049, 407

tests/bugs/quota/bug-1293601.t, 811, 812

tests/basic/afr/add-brick-self-heal.t, 407

tests/basic/afr/granular-esh/replace-brick.t, 392

tests/bugs/core/multiplex-limit-issue-151.t, 405

tests/bugs/distribute/bug-1042725.t, 405

tests/bugs/distribute/bug-1117851.t, 405

tests/bugs/glusterd/rebalance-operations-in-single-node.t, 405

tests/bugs/index/bug-1559004-EMLINK-handling.t, 405

tests/bugs/replicate/bug-1386188-sbrain-fav-child.t, 4048

tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t, 813  


Thanks,
Shyam


On 07/30/2018 03:21 PM, Shyam Ranganathan wrote:
> On 07/24/2018 03:12 PM, Shyam Ranganathan wrote:
>> 1) master branch health checks (weekly, till branching)
>>   - Expect every Monday a status update on various tests runs
> 
> See https://build.gluster.org/job/nightly-master/ for a report on
> various nightly and periodic jobs on master.
> 
> RED:
> 1. Nightly regression (3/6 failed)
> - Tests that reported failure:
> ./tests/00-geo-rep/georep-basic-dr-rsync.t
> ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
> ./tests/bugs/distribute/bug-1122443.t
> 
> - Tests that needed a retry:
> ./tests/00-geo-rep/georep-basic-dr-tarssh.t
> ./tests/bugs/glusterd/quorum-validation.t
> 
> 2. Regression with multiplex (cores and test failures)
> 
> 3. line-coverage (cores and test failures)
> - Tests that failed:
> ./tests/bugs/core/bug-1432542-mpx-restart-crash.t (patch
> https://review.gluster.org/20568 does not fix the timeout entirely, as
> can be seen in this run,
> https://build.gluster.org/job/line-coverage/401/consoleFull )
> 
> Calling out to contributors to take a look at various failures, and post
> the same as bugs AND to the lists (so that duplication is avoided) to
> get this to a GREEN status.
> 
> GREEN:
> 1. cpp-check
> 2. RPM builds
> 
> IGNORE (for now):
> 1. clang scan (@nigel, this job requires clang warnings to be fixed to
> go green, right?)
> 
> Shyam
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Release 5: Master branch health report (Week of 30th July)

2018-07-31 Thread Shyam Ranganathan
On 07/30/2018 03:21 PM, Shyam Ranganathan wrote:
> On 07/24/2018 03:12 PM, Shyam Ranganathan wrote:
>> 1) master branch health checks (weekly, till branching)
>>   - Expect every Monday a status update on various tests runs
> 
> See https://build.gluster.org/job/nightly-master/ for a report on
> various nightly and periodic jobs on master.

Thinking aloud, we may have to stop merges to master to get these test
failures addressed at the earliest and to continue maintaining them
GREEN for the health of the branch.

I would give the above a week, before we lockdown the branch to fix the
failures.

Let's try and get line-coverage and nightly regression tests addressed
this week (leaving mux-regression open), and if addressed not lock the
branch down.

> 
> RED:
> 1. Nightly regression (3/6 failed)
> - Tests that reported failure:
> ./tests/00-geo-rep/georep-basic-dr-rsync.t
> ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
> ./tests/bugs/distribute/bug-1122443.t
> 
> - Tests that needed a retry:
> ./tests/00-geo-rep/georep-basic-dr-tarssh.t
> ./tests/bugs/glusterd/quorum-validation.t
> 
> 2. Regression with multiplex (cores and test failures)
> 
> 3. line-coverage (cores and test failures)
> - Tests that failed:
> ./tests/bugs/core/bug-1432542-mpx-restart-crash.t (patch
> https://review.gluster.org/20568 does not fix the timeout entirely, as
> can be seen in this run,
> https://build.gluster.org/job/line-coverage/401/consoleFull )
> 
> Calling out to contributors to take a look at various failures, and post
> the same as bugs AND to the lists (so that duplication is avoided) to
> get this to a GREEN status.
> 
> GREEN:
> 1. cpp-check
> 2. RPM builds
> 
> IGNORE (for now):
> 1. clang scan (@nigel, this job requires clang warnings to be fixed to
> go green, right?)
> 
> Shyam
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Release 5: Master branch health report (Week of 30th July)

2018-07-30 Thread Shyam Ranganathan
On 07/24/2018 03:12 PM, Shyam Ranganathan wrote:
> 1) master branch health checks (weekly, till branching)
>   - Expect every Monday a status update on various tests runs

See https://build.gluster.org/job/nightly-master/ for a report on
various nightly and periodic jobs on master.

RED:
1. Nightly regression (3/6 failed)
- Tests that reported failure:
./tests/00-geo-rep/georep-basic-dr-rsync.t
./tests/bugs/core/bug-1432542-mpx-restart-crash.t
./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
./tests/bugs/distribute/bug-1122443.t

- Tests that needed a retry:
./tests/00-geo-rep/georep-basic-dr-tarssh.t
./tests/bugs/glusterd/quorum-validation.t

2. Regression with multiplex (cores and test failures)

3. line-coverage (cores and test failures)
- Tests that failed:
./tests/bugs/core/bug-1432542-mpx-restart-crash.t (patch
https://review.gluster.org/20568 does not fix the timeout entirely, as
can be seen in this run,
https://build.gluster.org/job/line-coverage/401/consoleFull )

Calling out to contributors to take a look at various failures, and post
the same as bugs AND to the lists (so that duplication is avoided) to
get this to a GREEN status.

GREEN:
1. cpp-check
2. RPM builds

IGNORE (for now):
1. clang scan (@nigel, this job requires clang warnings to be fixed to
go green, right?)

Shyam
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel