Copying a comment I made on a flaky test introduced by this PR: https://github.com/apache/incubator-mxnet/issues/11171
" @piiswrong <https://github.com/piiswrong> you introduced this test in this commit [WIP] Do Not Merge. Static memory allocation for cached_op (#10817 <https://github.com/apache/incubator-mxnet/pull/10817>) 2dbd143 <https://github.com/apache/incubator-mxnet/commit/2dbd143e4892bb9ad4aa1835c79f0046603e3531> and it seems to be flaky. I have seen it failing a few times in recent builds. Can you take a look? e.g http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/master/1002/pipeline/ Given all the talk about quality contributions going on the dev mailing list, I am a little unsettled by the fact this PR went undocumented (no design docs or explanation in the PR), unreviewed (1 question ignored), the optimization wasn't tested or benchmarked (it was actually making things slower), and the code was self-merged. Should we enforce that each PR , especially the ones that introduce a significant number of changes be properly documented and reviewed before merging? " My main gripe is that there is literally no description of what the PR is achieving, no design docs to refer to. I spent time benchmarking the different optimization because I thought I did something wrong when I found that it was slower with the optimization. I would have hoped that a significant change like that would have at least been benchmarked. In French we say "Il ne faut pas confondre vitesse et precipitation" which loosely translates to "Haste makes waste". I would advocate to take the time to document PRs properly, benchmark and thoroughly test significant changes. Because down the line, other users end up losing so much more time trying to figure out what is happening when things go wrong (like here). Thanks, Thomas 2018-06-15 14:27 GMT-07:00 Marco de Abreu < [email protected]>: > If it causes issues, I'd like to invite everybody to direct their requests > to Eric since he merged the PR prematurely. The committer who merges a PR > is responsible and can be held liable for any negative impact being the > result of their action [1]. > > [1]: https://www.apache.org/dev/committers.html#committer-responsibilities > > On Fri, Jun 15, 2018 at 2:23 PM Zheng, Da <[email protected]> > wrote: > > > +1 The PR has been merged a while ago, so it has been tested by many > > people. > > Other people's work now depends on this PR. Reverting it at this point > can > > cause a lot of problems for many other people. > > > > Best, > > Da > > > > On 6/15/18, 2:18 PM, "[email protected] on behalf of Tianqi Chen" < > > [email protected] on behalf of [email protected]> wrote: > > > > +1 We would be stuck at local minimums if we just keep reverting > the > > PR > > that brings improvements in the long term > > > > Tianqi > > > > On Fri, Jun 15, 2018 at 2:15 PM, Mu Li <[email protected]> wrote: > > > > > Why reverting instead of fixing the bugs? Static memory aims to > > reduce > > > memory allocation, it's a key feature to bridge the perf gap > between > > gluon > > > and symbol. > > > > > > On Fri, Jun 15, 2018 at 2:06 PM, Marco de Abreu < > > > [email protected]> wrote: > > > > > > > Hello, > > > > > > > > I'm reverting https://github.com/apache/ > incubator-mxnet/pull/10817 > > as of > > > > https://github.com/apache/incubator-mxnet/pull/11311 due to > > regressions > > > > described in > > https://github.com/apache/incubator-mxnet/issues/11171 and > > > > https://github.com/apache/incubator-mxnet/pull/10817. > > > > > > > > The pull request has been self-merged without proper review and > > > introduced > > > > regressions. Committers should act as role models in this project > > and > > > > adhere to software engineer best practices. > > > > > > > > Best regards, > > > > Marco > > > > > > > > > > > > > >
