Hi Haibin, auto scaling is currently not enabled on MXNet Apache CI. This only happens on my test environment. Thanks for the hint with Scipy, I will definitely look into this!
That's a good idea. I have spoken to Steffen in the last days and we brainstormed some ideas how to handle test failures. We will let the community know if we have a more detailed plan. Best regards, Marco On Wed, May 9, 2018 at 7:19 PM, Haibin Lin <haibin.lin....@gmail.com> wrote: > Hi Marco, > > Is auto scaling already enabled on mxnet apache CI, or this is only happens > on your setup? I see the test is using scipy. Do both environments have the > same version of scipy installed? > > I recently see lots of test failures on mxnet master. One thing on my wish > list is a database which stores all the occurrences of test failures and > their commit ids, which would be very helpful for initial diagnosing what > code changes potentially introduced bugs. Otherwise clicking all past tests > and reading those logs requires a lot of manual work. > > Best, > Haibin > > On Wed, May 9, 2018 at 5:32 AM, Marco de Abreu < > marco.g.ab...@googlemail.com > > wrote: > > > Hello, > > > > I'm currently working on auto scaling and encountering a consistent test > > failure on CPU. At the moment, I'm not really sure what's causing this, > > considering the setup should be identical. > > > > http://jenkins.mxnet-ci-dev.amazon-ml.com/blue/organizations/jenkins/ > > incubator-mxnet/detail/ci-master/557/pipeline/694 > > > > ====================================================================== > > > > FAIL: test_sparse_operator.test_sparse_mathematical_core > > > > ---------------------------------------------------------------------- > > > > Traceback (most recent call last): > > > > File "/usr/local/lib/python3.5/dist-packages/nose/case.py", line 198, > in > > runTest > > > > self.test(*self.arg) > > > > File "/work/mxnet/tests/python/unittest/common.py", line 157, in > > test_new > > > > orig_test(*args, **kwargs) > > > > File "/work/mxnet/tests/python/unittest/test_sparse_operator.py", line > > 1084, in test_sparse_mathematical_core > > > > density=density, ograd_density=ograd_density) > > > > File "/work/mxnet/tests/python/unittest/test_sparse_operator.py", line > > 1056, in check_mathematical_core > > > > density=density, ograd_density=ograd_density) > > > > File "/work/mxnet/tests/python/unittest/test_sparse_operator.py", line > > 698, in check_sparse_mathematical_core > > > > assert_almost_equal(arr_grad, input_grad, equal_nan=True) > > > > File "/work/mxnet/python/mxnet/test_utils.py", line 493, in > > assert_almost_equal > > > > raise AssertionError(msg) > > > > AssertionError: > > > > Items are not equal: > > > > Error nan exceeds tolerance rtol=0.000010, atol=0.000000. Location of > > maximum error:(0, 0), a=inf, b=-inf > > > > a: array([[inf], > > > > [inf], > > > > [inf],... > > > > b: array([[-inf], > > > > [-inf], > > > > [-inf],... > > > > -------------------- >> begin captured stdout << --------------------- > > > > pass 0 > > > > 0.0, 0.0, False > > > > --------------------- >> end captured stdout << ---------------------- > > > > -------------------- >> begin captured logging << -------------------- > > > > common: INFO: Setting test np/mx/python random seeds, use > > MXNET_TEST_SEED=2103230797 to reproduce. > > > > --------------------- >> end captured logging << --------------------- > > > > > > Does this ring any bells? > > > > Thanks in advance! > > > > -Marco > > >