ThomasDelteil commented on issue #11120: Address already in use during tutorial 
test
URL: 
https://github.com/apache/incubator-mxnet/issues/11120#issuecomment-400972980
 
 
   To reproduce the setup I suggest looking at the jenkins function for
   tutorial tests.
   
   I had put a fix in my last PR before removing them from CI, the issue might
   be gone already. Can someone start a few hundred runs of the tutorial tests
   and see if it still happen? Note that they take ~25min so that could take a
   few days.
   
   Actually commenting out most tests except three very fast ones might be a
   better idea since it isn't related to a specific test, and as one simple
   test runs in 2-3s with the jupyter kernel overhead. To know which one is
   fast check the tutorials, some do not much like the NDArray ones.
   
   My current best guess is that the issue is related to the fact that the
   ports used by jupiter internal mechanism are chosen randomly and that there
   is linger=1000 hard-coded in the jupiter code somewhere that keep it being
   used for 1sec. For every test there is ~1/10000 chance that the same port
   will be reused (3 ports are picked between 1-100000), which makes it 1/300
   because we have 30 tests and 1/150 because we run on python2 and python3.
   That seems roughly consistent with the number of reports we've had, about
   once every 150 CI runs.
   
   There is no easy way to set the ports to fixed deterministic value. My
   latest fix added a non-ideal 1.1 sleep between tests. Let's see if that
   fixed it. The above explanation might be bogus too.
   
   I'm on my phone in a plane and can assist more from Friday onwards.
   
   Thanks for looking into it @reminisce <https://github.com/reminisce> and
   @access2rohit <https://github.com/access2rohit>!
   
   
   On Wed, Jun 27, 2018, 20:45 Anirudh Subramanian <[email protected]>
   wrote:
   
   > assigned to @reminisce <https://github.com/reminisce> @access2rohit
   > <https://github.com/access2rohit> is working on this.
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > 
<https://github.com/apache/incubator-mxnet/issues/11120#issuecomment-400864919>,
   > or mute the thread
   > 
<https://github.com/notifications/unsubscribe-auth/ADi001F-ydKxRBX2r1PYR-DJXVfJv14lks5uBBkDgaJpZM4UWPoF>
   > .
   >
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to