I have looked into this a bit, and seems the open source version which is in https://github.com/apache/incubator-mxnet-ci is older than what's already deployed. The root cause of the failure in the update job seems to be a hardcoded AMI which is no longer available. There seems to be a way now to query for the latest windows AMI: https://aws.amazon.com/blogs/mt/query-for-the-latest-windows-ami-using-systems-manager-parameter-store/
On Mon, Dec 30, 2019 at 3:12 PM Pedro Larroy <pedro.larroy.li...@gmail.com> wrote: > It's automated but broken as the execution is in failed state. I think we > will need an engineer to do repairs there. > > It's using systems manager automation to produce these AMIs. > > On Mon, Dec 30, 2019 at 1:44 PM Lausen, Leonard <lau...@amazon.com.invalid> > wrote: > >> Some more background: >> >> Since a few days, CI downloads and installs a more recent cmake version >> in the >> Windows job based on >> >> https://github.com/leezu/mxnet/blob/230ceee5d9e0e02e58be69dad1c4ffdadbaa1bd9/ci/build_windows.py#L148-L153 >> >> This ad-hoc download and installation is not ideal and in fact a >> workaround >> until the base Windows AMI used by the CI server is updated. The script >> generating the base Windows AMI is tracked at >> https://github.com/apache/incubator-mxnet-ci and Shiwen Hu recently >> updated the >> script to include the updated cmake version: >> https://github.com/apache/incubator-mxnet-ci/pull/17 >> >> It seems that this change needs to be deployed manually, which Pedro is >> attempting to do. But if I understand correctly Pedro found the public >> version >> of the AMI generation script and some currently used script diverged: >> http://ix.io/25WQ >> >> >> >> Questions: >> 1) Is there a git history associated with the version of the script that >> diverged? >> >> 2) According to >> >> https://github.com/apache/incubator-mxnet-ci/tree/master/services/jenkins-slave-creation-windows >> the Windows Base AMI should be created automatically. Why is it not done >> automatically anymore / why does the documentation claim it happens >> automatically but it doesn't? >> >> On Mon, 2019-12-30 at 12:11 -0800, Pedro Larroy wrote: >> > Hi >> > >> > I was looking at a request from Leonard for updating CMake on windows, >> and >> > I see that the post-install.py script which setups the windows >> environment >> > in CI has diverged significantly from the incubator-mxnet-ci and the >> > private repository that is used to deploy to production CI. >> > >> > https://github.com/apache/incubator-mxnet/pull/17031 >> > >> > I see quite some patch of differences, there's also different directory >> > structure which Marco committed to incubator-mxnet-ci and MKL seems to >> be >> > removed. My question why has this diverged so much, I was expecting to >> > transplant just a single patch to update CMake. >> > >> > >> > http://ix.io/25WQ >> > >> > >> > Pedro. >> >