Time it takes to build and network traffic would be reduced. On Fri, Jan 12, 2018 at 7:27 AM, Marco de Abreu < [email protected]> wrote:
> Okay, so the only disadvantage of deleting the entire workspace is that > we’ll have to pull the subrepos every single time? If it’s not working > 100%, shouldn’t we just rather stick to the hard approach using a hard > delete of the workspace rather than risking a broken slave? > > If I understand it right, the only advantage of using your proposed > approach is the fact that we won’t have to pull sub-repos, right? > > -Marco > > On Fri, Jan 12, 2018 at 3:19 PM, Chris Olivier <[email protected]> > wrote: > > > btw i think a manual delete of some sort is still necessary as we found > > that git clean (with the proper options) does not work 100% if the time. > we > > found at the time reproducible situation in which it does not and it was > > breaking every build on that machine. > > > > > > On Fri, Jan 12, 2018 at 5:30 AM Marco de Abreu < > > [email protected]> > > wrote: > > > > > Seems right to me, but I will have to investigate. I noted it down. > > > > > > -Marco > > > > > > Am 12.01.2018 1:21 nachm. schrieb "Pedro Larroy" < > > > [email protected]>: > > > > > > > I think Chris is right, git clean with the right options plus proper > > > > initialization of the submodules should not make any difference > versus > > > > deleting the entire workspace. Right? > > > > > > > > On Fri, Jan 12, 2018 at 8:56 AM, kellen sunderland > > > > <[email protected]> wrote: > > > > > Doing a few searches I see that llvm.org <http://apt.llvm.org> > > doesn't > > > > > appear to be stable enough for CI. I'm going to write something to > > > > > hopefully make it a little more stable today, while still allowing > > > those > > > > at > > > > > home to have easily reproducible build steps through docker. What > > I'd > > > > > propose is we cache the 15 or so deb packages that get installed > with > > > > clang > > > > > in s3 in the CI env. For home users who can't reach the cached s3 > > > bucket > > > > > we fall back to apt.llvm.org installation. Sound like a > reasonable > > > plan > > > > > Marco? > > > > > > > > > > On Fri, Jan 12, 2018 at 8:21 AM, Marco de Abreu < > > > > > [email protected]> wrote: > > > > > > > > > >> Aah I understand, you're right, we should revisit our decisions. > > I'll > > > > put > > > > >> it into the backlog so I don't forget it. > > > > >> > > > > >> -Marco > > > > >> > > > > >> Am 12.01.2018 2:48 vorm. schrieb "Chris Olivier" < > > > [email protected] > > > > >: > > > > >> > > > > >> Yeah, I'm just saying the whole delete was done as a drastic > measure > > > at > > > > the > > > > >> time. It may not be necessary do re-pull everything. Instead of > > > deleting > > > > >> everything, you could delete everything *except* the .git dir. and > > > then > > > > >> checkout the commit you want and it'll regenerate the sources from > > the > > > > .git > > > > >> database. > > > > >> > > > > >> This, of course, assuming the .git database is never wrong... If > > > > something > > > > >> goes wrong, you can nuke the whole dir. > > > > >> > > > > >> > > > > >> On Thu, Jan 11, 2018 at 5:42 PM, Marco de Abreu < > > > > >> [email protected]> wrote: > > > > >> > > > > >> > Exactly > > > > >> > > > > > >> > -Marco > > > > >> > > > > > >> > On Fri, Jan 12, 2018 at 2:40 AM, Chris Olivier < > > > [email protected] > > > > > > > > > >> > wrote: > > > > >> > > > > > >> > > Actrually, this is the commit related to it. > > > > >> > > https://github.com/cjolivier01/mxnet/commit/ > > > > >> > 573a010879583885a0193e30dc0b8c > > > > >> > > 848d80869b > > > > >> > > > > > > >> > > Before, the workspace directory wasn't being deleted. Now it > > is, > > > > >> > correct? > > > > >> > > Everything under the top directory, right? > > > > >> > > > > > > >> > > So a git clone re-pulls everything? > > > > >> > > > > > > >> > > On Thu, Jan 11, 2018 at 4:51 PM, Marco de Abreu < > > > > >> > > [email protected]> wrote: > > > > >> > > > > > > >> > > > deleteDir() deletes the content of the current workspace > > > > >> > > > > > > > >> > > > Okay, I haven't seen any errors related to lua-package not > > being > > > > >> > deleted. > > > > >> > > > Do you have a CI-link by any chance? > > > > >> > > > > > > > >> > > > -Marco > > > > >> > > > > > > > >> > > > On Fri, Jan 12, 2018 at 1:49 AM, Chris Olivier < > > > > >> [email protected]> > > > > >> > > > wrote: > > > > >> > > > > > > > >> > > > > what is deleteDir() call doing in Jenkinsfile? > > > > >> > > > > Yes, I mentioned the case where it wasn't getting cleaned. > > > > >> > > > > > > > > >> > > > > On Thu, Jan 11, 2018 at 4:41 PM, Marco de Abreu < > > > > >> > > > > [email protected]> wrote: > > > > >> > > > > > > > > >> > > > > > During git_init: First we're just using git clean, if > > > checkout > > > > >> > fails, > > > > >> > > > > we're > > > > >> > > > > > deleting the entire workspace and retrying. > > > > >> > > > > > > > > > >> > > > > > During build: First we're using regular make. If build > > > fails, > > > > >> we're > > > > >> > > > using > > > > >> > > > > > make clean before executing make again. > > > > >> > > > > > > > > > >> > > > > > During test: No cleanup happening in case of failure. > > > > >> > > > > > > > > > >> > > > > > So far, I haven't noticed any files not being deleted in > > the > > > > >> > > workspace. > > > > >> > > > > Do > > > > >> > > > > > you know an example? > > > > >> > > > > > > > > > >> > > > > > -Marco > > > > >> > > > > > > > > > >> > > > > > On Fri, Jan 12, 2018 at 1:34 AM, Chris Olivier < > > > > >> > > [email protected]> > > > > >> > > > > > wrote: > > > > >> > > > > > > > > > >> > > > > > > What approach is used now? I see in Jenkinsfile() > that > > > > >> > deleteDir() > > > > >> > > > is > > > > >> > > > > > > called at the top of init_git() and init_git_win(). > > That > > > > >> > dele5tes > > > > >> > > > the > > > > >> > > > > > > whole directory, correct? > > > > >> > > > > > > > > > > >> > > > > > > Before there were problems with 'git clean -d -f' > *not* > > > > >> deleting > > > > >> > > some > > > > >> > > > > > > directories which were tracked on one branch and not > on > > > > >> another, > > > > >> > > > which > > > > >> > > > > I > > > > >> > > > > > > believe is why deletDir() was put there. The > directory I > > > > recall > > > > >> > was > > > > >> > > > > > > something like lua-package or something that was in > > > > someone's > > > > >> > > private > > > > >> > > > > > repo > > > > >> > > > > > > or something like that... > > > > >> > > > > > > > > > > >> > > > > > > On Thu, Jan 11, 2018 at 4:02 PM, Marco de Abreu < > > > > >> > > > > > > [email protected]> wrote: > > > > >> > > > > > > > > > > >> > > > > > > > While it's a quite harsh solution to delete the > entire > > > > >> > > workspace, I > > > > >> > > > > > think > > > > >> > > > > > > > that it's a good way. Git checkout takes between 2 > and > > > 10 > > > > >> > > seconds, > > > > >> > > > > so I > > > > >> > > > > > > > don't think we need to optimize in that regard. > > > > >> > > > > > > > > > > > >> > > > > > > > git clean is our 'soft' approach to clean up. > Deleting > > > the > > > > >> > > > workspace > > > > >> > > > > is > > > > >> > > > > > > the > > > > >> > > > > > > > 'hard' approach, so this shouldn't be an issue. > > > > >> > > > > > > > > > > > >> > > > > > > > But there is one catch: Windows builds are not > > > > containerized > > > > >> > and > > > > >> > > > > while > > > > >> > > > > > we > > > > >> > > > > > > > delete the workspace, there could still be a lot of > > > files > > > > >> which > > > > >> > > are > > > > >> > > > > not > > > > >> > > > > > > > being tracked. In future I'd like to have at least a > > > > >> > > > > file-system-layer > > > > >> > > > > > in > > > > >> > > > > > > > between our tests and the host, but we will have to > > > > analyze > > > > >> if > > > > >> > > > > > something > > > > >> > > > > > > > like this exists. At the moment, we even got tests > > > > writing to > > > > >> > > > > system32. > > > > >> > > > > > > > > > > > >> > > > > > > > -Marco > > > > >> > > > > > > > > > > > >> > > > > > > > On Fri, Jan 12, 2018 at 12:44 AM, Chris Olivier < > > > > >> > > > > [email protected] > > > > >> > > > > > > > > > > >> > > > > > > > wrote: > > > > >> > > > > > > > > > > > >> > > > > > > > > Ok, but still on that note. I remember before that > > > when > > > > >> some > > > > >> > > > > problems > > > > >> > > > > > > > were > > > > >> > > > > > > > > being fixed in CI (before your time), they > switched > > to > > > > >> > deleting > > > > >> > > > the > > > > >> > > > > > > > entire > > > > >> > > > > > > > > source directory, ".git" subdirectory and all. At > > the > > > > >> time, > > > > >> > > the > > > > >> > > > CI > > > > >> > > > > > was > > > > >> > > > > > > > in > > > > >> > > > > > > > > such an chaotic state that I didn't make an issue > of > > > it, > > > > >> but > > > > >> > > now > > > > >> > > > > that > > > > >> > > > > > > it > > > > >> > > > > > > > > has stabilized (for the most part, today's > incident > > > > >> > > > > > notwithstanding), I > > > > >> > > > > > > > > think that we may want to revisit it if it is > still > > > > doing > > > > >> > that. > > > > >> > > > > you > > > > >> > > > > > > > could, > > > > >> > > > > > > > > for example, just delete everything except the > .git > > > > >> directory > > > > >> > > and > > > > >> > > > > > then > > > > >> > > > > > > > do a > > > > >> > > > > > > > > 'git reset --hard' to get back a baseline before > > > having > > > > to > > > > >> > > > > > re-download > > > > >> > > > > > > > > everything every tim e(also should speed up the > > > builds). > > > > >> > > > > > > > > > > > > >> > > > > > > > > Note that 'git clean' was not working as it > doesn't > > > > delete > > > > >> > > > > 'unknown' > > > > >> > > > > > > > > directories, which was the problem. > > > > >> > > > > > > > > > > > > >> > > > > > > > > WDYT? > > > > >> > > > > > > > > > > > > >> > > > > > > > > On Thu, Jan 11, 2018 at 3:26 PM, Marco de Abreu < > > > > >> > > > > > > > > [email protected]> wrote: > > > > >> > > > > > > > > > > > > >> > > > > > > > > > This happens because we just merged the clang > > > > compilation > > > > >> > > > > > > > > > https://github.com/apache/ > incubator-mxnet/commit/ > > > > >> > > > > > > > > > 2b73aac527a3439ec0dc9b1e76c6df09ea347eb1. > > > > >> > > > > > > > > > This means that clang has to get installed on > all > > > > slaves > > > > >> > and > > > > >> > > > > after > > > > >> > > > > > > some > > > > >> > > > > > > > > > time, the docker images will be cached. The > > problem > > > > right > > > > >> > now > > > > >> > > > is > > > > >> > > > > > that > > > > >> > > > > > > > > their > > > > >> > > > > > > > > > apt-server is unavailable, means the initial > > > > installation > > > > >> > to > > > > >> > > > > create > > > > >> > > > > > > the > > > > >> > > > > > > > > > docker cache doesn't succeed. In future, this > will > > > be > > > > >> > cached. > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > -Marco > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > On Thu, Jan 11, 2018 at 11:45 PM, Chris Olivier > < > > > > >> > > > > > > [email protected] > > > > >> > > > > > > > > > > > > >> > > > > > > > > > wrote: > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > do we download all submodules from scratch > > every > > > > >> build? > > > > >> > > if > > > > >> > > > we > > > > >> > > > > > do > > > > >> > > > > > > > then > > > > >> > > > > > > > > > we > > > > >> > > > > > > > > > > should probably find a way not to suggest just > > > doing > > > > >> git > > > > >> > > > reset > > > > >> > > > > or > > > > >> > > > > > > > > > something > > > > >> > > > > > > > > > > like that > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > On Thu, Jan 11, 2018 at 1:47 PM Marco de > Abreu < > > > > >> > > > > > > > > > > [email protected]> > > > > >> > > > > > > > > > > wrote: > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > Hello, > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > we're currently experiencing a CI outage > > caused > > > by > > > > >> > > > > > > > > http://apt.llvm.org > > > > >> > > > > > > > > > > not > > > > >> > > > > > > > > > > > being reachable. > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > Best regards, > > > > >> > > > > > > > > > > > Marco > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > >
