I don't know of a way to eliminate the flyweight copy of the repository, but you can probably reduce the disc footprint dramatically if you use a reference repository (and a Linux or Unix machine to host it).
I have a large git repository (6 GB - unnatural, but it is what we have) and we save lots of time cloning that huge repository by creating a bare repository copy in a known location on each build machine (/var/lib/git/our-repo.git). Then we add the "Additional Behaviour" for "Advanced clone behaviours" and place that directory in the "Path of the reference repository" field. That causes "git clone" (on Linux) to point to the reference repository where it can rather than creating a whole new copy in the .git directory of the working repository. You may also be able to reduce your disc footprint if you use a sparse checkout (depending on whether you need the full set of files checked out or not). In our case, sparse checkout works well because our part of the system only needs a small subset of the total files, so we checkout that part using the "Sparse checkout". The combination of those two things have dramatically reduced our time to clone and our disc use. Refer to https://wiki.jenkins-ci.org/display/JENKINS/Building+a+matrix+project#Buildingamatrixproject-Executorsusedbyamulticonfigurationprojectfor a description of the Flyweight task. If you have a mix of Linux and Windows, you can open the Advanced section of the job definition and select "Restrict where this project can be run". When a value is entered there, that controls which host will include the Flyweight task repository. Make that a Linux machine with a reference repository and you'll save time and space on the flyweight repository creation. Mark Waite On Thu, May 1, 2014 at 8:57 AM, Scott Evans <[email protected]> wrote: > We have a bunch of matrix (multi-configuration) build jobs which are > configured to use a Git plugin to pull source code into the job workspace. > Unfortunately, what happens is that the parent job pulls the full set of > code and expands it into the parent workspace, only to then fire off all of > the child configuration jobs and redo all of the source code pulling from > Git and populating for the actual builds against the configured targets > into each of the child job workspaces. > > Is there a reason that the parent does the pull for no apparent reason, > since the parent doesn't actually do anything other than just fire off the > children and wait for them to be done? If there is no reason, is there any > way to disable this functionality on the parent build? We have some builds > which are pulling down 8-10 gig of content for a build, and if there is a > way to turn that off for the parent, we'd be far ahead with build speed > performance and disk space. In our environment, the parent build could run > on any number of different nodes and there isn't a built-in way to clean up > those old workspaces so we are ending up with several stale parent > instances as time goes on. > > Thanks in advance! > Scott > > -- > You received this message because you are subscribed to the Google Groups > "Jenkins Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- Thanks! Mark Waite -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
