Hi Chad, Thank you for the feedback.
> Personally I don't think what you are trying to do is compatible with the conceptual design goals of either git or GoCD. If these various paths are really independent of one another and cannot be looked at within the scope of the wider repo or organised into a "simple" mono repo structure, should they possibly be independent repositories? Part of our struggle is that we are moving from a TFVC-based monorepo to a set of slightly more segmented Git repos. TFVC allows checking out any arbitrary section of the file tree and the above materials are direct translations of the existing TFVC materials into Git-path materials. (I.E., the App Git repo is an export of TFVC path "$/Project/path/to/App" and the existing materials pull from "$/Project/path/to/App/Trunk" and "$/Project/path/to/App/Documents/Spec".) In terms of cross-repository code, what I mean is that if I am working with an app that lives at ./Apps/App1, the project for this code is structured in such a way that it assumes someLib exists at ../../Libs/someLib/bin (relative to my App1 directory). This relative path is accurate within the TFVC monorepo, but with the move to Git, someLib has been moved into a separate Git repo. In order to ensure the builds continue to work without having too much impact on the developer workflow, we need to recreate that same relative pathing at checkout time. Realistically, we probably should work with the development team to come up with some new workflows that are more aligned with Git and code separation principles, but the objective of our current initiative is to modernize our tech stack as cheaply as possible. In practical terms, that means we are primarily looking to recreate our existing workflows using the current version of GoCD (instead of 19.8.0), backed by Git on Azure DevOps rather than TFVC on TFS Server 2013. I think you are probably right and the only sane way of tackling this for now is to use no more than one git-path material for any given Git repo on a pipeline. I will have to think a bit on how to sanely name the materials to ensure that: - When 2 pipelines should share a material, it is easy to give it the same name in both YAML files, and - There are no naming conflicts between multiple pipelines that need different slices of any given repo. Thank you again for taking the time to respond. As always, your insights are appreciated. Cheers, Jason On Sunday, 29 December 2024 at 01:38:32 UTC-5 Chad Wilson wrote: > Sorry, I should have mentioned that I am broadly aware of git > sparse-checkout > <https://github.blog/open-source/git/bring-your-monorepo-down-to-size-with-sparse-checkout/> > (most > similar to a git native approach for this) but have not gone into this in > detail or evaluated whether it could make sense in the context of something > like GoCD - or is really only something that could effectively be used by > an end-user. > > This git feature was added to git subsequent to most of the rework I did > on the GoCD git-path plugin and I haven't looked at whether other build > automation tools have support for use of this server side. > > -Chad > > On Sun, Dec 29, 2024 at 2:32 PM Chad Wilson <ch...@thoughtworks.com> > wrote: > >> Hiya Jason >> >> On Sun, Dec 29, 2024 at 8:46 AM Jason Smyth <jsm...@taqauto.com> wrote: >> >>> Hi everyone, >>> >>> I am starting with the git-path plugin and I am having trouble >>> understanding how it should be configured to ensure the files end up where >>> I expect them when the agent fetches them. >>> >>> I am working with a pipeline that uses the following materials (whittled >>> down to what I understand to be the relevant bits): >>> >>> materials: >>> App.Trunk: >>> plugin_configuration: >>> id: "git-path" >>> options: >>> url: https://dev.azure.com/Org/Project/_git/App >>> path: Trunk >>> destination: source/Project/App/Trunk >>> App.Documents.Spec: >>> plugin_configuration: >>> id: "git-path" >>> options: >>> url: https://dev.azure.com/Org/Project/_git/App >>> path: Documents/Spec >>> destination: source/Project/App/Documents/Spec >>> >>> The intention was that the contents of App$/Trunk should be placed in >>> source/Project/App/Trunk and the contents of App$/Documents/Spec should be >>> placed in source/Project/App/Documents/Spec. Instead, the plugin seems to >>> be fetching the entire repo into each of the destinations. Is this the >>> expected behaviour? >>> >> >> Yes, it is the expected behaviour. It clones the entire repo and leaves >> everything else behind in other paths, at the versions they are current to >> as of the specific `path` (aka git ref spec). To my knowledge there is no >> native git way to get part of a file system tree like you'd suggest (as a >> git ref represents the state of the *entire repo* at a given commit >> independent of file system knowledge), so the only other alternative for >> the plugin's implementation would likely be to do some file system level >> hijinx to remove paths not fetched, which from a practical perspective >> would likely mean removing ability to use all possible git ref specs >> (documented >> here >> <https://github.com/TWChennai/gocd-git-path-material-plugin?tab=readme-ov-file#constructing-path-expressions>) >> >> and instead allowing only simple path prefixes (which would be difficult to >> validate in its own right without diving into a git ref spec parser. >> >> Basically the git path plugin allows you to mitigate excessive triggering >> and reinterpret up-to-dateness for a subset of a repo (as opposed to the >> allowlist/denylist approach which have other problems) - but doesn't >> introduce some fuller concept of only fetching part of the git repo. The >> remaining clone is still a fully functional git working directory and >> repository off a given commit, which it would cease to be if doing some >> non-git-native hijinx afterwards. This is somewhat discussed at >> https://github.com/TWChennai/gocd-git-path-material-plugin?tab=readme-ov-file#stale-data-in-agent-repository-clones >> >> and is why the language/examples focus on how it monitors for changes. >> >> The other problem with this approach here is that if a commit was made >> that changes contents of both "Trunk" and "Documents/Spec", the independent >> materials could detect this single commit at different times due to the way >> material polling works. A triggered build may kick off with only the >> changes for "Trunk" and the previous ref for "Documents/Spec" (or vice >> versa). If these paths are not sufficiently independent, modelling as >> separate materials is likely to hurt rather than help. >> >> >>> >>> If so, are there any guidelines for how to deal with multiple git-path >>> materials that need to poll different paths in a single repo, while >>> ensuring that the relative paths remain intact on the agent at job run time? >>> >> >> >>> >>> Things that I need to consider: >>> >>> - App.Trunk and App.Documents.Spec are likely to be reused across >>> various pipelines, though not necessarily always together. >>> - We probably do not want to configure a custom git-path material >>> for every existing combination of paths. >>> - There is a significant amount of cross-repository code, so >>> relative paths both inside and across repositories can be relevant. >>> (I.E., >>> for any given file, the right version of that file needs to be >>> downloaded >>> to "./Project/Repo/path/to/file".) >>> >>> >> Only sensible option IMHO is to use a single material off the same >> "wider" repo for both paths (violating your second requirement) >> >> path: "Trunk, Documents/Spec" >> >> If you used something non-yaml to generate your config repo contents you >> could conceptually programmatically generate this. >> >> I'm not sure I understand what "cross repository code" means, but you >> could perhaps consider shifting some of that responsibility to git >> submodules - although I don't really like them personally due to complexity >> and the way they change developer flow. I also don't know how effectively >> submodules work with the git-path plugin specifically. >> >> >>> >>> I'm thinking I will need to pull the git-path materials into a separate >>> location, then copy the relevant files to the expected location in the >>> first (few) task(s) of the job. (E.G., fetch them into >>> ./git-path/<materialName>, then copy "./git-path/App.Trunk/Trunk" to >>> "./source/Project/App/Trunk" ) This feels incredibly hacky, though. Are >>> there any cleaner options? >>> >> >> Personally I don't think what you are trying to do is compatible with the >> conceptual design goals of either git or GoCD. If these various paths are >> really independent of one another and cannot be looked at within the scope >> of the wider repo or organised into a "simple" mono repo structure, should >> they possibly be independent repositories? >> >> Otherwise you are losing all the guarantees of GoCD that the materials >> are at consistent ref versions with one another, etc, and that a git repo >> at a given ref is a complete representation of the repo at that ref/sha. >> The git-path plugin already slightly moves away from the GoCD integrity >> guarantee to "allow" for different subsets of a repo to be considered as >> independent materials and push more "risk" into the hands of the user - but >> there's probably a limit to how far you should consider pushing that >> compromise. >> >> But to answer your question, no there are no cleaner options if trying to >> slice-and-dice a repository at various repository versions/refs and >> assemble it back together. Personally I would (and have historically) >> combined paths together if I still felt the plugin was useful enough to use >> in its current form. >> >> -Chad >> >> > -- You received this message because you are subscribed to the Google Groups "go-cd" group. To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/go-cd/c23f9c20-a7f3-4bd4-b58c-ba7348c1fa65n%40googlegroups.com.