On Tue, Jan 19, 2021 at 7:54 AM Jay Conrod <jaycon...@google.com> wrote:
> By the way, I'm not sure if you're already doing this, but if you can > batch all of the `go list` runs (or go/packages > <https://goto.google.com/packages>.Load) together per module (passing an > argument per package), that will be much faster than loading individual > packages with separate `go list` calls. It will save `go list` from having > to load the module graph and common sets of dependencies multiple times. > > If you need to load all the packages in a module, you can also use an > argument like ./... from the root directory of the module to load > everything there (excluding modules in subdirectories). > Yeah, I am trying that but tripping on the "does not contain package" error. > > On Tue, Jan 19, 2021 at 10:02 AM Jay Conrod <jaycon...@google.com> wrote: > >> > Interesting - is the difference the absolute paths vs relative? >> >> It looks like the bug has to do with whether the directory is below the >> main module root directory or not. If it is, the go command takes a path >> that assumes it's part of the main module, which it's not. >> >> > I hoped maybe `-modfile` would do the same trick, but alas not: >> >> -modfile lets you change the effective content of go.mod but not the >> module root directory. Unfortunately it doesn't look like that can be used >> to work around the issue. >> >> > It seems that is because the "main" (top-level dir) go.mod has >> > `replace` directives with relative paths, which kubernetes really >> > does. >> >> You may need to copy those over to the tmp go.mod and adjust the paths. >> Sorry this has gotten pretty involved. >> >> > Yeah, I noticed. When GO111MODULE=off, everything I am doing is much >> > faster. I'm wary of depending on that forever, though. >> >> Module-aware mode is quite a bit more complicated than GOPATH mode, so to >> some degree it's not surprising it's slower... it's surprising that it's a >> LOT slower though. I expect there's some optimization work for us to do in >> the next development cycle. >> >> We would eventually like to deprecate GOPATH mode though, so it's a good >> idea not to depend on it in new tooling today. 'go list' should be fine to >> get package dependency info in either module mode or GOPATH mode. >> go/packages <https://goto.google.com/packages> is useful if you need >> additional information on top of that (parsed syntax trees, type info). >> >> > I want to run a slow codegen process only if the packages it depends >> > on have ACTUALLY changed (mtime is a good enough proxy) and I don't >> > know a priori which packages need codegen. I want to scan the file >> > tree, find the files that need codegen, check their deps, and only >> > then run the codegen. >> >> How much dependency info do you need? If the codegen is only within >> packages with files that have changed, 'go list' might be overkill (it >> always loads dependencies, even if they aren't printed). If you need >> dependencies or reverse dependencies, 'go list' or go/packages >> <https://goto.google.com/packages> are probably the right tools. >> >> On Fri, Jan 15, 2021 at 6:43 PM Tim Hockin <thoc...@google.com> wrote: >> >>> On Fri, Jan 15, 2021 at 2:17 PM Jay Conrod <jaycon...@google.com> wrote: >>> > >>> > I was initially going to suggest adding the module subdirectories as >>> requirements to the main go.mod, then replacing those with the >>> subdirectories. >>> > >>> > module example.com/m >>> > >>> > go 1.15 >>> > >>> > require ( >>> > example.com/other1 v0.0.0 >>> > example.com/other2 v0.0.0 >>> > example.com/m/submod v0.0.0 >>> > ) >>> > >>> > replace ( >>> > example.com/other1 => ./staging/src/example.com/other1 >>> > example.com/other2 => ./staging/src/example.com/other2 >>> > example.com/m/submod v0.0.0 => ./submod >>> > ) >>> > >>> > >>> > I think you might have tried this already. It gives the same "main >>> module ... does not contain package" error. I believe that's a bug. I've >>> opened #43733 to track it. >>> >>> Interesting. If that's a bug, then maybe I'll be able to do what I >>> need once fixed. >>> >>> > In general, it should be possible to give 'go list' an absolute or >>> relative path (starting with ./ or ../) to any directory containing a >>> package which is part of any module in the build list. For example, some >>> tools list directories in the module cache to find out what package a .go >>> file belongs to. >>> > >>> > As a workaround, you could put a go.mod in an otherwise empty >>> directory (in /tmp or something), then require the relevant modules from >>> the repo and replace them with absolute paths. Then you can run 'go list' >>> in that directory with absolute paths of package directories. >>> >>> Interesting - is the difference the absolute paths vs relative? >>> >>> I hoped maybe `-modfile` would do the same trick, but alas not: >>> >>> ``` >>> $ (cd /tmp/gomodhack/; go list /tmp/go-list-modules/submod/used/) >>> example.com/m/submod/used >>> >>> $ go list --modfile /tmp/gomodhack/go.mod >>> /tmp/go-list-modules/submod/used/ >>> main module (tmp) does not contain package tmp/submod/used >>> ``` >>> >>> It also fails some cases: >>> >>> ``` >>> (cd /tmp/gomodhack/; go list /tmp/go-list-modules/submod/used/) >>> example.com/m/submod/used >>> thockin@thockin-glaptop4 go-list-modules main /$ (cd /tmp/gomodhack/; >>> go list /tmp/go-list-modules/staging/src/example.com/other1/used/) >>> go: finding module for package >>> example.com/m/staging/src/example.com/other1/used >>> cannot find module providing package >>> example.com/m/staging/src/example.com/other1/used: unrecognized import >>> path "example.com/m/staging/src/example.com/other1/used": reading >>> https://example.com/m/staging/src/example.com/other1/used?go-get=1: >>> 404 Not Found >>> ``` >>> >>> It seems that is because the "main" (top-level dir) go.mod has >>> `replace` directives with relative paths, which kubernetes really >>> does. >>> >>> > Incidentally, golang.org/x/tools/go/packages will call 'go list' >>> under the hood in module mode. go/build <https://goto.google.com/build> >>> might do the same, depending on how it's invoked. 'go list' may be the best >>> thing to use if it gives the information you need. >>> >>> Yeah, I noticed. When GO111MODULE=off, everything I am doing is much >>> faster. I'm wary of depending on that forever, though. >>> >>> Stepping back, I fear I am pushing the square peg into a round hole. >>> Let me restate what I am trying to do. >>> >>> I want to run a slow codegen process only if the packages it depends >>> on have ACTUALLY changed (mtime is a good enough proxy) and I don't >>> know a priori which packages need codegen. I want to scan the file >>> tree, find the files that need codegen, check their deps, and only >>> then run the codegen. >>> >>> We do this today with `go list` and GO111MODULE=off, but I was advised >>> at some point that x/tools/go/packages was the future-safe approach. >>> >>> If there's a better way, I am all ears. >>> >>> Tim >>> GO111MODULE=off >>> > On Fri, Jan 15, 2021 at 11:59 AM 'Tim Hockin' via golang-nuts < >>> golang-nuts@googlegroups.com> wrote: >>> >> >>> >> Hi. This isn't exactly burning urgent, but it is a long-term issue >>> >> for Kubernetes. If there's anything I can do to catalyze the >>> >> discussion - tests to run, info to dig up, etc - please let me know. >>> >> >>> >> On Wed, Dec 23, 2020 at 10:48 AM Tim Hockin <thoc...@google.com> >>> wrote: >>> >> > >>> >> > Hi Paul! >>> >> > >>> >> > On Wed, Dec 23, 2020 at 4:23 AM Paul Jolly <p...@myitcv.io> wrote: >>> >> > > >>> >> > > > I just can't figure out how to do this. Maybe it can't be done >>> in `go >>> >> > > > list` ? Or maybe we're just missing some detail of go modules.. >>> >> > > >>> >> > > go list operates in the context of a single module (in the mode >>> you >>> >> > > are interested in), so you cannot do this with a single command >>> across >>> >> > > multiple modules. >>> >> > >>> >> > This might be a real problem for us. For this post I am reducing it >>> >> > to `go list`, but in actuality we have a small program that we wrote >>> >> > which does what we need in terms of `go/build`. It works great when >>> >> > `GO111MODULE=off` but is more than 100x slower normally. I thought >>> it >>> >> > was finally time to rewrite it in terms of `go/packages` and get rid >>> >> > of GO111MODULE=off. That didn't pan out, hence this post. >>> >> > >>> >> > More inline and below >>> >> > >>> >> > > > First I do a `find` for any file that has a specific comment >>> tag, >>> >> > > > indicating that the package needs codegen. The results span >>> several >>> >> > > > of the in-repo submodules. >>> >> > > >>> >> > > Just to check, I'm assuming the results of this find command are >>> being >>> >> > > translated to a list of packages? Because the transitive >>> dependencies >>> >> > > of a list of packages within a module can be done via a single go >>> list >>> >> > > command. >>> >> > >>> >> > The trick is "within a module". I'll update >>> >> > https://github.com/thockin/go-list-modules to reflect the process >>> >> > more. I've added a >>> >> > get_codegen_deps.sh that models the behavior. Note that I really >>> want >>> >> > files, not packages, so I can express the dep-graph. >>> >> > >>> >> > What do you mean by "translated to a list of packages" - which >>> specific syntax? >>> >> > >>> >> > What I end up with is something like `go list ./path/to/dir1 >>> >> > ./path/to/dir2 ./path/to/dir3`. Any of those dirs might be in >>> >> > different modules. So `go list` tells me "main module ( >>> example.com/m) >>> >> > does not contain package example.com/m/path/to/dir1" and so on. >>> >> > Setting `GO111MODULE=off` does work, but I fear the future of that. >>> >> > >>> >> > > > For each target package, I want to get the list of all deps and >>> >> > > > extract the GoFiles. Then I can use that to determine if the >>> codegen >>> >> > > > needs to run. >>> >> > > >>> >> > > FWIW I wrote a tool to do just this: >>> >> > > >>> https://pkg.go.dev/myitcv.io@v0.0.0-20201125173645-a7167afc9e13/cmd/gogenerate >>> >> > > which might work in your situation. >>> >> > >>> >> > I will take a look - it seems I will need to restructure a bunch of >>> >> > tooling to prove it works for us or doesn't :) >>> >> > >>> >> > > > Where it breaks down is that I can't seem to `go list` all at >>> once: >>> >> > > > >>> >> > > > ``` >>> >> > > > # This works within the "root" module >>> >> > > > $ go list -f '{{.GoFiles}}' ./subdir >>> >> > > > [file.go] >>> >> > > >>> >> > > This will work. >>> >> > > >>> >> > > > # This does not work across modules >>> >> > > > $ go list -f '{{.GoFiles}}' ./submod/used ./submod/unused >>> >> > > > main module (example.com/m) does not contain package >>> example.com/m/submod/used >>> >> > > > main module (example.com/m) does not contain package >>> example.com/m/submod/unused >>> >> > > >>> >> > > Per above, this will not work across module boundaries. >>> >> > >>> >> > It works with `GO111MODULE=off` which means that introducing modules >>> >> > is a breaking change. Can I depend on GO111MODULE=off to work the >>> >> > same way forever? >>> >> > >>> >> > > > # Nor does this work, even with module replacements >>> >> > > > $ go list -f '{{.GoFiles}}' ./staging/src/ >>> example.com/other1/used >>> >> > > > ./staging/src/example.com/other1/unused >>> >> > > > main module (example.com/m) does not contain package >>> >> > > > example.com/m/staging/src/example.com/other1/used >>> >> > > > main module (example.com/m) does not contain package >>> >> > > > example.com/m/staging/src/example.com/other1/unused >>> >> > > > ``` >>> >> > > >>> >> > > With replace directives in place this should work, but you won't >>> be >>> >> > > able to use the relative path to the modules (which is in fact >>> >> > > interpreted as a directory): it will need to be the full >>> >> > > module/package path. >>> >> > >>> >> > Given a "./path/to/pkg" - how do I convert that to a module/package >>> >> > path? I can run `(cd $dir && go list -m)` but that is super slow. >>> >> > Running JUST that for each directory that needs codegen in >>> kubernetes >>> >> > takes 20+ seconds. Is there a better way, short of writing my own >>> >> > directory-climb and parsing go.mod? >>> >> > >>> >> > > > I can run `go list` multiple times, but that's INCREDIBLY slow >>> - most >>> >> > > > of these submodules have common deps that are large. This >>> re-parses >>> >> > > > everything over and over. It takes almost 60 seconds just to >>> do `cd >>> >> > > > $dir; go list` (on the real kubernetes repo). >>> >> > > >>> >> > > Do you have a repro of this taking 60 seconds? Because that really >>> >> > > shouldn't be the case with a populated local module cache. >>> >> > >>> >> > github.com/kubernetes/kubernetes >>> >> > >>> >> > ``` >>> >> > $ time \ >>> >> > find . -type f -name \*.go \ >>> >> > | xargs grep -l "^// *+k8s:" \ >>> >> > | xargs -n 1 dirname \ >>> >> > | sort \ >>> >> > | uniq \ >>> >> > | while read X; do \ >>> >> > (cd $X; go list -f '{{.Deps}}'); \ >>> >> > done \ >>> >> > > /dev/null >>> >> > >>> >> > real 0m50.488s >>> >> > user 0m46.686s >>> >> > sys 0m18.416s >>> >> > ``` >>> >> > >>> >> > Just running that inner `go list` with GO111MODULE=off cuts the run >>> >> > time in half. >>> >> > >>> >> > Compare to: >>> >> > >>> >> > ``` >>> >> > time \ >>> >> > ( \ >>> >> > export GO111MODULE=off; \ >>> >> > find . -type f -name \*.go \ >>> >> > | xargs grep -l "^// *+k8s:" \ >>> >> > | xargs -n 1 dirname \ >>> >> > | sort \ >>> >> > | uniq \ >>> >> > | xargs go list -e -f '{{.Deps}}' \ >>> >> > ) \ >>> >> > > /dev/null >>> >> > >>> >> > real 0m1.323s >>> >> > user 0m1.174s >>> >> > sys 0m0.567s >>> >> > ``` >>> >> > >>> >> > The model repo doesn't show so significantly because it is small. >>> >> > Kubernetes is not small. >>> >> > >>> >> > I'm happy to hear better approaches - I really don't like relying on >>> >> > GO111MODULE=off forever - it seems like the sort of thing that will >>> >> > eventually get removed. >>> >> >>> >> -- >>> >> You received this message because you are subscribed to the Google >>> Groups "golang-nuts" group. >>> >> To unsubscribe from this group and stop receiving emails from it, >>> send an email to golang-nuts+unsubscr...@googlegroups.com. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/golang-nuts/CAO_Rewa6rMW79iBHj2Jz6HfJ-tCFLFhNAhYiwDh%3DNy6M35Y91Q%40mail.gmail.com >>> . >>> >> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAO_RewZX_gJDChjpJ9PgWqJdayP5KntHGj%3D%2BABjk_xa01%3D70qw%40mail.gmail.com.