Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags
Thanks for your work on this! I gave the attached patch a shot, and it seems to work fine for me. I tried it on two packages: “upspin” (untagged git, non-GitHub host, Files-Excluded in debian/copyright): # uscan(1) configuration file. version=4 opts="mode=git,pgpmode=none" \ https://upspin.googlesource.com/upspin \ HEAD debian uupdate “golang-github-coreos-go-oidc” (typical Go package on GitHub): # uscan(1) configuration file. version=4 opts="mode=git,pgpmode=none" \ https://github.com/coreos/go-oidc.git \ HEAD debian uupdate So, from my end, this looks good. I hope these examples can serve as test cases? Let me know if you need anything else. On Sat, Jan 13, 2018 at 8:32 PM, Osamu Aokiwrote: > Hi, > > On Thu, Jan 11, 2018 at 09:56:01AM +0100, Michael Stapelberg wrote: > > Happy new year! > > > > Osamu, any update on this? We’re still really interested. > > In progress ... partially working > > I need to clean up codes ;-) > I also need couple test cases to test these out. > > git served from github, git over WEBDAV, ... > > Latest local diff is attached. > > Osamu > -- Best regards, Michael ___ Pkg-go-maintainers mailing list Pkg-go-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers
Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags
Happy new year! Osamu, any update on this? We’re still really interested. On Fri, Oct 13, 2017 at 7:06 PM, Michael Stapelbergwrote: > Almost a month has passed. What’s the current status? The pkg-go group > is still eagerly waiting for this. If there is any way we can help, > please let us know. > > On Mon, Sep 18, 2017 at 3:04 AM, Osamu Aoki wrote: > > Hi, > > > > On Mon, Sep 18, 2017 at 01:20:22PM +0800, Shengjing Zhu wrote: > >> Hi, > >> > >> I want to know if it's still possible to support GitHub's commit via > >> different way, rather than do a `git clone`. > >> > >> I find GitHub has RSS feed for the commit, the url is like > >> https://github.com/Debian/dh-make-golang/commits/master.atom > >> So uscan can parse that xml feed, and get the commit data, id, and > >> finally forms the download link like > >> https://github.com/Debian/dh-make-golang/archive/ > 71736daa55a06e466cdcc6c0347f5b9489471fe3.tar.gz, > >> and the version is 0.0~git. > >> Besides, the RSS feed url is not an API url, and will not have the API > >> rate problem. > > > > Good. > > > >> That's simpler than doing `git clone` locally, though the shortage is > >> site specific. > >> BTW, gitlab also provides such feed, the url format is like > >> https://gitlab.com/inkscape/inkscape/commits/master?format=atom > > > > I have been busy with other higher priority issues with devscripts > > lately. > > > > There are two types of git archives. I don't want to make unorganized > > addition. Please let me have some time on this. It is living in my > > private git now. > > > > Osamu > > > > -- > Best regards, > Michael > -- Best regards, Michael ___ Pkg-go-maintainers mailing list Pkg-go-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers
Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags
Almost a month has passed. What’s the current status? The pkg-go group is still eagerly waiting for this. If there is any way we can help, please let us know. On Mon, Sep 18, 2017 at 3:04 AM, Osamu Aokiwrote: > Hi, > > On Mon, Sep 18, 2017 at 01:20:22PM +0800, Shengjing Zhu wrote: >> Hi, >> >> I want to know if it's still possible to support GitHub's commit via >> different way, rather than do a `git clone`. >> >> I find GitHub has RSS feed for the commit, the url is like >> https://github.com/Debian/dh-make-golang/commits/master.atom >> So uscan can parse that xml feed, and get the commit data, id, and >> finally forms the download link like >> https://github.com/Debian/dh-make-golang/archive/71736daa55a06e466cdcc6c0347f5b9489471fe3.tar.gz, >> and the version is 0.0~git. >> Besides, the RSS feed url is not an API url, and will not have the API >> rate problem. > > Good. > >> That's simpler than doing `git clone` locally, though the shortage is >> site specific. >> BTW, gitlab also provides such feed, the url format is like >> https://gitlab.com/inkscape/inkscape/commits/master?format=atom > > I have been busy with other higher priority issues with devscripts > lately. > > There are two types of git archives. I don't want to make unorganized > addition. Please let me have some time on this. It is living in my > private git now. > > Osamu -- Best regards, Michael ___ Pkg-go-maintainers mailing list Pkg-go-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers
Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags
Hi, I want to know if it's still possible to support GitHub's commit via different way, rather than do a `git clone`. I find GitHub has RSS feed for the commit, the url is like https://github.com/Debian/dh-make-golang/commits/master.atom So uscan can parse that xml feed, and get the commit data, id, and finally forms the download link like https://github.com/Debian/dh-make-golang/archive/71736daa55a06e466cdcc6c0347f5b9489471fe3.tar.gz, and the version is 0.0~git. Besides, the RSS feed url is not an API url, and will not have the API rate problem. That's simpler than doing `git clone` locally, though the shortage is site specific. BTW, gitlab also provides such feed, the url format is like https://gitlab.com/inkscape/inkscape/commits/master?format=atom Thanks Shengjing Zhu ___ Pkg-go-maintainers mailing list Pkg-go-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers
Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags
On Wed, Aug 9, 2017 at 6:54 PM, Shengjing Zhuwrote: > Thanks for the comment. > > On Wed, Aug 9, 2017 at 10:54 PM, Michael Stapelberg > wrote: > > 1. I think that infrastructure which the pkg-go team critically and very > > visibly depends on should eventually be hosted by DSA under debian.org. > I > > don’t see them hosting this special “workaround” service, when there > already > > is infrastructure in place to run uscan. > > Well it can be hosted by DSA, or even don't use web service. Maybe > uscan can just call its cli tool. > As much as I’d like to see more Go code within Debian, I think it might be best to stick with Perl for uscan :). > I do hope someone can implement it in perl and bring it to uscan. But > it's hard for me to hack 4k lines perl. > I can understand that, and I’m not asking you to work on uscan — Osamu already seems to be on that. > > Anyway, it's an exploration for using API rather than `git clone` locally. > And I intend to get it to support more Git services, maybe gopkg.in, > gitlab, etc. > > PS, gopkg.in will point to some specific branch, and > github.com///tags doesn't work well even I append a > '?after=' suffix. > > > > > > 2. I have concerns regarding the scalability of such a service if we > > actually adopted this approach: the GitHub quota permits 5000 requests > per > > hour (when authenticated). This sounds like a lot at first glance, but > > consider that we already have 845 Go packages. Your code does 4 requests > per > > repository (IIUC), so already we are fairly close to reaching the limit, > if > > we don’t take any precautions. > > I haven't considered rate-limit, but do we check so frequently indeed? I don’t actually know what the rate of uscan checks is behind the Debian Package Tracker. I can imagine that other places do run uscan, too, though (think Ubuntu, or other Debian derivatives). In fact, I’m working on a dashboard myself which does run uscan fairly frequently. -- Best regards, Michael ___ Pkg-go-maintainers mailing list Pkg-go-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers
Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags
Thanks for the comment. On Wed, Aug 9, 2017 at 10:54 PM, Michael Stapelbergwrote: > 1. I think that infrastructure which the pkg-go team critically and very > visibly depends on should eventually be hosted by DSA under debian.org. I > don’t see them hosting this special “workaround” service, when there already > is infrastructure in place to run uscan. Well it can be hosted by DSA, or even don't use web service. Maybe uscan can just call its cli tool. I do hope someone can implement it in perl and bring it to uscan. But it's hard for me to hack 4k lines perl. Anyway, it's an exploration for using API rather than `git clone` locally. And I intend to get it to support more Git services, maybe gopkg.in, gitlab, etc. PS, gopkg.in will point to some specific branch, and github.com///tags doesn't work well even I append a '?after=' suffix. > > 2. I have concerns regarding the scalability of such a service if we > actually adopted this approach: the GitHub quota permits 5000 requests per > hour (when authenticated). This sounds like a lot at first glance, but > consider that we already have 845 Go packages. Your code does 4 requests per > repository (IIUC), so already we are fairly close to reaching the limit, if > we don’t take any precautions. I haven't considered rate-limit, but do we check so frequently indeed? -- Best regards, Shengjing Zhu ___ Pkg-go-maintainers mailing list Pkg-go-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers
Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags
Thanks for sharing your tool! I also considered implementing such a tool, but ultimately decided against it for a number of reasons: 1. I think that infrastructure which the pkg-go team critically and very visibly depends on should eventually be hosted by DSA under debian.org. I don’t see them hosting this special “workaround” service, when there already is infrastructure in place to run uscan. 2. I have concerns regarding the scalability of such a service if we actually adopted this approach: the GitHub quota permits 5000 requests per hour (when authenticated). This sounds like a lot at first glance, but consider that we already have 845 Go packages. Your code does 4 requests per repository (IIUC), so already we are fairly close to reaching the limit, if we don’t take any precautions. Most likely, point ② could be addressed with some careful limiting on our end, and changing the processing model from generating a response upon end-user request to iterating through all Go packages in Debian and querying GitHub in a rate-limited fashion. This significantly complicates the program, though, to the point where we duplicate the logic behind the Debian Package Tracker. Worse, it introduces accidental complexity, not inherent complexity :). Hence, I think extending uscan is a much much more elegant route to achieve our goal, and I’d like to ask people to hold off providing/using custom services as a stop-gap measure. Thanks! On Wed, Aug 9, 2017 at 7:38 AM, Shengjing Zhuwrote: > Hi all, > > I spent some time playing around GitHub api, and results a small tool, > https://github.com/zhsj/git-watch > > I didn't implement it in uscan. But it can work well with uscan. > > A demo service is at https://watch.zhsj.me/ > > Take one of packages I maintained, > https://tracker.debian.org/golang-github-xiaq-persistent > https://watch.zhsj.me/github/xiaq/persistent is the watch url. > And the following d/watch works fine, > > version=4 > opts="filenamemangle=s%(?:.*?)?([^/]*)\.tar\.gz%golang-githu > b-xiaq-persistent-$1.tar.gz%" > \ > https://watch.zhsj.me/github/xiaq/persistent \ > (?:.*?/)?([^/]*)\.tar\.gz > > This tool works both as web service and cli. > > I hope you find this tool useful. > > Yours, > Shengjing Zhu > -- Best regards, Michael ___ Pkg-go-maintainers mailing list Pkg-go-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers
Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags
Hi all, I spent some time playing around GitHub api, and results a small tool, https://github.com/zhsj/git-watch I didn't implement it in uscan. But it can work well with uscan. A demo service is at https://watch.zhsj.me/ Take one of packages I maintained, https://tracker.debian.org/golang-github-xiaq-persistent https://watch.zhsj.me/github/xiaq/persistent is the watch url. And the following d/watch works fine, version=4 opts="filenamemangle=s%(?:.*?)?([^/]*)\.tar\.gz%golang-github-xiaq-persistent-$1.tar.gz%" \ https://watch.zhsj.me/github/xiaq/persistent \ (?:.*?/)?([^/]*)\.tar\.gz This tool works both as web service and cli. I hope you find this tool useful. Yours, Shengjing Zhu ___ Pkg-go-maintainers mailing list Pkg-go-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers
Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags
On 29/07/17 17:44, Michael Stapelberg wrote: > Given that we are talking about repositories which do not use tags, we > could specify --depth=1 when cloning to get a shallow clone, i.e. only > the latest commit. That saves bandwidth and disk space, but has the > downside that we cannot do any additional validation, i.e. we can’t > detect if upstream ever starts using tags — unfortunately, that is a > plausible scenario, so I would suggest doing a full clone. As a data point, I wrote a script a while ago to do exactly this locally. I used the shallow clone on a temporary directory: backticks("git", "clone", "--quiet", "--bare", "--depth=1", $url, $dest); my $commit_data = backticks("git", "--git-dir=$dest", "log", "-1", "--date=format:%Y%m%d", "--format=%h %cd"); chomp($commit_data); $commit_data =~ /^([0-9a-z]{7}) ([0-9]{8})$/m or die("Invalid git response: $commit_data"); return ($1, $2); -- Martín Ferrari (Tincho) ___ Pkg-go-maintainers mailing list Pkg-go-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers
Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags
On Sun, Jul 30, 2017 at 6:10 AM, Osamu Aokiwrote: > Hi, > > (I switched my ISP. No more osamua...@e01.itscom.net Thanks for the > reminder) > > On Sat, Jul 29, 2017 at 06:44:43PM +0200, Michael Stapelberg wrote: > > Hi Osamu, > > > > Sorry for the late reply, and thanks for looking into this! Replies > > inline: > > It's good time to make feature enhancements now. > > > Osamu Aoki writes: > > > How should we explicitly specify such variables, I guess it should be > > > through "opts=..." such as: > > > > > > opts="mode=git, pretty=0.0~git%cd.%h, date=%Y%m%d%H%M" > > > > Sounds good. > > I had to read the whole thread to recall what I was thinking ... OK ;-) > > > > But this "git log" needs to have local clone of git repository. > > > > > > I wonder if I can do without cloning first. > > > > After reading the git protocol and searching on the web for a little > > bit, my conclusion is that no, you cannot use “git log” without having a > > clone of the repository. > > > > Given that we are talking about repositories which do not use tags, we > > could specify --depth=1 when cloning to get a shallow clone, i.e. only > > the latest commit. That saves bandwidth and disk space, but has the > > downside that we cannot do any additional validation, i.e. we can’t > > detect if upstream ever starts using tags — unfortunately, that is a > > plausible scenario, so I would suggest doing a full clone. > > OK with FULL clone. (I need to rethink details though... I totally lost > my memory on this topic) > > The thing to consider is what git local repository looks like and how > you clone such remote tree. "upstream" branch used by git-buildpackage > is not really the upstream git repository but its series of commits from > the released upstream tarballs. Maybe clone it into "upstream-git" > branch... > Wouldn’t it be cleaner to not modify the local repository at all, i.e. clone in a separate, temporary directory? Aside from a new orig tarball, uscan doesn’t leave files behind usually, does it? > > > For GitHub, we can apply an optimization: the GitHub HTTP API exposes > > repository details, such as: > > > > 1. The default_branch of the repo, in > >https://developer.github.com/v3/repos/#get > > > > 2. The latest commit of the branch, in > >https://developer.github.com/v3/repos/branches/#get-branch > > > > For interactive use by individual developers, we could send these HTTP > > requests unauthenticated. For a setup which does many uscan calls, we’d > > need to create a GitHub account to get the higher rate limit. See > > https://godoc.org/github.com/google/go-github/github#hdr-Rate_Limiting > > for details. > > (This optimization is a bit more work than I can do immediately.) > That’s fair. I’m happy to help with a patch for uscan to apply this optimization, once the foundation for it is done. > > > > Adding support to the number of commits is complicated. Let's be happy > > > to use hash to be unique commit. I do not think we upload more than 2 > > > Debian upstream tarball in a minute. > > > > In a day, not in a minute. But regardless, you are probably right. I > > asked in the pkg-go IRC channel to see whether people are okay with > > removing that part from the version number, so barring any objections, > > we can probably get that done within the next few days. > > Why in a day? > > %cd is committer date and this format respects --date= option. > --date option I suggested was %Y%m%d%H%M" which specified down to > minutes;-) > If you insist, I can add seconds ;-) > Ah, now I see where you’re coming from. We’re currently using day granularity, and don’t want to change that, so we’re restricted to 1 upload per day :). > > > > As for "git describe" like nearest tag feature, it's a interesting > > > thought but it may make things more complicate. So unless someone > > > strongly request with patch, I would like to skip it. > > > > Agreed — if we get rid of the number of commits, we shouldn’t need git > > describe, not even in dh-make-golang. > > > > It seems like you have a good handle on implementing this in uscan. Do > > you need any additional details? Do you prefer an external patch from > > us over implementing this yourself? I’d be happy to give you feedback on > > a proposed patch or git commit. > > OK. I guess this will be a nice project during My Debconf17 travel for > me. Sounds great! I can’t make it to this DebConf, but I wish you safe travels and a great conference! Thanks in advance, -- Best regards, Michael ___ Pkg-go-maintainers mailing list Pkg-go-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers
Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags
Hi Osamu, Sorry for the late reply, and thanks for looking into this! Replies inline: Osamu Aokiwrites: > How should we explicitly specify such variables, I guess it should be > through "opts=..." such as: > > opts="mode=git, pretty=0.0~git%cd.%h, date=%Y%m%d%H%M" Sounds good. > > But this "git log" needs to have local clone of git repository. > > I wonder if I can do without cloning first. After reading the git protocol and searching on the web for a little bit, my conclusion is that no, you cannot use “git log” without having a clone of the repository. Given that we are talking about repositories which do not use tags, we could specify --depth=1 when cloning to get a shallow clone, i.e. only the latest commit. That saves bandwidth and disk space, but has the downside that we cannot do any additional validation, i.e. we can’t detect if upstream ever starts using tags — unfortunately, that is a plausible scenario, so I would suggest doing a full clone. For GitHub, we can apply an optimization: the GitHub HTTP API exposes repository details, such as: 1. The default_branch of the repo, in https://developer.github.com/v3/repos/#get 2. The latest commit of the branch, in https://developer.github.com/v3/repos/branches/#get-branch For interactive use by individual developers, we could send these HTTP requests unauthenticated. For a setup which does many uscan calls, we’d need to create a GitHub account to get the higher rate limit. See https://godoc.org/github.com/google/go-github/github#hdr-Rate_Limiting for details. > Adding support to the number of commits is complicated. Let's be happy > to use hash to be unique commit. I do not think we upload more than 2 > Debian upstream tarball in a minute. In a day, not in a minute. But regardless, you are probably right. I asked in the pkg-go IRC channel to see whether people are okay with removing that part from the version number, so barring any objections, we can probably get that done within the next few days. > As for "git describe" like nearest tag feature, it's a interesting > thought but it may make things more complicate. So unless someone > strongly request with patch, I would like to skip it. Agreed — if we get rid of the number of commits, we shouldn’t need git describe, not even in dh-make-golang. It seems like you have a good handle on implementing this in uscan. Do you need any additional details? Do you prefer an external patch from us over implementing this yourself? I’d be happy to give you feedback on a proposed patch or git commit. Thank you very much! -- Best regards, Michael ___ Pkg-go-maintainers mailing list Pkg-go-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers