Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2018-01-14 Thread Michael Stapelberg
Thanks for your work on this!

I gave the attached patch a shot, and it seems to work fine for me. I tried
it on two packages:

“upspin” (untagged git, non-GitHub host, Files-Excluded in
debian/copyright):

# uscan(1) configuration file.
version=4
opts="mode=git,pgpmode=none" \
https://upspin.googlesource.com/upspin \
HEAD debian uupdate

“golang-github-coreos-go-oidc” (typical Go package on GitHub):

# uscan(1) configuration file.
version=4
opts="mode=git,pgpmode=none" \
https://github.com/coreos/go-oidc.git \
HEAD debian uupdate

So, from my end, this looks good. I hope these examples can serve as test
cases?

Let me know if you need anything else.


On Sat, Jan 13, 2018 at 8:32 PM, Osamu Aoki  wrote:

> Hi,
>
> On Thu, Jan 11, 2018 at 09:56:01AM +0100, Michael Stapelberg wrote:
> > Happy new year!
> >
> > Osamu, any update on this? We’re still really interested.
>
> In progress ... partially working
>
> I need to clean up codes ;-)
> I also need couple test cases to test these out.
>
> git served from github, git over WEBDAV, ...
>
> Latest local diff is attached.
>
> Osamu
>



-- 
Best regards,
Michael
___
Pkg-go-maintainers mailing list
Pkg-go-maintainers@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers

Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2018-01-11 Thread Michael Stapelberg
Happy new year!

Osamu, any update on this? We’re still really interested.

On Fri, Oct 13, 2017 at 7:06 PM, Michael Stapelberg 
wrote:

> Almost a month has passed. What’s the current status? The pkg-go group
> is still eagerly waiting for this. If there is any way we can help,
> please let us know.
>
> On Mon, Sep 18, 2017 at 3:04 AM, Osamu Aoki  wrote:
> > Hi,
> >
> > On Mon, Sep 18, 2017 at 01:20:22PM +0800, Shengjing Zhu wrote:
> >> Hi,
> >>
> >> I want to know if it's still possible to support GitHub's commit via
> >> different way, rather than do a `git clone`.
> >>
> >> I find GitHub has RSS feed for the commit, the url is like
> >> https://github.com/Debian/dh-make-golang/commits/master.atom
> >> So uscan can parse that xml feed, and get the commit data, id, and
> >> finally forms the download link like
> >> https://github.com/Debian/dh-make-golang/archive/
> 71736daa55a06e466cdcc6c0347f5b9489471fe3.tar.gz,
> >> and the version is 0.0~git.
> >> Besides, the RSS feed url is not an API url, and will not have the API
> >> rate problem.
> >
> > Good.
> >
> >> That's simpler than doing `git clone` locally, though the shortage is
> >> site specific.
> >> BTW, gitlab also provides such feed, the url format is like
> >> https://gitlab.com/inkscape/inkscape/commits/master?format=atom
> >
> > I have been busy with other higher priority issues with devscripts
> > lately.
> >
> > There are two types of git archives.  I don't want to make unorganized
> > addition.  Please let me have some time on this.  It is living in my
> > private git now.
> >
> > Osamu
>
>
>
> --
> Best regards,
> Michael
>



-- 
Best regards,
Michael
___
Pkg-go-maintainers mailing list
Pkg-go-maintainers@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers

Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-10-13 Thread Michael Stapelberg
Almost a month has passed. What’s the current status? The pkg-go group
is still eagerly waiting for this. If there is any way we can help,
please let us know.

On Mon, Sep 18, 2017 at 3:04 AM, Osamu Aoki  wrote:
> Hi,
>
> On Mon, Sep 18, 2017 at 01:20:22PM +0800, Shengjing Zhu wrote:
>> Hi,
>>
>> I want to know if it's still possible to support GitHub's commit via
>> different way, rather than do a `git clone`.
>>
>> I find GitHub has RSS feed for the commit, the url is like
>> https://github.com/Debian/dh-make-golang/commits/master.atom
>> So uscan can parse that xml feed, and get the commit data, id, and
>> finally forms the download link like
>> https://github.com/Debian/dh-make-golang/archive/71736daa55a06e466cdcc6c0347f5b9489471fe3.tar.gz,
>> and the version is 0.0~git.
>> Besides, the RSS feed url is not an API url, and will not have the API
>> rate problem.
>
> Good.
>
>> That's simpler than doing `git clone` locally, though the shortage is
>> site specific.
>> BTW, gitlab also provides such feed, the url format is like
>> https://gitlab.com/inkscape/inkscape/commits/master?format=atom
>
> I have been busy with other higher priority issues with devscripts
> lately.
>
> There are two types of git archives.  I don't want to make unorganized
> addition.  Please let me have some time on this.  It is living in my
> private git now.
>
> Osamu



-- 
Best regards,
Michael

___
Pkg-go-maintainers mailing list
Pkg-go-maintainers@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers

Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-09-17 Thread Shengjing Zhu
Hi,

I want to know if it's still possible to support GitHub's commit via
different way, rather than do a `git clone`.

I find GitHub has RSS feed for the commit, the url is like
https://github.com/Debian/dh-make-golang/commits/master.atom
So uscan can parse that xml feed, and get the commit data, id, and
finally forms the download link like
https://github.com/Debian/dh-make-golang/archive/71736daa55a06e466cdcc6c0347f5b9489471fe3.tar.gz,
and the version is 0.0~git.
Besides, the RSS feed url is not an API url, and will not have the API
rate problem.

That's simpler than doing `git clone` locally, though the shortage is
site specific.
BTW, gitlab also provides such feed, the url format is like
https://gitlab.com/inkscape/inkscape/commits/master?format=atom

Thanks
Shengjing Zhu

___
Pkg-go-maintainers mailing list
Pkg-go-maintainers@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers


Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-08-09 Thread Michael Stapelberg
On Wed, Aug 9, 2017 at 6:54 PM, Shengjing Zhu  wrote:

> Thanks for the comment.
>
> On Wed, Aug 9, 2017 at 10:54 PM, Michael Stapelberg
>  wrote:
> > 1. I think that infrastructure which the pkg-go team critically and very
> > visibly depends on should eventually be hosted by DSA under debian.org.
> I
> > don’t see them hosting this special “workaround” service, when there
> already
> > is infrastructure in place to run uscan.
>
> Well it can be hosted by DSA, or even don't use web service. Maybe
> uscan can just call its cli tool.
>

As much as I’d like to see more Go code within Debian, I think it might be
best to stick with Perl for uscan :).


> I do hope someone can implement it in perl and bring it to uscan. But
> it's hard for me to hack 4k lines perl.
>

I can understand that, and I’m not asking you to work on uscan — Osamu
already seems to be on that.


>
> Anyway, it's an exploration for using API rather than `git clone` locally.
> And I intend to get it to support more Git services, maybe gopkg.in,
> gitlab, etc.
>
> PS, gopkg.in will point to some specific branch, and
> github.com///tags doesn't work well even I append a
> '?after=' suffix.
>
>
> >
> > 2. I have concerns regarding the scalability of such a service if we
> > actually adopted this approach: the GitHub quota permits 5000 requests
> per
> > hour (when authenticated). This sounds like a lot at first glance, but
> > consider that we already have 845 Go packages. Your code does 4 requests
> per
> > repository (IIUC), so already we are fairly close to reaching the limit,
> if
> > we don’t take any precautions.
>
> I haven't considered rate-limit, but do we check so frequently indeed?


I don’t actually know what the rate of uscan checks is behind the Debian
Package Tracker. I can imagine that other places do run uscan, too, though
(think Ubuntu, or other Debian derivatives). In fact, I’m working on a
dashboard myself which does run uscan fairly frequently.

-- 
Best regards,
Michael
___
Pkg-go-maintainers mailing list
Pkg-go-maintainers@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers

Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-08-09 Thread Shengjing Zhu
Thanks for the comment.

On Wed, Aug 9, 2017 at 10:54 PM, Michael Stapelberg
 wrote:
> 1. I think that infrastructure which the pkg-go team critically and very
> visibly depends on should eventually be hosted by DSA under debian.org. I
> don’t see them hosting this special “workaround” service, when there already
> is infrastructure in place to run uscan.

Well it can be hosted by DSA, or even don't use web service. Maybe
uscan can just call its cli tool.
I do hope someone can implement it in perl and bring it to uscan. But
it's hard for me to hack 4k lines perl.

Anyway, it's an exploration for using API rather than `git clone` locally.
And I intend to get it to support more Git services, maybe gopkg.in,
gitlab, etc.

PS, gopkg.in will point to some specific branch, and
github.com///tags doesn't work well even I append a
'?after=' suffix.


>
> 2. I have concerns regarding the scalability of such a service if we
> actually adopted this approach: the GitHub quota permits 5000 requests per
> hour (when authenticated). This sounds like a lot at first glance, but
> consider that we already have 845 Go packages. Your code does 4 requests per
> repository (IIUC), so already we are fairly close to reaching the limit, if
> we don’t take any precautions.

I haven't considered rate-limit, but do we check so frequently indeed?


-- 
Best regards,
Shengjing Zhu

___
Pkg-go-maintainers mailing list
Pkg-go-maintainers@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers

Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-08-09 Thread Michael Stapelberg
Thanks for sharing your tool!

I also considered implementing such a tool, but ultimately decided against
it for a number of reasons:

1. I think that infrastructure which the pkg-go team critically and very
visibly depends on should eventually be hosted by DSA under debian.org. I
don’t see them hosting this special “workaround” service, when there
already is infrastructure in place to run uscan.

2. I have concerns regarding the scalability of such a service if we
actually adopted this approach: the GitHub quota permits 5000 requests per
hour (when authenticated). This sounds like a lot at first glance, but
consider that we already have 845 Go packages. Your code does 4 requests
per repository (IIUC), so already we are fairly close to reaching the
limit, if we don’t take any precautions.

Most likely, point ② could be addressed with some careful limiting on our
end, and changing the processing model from generating a response upon
end-user request to iterating through all Go packages in Debian and
querying GitHub in a rate-limited fashion. This significantly complicates
the program, though, to the point where we duplicate the logic behind the
Debian Package Tracker. Worse, it introduces accidental complexity, not
inherent complexity :).

Hence, I think extending uscan is a much much more elegant route to achieve
our goal, and I’d like to ask people to hold off providing/using custom
services as a stop-gap measure. Thanks!

On Wed, Aug 9, 2017 at 7:38 AM, Shengjing Zhu  wrote:

> Hi all,
>
> I spent some time playing around GitHub api, and results a small tool,
> https://github.com/zhsj/git-watch
>
> I didn't implement it in uscan. But it can work well with uscan.
>
> A demo service is at https://watch.zhsj.me/
>
> Take one of packages I maintained,
> https://tracker.debian.org/golang-github-xiaq-persistent
> https://watch.zhsj.me/github/xiaq/persistent is the watch url.
> And the following d/watch works fine,
>
> version=4
> opts="filenamemangle=s%(?:.*?)?([^/]*)\.tar\.gz%golang-githu
> b-xiaq-persistent-$1.tar.gz%"
> \
>   https://watch.zhsj.me/github/xiaq/persistent \
>   (?:.*?/)?([^/]*)\.tar\.gz
>
> This tool works both as web service and cli.
>
> I hope you find this tool useful.
>
> Yours,
> Shengjing Zhu
>



-- 
Best regards,
Michael
___
Pkg-go-maintainers mailing list
Pkg-go-maintainers@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers

Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-08-09 Thread Shengjing Zhu
Hi all,

I spent some time playing around GitHub api, and results a small tool,
https://github.com/zhsj/git-watch

I didn't implement it in uscan. But it can work well with uscan.

A demo service is at https://watch.zhsj.me/

Take one of packages I maintained,
https://tracker.debian.org/golang-github-xiaq-persistent
https://watch.zhsj.me/github/xiaq/persistent is the watch url.
And the following d/watch works fine,

version=4
opts="filenamemangle=s%(?:.*?)?([^/]*)\.tar\.gz%golang-github-xiaq-persistent-$1.tar.gz%"
\
  https://watch.zhsj.me/github/xiaq/persistent \
  (?:.*?/)?([^/]*)\.tar\.gz

This tool works both as web service and cli.

I hope you find this tool useful.

Yours,
Shengjing Zhu

___
Pkg-go-maintainers mailing list
Pkg-go-maintainers@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers


Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-07-30 Thread Martín Ferrari
On 29/07/17 17:44, Michael Stapelberg wrote:


> Given that we are talking about repositories which do not use tags, we
> could specify --depth=1 when cloning to get a shallow clone, i.e. only
> the latest commit. That saves bandwidth and disk space, but has the
> downside that we cannot do any additional validation, i.e. we can’t
> detect if upstream ever starts using tags — unfortunately, that is a
> plausible scenario, so I would suggest doing a full clone.

As a data point, I wrote a script a while ago to do exactly this
locally. I used the shallow clone on a temporary directory:

backticks("git", "clone", "--quiet", "--bare", "--depth=1", $url,
$dest);

my $commit_data = backticks("git", "--git-dir=$dest", "log", "-1",
"--date=format:%Y%m%d", "--format=%h %cd");

chomp($commit_data);
$commit_data =~ /^([0-9a-z]{7}) ([0-9]{8})$/m
or die("Invalid git response: $commit_data");
return ($1, $2);

-- 
Martín Ferrari (Tincho)

___
Pkg-go-maintainers mailing list
Pkg-go-maintainers@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers

Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-07-30 Thread Michael Stapelberg
On Sun, Jul 30, 2017 at 6:10 AM, Osamu Aoki  wrote:

> Hi,
>
> (I switched my ISP.  No more osamua...@e01.itscom.net  Thanks for the
> reminder)
>
> On Sat, Jul 29, 2017 at 06:44:43PM +0200, Michael Stapelberg wrote:
> > Hi Osamu,
> >
> > Sorry for the late reply, and thanks for looking into this! Replies
> > inline:
>
> It's good time to make feature enhancements now.
>
> > Osamu Aoki  writes:
> > > How should we explicitly specify such variables, I guess it should be
> > > through "opts=..." such as:
> > >
> > >  opts="mode=git, pretty=0.0~git%cd.%h, date=%Y%m%d%H%M"
> >
> > Sounds good.
>
> I had to read the whole thread to recall what I was thinking ... OK ;-)
>
> > > But this "git log" needs to have local clone of git repository.
> > >
> > > I wonder if I can do without cloning first.
> >
> > After reading the git protocol and searching on the web for a little
> > bit, my conclusion is that no, you cannot use “git log” without having a
> > clone of the repository.
> >
> > Given that we are talking about repositories which do not use tags, we
> > could specify --depth=1 when cloning to get a shallow clone, i.e. only
> > the latest commit. That saves bandwidth and disk space, but has the
> > downside that we cannot do any additional validation, i.e. we can’t
> > detect if upstream ever starts using tags — unfortunately, that is a
> > plausible scenario, so I would suggest doing a full clone.
>
> OK with FULL clone.  (I need to rethink details though... I totally lost
> my memory on this topic)
>
> The thing to consider is what git local repository looks like and how
> you clone such remote tree. "upstream" branch used by git-buildpackage
> is not really the upstream git repository but its series of commits from
> the released upstream tarballs.  Maybe clone it into "upstream-git"
> branch...
>

Wouldn’t it be cleaner to not modify the local repository at all, i.e.
clone in a separate, temporary directory? Aside from a new orig tarball,
uscan doesn’t leave files behind usually, does it?


>
> > For GitHub, we can apply an optimization: the GitHub HTTP API exposes
> > repository details, such as:
> >
> > 1. The default_branch of the repo, in
> >https://developer.github.com/v3/repos/#get
> >
> > 2. The latest commit of the branch, in
> >https://developer.github.com/v3/repos/branches/#get-branch
> >
> > For interactive use by individual developers, we could send these HTTP
> > requests unauthenticated. For a setup which does many uscan calls, we’d
> > need to create a GitHub account to get the higher rate limit. See
> > https://godoc.org/github.com/google/go-github/github#hdr-Rate_Limiting
> > for details.
>
> (This optimization is a bit more work than I can do immediately.)
>

That’s fair. I’m happy to help with a patch for uscan to apply this
optimization, once the foundation for it is done.


>
> > > Adding support to the number of commits is complicated.  Let's be happy
> > > to use hash to be unique commit.  I do not think we upload more than 2
> > > Debian upstream tarball in a minute.
> >
> > In a day, not in a minute. But regardless, you are probably right. I
> > asked in the pkg-go IRC channel to see whether people are okay with
> > removing that part from the version number, so barring any objections,
> > we can probably get that done within the next few days.
>
> Why in a day?
>
> %cd is committer date and this format respects --date= option.
> --date option I suggested was %Y%m%d%H%M" which specified down to
> minutes;-)
> If you insist, I can add seconds ;-)
>

Ah, now I see where you’re coming from. We’re currently using day
granularity, and don’t want to change that, so we’re restricted to 1 upload
per day :).


>
> > > As for "git describe" like nearest tag feature, it's a interesting
> > > thought but it may make things more complicate.  So unless someone
> > > strongly request with patch, I would like to skip it.
> >
> > Agreed — if we get rid of the number of commits, we shouldn’t need git
> > describe, not even in dh-make-golang.
> >
> > It seems like you have a good handle on implementing this in uscan. Do
> > you need any additional details? Do you prefer an external patch from
> > us over implementing this yourself? I’d be happy to give you feedback on
> > a proposed patch or git commit.
>
> OK.  I guess this will be a nice project during My Debconf17 travel for
> me.


Sounds great! I can’t make it to this DebConf, but I wish you safe travels
and a great conference!

Thanks in advance,

-- 
Best regards,
Michael
___
Pkg-go-maintainers mailing list
Pkg-go-maintainers@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers

Re: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-07-29 Thread Michael Stapelberg
Hi Osamu,

Sorry for the late reply, and thanks for looking into this! Replies
inline:

Osamu Aoki  writes:
> How should we explicitly specify such variables, I guess it should be
> through "opts=..." such as:
>
>  opts="mode=git, pretty=0.0~git%cd.%h, date=%Y%m%d%H%M"

Sounds good.

>
> But this "git log" needs to have local clone of git repository.
>
> I wonder if I can do without cloning first.

After reading the git protocol and searching on the web for a little
bit, my conclusion is that no, you cannot use “git log” without having a
clone of the repository.

Given that we are talking about repositories which do not use tags, we
could specify --depth=1 when cloning to get a shallow clone, i.e. only
the latest commit. That saves bandwidth and disk space, but has the
downside that we cannot do any additional validation, i.e. we can’t
detect if upstream ever starts using tags — unfortunately, that is a
plausible scenario, so I would suggest doing a full clone.

For GitHub, we can apply an optimization: the GitHub HTTP API exposes
repository details, such as:

1. The default_branch of the repo, in
   https://developer.github.com/v3/repos/#get

2. The latest commit of the branch, in
   https://developer.github.com/v3/repos/branches/#get-branch

For interactive use by individual developers, we could send these HTTP
requests unauthenticated. For a setup which does many uscan calls, we’d
need to create a GitHub account to get the higher rate limit. See
https://godoc.org/github.com/google/go-github/github#hdr-Rate_Limiting
for details.

> Adding support to the number of commits is complicated.  Let's be happy
> to use hash to be unique commit.  I do not think we upload more than 2
> Debian upstream tarball in a minute.

In a day, not in a minute. But regardless, you are probably right. I
asked in the pkg-go IRC channel to see whether people are okay with
removing that part from the version number, so barring any objections,
we can probably get that done within the next few days.

> As for "git describe" like nearest tag feature, it's a interesting
> thought but it may make things more complicate.  So unless someone
> strongly request with patch, I would like to skip it.

Agreed — if we get rid of the number of commits, we shouldn’t need git
describe, not even in dh-make-golang.

It seems like you have a good handle on implementing this in uscan. Do
you need any additional details? Do you prefer an external patch from
us over implementing this yourself? I’d be happy to give you feedback on
a proposed patch or git commit.

Thank you very much!

-- 
Best regards,
Michael

___
Pkg-go-maintainers mailing list
Pkg-go-maintainers@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-go-maintainers