Bug#811565: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-08-10 Thread Michael Stapelberg
On Thu, Aug 10, 2017 at 9:34 AM, Osamu Aoki  wrote:

> Hi,
>
> On Thu, Aug 10, 2017 at 08:19:20AM +0200, Michael Stapelberg wrote:
> > On Thu, Aug 10, 2017 at 6:50 AM, Osamu Aoki  wrote:
> >
> > The functionality in the tool is exactly what was already communicated
> in the
> > bug :).
> >
> > If you need any help with that bug, just let me know.
>
> #1: exact use case example URL.  (Project which hopefully exists after 4
> years.)
>

Here’s an example: https://github.com/Debian/dh-make-golang/. The
corresponding Debian binary package is
https://packages.debian.org/buster/dh-make-golang; there’s no debian/watch
file yet because of the current lack of support for git :).

Please let me know if you need anything else. Thanks!


>
> #2 Yes, I see ... but backticks  I think I need to do it slightly
> different
>
> backticks("git", "clone", "--quiet", "--bare", "--depth=1", $url,
> $dest);
>
> my $commit_data = backticks("git", "--git-dir=$dest", "log", "-1",
> "--date=format:%Y%m%d", "--format=%h %cd");
>
> chomp($commit_data);
> $commit_data =~ /^([0-9a-z]{7}) ([0-9]{8})$/m
> or die("Invalid git response: $commit_data");
> return ($1, $2);
>
> Thank
>
> Osamu
>
>


-- 
Best regards,
Michael


Bug#811565: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-08-10 Thread Osamu Aoki
Hi,

On Thu, Aug 10, 2017 at 08:19:20AM +0200, Michael Stapelberg wrote:
> On Thu, Aug 10, 2017 at 6:50 AM, Osamu Aoki  wrote:
> 
> The functionality in the tool is exactly what was already communicated in the
> bug :).
> 
> If you need any help with that bug, just let me know.

#1: exact use case example URL.  (Project which hopefully exists after 4
years.)

#2 Yes, I see ... but backticks  I think I need to do it slightly
different

backticks("git", "clone", "--quiet", "--bare", "--depth=1", $url,
$dest);

my $commit_data = backticks("git", "--git-dir=$dest", "log", "-1",
"--date=format:%Y%m%d", "--format=%h %cd");

chomp($commit_data);
$commit_data =~ /^([0-9a-z]{7}) ([0-9]{8})$/m
or die("Invalid git response: $commit_data");
return ($1, $2);

Thank

Osamu



Bug#811565: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-08-10 Thread Michael Stapelberg
On Thu, Aug 10, 2017 at 6:50 AM, Osamu Aoki  wrote:

> Hi,
>
>
> On Wed, Aug 09, 2017 at 07:54:05AM -0700, Michael Stapelberg wrote:
> > Thanks for sharing your tool!
>
> Thank you but I don't use go nor perl regulary :-(  I wish uscan was in
> Python.)
>
> But supporting github for go or other project is high on my list.
>
>
> > Hence, I think extending uscan is a much much more elegant route to
> achieve our
> > goal, and I’d like to ask people to hold off providing/using custom
> services as
> > a stop-gap measure. Thanks!
>
> I am thinking to add features to support use case for github for go as
> much as it can be as generic solution extendible by user.  (normally,
> this is the last thing on my list since uscan is already too
> complicated.  The last thing we need is project/site specific brute
> force work around.)
>
> If you explain in a plain english (or in patch) text what you wish!
> (Please don't send me complicated perl code. I am no perl speaker.  My
> perl skill is woose than my English one.)
>
> I know there is a bug report on uscan support of github which I should
> be working on.  :-)
>

The functionality in the tool is exactly what was already communicated in
the bug :).

If you need any help with that bug, just let me know.

Thanks!

-- 
Best regards,
Michael


Bug#811565: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-08-09 Thread Osamu Aoki
Hi,


On Wed, Aug 09, 2017 at 07:54:05AM -0700, Michael Stapelberg wrote:
> Thanks for sharing your tool!

Thank you but I don't use go nor perl regulary :-(  I wish uscan was in
Python.)

But supporting github for go or other project is high on my list.   
  
 
> Hence, I think extending uscan is a much much more elegant route to achieve 
> our
> goal, and I’d like to ask people to hold off providing/using custom services 
> as
> a stop-gap measure. Thanks!

I am thinking to add features to support use case for github for go as
much as it can be as generic solution extendible by user.  (normally,
this is the last thing on my list since uscan is already too
complicated.  The last thing we need is project/site specific brute
force work around.)

If you explain in a plain english (or in patch) text what you wish!
(Please don't send me complicated perl code. I am no perl speaker.  My
perl skill is woose than my English one.)

I know there is a bug report on uscan support of github which I should
be working on.  :-)

Osamu



Bug#811565: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-08-09 Thread Michael Stapelberg
On Wed, Aug 9, 2017 at 6:54 PM, Shengjing Zhu  wrote:

> Thanks for the comment.
>
> On Wed, Aug 9, 2017 at 10:54 PM, Michael Stapelberg
>  wrote:
> > 1. I think that infrastructure which the pkg-go team critically and very
> > visibly depends on should eventually be hosted by DSA under debian.org.
> I
> > don’t see them hosting this special “workaround” service, when there
> already
> > is infrastructure in place to run uscan.
>
> Well it can be hosted by DSA, or even don't use web service. Maybe
> uscan can just call its cli tool.
>

As much as I’d like to see more Go code within Debian, I think it might be
best to stick with Perl for uscan :).


> I do hope someone can implement it in perl and bring it to uscan. But
> it's hard for me to hack 4k lines perl.
>

I can understand that, and I’m not asking you to work on uscan — Osamu
already seems to be on that.


>
> Anyway, it's an exploration for using API rather than `git clone` locally.
> And I intend to get it to support more Git services, maybe gopkg.in,
> gitlab, etc.
>
> PS, gopkg.in will point to some specific branch, and
> github.com///tags doesn't work well even I append a
> '?after=' suffix.
>
>
> >
> > 2. I have concerns regarding the scalability of such a service if we
> > actually adopted this approach: the GitHub quota permits 5000 requests
> per
> > hour (when authenticated). This sounds like a lot at first glance, but
> > consider that we already have 845 Go packages. Your code does 4 requests
> per
> > repository (IIUC), so already we are fairly close to reaching the limit,
> if
> > we don’t take any precautions.
>
> I haven't considered rate-limit, but do we check so frequently indeed?


I don’t actually know what the rate of uscan checks is behind the Debian
Package Tracker. I can imagine that other places do run uscan, too, though
(think Ubuntu, or other Debian derivatives). In fact, I’m working on a
dashboard myself which does run uscan fairly frequently.

-- 
Best regards,
Michael


Bug#811565: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-08-09 Thread Shengjing Zhu
Thanks for the comment.

On Wed, Aug 9, 2017 at 10:54 PM, Michael Stapelberg
 wrote:
> 1. I think that infrastructure which the pkg-go team critically and very
> visibly depends on should eventually be hosted by DSA under debian.org. I
> don’t see them hosting this special “workaround” service, when there already
> is infrastructure in place to run uscan.

Well it can be hosted by DSA, or even don't use web service. Maybe
uscan can just call its cli tool.
I do hope someone can implement it in perl and bring it to uscan. But
it's hard for me to hack 4k lines perl.

Anyway, it's an exploration for using API rather than `git clone` locally.
And I intend to get it to support more Git services, maybe gopkg.in,
gitlab, etc.

PS, gopkg.in will point to some specific branch, and
github.com///tags doesn't work well even I append a
'?after=' suffix.


>
> 2. I have concerns regarding the scalability of such a service if we
> actually adopted this approach: the GitHub quota permits 5000 requests per
> hour (when authenticated). This sounds like a lot at first glance, but
> consider that we already have 845 Go packages. Your code does 4 requests per
> repository (IIUC), so already we are fairly close to reaching the limit, if
> we don’t take any precautions.

I haven't considered rate-limit, but do we check so frequently indeed?


-- 
Best regards,
Shengjing Zhu



Bug#811565: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-08-09 Thread Michael Stapelberg
Thanks for sharing your tool!

I also considered implementing such a tool, but ultimately decided against
it for a number of reasons:

1. I think that infrastructure which the pkg-go team critically and very
visibly depends on should eventually be hosted by DSA under debian.org. I
don’t see them hosting this special “workaround” service, when there
already is infrastructure in place to run uscan.

2. I have concerns regarding the scalability of such a service if we
actually adopted this approach: the GitHub quota permits 5000 requests per
hour (when authenticated). This sounds like a lot at first glance, but
consider that we already have 845 Go packages. Your code does 4 requests
per repository (IIUC), so already we are fairly close to reaching the
limit, if we don’t take any precautions.

Most likely, point ② could be addressed with some careful limiting on our
end, and changing the processing model from generating a response upon
end-user request to iterating through all Go packages in Debian and
querying GitHub in a rate-limited fashion. This significantly complicates
the program, though, to the point where we duplicate the logic behind the
Debian Package Tracker. Worse, it introduces accidental complexity, not
inherent complexity :).

Hence, I think extending uscan is a much much more elegant route to achieve
our goal, and I’d like to ask people to hold off providing/using custom
services as a stop-gap measure. Thanks!

On Wed, Aug 9, 2017 at 7:38 AM, Shengjing Zhu  wrote:

> Hi all,
>
> I spent some time playing around GitHub api, and results a small tool,
> https://github.com/zhsj/git-watch
>
> I didn't implement it in uscan. But it can work well with uscan.
>
> A demo service is at https://watch.zhsj.me/
>
> Take one of packages I maintained,
> https://tracker.debian.org/golang-github-xiaq-persistent
> https://watch.zhsj.me/github/xiaq/persistent is the watch url.
> And the following d/watch works fine,
>
> version=4
> opts="filenamemangle=s%(?:.*?)?([^/]*)\.tar\.gz%golang-githu
> b-xiaq-persistent-$1.tar.gz%"
> \
>   https://watch.zhsj.me/github/xiaq/persistent \
>   (?:.*?/)?([^/]*)\.tar\.gz
>
> This tool works both as web service and cli.
>
> I hope you find this tool useful.
>
> Yours,
> Shengjing Zhu
>



-- 
Best regards,
Michael


Bug#811565: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-08-09 Thread Shengjing Zhu
Hi all,

I spent some time playing around GitHub api, and results a small tool,
https://github.com/zhsj/git-watch

I didn't implement it in uscan. But it can work well with uscan.

A demo service is at https://watch.zhsj.me/

Take one of packages I maintained,
https://tracker.debian.org/golang-github-xiaq-persistent
https://watch.zhsj.me/github/xiaq/persistent is the watch url.
And the following d/watch works fine,

version=4
opts="filenamemangle=s%(?:.*?)?([^/]*)\.tar\.gz%golang-github-xiaq-persistent-$1.tar.gz%"
\
  https://watch.zhsj.me/github/xiaq/persistent \
  (?:.*?/)?([^/]*)\.tar\.gz

This tool works both as web service and cli.

I hope you find this tool useful.

Yours,
Shengjing Zhu



Bug#811565: [pkg-go] Bug#811565: [uscan] git mode: allow for scanning repositories without tags

2017-07-30 Thread Martín Ferrari
On 29/07/17 17:44, Michael Stapelberg wrote:


> Given that we are talking about repositories which do not use tags, we
> could specify --depth=1 when cloning to get a shallow clone, i.e. only
> the latest commit. That saves bandwidth and disk space, but has the
> downside that we cannot do any additional validation, i.e. we can’t
> detect if upstream ever starts using tags — unfortunately, that is a
> plausible scenario, so I would suggest doing a full clone.

As a data point, I wrote a script a while ago to do exactly this
locally. I used the shallow clone on a temporary directory:

backticks("git", "clone", "--quiet", "--bare", "--depth=1", $url,
$dest);

my $commit_data = backticks("git", "--git-dir=$dest", "log", "-1",
"--date=format:%Y%m%d", "--format=%h %cd");

chomp($commit_data);
$commit_data =~ /^([0-9a-z]{7}) ([0-9]{8})$/m
or die("Invalid git response: $commit_data");
return ($1, $2);

-- 
Martín Ferrari (Tincho)