Jeff King <[email protected]> wrote:
> On Fri, Aug 12, 2016 at 10:42:55PM +0000, Eric Wong wrote:
> > Junio C Hamano <[email protected]> wrote:
> > > is still available. An alternative
> > >
> > > nntp://news.public-inbox.org/inbox.comp.version-control.git
> > >
> > > will become usable once it catches up with old messages.
> >
> > Mostly caught up, I injected 33 more today which were
> > cross-posted (which tripped up some of my anti-spam rules) or
> > simply missed by gmane.
> >
> > There may be more in some personal archives gmane doesn't
> > have...
>
> Is there an easy way to get _just_ the list of message-ids you are
> storing (I know I can download the whole archive, but it's big)?
XHDR (or HDR) over NNTP should do it (that's how I checked
against gmane):
--------8<-----
use Net::NNTP;
my $nntp = Net::NNTP->new($ENV{NNTPSERVER} || 'news.public-inbox.org');
my ($num, $first, $last) = $nntp->group('inbox.comp.version-control.git');
my $batch = 10000;
my $i;
for ($i = $first; $i < $last; $i += $batch) {
my $j = $i + $batch - 1;
$j = $last if $j > $last;
my $num2mid = $nntp->xhdr('Message-ID', "$i-$j");
for my $n ($i..$j) {
defined(my $mid = $num2mid->{$n}) or next;
print "$mid\n";
}
}
# and I forgot to optimize XHDR/HDR further in public-inbox-nntpd.
# Oh well, it seems to work, at least.
> Then I can cross-reference with my archive. I doubt I'll have anything
> significant that you don't. My archive of the early days was pulled from
> gmane, though I have been collecting steadily via mailing list delivery
> since 2007 or so.
What's odd is there's some messages with two Message-ID fields
from gmane from the old days, too. I'll dig a bit another time.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html