Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

2019-08-01 Thread Juan Francisco Cantero Hurtado
On Mon, Jul 15, 2019 at 09:00:46PM -0400, Daniel Jakots wrote:
> On Tue, 2 Jul 2019 17:42:07 +0200, Juan Francisco Cantero Hurtado
>  wrote:
> 
> > I've the python script but I've not started yet with the port. The
> > question for the porters (and in special people running the bulks,
> > signing/uploading packages) is, are you against this idea?
> 
> It looks like to me no one has any objection :)
> 

The script:
https://github.com/juanfra684/ports-json

The JSON:
https://juanfra.keybase.pub/tmp-ports/take3/packages.json

The size of the file is 14MB without compression and 2.7MB with gzip -1.
Some of our mirrors support gzip compression for some file types.

The file is mostly the same as in previous versions but I added a new
object at the start of the file to add some basic info about the build,
like the date, which I would like to use in the future. Also, the file
now includes the BROKEN info.

The columns used by the script are:

+-+---+
| PathId  | DISCARDED |
+-+---+
| FullPkgPath | SELECTED  |
+-+---+
| AUTOCONF_VERSION| DISCARDED |
+-+---+
| AUTOMAKE_VERSION| DISCARDED |
+-+---+
| BUILD_DEPENDS   | SELECTED  |
+-+---+
| CATEGORIES  | SELECTED  |
+-+---+
| COMES_WITH  | DISCARDED |
+-+---+
| COMMENT | SELECTED  |
+-+---+
| COMPILER| DISCARDED |
+-+---+
| COMPILER_LANGS  | DISCARDED |
+-+---+
| COMPILER_LINKS  | DISCARDED |
+-+---+
| CONFIGURE_ARGS  | DISCARDED |
+-+---+
| CONFIGURE_STYLE | DISCARDED |
+-+---+
| DESCR   | DISCARDED |
+-+---+
| DESCR_CONTENTS  | SELECTED  |
+-+---+
| DISTFILES   | DISCARDED |
+-+---+
| DISTNAME| SELECTED  |
+-+---+
| DIST_SUBDIR | DISCARDED |
+-+---+
| DPB_PROPERTIES  | DISCARDED |
+-+---+
| EPOCH   | DISCARDED |
+-+---+
| FIX_EXTRACT_PERMISSIONS | DISCARDED |
+-+---+
| FLAVORS | DISCARDED |
+-+---+
| FULLPKGNAME | SELECTED  |
+-+---+
| GH_ACCOUNT  | DISCARDED |
+-+---+
| GH_COMMIT   | DISCARDED |
+-+---+
| GH_PROJECT  | DISCARDED |
+-+---+
| GH_TAGNAME  | DISCARDED |
+-+---+
| HOMEPAGE| SELECTED  |
+-+---+
| IGNORE  | DISCARDED |
+-+---+
| IS_INTERACTIVE  | DISCARDED |
+-+---+
| LIB_DEPENDS | SELECTED  |
+-+---+
| MAINTAINER  | SELECTED  |
+-+---+
| MAKEFILE_LIST   | DISCARDED |
+-+---+
| MASTER_SITES| DISCARDED |
+-+---+
| MASTER_SITES0   | DISCARDED |
+-+---+
| MASTER_SITES1   | DISCARDED |
+-+---+
| MASTER_SITES2   | DISCARDED |
+-+---+
| MASTER_SITES3   | DISCARDED |
+-+---+
| MASTER_SITES4   | DISCARDED |
+-+---+
| MASTER_SITES5   | DISCARDED |
+-+---+
| MASTER_SITES6   | DISCARDED |
+-+---+
| MASTER_SITES7   | DISCARDED |
+-+---+
| MASTER_SITES8   | DISCARDED |
+-+---+
| MASTER_SITES9   | DISCARDED |
+-+---+
| MODULES | SELECTED  |
+-+---+
| NOT_FOR_ARCHS   | SELECTED  |
+-+---+
| NO_BUILD| SELECTED  |
+-+---+
| NO_TEST | SELECTED  |
+-+---+
| ONLY_FOR_ARCHS  | SELECTED  |
+-+---+
| PATCHFILES  | DISCARDED |
+-+---+
| 

Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

2019-07-15 Thread Daniel Jakots
On Tue, 2 Jul 2019 17:42:07 +0200, Juan Francisco Cantero Hurtado
 wrote:

> I've the python script but I've not started yet with the port. The
> question for the porters (and in special people running the bulks,
> signing/uploading packages) is, are you against this idea?

It looks like to me no one has any objection :)



Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

2019-07-03 Thread Marc Espie
On Tue, Jul 02, 2019 at 09:44:51PM +0100, Stuart Henderson wrote:
> On 2019/07/02 21:48, Hiltjo Posthuma wrote:
> > On Tue, Jul 02, 2019 at 07:02:04PM +0200, Landry Breuil wrote:
> > > On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote:
> > > > I think it would be indeed nice to provide a way for people to easily
> > > > see the content of the ports tree without having to take the sqlite3 db
> > > > out from sqlports.tgz.
> > > > 
> > > > Regarding the JSON, I think only one, with as much info as
> > > > possible with avoiding bloat, should be provided. And they should have
> > > > the field names ;)
> > > 
> > > as for repology, in the meantime i've hacked my way to extract
> > > ports-INDEX from portslist package in
> > > http://ftp.fr.openbsd.org/pub/ports-INDEX to hopefully fix
> > > https://github.com/repology/repology/issues/149, because right now it's
> > > showing misleading info.
> 
> "That's the best I can do in the current situation." -> well, removing
> OpenBSD from the list would be better than that... Surely with the amount
> of parsing they're doing they noticed the old INDEX file in cvs was junk
> with duplicate entries etc anyway?

Those guys just don't want to be bothered.
We don't matter enough for them to do some decent job out of it.



Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

2019-07-02 Thread Stuart Henderson
On 2019/07/02 21:48, Hiltjo Posthuma wrote:
> On Tue, Jul 02, 2019 at 07:02:04PM +0200, Landry Breuil wrote:
> > On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote:
> > > I think it would be indeed nice to provide a way for people to easily
> > > see the content of the ports tree without having to take the sqlite3 db
> > > out from sqlports.tgz.
> > > 
> > > Regarding the JSON, I think only one, with as much info as
> > > possible with avoiding bloat, should be provided. And they should have
> > > the field names ;)
> > 
> > as for repology, in the meantime i've hacked my way to extract
> > ports-INDEX from portslist package in
> > http://ftp.fr.openbsd.org/pub/ports-INDEX to hopefully fix
> > https://github.com/repology/repology/issues/149, because right now it's
> > showing misleading info.

"That's the best I can do in the current situation." -> well, removing
OpenBSD from the list would be better than that... Surely with the amount
of parsing they're doing they noticed the old INDEX file in cvs was junk
with duplicate entries etc anyway?

> I also use a similar method at the moment and wrote the shellscript below for
> my own use:
> 
> 
> #!/bin/sh
> # fetch generated ports INDEX from databases/sqlports (portslist) from
> # packages snapshots.
> 
> mkdir -p dl
> cd dl || exit 1
> 
> pubkey="/etc/signify/openbsd-65-pkg.pub"
> url="$(cat /etc/installurl)/snapshots/packages/$(uname -m)/"
> 
> indextxt="${url}index.txt"
> ftp "${indextxt}"
> 
> portslistpkg=$(grep -m 1 -o '\(portslist-[0-9p\.]*\.tgz\)$' < index.txt)
> 
> if test x"${portslistpkg}" = x""; then
>   echo "cannot find portslist package in index.txt" >&2
>   exit 1
> fi
> 
> portslist="${url}${portslistpkg}"
> ftp "${portslist}"
> 
> # verify
> ftp "${url}SHA256.sig"
> if ! signify -C -p "${pubkey}" -x SHA256.sig "${portslistpkg}"; then
>   exit 1
> fi
> 
> tar -xzvf "${portslistpkg}" share/ports-INDEX
> cp -v share/ports-INDEX ../INDEX

I took the simpler approach

pkg_add portslist
ln -s /usr/local/share/ports-INDEX /usr/ports/INDEX



Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

2019-07-02 Thread Hiltjo Posthuma
On Tue, Jul 02, 2019 at 07:02:04PM +0200, Landry Breuil wrote:
> On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote:
> > I think it would be indeed nice to provide a way for people to easily
> > see the content of the ports tree without having to take the sqlite3 db
> > out from sqlports.tgz.
> > 
> > Regarding the JSON, I think only one, with as much info as
> > possible with avoiding bloat, should be provided. And they should have
> > the field names ;)
> 
> as for repology, in the meantime i've hacked my way to extract
> ports-INDEX from portslist package in
> http://ftp.fr.openbsd.org/pub/ports-INDEX to hopefully fix
> https://github.com/repology/repology/issues/149, because right now it's
> showing misleading info.
> 
> Landry
> 

I also use a similar method at the moment and wrote the shellscript below for
my own use:


#!/bin/sh
# fetch generated ports INDEX from databases/sqlports (portslist) from
# packages snapshots.

mkdir -p dl
cd dl || exit 1

pubkey="/etc/signify/openbsd-65-pkg.pub"
url="$(cat /etc/installurl)/snapshots/packages/$(uname -m)/"

indextxt="${url}index.txt"
ftp "${indextxt}"

portslistpkg=$(grep -m 1 -o '\(portslist-[0-9p\.]*\.tgz\)$' < index.txt)

if test x"${portslistpkg}" = x""; then
echo "cannot find portslist package in index.txt" >&2
exit 1
fi

portslist="${url}${portslistpkg}"
ftp "${portslist}"

# verify
ftp "${url}SHA256.sig"
if ! signify -C -p "${pubkey}" -x SHA256.sig "${portslistpkg}"; then
exit 1
fi

tar -xzvf "${portslistpkg}" share/ports-INDEX
cp -v share/ports-INDEX ../INDEX

-- 
Kind regards,
Hiltjo



Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

2019-07-02 Thread Juan Francisco Cantero Hurtado
On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote:
> I think it would be indeed nice to provide a way for people to easily
> see the content of the ports tree without having to take the sqlite3 db
> out from sqlports.tgz.
> 
> Regarding the JSON, I think only one, with as much info as
> possible with avoiding bloat, should be provided. And they should have
> the field names ;)

Done.

3,0MB
https://juanfra.keybase.pub/tmp-ports/take2/packages-depends.json

1,8MB
https://juanfra.keybase.pub/tmp-ports/take2/packages-info.json

1,7MB
https://juanfra.keybase.pub/tmp-ports/take2/packages-versions.json


> On Tue, 2 Jul 2019 17:42:07 +0200, Juan Francisco Cantero Hurtado
>  wrote:
> 
> > More than a year ago I talked with espie@ about making the sqlports
> > info available as a json file in the CDNs/mirrors. My idea was to
> > provide that info for people who want to create simple/static
> > javascript web pages, external tools for monitoring the ports tree or
> > some type of automation system.
> > 
> > Nobody wants to maintain a dynamic web page to show info about the
> > ports. We had one (i can't even remember the domain) and was
> > decommissioned long time ago. A static web page reading the json from
> > the CDN with javascript would be quite easy to maintain. Also, other
> > webs could use the info (like repology.org). Or tools like [1]. You
> > could compare packages between versions of OpenBSD without uncompress
> > various copies of sqlports.
> > 
> > Everything would be generated by a simple python port during the bulk.
> > The script is now 20 lines and takes less than 10 seconds. The files
> > would be copied to pub/OpenBSD/{snapshots,version}/packages/arch.
> > Maybe only to amd64, because the info is the same. I abandoned the
> > idea one year ago because I wanted to convert the table Ports and the
> > json was 27MB[2]. Now I'm splitting a limited part of Ports in small
> > json files.
> > 
> > Examples:
> > 
> > 1,3MB, FullPkgPath, COMMENT, MAINTAINER
> > https://juanfra.keybase.pub/tmp-ports/packages-info.json
> > 
> > 1,1MB, FullPkgPath, DISTNAME, FULLPKGNAME, PKGNAME
> > https://juanfra.keybase.pub/tmp-ports/packages-versions.json
> > 
> > 1,7MB, FullPkgPath, BUILD_DEPENDS, LIB_DEPENDS, RUN_DEPENDS,
> > TEST_DEPENDS
> > https://juanfra.keybase.pub/tmp-ports/packages-depends.json
> > 
> > 
> > I've the python script but I've not started yet with the port. The
> > question for the porters (and in special people running the bulks,
> > signing/uploading packages) is, are you against this idea?.
> > 
> > 
> > 1.
> > https://github.com/danieljakots/obsd-ports-tools/blob/master/showvictims.py
> > 
> > 2. https://juanfra.keybase.pub/tmp-ports/sqlports-everything.json
> > 
> > 
> 

-- 
Juan Francisco Cantero Hurtado http://juanfra.info



Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

2019-07-02 Thread Landry Breuil
On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote:
> I think it would be indeed nice to provide a way for people to easily
> see the content of the ports tree without having to take the sqlite3 db
> out from sqlports.tgz.
> 
> Regarding the JSON, I think only one, with as much info as
> possible with avoiding bloat, should be provided. And they should have
> the field names ;)

as for repology, in the meantime i've hacked my way to extract
ports-INDEX from portslist package in
http://ftp.fr.openbsd.org/pub/ports-INDEX to hopefully fix
https://github.com/repology/repology/issues/149, because right now it's
showing misleading info.

Landry



Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

2019-07-02 Thread Daniel Jakots
I think it would be indeed nice to provide a way for people to easily
see the content of the ports tree without having to take the sqlite3 db
out from sqlports.tgz.

Regarding the JSON, I think only one, with as much info as
possible with avoiding bloat, should be provided. And they should have
the field names ;)

Cheers,
Daniel

On Tue, 2 Jul 2019 17:42:07 +0200, Juan Francisco Cantero Hurtado
 wrote:

> More than a year ago I talked with espie@ about making the sqlports
> info available as a json file in the CDNs/mirrors. My idea was to
> provide that info for people who want to create simple/static
> javascript web pages, external tools for monitoring the ports tree or
> some type of automation system.
> 
> Nobody wants to maintain a dynamic web page to show info about the
> ports. We had one (i can't even remember the domain) and was
> decommissioned long time ago. A static web page reading the json from
> the CDN with javascript would be quite easy to maintain. Also, other
> webs could use the info (like repology.org). Or tools like [1]. You
> could compare packages between versions of OpenBSD without uncompress
> various copies of sqlports.
> 
> Everything would be generated by a simple python port during the bulk.
> The script is now 20 lines and takes less than 10 seconds. The files
> would be copied to pub/OpenBSD/{snapshots,version}/packages/arch.
> Maybe only to amd64, because the info is the same. I abandoned the
> idea one year ago because I wanted to convert the table Ports and the
> json was 27MB[2]. Now I'm splitting a limited part of Ports in small
> json files.
> 
> Examples:
> 
> 1,3MB, FullPkgPath, COMMENT, MAINTAINER
> https://juanfra.keybase.pub/tmp-ports/packages-info.json
> 
> 1,1MB, FullPkgPath, DISTNAME, FULLPKGNAME, PKGNAME
> https://juanfra.keybase.pub/tmp-ports/packages-versions.json
> 
> 1,7MB, FullPkgPath, BUILD_DEPENDS, LIB_DEPENDS, RUN_DEPENDS,
> TEST_DEPENDS
> https://juanfra.keybase.pub/tmp-ports/packages-depends.json
> 
> 
> I've the python script but I've not started yet with the port. The
> question for the porters (and in special people running the bulks,
> signing/uploading packages) is, are you against this idea?.
> 
> 
> 1.
> https://github.com/danieljakots/obsd-ports-tools/blob/master/showvictims.py
> 
> 2. https://juanfra.keybase.pub/tmp-ports/sqlports-everything.json
> 
>