Re: sqlports info available as a json in the CDN/mirrors, yay or nay?
On Mon, Jul 15, 2019 at 09:00:46PM -0400, Daniel Jakots wrote: > On Tue, 2 Jul 2019 17:42:07 +0200, Juan Francisco Cantero Hurtado > wrote: > > > I've the python script but I've not started yet with the port. The > > question for the porters (and in special people running the bulks, > > signing/uploading packages) is, are you against this idea? > > It looks like to me no one has any objection :) > The script: https://github.com/juanfra684/ports-json The JSON: https://juanfra.keybase.pub/tmp-ports/take3/packages.json The size of the file is 14MB without compression and 2.7MB with gzip -1. Some of our mirrors support gzip compression for some file types. The file is mostly the same as in previous versions but I added a new object at the start of the file to add some basic info about the build, like the date, which I would like to use in the future. Also, the file now includes the BROKEN info. The columns used by the script are: +-+---+ | PathId | DISCARDED | +-+---+ | FullPkgPath | SELECTED | +-+---+ | AUTOCONF_VERSION| DISCARDED | +-+---+ | AUTOMAKE_VERSION| DISCARDED | +-+---+ | BUILD_DEPENDS | SELECTED | +-+---+ | CATEGORIES | SELECTED | +-+---+ | COMES_WITH | DISCARDED | +-+---+ | COMMENT | SELECTED | +-+---+ | COMPILER| DISCARDED | +-+---+ | COMPILER_LANGS | DISCARDED | +-+---+ | COMPILER_LINKS | DISCARDED | +-+---+ | CONFIGURE_ARGS | DISCARDED | +-+---+ | CONFIGURE_STYLE | DISCARDED | +-+---+ | DESCR | DISCARDED | +-+---+ | DESCR_CONTENTS | SELECTED | +-+---+ | DISTFILES | DISCARDED | +-+---+ | DISTNAME| SELECTED | +-+---+ | DIST_SUBDIR | DISCARDED | +-+---+ | DPB_PROPERTIES | DISCARDED | +-+---+ | EPOCH | DISCARDED | +-+---+ | FIX_EXTRACT_PERMISSIONS | DISCARDED | +-+---+ | FLAVORS | DISCARDED | +-+---+ | FULLPKGNAME | SELECTED | +-+---+ | GH_ACCOUNT | DISCARDED | +-+---+ | GH_COMMIT | DISCARDED | +-+---+ | GH_PROJECT | DISCARDED | +-+---+ | GH_TAGNAME | DISCARDED | +-+---+ | HOMEPAGE| SELECTED | +-+---+ | IGNORE | DISCARDED | +-+---+ | IS_INTERACTIVE | DISCARDED | +-+---+ | LIB_DEPENDS | SELECTED | +-+---+ | MAINTAINER | SELECTED | +-+---+ | MAKEFILE_LIST | DISCARDED | +-+---+ | MASTER_SITES| DISCARDED | +-+---+ | MASTER_SITES0 | DISCARDED | +-+---+ | MASTER_SITES1 | DISCARDED | +-+---+ | MASTER_SITES2 | DISCARDED | +-+---+ | MASTER_SITES3 | DISCARDED | +-+---+ | MASTER_SITES4 | DISCARDED | +-+---+ | MASTER_SITES5 | DISCARDED | +-+---+ | MASTER_SITES6 | DISCARDED | +-+---+ | MASTER_SITES7 | DISCARDED | +-+---+ | MASTER_SITES8 | DISCARDED | +-+---+ | MASTER_SITES9 | DISCARDED | +-+---+ | MODULES | SELECTED | +-+---+ | NOT_FOR_ARCHS | SELECTED | +-+---+ | NO_BUILD| SELECTED | +-+---+ | NO_TEST | SELECTED | +-+---+ | ONLY_FOR_ARCHS | SELECTED | +-+---+ | PATCHFILES | DISCARDED | +-+---+ |
Re: sqlports info available as a json in the CDN/mirrors, yay or nay?
On Tue, 2 Jul 2019 17:42:07 +0200, Juan Francisco Cantero Hurtado wrote: > I've the python script but I've not started yet with the port. The > question for the porters (and in special people running the bulks, > signing/uploading packages) is, are you against this idea? It looks like to me no one has any objection :)
Re: sqlports info available as a json in the CDN/mirrors, yay or nay?
On Tue, Jul 02, 2019 at 09:44:51PM +0100, Stuart Henderson wrote: > On 2019/07/02 21:48, Hiltjo Posthuma wrote: > > On Tue, Jul 02, 2019 at 07:02:04PM +0200, Landry Breuil wrote: > > > On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote: > > > > I think it would be indeed nice to provide a way for people to easily > > > > see the content of the ports tree without having to take the sqlite3 db > > > > out from sqlports.tgz. > > > > > > > > Regarding the JSON, I think only one, with as much info as > > > > possible with avoiding bloat, should be provided. And they should have > > > > the field names ;) > > > > > > as for repology, in the meantime i've hacked my way to extract > > > ports-INDEX from portslist package in > > > http://ftp.fr.openbsd.org/pub/ports-INDEX to hopefully fix > > > https://github.com/repology/repology/issues/149, because right now it's > > > showing misleading info. > > "That's the best I can do in the current situation." -> well, removing > OpenBSD from the list would be better than that... Surely with the amount > of parsing they're doing they noticed the old INDEX file in cvs was junk > with duplicate entries etc anyway? Those guys just don't want to be bothered. We don't matter enough for them to do some decent job out of it.
Re: sqlports info available as a json in the CDN/mirrors, yay or nay?
On 2019/07/02 21:48, Hiltjo Posthuma wrote: > On Tue, Jul 02, 2019 at 07:02:04PM +0200, Landry Breuil wrote: > > On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote: > > > I think it would be indeed nice to provide a way for people to easily > > > see the content of the ports tree without having to take the sqlite3 db > > > out from sqlports.tgz. > > > > > > Regarding the JSON, I think only one, with as much info as > > > possible with avoiding bloat, should be provided. And they should have > > > the field names ;) > > > > as for repology, in the meantime i've hacked my way to extract > > ports-INDEX from portslist package in > > http://ftp.fr.openbsd.org/pub/ports-INDEX to hopefully fix > > https://github.com/repology/repology/issues/149, because right now it's > > showing misleading info. "That's the best I can do in the current situation." -> well, removing OpenBSD from the list would be better than that... Surely with the amount of parsing they're doing they noticed the old INDEX file in cvs was junk with duplicate entries etc anyway? > I also use a similar method at the moment and wrote the shellscript below for > my own use: > > > #!/bin/sh > # fetch generated ports INDEX from databases/sqlports (portslist) from > # packages snapshots. > > mkdir -p dl > cd dl || exit 1 > > pubkey="/etc/signify/openbsd-65-pkg.pub" > url="$(cat /etc/installurl)/snapshots/packages/$(uname -m)/" > > indextxt="${url}index.txt" > ftp "${indextxt}" > > portslistpkg=$(grep -m 1 -o '\(portslist-[0-9p\.]*\.tgz\)$' < index.txt) > > if test x"${portslistpkg}" = x""; then > echo "cannot find portslist package in index.txt" >&2 > exit 1 > fi > > portslist="${url}${portslistpkg}" > ftp "${portslist}" > > # verify > ftp "${url}SHA256.sig" > if ! signify -C -p "${pubkey}" -x SHA256.sig "${portslistpkg}"; then > exit 1 > fi > > tar -xzvf "${portslistpkg}" share/ports-INDEX > cp -v share/ports-INDEX ../INDEX I took the simpler approach pkg_add portslist ln -s /usr/local/share/ports-INDEX /usr/ports/INDEX
Re: sqlports info available as a json in the CDN/mirrors, yay or nay?
On Tue, Jul 02, 2019 at 07:02:04PM +0200, Landry Breuil wrote: > On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote: > > I think it would be indeed nice to provide a way for people to easily > > see the content of the ports tree without having to take the sqlite3 db > > out from sqlports.tgz. > > > > Regarding the JSON, I think only one, with as much info as > > possible with avoiding bloat, should be provided. And they should have > > the field names ;) > > as for repology, in the meantime i've hacked my way to extract > ports-INDEX from portslist package in > http://ftp.fr.openbsd.org/pub/ports-INDEX to hopefully fix > https://github.com/repology/repology/issues/149, because right now it's > showing misleading info. > > Landry > I also use a similar method at the moment and wrote the shellscript below for my own use: #!/bin/sh # fetch generated ports INDEX from databases/sqlports (portslist) from # packages snapshots. mkdir -p dl cd dl || exit 1 pubkey="/etc/signify/openbsd-65-pkg.pub" url="$(cat /etc/installurl)/snapshots/packages/$(uname -m)/" indextxt="${url}index.txt" ftp "${indextxt}" portslistpkg=$(grep -m 1 -o '\(portslist-[0-9p\.]*\.tgz\)$' < index.txt) if test x"${portslistpkg}" = x""; then echo "cannot find portslist package in index.txt" >&2 exit 1 fi portslist="${url}${portslistpkg}" ftp "${portslist}" # verify ftp "${url}SHA256.sig" if ! signify -C -p "${pubkey}" -x SHA256.sig "${portslistpkg}"; then exit 1 fi tar -xzvf "${portslistpkg}" share/ports-INDEX cp -v share/ports-INDEX ../INDEX -- Kind regards, Hiltjo
Re: sqlports info available as a json in the CDN/mirrors, yay or nay?
On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote: > I think it would be indeed nice to provide a way for people to easily > see the content of the ports tree without having to take the sqlite3 db > out from sqlports.tgz. > > Regarding the JSON, I think only one, with as much info as > possible with avoiding bloat, should be provided. And they should have > the field names ;) Done. 3,0MB https://juanfra.keybase.pub/tmp-ports/take2/packages-depends.json 1,8MB https://juanfra.keybase.pub/tmp-ports/take2/packages-info.json 1,7MB https://juanfra.keybase.pub/tmp-ports/take2/packages-versions.json > On Tue, 2 Jul 2019 17:42:07 +0200, Juan Francisco Cantero Hurtado > wrote: > > > More than a year ago I talked with espie@ about making the sqlports > > info available as a json file in the CDNs/mirrors. My idea was to > > provide that info for people who want to create simple/static > > javascript web pages, external tools for monitoring the ports tree or > > some type of automation system. > > > > Nobody wants to maintain a dynamic web page to show info about the > > ports. We had one (i can't even remember the domain) and was > > decommissioned long time ago. A static web page reading the json from > > the CDN with javascript would be quite easy to maintain. Also, other > > webs could use the info (like repology.org). Or tools like [1]. You > > could compare packages between versions of OpenBSD without uncompress > > various copies of sqlports. > > > > Everything would be generated by a simple python port during the bulk. > > The script is now 20 lines and takes less than 10 seconds. The files > > would be copied to pub/OpenBSD/{snapshots,version}/packages/arch. > > Maybe only to amd64, because the info is the same. I abandoned the > > idea one year ago because I wanted to convert the table Ports and the > > json was 27MB[2]. Now I'm splitting a limited part of Ports in small > > json files. > > > > Examples: > > > > 1,3MB, FullPkgPath, COMMENT, MAINTAINER > > https://juanfra.keybase.pub/tmp-ports/packages-info.json > > > > 1,1MB, FullPkgPath, DISTNAME, FULLPKGNAME, PKGNAME > > https://juanfra.keybase.pub/tmp-ports/packages-versions.json > > > > 1,7MB, FullPkgPath, BUILD_DEPENDS, LIB_DEPENDS, RUN_DEPENDS, > > TEST_DEPENDS > > https://juanfra.keybase.pub/tmp-ports/packages-depends.json > > > > > > I've the python script but I've not started yet with the port. The > > question for the porters (and in special people running the bulks, > > signing/uploading packages) is, are you against this idea?. > > > > > > 1. > > https://github.com/danieljakots/obsd-ports-tools/blob/master/showvictims.py > > > > 2. https://juanfra.keybase.pub/tmp-ports/sqlports-everything.json > > > > > -- Juan Francisco Cantero Hurtado http://juanfra.info
Re: sqlports info available as a json in the CDN/mirrors, yay or nay?
On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote: > I think it would be indeed nice to provide a way for people to easily > see the content of the ports tree without having to take the sqlite3 db > out from sqlports.tgz. > > Regarding the JSON, I think only one, with as much info as > possible with avoiding bloat, should be provided. And they should have > the field names ;) as for repology, in the meantime i've hacked my way to extract ports-INDEX from portslist package in http://ftp.fr.openbsd.org/pub/ports-INDEX to hopefully fix https://github.com/repology/repology/issues/149, because right now it's showing misleading info. Landry
Re: sqlports info available as a json in the CDN/mirrors, yay or nay?
I think it would be indeed nice to provide a way for people to easily see the content of the ports tree without having to take the sqlite3 db out from sqlports.tgz. Regarding the JSON, I think only one, with as much info as possible with avoiding bloat, should be provided. And they should have the field names ;) Cheers, Daniel On Tue, 2 Jul 2019 17:42:07 +0200, Juan Francisco Cantero Hurtado wrote: > More than a year ago I talked with espie@ about making the sqlports > info available as a json file in the CDNs/mirrors. My idea was to > provide that info for people who want to create simple/static > javascript web pages, external tools for monitoring the ports tree or > some type of automation system. > > Nobody wants to maintain a dynamic web page to show info about the > ports. We had one (i can't even remember the domain) and was > decommissioned long time ago. A static web page reading the json from > the CDN with javascript would be quite easy to maintain. Also, other > webs could use the info (like repology.org). Or tools like [1]. You > could compare packages between versions of OpenBSD without uncompress > various copies of sqlports. > > Everything would be generated by a simple python port during the bulk. > The script is now 20 lines and takes less than 10 seconds. The files > would be copied to pub/OpenBSD/{snapshots,version}/packages/arch. > Maybe only to amd64, because the info is the same. I abandoned the > idea one year ago because I wanted to convert the table Ports and the > json was 27MB[2]. Now I'm splitting a limited part of Ports in small > json files. > > Examples: > > 1,3MB, FullPkgPath, COMMENT, MAINTAINER > https://juanfra.keybase.pub/tmp-ports/packages-info.json > > 1,1MB, FullPkgPath, DISTNAME, FULLPKGNAME, PKGNAME > https://juanfra.keybase.pub/tmp-ports/packages-versions.json > > 1,7MB, FullPkgPath, BUILD_DEPENDS, LIB_DEPENDS, RUN_DEPENDS, > TEST_DEPENDS > https://juanfra.keybase.pub/tmp-ports/packages-depends.json > > > I've the python script but I've not started yet with the port. The > question for the porters (and in special people running the bulks, > signing/uploading packages) is, are you against this idea?. > > > 1. > https://github.com/danieljakots/obsd-ports-tools/blob/master/showvictims.py > > 2. https://juanfra.keybase.pub/tmp-ports/sqlports-everything.json > >