On Jul 19, 2016 8:44 AM, "Nick Coghlan" <ncogh...@gmail.com> wrote: > > On 19 July 2016 at 18:13, Wes Turner <wes.tur...@gmail.com> wrote: > > so, there's a need for specifying the {PyPI} package URI in setup.py > > Not really - tools can make a reasonable guess about the source PyPI > URL based purely on the name and version. For non-PyPI hosted > packages, the extra piece of info needed is the index server URL.
So, the index server URL is in pip.conf or .pydistutils.cfg or setup.cfg OR specified on the commandline? > > > and then generating meta.jsonld from setup.py > > No, a JSON-LD generator would start with a rendered metadata format, > not the raw setup.py. "pydist.json", my mistake https://github.com/pypa/interoperability-peps/issues/31#issuecomment-139657247 - pydist.json - metadata.json (wheel) - pydist.jsonld > > > and then generating JSONLD in a warehouse/pypa view; because that's where > > they keep the actual metadara (package platform versions, checksums, > > potentially supersededBy redirects) > > No, there is no requirement for this to be a PyPI feature. Absolutely none. > > > and then a signing key for a) package maintainer-supplied metadata and b) > > package repository metadata (which is/would be redundant but comforting) > > This is already covered (thoroughly) in PEPs 458 and 480, and has > nothing to do with metadata linking. ld-signatures can be used to sign {RDF, JSONLD, RDFa}; and attach the signature to the document. https://web-payments.org/specs/source/ld-signatures/ - JWS only works with JSON formats (and not RDF) https://www.python.org/dev/peps/pep-0480/ - Does this yet include signing potentially cached JSON metadata used by actual tools like e.g. pip? - How do you feel about redirects because superseded and nobody can convince the maintainer to update the long_description? > > > and then third-party services like NVD, CVEdetails, and stack metadata > > aggregation services > > And this is the other reason why it doesn't make sense to do this on > PyPI itself - the publisher provided metadata from PyPI is only one > piece of the project metadata puzzle (issue trackers and source code > repositories are another one, as are the communication metrics > collected by the likes of Bitergia). AFAIU, the extra load of fielding vulnerability reports for responsibly PyPI-hosted packages is beyond the scope of the PyPI and Warehouse packages. > > For a data aggregator, supporting multiple language ecosystems, and > multiple issue trackers, and multiple code hosting sites is an M+N+O > scale problem (where M is the number of language ecosystems supported, > etc). By contrast, if you try to solve this problem in the package > publication services for each individual language, you turn it into an > M*(N+O) scale problem, where you need to give each language-specific > service the ability to collect metadata from all those other sources. Are you saying that, for release-monitoring.org (a service you are somehow financially associated with), you have already invested the time to read the existing PyPI metadata; but not eg the 'python' or 'python-dev' OS package metadata? Debian has an RDF endpoint. - https://packages.qa.debian.org/p/python-defaults.html - https://packages.qa.debian.org/p/python-defaults.ttl - But there's yet no easy way to JOIN metadata down the graph of downstream OS packages to PyPI archives to source repository changesets; not without RDF and not without writing unnecessary language/packaging-community-specific {INI,JSON,TOML, YAMLLD } parsers. O-estimations aside, when a data publisher publishes web standard data, everyone can benefit; because upper bound network effects N**2 (Metcalf's Law) > > This means that since we don't have a vested interest in adding more > functionality to PyPI that doesn't specifically *need* to be there > (and in fact actively want to avoid doing so), we can say "Conformance > to semantic web standards is a problem for aggregation services like > libraries.io and release-monitoring.org to solve, not for us to > incorporate directly into PyPI". A view producing JSONLD. Probably right about here: https://github.com/pypa/warehouse/blob/master/warehouse/packaging/views.py Because there are a few (possibly backwards compatible) changes that could be made here so that we could just add @context to the existing JSON record (thus making it JSONLD, which anyone can read and index without a domain-specific parser): https://github.com/pypa/warehouse/blob/master/warehouse/legacy/api/json.py IIRC: https://github.com/pypa/interoperability-peps/issues/31#issuecomment-233195564 > > > sorry to hijack the thread; i hear "more links and metadata in an > > auxilliary schema" and think 'RDF is the semantic web solution for this > > graph problem' > > I know, and you're not wrong about that. Where you're running into > trouble is that you're trying to insist that it is the responsibility > of the initial data *publishers* to conform to the semantic web > standards, and it *isn't* - that job is one for the data aggregators > that have an interest in making it easier for people to work across > multiple data sets managed by different groups of people. No, after-the-fact transformation is wasteful and late. A bit of advice for data publishers: http://5stardata.info/en/ > > For publication platforms managing a single subgraph, native support > for JSON-LD and RDFa introduces unwanted complexity by expanding the > data model to incorporate all of the relational concepts defined in > those standards. Well funded platforms may have the development > capacity to spare to spend time on such activities, but PyPI isn't > such a platform. This is Warehouse: https://github.com/pypa/warehouse It is maintainable. https://www.pypa.io/en/latest/help/ > > By contrast, for aggregators managing a graph-of-graphs problem, > JSON-LD and RDFa introduce normalisation across data sets that > *reduces* overall complexity, since most of the details of the > subgraphs can be ignored, as you focus instead on the links between > the entities they contain. > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig