On Wed, Feb 15, 2017 at 5:33 AM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 14 February 2017 at 21:21, Vinay Sajip via Distutils-SIG > <distutils-sig@python.org> wrote: > > > > > >> I thought the current status was that it's called metadata.json > >> exactly *because* it's not standardized, and you *shouldn't* look at > >> it? > > > > > > Well, it was work-in-progress-standardised according to PEP 426 (since > > sometimes implementations have to work in parallel with working out the > > details of specifications). Given that PEP 426 wasn't done and dusted > > but being progressed, I would have thought it perfectly acceptable to > > use "pydist.json", as the only things that would be affected would be > > packaging tools working to the PEP. > > I asked Daniel to *stop* using pydist.json, since wheel was emitting a > point-in-time snapshot of PEP 426 (which includes a lot of > potentially-nice-to-have things that nobody has actually implemented > so far, like the semantic dependency declarations and the enhancements > to the extras syntax), rather than the final version of the spec. > Would you send a link to the source for this? > > >> It's too bad that the JSON thing didn't work out, but I think we're > >> better off working on better specifying the one source of truth > >> everything already uses (METADATA) instead of bringing in *new* > >> partially-incompatible-and-poorly-specified formats. > > > > When you say "everything already uses", do you mean setuptools and wheel? > > If nobody else is allowed to play, that's one thing. But otherwise, there > > need to be standards for interoperability. The METADATA file, now - > exactly > > which standard does it follow? The one in the dateutil wheel that Jim > > referred to doesn't appear to conform to any of the metadata PEPs. It was > > rejected by old metadata code in distlib (which came of out the Python > 3.3 > > era "packaging" package - not to be confused with Donald's of the same > name - > > which is strict in its interpretation of those earlier PEPs). > > > > The METADATA format (key-value) is not really flexible enough for certain > > things which were in PEP 426 (e.g. dependency descriptions), and for > these > > JSON seems a reasonable fit. > > The current de facto standard set by setuptools and bdist_wheel is: > > - dist-info/METADATA as defined at > https://packaging.python.org/specifications/#package-distribution-metadata > - dist-info/requires.txt runtime dependencies as defined at > http://setuptools.readthedocs.io/en/latest/formats.html#requires-txt > - dist-info/setup_requires.txt build time dependencies as defined at > http://setuptools.readthedocs.io/en/latest/formats.html#setup-requires-txt > > The dependency fields in METADATA itself unfortunately aren't really > useful for anything. > Graph: Nodes and edges. > > There's definitely still a place for a pydist.json created by going > through PEP 426, comparing it to what bdist_wheel already does to > populate metadata.json, and either changing the PEP to match the > existing practice, or else agreeing that we prefer what the PEP > recommends, that we want to move in that direction, and that there's a > definite commitment to implement the changes in at least setuptools > and bdist_wheel (plus a migration strategy that allows for reasonably > sensible consumption of old metadata). > Which function reads metadata.json? Which function reads pydist.json? > > Such an update would necessarily be a fairly ruthless process, where > we defer everything that can possibly be deferred. I already made one > pass at that when I split out the metadata extensions into PEP 459, > but at least one more such pass is needed before we can sign off on > the spec as metadata 2.0 - even beyond any "open for discussion" > questions, there are still things in there which were extracted and > standardised separately in PEP 508. > > > There's no technical reason why "the JSON thing > > didn't work out", as far as I can see - it was just given up on for a > more > > incremental approach (which has got no new PEPs other than 440, AFAICT). > > Yep, it's a logistical problem rather than a technical problem per se > - new metadata formats need software publisher adoption to ensure the > design is sensible before we commit to them long term, but software > publishers are understandably reluctant to rely on new formats that > limit their target audience to folks running the latest versions of > the installation tools (outside constrained cases where the software > publisher is also the main consumer of that software). > An RDFS Vocabulary contains Classes and Properties with rdfs:ranges and rdfs:domains. There are many representations for RDF: RDF/XML, Turtle/N3, JSONLD. RDF is implementation-neutral. JSONLD is implementation-neutral. > > For PEP 440 (version specifiers) and PEP 508 (dependency specifiers), > this was handled by focusing on documenting practices that people > already used (and checking existing PyPI projects for compatibility), > rather than trying to actively change those practices. > > For pyproject.toml (e.g. enscons), the idea is to provide a setup.py > shim that can take care of bootstrapping the new approach for the > benefit of older tools that assume the use of setup.py (similar to > what was done with setup.cfg and d2to1). > > The equivalent for PEP 426 would probably be legacy-to-pydist and > pydist-to-legacy converters that setuptools, bdist_wheel and other > publishing tools can use to ship legacy metadata alongside the > standardised format (and I believe Daniel already has at least the > former in order to generate metadata.json in bdist_wheel). With PEP > 426 as currently written, a pydist-to-legacy converter isn't really > feasible, since pydist proposes new concepts that can't be readily > represented in the old format. > pydist-to-legacy would be a lossy transformation. > > > I understand that social reasons are often more important than technical > reasons > > when it comes to success or failure of an approach; I'm just not sure > that > > in this case, it wasn't given up on too early. > > I think of PEP 426 as "deferred indefinitely pending specific > practical problems to provide clearer design constraints" rather than > abandoned :) > Is it too late to request lowercased property names without dashes? If we're (I'm?) going to create @context URIs, compare: https://schema.python.org/v1#Provides-Extra { "@context": { "default": "https://schema.python.org/#", "schema": "http://schema.org/", # "name": "http://schema.org/name", # "url": "http://schema.org/url", # "verstr": # "extra": # "requirements" # "requirementstr" }, "@typeof": [ "py:PythonPackage"], "name": "IPython", "url": ["https://pypi.python.org/pypi/IPython", "https://pypi.org/project/ IPython"], "Provides-Extra": [ {"@typeof": "Requirement", "name": "notebook", "extra": ["notebook"], "requirements": [], #TODO "requirementstr": "extra == 'notebook'" }, {"name": "numpy", "extra": ["test"], "requirements": #TODO, "requirementstr": "python_version >= \"3.4\" and extra == 'test'" }, ... ] } > There are two recent developments that I think may provide those > missing design constraints and hence motivation to finalise a metadata > 2.0 specification: > > 1. the wheel-to-egg support in humpty (and hence zc.buiidout). That > makes humpty a concrete non-traditional installer that would benefit > from both a modernised standard metadata format, as well as common > tools both to convert legacy metadata to the agreed modern format and > to convert the modern format back to the legacy format for inclusion > in the generated egg files (as then humpty could just re-use the > shared tools, rather than having to maintain those capabilities > itself). class PackageMetadata def __init__(): self.data = collections.OrderedDict() @staticmethod def read_legacy() def read_metadata_json() def read_pydist_json() def read_pyproject_toml() def read_jsonld() def to_legacy(): def to_metadata_json() def to_pydist_json() def to_pyproject_toml() def to_jsonld() @classmethod def Legacy() def MetadataJson() def PydistJson() def PyprojectToml() def Jsonld(cls, *args, **kwargs) obj = cls(*args, **kwargs) obj.read_jsonld(*args, **kwargs) return obj @classmethod def from(cls, path, format='legacy|metadatajson|pydistjson|pyprojecttoml|jsonld'): # or this ... for maximum reusability, we really shouldn't need an adapter registry here; > 2. the new pipenv project to provide a simpler alternative to the > pip+virtualenv+pip-tools combination for environment management in web > service development (and similar layered application architectures). > As with the "install vs setup" split in setuptools, pipenv settled on > an "only two kinds of requirement (deployment and development)" model > for usability reasons, but it also distinguishes abstract dependencies > stored in Pipfile from pinned concrete dependencies stored in > Pipfile.lock. > Does the Pipfile/Pipfile.lock distinction overlap with 'integrates' as a replacement for meta_requires? > > If we put those together with the existing interest in automating > generation of policy compliant operating system distribution packages, > Downstream OS packaging could easily (and without permission) include extra attributes (properties specified with full URIS) in JSONLD metadata. > it makes it easier to go through the proposed semantic dependency > model in PEP 426 and ask "How would we populate these fields based on > the metadata that projects *already* publish?". > See 'class PackageMetadata' > > - "run requires": straightforward, as these are the standard > dependencies used in most projects. Not entirely clear how to gently > (or strongly!) discourage dependency pinning when publishing to PyPI > (although the Pipfile and Pipfile.lock model used in pipenv may help > with this) > - "meta requires": not clear at all, as this was added to handle cases > like PyObjC, where the main package is just a metapackage that makes a > particular set of versioned subpackages easy to install. This may be > better modeled as a separate "integrates" field, using a declaration > syntax more akin to that used for Pipfile.lock rather than that used > for normal requirements declarations. > - "dev requires": corresponds to "dev-packages" in pipenv > - "build requires": corresponds to "setup_requires" in setuptools, > "build-system.requires" + any dynamic build dependencies in PEP 518 > - "test requires": corresponds to "test" extra in > https://packaging.python.org/specifications/#provides-extra-multiple-use > > The "doc" extra in > https://packaging.python.org/specifications/#provides-extra-multiple-use > would map to "build requires", but there's potential benefit to > redistributors in separating it out, as we often split the docs out > from the built software components (since there's little reason to > install documentation on headless servers that are only going to be > debugged remotely). > > The main argument against "test requires" and "doc requires" is that > the extras system already works fine for those - "pip install > MyProject[test]" and "pip install MyProject[doc]" are both already > supported, so metadata 2.0 just needs to continue to reserve those as > semantically significant extras names. > > "dev" requires could be handled the same way - anything you actually > need to *build* an sdist or wheel archive from a source repository > should be in "setup_requires" (setuptools) or "build-system.requires" > (pyproject.toml), so "dev" would just be a conventional extra name > rather than a top level field. > > That just leaves "build_requires", which turns out to interact > awkwardly with the "extras" system: if you write "pip install > MyProject[test]", does it install all the "test" dependencies, > regardless of whether they're listed in run_requires or > build_requires? > > If yes: then why are run_requires and build_requires separate? > If no: then how do you request installation of the "test" build extra? > Or are build extras prohibited entirely? > > That suggests that perhaps "build" should just be a conventional extra > as well, and considered orthogonal to the other conventional extras. > (I'm sure this idea has been suggested before, but I don't recall who > suggested it or when) > > And if build, test, doc, and dev are all handled as extras, then the > top level name "run_requires" no longer makes sense, and the field > name should go back to just being "requires". > Under that evaluation, we'd be left with only the following top level > fields defined for dependency declarations: > > - "requires": list where entries are either a string containing a PEP > 508 dependency specifier or else a hash map contain a "requires" key > plus "extra" or "environment" fields as qualifiers > +1 > - "integrates": replacement for "meta_requires" that only allows > pinned dependencies (i.e. hash maps with "name" & "version" fields, or > direct URL references, rather than a general PEP 508 specifier as a > string) > Pipfile.lock? What happens here when something is listed in both requires and integrates? Where/do these get merged on the "name" attr as a key, given a presumed namespace URI prefix (https://pypi.org/project/)? > > For converting old metadata, any concrete dependencies that are > compatible with the "integrates" field format would be mapped that > way, while everything else would be converted to "requires" entries. > What heuristic would help identify compatibility with the integrates field? > The semantic differences between normal runtime dependencies and > "dev", "test", "doc" and "build" requirements would be handled as > extras, regardless of whether you were using the old metadata format > or the new one. > +1 from me. I can't recall whether I've used {"dev", "test", "doc", and "build"} as extras names in the past; though I can remember thinking "wouldn't it be more intuitive to do it [that way]" Is this backward compatible? Extras still work as extras? > > Going the other direction would be similarly straightforward since > (excluding extensions) the set of required conceptual entities has > been reduced back to the set that already exists in the current > metadata formats. While "requires" and "integrates" would be distinct > fields in pydist.json, the decomposed fields in the latter would map > back to their string-based counterparts in PEP 508 when converted to > the legacy metadata formats. > > Cheers, > Nick. > > P.S. I'm definitely open to a PR that amends the PEP 426 draft along > these lines. I'll get to it eventually myself, but there are some > other things I see as higher priority for my open source time at the > moment (specifically the C locale handling behaviour of Python 3.6 in > Fedora 26 and the related upstream proposal for Python 3.7 in PEP 538) > I need to find a job; my time commitment here is inconsistent. I'm working on a project (nbmeta) for generating, displaying, and embedding RDFa and JSONLD in Jupyter notebooks (w/ _repr_html_() and an OrderedDict) which should refresh the JSONLD @context-writing skills necessary to define the RDFS vocabulary we could/should have at https://schema.python.org/ . - [ ] JSONLD PEP (<- PEP426) - [ ] examples / test cases - I've referenced IPython as an example package; are there other hard test cases for python packaging metadata conversion? (i.e. one that uses every feature of each metadata format)? - [ ] JSONLD @context - [ ] class PackageMetadata - [ ] wheel: (additionally) generate JSONLD metadata - [ ] schema.python.org: master, gh-pages (or e.g. " https://www.pypa.io/ns#") - [ ] warehouse: add a ./jsonld view (to elgacy?) https://github.com/pypa/interoperability-peps/issues/31 > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG@python.org > https://mail.python.org/mailman/listinfo/distutils-sig >
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig