On Thu, Oct 18, 2012 at 10:55 PM, Stephen J. Turnbull <step...@xemacs.org> wrote: > Executive summary: > > You probably should include a full ABNF grammar.... > > Daniel Holth writes: > > > To support empty lines and lines with indentation with respect to > > the RFC 822 format, any CRLF character has to be suffixed by 7 spaces > > followed by a pipe ("|") char. [...] > > This encoding implies that any occurences of a CRLF followed by 7 spaces > > and a pipe char have to be replaced by a single CRLF when the field > > is unfolded using a RFC822 reader. > > This isn't RFC 822 unfolding at all. An RFC 822 "reader" will simply > remove the CRLF and optionally "canonicalize" the spaces (the latter > is not allowed by RFC 822, but sometimes it's observed). This implies > that if you use an RFC 822 reader, you need to replace instances of the > regexp r"\s+\|" with a newline. (If you have a conforming reader, you > can use the regexp r"\s{7}\|" instead.) And of course you have to > RFC-2047-encode non-ASCII in an RFC-822 field. > > So please don't refer to the basic format ("field-name: field-body" > followed by optional continuation lines) as "RFC822". "Inspired by > RFC 822" maybe. Better "chosen to resemble the familiar RFC 822 > header format used in email and netnews." (Note that RFC 822 is > actually ambiguous even about the basic format; section 3.4.2 implies > that "name :body" would be an acceptable field, although section > 3.1.2 doesn't seem to allow space before the colon. Referring to RFC > 822 as a standard here is a bad idea. There is a reason why that > standard gets revised/replaced periodically!) > > I don't understand why you specify that the newline is represented by > CRLF *after* unfolding. Once unfolded, these fields are all what > RFC822 would call "unstructured fields" (in that context of that RFC). > They will contain text followed by a terminating CRLF, but including > no others. In fact that CRLF is redundant, and may as well be > stripped (and probably will be, in most implementations). > > I don't understand why you specify newline as CRLF here, except to > pretend that you're respecting RFC 822. But all you're using are the > division of a field into field-name and field-body by a colon, and the > convention that a newline followed by folding whitespace is a > continuation line. These are both trivial to implement, and almost > all implementations will undoubtedly read the file as *text* in > universal newline mode. I see no reason to specify a binary format. > > > Author-email (optional) > > ::::::::::::::::::::::: > > > > A string containing the author's e-mail address. It can contain > > a name and e-mail address in the legal forms for a RFC-822 > > ``From:`` header. > > Heavens above, no! From RFC 822, this: > > Wilt . (the Stilt) chamberl...@nba.us > > is a legal email address, which probably would be represented > conventionally as > > "Wilt (the Stilt) Chamberlain" <wilt.chamberl...@nba.us> > > However, it's not at all clear that all mail clients, let alone just > plain folks, will interpret the first form correctly. And there are > worse examples given in that RFC. Is there a reason why you can't > require these to be in the form recommended by RFC 5322 (ie, the > "conventional representation" above)? Or you could relax this so that > the quotes are prohibited. > > > License (optional) > > :::::::::::::::::: > > > > Text indicating the license covering the distribution where the license > > is not a selection from the "License" Trove classifiers. See > > "Classifier" below. This field may also be used to specify a > > particular version of a licencse which is named via the ``Classifier`` > A > typo----------------------------+ > > > field, or to indicate a variation or exception to such a license. > > This won't do as is. It doesn't exclude the possibility of including > a complete license, and if that is intentional, this field needs to be > in the same format as "Distribution". Licenses are complex documents, > needing at least some of the power of something like ReST. You may as > well give them all of it. > > > Project-URL (multiple-use) > > Provides-Extra (multiple use) > > Hyphen or no hyphen? Consistency is good.
I will include or remove the hyphen. Your other comments are also true of the predecessor Metadata 1.2. The | folding discussion could probably die. Personally I do not respect RFC822 at all (in this format). I rather expect the pragmatic implementer to more or less [line.split(':', 1) for line in open('METADATA') if line[0].isalpha()]. The fields that matter at runtime (Name, Version, Requires-Dist, Provides-Extra) are all single-line only. Basically everything else is a curiosity for the human reader. The .dist-info (PEP 376) or the wheel spec should gain a well-known file package-1.0.dist-info/LICENSE. Many open source licenses require that you include the license with every copy of the program. Thanks, Daniel Holth _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com