Executive summary: You probably should include a full ABNF grammar....
Daniel Holth writes: > To support empty lines and lines with indentation with respect to > the RFC 822 format, any CRLF character has to be suffixed by 7 spaces > followed by a pipe ("|") char. [...] > This encoding implies that any occurences of a CRLF followed by 7 spaces > and a pipe char have to be replaced by a single CRLF when the field > is unfolded using a RFC822 reader. This isn't RFC 822 unfolding at all. An RFC 822 "reader" will simply remove the CRLF and optionally "canonicalize" the spaces (the latter is not allowed by RFC 822, but sometimes it's observed). This implies that if you use an RFC 822 reader, you need to replace instances of the regexp r"\s+\|" with a newline. (If you have a conforming reader, you can use the regexp r"\s{7}\|" instead.) And of course you have to RFC-2047-encode non-ASCII in an RFC-822 field. So please don't refer to the basic format ("field-name: field-body" followed by optional continuation lines) as "RFC822". "Inspired by RFC 822" maybe. Better "chosen to resemble the familiar RFC 822 header format used in email and netnews." (Note that RFC 822 is actually ambiguous even about the basic format; section 3.4.2 implies that "name :body" would be an acceptable field, although section 3.1.2 doesn't seem to allow space before the colon. Referring to RFC 822 as a standard here is a bad idea. There is a reason why that standard gets revised/replaced periodically!) I don't understand why you specify that the newline is represented by CRLF *after* unfolding. Once unfolded, these fields are all what RFC822 would call "unstructured fields" (in that context of that RFC). They will contain text followed by a terminating CRLF, but including no others. In fact that CRLF is redundant, and may as well be stripped (and probably will be, in most implementations). I don't understand why you specify newline as CRLF here, except to pretend that you're respecting RFC 822. But all you're using are the division of a field into field-name and field-body by a colon, and the convention that a newline followed by folding whitespace is a continuation line. These are both trivial to implement, and almost all implementations will undoubtedly read the file as *text* in universal newline mode. I see no reason to specify a binary format. > Author-email (optional) > ::::::::::::::::::::::: > > A string containing the author's e-mail address. It can contain > a name and e-mail address in the legal forms for a RFC-822 > ``From:`` header. Heavens above, no! From RFC 822, this: Wilt . (the Stilt) chamberl...@nba.us is a legal email address, which probably would be represented conventionally as "Wilt (the Stilt) Chamberlain" <wilt.chamberl...@nba.us> However, it's not at all clear that all mail clients, let alone just plain folks, will interpret the first form correctly. And there are worse examples given in that RFC. Is there a reason why you can't require these to be in the form recommended by RFC 5322 (ie, the "conventional representation" above)? Or you could relax this so that the quotes are prohibited. > License (optional) > :::::::::::::::::: > > Text indicating the license covering the distribution where the license > is not a selection from the "License" Trove classifiers. See > "Classifier" below. This field may also be used to specify a > particular version of a licencse which is named via the ``Classifier`` A typo----------------------------+ > field, or to indicate a variation or exception to such a license. This won't do as is. It doesn't exclude the possibility of including a complete license, and if that is intentional, this field needs to be in the same format as "Distribution". Licenses are complex documents, needing at least some of the power of something like ReST. You may as well give them all of it. > Project-URL (multiple-use) > Provides-Extra (multiple use) Hyphen or no hyphen? Consistency is good. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com