Re: Why the normative form of IETF Standards is ASCII
Martin, Thanks for your great review! On 10-03-26 4:17 PM, Martin Rex m...@sap.com wrote: I downloaded the WG document ASCII I-D (14-pages) from http://tools.ietf.org/id/draft-ietf... loaded it into NRoffEdit, selected Edit-Convert Text to NRoff, spent about 30 minutes fixing the Table Of Contents, I-D header and some minor formatting defects from the conversion along with several existing spelling errors reported by NRoffEdit and formatting issues like new sections starting very close to the bottom of pages. ... the original author is likely an xml2rfc user without a spell checker in his tool-chain. Just a short comment on this to clarify. You don't need to fix the table of content in NroffEdit. You just delete the present one, then select Edit - Paste managed 'Table of Contents', and NroffEdit will generate a new one for you that is automatically updated as you edit the draft. /Stefan ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 27.03.2010 00:17, Martin Rex wrote: ... If an I-D author has issues with idnits complaining about formatting, then the toolchain of that author is likely responsible for this shortcoming. ... Indeed; or the lack of a tool chain :-) ... IMHO, being able to do this without chasing around for an authoring version of someone else's draft is neat. For various reasons, asking the original I-D editor for an authoring format version of his I-D was not an option -- and an XML-based authoring format would have been entirely useless to me anyway. ... Just clarifying: but it would have been helpful for other authors that use xml2rfc. Thus, it's good to submit it with the Internet Draft when available. (But PLEASE submit standalone versions that do not require additional files; xml2rfc's toxml mode is your friend). ... Personally, I know very little about XML. I don't use it my self, the code that I'm writing and maintaining neither uses nor creates XML. All of my Editors are plain text editors and I don't know or care how any of my Browsers (MSIE6 or FF3.5) could be made to display XML. ... I'm editing XML code with a text editor. This is not a problem. I realize that you don't care about the XML format, and doing it in browsers, but for those who might be interested: - Get rfc2629.xslt (distribution archive at http://greenbytes.de/tech/webdav/rfc2629xslt.zip) - Add ?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ? to the top of the source file (but after the XML declaration) - Point your browser to the source file (works with the two browsers Martin mentioned, and all recent ones anyway). - Caveats: do not use the PI-based inclusion mechanism; more documentation at http://greenbytes.de/tech/webdav/rfc2629xslt/rfc2629xslt.html. ... (and the editing process I use must be entirely offline capable for policy reasons that are otherwise not relevant to this discussion). ... Yes, that's a given. You don't need to repeat that :-) Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Andrew Sullivan wrote: On Sat, Mar 20, 2010 at 02:55:56PM -0800, Bob Braden wrote: Drafts. That always seemed counter-productive to me. I am not sure I would characterize the problem as serious, but it does seem t o warp common sense for the sake of bureaucratic uniformity.) I got some mail off-list about calling the problem serious, too, so I thought I should justify myself. I had, in the past year, two different DNSEXT participants send me frustrated email because of the idnits checks. The people in question were both long-time contributors to the IETF with perhaps ideosyncratic toolchains. Neither of them was using xml2rfc, and neither of them had well-maintained *roff templates that just did the right thing. My co-chair spent some time one day fiddling with the draft of one of these people in order to make it pass the submission checks for a -00 draft, mostly because the author was about to give up in frustration. You SHOULD have tried NRoffEdit. It would likely have solved all your problems in a matter of minutes. If an I-D author has issues with idnits complaining about formatting, then the toolchain of that author is likely responsible for this shortcoming. Just a few weeks ago, I used NroffEdit in order to make a suggestion for a document update in a fashion that makes it easy for others to make an assessment. I downloaded the WG document ASCII I-D (14-pages) from http://tools.ietf.org/id/draft-ietf... loaded it into NRoffEdit, selected Edit-Convert Text to NRoff, spent about 30 minutes fixing the Table Of Contents, I-D header and some minor formatting defects from the conversion along with several existing spelling errors reported by NRoffEdit and formatting issues like new sections starting very close to the bottom of pages. ... the original author is likely an xml2rfc user without a spell checker in his tool-chain. Then I edited my changes into the document, uploaded the resulting ASCII TXT output to our internet-accessible FTP server and send an URL to the WG mailing list with a prefilled http://tools.ietf.org/rfcdiff?url1=http://tools.ietf.org/id/draft-ietf-... url2=ftp://my-server/...draft-with-my-suggestions IMHO, being able to do this without chasing around for an authoring version of someone else's draft is neat. For various reasons, asking the original I-D editor for an authoring format version of his I-D was not an option -- and an XML-based authoring format would have been entirely useless to me anyway. And this was the first time that I tried this feature of NRoffEdit! The availability of decent tools to make your I-D authoring task simple is important, and some of the existing solutions appear to be more difficult to install, more difficult to use than others. Personally, I know very little about XML. I don't use it my self, the code that I'm writing and maintaining neither uses nor creates XML. All of my Editors are plain text editors and I don't know or care how any of my Browsers (MSIE6 or FF3.5) could be made to display XML. Being able to see right away in the right output pane of NRoffEdit how the stuff that I type comes out formatted while typing is nice. I'm writing with 10 fingers and usually have ~50 app windows open at the same time. I _really_ prefer to get things done _without_ switching between apps constantly when I'm working on something, because that will considerably slow down my work. (and the editing process I use must be entirely offline capable for policy reasons that are otherwise not relevant to this discussion). -Martin ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Actually, there seems to be one here: http://sourceforge.net/projects/rfc2xml/ Not sure how much of a good work it does. /Stefan On 10-03-24 5:10 PM, Julian Reschke julian.resc...@gmx.de wrote: On 25.03.2010 00:56, Stefan Santesson wrote: Julian, One minor question. How do you use xml2rfc to edit a document when you don't have that document in xml format? You don't. For example, if it was not originally created using xml2rfc. Somebody might have converted it (you may want to google for it, or ask the RFC Editor). Otherwise, you need to convert. Anticipating the next question: no, I'm not aware of a tool that does that well; in my experience, to get good XML (with proper markup of artwork, lists, references...), you really have to do it manually. Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Maybe it's just me, but I couldn't find any files there. On Mar 25, 2010, at 12:03 AM, Stefan Santesson wrote: Actually, there seems to be one here: http://sourceforge.net/projects/rfc2xml/ Not sure how much of a good work it does. /Stefan On 10-03-24 5:10 PM, Julian Reschke julian.resc...@gmx.de wrote: On 25.03.2010 00:56, Stefan Santesson wrote: Julian, One minor question. How do you use xml2rfc to edit a document when you don't have that document in xml format? You don't. For example, if it was not originally created using xml2rfc. Somebody might have converted it (you may want to google for it, or ask the RFC Editor). Otherwise, you need to convert. Anticipating the next question: no, I'm not aware of a tool that does that well; in my experience, to get good XML (with proper markup of artwork, lists, references...), you really have to do it manually. Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf Scanned by Check Point Total Security Gateway. ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On Wed, Mar 24, 2010 at 11:56 PM, Stefan Santesson ste...@aaa-sec.com wrote: One minor question. How do you use xml2rfc to edit a document when you don't have that document in xml format? I've had luck with converting using xml2rfc-xxe (http://xml2rfc-xxe.googlecode.com/); you select the entire document, use paste as paragraphs to get it into xxe, and insert sections / reflow paragraphs / etc. as needed. Bill ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Julian, One minor question. How do you use xml2rfc to edit a document when you don't have that document in xml format? For example, if it was not originally created using xml2rfc. /Stefan On 10-03-22 2:58 PM, Julian Reschke julian.resc...@gmx.de wrote: On 22.03.2010 22:28, Martin Rex wrote: ... With xml2rfc 1, 2, 3, 4, 5 and 6 are all seperate, manual and painful steps that require all sorts of unspecified other software and require you to search around for information and read lots of stuff in order to get it working. The specific tools, their usage and the options to automate some steps from the above list varies significantly between OS. ... Hi Martin, it's clear that you're very happy with nroffedit. I have no problem with that. But you paint a picture of xml2rfc that isn't totally accurate. There are editors with spell checking. There are editors with direct preview. Also, you need two things, exactly as for Nroffedit (xml2rfc.tcl + a TCL impl, instead of nroffedit + Java). There is an alternate impl, rfc2629.xslt, which requires exactly one file, and a browser. ... My first encounter with TeX was in 1990 on an Amiga, and it came _with_ an editor where you could run your document through the TeX processor and get it display (DVI graphics output on screen) with a single keypress. Not WYSIWYG, but still quite usable. And comparing the quality of the printed output, none of the existing WYSIWYG solutions came even close. ... I liked Tex, too. Although I ran it on that other 68K machine. In absence of an easy-to-use, single tool (preferably platform-independent) to take care of most of the document editing, INCLUDING visualizing the output, I see no value in considering an XML-based authoring format for I-Ds and RFCs. Again: use a text editor plus a web browser. ... Even reflowing the existing paginated ASCII output should be fairly simple. What is missing in that output is the information about the necessary amount of vertical whitespace between pages when removing page breaksheaderfooter lines. ... What's also missing is the information whether text is reflowable or not (think ASCII artwork, program listing, tables...). Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 10-03-12 8:34 PM, Julian Reschke julian.resc...@gmx.de wrote: Because of the page breaks and the consistent presence of these headers and footers just before and after the page breaks, an accessibility tool should be able to recognize them as such. I agree it would be nice if they did that. Do they? Both NroffEdit and the Idnits tool do. It's not that hard. Pages are separated by a form feed character. /Stefan ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 25.03.2010 00:56, Stefan Santesson wrote: Julian, One minor question. How do you use xml2rfc to edit a document when you don't have that document in xml format? You don't. For example, if it was not originally created using xml2rfc. Somebody might have converted it (you may want to google for it, or ask the RFC Editor). Otherwise, you need to convert. Anticipating the next question: no, I'm not aware of a tool that does that well; in my experience, to get good XML (with proper markup of artwork, lists, references...), you really have to do it manually. Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 25.03.2010 01:02, Stefan Santesson wrote: On 10-03-12 8:34 PM, Julian Reschkejulian.resc...@gmx.de wrote: Because of the page breaks and the consistent presence of these headers and footers just before and after the page breaks, an accessibility tool should be able to recognize them as such. I agree it would be nice if they did that. Do they? Both NroffEdit and the Idnits tool do. It's not that hard. Pages are separated by a form feed character. I meant accessibility tools such as screen readers. Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Using xml2rfc (was: Re: Why the normative form of IETF Standards is ASCII)
Martin Rex mrex at sap dot com wrote: If anything deserves the description 60's style document editing then it is the current xml2rfc processing, which requires a whole bunch of extra software, lots of manual processing steps, reading of lots of documentation and plenty of time and desire for humiliation in order to test all those features through the manual self-torture process. I'm probably not a good data point, since I've only contributed to two RFCs, and as a software developer I don't have much problem with using multiple tools to get the job done (or with writing XML). But I have to take issue with the humiliation and torture scenario described by Martin. I have never attempted to install xml2rfc on my local machine; I have only used the online version at http://xml.resource.org/, and while I would not lie and say the experience was completely trouble-free or that the documentation was always perfect, on no account was it torture, certainly not on the level that would discourage me from writing another I-D. (Endless WG lily-gilding and tolerance of career trolls might, though.) In particular, with my heavily text-based drafts, in most cases I found xml2rfc perfectly adequate to handle formatting. If you really care about the exact formatting, you're going to have to go back and forth between the editor output and your browser anyway, since browsers can differ in their output. One phenomenon that always emerges from this joint character-set/RFC format discussion, every time it comes up, is that someone feels there should be one and only one process and tool set for writing I-Ds, and someone else feels the need to wave the Don't Tread on Me flag in protest. -- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ http://is.gd/2kf0s ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Using xml2rfc (was: Re: Why the normative form of IETF Standards is ASCII)
On Mar 23, 2010, at 6:12 AM, Doug Ewell wrote: Martin Rex mrex at sap dot com wrote: If anything deserves the description 60's style document editing then it is the current xml2rfc processing, which requires a whole bunch of extra software, lots of manual processing steps, reading of lots of documentation and plenty of time and desire for humiliation in order to test all those features through the manual self-torture process. I'm probably not a good data point, since I've only contributed to two RFCs, and as a software developer I don't have much problem with using multiple tools to get the job done (or with writing XML). But I have to take issue with the humiliation and torture scenario described by Martin. I have been staying out of this, as one of those pointless debates that happen. But here I will chime in. I do indeed use xml2rfc, and yes I installed an editor that I found helpful. It happens to be XMLMind with Bill Fenner's WYSIKN plugins. I do in fact keep a directory of current work, and I do in fact run xml2rfc on my system. It does all mostly work. The one thing that really makes it a little harder to use than, say, Word or Pages, is that I draw ASCII Art in one of a couple of other applications and have to drop it into the documents in a separate tool. It's not too hard. http://www.ipinc.net/IPv4.GIF ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
What I found strange that for many many years it was difficult to get started on writing an I-D, because of a lack of a decent tool to facilitate writing I-Ds. The processing steps in producing an I-D are rougly this: 1. get document template 2. edit document 3. spell checking 4. convert authoring - publishing format 5. display formatted output 6. make adjustments of document for nicer formatted output 7. idnits 8. submission NRoffEdit does 1+2+3+4+5+6 for you automatically, intuitively, WYSIWYG. You do *NOT* need to download, install or compile _any_ other software besides a Java Runtime. Just unzip and start the NroffEdit.jar. And because it is Java, the usage is going to be fairly independent of the underlying OS, which is a big plus for heterogeneous environments. With xml2rfc 1, 2, 3, 4, 5 and 6 are all seperate, manual and painful steps that require all sorts of unspecified other software and require you to search around for information and read lots of stuff in order to get it working. The specific tools, their usage and the options to automate some steps from the above list varies significantly between OS. Every author that wants to start writing his first I-D, will have to go through this manual self-torture process when choosing xml2rfc: 1. enter what you want to test into your document within your editor 2. save file 3. switch from editor to command line 4. run xml2rfc processor on document 5. switch between xml2rfc output and editor to fix xml2rfc reported errors 6. save fixed file 7. switch from editor to command line 8. run xml2rfc procerror on document 9. switch to visualization app 10. load xml2rfc output into visualization app 11. change your document based on output in visualization app The orignal nroff approach was somewhat as complicated, though there probably were more undesired formatting result rather than nroff errors (saving you 5+6), and the nroff output could be used directly as visualization (|more), obviating an additional app for that (saving you 9+10). But with the advent of NRoffEdit, that awkward processing issue can be entirely avoided with the easy and intuitive nroff authoring format -- you even get an english spell checker included. All you need to know is described in the one-page summary Help-Supported features of NRoffEdit plus the stuff that File-New draft from template creates for you. If anything deserves the description 60's style document editing then it is the current xml2rfc processing, which requires a whole bunch of extra software, lots of manual processing steps, reading of lots of documentation and plenty of time and desire for humiliation in order to test all those features through the manual self-torture process. My first encounter with TeX was in 1990 on an Amiga, and it came _with_ an editor where you could run your document through the TeX processor and get it display (DVI graphics output on screen) with a single keypress. Not WYSIWYG, but still quite usable. And comparing the quality of the printed output, none of the existing WYSIWYG solutions came even close. In absence of an easy-to-use, single tool (preferably platform-independent) to take care of most of the document editing, INCLUDING visualizing the output, I see no value in considering an XML-based authoring format for I-Ds and RFCs. I do not mind that some people prefer using the xml2rfc approach, and I am fine if they continue to use it. But I do not like to use that awkward document editing process, and I do not want the IETF to force any authors to subject themselves to the xml2rfc torture when there is running code (NRoffEdit) readily available, that significantly facilitates document editing in the existing format. And btw. creating flowable HTML that might be easier to render on mobile devices from the .nroff source should be very easy. Even reflowing the existing paginated ASCII output should be fairly simple. What is missing in that output is the information about the necessary amount of vertical whitespace between pages when removing page breaksheaderfooter lines. -Martin ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 22.03.2010 22:28, Martin Rex wrote: ... With xml2rfc 1, 2, 3, 4, 5 and 6 are all seperate, manual and painful steps that require all sorts of unspecified other software and require you to search around for information and read lots of stuff in order to get it working. The specific tools, their usage and the options to automate some steps from the above list varies significantly between OS. ... Hi Martin, it's clear that you're very happy with nroffedit. I have no problem with that. But you paint a picture of xml2rfc that isn't totally accurate. There are editors with spell checking. There are editors with direct preview. Also, you need two things, exactly as for Nroffedit (xml2rfc.tcl + a TCL impl, instead of nroffedit + Java). There is an alternate impl, rfc2629.xslt, which requires exactly one file, and a browser. ... My first encounter with TeX was in 1990 on an Amiga, and it came _with_ an editor where you could run your document through the TeX processor and get it display (DVI graphics output on screen) with a single keypress. Not WYSIWYG, but still quite usable. And comparing the quality of the printed output, none of the existing WYSIWYG solutions came even close. ... I liked Tex, too. Although I ran it on that other 68K machine. In absence of an easy-to-use, single tool (preferably platform-independent) to take care of most of the document editing, INCLUDING visualizing the output, I see no value in considering an XML-based authoring format for I-Ds and RFCs. Again: use a text editor plus a web browser. ... Even reflowing the existing paginated ASCII output should be fairly simple. What is missing in that output is the information about the necessary amount of vertical whitespace between pages when removing page breaksheaderfooter lines. ... What's also missing is the information whether text is reflowable or not (think ASCII artwork, program listing, tables...). Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Doug Ewell wrote: As many Japanese type Yen sign, when he actually want to input back slash, the JIS character of Yen sign is converted to unicode character of Yen sign, which is not back slash, which was the intention. I think this means that the user's kludge, in typing a yen sign to get a backslash, is not matched by Unicode with an equal and opposite kludge of converting the yen sign back to a backslash. I guess in the 1960s one could consider this a fault. That is simply a reality though it does not match your opinion. It should also be noted that, in Japanese encoding of JIS C 6226, back slash and Yen sign has been separateds already in 1978, which means unicode adds nothing. Why don't we ask one of the scores of software vendors that have deployed Unicode, at least as fully as this thread is about, just how disastrous their experience has been and how much better things would be if they had stuck with ISO 2022 instead? See above. Masataka Ohta ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On Sat, Mar 20, 2010 at 02:55:56PM -0800, Bob Braden wrote: Drafts. That always seemed counter-productive to me. I am not sure I would characterize the problem as serious, but it does seem t o warp common sense for the sake of bureaucratic uniformity.) I got some mail off-list about calling the problem serious, too, so I thought I should justify myself. I had, in the past year, two different DNSEXT participants send me frustrated email because of the idnits checks. The people in question were both long-time contributors to the IETF with perhaps ideosyncratic toolchains. Neither of them was using xml2rfc, and neither of them had well-maintained *roff templates that just did the right thing. My co-chair spent some time one day fiddling with the draft of one of these people in order to make it pass the submission checks for a -00 draft, mostly because the author was about to give up in frustration. Now, one might think that such people could be counselled to use different tools, or to maintain their templates, or so on. But I think that's completely wrong: the point here is to make it easy to produce proto-specifications, and easy for the kinds of (possibly cranky) technical contributors the IETF tends to attract. We're supposed to be working on interoperability, and the formatting of the documents ought therefore to be a contribution to that goal, not some barrier that ensures only those using certain tools get to send in their contributions. Best regards, A -- Andrew Sullivan a...@shinkuro.com Shinkuro, Inc. ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Andrew Sullivan wrote: I had, in the past year, two different DNSEXT participants send me frustrated email because of the idnits checks. The people in question were both long-time contributors to the IETF with perhaps ideosyncratic toolchains. Neither of them was using xml2rfc, and neither of them had well-maintained *roff templates that just did the right thing. While I recently spent a few extra hour editing nroff source, a solution for those who are less familiar with nroff is to provide an official nroff template. Anyway, it's an issue caused by complex legal requirements and the fundamental solution is to loosen the requirements at least for IDs. With regard to the subject, introduction of non-ASCII characters will make the matter a lot worse. Masataka Ohta ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 20.03.2010 00:45, Martin Rex wrote: Julian Reschke wrote: I don't buy that. We've got something like 1 billion people on the planet running web browsers, and I'm pretty confident we can find a few non-ASCII characters everybody can display which could be used in examples. What exactly is the purpose of a few non-ASCII characters everybody can display? And while the environments that I use are mostly capable to display ISO-Latin-1, I do _NOT_ know names for the majority of symbols from 128, and would have severe difficulties discussing stuff with such symbols in speech, like in-person, at a bar, over lunch or on the phone, and therefore don't want to have any of them in RFCs. A few characters should be sufficient with specs that deal with I18N. Discussing non-ASCII characters often requires the use of unicode codepoints to avoid ambiguities and the lack of familiarity of most people of this planet with the glyphs on most unicode codepoints. Describing a unicode codepoint by its numeric value with characters from IA5/US-ASCII, on the other hand, is fairly simple and straightforward. It is, but displaying an actual *example* isn't. Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 19.03.2010 09:47, Arnt Gulbrandsen wrote: ... (That would also help with the other kind of cross references, see [19] section 4.2 when [19] is updated. The likelihood that 4.2 is renumbered shrinks, since xml2rfc can warn when it happens.) ... The preferred RFC editor style is symbolic names, btw. rfc2629.xslt already supports an extension where the section number becomes part of the markup, making all kinds of interesting extensions possible. Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 20.03.2010 00:26, Martin Rex wrote: ... When I submitted my very first I-D last November, it took me about 10 minutes to fix the few issues that idnits reported. If you have significantly more problems, then maybe you are using the wrong tool to write I-Ds. Try NRoffEdit. It will take care of many of these issues for you. :-) ... I'm it's a nice tool. As previously mentioned, I gave up on trying to _install_ xml2rfc one hour after downloading it. I was writing the third page of my I-D one hour after downloading NRoffEdit. 1) I run xml2rfc locally. The installation time was close to zero; copy a single file. That's probably because my Cygwin installation had TCL preinstalled. 2) Alternative: test locally with rfc2629.xslt (and maybe a validating XML editor), then use the web service to produce ASCII. Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Julian Reschke wrote: What exactly is the purpose of a few non-ASCII characters everybody can display? And while the environments that I use are mostly capable to display ISO-Latin-1, I do _NOT_ know names for the majority of symbols from 128, and would have severe difficulties discussing stuff with such symbols in speech, like in-person, at a bar, over lunch or on the phone, and therefore don't want to have any of them in RFCs. A few characters should be sufficient with specs that deal with I18N. Yes, but, ASCII back slash is already a little too much enough for us Japanese, because, in Japan, JIS Latin, which assigne Yen sign to the code point of back slash, is so widely used. Yes, we can and do accept it but no Latin-1 please. Most people in Japan can recognize Yen sign as back slash but can't recognize Latin-1 specific characters with fancy diacritica marks at all. Masataka Ohta ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Masataka Ohta mohta at necom830 dot hpcl dot titech dot ac dot jp wrote: Yes, but, ASCII back slash is already a little too much enough for us Japanese, because, in Japan, JIS Latin, which assigne Yen sign to the code point of back slash, is so widely used. See, if you use any encoding of Unicode, you won't have this problem, because U+005C is unequivocally the backslash and U+00A5 is unequivocally the yen sign. There are no context-dependent duals in Unicode. -- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ http://is.gd/2kf0s ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Doug Ewell wrote: See, if you use any encoding of Unicode, you won't have this problem, because U+005C is unequivocally the backslash and U+00A5 is unequivocally the yen sign. There are no context-dependent duals in Unicode. Character issues are a lot more complicated than you can imagine. As many Japanese type Yen sign, when he actually want to input back slash, the JIS character of Yen sign is converted to unicode character of Yen sign, which is not back slash, which was the intention. Here, it is not a problem of so complicated unicode but a rather simple ASCII and JIS. Even I can't fully predict how disastrous full deployment of unicode could be. Masataka Ohta ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Masataka Ohta mohta at necom830 dot hpcl dot titech dot ac dot jp wrote: As many Japanese type Yen sign, when he actually want to input back slash, the JIS character of Yen sign is converted to unicode character of Yen sign, which is not back slash, which was the intention. I think this means that the user's kludge, in typing a yen sign to get a backslash, is not matched by Unicode with an equal and opposite kludge of converting the yen sign back to a backslash. I guess in the 1960s one could consider this a fault. Here, it is not a problem of so complicated unicode but a rather simple ASCII and JIS. Even I can't fully predict how disastrous full deployment of unicode could be. Why don't we ask one of the scores of software vendors that have deployed Unicode, at least as fully as this thread is about, just how disastrous their experience has been and how much better things would be if they had stuck with ISO 2022 instead? -- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ http://is.gd/2kf0s ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
We issue errata for RFCs. Most errata address a substantive defect in the text that would affect the protocol. The RFC may be 'authoritative' (whatever that is meant to mean) but the errata is almost certainly what someone would want to actually implement to make the protocol work. I remember a case in the X.25 protocol specs which turned out to be so wrong that the only way to make them work was to look at bits on the wire and work out what real X.25 boxen did. If I had the xml2rfc source, and the errata in appropriate form, I could then write a piece of code that shows the errata in context of the RFC, so that I can then see that there is a correction as I am looking at the page. RFC stands for Request For Comments. The idea of an authoritative request is a little strange. This is not scripture. It is an engineering tool. There are plenty of IETF specs where best practice on the net is to ignore the MUST requirements of RFCs. A case in point is the ludicrous requirement in the FTP spec that the user interface default to ascii mode rather than binary. That may have made sense in the mid 70s, but I somewhat doubt it, EBSIDC was already on its way out. The chance of meeting a system that uses a character set that could be matched through the naive transcoding mechanism in FTP clients is many orders of magnitude less than that of a straight guy getting lucky on chatroulette. On Fri, Mar 19, 2010 at 1:06 PM, Henning Schulzrinne h...@cs.columbia.edu wrote: Maybe I'm not enough of a amateur lawyer, but has authoritative been a practical issue, i.e., has there been confusion or legal action because one rendition (say, PDF) differed in some trivial aspect from another (e.g., ASCII)? Pragmatically, one could simply state that one form (say, good-ol ASCII, to avoid endless debates and for historical reasons) was authoritative and that others were best effort versions of the same text and that any deviations and omissions were accidental and should be brought to the attention of the appropriate authorities. I'm sure we can come up with more legal boiler plate to phrase this more precisely - we seem to be getting good at this boilerplate thing... With that caveat, in the case of a (presumably exceedingly rare) production error, the non-authoritative version could then be updated, in the same manner that the auto-generated pseudo-HTML versions we have today on the IETF site change occasionally as the rendering program is improved. This doesn't seem to have caused a significant protocol interoperability problem. Henning On Mar 18, 2010, at 12:42 PM, Bob Braden wrote: John R. Levine wrote: between the XML and the final output. If we could agree that the final XML was authoritative, John, What, precisely, do you mean here? Do you mean that there would be NO text form of an RFC that was authoritative, or do you mean that BOTH the xml2rfc form and some text-equivalent form (say, .txt or .pdf) would be authoritative? I don't quite understand how either choice would work. I am asking about RFCs here, not Internet Drafts, BTW. Thanks, Bob Braden ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf -- -- New Website: http://hallambaker.com/ View Quantum of Stupid podcasts, Tuesday and Thursday each week, http://quantumofstupid.com/ ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
+1 Bob Braden (PS: The IESG has chosen to impose the RFC editing rules on all Internet Drafts. That always seemed counter-productive to me. I am not sure I would characterize the problem as serious, but it does seem t o warp common sense for the sake of bureaucratic uniformity.) In my view, we have an actual serious problem in that there is an increasingly high barrier to I-D submission because idnits has a large number of rules, nearly all of which are about formatting. I don't believe that authors of documents or WG-appointed editors ought to have to worry terribly much about that, except maybe near the time when the document is ready for publication. It's absurd, given the tools available, that document authors need to worry as much about line lengths and number of pages (!) in initial submissions as they need to worry about completeness and clarity of their text. A ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
There are two versions of ID-nits that should be used: 1) one when you're working on getting the words right, and 2) one when you're working on getting the formatting right. #1 should be used when you're at the beginning of the lifecycle. The requirements for it *should* *be* *minimal*. #2 should be used when you're getting close to submitting the document to the IESG for further processing. That's when the majority of those picky things should be highlighted. Perhaps the I-D uploader can ask which type of document this is, and act accordingly. For *-00 and *-01 files, it could even *assume* that it's a #1-style document. Tony Hansen t...@att.com On 3/20/2010 6:55 PM, Bob Braden wrote: +1 Bob Braden (PS: The IESG has chosen to impose the RFC editing rules on all Internet Drafts. That always seemed counter-productive to me. I am not sure I would characterize the problem as serious, but it does seem t o warp common sense for the sake of bureaucratic uniformity.) In my view, we have an actual serious problem in that there is an increasingly high barrier to I-D submission because idnits has a large number of rules, nearly all of which are about formatting. I don't believe that authors of documents or WG-appointed editors ought to have to worry terribly much about that, except maybe near the time when the document is ready for publication. It's absurd, given the tools available, that document authors need to worry as much about line lengths and number of pages (!) in initial submissions as they need to worry about completeness and clarity of their text. A ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 03/19/2010 01:49 AM, Tony Finch wrote: Boggle. A major advantage of xml2rfc compared to HTML is that it does the numbering for you, and you don't have to manually maintain cross references. I don't have any problem editing the source in one window while viewing the presentation document in another. (I do. Short version: One window good, two good, three bad, and there's already the email I'm answering and an editor.) Let me restate your message in unkind terms: A major advantage of xml2rfc is that it handles the numbering for you. You handle the numbering by opening an extra window. That's unkind phrasing, but hopefully not bad enough to offend. If a tool handles something for me, then I expect not to have to handle that same thing. In my opinion, xml2rfc just changes that problem instead of solving it, and the changed problem isn't noticeably better _for_me_. It could be solved within xml2rfc, e.g. by having it edit the source and record the number the number there. But xml2rfc doesn't do that now. (That would also help with the other kind of cross references, see [19] section 4.2 when [19] is updated. The likelihood that 4.2 is renumbered shrinks, since xml2rfc can warn when it happens.) Arnt ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Make HTML and PDF more prominent, was: Re: Why the normative form of IETF Standards is ASCII
On 19 mrt 2010, at 5:05, John Levine wrote: xml2rfc does a pretty good job of capturing what needs to be in an RFC, so that is the strawman I would start from. The virtues (or lack thereof) of xml2rfc are a separate discussion. The question isn't how we generate the normative output, but what the normative output should be. On 19 mrt 2010, at 2:04, Tim Bray wrote: On Thu, Mar 18, 2010 at 12:24 PM, Iljitsch van Beijnum iljit...@muada.com wrote: So far the only thing I hear is assertions offered without any foundation that the current format is problematic OK, one more time, let me enumerate the problems with the current format. I agree that you may not perceive them as problems, but they are problems for me: 1. I cannot print them correctly on either Windows or Mac. 2. I cannot view them at all on the mobile device These two issues can easily be solved by using the PDF or HTML versions. Any paginated ASCII can be turned into a PDF easily and automatically. There are different HTMLizations of RFCs, some better some worse. Creating an HTML version is harder than a PDF version without an xml2rfc source but for most RFCs there is a decent HTML version available somewhere. The PDF versions can be obtained from the RFC Editor if you search specifically for them, but in most places only the text versions show up. It would help a lot if the HTML and PDF versions were easier to find. Maybe the secretariat could put this on their todo list? 3. I cannot enter the name of an author correctly if that name includes non-ASCII characters. But even if you could, would you? I can't do anything useful with names written in anything other than latin characters (well, maybe also Greek). I wouldn't even know how to type them if I wanted to search for them. So at the very least all names would still have to appear in latin script and the non-latin form would be extra. Is the tiny benefit of having the real name there as a non-normative extra really enough to change what we've been doing for 40 years? 4. I cannot provide an actual illustrative working example of the use of non-ASCII text in Internet Protocols. Correct interpretation of things like UTF-8 is highly dependent on context. On many systems a plain text file with non-7bit-ASCII characters won't be displayed as intended by default. So it would be necessary to go to HTML with #; encodings of these characters or PDF to be reasonably sure they show up correctly. To me, PDF is unacceptable because it's even harder to display on devices other than computers with large screens or paper and it can't be decoded without complex tools. And switching to HTML just for this purpose isn't worth it to me. But then, I've never written a draft that required non-ASCII characters so that's easy for me to say. ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Make HTML and PDF more prominent, was: Re: Why the normative form of IETF Standards is ASCII
On Fri Mar 19 10:29:04 2010, Iljitsch van Beijnum wrote: On 19 mrt 2010, at 5:05, John Levine wrote: xml2rfc does a pretty good job of capturing what needs to be in an RFC, so that is the strawman I would start from. The virtues (or lack thereof) of xml2rfc are a separate discussion. The question isn't how we generate the normative output, but what the normative output should be. Why care about a normative output? You change the subject to talk about using non-normative representations already, why care about a normative output *at all*? Let's concentrate on a normative format, and ideally making that format editable directly. For display purposes, I don't care if you want HTML, PDF, or RO-33. 2. I cannot view them at all on the mobile device These two issues can easily be solved by using the PDF or HTML versions. Any paginated ASCII can be turned into a PDF easily and automatically. There are different HTMLizations of RFCs, some better some worse. Creating an HTML version is harder than a PDF version without an xml2rfc source but for most RFCs there is a decent HTML version available somewhere. The PDF versions can be obtained from the RFC Editor if you search specifically for them, but in most places only the text versions show up. It would help a lot if the HTML and PDF versions were easier to find. Maybe the secretariat could put this on their todo list? Or maybe do as the XSF does, with a normative XML format, but generating HTML and PDF from it. In the IETF, we'd even generate RO-33 format text, too. 3. I cannot enter the name of an author correctly if that name includes non-ASCII characters. But even if you could, would you? I can't do anything useful with names written in anything other than latin characters (well, maybe also Greek). I wouldn't even know how to type them if I wanted to search for them. So at the very least all names would still have to appear in latin script and the non-latin form would be extra. Is the tiny benefit of having the real name there as a non-normative extra really enough to change what we've been doing for 40 years? I don't think in itself it's a huge deal. I just think it's crushingly embarrassing. The IAB made a clear statement that we need i18n support, yet over a decade after RFC 2130 or RFC 2825, the RFCs themselves still have a strict ASCII limitation. Sure, that wasn't mentioned at the time, but does nobody else find this plain shameful? 4. I cannot provide an actual illustrative working example of the use of non-ASCII text in Internet Protocols. Correct interpretation of things like UTF-8 is highly dependent on context. On many systems a plain text file with non-7bit-ASCII characters won't be displayed as intended by default. So it would be necessary to go to HTML with #; encodings of these characters or PDF to be reasonably sure they show up correctly. To me, PDF is unacceptable because it's even harder to display on devices other than computers with large screens or paper and it can't be decoded without complex tools. And switching to HTML just for this purpose isn't worth it to me. But then, I've never written a draft that required non-ASCII characters so that's easy for me to say. So drop the non-ASCII characters from the text representation, just as we do already. I'm okay with that. Dave. -- Dave Cridland - mailto:d...@cridland.net - xmpp:d...@dave.cridland.net - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Make HTML and PDF more prominent, was: Re: Why the normative form of IETF Standards is ASCII
Iljitsch van Beijnum wrote: 1. I cannot print them correctly on either Windows or Mac. 2. I cannot view them at all on the mobile device These two issues can easily be solved by using the PDF or HTML versions. Simple plain ASCII text is just fine. 3. I cannot enter the name of an author correctly if that name includes non-ASCII characters. But even if you could, would you? That's why non ASCII characters MUST NOT be used. I can't do anything useful with names written in anything other than latin characters (well, maybe also Greek). Not merely Latin but pure ASCII and definitely NOT Greek. Is the tiny benefit of having the real name there as a non-normative extra really enough to change what we've been doing for 40 years? Not at all. But say it to people who are insisting to use Latin-1 diacritical marks for IETF ML discussions as if it were internationally usable. Correct interpretation of things like UTF-8 is highly dependent on context. On many systems a plain text file with non-7bit-ASCII characters won't be displayed as intended by default. The problem is that UNICODE is broken that it does not carry necessary context information, which was expected to be supplied magically. So it would be necessary to go to HTML with #; encodings of these characters or PDF to be reasonably sure they show up correctly. To me, PDF is unacceptable because it's even harder to display on devices other than computers with large screens or paper and it can't be decoded without complex tools. And switching to HTML just for this purpose isn't worth it to me. Just rely on ISO2022 and everything works just fine. But then, I've never written a draft that equired non-ASCII characters so that's easy for me to say. I did write an RFC and drafts on how character encoding could have been internationalized. Moreover, for these 10 years, I have developped a simple and straight forward theory on what, actually, is unification. I can elaborate, if you are interested in. Masataka Ohta ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Make HTML and PDF more prominent, was: Re: Why the normative form of IETF Standards is ASCII
Ohta san, Let me guess: You're not a big fan of IDNs either, right? Ole J. Jacobsen Editor and Publisher, The Internet Protocol Journal Cisco Systems Tel: +1 408-527-8972 Mobile: +1 415-370-4628 E-mail: o...@cisco.com URL: http://www.cisco.com/ipj On Fri, 19 Mar 2010, Masataka Ohta wrote: Simple plain ASCII text is just fine. Masataka Ohta ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
A state of spin ... presented in ASCII (was: Make HTML and PDF more prominent, was: Re: Why the normative form of IETF Standards is ASCII)
At 04:02 19-03-10, Dave Cridland wrote: The IAB made a clear statement that we need i18n support, yet over a decade after RFC 2130 or RFC 2825, the RFCs themselves still have a strict ASCII limitation. Sure, that wasn't mentioned at the time, but does nobody else find this plain shameful? As seen in an I-D: The IETF is an international organization with open participation. It is important that the IETF leadership be a reflection of the diversity of its participants. The IETF is an open organization as such one would expect that no single company or funder of participants would dominate the leadership positions. What if those positions were a non-diverse set in terms of geographic region, ethnicity and gender? What if the number of positions was not limited to two for a single company or funding source and the positions for an area had the same funding source? What if there was a dominance of vendors in those positions? A person could also ask whether it is shameful. There is absolutely no relationship between the question asked (see quoted text) and this response. It is up to the reader to assess the state of spin. Regards, -sm ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Maybe I'm not enough of a amateur lawyer, but has authoritative been a practical issue, i.e., has there been confusion or legal action because one rendition (say, PDF) differed in some trivial aspect from another (e.g., ASCII)? Pragmatically, one could simply state that one form (say, good-ol ASCII, to avoid endless debates and for historical reasons) was authoritative and that others were best effort versions of the same text and that any deviations and omissions were accidental and should be brought to the attention of the appropriate authorities. I'm sure we can come up with more legal boiler plate to phrase this more precisely - we seem to be getting good at this boilerplate thing... With that caveat, in the case of a (presumably exceedingly rare) production error, the non-authoritative version could then be updated, in the same manner that the auto-generated pseudo-HTML versions we have today on the IETF site change occasionally as the rendering program is improved. This doesn't seem to have caused a significant protocol interoperability problem. Henning On Mar 18, 2010, at 12:42 PM, Bob Braden wrote: John R. Levine wrote: between the XML and the final output. If we could agree that the final XML was authoritative, John, What, precisely, do you mean here? Do you mean that there would be NO text form of an RFC that was authoritative, or do you mean that BOTH the xml2rfc form and some text-equivalent form (say, .txt or .pdf) would be authoritative? I don't quite understand how either choice would work. I am asking about RFCs here, not Internet Drafts, BTW. Thanks, Bob Braden ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Make HTML and PDF more prominent, was: Re: Why the normative form of IETF Standards is ASCII
On 3/19/2010 3:29 AM, Iljitsch van Beijnum wrote: On 19 mrt 2010, at 5:05, John Levine wrote: xml2rfc does a pretty good job of capturing what needs to be in an RFC, so that is the strawman I would start from. The virtues (or lack thereof) of xml2rfc are a separate discussion. The question isn't how we generate the normative output, but what the normative output should be. Agreed, the issue is two fold - the data type and the data format for submission. The content itself is irrelevant to this conversation since that is controlled by the publishing desk. On 19 mrt 2010, at 2:04, Tim Bray wrote: On Thu, Mar 18, 2010 at 12:24 PM, Iljitsch van Beijnum iljit...@muada.com wrote: So far the only thing I hear is assertions offered without any foundation that the current format is problematic The assertion is that something more than TEXT needs to be acceptable as far as the packaging goes especially with regard to diagramming and drawings inside of work product. OK, one more time, let me enumerate the problems with the current format. I agree that you may not perceive them as problems, but they are problems for me: 1. I cannot print them correctly on either Windows or Mac. Why not? Text printing is text printing. PDF's on Mac look just like PDF's on Windows do too AFAIK... and Adobe would be in a world of hurt if this was not so, so... 2. I cannot view them at all on the mobile device Why do you need to? Do you do design review work on your mobile device? Most of them dont have enough screen space to make that efficient so again what is the real issue here? or is it about being able to just utter the words IETF processes allow collaboration for every device on the Internet... These two issues can easily be solved by using the PDF or HTML versions. Any paginated ASCII can be turned into a PDF easily and automatically. There are different HTMLizations of RFCs, some better some worse. Creating an HTML version is harder than a PDF version without an xml2rfc source but for most RFCs there is a decent HTML version available somewhere. The PDF versions can be obtained from the RFC Editor if you search specifically for them, but in most places only the text versions show up. It would help a lot if the HTML and PDF versions were easier to find. Maybe the secretariat could put this on their todo list? 3. I cannot enter the name of an author correctly if that name includes non-ASCII characters.But even if you could, would you? I can't do anything useful with names written in anything other than latin characters (well, maybe also Greek). I wouldn't even know how to type them if I wanted to search for them. So at the very least all names would still have to appear in latin script and the non-latin form would be extra. Is the tiny benefit of having the real name there as a non-normative extra really enough to change what we've been doing for 40 years? yes... but that also is another issue. 4. I cannot provide an actual illustrative working example of the use of non-ASCII text in Internet Protocols. Correct interpretation of things like UTF-8 is highly dependent on context. On many systems a plain text file with non-7bit-ASCII characters won't be displayed as intended by default. So it would be necessary to go to HTML with #; encodings of these characters or PDF to be reasonably sure they show up correctly. To me, PDF is unacceptable because it's even harder to display on devices other than computers with large screens or paper and it can't be decoded without complex tools. And switching to HTML just for this purpose isn't worth it to me. But then, I've never written a draft that required non-ASCII characters so that's easy for me to say. ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf attachment: tglassey.vcf___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Make HTML and PDF more prominent, was: Re: Why the normative form of IETF Standards is ASCII
The virtues (or lack thereof) of xml2rfc are a separate discussion. The question isn't how we generate the normative output, but what the normative output should be. Seems to me that this discussion has reached the point at which running code is needed in order to get any further. May I suggest that those interested in changing the normative format come up with an example based on a couple of RFCs, one recent and one ancient. For instance, if you believe that an XML format is the right one, present us with an example RFC in normative XML format along with some XSLT transformations that can be used to produce HTML and ASCII text format versions. PDF shouldn't be an issue since it is easy to change just about anything into a PDF file, but it might be useful to document the workflow and toolchain required to go from normative XML to archival PDF/A since it seems sensible maintain archive copies of all RFCs as well as normative. Note that a PDF/A document could contain an appendix with the source code of the normative XML document, thus archiving that as well. If it can be demonstrated that an XML normative format is workable and can be easily transformed into other needed formats using a variety of common tools, then there is some point in extending the discussion to editing and submission formats. I do believe that one can trivially export a normative XML document into formats suitable for viewing in all the contexts discussed in this and previous threads on the topic. It is therefore trivial for the IETF to offer a download tool for every RFC that allows the user to choose a set of formats and receive a package of files in their choice of .ZIP, .7z, .tar.gz and other formats. Each file would have a copy of the specified RFC in the chosen formats such as HTML, HTML with printable CSS, ASCII text, UNICODE text, specified column width text, paginated or unpaginated text with specified page length, PDF, PDF/A, XML, .doc, .docx, .sdw, .odt, etc... If nobody is willing to produce a sample normative XML format RFC, then let's drop the whole topic. --Michael Dillon ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
It would be good if RFC authors put atleast as much care into the clarity and organization of their contents as you are devoting to a discussion of the formatting. The contents are what matter, and fancy formatting may (or may not) be a distraction from the more important issues of contents. Bob Braden ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Make HTML and PDF more prominent, was: Re: Why the normative form of IETF Standards is ASCII
On 19 mrt 2010, at 12:02, Dave Cridland wrote: Why care about a normative output? You change the subject to talk about using non-normative representations already, why care about a normative output *at all*? You have a point. But it's in the subject line... Let's concentrate on a normative format, and ideally making that format editable directly. Right, because that is the thing you need to do most often with the normative form of the document. /sarcasm The most important feature of the normative version is that it is unambiguous. That means that the software layers to view that version must be as few and as simple as possible. 3. I cannot enter the name of an author correctly if that name includes non-ASCII characters. But even if you could, would you? I don't think in itself it's a huge deal. I just think it's crushingly embarrassing. The IAB made a clear statement that we need i18n support, yet over a decade after RFC 2130 or RFC 2825, the RFCs themselves still have a strict ASCII limitation. Sure, that wasn't mentioned at the time, but does nobody else find this plain shameful? Not at all. Requring all users around the world to use latin script in their URL bars and email addresses is a very bad thing. So all user serviceable parts of internet standards must support scripts used around the world. But that's a very different thing from what the IETF should do for its internal business. We already restrict our communications to the English language. The further restriction that publications only contain 7-bit ASCII can be argued on both sides, but is hardly shameful. It's just a matter of efficient operation. I'm named after Пётр Ильич Чайковский, but the Dutch government only accepts names in latin characters. If countries can impose that restriction, the IETF certainly can. On 19 mrt 2010, at 18:06, Henning Schulzrinne wrote: Pragmatically, one could simply state that one form (say, good-ol ASCII, to avoid endless debates and for historical reasons) was authoritative and that others were best effort versions of the same text and that any deviations and omissions were accidental and should be brought to the attention of the appropriate authorities. Exactly. And then provide links to the PDF and HTML versions on an equal footing with the text version so people can easily select the version that best suits their needs of the moment. ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On Fri, Mar 19, 2010 at 01:39:38PM -0700, Bob Braden wrote: It would be good if RFC authors put atleast as much care into the clarity and organization of their contents as you are devoting to a discussion of the formatting. The contents are what matter, and fancy formatting may (or may not) be a distraction from the more important issues of contents. I fully agree, and it is why I was so vexed by Donald Eastlake's inital claim that the I-D and RFC format is plain ASCII. In my view, we have an actual serious problem in that there is an increasingly high barrier to I-D submission because idnits has a large number of rules, nearly all of which are about formatting. I don't believe that authors of documents or WG-appointed editors ought to have to worry terribly much about that, except maybe near the time when the document is ready for publication. It's absurd, given the tools available, that document authors need to worry as much about line lengths and number of pages (!) in initial submissions as they need to worry about completeness and clarity of their text. A -- Andrew Sullivan a...@shinkuro.com Shinkuro, Inc. ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Andrew Sullivan wrote: On Fri, Mar 19, 2010 at 01:39:38PM -0700, Bob Braden wrote: It would be good if RFC authors put atleast as much care into the clarity and organization of their contents as you are devoting to a discussion of the formatting. The contents are what matter, and fancy formatting may (or may not) be a distraction from the more important issues of contents. I fully agree, and it is why I was so vexed by Donald Eastlake's inital claim that the I-D and RFC format is plain ASCII. In my view, we have an actual serious problem in that there is an increasingly high barrier to I-D submission because idnits has a large number of rules, nearly all of which are about formatting. When I submitted my very first I-D last November, it took me about 10 minutes to fix the few issues that idnits reported. If you have significantly more problems, then maybe you are using the wrong tool to write I-Ds. Try NRoffEdit. It will take care of many of these issues for you. :-) As previously mentioned, I gave up on trying to _install_ xml2rfc one hour after downloading it. I was writing the third page of my I-D one hour after downloading NRoffEdit. -Martin ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On Mar 19, 2010, at 3:26 PM, Martin Rex wrote: As previously mentioned, I gave up on trying to _install_ xml2rfc one hour after downloading it. I was writing the third page of my I-D one hour after downloading NRoffEdit. Even if you're one of those rare birds who has difficulty installing xml2rfc, that question is largely moot since there's a web interface to a functioning implementation at xml.resource.org. Isn't it nice that all these different tools can produce conforming documents? I think it's nice. Melinda ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Julian Reschke wrote: I don't buy that. We've got something like 1 billion people on the planet running web browsers, and I'm pretty confident we can find a few non-ASCII characters everybody can display which could be used in examples. What exactly is the purpose of a few non-ASCII characters everybody can display? And while the environments that I use are mostly capable to display ISO-Latin-1, I do _NOT_ know names for the majority of symbols from 128, and would have severe difficulties discussing stuff with such symbols in speech, like in-person, at a bar, over lunch or on the phone, and therefore don't want to have any of them in RFCs. Discussing non-ASCII characters often requires the use of unicode codepoints to avoid ambiguities and the lack of familiarity of most people of this planet with the glyphs on most unicode codepoints. Describing a unicode codepoint by its numeric value with characters from IA5/US-ASCII, on the other hand, is fairly simple and straightforward. -Martin ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Martin Rex wrote: Discussing non-ASCII characters often requires the use of unicode codepoints to avoid ambiguities and the lack of familiarity of most people of this planet with the glyphs on most unicode codepoints. Avoid ambiguities with unicode? Describing a unicode codepoint by its numeric value with characters from IA5/US-ASCII, on the other hand, is fairly simple and straightforward. The problem of unicode is that its codepoint does not disambiguate Chinese and Japanese characters, which makes unicode useless for multilingual communication. Language tag, if ever supplied, does not help here. Masataka Ohta ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
That would meet most of my issues, provided of course that the XML2RFC format was published. Zero time spent going to an editable format is better than any amount of 'easy conversion'. On Wed, Mar 17, 2010 at 9:03 PM, Tony Hansen t...@att.com wrote: +1 On 3/17/2010 12:18 PM, John R. Levine wrote: If we could agree that the final XML was authoritative, and if necessary let them hire someone to fix xmlrfc so it can produce the text version without hand editing or postprocessing, that would be a big step forward. ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf -- -- New Website: http://hallambaker.com/ View Quantum of Stupid podcasts, Tuesday and Thursday each week, http://quantumofstupid.com/ ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On Thu Mar 18 03:27:30 2010, Phillip Hallam-Baker wrote: That would meet most of my issues, provided of course that the XML2RFC format was published. There's a rfc2629bis at/as http://xml.resource.org/authoring/draft-mrose-writing-rfcs.html Is there anything you feel that's not covering? (I agree much of http://xml.resource.org/authoring/draft-mrose-writing-rfcs.html#anchor19 is now in such common usage a formal I-D submission would be useful). Zero time spent going to an editable format is better than any amount of 'easy conversion'. Indeed. On Wed, Mar 17, 2010 at 9:03 PM, Tony Hansen t...@att.com wrote: +1 On 3/17/2010 12:18 PM, John R. Levine wrote: If we could agree that the final XML was authoritative, and if necessary let them hire someone to fix xmlrfc so it can produce the text version without hand editing or postprocessing, that would be a big step forward. I'm in agreement 99%. I just think we've accumulated a lot of working experience with XML editing forms both in the IETF and elsewhere, and it'd be useful to attempt to consolidate that at this point in time before moving to full adoption. Dave. -- Dave Cridland - mailto:d...@cridland.net - xmpp:d...@dave.cridland.net - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
John R. Levine wrote: between the XML and the final output. If we could agree that the final XML was authoritative, John, What, precisely, do you mean here? Do you mean that there would be NO text form of an RFC that was authoritative, or do you mean that BOTH the xml2rfc form and some text-equivalent form (say, .txt or .pdf) would be authoritative? I don't quite understand how either choice would work. I am asking about RFCs here, not Internet Drafts, BTW. Thanks, Bob Braden ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
between the XML and the final output. If we could agree that the final XML was authoritative, What, precisely, do you mean here? Do you mean that there would be NO text form of an RFC that was authoritative, or do you mean that BOTH the xml2rfc form and some text-equivalent form (say, .txt or .pdf) would be authoritative? The XML is authoritative, the text is derived from it. This presumes that we improve xml2rfc so it produces text comparable to the stuff we have now, and has sufficient change control and regression testing that we can count on future versions of xml2rfc to produce the same output with the same input. As I expect you know, multiple forms are quite common in other SDOs. The ITU typically publishers authoritative PDFs, but also provides Word documents for people who want to cut and paste. R's, John ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 18 mrt 2010, at 2:43, Richard Barnes wrote: +1 Making the XML normative would be an abomination. The XML in itself can't be interpreted by a human to the level needed to create a compliant implementation, although it deceptively looks like maybe it could. Of course human readability also doesn't exist for pretty much anything other than text or the simplest of HTML, in itself this isn't a show stopper. But there is no standard way of converting xml2rfc into something that humans can interpret unambigously. Practically, the only way to do this is with the xml2rfc tool, which is non-standard, only partially documented and very hard to run for most people. There have also been times during which the released version was unable to convert the XML files that were actually being used inside the IETF. And of course there are no existing RFC for which there is an xml2rfc XML file that you can run through xml2rfc and obtain the exact ASCII version of that RFC. Older RFCs are formatted in ways which are completely incompatible with xml2rfc, so it would be impossible to have all RFCs be available in one format if XML is adopted for future RFCs. If we really want to do something in this space first of all we need to agree on the problem, then on the requirements and THEN we can have a useful discussion. So far the only thing I hear is assertions offered without any foundation that the current format is problematic, while every personal computer operating system sold (or given away for free) the past decade can display it without the need to install additional software. That's a pretty good result for files which date back as long as 40 years. Good luck finding any other document format of the same age with that property. ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 18.03.2010 20:24, Iljitsch van Beijnum wrote: On 18 mrt 2010, at 2:43, Richard Barnes wrote: +1 Making the XML normative would be an abomination. The XML in itself can't be interpreted by a human to the level needed to create a compliant implementation, although it deceptively looks like maybe it could. Of course human readability also doesn't exist for pretty much anything other than text or the simplest of HTML, in itself this isn't a show stopper. That is simply incorrect, which can easily be checked by looking at the XML source of a spec. But there is no standard way of converting xml2rfc into something that humans can interpret unambigously. Practically, the only way to do this is with the xml2rfc tool, which is non-standard, only partially documented and very hard to run for most people. There have also been times during which the released version was unable to convert the XML files that were actually being used inside the IETF. Again incorrect. There is at least one other implementation that can be used by everybody who's got a current browser (which means, everybody), assuming that the source file actually is valid, and doesn't use non-standard extensions (as opposed to what RFC 2629 defines). And of course there are no existing RFC for which there is an xml2rfc XML file that you can run through xml2rfc and obtain the exact ASCII version of that RFC. Older RFCs are formatted in ways which are completely incompatible with xml2rfc, so it would be impossible to have all RFCs be available in one format if XML is adopted for future RFCs. Yes. How is that a problem, exactly? Just don't try to change the past. If we really want to do something in this space first of all we need to agree on the problem, then on the requirements and THEN we can have a useful discussion. So far the only thing I hear is assertions offered without any foundation that the current format is problematic, while every personal computer operating system sold (or given away for free) the past decade can display it without the need to install additional software. That's a pretty good result for files which date back as long as 40 years. Good luck finding any other document format of the same age with that property. That may be true, but that features comes with drawbacks. Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 18 mrt 2010, at 20:59, Julian Reschke wrote: The XML in itself can't be interpreted by a human to the level needed to create a compliant implementation, although it deceptively looks like maybe it could. Of course human readability also doesn't exist for pretty much anything other than text or the simplest of HTML, in itself this isn't a show stopper. That is simply incorrect, which can easily be checked by looking at the XML source of a spec. People make mistakes implementing today's text. If they have to implement from XML source where they have to interpret things like escape codes and numbered lists (just to mention the first two things that come to mind) in their head this is going to be much worse. There is at least one other implementation that can be used by everybody who's got a current browser So now I have to have a working network connection to read an RFC? What if the site that hosts all this goes down? What if someone wants to read a spec 20 years from now? it would be impossible to have all RFCs be available in one format if XML is adopted for future RFCs. Yes. How is that a problem, exactly? Because this doubles the amount of effort needed to be able to read RFCs. And if old RFCs can be in text, why not new ones? And if some RFCs are text, why not derive text versions of the XML ones too, so there are text versions of all of them? And if there are text versions for all RFCs and XML versions of only some, why not make the text version authoritative? Oh wait... That's a pretty good result for files which date back as long as 40 years. Good luck finding any other document format of the same age with that property. That may be true, but that features comes with drawbacks. Drawbacks that we can all agree on? Sure, RFCs don't look too pretty, and their hard line and page endings are very annoying because they never fit the screen or paper that you happen to use. (Aside: PDF is much worse in this regard.) But pretty much all RFCs can be viewed in HTML versions which don't have these problems by anyone who cares. Being able to have names and examples in non-latin characters would be nice, but for names this is just a cosmetic thing with compatibility issues that make it not worth the trouble, and with examples it's dangerous to depend on correct display of anything that isn't 7-bit ASCII because it's still quite easy to end up with something that's incorrect or doesn't show. The ability to use graphics would be helpful but would have severe consequences for the file format, having to use multiple files to make up a single RFC would be problematic (ASCII, HTML with images) and single file formats aren't trivially decoded. Images are also very hard to edit, making collaboration and especially updating RFCs much more difficult. And the inclusion of images reduces the number of devices that can display an RFC significantly. (Line printers, text only displays and remote login sessions are out, hand held devices also to a large degree because the screens probably don't have enough resolution.) I guess plain ASCII isn't so bad after all. Would be nice if we could get rid of the pagination, though. ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 18.03.2010 21:25, Iljitsch van Beijnum wrote: ... That is simply incorrect, which can easily be checked by looking at the XML source of a spec. People make mistakes implementing today's text. If they have to implement from XML source where they have to interpret things like escape codes and numbered lists (just to mention the first two things that come to mind) in their head this is going to be much worse. I don't believe that the few escape codes (essentially two) are really a problem. And how are numbered lists a problem? There is at least one other implementation that can be used by everybody who's got a current browser So now I have to have a working network connection to read an RFC? What if the site that hosts all this goes down? What if someone wants to read a spec 20 years from now? No, you don't need a network connection. it would be impossible to have all RFCs be available in one format if XML is adopted for future RFCs. Yes. How is that a problem, exactly? Because this doubles the amount of effort needed to be able to read RFCs. And if old RFCs can be in text, why not new ones? And if some RFCs are text, why not derive text versions of the XML ones too, so there are text versions of all of them? And if there are text versions for all RFCs and XML versions of only some, why not make the text version authoritative? Oh wait... We've done it for 40 years, why not continue that way? :-) That's a pretty good result for files which date back as long as 40 years. Good luck finding any other document format of the same age with that property. That may be true, but that features comes with drawbacks. Drawbacks that we can all agree on? Sure, RFCs don't look too pretty, and their hard line and page endings are very annoying because they never fit the screen or paper that you happen to use. (Aside: PDF is much worse in this regard.) But pretty much all RFCs can be viewed in HTML versions which don't have these problems by anyone who cares. Aha! So how about: 1) asking the RFC Editor to always archive the XML when present (I believe this is already the case), and to ensure the XML is actually valid (to be done). 2) asking the RFC Editor to publish HTML *in addition* to the TXT version, when available? Being able to have names and examples in non-latin characters would be nice, but for names this is just a cosmetic thing with compatibility issues that make it not worth the trouble, and with examples it's dangerous to depend on correct display of anything that isn't 7-bit ASCII because it's still quite easy to end up with something that's incorrect or doesn't show. I don't buy that. We've got something like 1 billion people on the planet running web browsers, and I'm pretty confident we can find a few non-ASCII characters everybody can display which could be used in examples. The ability to use graphics would be helpful but would have severe consequences for the file format, having to use multiple files to make up a single RFC would be problematic (ASCII, HTML with images) and single file formats aren't trivially decoded. Images are also very hard to edit, making collaboration and especially updating RFCs much more difficult. And the inclusion of images reduces the number of devices that can display an RFC significantly. (Line printers, text only displays and remote login sessions are out, hand held devices also to a large degree because the screens probably don't have enough resolution.) That's an orthogonal problem. I agree it's non-trivial. MHTML would come to mind, if it had more implementations. I guess plain ASCII isn't so bad after all. Would be nice if we could get rid of the pagination, though. As far as I can tell, all RFCs today are published from XML or NROFF source. I know xml2rfc can produce unpaginated text, and I'm confident NROFF can as well. Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 03/18/2010 09:37 PM, Julian Reschke wrote: And how are numbered lists a problem? I thought it was a pain because I got comments referring to x and the file I edited contained no x. xml2rfc generated numbers, people used them to me, I didn't see them in the source. In general I think the RFC format should use author-visible numbers in the cases where those numbers are used in email, and might benefit from being unchanged in the next revision of the RFC: Sections, list items. Not references, people don't often refer to those by number in email. Arnt ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 18.03.2010 21:41, Arnt Gulbrandsen wrote: On 03/18/2010 09:37 PM, Julian Reschke wrote: And how are numbered lists a problem? I thought it was a pain because I got comments referring to x and the file I edited contained no x. xml2rfc generated numbers, people used them to me, I didn't see them in the source. In general I think the RFC format should use author-visible numbers in the cases where those numbers are used in email, and might benefit from being unchanged in the next revision of the RFC: Sections, list items. Not references, people don't often refer to those by number in email. It would be a simple exercise to write a tool that augments the source with the generated section/list item numbers. Damn, now I have to write it :-) Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 03/18/2010 01:52 PM, Julian Reschke wrote: On 18.03.2010 21:41, Arnt Gulbrandsen wrote: On 03/18/2010 09:37 PM, Julian Reschke wrote: And how are numbered lists a problem? I thought it was a pain because I got comments referring to x and the file I edited contained no x. xml2rfc generated numbers, people used them to me, I didn't see them in the source. In general I think the RFC format should use author-visible numbers in the cases where those numbers are used in email, and might benefit from being unchanged in the next revision of the RFC: Sections, list items. Not references, people don't often refer to those by number in email. It would be a simple exercise to write a tool that augments the source with the generated section/list item numbers. Damn, now I have to write it :-) This is another proof that we need to define a subset of rfc2629-bis for the canonical XML source (no include; day/month attribute mandatory; figures, tables and lists numbered[1]. etc...), so the XML source submitted can be used to generate the various formats people want. -- Marc Petit-Huguenin Personal email: m...@petit-huguenin.org Professional email: petit...@acm.org Blog: http://blog.marc.petit-huguenin.org ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 18 Mar 2010, at 20:41, Arnt Gulbrandsen a...@gulbrandsen.priv.no wrote: On 03/18/2010 09:37 PM, Julian Reschke wrote: And how are numbered lists a problem? I thought it was a pain because I got comments referring to x and the file I edited contained no x. xml2rfc generated numbers, people used them to me, I didn't see them in the source. Boggle. A major advantage of xml2rfc compared to HTML is that it does the numbering for you, and you don't have to manually maintain cross references. I don't have any problem editing the source in one window while viewing the presentation document in another. Tony (on his iPod). -- f.anthony.n.finch d...@dotat.at http://dotat.at/ ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On Thu, Mar 18, 2010 at 12:24 PM, Iljitsch van Beijnum iljit...@muada.com wrote: If we really want to do something in this space first of all we need to agree on the problem, then on the requirements and THEN we can have a useful discussion. So far the only thing I hear is assertions offered without any foundation that the current format is problematic, while every personal computer operating system sold (or given away for free) the past decade can display it without the need to install additional software. OK, one more time, let me enumerate the problems with the current format. I agree that you may not perceive them as problems, but they are problems for me: 1. I cannot print them correctly on either Windows or Mac. 2. I cannot view them at all on the mobile device with a highly competent web browser with which I do an increasing proportion of my information consumption. 3. I cannot enter the name of an author correctly if that name includes non-ASCII characters. 4. I cannot provide an actual illustrative working example of the use of non-ASCII text in Internet Protocols. I have no problem with people disagreeing on the way forward, but I'm getting a bit short-tempered when I hear people claim that nobody's said what the current problems are. The following are things that I am NOT asking for: 1. Graphics 2. PDF 3. The use of non-English languages in spec text - Tim ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
I don't have any problem editing the source in one window while viewing the presentation document in another. Window? My ASR-33 doesn't have any windows. R's, John ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
If we really want to do something in this space first of all we need to agree on the problem, then on the requirements and THEN we can have a useful discussion. I thought the waterfall model of software design was discredited in about 1975. Rough consensus and running code, throwing darts at strawman proposals, works a lot better. xml2rfc does a pretty good job of capturing what needs to be in an RFC, so that is the strawman I would start from. R's, John ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 12 mrt 2010, at 6:58, John Levine wrote: Indeed, I know plenty of people these days who have no idea today how to produce an ASCII file with only tab, CR, and LF formatting characters. Type. Save as text. How hard is that? I have actually written a few drafts that way. The text part isn't hard, but the hard breaks at every line are, and the hard breaks at every page even more so. Tools do create those don't exist in today's world. The current process uses input and output formats that are similar enough that people wrongly think they're the same, even though of course they are not. Many people seem to assume that if we picked a new output format, we would necessarily change the input format to be the same as the output format, which I think would be a terrible idea. The input formats need to be reasonably easy for non-experts to create, Which it is not. xml2rfc is very hard to use for anyone who has otherwise no experience with XML just because it's XML (the proper nesting and terminating are hell) and also because at least 50% of the xml2rfc commands aren't documented. I don't understand why we would even need to discuss the output formats, you can get HTML and PDF without trouble, even though the text version is authoritative. It's the input that's the problem. ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 2010-3-17, at 8:48, Iljitsch van Beijnum wrote: I have actually written a few drafts that way. The text part isn't hard, but the hard breaks at every line are, and the hard breaks at every page even more so. Tools do create those don't exist in today's world. they do, e.g., something like fmt -w 72 draft.txt gsed '0~50G' draft.txt Lars smime.p7s Description: S/MIME cryptographic signature ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Indeed, I know plenty of people these days who have no idea today how to produce an ASCII file with only tab, CR, and LF formatting characters. Type. Save as text. How hard is that? Good guess, but wrong. If you do that, you will still generally get various non-ASCII quotes and punctuation marks unless you carefully configure your WP program not to insert smart quotes or whatever they call it. I have actually written a few drafts that way. The text part isn't hard, but the hard breaks at every line are, and the hard breaks at every page even more so. Tools do create those don't exist in today's world. Right. Again, you've figured it out, but most people haven't. I write books in emacs with nroff-like markup, but my editors consider me pretty strange. Lucky for them, I have scripts to turn my stuff into RTF which works with the the tools they use. As many people have pointed out, the world has moved on since 1980. (No, I'm not suggesting the IETF use Word.) xml2rfc is very hard to use for anyone who has otherwise no experience with XML just because it's XML (the proper nesting and terminating are hell) and also because at least 50% of the xml2rfc commands aren't documented. Are you assuming that the only way to write XML is by hand? Aw, come on. I don't understand why we would even need to discuss the output formats, you can get HTML and PDF without trouble, even though the text version is authoritative. It's the input that's the problem. Now I'm confused. Even though the RFC Editor mostly uses XML internally, they don't publish the XML, and there is a hand-edit stop between the XML and the final output. If we could agree that the final XML was authoritative, and if necessary let them hire someone to fix xmlrfc so it can produce the text version without hand editing or postprocessing, that would be a big step forward. R's, John ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
+1 On 3/17/2010 12:18 PM, John R. Levine wrote: If we could agree that the final XML was authoritative, and if necessary let them hire someone to fix xmlrfc so it can produce the text version without hand editing or postprocessing, that would be a big step forward. ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
+1 On Mar 17, 2010, at 9:03 PM, Tony Hansen wrote: +1 On 3/17/2010 12:18 PM, John R. Levine wrote: If we could agree that the final XML was authoritative, and if necessary let them hire someone to fix xmlrfc so it can produce the text version without hand editing or postprocessing, that would be a big step forward. ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 15.03.2010 22:34, Julian Reschke wrote: On 15.03.2010 22:19, Martin Rex wrote: ... It needs a painful lot of work to make free-floating formating not come out with poor results. When I do the above, an ascii arts with 3 lines of text and a box around is broken over from page8-page9 for http://greenbytes.de/tech/webdav/rfc2616.html ... Yes. (And thanks for retrying). What you are observing is missing support for CSS3 paged media hints in most browsers. It would be great if the UAs would get better on that. (PrinceXML (http://www.princexml.com/) is a great program that gets this right). ... And, as it seems, IE8 and Opera... Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Funny, I don't think anyone was suggesting PDF/A. The format most people have been suggesting is HTML. Donald brought up PDF/A as a strawman at the start of this discussion. And the fact is that even though many, many people submit HTML versions of their drafts it is not possible to retrieve them from the IETF site. And that is a really bad, really insulting waste of my time and the main reason I don't feel very inclined to work on RFCs. If you don't care about the format of the RFC then let those of us who care a very great deal to change it. Or admit that what you really want is the fact that forcing everyone to use the teletype format and rubbing their noses in it is what makes you feel big and important and that the plain fact is that you really don't care what other people might think about this organization, all that really matters to you is that you are seen to get your way. The teletype format really does hurt my eyes to look at. On Sun, Mar 14, 2010 at 3:59 AM, Jari Arkko jari.ar...@piuha.net wrote: Running code, actual interest to deploy, and an incremental deployment model would probably take this matter further than the annual religious argument :-) Those who feel the pain should build/select tools and demonstrate that (a) they can produce high-quality PDF/A, (b) that it provides additional value, and (c) that you can start publishing drafts and RFCs that use such a format without causing a problem for those who still read ASCII or use rfcdiff. No one prevents you from publishing .pdf in addition to .txt as a draft. Just do it. For RFC 5534 we also got a permission to publish the .pdf version alongside .txt so it the process does not prevent you from doing it. (But I run out of time to fine-tune the RFC version of the PDF, so for now at least there is only the text version.) In short, just go do it. Jari ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf -- -- New Website: http://hallambaker.com/ View Quantum of Stupid podcasts, Tuesday and Thursday each week, http://quantumofstupid.com/ ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
You can submit the HTML, the problem is that it seems to go in the bit bucket. Since the preferred submission formats are XML or nroff, I see no reason that the HTML version could not be generated from the XML. The problem seems to be that the RFC editor insists on using the XML to generate nroff and then makes all edits to the nroff, this is then used to generate the teletype version and a PDF of the teletype version. The net effect appears to be that there is no HTML version available, even if the authors exclusively used HTML for the production process. Hence, what I consider to be entirely justified anger on this point. On Sun, Mar 14, 2010 at 2:35 PM, Jari Arkko jari.ar...@piuha.net wrote: Phillip, I don't want to enter a discussion about the merits of PDF/A over HTML at this time. However, I do agree with you that it would be nice if you could submit HTML. If its true that its currently prohibited to look at it or subimt it, then that is something that could be fixed. I can take it up with the IESG. Or does someone see a reason why this should be prohibited (as long as you also submit ASCII)? Jari -- -- New Website: http://hallambaker.com/ View Quantum of Stupid podcasts, Tuesday and Thursday each week, http://quantumofstupid.com/ ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 14.03.2010 19:45, Phillip Hallam-Baker wrote: You can submit the HTML, the problem is that it seems to go in the bit bucket. Since the preferred submission formats are XML or nroff, I see no reason that the HTML version could not be generated from the XML. The problem seems to be that the RFC editor insists on using the XML to generate nroff and then makes all edits to the nroff, this is then used to generate the teletype version and a PDF of the teletype version. Again: that's not true anymore; in many cases, most of AUTH48 happens to the XML version, and NROFF is only generated in the final step. Of course it would be great if we could avoid that final step, as it's (IMHO) a waste of time, and also adds the risk ob subtle bugs (like whitespace in artwork/XML/ABNF The net effect appears to be that there is no HTML version available, even if the authors exclusively used HTML for the production process. Hence, what I consider to be entirely justified anger on this point. I think we should encourage the future RFC-Editor to publish the (X)HTML version as well, when available (*). BR, Julian (*) I do have a different preference about the tool to translate from RFC2629-XML to (X)HTML, but that's a separate topic. ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Julian Reschke wrote: On 14.03.2010 19:45, Phillip Hallam-Baker wrote: Since the preferred submission formats are XML or nroff, I see no reason that the HTML version could not be generated from the XML. Are there numbers available from the RFC Editor about the use of XML vs nroff for document subissions during the past 1/2 years? I would not be surprised if the prefered document format would also vary with the document size, and might be related to some authors desire to publish the content elsewhere as well. The net effect appears to be that there is no HTML version available, even if the authors exclusively used HTML for the production process. Hence, what I consider to be entirely justified anger on this point. I think we should encourage the future RFC-Editor to publish the (X)HTML version as well, when available (*). For some documents, there seems to be an alternative version of the document available, at least in PDF format. tools.ietf.org should probably add a URL for a floating HTML version in addition to the floating PDF version when an alternative document format is available. Compare http://tools.ietf.org/pdf/rfc2616 with http://tools.ietf.org/pdf/rfc2817 If you look at the floating PDF for rfc2616, it has some formatting flaws (like a missing non-breaking space in the expression (STD 1) at the very beginning of the document and defective spacing in section 1.4 around the ascii arts diagram. Problems that do not exist in the plain text ASCII version of that RFC. So the big plus for the ASCII document version is that an author can spend his time entirely on the content and doesn't have to worry as much about formatting as with XML/HTML with arbitrary output page sizes and font sizes. And a document format that does not, by default, limit the line length to at most 100 chars should *NOT* be used. The most preferred line length is ~75 chars per line. (the printed copy of ISO-C 9899:1990 that I have on my desk uses 96 cpl). And I firmly believe that the RFC Editor should continue to provide RFCs preformatted in the original ASCII text format so that they can still be displayed and used without any kinds of tools in the exiting environments, and contents of RFC can be copied into source code comments without any reformatting. And just for the sake of the fun of it, I just sent a plain RFC to our laser printer here (HP M5035MFP) with netcat (nc). The printer honours the Formfeed just fine! The left margin is a little narrow, but that's an easy one Unix: sed 's/^/ /' rfc.txt | nc -w 1 hp-net-printer 9100 Cygwin: sed -b 's/^/ /' rfc.txt | nc -w 1 hp-net-printer 9100 One thing that is odd, though: the RFC Editor seems to constantly produce incorrect length title pages. The lines with the form feed are at lines (59,115,171,227), which means that there are two excessive CR+LF on the title page. Printing the documents with Microsoft Word is not that difficult. Load it as .txt, remove two newlines at the beginning of the title page, select page margins at 1/1 leftright, font courier new and font size 10 throughout should work on A4 paper. Printing them 2-up probably makes sense, and may be easier to your eye that a free-floating 1-column printout of an HTML-version of the document. And on something like an iPhone, with a screen resolution of 320x480, a traditional ASCII RFC should display just fine in fullscreen landscape mode. xterm -fn fixed -geometry 80x24 creates a window, which, according to xwininfo has a size of 484x316 pixels (i.e. a 6x13 pixel font with a one-pixel border all around). -Martin ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 15.03.2010 21:01, Martin Rex wrote: ... Are there numbers available from the RFC Editor about the use of XML vs nroff for document subissions during the past 1/2 years? ... That would be interesting. ... So the big plus for the ASCII document version is that an author can spend his time entirely on the content and doesn't have to worry as much about formatting as with XML/HTML with arbitrary output page sizes and font sizes. And a document format that does not, ... Using the XML format separates you more from formatting than plain text. In particular, list formatting, page breaks and indentations become non-issues. Font sizes and page sizes just do not matter at all. by default, limit the line length to at most 100 chars should *NOT* be used. The most preferred line length is ~75 chars per line. (the printed copy of ISO-C 9899:1990 that I have on my desk uses 96 cpl). A document format should not have an inherently fixed line width at all. Propose needs to be reflowable, ASCII are/diagrams/code should not. This in itself requires some amount of metadata in the document. And I firmly believe that the RFC Editor should continue to provide RFCs preformatted in the original ASCII text format so that they can still be displayed and used without any kinds of tools in the exiting environments, and contents of RFC can be copied into source code comments without any reformatting. I have no problem with that. ... Printing the documents with Microsoft Word is not that difficult. Load it as .txt, remove two newlines at the beginning of the title page, select page margins at 1/1 leftright, font courier new and font size 10 throughout should work on A4 paper. Printing them 2-up probably makes sense, and may be easier to your eye that a free-floating 1-column printout of an HTML-version of the document. ... Open properly formatted HTML in browser, make sure shrink-to-fit is disabled and zoom level is 100%, print. ... Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Julian Reschke wrote: Printing the documents with Microsoft Word is not that difficult. Load it as .txt, remove two newlines at the beginning of the title page, select page margins at 1/1 leftright, font courier new and font size 10 throughout should work on A4 paper. Printing them 2-up probably makes sense, and may be easier to your eye that a free-floating 1-column printout of an HTML-version of the document. ... Open properly formatted HTML in browser, make sure shrink-to-fit is disabled and zoom level is 100%, print. It needs a painful lot of work to make free-floating formating not come out with poor results. When I do the above, an ascii arts with 3 lines of text and a box around is broken over from page8-page9 for http://greenbytes.de/tech/webdav/rfc2616.html In order to support free-float formatting, you will have to tag *ALL* paragraphs, bullet points, drawings, sections and what have you with non-breakability information (in TeX it is called badness) so that formatting doesn't break badly as it currently will do. HTML may be ok for screen rendering, but it is very poor at printing. This particular free-float HTML-version of rfc2616 also creates some close-to intentionally left blank pages (and the purpose of page 120 in the above document is a mystery to me). And while the HeadersFooters that a browser puts on a printout might make sense for printing web pages, I strongly prefer the original headers and footers on RFCs and I-Ds! -Martin ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 15.03.2010 22:19, Martin Rex wrote: ... It needs a painful lot of work to make free-floating formating not come out with poor results. When I do the above, an ascii arts with 3 lines of text and a box around is broken over from page8-page9 for http://greenbytes.de/tech/webdav/rfc2616.html ... Yes. (And thanks for retrying). What you are observing is missing support for CSS3 paged media hints in most browsers. It would be great if the UAs would get better on that. (PrinceXML (http://www.princexml.com/) is a great program that gets this right). In order to support free-float formatting, you will have to tag *ALL* paragraphs, bullet points, drawings, sections and what have you with non-breakability information (in TeX it is called badness) so that formatting doesn't break badly as it currently will do. HTML may be ok for screen rendering, but it is very poor at printing. HTML has nothing to do with that; CSS does. The necessary specs are there, but implementations lag behind. Probably because fewer and fewer people actually care about printing stuff (sic). This particular free-float HTML-version of rfc2616 also creates some close-to intentionally left blank pages (and the purpose of page 120 in the above document is a mystery to me). I have no idea what page 120 is for you (that's the point of a reflowable format ;-). If you get me some more details (what's on 119 and on 121?), I may be able to offer a theory :-). And while the HeadersFooters that a browser puts on a printout might make sense for printing web pages, I strongly prefer the original headers and footers on RFCs and I-Ds! Again, something that is specified in the CSS paged media spec, and at least one implementation gets it right already. Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 14.03.2010 01:39, Doug Ewell wrote: ... A different PDF creation program other than Word might not insert the registered-trademark symbol anyway. Or it could be edited out, if this is truly a deal-breaker. But I thought the contents were what was important. PDF is a binary format and there are lots of other bytes in this file with the high bit set, if we want to bring that up too. ... Actually, it can be plain ASCII, see http://greenbytes.de/tech/tc2231/inlwithasciifilenamepdf.asis. Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Phillip, I don't want to enter a discussion about the merits of PDF/A over HTML at this time. However, I do agree with you that it would be nice if you could submit HTML. If its true that its currently prohibited to look at it or subimt it, then that is something that could be fixed. I can take it up with the IESG. Or does someone see a reason why this should be prohibited (as long as you also submit ASCII)? Jari ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 3/14/2010 11:35 AM, Jari Arkko wrote: I do agree with you that it would be nice if you could submit HTML. If its true that its currently prohibited to look at it or subimt it, then that is something that could be fixed. That's certainly a reasonable view, but it raises the concern about supporting multiple revisable submission forms. For each form supported, there should be very high benefit, since there is real, incremental cost for each one. Given xml2rfc and given that modern browsers can render xml source (given the correct supporting file), I guess I do not see the substantial benefit of supporting html as a submission format. d/ -- Dave Crocker Brandenburg InternetWorking bbiw.net ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Are you asking about I-Ds or RFCs? I cannot see any HTML version of any I-D in the directory, and I think it should be easy to enable that (just as we have enabled PDF submission). Getting the RFC Editor to publish your own HTML is a different matter, because then we are talking about permanent documents. But my point was that start doing this for I-Ds and show to others that it works, THEN it becomes easier to talk about making RFCs use PDF/A or HTML or whatever. Jari ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On Sun, Mar 14, 2010 at 10:35 AM, Jari Arkko jari.ar...@piuha.net wrote: I don't want to enter a discussion about the merits of PDF/A over HTML at this time. For the record, if the IETF were to entertain the notion of blessing a format other than legacy-ASCII, I'd be strongly against any form of PDF. It seems obvious that a simple constrained dialect of HTML is maximally portable, accessible, and usable compared to anything else. -Tim ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Since we are destined to keep pretending that character sets and document formats are one and the same... Martin Rex mrex at sap dot com wrote: all unicode codepoints from their glyphs (and a number of them can not be distinguished by their glyphs), and even worse, most machines/environments do not even have fonts to display glyphs for most of the unicode codepoints. That is an argument for not allowing *all* Unicode characters. Which languages do you want to discriminate against, and on what grounds? Anything beyond US-ASCII is unfair to those that are not in the set. Since there are some users who cannot display 100,000 discrete characters, let's restrict all users to 95 characters, discriminating against every language except English and Swahili. Brilliant. Some people think internationalized domain names are a good idea. I think they are a pretty stupid idea, because they're a significant roadblock for international communication. Lots of people around the globe will have severe difficulties accessing some Web-Site that uses a fancy internationalized domain name, or someone using a fancy internationalized email address. If you don't happen to know the language, recognize the gylphs and have a platform where you can actually create/type that on your keyboard, then you will not be able to readuse such web server or email addresses from paper or television ads or from a paper business card. Domain names in Cyrillic or Arabic or Han aren't intended for you and me, they're intended for users who know those fancy scripts and are able to input those fancy characters. If the owners of those Web sites and e-mail addresses want to reach Latin-script users, they will have a Latin-script domain name as well. But more often than not, the screen-oriented formatting in HTML resuls in the printouts being truncated at the right border or filled with white spaces. This is true only when the page author has included a big, wide graphic image that redefines the minimum width of the page to be wider than can fit on the printed page. It is not an inherent flaw of HTML. -- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ http://is.gd/2kf0s ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Masataka Ohta mohta at necom830 dot hpcl dot titech dot ac dot jp wrote: The problem with email is people use html way too much. TXT - HTML - TXT does not work reliable. Too many one way transformations. That's enough to deny the following statement of Doug Ewell; You could have HTML or PDF-A that uses nothing but ASCII characters. I don't know how these simple statements (annotated here for reference): You could have [1] plain text encoded in UTF-8. You could have [2] HTML or [3] PDF-A that uses nothing but ASCII characters. came to be twisted by Ohta-san so imaginatively. [1] Here is an example of plain text encoded in UTF-8: http://www.ewellic.org/non-ascii.txt [2] Here is an example of HTML that uses nothing but ASCII characters: http://www.ewellic.org/ascii-only.html [3] Here is an example of PDF-A that uses nothing but ASCII characters: http://www.ewellic.org/ascii-only.pdf Note that my domain-name provider purposely inserts HTML banner-ad header and/or footer material into all text and HTML documents displayed on this site, as a condition of getting free Web hosting. You should download the first two files instead of displaying them directly. This demonstrates that you can have plain text encoded in UTF-8, and that you can have HTML and PDF-A that uses nothing but ASCII characters. Q.E.D. -- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ http://is.gd/2kf0s ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Doug Ewell wrote: came to be twisted by Ohta-san so imaginatively. I'm simply realistic. [3] Here is an example of PDF-A that uses nothing but ASCII characters: http://www.ewellic.org/ascii-only.pdf I'm afraid the PDF file contains non-ASCII character of circled R in metadata for pdf:Creator. Thank you for a convincing demonstration to deny yourself. Masataka Ohta ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Masataka Ohta mohta at necom830 dot hpcl dot titech dot ac dot jp wrote: [3] Here is an example of PDF-A that uses nothing but ASCII characters: http://www.ewellic.org/ascii-only.pdf I'm afraid the PDF file contains non-ASCII character of circled R in metadata for pdf:Creator. Thank you for a convincing demonstration to deny yourself. Metadata? Is that what we're talking about? Not the contents of the text? A different PDF creation program other than Word might not insert the registered-trademark symbol anyway. Or it could be edited out, if this is truly a deal-breaker. But I thought the contents were what was important. PDF is a binary format and there are lots of other bytes in this file with the high bit set, if we want to bring that up too. -- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ http://is.gd/2kf0s ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
[3] Here is an example of PDF-A that uses nothing but ASCII characters: http://www.ewellic.org/ascii-only.pdf I've replaced this with another PDF file created by a program (Acrobat Distiller 6.0.1) whose name, as displayed in the Properties dialog, doesn't include a non-ASCII symbol. Of course my old version of Distiller doesn't support PDF/A, as I'm sure newer versions do, but you should get the idea: allowing RFCs to contain non-ASCII does not imply changing the format from plain text, and vice versa. -- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ http://is.gd/2kf0s ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Doug Ewell wrote: I'm afraid the PDF file contains non-ASCII character of circled R in metadata for pdf:Creator. Thank you for a convincing demonstration to deny yourself. Metadata? Is that what we're talking about? Yes. PDF is a binary format and there are lots of other bytes in this file with the high bit set, if we want to bring that up too. As we are discussing about text format, let's ignore PDF. PERIOD. Masataka Ohta ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Running code, actual interest to deploy, and an incremental deployment model would probably take this matter further than the annual religious argument :-) Those who feel the pain should build/select tools and demonstrate that (a) they can produce high-quality PDF/A, (b) that it provides additional value, and (c) that you can start publishing drafts and RFCs that use such a format without causing a problem for those who still read ASCII or use rfcdiff. No one prevents you from publishing .pdf in addition to .txt as a draft. Just do it. For RFC 5534 we also got a permission to publish the .pdf version alongside .txt so it the process does not prevent you from doing it. (But I run out of time to fine-tune the RFC version of the PDF, so for now at least there is only the text version.) In short, just go do it. Jari ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 12.03.2010 01:11, Martin Rex wrote: Jorge Amodio wrote: I'd potentially agree if the format we actually use wouldn't have useless page breaks that leave 25% of the pages unused. At least over here. I'd also agree if that format would actually be usable on small devices like ebook readers (where it's essential that you can reflow the text). Agree, lot of white space. Actually, the page breaks _are_ useful. Like when referencing specific parts/paragraph in a document with an URL in a long section, e.g. http://tools.ietf.org/html/rfc5246#page-36 which contains the message flow of a full TLS handshake. And that message flow is just perfect in ASCII arts. That URL points to an HTML document, not a TXT document. There is (unfortunately) no fragment identifier syntax for text/plain (at least not one that UAs actually support) And guess what: if we go directly to HTML, we'd have anchors as well, but not only for section numbers, but also figures, tables, or even individual paragraphs. Discussing parts of documents that are not ASCII text is extremely difficult on IETF mailing lists and wastes huge amounts of network bandwith if put in graphics attachments. That's a argument about image formats, which is orthogonal. Just because other formats would *allow* more complex graphics formats doesn't imply we would have to use them. Unicode characters are also a Royal PITA in specs, because they're non-discussable. There are extremely few people who can recognize non-discussable? all unicode codepoints from their glyphs (and a number of them can not be distinguished by their glyphs), and even worse, most machines/environments do not even have fonts to display glyphs for most of the unicode codepoints. That is an argument for not allowing *all* Unicode characters. Having stuff that can only be copypasted for a large part of the internet population, but neither typed nor displayed is nothing that we need in our specs. Using HTML or PDF for RFCs is about the same as moving from English language RFCs to mandarin language RFCs. There is a huge number of people who can read it, but there is a also a large number of current RFC and I-D consumers and producers which can not and does not want to use mandarin. Sorry? Are you implying anybody is unable to display HTML? I do not doubt that there are tools available for heavy graphical user interfaces and specific platforms that can deal with mandarin just fine. But I do not understand mandarin, my tools can not cope with it and a lot of my platforms and my environments can not cope with it. HTML and pdf are only marginally better than mandarin. Sorry? With all due respect, but this statement is really ridiculous; at least in the context of HTML. Btw. printing out I-Ds and RFCs on paper (even 2-up and double sided) has always been working just fine for me with tools like a2ps. Sounds great if you have a Postscript printer. However, what typically happens is: open in browser, print, fail. Ask for advice, and if you're lucky get pointed to a program that understands form feeds in text/plain, such as Wordpad. Retry. Get a printout with large parts of the page being empty (depending on local). Printing out HTML is a royal pain and waste of paper. I don't know what it is, but in 80% of the time when I print out Web content, content is cut off at the right side (for both MSIE and Firefox on Windows). Yes, there are bugs. And yes, there is complex HTML out there that is tricky to print. Use something simple, such as the HTML produced by rfc2html or rfc2629.xslt, and there shouldn't be a problem. Printing PDF works, but reading PDF on screen is a royal PITA. Yes. Even with a 1600x1200 screen, displaying a single page is The concept of a page is part of the problem - for the current format as well. Pages are good when the primary output format is paper, and the paper size is known in advance. Today, that's the edge case. difficult to achieve -- and hardly legible with many PDFs that I've come accross--and if you choose a legible size, then the page doesn't fit and you have to page down to see the bottom of the page. Getting 62 lines of pure ascii text with a legible font displayed on a 1024x768 screen is much easier and much more legible. And doing a 2-up of an RFC or I-D has a predictable legibility. ... How is that different for (a properly selected subset of) HTML? Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Hi A favourite topic revisited :-) Frankly speaking, all other standards foras use MS-word (or simular) format and as far as I can see it they seem to manage it, for instance OpenOffice can be selected as document tool. Plain ASCII worked well when RFC768 was specified. Today protocols and algorithms are much more complex. You can easily find RFCs with flowcharts that spans two pages, they easly get difficult to follow. Don't even think about forumating complex equations... If the intention is that the RFCs should survive a global nuclear war then plain ASCII on stone tablets stored in some cave on Svalbard is likely the best choice but I would not believe that people care about RFCs if sh-t hits the fan. I strongly believe that it is at some stage time to consider more modern document formats. Regards Ingemar Message: 3 Date: Thu, 11 Mar 2010 20:24:58 +0100 From: Julian Reschke julian.resc...@gmx.de Subject: Re: Why the normative form of IETF Standards is ASCII To: Jorge Amodio jmamo...@gmail.com Cc: ietf@ietf.org Message-ID: 4b99438a.8010...@gmx.de Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 11.03.2010 19:44, Jorge Amodio wrote: On Thu, Mar 11, 2010 at 11:12 AM, Julian Reschkejulian.resc...@gmx.de wrote: On 11.03.2010 17:54, Jorge Amodio wrote: Besides your eyes, (only one in some cases), you don't need any extra junkware to be able to read the RFCs, even better, without eyes you still can do it since text to speech works very nicely with ASCII. ... I'd claim that accessibility for properly authored HTML will actually be better, for instance the markup can express whether something is prose or artwork. HTML uses ASCII as far as I remember, some tags, URIs and URLs may be impossible to decipher these days but still ASCII (I've to admit that some folks still use-abuse extended ASCII on HTML pages instead proper encoding and lang selection). HTML actually uses Unicode. All current element and attribute names are ASCII, in case you meant that. I don't understand the second statement, you appear to mix up character sets, encodings (and their declarations) with language information. About text to speech, it only takes a forward or going trough one of the stupid no context aware robo-translators and you will get your t2s interface reciting gee tee ampersand semicolon greater than eich ref equal lower than bee greater than ... I guess you get the point. I believe this to be not true, as long as you use the right tools (such as an HTML UA instead of a text editor). And I agree with Martin, all other formats add a lot of unnecessary crap to the documents, embedded fonts, meta-crap data, hooks to track document changes. That's why we would need to talk about a profile of the available features. And ASCII is more eco-friendly :-) I'd potentially agree if the format we actually use wouldn't have useless page breaks that leave 25% of the pages unused. At least over here. I'd also agree if that format would actually be usable on small devices like ebook readers (where it's essential that you can reflow the text). Best regards, Julian -- ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
RE: Why the normative form of IETF Standards is ASCII
Just for the record, because I think it's a data point that's relevant to this discussion: I have never been able to get text-format RFCs to print properly on either Windows or Mac. I have found that a painless way to do so is to open the .txt file with Word (Office 2003). Uses Courier and understands the page breaks. Maybe not a purists' solution, but it works with no extra effort at all (at least as Word is set up here, which I have no reason to believe unusual). Right click, pick off sub-menu, print. (Some drafts require an extra click to manage non-standard ends of line, but very few.) What I don't recall is why I even tried it in the first place, possibly by accident. Does Open Office work? This email and any attachments are confidential to the intended recipient and may also be privileged. If you are not the intended recipient please delete it from your system and notify the sender. You should not copy it or use it for any purpose nor disclose or distribute its contents to any other person. ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On Thu Mar 11 16:54:21 2010, Martin Rex wrote: Richard Shockey wrote: I do get the arguments in favour of ASCII, though I think there are some pretty serious countervailing arguments (like, for instance, that we can't spell many contributors' names, to take an easy one). But the RFC format _is not_ plain ASCII. Just ask anyone whose draft has failed the increasingly stringent and lengthy list of IDNits tests due to bad pagination in their I-D. The difficulty to spell contributors' names is a completely ridiculous reason. If there is anyone competent to specify how to spell his name in plain ASCII, then it is the authors and contributors themselves -- and if they are available at all, then it is during the process of their contribution and the document creation. Right, so how would one spell, for example, Tronçon in ASCII? Leaving off the diacritical leaves the suffix con, which is somewhat less than polite French. I accept that in some cases, there are generally recognised ASCIIfications of names, but that's not universal, and certainly not always acceptable to those people suffering from non-ASCII names. Really, the year is 2010, and I see absolutely no excuse for the IETF's inability to support writing author's names properly - this is plain embarassing. The existing plaintext ASCII format is easy and univerval. Any more fancy document formats come with plenty of problems and infinitesimal close to zero benefit. I would argue that the ubiquity of HTML and the comparitive rarity of text/plain viewers of similar capability actually reverses this argument at this point in time. I certainly now view the RFCs and I-Ds *solely* through a translation layer to HTML, strongly suspect this is the case for the vast majority of people working with RFCs and I-Ds, and I'd find it very significantly more useful if there were ways for authors to provide more metadata. Certainly in the XSF, use of an XML format for the XEPs, complete with anchors for Examples, Sections, etc in the HTML form, and metadata sufficient to syntax-highlight in the PDF form, has been a tangiable benefit on many occasions, speaking personally. Creating, displaying and printing, processing and updating the I-D and RFCs in the current form was possible 30 years ago, is possible and quite easy today (just try NRoffEdit once), and will be possible and easy in 30 years from now. All other formats will come with a varying number of problems. Taking an existing formatted ASCII RFC or I-D (which you did not author yourself) and putting it back into authoring format is round 1 hour of work with Nroffedit. Having taken quite large, NROFF edited documents and transformed them into RFC 2629 format, for further editing, I'd comfortably argue that this can be independent of the authoring format, and moreover, this is a moot point if the authoring format is the archival format. This is the case at the XSF, where the http://xmpp.org/extensions/ page points to rendered versions of the archival format. (In two presentation formats, incidentally). Diffing various revisions of documents is fairly easy with existing tools e.g. http://tools.ietf.org/rfcdiff Only because those tools are written. Diffing tools have been written for the much smaller community in the XSF, so again, I dispute your implicit assertion that this is made possible by the formatting choice. The problem with basically all of the fancy format is, that none of your existing tools can cope with it, the possibility to create that format is often limited to specific platforms, environments or tools. Diffing with previoius versions of documents is difficult, converting a published document back into authoring format is EXTREMELY diffcult, the size of the document often grows by factors, and searching and displaying such documents may require specific new tools and platforms and be therefore impossible for a number of platforms and environments where RFCs and I-Ds are currently displayed, searched and processed. I have no objection whatsoever to maintaining the text format as one presentation format. I do think it's time to look at lessons learnt from xml2rfc and the XSF's XML format, and consider developing a new specification format from both that then provides a solid base for a new archival format, which can be used to consistently generate presentation formats that take full advantage of existing and widely deployed technology. Both these formats have existed now for a decade or so, and that's surely long enough that we have some clear ideas about how to proceed. And searching for and comparing characteristics of graphics or graphical drawings instead of text is a field that needs another two decades of reasearch. I firmly agree that we should prohibit the use of graphical diagrams at this stage. (It may be that in the future, SVG may
Re: Why the normative form of IETF Standards is ASCII
Hi Dave, why don't you write a draft? Some possible section headlines: 0. Introduction, abstract, boilerplate 1. Lessons learned from 1.1. xml2rfc 1.2. XSF 1.2.1. Why the XYZ doesn't use RFCs 1.3. W3C 2. Tools to be leeched 3. Generating ASCII 3.1. Limitations on the source 4. Turning existing documents into the new format 5. Why HTML and unicode instead of... 5.1. PDF/A 5.2. Microsoft Word 5.3. 72-column ASCII 5.4. 72-column UTF-8 5.5. Undocumented running code 6. Author, references, acknowledgments 7. More boilerplate And so on. Personally, I see the sense in moving from ASCII to UTF-8. Unicode has beaten off everything else now, it's a clear and safe choice, and non-ASCII characters are used more and more often. It's not so clear to me that bold/italics/hyperlinks are worth the change, not to mention graphics. Btw, when you say the authoring format, I think you may overestimate IETFers' willingness to march in step. Arnt ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Masataka Ohta mohta at necom830 dot hpcl dot titech dot ac dot jp wrote: These are two separate topics. You could have plain text encoded in UTF-8. You could have HTML or PDF-A that uses nothing but ASCII characters. As was demonstrated by Tim Bray against Martin Rex's ASCII message, they are strongly interrelated topics. Sorry, I don't understand this at all. A space character (which looks like an ordinary U+0020 space to me, in both the plain-text message I received and in the Web archive) got erroneously converted to a question mark in Tim's plain-text mail. How does this demonstrate anything about PDF? -- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ http://is.gd/2kf0s ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
I wrote: A space character (which looks like an ordinary U+0020 space to me, in both the plain-text message I received and in the Web archive) got erroneously converted to a question mark in Tim's plain-text mail. Actually, I take that back. All the spaces in Tim's message look fine to me; only in Ohta-san's quoting of Tim's message do I see the question mark. Regardless of where the question mark came from, this still has nothing to do with PDF. It has only to do with tools that misinterpret character sets or transform plain text. -- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ http://is.gd/2kf0s ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Julian Reschke wrote: Actually, the page breaks _are_ useful. Like when referencing specific parts/paragraph in a document with an URL in a long section, e.g. http://tools.ietf.org/html/rfc5246#page-36 which contains the message flow of a full TLS handshake. And that message flow is just perfect in ASCII arts. That URL points to an HTML document, not a TXT document. There is (unfortunately) no fragment identifier syntax for text/plain (at least not one that UAs actually support) Wrong. It points to a TXT document that is rendered as HTML. If you abide to certain conventions in your plain ASCII text, then everyone can recognize and use them (RFC/ID - HTML or - PDF converters, accessibility tools like text-speech). And it still renders just fine on pure text environments and over very low bandwidth links. I-Ds and RFCs are not publish and forget documents, but instead they're vivid snapshots of working group discussions in constant motion and under discussion, and one of the most important aspects is that others can easily build a derivative work from an existing document (especially for expired I-Ds). So it is extremely important that the published format is easy to quote in ASCII-Emails, can be easily quoted in ASCII code comments, and easily incorporated into new documents. Just try NRoffEdits conversion I-D - authoring nroff source and see how easy that is. It's a single all-in-one tool written in Java, basically wysiwyg with spell checker included and makes I-D editing extremely easy. And guess what: if we go directly to HTML, we'd have anchors as well, but not only for section numbers, but also figures, tables, or even individual paragraphs. Anchors in plain-ASCII text that are human-comprehensible can be automatically converted into real URLs and anchors with simple tools. These tools exist and work just fine with the existing plain-ascii text documents. all unicode codepoints from their glyphs (and a number of them can not be distinguished by their glyphs), and even worse, most machines/environments do not even have fonts to display glyphs for most of the unicode codepoints. That is an argument for not allowing *all* Unicode characters. Which languages do you want to discriminate against, and on what grounds? Anything beyond US-ASCII is unfair to those that are not in the set. I'm typing on a german keyboard, but there are a significant number of characters in ISO-8859-1 (or ISO-8859-15) that I can not type. Over the years I've come accustomed to not use any Umlauts in Emails, even when writing in German. The employee names in our Outlook address book also do not use any non-ascii characters in consideration to our world-wide subsidiaries. Adding names into an Email Addressbook that most people can not type on their keyboards just doesn't make any sense. Have you ever heard of X.400 Mail? The story goes like this: Two people meet on a conference, they both have Internet-Email, exchange their email addresses and happily communicate thereafter. Two other people meet on a conference, one of them has Internet-Email, the other X.400. The X.400-person takes the Internet-Email address home, sends an EMail and both hope, that the reply function will work. Yet two other people meet on a conference, both of them have X.400. They exchange their phone numbers. If you want people to communicate, they need to share a common language. If they don't happen to already know one common language (in which case the difficulty/complexity of that language doesn't matter), then they should probably standardize on a language that is easy to learn for both of them and has a high likelyhood to be useful for communication with other people as well. And it makes perfect sense to not only standardize on that single language in spoken communication, but also in written communication. Anyone who enters IETF discussions, which are Email-based for a large part, should provide a description of his own name with letters from the US-ASCII alphabet, rather than forcing others to make guesses how to do it given some kind of gibberish codepoints from awkward codepages. Some people think internationalized domain names are a good idea. I think they are a pretty stupid idea, because they're a significant roadblock for international communication. Lots of people around the globe will have severe difficulties accessing some Web-Site that uses a fancy internationalized domain name, or someone using a fancy internationalized email address. If you don't happen to know the language, recognize the gylphs and have a platform where you can actually create/type that on your keyboard, then you will not be able to readuse such web server or email addresses from paper or television ads or from a paper business card. We are really lucky that the world standardized on a single set of gylphs to represent digits, use of the decimal system and the same orientation for the
Re: Why the normative form of IETF Standards is ASCII
On 12.03.2010 16:58, Martin Rex wrote: Julian Reschke wrote: Actually, the page breaks _are_ useful. Like when referencing specific parts/paragraph in a document with an URL in a long section, e.g. http://tools.ietf.org/html/rfc5246#page-36 which contains the message flow of a full TLS handshake. And that message flow is just perfect in ASCII arts. That URL points to an HTML document, not a TXT document. There is (unfortunately) no fragment identifier syntax for text/plain (at least not one that UAs actually support) Wrong. It points to a TXT document that is rendered as HTML. No, it does not. It points to an HTML document that was converted from the original TXT version (on the server, by Henrik's rfcmarkup script). If you abide to certain conventions in your plain ASCII text, then everyone can recognize and use them (RFC/ID - HTML or - PDF converters, accessibility tools like text-speech). And it still renders just fine on pure text environments and over very low bandwidth links. So what do accessibility tools do when they encounter page breaks, with the header footer lines? What does a screen reader to with ASCII art? I-Ds and RFCs are not publish and forget documents, but instead they're vivid snapshots of working group discussions in constant motion and under discussion, and one of the most important aspects is that others can easily build a derivative work from an existing document (especially for expired I-Ds). Yes. That's an argument to require an easy to re-use *submission* format. But the submission format doesn't need to be identical to the publication format. The submission tool already allows you to supply additional source files, such as XML. I recommend to use that. So it is extremely important that the published format is easy to quote in ASCII-Emails, can be easily quoted in ASCII code comments, and easily incorporated into new documents. Yes. Just try NRoffEdits conversion I-D - authoring nroff source and see how easy that is. It's a single all-in-one tool written in Java, basically wysiwyg with spell checker included and makes I-D editing extremely easy. It's probably a nice tool for people willing to use NROFF. There are other nice tools, based on the RFC2629 XML format. And guess what: if we go directly to HTML, we'd have anchors as well, but not only for section numbers, but also figures, tables, or even individual paragraphs. Anchors in plain-ASCII text that are human-comprehensible can be automatically converted into real URLs and anchors with simple tools. These tools exist and work just fine with the existing plain-ascii text documents. Example? ... If you want people to communicate, they need to share a common language. If they don't happen to already know one common language (in which case the difficulty/complexity of that language doesn't matter), then they should probably standardize on a language that is easy to learn for both of them and has a high likelyhood to be useful for communication with other people as well. And it makes perfect sense to not only standardize on that single language in spoken communication, but also in written communication. Anyone who enters IETF discussions, which are Email-based for a large part, should provide a description of his own name with letters from the US-ASCII alphabet, rather than forcing others to make guesses how to do it given some kind of gibberish codepoints from awkward codepages. ... I don't have any problem with that. But requiring an ASCII transcription doesn't imply that the real name can't be used in *addition*. (We had a discussion about this a few years ago, and that's exactly what was proposed). However, contact information is just one use case. Another one are examples for I18N in specs which are incredibly hard to write unless you can actually use a few example characters. See RFC 3987, for example. Some people think internationalized domain names are a good idea. I think they are a pretty stupid idea, because they're a significant .. That's a completely orthogonal discussion. Using HTML or PDF for RFCs is about the same as moving from English language RFCs to mandarin language RFCs. There is a huge number of people who can read it, but there is a also a large number of current RFC and I-D consumers and producers which can not and does not want to use mandarin. Sorry? Are you implying anybody is unable to display HTML? Yes, of course. The majority of devices and a huge number of environments is completely unable to display HTML. And those *can* display text/plain with form feeds in a sane manner? Example? ... I do not doubt that there are tools available for heavy graphical user interfaces and specific platforms that can deal with mandarin just fine. But I do not understand mandarin, my tools can not cope with it and a lot of my platforms and my environments can not cope with it. HTML and pdf are only marginally better than
Re: Why the normative form of IETF Standards is ASCII
On Thu, Mar 11, 2010 at 11:37:55PM -0500, Donald Eastlake wrote: PDF/A is a deliberately-limited format designed specifically for archival purposes. And is clearly a non-starter because I have no idea how to produce PDF so limited, not idea how to test a PDF to see if its PDF/A, etc. On the other hand, since I produced my first ID sometime in 1992, I've had no particular problem producing them with nroff and I've never had to hunt for, write, debug, or install a single piece of software. It's just there already, including in Mac OS X. Wait a minute. That argument just boils down to, I don't know how to do this, so it's obviously wrong. First, that you don't know how to do something is by no means evidence that it can't be done, and done easily. Second, I'm sure it won't come as a complete surprise that many people find nroff to be cryptic, arcane, hard to use, and designed for an era when the primary publication mechanism was ink on paper using output mechanisms with limited capabilities. If people have such trouble, then the same argument form (I have no idea . . .) can be used by them. And I dare say that, in this day and age, more people have trouble using nroff than have trouble producing PDF/A, since OpenOffice.org includes a little button that generates such PDFs. Third -- and this is a point since made in this thread by others more clearly than I originally made it -- the IETF format _is not_ plain ASCII. It's a page layout format that happens to restrict itself also to ASCII characters only. So there are completely separate issues to address here, and we shouldn't conflate them. There is the archival format issue. In my view, if we really want to have a format for archival purposes, then something other than files made for printing on a printer (with paper not even widely available in parts of the world) would be an improvement. PDF/A is one candidate format, standardized by another SDO and apparently embraced by a community (librarians) that really know about long-term archives and who already have extensive experience with the pain of supporting old computer formats. So it seems to me to be a useful candidate for archival purposes. It isn't the only one. Pointing and laughing at an implementation of a viewer of such files because it happens to be riddled with bugs is in no way an argument that the standard itself is somehow dangerous, any more than noting the mess that many home gateways make of DNS packets is an argument that we should go back to distributing the hosts.txt file via FTP. If we turn our attention to the utility for readers and reviewers and those wanting to incorporate parts of text into other contexts, then the official format that idnits permits (never mind exactly what the RFC Editor ends up with) is _still_ inconvenient. You can't rewrap lines for small screens. You can't anchor to particular sections or diagrams (which are, anyway, hard to use because they have to be produced as ASCII art, as though the IETF was some sort of giant warlording fan club). Complex equations are hard to represent and hard to read. And so on: the reasons why the format doesn't even work reliably for the community actually using the documents today are legion; repeated; and when raised often simply denied, as though such were an argument. Moreover, from a process point of view, I've had at least one contributor in DNSEXT recently refuse to update a draft because the idnits tool checks for both form and content. This makes the exact formatting conventions of the page into a problem that contributors have to worry about when trying to hammer out technical details of a protocol. Every contributor has to be an amateur typesetter, only still targetting a technology that was a significant step backwards in typeset quality compared to things professional typesetters had been doing for centuries. (This is not a criticism of the idnits maintainers, who are doing the necessary thing to support a broken rule we have in place. If you find you have to adjust the boilerplate manually, and that moves things around, then suddenly you have to start counting lines on a page in order to get everything right.) Finally, as someone noted in this thread, the underlying assumption that the input format has to be perfectly aligned to exactly one output format is wrong. We have more than one purpose in consumption of the final product. Why would we insist on having exactly one format for it? None of the above distinctions are new. The last time this frustrating topic heaved into view, I recall those distinctions being made too. I'd find it really nice if, in future when this topic comes up, we at least stop making demonstrably false claims that the RFC format is plain ASCII. I'm not so optimistic as to imagine we'll really address the different issues and find a way forward, but not misrepresenting the way things are would be nice. A -- Andrew Sullivan a...@shinkuro.com
Re: Why the normative form of IETF Standards is ASCII
Andrew Sullivan wrote: On Thu, Mar 11, 2010 at 11:37:55PM -0500, Donald Eastlake wrote: PDF/A is a deliberately-limited format designed specifically for archival purposes. And is clearly a non-starter because I have no idea how to produce PDF so limited, not idea how to test a PDF to see if its PDF/A, etc. On the other hand, since I produced my first ID sometime in 1992, I've had no particular problem producing them with nroff and I've never had to hunt for, write, debug, or install a single piece of software. It's just there already, including in Mac OS X. Wait a minute. That argument just boils down to, I don't know how to do this, so it's obviously wrong. First, that you don't know how to do something is by no means evidence that it can't be done, and done easily. What it means, is that it is neither easy, nor intuitive, and will require new tools that have to be built and are therefore only going to be for a small number of new platforms, but not for a huge part of the installed base. Second, I'm sure it won't come as a complete surprise that many people find nroff to be cryptic, arcane, hard to use, and designed for an era when the primary publication mechanism was ink on paper using output mechanisms with limited capabilities. The use of the nroff authoring format doesn't mean that you have to do everything yourself manually, as with xml2rfc. Just try NRoffEdit once. It's simple and intuitive, and it will convert an existing ascii-formatted I-D back into nroff-authoring with ease. I was amazed just how simple and powerful this tool is when I wrote my first I-D. With xml2rfc, I gave up on trying to install an hour after downloading it. With NRoffEdit, I was writing page 3 an hour after downloading it. Third -- and this is a point since made in this thread by others more clearly than I originally made it -- the IETF format _is not_ plain ASCII. It's a page layout format that happens to restrict itself also to ASCII characters only. So there are completely separate issues to address here, and we shouldn't conflate them. The format is plain ASCII, in that all formatting is done with only two(three) ASCII control characters. End-Of-Line is represented by CR+LF, Pagebreaks by an VF (Vertical Feed, Form Feed) character. and all other positioning with ASCII space characters. There is the archival format issue. In my view, if we really want to have a format for archival purposes, then something other than files made for printing on a printer (with paper not even widely available in parts of the world) would be an improvement. PDF/A is one candidate format, standardized by another SDO and apparently embraced by a community (librarians) that really know about long-term archives and who already have extensive experience with the pain of supporting old computer formats. A number of the other SDOs are in the publishing business. They charge you for every copy of the document and you're not allowed redistribute or to create derivative works at your leisure from their publications. And for their working documents, the circulation/distribution may even be limited, occasionally by some kind of NDA. This is a significant difference to IETF documents, RFCs and I-Ds, where the purpuse is explicitly that they're distributed, discussed, improved, parts re-used, and derivative works created. PDF and PDF/A may be OK for publish and forget documents, but I think they're not useful for IETF documents because of the severe limitations to anything beyond pretty publication. Moreover, from a process point of view, I've had at least one contributor in DNSEXT recently refuse to update a draft because the idnits tool checks for both form and content. This makes the exact formatting conventions of the page into a problem that contributors have to worry about when trying to hammer out technical details of a protocol. Every contributor has to be an amateur typesetter, only still targetting a technology that was a significant step backwards in typeset quality compared to things professional typesetters had been doing for centuries. If we were using a more feature rich document format, the constraints that the submission tool would have to enforce would make this one-time software logistics lapse a recurring experience for each and every I-D author and every submission. If you have troubles with idnits, you should really try NRoffEdit. There were a number of things that I bothered with when writing my first I-D last November. Getting started, formatting the document and passing idnits were complete non-issues with NRoffEdit. -Martin ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 12.03.2010 19:43, Martin Rex wrote: ... No, it does not. It points to an HTML document that was converted from the original TXT version (on the server, by Henrik's rfcmarkup script). Whether the converted document is long-term cached or converted on the fly is an insignificant implementation detail. The point is, Indeed. What's relevant is *that* it was converted to HTML. that the original document is a plain ASCII text file. And the existing standardization for I-D and RFC formatting enables a simple tool to recognize references and anchors and create HTML tags from it. *Most* references and anchors. It would be better if we had a text based format that actually marks up these things, so you don't need heuristics to get them back. Oh, wait... ... So what do accessibility tools do when they encounter page breaks, with the header footer lines? What does a screen reader to with ASCII art? Because of the page breaks and the consistent presence of these headers and footers just before and after the page breaks, an accessibility tool should be able to recognize them as such. I agree it would be nice if they did that. Do they? ... Yes. That's an argument to require an easy to re-use *submission* format. But the submission format doesn't need to be identical to the publication format. The submission tool already allows you to supply additional source files, such as XML. I recommend to use that. I'm so glad that NroffEdit will happily convert a plain ASCII RFC or I-D into suitable authoring format. It would be horror if I would want to suggest changes to a document and had to jump hoops to get hold of the XML-type authoring format of a document first and use tools like xml2rfc in order to create an updated version of the document myself. Hoops as in replacing .txt by .xml in the URL, and just download it? Just try NRoffEdits conversion I-D - authoring nroff source and see how easy that is. It's a single all-in-one tool written in Java, basically wysiwyg with spell checker included and makes I-D editing extremely easy. It's probably a nice tool for people willing to use NROFF. There are other nice tools, based on the RFC2629 XML format. willing to use nroff? You have likely never looked at NRoffEdit. It's an all-in-one wysiwyg tool written in Java, that uses nroff-formatting commands for authoring, Instant(!) preview and output is formatted ASCII, spell checker is built-in. You do _NOT_ need any extra tools, and in particular you do not need to figure out how to combine a bunch of tools from various different sources somehow into a productive workflow, as with xml2rfc. When writing spec text, the *least* I'm interested in is the final text/plain output by xml2rfc. What I do is that I press F5 in the browser of my choice. That's my workflow. ... Anchors in plain-ASCII text that are human-comprehensible can be automatically converted into real URLs and anchors with simple tools. These tools exist and work just fine with the existing plain-ascii text documents. Example? Section headings, page breaks, normative/informative references to sections of other RFC documents, in-document references to other sections. When I wrote my first I-D, I tried to make sure that the conversion tool on tools.ietf.org correctly recognizes and interprets all my references. (I was actually surpized that it picked up most of my references to sections of other documents before I even knew that it was doing this). Oh. I thought you meant *named* anchors. ... Another one are examples for I18N in specs which are incredibly hard to write unless you can actually use a few example characters. See RFC 3987, for example. If something is difficult to accomplish, maybe it should not be done in the first place. Personally, I think that the use of I18N at the protocol level (I18N hostnames, I18N Email addresses, I18N URIs) is a huge mistake and will result in lots of needless pain when used. ... Oh, so the IETF shouldn't care about I18N. Right. ... A few years ago a support call was forwarded to me from a japanese customer having problems with Single Sign-On configuration based on the Windows domain authentication. So I asked the customer about his current settings in the Local Security Policy (the communication is translated through our support organization). I got back a screen shot, but all the text that was shown was completely incomprehensible (I can not read japanese) -- and because of a different lexical order, counting from the top didn't work either... ... I know these kinds of problems (actually, from the same organization you are in), but not liking the negative effects of localization doesn't make it go away. ... Yes, of course. The majority of devices and a huge number of environments is completely unable to display HTML. And those *can* display text/plain with form feeds in a sane manner? Example? The presence of form feeds in text/plain doesn't
Re: Why the normative form of IETF Standards is ASCII
Doug Ewell wrote: A space character (which looks like an ordinary U+0020 space to me, in both the plain-text message I received and in the Web archive) got erroneously converted to a question mark in Tim's plain-text mail. It is not a question mark character but a some strange non-ASCII character, though it is displayed as a question mark in some environment. How does this demonstrate anything about PDF? As it occurred with pure ASCII profile of non-ASCII-capable e-mail, the same will occur with pure ASCII profile of non-ASCII-capable PDF. Q.E.D. Masataka Ohta ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 12.03.2010 21:41, Masataka Ohta wrote: Doug Ewell wrote: A space character (which looks like an ordinary U+0020 space to me, in both the plain-text message I received and in the Web archive) got erroneously converted to a question mark in Tim's plain-text mail. It is not a question mark character but a some strange non-ASCII character, though it is displayed as a question mark in some environment. How does this demonstrate anything about PDF? As it occurred with pure ASCII profile of non-ASCII-capable e-mail, the same will occur with pure ASCII profile of non-ASCII-capable PDF. ... What *is* a pure ASCII profile of non-ASCII-capable PDF? A PDF viewer that doesn't understand, for example, code point 160? Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
Julian Reschke wrote: I'm at the end of your mail, but you haven't told me how printing the example document I pointed to worked for you. Did you try? If not, why not? You mean this one: It would be nice if you could elaborate on what the problem is. Try, for instance, printing http://greenbytes.de/tech/webdav/rfc2616.html. OUCH! Printing: On the printer, it comes out with a line length of 131 characters (paper size A4, portrait, one-up). US-letter is wider, so that might be even worse. The font size appears to be like ~6 pt and very thin, it comes out grey. Comparing it to an I-D that I printed 2-up in 1996, it uses much smaller characters. The I-D is 2 columns because of the 2-up printing. The use of a sans-serif font also makes this html-rfc harder to read on printout than my a2ps formatted I-D from 1996. That's an absolute disaster! Screen-Reading: Opening it in my Tabbed browser (firefox) renders it in an similarly illegible fashion (I have a 1600x1200 monitor). For resizing it to a legible line length, I have to resize all my other open tabs as well, which is a nuisance (Zooming is worse, because it also reduces the number of visible lines). The version at http://tools.ietf.org/html/rfc2616 comes up with a perfect rendering instantly in my browser. The line length is easy on the eyes, the font is more readable (to my eyes anyway). So the ASCII-format has a significant lead over that particularly poor example of HTML. -Martin ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
In message 4b9aa7ff.4000...@gmx.de, Julian Reschke writes: On 12.03.2010 21:41, Masataka Ohta wrote: Doug Ewell wrote: A space character (which looks like an ordinary U+0020 space to me, in both the plain-text message I received and in the Web archive) got erroneously converted to a question mark in Tim's plain-text mail. It is not a question mark character but a some strange non-ASCII character, though it is displayed as a question mark in some environment. How does this demonstrate anything about PDF? As it occurred with pure ASCII profile of non-ASCII-capable e-mail, the same will occur with pure ASCII profile of non-ASCII-capable PDF. ... What *is* a pure ASCII profile of non-ASCII-capable PDF? A PDF viewer that doesn't understand, for example, code point 160? The problem with email is people use html way too much. TXT - HTML - TXT does not work reliable. Too many one way transformations. Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf
Re: Why the normative form of IETF Standards is ASCII
On 12.03.2010 22:25, Mark Andrews wrote: ... The problem with email is people use html way too much. TXT - HTML - TXT does not work reliable. Too many one way transformations. ... Yes, agreed. I don't like HTML email either. But what does this have to do with IETF specification formats? Best regards, Julian ___ Ietf mailing list Ietf@ietf.org https://www.ietf.org/mailman/listinfo/ietf