Kent Watsen <[email protected]> wrote: > > > > On Mar 4, 2019, at 11:04 AM, Martin Bjorklund <[email protected]> wrote: > > > > Kent Watsen <[email protected]> wrote: > >> > >> > >>> But note that figures in RFCs are normally indented with 3 spaces > >>> (they _can_ be outdented, if the lines are long enough). > >> > >> > >> The days of scraping from plain-text RFCs are over [1]. Extracting, > >> if needed at all, should be from the XML, where there are no such > >> issues. Extracting from the plain-text output makes about as much > >> sense as extracting from the HTML or PDF outputs. > > > > I am confused. Are you saying that the unfolding algorithm only is > > supposed to work on data extracted from the XML version of the I-D or > > RFC? If so, I think this needs to be clarified in the draft. > > The unfolding algorithm works as long as the input == the output. The > problem is that plain-text RFCs introduce a lot of artifacts that makes > lossless extraction difficult. I don't believe we should try to design a > solution for input != output. > > Now that IETF has officially moved to XML as the sole format
I'm not sure what you mean, can you provide a pointer? AFAICT, the latest published RFC is still only available as txt and pdf. If the only format was XML, why bother with any line breaking at all? >, there > is no longer a need to support extracting from plain-text. In general, > folks are advised to always extract from XML. I support adding a > statement to this affect. > > > > >> Lossless extractions are critical for formal verifications (e.g., > >> doctor reviews, shepherd reviews, AUTH48 reviews). Both the > >> double-backslash approach we currently have, and the single-backslash > >> approach we had originally (where the continuation-line begins on > >> column 1, as it has been in programming languages for decades) provide > >> lossless extractions. > > > > ... as does the single-backslash with leading space removal. > > No, there are cases where this fails. We went thru this before. Only if you have data with > 69 spaces in a row that needs to be preserved. /martin > This is why > we adopted the double-backslash approach. > > > Kent // contributor (also on my previous emails in this thread) > > > _______________________________________________ netmod mailing list [email protected] https://www.ietf.org/mailman/listinfo/netmod
