Re: RFC: Better formatting for long descriptions
I haven't read the whole long thread, so perhaps this has been mentioned by someone else. Python has recently decided to convert their documentation to reStructuredText [1]. It would make a lot of sense for Debian to use that de-facto standard (or some subset of it) for text typesetting in the long descriptions, rather than re-inventing the wheel. Cheers, Morten [1] http://docutils.sourceforge.net/rst.html -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Wed, 13 May 2009, Morten Kjeldgaard wrote: I haven't read the whole long thread, so perhaps this has been mentioned by someone else. Python has recently decided to convert their documentation to reStructuredText [1]. It would make a lot of sense for Debian to use that de-facto standard (or some subset of it) for text typesetting in the long descriptions, rather than re-inventing the wheel. My understanding is that we currently have the choice between Markdown and reStructuredText and from my point of view the right place to continue the discussion is http://lists.debian.org/debian-devel/2009/04/msg01132.html For my understanding it is decided to use a formating library and I tried to compare two portantial candidates. Further investigation should be done - preferably by people who really know these libraries. My time for such things is currently limited. Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
Andreas Tille wrote: But what exactly do I have to do to get the item lists marked? Remove the first space, remove the '.' that are alone on their line, add a blank line before enumeration (this last point seems the more annoying to me: it can be difficult to automatically find where to insert a blank line). grep-available -s Description -F Package airport-utils | markdown grep-aptavail -s Description -F Package airport-utils | sed -e 's/^ \(.$\)\?//' -e '/: *$/a\\ ' | markdown pDescription: configuration and management utilities for Apple AirPort base stations This package contains various utilities to manage the Apple AirPort base stations./p pBe aware that Apple released several versions of the AirPort base station; the original AirPort (Graphite) was a rebranded Lucent RG-1000 base station, doing 802.11a/b. The AirPort Extreme (Snow) is an Apple-built 802.11a/b/g base station./p pFor the original Apple AirPort and the Lucent RG-1000 base stations only:/p ul liairport-config: base station configurator/li liairport-linkmon: wireless link monitor, gives information on the wireless link quality between the base station and the associated hosts/li /ul pFor the Apple AirPort Extreme base stations only:/p ul liairport2-config: base station configurator/li liairport2-portinspector: port maps monitor/li liairport2-ipinspector: WAN interface monitoring utility/li /ul pFor all:/p ul liairport-modem: modem control utility, displays modem state, starts/stops modem connections, displays the approximate connection time (Extreme only) ul liairport-hostmon: wireless hosts monitor, lists wireless hosts connected to the base station (see airport2-portinspector for the Snow)/li /ul/li /ul Regards, Vincent -- Vincent Danjean GPG key ID 0x9D025E87 vdanj...@debian.org GPG key fingerprint: FC95 08A6 854D DB48 4B9A 8A94 0BF7 7867 9D02 5E87 Unofficial pacakges: http://moais.imag.fr/membres/vincent.danjean/deb.html APT repo: deb http://perso.debian.org/~vdanjean/debian unstable main -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
Peter Pentchev r...@ringlet.net writes: Just as a kind of clarification: Manoj, I think that Giacomo's comments were only to the *last* item of the text he quoted, not to the whole portion above it :) Thus, IMHO his first really needed? question referred specifically to the ordered lists item, and the I don't think they are needed referred specifically to the underlines and strike-throughs, not to the emphasis, strong emphasis, etc. Traps for new players: One must remember to trim irrelevant quoted material so it's clear what the context of one's responses are. -- \“You can't have everything; where would you put it?” —Steven | `\Wright | _o__) | Ben Finney pgpE8rXLZJiKa.pgp Description: PGP signature
Re: RFC: Better formatting for long descriptions
On Sat, 18 Apr 2009, Vincent Danjean wrote: Remove the first space, remove the '.' that are alone on their line, That's cheap. add a blank line before enumeration (this last point seems the more annoying to me: it can be difficult to automatically find where to insert a blank line). Well - here is the crux which let's me wonder whether Manoj was right in his posting[1] when he claimed: If you make a suggestion please answer the following question: A. Does the suggestion enable parsing logical structures like two level itemize lists? (This is what I want to approach and what is IMHO needed) Markdown and ReST, trivially. B. Does the suggestion enable keeping the majority of description untouched and enables keeping the currently existing tools? (This is important to gain any acceptance) Yes, for both. It is neither trivial to detect the point where to add the needed blank line nor would it be a solution to advise people alwasy to enclose lists in blank lines because people will tell you that this will look ugly in the existing interfaces. So I would rather tend to No for both and this is the crux here. So while I perfectly agree with Manoj that voting on technical decisions is a bad idea I come back to my initial suggestion because my suggestions are technically equivalent but express a matter of taste of the developers which might lead to better acceptance. I would love if somebody could provide a proof that I'm wrong and there is a reliable way to turn long descriptions into proper markdown input to *really* be able to detect the lists. If not I think I continue with my intention as described. [2] Kind regards Andreas. [1] http://lists.debian.org/debian-devel/2009/04/msg00652.html [2] http://lists.debian.org/debian-devel/2009/04/msg00643.html -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Sat, Apr 18 2009, Andreas Tille wrote: On Sat, 18 Apr 2009, Vincent Danjean wrote: Remove the first space, remove the '.' that are alone on their line, That's cheap. add a blank line before enumeration (this last point seems the more annoying to me: it can be difficult to automatically find where to insert a blank line). Well - here is the crux which let's me wonder whether Manoj was right in his posting[1] when he claimed: If you make a suggestion please answer the following question: A. Does the suggestion enable parsing logical structures like two level itemize lists? (This is what I want to approach and what is IMHO needed) Markdown and ReST, trivially. B. Does the suggestion enable keeping the majority of description untouched and enables keeping the currently existing tools? (This is important to gain any acceptance) Yes, for both. It is neither trivial to detect the point where to add the needed blank line nor would it be a solution to advise people alwasy to Actually, it is pretty trivial. It is a second chanpeter exercise in KR; it is a first month exercise in computer science 101. Here is an algorithm: --8---cut here---start-8--- we are not in a list while reading each line, do remove leading space if the only non white space character on the line is a singe . remove the dot if the line matches the regexp: '^\s+[\*\+\-]\s+' if we are not in a list emit blank line first record we are not in a list else if we are in a list record we are not in a list emit line --8---cut here---end---8--- People who can not convert this 13 line Psuedocode into a real code should not be writing stuff to pretty print descriptions. enclose lists in blank lines because people will tell you that this will look ugly in the existing interfaces. So I would rather tend to No for both and this is the crux here. Frankly, I think this is very wrong. So while I perfectly agree with Manoj that voting on technical decisions is a bad idea I come back to my initial suggestion because my suggestions are technically equivalent but express a matter of taste of the developers which might lead to better acceptance. I would love if somebody could provide a proof that I'm wrong and there is a reliable way to turn long descriptions into proper markdown input to *really* be able to detect the lists. If not I think I continue with my intention as described. [2] Is the above algorithm proof enough for you? Or do I have to write that into real code in your favourite porogramming language before you can see it? manoj -- The minority is always right. Henrik Ibsen 1828-1906 Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Sat, 18 Apr 2009, Manoj Srivastava wrote: Here is an algorithm: --8---cut here---start-8--- we are not in a list while reading each line, do remove leading space if the only non white space character on the line is a singe . remove the dot if the line matches the regexp: '^\s+[\*\+\-]\s+' if we are not in a list emit blank line first record we are not in a list else if we are in a list record we are not in a list emit line --8---cut here---end---8--- People who can not convert this 13 line Psuedocode into a real code should not be writing stuff to pretty print descriptions. Thanks for the trust in the programming skills of your fellow developers. You obviosely are able to write the code to detect a list *without* using a library. Wasn't it you who told me we should use a library to *avoid* inventing our own code? So if you have this code which works perfectly on the input I'm suggesting since two weeks why you want to add an additional library on top of this. I feel a little bit bored by this discussion which is running several circles starts to become personal without any real reason (I hope I did not gave any) and finally leads to nothing (at least this is my impression). enclose lists in blank lines because people will tell you that this will look ugly in the existing interfaces. So I would rather tend to No for both and this is the crux here. Frankly, I think this is very wrong. The solution does not work without the code you wrote above. But you need this code anyway to detect lists in the long descriptions and so I wonder where the real profit of an additional library is. Is the above algorithm proof enough for you? Or do I have to write that into real code in your favourite porogramming language before you can see it? I hope you would not code the bug in line no. 9. What you basically tried to prove is that you are keen on teaching your fellow developers programming. Your time would be much better spend if you would bring the effort forward to finally reach a consensus how we should change best practices for debian/control to enable the parsing of list. My suggestions I presented [1] are not in contrast to markdown and what you finally are using for the description parsing tools - the algorithm above or a library on top of it - does not matter at all if we agree to some simple standard. It would be really helpful if you would return to the constructive way of discussion I observed in former times instead of bluring the issue with distracting discussions. Kind regards Andreas. [1] http://lists.debian.org/debian-devel/2009/04/msg00643.html -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Sat, Apr 18 2009, Andreas Tille wrote: On Sat, 18 Apr 2009, Manoj Srivastava wrote: Here is an algorithm: --8---cut here---start-8--- we are not in a list while reading each line, do remove leading space if the only non white space character on the line is a singe . remove the dot if the line matches the regexp: '^\s+[\*\+\-]\s+' if we are not in a list emit blank line first record we are not in a list s/not// else if we are in a list record we are not in a list emit line --8---cut here---end---8--- People who can not convert this 13 line Psuedocode into a real code should not be writing stuff to pretty print descriptions. Thanks for the trust in the programming skills of your fellow developers. You obviosely are able to write the code to detect a list *without* using a library. Wasn't it you who told me we should use a library to *avoid* inventing our own code? So if you have this code which works perfectly on the input I'm suggesting since two weeks why you want to add an additional library on top of this. I feel a little bit bored by this discussion which is running several circles starts to become personal without any real reason (I hope I did not gave any) and finally leads to nothing (at least this is my impression). Frankly, I have no idea where this trade is going. With a 6 line pre-processor, you can feed the grep-dctrl provided Description fields to Markdown. So, seems like we have come somewhere -- we have had one investigation that leads one to believe that there are a small fraction of packages using o as a bullet that need to be changed, and apart fro that there are less than 50 packages are affected (if we want to specify markdown as the markup language for descriptions -- and these are the one where we have some unwanted emphasis, a non-fatal result). There is a mechanism to pre-process the description for markdown (Perl implementation below). What more is needed for you to think this is leading somewhere? enclose lists in blank lines because people will tell you that this will look ugly in the existing interfaces. So I would rather tend to No for both and this is the crux here. Frankly, I think this is very wrong. The solution does not work without the code you wrote above. But you need this code anyway to detect lists in the long descriptions and so I wonder where the real profit of an additional library is. *Sigh*. All I am doing with the code is inserting a line before the lists. I am not generating html. I am not also handling the _other_ markup that markdown handles, that I presented as something that will make the description more readable too. The markdown librarys does all the heavy lifting fro the html generation. If you think my little perl snippet is the equivalent for what markdown does, you have not looked at markdown. I am not re-inventing the wheel when it comes to markup languages. We know we needed _some_ pre-processing because we have the paragraphs separated by ' .', but the code is pretty minimal. --8---cut here---start-8--- my $in=0; while() { chomp; s/^ //g; s/^\.\s*$//; if(/^\s+[\*\+\-]\s+/) { print \n unless $in++;} else { $in=0; } print $_\n } --8---cut here---end---8--- manoj ps: This can easily become a shell function. __ grep-aptavail -s Description -P airport-utils | perl -e ' my $in=0; while() { chomp; s/^ //g; s/^\.\s*$//; if(/^\s+[\*\+\-]\s+/) { print \n unless $in++;} else{ $in=0; } print $_\n }' | markdown pDescription: configuration and management utilities for Apple AirPort base stations This package contains various utilities to manage the Apple AirPort base stations./p pBe aware that Apple released several versions of the AirPort base station; the original AirPort (Graphite) was a rebranded Lucent RG-1000 base station, doing 802.11a/b. The AirPort Extreme (Snow) is an Apple-built 802.11a/b/g base station./p pFor the original Apple AirPort and the Lucent RG-1000 base stations only:/p ul liairport-config: base station configurator/li liairport-linkmon: wireless link monitor, gives information on the wireless link quality between the base station and the associated hosts/li /ul pFor the Apple AirPort Extreme base stations only:/p ul liairport2-config: base station configurator/li liairport2-portinspector: port maps monitor/li liairport2-ipinspector: WAN interface monitoring utility/li /ul pFor all:/p ul lipairport-modem: modem control utility, displays modem state, starts/stops modem connections, displays the approximate connection time (Extreme only)/p ul liairport-hostmon: wireless hosts monitor, lists wireless hosts
Re: RFC: Better formatting for long descriptions
On Sat, 18 Apr 2009, Manoj Srivastava wrote: Frankly, I have no idea where this trade is going. IMHO the problem is that you assume our suggestions are in contrast to each other - but they are not. I wanted to iron out suggestions how to format the input in a standardised way. What will be done afterwards is the choice of people who are working with this input. I don't care whether they choose markdown, restructured text or just take your perl code and use ul / /ul instead of the additional blank lines and wrapp the lines in lists in li / /li tags if they need HTML output. But this is NOT to be discussed HERE (even if it does not harm. The point is that our input should ENABLE this which needs a better standardisation of long descriptions. You are one step after this - and your input is welcome - but there is no contradiction. With a 6 line pre-processor, you can feed the grep-dctrl provided Description fields to Markdown. BTW, your pre-processor will need some additional lines if it comes to second level lists (and yes, I'm sure this can easily be done - but this is, and never was the point) So, seems like we have come somewhere -- we have had one investigation that leads one to believe that there are a small fraction of packages using o as a bullet that need to be changed, and apart fro that there are less than 50 packages are affected Great - let's iron out the advise how to format long descriptions in our docs to enable us to write lintian checks and file bug reports. Manoj, we really reached a point here! (if we want to specify markdown as the markup language for descriptions -- and these are the one where we have some unwanted emphasis, a non-fatal result). Please let's draw this to a different discussion. People who are responsible for packages.debian.org might be interested and adopt your idea. There is a mechanism to pre-process the description for markdown (Perl implementation below). What more is needed for you to think this is leading somewhere? Did I gave the impression that I wanted more? Honestly, I'd be interested from what part of my mails you are drawing the conclusion to enhance my communication skills. All I am doing with the code is inserting a line before the lists. I am not generating html. I am not also handling the _other_ markup that markdown handles, that I presented as something that will make the description more readable too. The markdown librarys does all the heavy lifting fro the html generation. If you think my little perl snippet is the equivalent for what markdown does, you have not looked at markdown. In the whole discussion I was talking about structuring the input to ENABLE turning it to html (or whatever structured output you need). You were discussing steps to actually *do* the step I just wanted to provide the precondition for. I just was saying if you need a preprocessor for a library while you could reach a similar result by tweaking the preprocessor a little bit. I just do not want to force any programmer to use markdown (even if it has advantages admittedly as I also agreed to). This was a *sidenote* because this whole processing of the input is just not my point. I am not re-inventing the wheel when it comes to markup languages. Same for me - or am I writing in delirium??? And your divergence of the original topic just blurs the issue - would you mind rereading my initial mail. [1] Do you agree that long descriptions need enhancement or not? We know we needed _some_ pre-processing because we have the paragraphs separated by ' .', but the code is pretty minimal. --8---cut here---start-8--- my $in=0; while() { chomp; s/^ //g; s/^\.\s*$//; if(/^\s+[\*\+\-]\s+/) { print \n unless $in++;} else { $in=0; } print $_\n } --8---cut here---end---8--- manoj ps: This can easily become a shell function. Again: Please asume for the rest of this thread that I'm not stupid and know how scripts can be used. __ grep-aptavail -s Description -P airport-utils | perl -e ' my $in=0; while() { chomp; s/^ //g; s/^\.\s*$//; if(/^\s+[\*\+\-]\s+/) { print \n unless $in++;} else{ $in=0; } print $_\n }' | markdown pDescription: configuration and management utilities for Apple AirPort base stations This package contains various utilities to manage the Apple AirPort base stations./p pBe aware that Apple released several versions of the AirPort base station; the original AirPort (Graphite) was a rebranded Lucent RG-1000 base station, doing 802.11a/b. The AirPort Extreme (Snow) is an Apple-built 802.11a/b/g base station./p pFor the original Apple AirPort and the Lucent RG-1000 base stations only:/p ul liairport-config: base station configurator/li liairport-linkmon: wireless link monitor, gives information on the wireless link quality between the base station and the associated
Re: RFC: Better formatting for long descriptions
On Thu, 16 Apr 2009, Manoj Srivastava wrote: Which is good, since Markdown/ReST rules for lists will only make the lists using o as the bullet out of whack. Fine. None of which are mandatory. All the package descriptions I read in /var/lib/dpkg/available seems to pass, though a couple had italics in strange places. This is not a fatal flaw. No - this perfectly fits my intention that some descriptions have to be fixed. We just need guidelines for developers to follow. I find the descriptions on packages.d.o just fine right now. IMHO it is no argument that a specific person is happy with the layout everybody else is. Just like it is no argument that someone think something is ugly that means everyone thinks so too. If a text has a certain logic it should to be supported by the means a certain output style has. HTML can express a list and so it should if we want to express lists. Please do not split my paragraphs to blur my arguing. Thanks. Heh. Ever heard of inline answers? In most cases I manage to ignore this kind of questions. Try reading my mail again to find out a reasonable answer to your question yourself. I suggest you try it out, before handwaving vague FUD around. Even tnftp description works fine with either. There are very few descriptions (about 24 or so) where we might have unwanted emphasis. I think we can have that fixed. But what exactly do I have to do to get the item lists marked? grep-available -s Description -F Package airport-utils | markdown pDescription: configuration and management utilities for Apple AirPort base stations This package contains various utilities to manage the Apple AirPort base stations. . Be aware that Apple released several versions of the AirPort base station; the original AirPort (Graphite) was a rebranded Lucent RG-1000 base station, doing 802.11a/b. The AirPort Extreme (Snow) is an Apple-built 802.11a/b/g base station. . For the original Apple AirPort and the Lucent RG-1000 base stations only: - airport-config: base station configurator - airport-linkmon: wireless link monitor, gives information on the wireless link quality between the base station and the associated hosts . For the Apple AirPort Extreme base stations only: - airport2-config: base station configurator - airport2-portinspector: port maps monitor - airport2-ipinspector: WAN interface monitoring utility . For all: - airport-modem: modem control utility, displays modem state, starts/stops modem connections, displays the approximate connection time (Extreme only) - airport-hostmon: wireless hosts monitor, lists wireless hosts connected to the base station (see airport2-portinspector for the Snow)/p $ grep-available -s Description -F Package tnftp | markdown pDescription: The enhanced ftp client tnftp is what many users affectionately call the enhanced ftp client in NetBSD (http://www.netbsd.org). . This package is a codeport' of the NetBSD ftp client to other systems. . The enhancements over the standard ftp client in 4.4BSD include: * command-line editing within ftp * command-line fetching of URLS, including support for: - http proxies (c.f: $http_proxy, $ftp_proxy) - authentication * context sensitive command and filename completion * dynamic progress bar * IPv6 support (from the WIDE project) * modification time preservation * paging of local and remote files, and of directory listings (c.f:/codelpage', codepage',/codepdir') * passive mode support, with fallback to active mode * codeset option' override of ftp environment variables * TIS Firewall Toolkit gate ftp proxy support (c.f:/codegate') * transfer-rate throttling (c.f: code-T',/coderate')/p I would simplify the rule, as opposed to having a trivial library call in the tool. Indeed, reusing the libraries provided is *less* work for the parser, than a NIH new parser. I'm really in favour of reusing a library (and I wonder whether I wrote anything in contrast to this). I just fail to see any effect when using markdown except that the description is now enclosed in p/p and some other markups appear which could be fixed. But the intended result to get a list markup is not reached. Or did I missed something? I think we need the emphasis almost as much as we need lists; and people are already using *word* for emphasis in desciptions (though not all that many). I'm not against implementing emphasis which might be also an interesting enhancement and if it is a small amount of packages which need to be fixed these most probably need to be fixed in plain text anyway. So if you enlighten me how the lists could work I'm perfectly happy. Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, Apr 16, 2009 at 03:01:30PM -0500, Manoj Srivastava wrote: On Thu, Apr 16 2009, Giacomo Catenazzi wrote: Manoj Srivastava wrote: - Ability to recognize and render the following logical entities, in decreasing order of importance: + unordered lists + ordered lists really needed? I would think these are the guts of this proposal. Or else what are we discussing here? + emphasis + strong emphasis + definition lists + hypertext links + underlines, and strike throughs I don't think they are needed. Why not? If rendering a description in a manner that makes it easier to read is the goal, I fail to see why emphasis and strong emphasis is a bad idea (think of text-to-speech mechanisms). This is not just opinions we are discussing here, we should be looking at use cases for marking up a textual description. Underlines is generally bad, strike throughs are worse ;-) So you say. Don't use them, then. There are cases where either one of these constructs have value; and you should not impose your personal aesthetics on a general policy discussion. Just as a kind of clarification: Manoj, I think that Giacomo's comments were only to the *last* item of the text he quoted, not to the whole portion above it :) Thus, IMHO his first really needed? question referred specifically to the ordered lists item, and the I don't think they are needed referred specifically to the underlines and strike-throughs, not to the emphasis, strong emphasis, etc. G'luck, Peter -- Peter Pentchev r...@ringlet.netr...@space.bgr...@freebsd.org PGP key:http://people.FreeBSD.org/~roam/roam.key.asc Key fingerprint FDBA FD79 C26F 3C51 C95E DF9E ED18 B68D 1619 4553 If this sentence didn't exist, somebody would have invented it. pgpiMqRiX2tCB.pgp Description: PGP signature
Re: RFC: Better formatting for long descriptions
On Thu, 16 Apr 2009, Manoj Srivastava wrote: Manoj Srivastava wrote: - Ability to recognize and render the following logical entities, in decreasing order of importance: + unordered lists + ordered lists really needed? I would think these are the guts of this proposal. Or else what are we discussing here? In this thread it was mentioned that ordered lists are not really needed. Despite this opinion they are actually *used* and thus there seems to be some need. Another thing what actually is used are description lists (which are IMHO needed more than orderes lists) but if we at least get the two above working there is a big win. + emphasis + strong emphasis + definition lists + hypertext links + underlines, and strike throughs I don't think they are needed. Why not? If rendering a description in a manner that makes it easier to read is the goal, I fail to see why emphasis and strong emphasis is a bad idea (think of text-to-speech mechanisms). This is not just opinions we are discussing here, we should be looking at use cases for marking up a textual description. As Peter Pentchev wrote in his mail I think what is not needed is underlines, and strike throughs - but I would not forcibly restrict the use if the lib we decide to use provides this feature. Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, 16 Apr 2009, Guillem Jover wrote: ,-- count-bullet-chars.sh -- #!/bin/sh lists=/var/lib/apt/lists/*_sid_main_*_Packages total=`grep ^ *[-+\*o] $lists | wc -l` for tag in \* - + o; do items=`grep ^ *$tag $lists | wc -l` percent=`echo scale=4; $items / $total * 100 | bc` echo Tag $tag was used $items times ($percent%) done `-- Tag \* was used 9277 times (68.0900%) Tag - was used 3837 times (28.1600%) Tag + was used 120 times (.8800%) Tag o was used 390 times (2.8600%) Regardless of the numbers though (which have moved lately slightly in favour of '-' due to the recommendations from the Smith reviewing project), I have not found any recommendation regarding this at the SRP Wiki page [1]. I vaguely remember that this Smith project was initially driven by a French guy who might try to push a French habit into the English world. ;-) Do you have any link to those recommendation which perhaps should be fixed in the first place. IMHO the Smith Review Project would be a first place were we could start kind of a standardisation of this issue - it seems there is no stronger place to move this suggestion to. I've always found the asterisk the obvious character to use for bulleted lists, as it's the one ressembling the most a bullet, and it's the one we use in changelog entries and similar. I perfectly agree here. Even if I tend to a I do not care about the actual character we use as long as it is a defined one opinion the statistics above shows clearly a preference and we should turn this preference in a recommendation and ask people to stick to this recommendation. So could we settle down with the agreement: ' * ' for first order lists and '- ' for second order lists. I would like to push this to SRP *and* 6.2. Best practices for debian/control of developers reference. This would finally allow us to file wishlist bug reports against packages which do not follow this recommendation. Kind regards Andreas. [1] http://wiki.debian.org/I18n/SmithReviewProject -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Item lists bulletting (was: Re: RFC: Better formatting for long descriptions)
Andreas Tille a écrit : I have not found any recommendation regarding this at the SRP Wiki page [1]. I vaguely remember that this Smith project was initially driven by a French guy who might try to push a French habit into the English world. ;-) Of course. Because, contrary to the world of English language, we *do* have written rules for such cases. From the Lexique des règles typographiques en usage à l'Imprimerie Nationale (which is the reference for all typographic conventions for the French languagethe reference book of all French TeXnicians) : Les énumérations - elles sont introduites par un deux-points ; - les énumérations de premier rang sont introduites par un tiret et se terminent par un point-virgule, sauf la dernière par un point final ; - les énumérations de second rang sont introduites par un tiret décalé et se terminent par une virgule. Which (badly) translates to: Itemizations: - they're introduced by a colon; - first degree itemizations are preceeded by a dash and end with a semi-colon, except the last one that ends up with a sentence dot; - second degree itemizations are preceeded by a tabbed dash and end up with a comma. I have never been able to find any such solid reference for English. There is probably something in the Chicago Manual of Style, that's generally accepted as the Right Reference for en_US. Maybe more input from our experts on debian-l10n-english? -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Wed, Apr 15 2009, Guillem Jover wrote: Tag \* was used 9277 times (68.0900%) Tag - was used 3837 times (28.1600%) Tag + was used 120 times (.8800%) Tag o was used 390 times (2.8600%) Regardless of the numbers though (which have moved lately slightly in favour of '-' due to the recommendations from the Smith reviewing project), I've always found the asterisk the obvious character to use for bulleted lists, as it's the one ressembling the most a bullet, and it's the one we use in changelog entries and similar. The primary goal of the description is to convey to the user why they should install the package. The maintainer can use an unsorted list to help convey the information; and any means that make it clear to the user that they are looking at a list is good enough. Anything beyond that seems like striving for a foolish consistency; and the basic assumption being made (which does not, in my opinion, hold) is that a rigid monotonic conformity is aesthetically pleasing. I think a variety in the symbols used for bullets is better, in that it breaks the monotony. Do we really have nothing better to do than to impose bureaucratic rules on what characters to use as bullet symbols in long descriptions even if the user can tell that the character is a bullet? manoj -- Slowly and surely the unix crept up on the Nintendo user ... Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Item lists bulletting (was: Re: RFC: Better formatting for long descriptions)
to, 2009-04-16 kello 08:42 +0200, Christian Perrier kirjoitti: I have never been able to find any such solid reference for English. There is probably something in the Chicago Manual of Style, that's generally accepted as the Right Reference for en_US. Maybe more input from our experts on debian-l10n-english? I'm not an expert, but I have the 14th edition of the CMS. It says both bullets and dashes are acceptable (8.77, page 314, for reference). (I am not expressing an opinion for or against the normalization of long description markup.) -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, 16 Apr 2009, Manoj Srivastava wrote: Do we really have nothing better to do than to impose bureaucratic rules on what characters to use as bullet symbols in long descriptions even if the user can tell that the character is a bullet? The user can tell, but scripts can't reliably. Long descriptions are used in several places and some of these could render a better layout. A good layout is pleasing for users. So it is not stupid bureaucracy but making our descriptions better readable (for instance on packages.d.o and other places). Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, Apr 16 2009, Andreas Tille wrote: On Thu, 16 Apr 2009, Manoj Srivastava wrote: Do we really have nothing better to do than to impose bureaucratic rules on what characters to use as bullet symbols in long descriptions even if the user can tell that the character is a bullet? The user can tell, but scripts can't reliably. Any script should be able to take the top 4 symbols currently used, and be able to detect them. I think *, +, - and o cover most packages, and the scripts in question can be readily expanded. All kinds of markup languages already do something similar. (markdown, Emacs org-mode, mediawiki, etc) Long descriptions are used in several places and some of these could render a better layout. Functionally, just rendering the description as written would suffice; the rest is aesthetics. A good layout is pleasing for users. So it Pleasing is in the eye of the beholder, no? is not stupid bureaucracy but making our descriptions better readable (for instance on packages.d.o and other places). I find the descriptions on packages.d.o just fine right now. Having sad that, I would not be averse to specifying that leading white space and *, +, and - would be acceptable as bullet marks (I thought specifying which mark at which level was overspecification). manoj -- A man convinced against his will is of the same opinion still. -- Butler Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, 16 Apr 2009, Manoj Srivastava wrote: Any script should be able to take the top 4 symbols currently used, and be able to detect them. I think *, +, - and o cover most packages, and the scripts in question can be readily expanded. All kinds of markup languages already do something similar. (markdown, Emacs org-mode, mediawiki, etc) Perhaps you missed the point that it is not only the very character which is used but also the broken spacing which prevents scripts from detecting levels of itemizing list. Yes, we have more than one level itemizings in our descriptions (see my initial posting. Detecting these would need either a defined character or a defined spacing (IMHO an 'and' would be better than a non-exclusive 'or' here). I find the descriptions on packages.d.o just fine right now. IMHO it is no argument that a specific person is happy with the layout everybody else is. If a text has a certain logic it should to be supported by the means a certain output style has. HTML can express a list and so it should if we want to express lists. Having sad that, I would not be averse to specifying that leading white space and *, +, and - would be acceptable as bullet marks (I thought specifying which mark at which level was overspecification). So you would be in favour of specifying only the amount of white space to define a level? If this might be accepted as a rough consensus it is at least helpful to enable tools detecting what they need to detect. Even if my esthetical feeling goes beyond this I can accept this. But you also specified three characters (*, +, and -) so do you want to restrict the acceptable set yourself (for instance not accept 'o')? Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, Apr 16, 2009 at 02:34:52AM -0500, Manoj Srivastava wrote: Having sad that, I would not be averse to specifying that leading white space and *, +, and - would be acceptable as bullet marks (I thought specifying which mark at which level was overspecification). Why don't we say binaries are fine in /usr/bin, /usr/local/bin and /opt while we are at it, to provide some refreshing alternatives to our users? Michael -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, Apr 16 2009, Andreas Tille wrote: On Thu, 16 Apr 2009, Manoj Srivastava wrote: Any script should be able to take the top 4 symbols currently used, and be able to detect them. I think *, +, - and o cover most packages, and the scripts in question can be readily expanded. All kinds of markup languages already do something similar. (markdown, Emacs org-mode, mediawiki, etc) Perhaps you missed the point that it is not only the very character which is used but also the broken spacing which prevents scripts from detecting levels of itemizing list. Yes, we have more than one level itemizings in our descriptions (see my initial posting. Detecting these would need either a defined character or a defined spacing (IMHO an 'and' would be better than a non-exclusive 'or' here). Umm. I am not sure that follows. I am also not convinced we need to invent our own rules. Text::Markdown or Text::MultiMarkdown could help. And they do not seem to have issues with recognizing indentation/different characters as denoting levels of lists. I find the descriptions on packages.d.o just fine right now. IMHO it is no argument that a specific person is happy with the layout everybody else is. Just like it is no argument that someone think something is ugly that means everyone thinks so too. If a text has a certain logic it should to be supported by the means a certain output style has. HTML can express a list and so it should if we want to express lists. And we do not need to specify any more rigid rules than established systems like markdown do in order to achieve that. Indeed, we can just pipe the description though markdown, and use the html Having sad that, I would not be averse to specifying that leading white space and *, +, and - would be acceptable as bullet marks (I thought specifying which mark at which level was overspecification). So you would be in favour of specifying only the amount of white space to define a level? You do not have to specify the level. Just that the indentation be sufficient for the user or markdown to be able to differentiate what level the item is at. If this might be accepted as a rough consensus it is at least helpful to enable tools detecting what they need to detect. Even if my esthetical feeling goes beyond this I can accept this. But you also specified three characters (*, +, and -) so do you want to restrict the acceptable set yourself (for instance not accept 'o')? I suggest we follow a convention and tool set already in place, with multiple language bindings, if you must insist on adding rules to the long description. There are alternatives (Text::Textile comes to mind), but Markdown has better language support, so long description parsers might have an easier time. I suggest, for readability, to use a subset of markdown; the link and image tags are not that human readable. manoj http://en.wikipedia.org/wiki/Markdown http://markdown.infogami.com/ http://daringfireball.net/projects/markdown/syntax -- Man's horizons are bounded by his vision. Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
(following up on IRC discussion) Manoj Srivastava sriva...@debian.org writes: I suggest we follow a convention and tool set already in place, with multiple language bindings, if you must insist on adding rules to the long description. There are alternatives (Text::Textile comes to mind), but Markdown has better language support, so long description parsers might have an easier time. I suggest, for readability, to use a subset of markdown; the link and image tags are not that human readable. reStructuredText URL:http://docutils.sourceforge.net/rst.html (reST) is, I argue, a superior choice to Markdown for our existing format. Markdown explicitly assumes the writer is going to punt to HTML for anything not covered by Markdown, which severely limits its future flexibility in contexts where we don't want to put HTML in the source. reST, on the other hand, makes no such assumptions about enclosing context; it was initially designed for documentation in program source code, which is much closer to our needs for text in a control field. It also helps that the simple bullet lists that are the most common case are perfectly valid in reST too. -- \ “Never express yourself more clearly than you are able to | `\ think.” —Niels Bohr | _o__) | Ben Finney -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, 16 Apr 2009, Manoj Srivastava wrote: my initial posting. Detecting these would need either a defined character or a defined spacing (IMHO an 'and' would be better than a non-exclusive 'or' here). Umm. I am not sure that follows. I am also not convinced we need to invent our own rules. I tried to suggest *any* rule which works. I'm not in favour of invanting new rules. But the rules should be simple enough to not break any existing tool. Text::Markdown or Text::MultiMarkdown could help. And they do not seem to have issues with recognizing indentation/different characters as denoting levels of lists. If I interpret your first link [1] right this are even *more* rules as I suggested. I find the descriptions on packages.d.o just fine right now. IMHO it is no argument that a specific person is happy with the layout everybody else is. Just like it is no argument that someone think something is ugly that means everyone thinks so too. If a text has a certain logic it should to be supported by the means a certain output style has. HTML can express a list and so it should if we want to express lists. Please do not split my paragraphs to blur my arguing. Thanks. And we do not need to specify any more rigid rules than established systems like markdown do in order to achieve that. Indeed, we can just pipe the description though markdown, and use the html Have you tested this suggestion whether the current long descriptions will render correctly? So you would be in favour of specifying only the amount of white space to define a level? You do not have to specify the level. Just that the indentation be sufficient for the user or markdown to be able to differentiate what level the item is at. I'm sorry - I do not know markdown whether it is clever enough to render the lists in all long descriptions. But as long as the hint please make sure that your long description renders with markdown is not written in any of our documents I really doubt that. May I draw the conclusion that you are also in favour of some rules but not really happy with the rules I suggested? That's really fine for me. I just want *any* rule which *works* and is written down somewhere to enable us filing bug reports against packages which do not follow this rule. I think I mentioned this in my postings of this thread. I suggest we follow a convention and tool set already in place, with multiple language bindings, if you must insist on adding rules to the long description. There are alternatives (Text::Textile comes to mind), but Markdown has better language support, so long description parsers might have an easier time. I do not want any complicated tool to parse our long descriptions. In principle they are really easy to parse. I want to have the simplest possible rule set which enables us to reliable parse the logic of our long descriptions. While you claim to be against rules you propose even harder to apply rules. At least for me your suggestions are confusing and just bluring the issue. I suggest, for readability, to use a subset of markdown; the link and image tags are not that human readable. Yes - that's perfectly fine. We are just using a subset of markdown actually - a much simpler one than the suggested, without features like italics and strong, headings etc. And we do not really need it - we just should keep it simple to not break any existing tool. If there is a library which reliably can detect the logic of the current long descriptions probably nothing has to be changed. But I doubt there is one and I really wonder why anybody who is happy with the current rendering is suggesting even more complex things. Kind regards Andreas. [1] http://en.wikipedia.org/wiki/Markdown -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, Apr 16, 2009 at 04:01:20AM -0500, Manoj Srivastava wrote: Umm. I am not sure that follows. I am also not convinced we need to invent our own rules. Text::Markdown or Text::MultiMarkdown could help. And they do not seem to have issues with recognizing indentation/different characters as denoting levels of lists. Character-level formatting of markdown as well? Two examples: * From abcmidi: This package contains the programs `abc2midi' and `midi2abc', which * From alltray: KDE, XFCE 4*, Fluxbox* and WindowMaker*. (*) No drag 'n drop support. Enable with -nm option. -- Tzafrir Cohen | tzaf...@jabber.org | VIM is http://tzafrir.org.il || a Mutt's tzaf...@cohens.org.il || best ICQ# 16849754 || friend -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
Ben Finney ben+deb...@benfinney.id.au writes: (following up on IRC discussion) Manoj Srivastava sriva...@debian.org writes: I suggest, for readability, to use a subset of markdown; the link and image tags are not that human readable. reStructuredText URL:http://docutils.sourceforge.net/rst.html (reST) is, I argue, a superior choice to Markdown for our existing format. Note that, like Manoj, I'm suggesting only a *subset*, not the full specification. -- \“Like the creators of sitcoms or junk food or package tours, | `\ Java's designers were consciously designing a product for | _o__) people not as smart as them.” —Paul Graham | Ben Finney -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, 16 Apr 2009, Ben Finney wrote: Note that, like Manoj, I'm suggesting only a *subset*, not the full specification. Well, in this thread we had several suggestions reaching from complete change to different format up to not in detail specified subsets of other formats. IMHO this does not bring us foreward a single step. If we want to move foreward we have to make sure that we will not be forced to touch every single package because such an intend will be bound to fail and every minute spended in discussion here is simply wasted. So if you suggest a subset of a specification please state clearly which subset and whether it works with currently existing descriptions. I'd volunteer to set up a doodle poll with suggestions. If you make a suggestion please answer the following question: A. Does the suggestion enable parsing logical structures like two level itemize lists? (This is what I want to approach and what is IMHO needed) B. Does the suggestion enable keeping the majority of description untouched and enables keeping the currently existing tools? (This is important to gain any acceptance) If one of the question above is answered with no please mention whether you are volunteering to do the work which is needed to port the existing stuff to match your suggestion. Currently I would feed the poll with 4 suggestions: 0. Keep anything as unstructured as it is. Answer to A: no Answer to B: yes 1. Use '*' for first order item lists, '-' for second order item lists and use ' ' (exactly two spaces) before the '*' and '' (exactly four spaces) before the '-'. After '*' and '-' exactly one space should be used and continued lines should start in the same column as the text starts above. Answer to A: yes Answer to B: yes 2. Use '*' for first order item lists, '-' for second order item lists. Spacing does not matter as long as continued lines will start in the same column as the text above. Answer to A: yes Answer to B: yes 3. Use any character of ('*', '-', '+') to start a list and mark the level of the list by strictly following spacing rules and use ' ' (exactly two spaces) before the selected character for starting first order list and '' (exactly four spaces) before the character for starting second order list. After the marker symbold exactly one space should be used and continued lines should start in the same column as the text starts above. Answer to A: yes Answer to B: yes If you want to make further suggestions just append this list. I'll start a doodle poll next Monday. Depending from the outcome of this poll I will submit a patch for 6.2. Best practices for debian/control. Does this sound reasonable? Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, Apr 16 2009, Ben Finney wrote: (following up on IRC discussion) Manoj Srivastava sriva...@debian.org writes: I suggest we follow a convention and tool set already in place, with multiple language bindings, if you must insist on adding rules to the long description. There are alternatives (Text::Textile comes to mind), but Markdown has better language support, so long description parsers might have an easier time. I suggest, for readability, to use a subset of markdown; the link and image tags are not that human readable. reStructuredText URL:http://docutils.sourceforge.net/rst.html (reST) is, I argue, a superior choice to Markdown for our existing format. I can live with restructured text. I would like to point out, though, that the language support is more mature in markdown, and the subset of features we care about are identical in markdown and rest. It also helps that the simple bullet lists that are the most common case are perfectly valid in reST too. Right. manoj -- Patageometry, n.: The study of those mathematical properties that are invariant under brain transplants. Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, Apr 16 2009, Andreas Tille wrote: On Thu, 16 Apr 2009, Ben Finney wrote: Note that, like Manoj, I'm suggesting only a *subset*, not the full specification. Well, in this thread we had several suggestions reaching from complete change to different format up to not in detail specified subsets of other formats. IMHO this does not bring us foreward a single step. If we want to move foreward we have to make sure that we will not be forced to touch every single package because such an intend will be This is exactly why I like markdown or restructured text, most packages conform already. bound to fail and every minute spended in discussion here is simply wasted. So if you suggest a subset of a specification please state clearly which subset and whether it works with currently existing descriptions. I'd volunteer to set up a doodle poll with suggestions. Voting is a piss poor means of making a technical decision. At this point, I would say rules for lists, and bold/italics should not be any more restrictive than markdown/ReST, and not impose any more burdens on the description writer. If you make a suggestion please answer the following question: A. Does the suggestion enable parsing logical structures like two level itemize lists? (This is what I want to approach and what is IMHO needed) Markdown and ReST, trivially. B. Does the suggestion enable keeping the majority of description untouched and enables keeping the currently existing tools? (This is important to gain any acceptance) Yes, for both. The one issue I have seen raised is that of using *italics* and **bold** text; there are package descriptions where italics will suddenly appear. Me, I like org mode, where we have /italics/, *bold* +strikethrough+, _underline_; bug I doubt that org-mode will be popular as an interpreter. manoj -- It is better to have loved and lost -- much better. Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, Apr 16 2009, Tzafrir Cohen wrote: On Thu, Apr 16, 2009 at 04:01:20AM -0500, Manoj Srivastava wrote: Umm. I am not sure that follows. I am also not convinced we need to invent our own rules. Text::Markdown or Text::MultiMarkdown could help. And they do not seem to have issues with recognizing indentation/different characters as denoting levels of lists. Character-level formatting of markdown as well? Two examples: * From abcmidi: This package contains the programs `abc2midi' and `midi2abc', which Yup, this one is a problem. pThis package contains the programs codeabc2midi\' and/codemidi2abc\', which/p So using ` as a quote seems to be an issue. __ egrep '`' /var/lib/dpkg/available | wc -l 149 Less than 150 instances. * From alltray: KDE, XFCE 4*, Fluxbox* and WindowMaker*. (*) No drag 'n drop support. Enable with -nm option. __ echo KDE, XFCE 4*, Fluxbox* and WindowMaker*. (*) No drag 'n drop support. Enable with -nm option. | markdown pKDE, XFCE 4*, Fluxbox* and WindowMaker*. (*) No drag 'n drop support. Enable with -nm option./p Hmm. Looks fine to me. manoj -- If Diet Coke did not exist it would have been necessary to invent it. Karl Lehenbauer Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, Apr 16 2009, Andreas Tille wrote: On Thu, 16 Apr 2009, Manoj Srivastava wrote: my initial posting. Detecting these would need either a defined character or a defined spacing (IMHO an 'and' would be better than a non-exclusive 'or' here). Umm. I am not sure that follows. I am also not convinced we need to invent our own rules. I tried to suggest *any* rule which works. I'm not in favour of invanting new rules. But the rules should be simple enough to not break any existing tool. Which is good, since Markdown/ReST rules for lists will only make the lists using o as the bullet out of whack. Text::Markdown or Text::MultiMarkdown could help. And they do not seem to have issues with recognizing indentation/different characters as denoting levels of lists. If I interpret your first link [1] right this are even *more* rules as I suggested. None of which are mandatory. All the package descriptions I read in /var/lib/dpkg/available seems to pass, though a couple had italics in strange places. This is not a fatal flaw. I find the descriptions on packages.d.o just fine right now. IMHO it is no argument that a specific person is happy with the layout everybody else is. Just like it is no argument that someone think something is ugly that means everyone thinks so too. If a text has a certain logic it should to be supported by the means a certain output style has. HTML can express a list and so it should if we want to express lists. Please do not split my paragraphs to blur my arguing. Thanks. Heh. Ever heard of inline answers? And we do not need to specify any more rigid rules than established systems like markdown do in order to achieve that. Indeed, we can just pipe the description though markdown, and use the html Have you tested this suggestion whether the current long descriptions will render correctly? Yup. So you would be in favour of specifying only the amount of white space to define a level? You do not have to specify the level. Just that the indentation be sufficient for the user or markdown to be able to differentiate what level the item is at. I'm sorry - I do not know markdown whether it is clever enough to render the lists in all long descriptions. But as long as the hint please make sure that your long description renders with markdown is not written in any of our documents I really doubt that. May I draw Doubt is fine. Actually reading the package descriptions would have been better. Tag \* was used 9277 times (68.0900%) Tag - was used 3837 times (28.1600%) Tag + was used 120 times (.8800%) These work. Tag o was used 390 times (2.8600%) These do not. Now, using *italic* had a few issues. There are 99 lines in available where * is not used as a list item tag. Of these 99 lines, 27 places the *word* is used for emphasis, meaning that 72 places in the available file * is used as a wildcard. But not all of these are an issue: --8---cut here---start-8--- __ echo ' bsd* and others.' | markdown pbsd* and others./p --8---cut here---end---8--- In those 72 places, only 24 descriptions did we have a second * show up, to anchor the other end of the mistaken emphasis. the conclusion that you are also in favour of some rules but not really happy with the rules I suggested? That's really fine for me. I just want *any* rule which *works* and is written down somewhere to enable us filing bug reports against packages which do not follow this rule. I think I mentioned this in my postings of this thread. I suggest you try it out, before handwaving vague FUD around. Even tnftp description works fine with either. There are very few descriptions (about 24 or so) where we might have unwanted emphasis. I think we can have that fixed. I suggest we follow a convention and tool set already in place, with multiple language bindings, if you must insist on adding rules to the long description. There are alternatives (Text::Textile comes to mind), but Markdown has better language support, so long description parsers might have an easier time. I do not want any complicated tool to parse our long descriptions. In principle they are really easy to parse. I want to have the simplest possible rule set which enables us to reliable parse the logic of our long descriptions. While you claim to be against rules you propose even harder to apply rules. At least for me your suggestions are confusing and just bluring the issue. I would simplify the rule, as opposed to having a trivial library call in the tool. Indeed, reusing the libraries provided is *less* work for the parser, than a NIH new parser. I suggest, for readability, to use a subset of markdown; the link and image tags are not
Re: RFC: Better formatting for long descriptions
Hi, Oh, markdown is only confused when you have `two' `words' quoted like this, wqhen there is only one such quote in the package, we are fine. pThis package contains the programs `abc2midi' which/p So, less than 149 instances of the code tag where we want none. manoj finding fewer problems in the descriptions than expected -- Slime is the agony of water. Jean-Paul Sartre Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Item lists bulletting (was: Re: RFC: Better formatting for long descriptions)
Quoting Lars Wirzenius (l...@liw.fi): to, 2009-04-16 kello 08:42 +0200, Christian Perrier kirjoitti: I have never been able to find any such solid reference for English. There is probably something in the Chicago Manual of Style, that's generally accepted as the Right Reference for en_US. Maybe more input from our experts on debian-l10n-english? I'm not an expert, but I have the 14th edition of the CMS. It says both bullets and dashes are acceptable (8.77, page 314, for reference). Well, based on that discussion, these facts and the current practice, I think that, in Smith reviews, we will, from now, recommend the use of asterisks for 1st level items in item lists, in package descriptions and debconf templates (these are the texts we review). Please note that this is not *enforcing* things on maintainers. All Smith reviews are suggestions made to maintainers and they are associated to the whole discussion/review. When maintainers insist on some practice (or even spelling|wording) we always follow their advice at the endeven for mainainers who insist on using first person sentences (hint hint). The same will happen for item lists. signature.asc Description: Digital signature
Re: RFC: Better formatting for long descriptions
Hi, I think we need to enumerate some goals for this proposed change. Here is a start: - Minimal disruption for current packages. The impact should be measured by numbers of packages impacted + Any specification of which of *, +, - to use as th first level item will impact more packages than not specifying it, by several hundred + The same is true for specifying the mark used for second level list items + Specifying exact number of spaces will also hit current packages, and will be a source of errors in the future. - Ability to recognize and render the following logical entities, in decreasing order of importance: + unordered lists + ordered lists + emphasis + strong emphasis + definition lists + hypertext links + underlines, and strike throughs - Readability for people looking at non-enhanced renditions, i.e., using less on the Packages file. Sticking to widely known conventions, using the same conventions that peple are used to using in email, and Wikis, is a plus. - Ease of use for description writers. Again, sticking with standards that people already know and use is better than making our own, more restrictive standards - Not adding hugely to bloat for the Packages file This kinda excludes verbose markup like XML (which would have failed the readability test too) At this point, I would say that Markdown/Resstructued text meets most of the goals above, as long as we restrict the markup to the list above: * unordered lists * ordered lists * emphasis * strong emphasis * definition lists * hypertext links * underlines, and strike throughs manoj -- If we can't fix it -- we'll fix it so nobody can. Gibbons Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Manoj Srivastava wrote: - Ability to recognize and render the following logical entities, in decreasing order of importance: + unordered lists + ordered lists really needed? + emphasis + strong emphasis + definition lists + hypertext links + underlines, and strike throughs I don't think they are needed. Underlines is generally bad, strike throughs are worse ;-) Ev. also monospace, e.g. for commands, but I really prefer to have a simpler language as possible. At this point, I would say that Markdown/Resstructued text meets most of the goals above, as long as we restrict the markup to the list above: Could provide us an example of Resstructued for the basic constructs? * unordered lists * ordered lists * emphasis * strong emphasis * definition lists * hypertext links * underlines, and strike throughs I like also creole (standardized wiki language, moinmoin support it), but no definition lists, underline, strike throughs. So for creole: * unordered lists \n * \n ** * ordered lists \n # \n ## * emphasis //foo// * strong emphasis **bar** * definition lists missing ev. \n **spam** is spam * hypertext links normal url * underlines, and strike throughs missing, missing ciao cate -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAknnhU8ACgkQ+ZNUJLHfmlfJigCfR/Jpn96l7FxHb9INlJlHkd+S z+MAn2eM+rOOHN9n8LJTYXi/gT7cWuMa =3a5+ -END PGP SIGNATURE- -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, Apr 16 2009, Giacomo Catenazzi wrote: Manoj Srivastava wrote: - Ability to recognize and render the following logical entities, in decreasing order of importance: + unordered lists + ordered lists really needed? I would think these are the guts of this proposal. Or else what are we discussing here? + emphasis + strong emphasis + definition lists + hypertext links + underlines, and strike throughs I don't think they are needed. Why not? If rendering a description in a manner that makes it easier to read is the goal, I fail to see why emphasis and strong emphasis is a bad idea (think of text-to-speech mechanisms). This is not just opinions we are discussing here, we should be looking at use cases for marking up a textual description. Underlines is generally bad, strike throughs are worse ;-) So you say. Don't use them, then. There are cases where either one of these constructs have value; and you should not impose your personal aesthetics on a general policy discussion. Ev. also monospace, e.g. for commands, but I really prefer to have a simpler language as possible. At this point, I would say that Markdown/Resstructued text meets most of the goals above, as long as we restrict the markup to the list above: Could provide us an example of Resstructued for the basic constructs? * unordered lists * ordered lists * emphasis * strong emphasis * definition lists * hypertext links * underlines, and strike throughs I like also creole (standardized wiki language, moinmoin support it), but no definition lists, underline, strike throughs. What kind of language bindings are present for creole libraries? markdown has a shell interpreter, has python, perl, ruby, C, c++, lisp, and is widely supported and used by wikis et al. So for creole: * unordered lists \n * \n ** This fails the Do not impact large numbers of packages test, since we have lots of packages using + and -. for list items. * ordered lists \n # \n ## * emphasis//foo// This also fails the test above -- lots of people are using *emphasis*. * strong emphasis **bar** * definition listsmissing ev. \n **spam** is spam Hmm * hypertext links normal url * underlines, and strike throughs missing, missing ok. manoj -- There's just something I don't like about Virginia; the state. Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Thu, Apr 16, 2009 at 12:50:12PM -0500, Manoj Srivastava wrote: I think we need to enumerate some goals for this proposed change. Here is a start: - Minimal disruption for current packages. The impact should be measured by numbers of packages impacted snip At this point, I would say that Markdown/Resstructued text meets most of the goals above, as long as we restrict the markup to the list above: I agree with the goals and thanks for resetting the discussion on their grounds. According to the goals you pointed out, it looks like that Markdown would be a more than suitable choice in terms of availability of implementations, matching of mail-like markup (which is actually one of the design goal of the language), and minimal disruption. [ Markdown would also be my choice in term of personal tastes. Not that it matters, but I mention it to it make clear which is my church in this respect :) ] However, markdown would not be directly applicable to the content of the long description field, as a RFC822 parser would give you, due to '.'s used as paragraph separators. Sure the needed pre-processing to fix that would be trivial, but it is *some kind* of pre-processing. One can then wonder to which extent we would allow pre-processing before the markup processor without considering that need a disruption of current long descriptions. I just felt like pointing that out, because it can put back into play some other language which can be considered non disrupting by allowing some extra pre-processing bits. ... nevertheless I completely agree that something like Markdown + the minimal paragraph separator pre-processing looks like a completely reasonable implementation plan. Out of curiosity, would restructured text be immune to this problem? Cheers. -- Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7 z...@{upsilon.cc,pps.jussieu.fr,debian.org} -- http://upsilon.cc/zack/ Dietro un grande uomo c'è ..| . |. Et ne m'en veux pas si je te tutoie sempre uno zaino ...| ..: | Je dis tu à tous ceux que j'aime signature.asc Description: Digital signature
Re: RFC: Better formatting for long descriptions
Hi! On Mon, 2009-03-23 at 13:26:36 +0100, Andreas Tille wrote: On Mon, 23 Mar 2009, Michael Banck wrote: So it would be great if some numbers could be brought up first (maybe Andreas has a rough overview now, because he looked at the different kinds of itemizations). Well, I had not but you can get it somehow by for tag in \* - + o ; do echo Tag $tag was used `grep ^ $tag /var/lib/dpkg/available | wc -l` times done Tag \* was used 5647 times Tag - was used 2710 times Tag + was used 85 times Tag o was used 282 times which only counts those who have proper spacing - but for a rough estimation '*' wins definitely. Even if we'd have to fix all the entries with wrong spacing anyway to reach correctness, I was curious to see numbers for all spacing variants for a wider representation of the characters used: ,-- count-bullet-chars.sh -- #!/bin/sh lists=/var/lib/apt/lists/*_sid_main_*_Packages total=`grep ^ *[-+\*o] $lists | wc -l` for tag in \* - + o; do items=`grep ^ *$tag $lists | wc -l` percent=`echo scale=4; $items / $total * 100 | bc` echo Tag $tag was used $items times ($percent%) done `-- Tag \* was used 9277 times (68.0900%) Tag - was used 3837 times (28.1600%) Tag + was used 120 times (.8800%) Tag o was used 390 times (2.8600%) Regardless of the numbers though (which have moved lately slightly in favour of '-' due to the recommendations from the Smith reviewing project), I've always found the asterisk the obvious character to use for bulleted lists, as it's the one ressembling the most a bullet, and it's the one we use in changelog entries and similar. regards, guillem -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Wed, 8 Apr 2009, Guillem Jover wrote: There's been a wiki page trying to track this, including packages which formatting was proving problematic: http://wiki.debian.org/Aptitude::Parse-Description-Bullets=true Great. The most important information from this page for myself is that there are actually other tools (not the one I intended to write for Blends) which actually would profit from a more standardized formating of descriptions. IMHO this rectifies filing bug reports against packages that try to implement a list but fail to use the form: has_list |= ( line =~ /^\s+-/ )# a line starts with - has_list |= ( line =~ /^\s+\+/ ) # + has_list |= ( line =~ /^\s+\*/ ) # * has_list |= ( line =~ /^\s+o\s+/ ) # o BTW, why are you checking for \s after the itemizing symbol only after 'o'? IMHO it should always follow each itemizing symbol. I also see no good chances to detect multi level lists and thus I would like to come back to more strict rules regarding the itemizing symbol and the spacing. In contrast to the comment in the end the check also allows - and I would rather like to force /^ - / or /^ + / (yes, not checking for any space but really the character ' ' = blank). IMHO this would increase the reliability of detecting a list and if there are tools like aptitude who are actually making use of it it should be worth the effort. For the sake of interest: What programming language is the script above? Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
Hi! On Mon, 2009-03-23 at 16:23:12 -0700, Daniel Burrows wrote: I don't have the energy to push this any more, but I should probably at least refer to my previous attempt to standardize bulleted lists: http://lists.debian.org/debian-devel/2005/12/msg00531.html You might find it useful, or not. At least it more or less documents current practice in aptitude (I think there have been some tweaks since then; if anyone cares I could go research what they are and dig them up). There's been a wiki page trying to track this, including packages which formatting was proving problematic: http://wiki.debian.org/Aptitude::Parse-Description-Bullets=true regards, guillem -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Mon, Mar 23, 2009 at 07:18:07PM +0100, Michael Banck wrote: Uh, what are you saying here? That we should use * to prepend items in itemized lists, so that it can be converted to HTML lists by packages.debian.org et al.? If not, what else? Yes. More generally, I believe we can benefit in the long run of some simple text-based markup that support the basic emphasis stuff we are used to use in emails; markdown is just an example of such a language. Having to choose a syntax for itemized list, it would be wise to choose one which is future compatible with such a language. Cheers. -- Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7 z...@{upsilon.cc,pps.jussieu.fr,debian.org} -- http://upsilon.cc/zack/ Dietro un grande uomo c'è ..| . |. Et ne m'en veux pas si je te tutoie sempre uno zaino ...| ..: | Je dis tu à tous ceux que j'aime signature.asc Description: Digital signature
Re: RFC: Better formatting for long descriptions
Quoting Michael Banck (mba...@debian.org): Please note that debian-l10n-english suggests using the enumeration style you mention for a2ps, when we're reviewing package descriptions... What's the rationale? So far, I was under the impression that * A not very strong one, I'm afraid..:-) IIRC, we once found some reference indicating a tendency for dashed enumerations to be an accepted standard but I can't quote this. Another reason is the fact that we're using this in French translationswhich is a bad reason..:-) Another is that we had to choose something and, based on purely personal impressions, we were thinking that dashed enumerations were the majority (nobody really verified). I think that we never really went into this to be the only proposed change. Most of the time, there are several other changes...particularly when enumerations are involved because, in such cases: - they're often too long (enumerating each and every feature of the software) - they have formatting issues (punctuation, often) - they have consistency issues (mixing verb sentences and noun sentences for instance) signature.asc Description: Digital signature
Re: RFC: Better formatting for long descriptions
On Mon, 23 Mar 2009, Christian Perrier wrote: What's the rationale? So far, I was under the impression that * A not very strong one, I'm afraid..:-) IIRC, we once found some reference indicating a tendency for dashed enumerations to be an accepted standard but I can't quote this. Could you please clarify whether you mean *enumeration* (in the sense of LaTeXs enumeration environment or HTMLs ol) or would you rather mean *itemize* (in the sense of LaTeXs itemize environment or HTMLs ul)? IMHO this are things which should be handled differently. I don't care whether a ' *' or a ' -' is finally used - it just should be used in the same way for all descriptions. - they're often too long (enumerating each and every feature of the software) - they have formatting issues (punctuation, often) - they have consistency issues (mixing verb sentences and noun sentences for instance) I completely agree that this should be fixed as well - but it is hard to code such tests in a lintian check or something like this. Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Mon, Mar 23, 2009 at 07:24:45AM +0100, Christian Perrier wrote: Quoting Michael Banck (mba...@debian.org): Please note that debian-l10n-english suggests using the enumeration style you mention for a2ps, when we're reviewing package descriptions... What's the rationale? So far, I was under the impression that * A not very strong one, I'm afraid..:-) IIRC, we once found some reference indicating a tendency for dashed enumerations to be an accepted standard but I can't quote this. Another reason is the fact that we're using this in French translationswhich is a bad reason..:-) Another is that we had to choose something and, based on purely personal impressions, we were thinking that dashed enumerations were the majority (nobody really verified). Well, ok; but your initial post to this thread made it sound like some semi-or-mostly official description review process, so having to change all my long descriptions to - (after all, standardizing on one format is the point of this thread) does not fill me with pure joy. So if I have to do that, I'd prefer having a reason like 80% of the packages do it like that or this is the preferred form of itemization in english according to ..., or something. The above reasons do not look very convincing to me. So it would be great if some numbers could be brought up first (maybe Andreas has a rough overview now, because he looked at the different kinds of itemizations). Again, I don't think enumerations are used that much (and if they are, a lot of them are really itemizations I guess), but standardizing on itemizations strikes me as useful. Not just for packages.d.o HTML output, but also for apt-cache show consistence etc. Michael -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Mon, 23 Mar 2009, Michael Banck wrote: So it would be great if some numbers could be brought up first (maybe Andreas has a rough overview now, because he looked at the different kinds of itemizations). Well, I had not but you can get it somehow by for tag in \* - + o ; do echo Tag $tag was used `grep ^ $tag /var/lib/dpkg/available | wc -l` times done Tag \* was used 5647 times Tag - was used 2710 times Tag + was used 85 times Tag o was used 282 times which only counts those who have proper spacing - but for a rough estimation '*' wins definitely. Again, I don't think enumerations are used that much (and if they are, a lot of them are really itemizations I guess) Just recommending: There is no real need for enumerations - lets use itemize in any case might be a valid point as well. But IMHO whe need descriptions (in the sense of LaTeX description environment or HTML dl). Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Fri, Mar 20, 2009 at 02:45:09PM +0100, Andreas Tille wrote: I do not propose drastic changes but a start for Best practices might be reasonable and perhaps some lintian warnings might help to remind developers to move to some standard. Laudable initiative, thanks for raising the issue. The current handling of list is dumb at best. I agree with Martin that we should avoid the NIH syndrome though, but that does not necessarily mean that we should switch entirely control files to a new format. It just means that we should think big. In particular, I observe that we (IIRC) already have psuedo-parsing code which is used at least by packages.d.o to render as proper HTML lists the pseudo-lists which come from long descriptions. That makes evident, at least to me, that long descriptions need some kind of formatting for most of their use cases (packages.d.o is one, the interface of a GUI package manager is another one). In that respect, resisting the NIH syndrome just means choose an already existing text-based markup language and adopt its convention. For instance, we can just say that long description lists have to be formatted as Markdown lists (modulo some extra bits needed to not violate 822 parsing). That would be synergistic with a possible future switch to Markdown for the whole markup of long descriptions. Note that I don't care in particular about Markdown, it can also be restructured text for what I care. But please check that your convention matches such a markup language and please say explicitly so in your proposal. That would also implement a somewhat principle of least surprise for people coming from those languages. Thanks! Cheers. -- Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7 z...@{upsilon.cc,pps.jussieu.fr,debian.org} -- http://upsilon.cc/zack/ Dietro un grande uomo c'è ..| . |. Et ne m'en veux pas si je te tutoie sempre uno zaino ...| ..: | Je dis tu à tous ceux que j'aime signature.asc Description: Digital signature
Re: RFC: Better formatting for long descriptions
On Mon, 23 Mar 2009, Stefano Zacchiroli wrote: In particular, I observe that we (IIRC) already have psuedo-parsing code which is used at least by packages.d.o to render as proper HTML lists the pseudo-lists which come from long descriptions. Not that I know of. IMHO it is just set verbose (pre) just checking the a2ps example which was mentioned here: - h2GNU a2ps - 'Anything to PostScript' converter and pretty-printer/h2 p GNU a2ps converts files into PostScript for printing or viewing. It uses a nice default format, usually two pages on each physical page, borders surrounding pages, headers with useful information (page number, printing date, file name or supplied header), line numbering, symbol substitution as well as pretty printing for a wide range of programming languages. p Historically, a2ps started as a text to PostScript converter, but thanks to powerful delegations it is able to let you use it for any kind of files, ie it can also digest manual pages, dvi files, texinfo, p Among the other most noticeable features of a2ps are: pre - various encodings (all the Latins and others), - various fonts (automatic font down loading), - various medias, - various printer interfaces, - various output styles, - various programming languages, - various helping applications, - and various spoken languages. /pre But please check that your convention matches such a markup language and please say explicitly so in your proposal. This is definitely intended but I'm not an example of those markup languages. That's why I said: 1. Defines some kind of standard which can be parsed automatically. 2. Does not break any existing tool If there is an existing markup language which fits this feature I'd definitely vote for it. Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
Quoting Andreas Tille (til...@rki.de): Could you please clarify whether you mean *enumeration* (in the sense I meant itemization, actually, so more ul than ol. There are certainly very few cases where ordered lists are really useful in packages' description. Sorry for the approximative English, here.. -- signature.asc Description: Digital signature
Re: RFC: Better formatting for long descriptions
On Mon, Mar 23, 2009 at 02:32:17PM +0100, Stefano Zacchiroli wrote: In that respect, resisting the NIH syndrome just means choose an already existing text-based markup language and adopt its convention. For instance, we can just say that long description lists have to be formatted as Markdown lists (modulo some extra bits needed to not violate 822 parsing). That would be synergistic with a possible future switch to Markdown for the whole markup of long descriptions. Note that I don't care in particular about Markdown, it can also be restructured text for what I care. Uh, what are you saying here? That we should use * to prepend items in itemized lists, so that it can be converted to HTML lists by packages.debian.org et al.? If not, what else? Michael -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
I don't have the energy to push this any more, but I should probably at least refer to my previous attempt to standardize bulleted lists: http://lists.debian.org/debian-devel/2005/12/msg00531.html You might find it useful, or not. At least it more or less documents current practice in aptitude (I think there have been some tweaks since then; if anyone cares I could go research what they are and dig them up). Daniel -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)
On Sun, 22 Mar 2009, Michael Bramer wrote: if we like to remove the long description from the package file, we must change apt in some way and use some other rules for select the right description (a new 'Description-md5sum' or the Version-Nr) I'd call the Version-Nr. a sinsible choice. ;-) Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Sat, 21 Mar 2009, Christian Perrier wrote: Please note that debian-l10n-english suggests using the enumeration style you mention for a2ps, when we're reviewing package descriptions... BTW, once you answered in this thread: Shouldn't we make the suggested enhancements part of the Smith-Project? Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
Quoting Andreas Tille (til...@rki.de): On Sat, 21 Mar 2009, Christian Perrier wrote: Please note that debian-l10n-english suggests using the enumeration style you mention for a2ps, when we're reviewing package descriptions... BTW, once you answered in this thread: Shouldn't we make the suggested enhancements part of the Smith-Project? Certainly. I currently refrain myself from reading -devel (it seems like we are in this state of the release cycle where flame wars and complicated discussions increase.and I try saving my own time for productive work) but I would appreciate a summary in case things and ideas converge (good luck for this..:-)) Another thing we encourage in Smith is the use of good boilerplates in package descriptions, for multi-binary packagesThe point is having a repetitive part common to all packages of a give source package, that is the description of the general use of the framework and 1 or 2 specific paragraphs for each binary package saying things like This package provides the development files for foo, etc. A good example of this is the recent review of nut templatesthat was one of the most complicated review we did (mostly because this is one of the few where the maintainer gave advices...:-)) That review starts at http://lists.debian.org/debian-l10n-english/2009/03/msg00025.html ...and turned out into #520591 I suggest interested parties to look at debian/control for nut before and after the review..:-) signature.asc Description: Digital signature
Re: RFC: Better formatting for long descriptions
On Sat, Mar 21, 2009 at 10:52:10PM +0100, Andreas Tille wrote: I agree that some descriptions are definitely to long. I wonder who should really read some descriptions to the end. Bad examples can be viewn here: http://debian-med.alioth.debian.org/tasks/typesetting.html The very long lengths seem to come mostly from lists of CTAN packages in a Debian package; I find these useful, as I can apt-cache search CTAN_package to find it in Debian. -- Lionel -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Sun, 22 Mar 2009, Lionel Elie Mamane wrote: http://debian-med.alioth.debian.org/tasks/typesetting.html The very long lengths seem to come mostly from lists of CTAN packages in a Debian package; I find these useful, as I can apt-cache search CTAN_package to find it in Debian. Yes, I'm sure there are reasons for just putting everything into the description of a package - but as this thread shows there are also reasons against - and I wonder how many users are bored about overlongish descriptions compared to those who grep apt-cache output. Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
Lionel Elie Mamane lio...@mamane.lu writes: The very long lengths seem to come mostly from lists of CTAN packages in a Debian package; I find these useful, as I can apt-cache search CTAN_package to find it in Debian. For that purpose, it would seem ‘apt-file’ can do the job better, obviating the need for that listing to bloat the Packages file. Or am I missing something? -- \“I bought a dog the other day. I named him Stay. It's fun to | `\ call him. ‘Come here, Stay! Come here, Stay!’ He went insane. | _o__) Now he just ignores me and keeps typing.” —Steven Wright | Ben Finney -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
Neil Williams wrote: If large numbers of package descriptions are to change collectively, it's best to make that one change with two aims rather than two separate changes. Less work for everyone involved. But Andreas' RFC affects the source packages, yours only affects the infrastructure that builds and uses Packages. IOW: maintainers need to do something to go ahead with Andrea's proposal and do nothing to see package descriptions go away from Packages. Just looking for a bit of consideration for those situations where the Packages file is already too large. Cheers, Raphael Geissert -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Sat, Mar 21, 2009 at 11:13:54PM +0100, Christian Perrier wrote: Quoting Andreas Tille (til...@rki.de): Package: a2ps - various encodings (all the Latins and others), - various fonts (automatic font down loading), - various medias, ^^ (two spaces) Please note that debian-l10n-english suggests using the enumeration style you mention for a2ps, when we're reviewing package descriptions... What's the rationale? So far, I was under the impression that * was the most used enumeration style in long descriptions. Michael -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)
On Fri, 20 Mar 2009 19:15:00 -0400 Filipus Klutiero chea...@gmail.com wrote: On Fri, 20 Mar 2009 14:45:09 +0100 (CET) Andreas Tille til...@rki.de wrote: I tried to find a clear advise how to reasonable format lists inside long descriptions of packages. The only thing I know is that lines with two leading spaces is considered verbose. Packages.gz is already 26Mb - I'd like to find ways to shorten the package descriptions, not lengthen it. :-( Current squeeze main Packages.gz is 7 MB: http://ftp.ca.debian.org/debian/dists/squeeze/main/binary-i386/ Bah, my fault - 26Mb uncompressed. I was looking at /var/lib/apt/lists/ Sorry. Can the long description be trimmed to only such data necessary to identify the package compared to similar packages? We have debtags for lots of other facets of a package description, maybe it is time that the long description itself is trimmed so that it does not repeat any information already encoded as debtags? debtags is not yet at a stage where this should be done (for one thing, Synaptic, for example, does not support debtags). Even if it would be possible, I doubt this would help much. Any reduction, replicated across 13,000 packages (or even just the ones from that 13,000 that have verbose long descriptions currently), is only going to help reduce the size of the file. What about a way of having a really long, detailed, nicely formatted description on packages.debian.org but a much shorter, more basic version in the Packages.gz file? The extended description needs to be available to APT Only for use by apt-search, the rest of apt doesn't care about it. apt understands debtags, why duplicate that information? (Frontends can be adapted or just rely on apt-cache search underneath.) , not only via packages.d.o. I seem to remember that Mandrake Linux (or some other RPM-based distribution) used two Packages-like files, a fat one about 5 times our Packages and a slim one about a fifth of Debian's Packages. I remember finding the slim index cool, but now that there's Packages.diff, I think that developing Mandrake-like Packages files and seeing the results in, perhaps, 2 years, would not benefit much to the kind of hardware Debian will run on by then. Debian is not exclusively for power-hungry servers and mega-powerful workstations, Debian also runs on very small hardware and not necessarily old stuff either. It is a mistake to think that Debian should require more and more powerful hardware for the basic system. Yes, there is software in Debian that needs a powerful machine, there is also a LOT of software in Debian specifically designed for low resource machines where the benefits of a 1Mb Packages.gz file are appreciable. -- Neil Williams = http://www.data-freedom.org/ http://www.nosoftwarepatents.com/ http://www.linux.codehelp.co.uk/ pgp3lHY1fDFBt.pgp Description: PGP signature
Re: RFC: Better formatting for long descriptions
On Fri, 20 Mar 2009 23:32:51 +0100 Michael Banck mba...@debian.org wrote: On Fri, Mar 20, 2009 at 07:20:43PM +, Neil Williams wrote: I'd like to get the longest descriptions out of Packages.gz completely, so encouraging their retention it not ideal. It's not about whether 2 or 3 spaces should be used, it's about whether such detailed content deserves to be in Packages.gz in the first place. Then I wonder why you hijacked this thread and did not rather start a new one? If large numbers of package descriptions are to change collectively, it's best to make that one change with two aims rather than two separate changes. Less work for everyone involved. Just looking for a bit of consideration for those situations where the Packages file is already too large. -- Neil Williams = http://www.data-freedom.org/ http://www.nosoftwarepatents.com/ http://www.linux.codehelp.co.uk/ pgptoOhBoY8aZ.pgp Description: PGP signature
Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)
On Sat, 21 Mar 2009 12:28:36 +0900 Paul Wise p...@debian.org wrote: On Sat, Mar 21, 2009 at 8:15 AM, Filipus Klutiero chea...@gmail.com wrote: The extended description needs to be available to APT, not only via packages.d.o. I agree with Neil William's comment in the other thread about removing long descriptions from the Packages files. I think the obvious place to put them is in dists/unstable/main/i18n/Translations-en (or C) like the descriptions from DDTP. Now that's a good idea - thanks Paul. That way, the long descriptions can be moved aside without needing changes by lots of maintainers and other formatting changes like the original thread can proceed independently. It's another instance of duplication - why retain the long description in the Packages file while a translated version also exists from DDTP? Probably better for the description to be removed from the Packages file completely and the DDTP one contains the translated version and English ones for those with missing or outdated translations. That way, apt spends less time parsing the (smaller) Packages file when doing ordinary stuff like package installation and only needs to look at the DDTP information when specifically called as 'apt-cache search'. CC:'ing debian-i18n to see if there are problems with this approach. -- Neil Williams = http://www.data-freedom.org/ http://www.nosoftwarepatents.com/ http://www.linux.codehelp.co.uk/ pgprAi03SA6jw.pgp Description: PGP signature
Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)
On Sat, Mar 21, 2009 at 4:58 PM, Neil Williams codeh...@debian.org wrote: It's another instance of duplication - why retain the long description in the Packages file while a translated version also exists from DDTP? Probably better for the description to be removed from the Packages file completely and the DDTP one contains the translated version and English ones for those with missing or outdated translations. That way, apt spends less time parsing the (smaller) Packages file when doing ordinary stuff like package installation and only needs to look at the DDTP information when specifically called as 'apt-cache search'. One issue is that many people will have disabled downloading translations so they'll need to change their configuration from none to en: APT::Acquire::Translation none; Since en will now be a Translation, perhaps a different config item is more appropriate: APT::Acquire::Description en; -- bye, pabs http://wiki.debian.org/PaulWise -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Fri, 20 Mar 2009, Neil Williams wrote: Packages.gz is already 26Mb - I'd like to find ways to shorten the package descriptions, not lengthen it. :-( Please read again. Chances are good that packages files might become shorter. The rationale behind this is that with some better standard formating some tools which display descriptions on web pages might be enhanced to use li, ol and dl tags which finally makes a better reading. Oh no, please don't let Packages.gz get to 40Mb or 50Mb or more. There has to be a limit somewhere. You should definitely read again - in how far removing / adding some spaces and use defined characters instead of random ones should have such an effect? Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Fri, 20 Mar 2009, Neil Williams wrote: My comment for this RFC is, therefore, that better formatting for long descriptions should include a review of whether the long description deserves to be that long in the first place, whether the long description merely duplicates data already available via debtags and whether the long description should be trimmed for the package in question *as well as* standardising the formatting of what remains. I agree that some descriptions are definitely to long. I wonder who should really read some descriptions to the end. Bad examples can be viewn here: http://debian-med.alioth.debian.org/tasks/typesetting.html Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Fri, 20 Mar 2009, Filipus Klutiero wrote: 2. Does not break any existing tool I tend to agree with Martin. Do you have a particular reason making this change urge? Just to give the suggestion a small chance. I'm not against a better format but I have read enough suggestions that ended in nothing. BTW, getting the descriptions in some standard shape might make an automatic transition to a better format easier. Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
Quoting Andreas Tille (til...@rki.de): Package: a2ps - various encodings (all the Latins and others), - various fonts (automatic font down loading), - various medias, ^^ (two spaces) Package: acerhk-source * controlling LEDs (Mail, Wireless) * enable/disable wireless hardware ^^^ (three spaces) .../... Please note that debian-l10n-english suggests using the enumeration style you mention for a2ps, when we're reviewing package descriptions... Of course, that triggers rewrites but these are generally coupled with much more very good improvement suggestions (the team features an artist of the English language and that's not /mewhich is obvious for everybody). signature.asc Description: Digital signature
Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)
Paul Wise schrieb: On Sat, Mar 21, 2009 at 4:58 PM, Neil Williams codeh...@debian.org wrote: It's another instance of duplication - why retain the long description in the Packages file while a translated version also exists from DDTP? Probably better for the description to be removed from the Packages file completely and the DDTP one contains the translated version and English ones for those with missing or outdated translations. That way, apt spends less time parsing the (smaller) Packages file when doing ordinary stuff like package installation and only needs to look at the DDTP information when specifically called as 'apt-cache search'. One issue is that many people will have disabled downloading translations so they'll need to change their configuration from none to en: APT::Acquire::Translation none; Since en will now be a Translation, perhaps a different config item is more appropriate: APT::Acquire::Description en; This will not work: apt use a md5sum from the sort and lang description (from the packages file) to find the right 'translation'. If you remove the long description from the packages file, apt can't do this task... if we like to remove the long description from the package file, we must change apt in some way and use some other rules for select the right description (a new 'Description-md5sum' or the Version-Nr) Gruss Grisu -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)
Neil Williams wrote: On Fri, 20 Mar 2009 19:15:00 -0400 Filipus Klutiero chea...@gmail.com wrote: [...] What about a way of having a really long, detailed, nicely formatted description on packages.debian.org but a much shorter, more basic version in the Packages.gz file? The extended description needs to be available to APT Only for use by apt-search, the rest of apt doesn't care about it. apt understands debtags, why duplicate that information? (Frontends can be adapted or just rely on apt-cache search underneath.) I don't understand what you mean. Where would apt-cache get the extended description from? Again, debtags is not mature enough yet to shrink descriptions. , not only via packages.d.o. I seem to remember that Mandrake Linux (or some other RPM-based distribution) used two Packages-like files, a fat one about 5 times our Packages and a slim one about a fifth of Debian's Packages. I remember finding the slim index cool, but now that there's Packages.diff, I think that developing Mandrake-like Packages files and seeing the results in, perhaps, 2 years, would not benefit much to the kind of hardware Debian will run on by then. Debian is not exclusively for power-hungry servers and mega-powerful workstations, Debian also runs on very small hardware and not necessarily old stuff either. It is a mistake to think that Debian should require more and more powerful hardware for the basic system. Actually, I was only saying that I thought such a reduction of the hardware requirements would not help much. Yes, there is software in Debian that needs a powerful machine, there is also a LOT of software in Debian specifically designed for low resource machines where the benefits of a 1Mb Packages.gz file are appreciable. I agree, after reading Paul's comment, that if we get a Translations-en file via DDTP, removing the extended description from Packages would be less work, and thus more interesting. I tested the gain with awk '$0 !~ /^(Description| )/' and the result loses close to half of its compressed size. -rw-r--r-- 1 chealer chealer 4224356 mar 21 20:12 nodesc.tar.gz -rw-r--r-- 1 chealer chealer 7350583 mar 21 15:56 debian.savoirfairelinux.net_debian_dists_testing_main_binary-i386_Packages.tar.gz -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
also sprach Andreas Tille til...@rki.de [2009.03.20.1445 +0100]: I tried to find a clear advise how to reasonable format lists inside long descriptions of packages. The only thing I know is that lines with two leading spaces is considered verbose. This leaves a lot of freedom to simulate for instance itemize lists. I'd like to give some examples for package names starting with 'a' and stopped with the first package names of 'b'. If you are bored by these examples continue reading below the -- line. What we really should do, instead of clinging to the NIH-behaviour, reinventing the wheel, and polishing it over and over again is ditch the pseudo-RFC822 format we have and use Yaml instead. http://www.yaml.org/start.html http://yaml.org/spec/1.2/ -- .''`. martin f. krafft madd...@d.o Related projects: : :' : proud Debian developer http://debiansystem.info `. `'` http://people.debian.org/~madduckhttp://vcs-pkg.org `- Debian - when you have better things to do than fixing systems den stil verbessern, das heißt den gedanken verbessern. - friedrich nietzsche digital_signature_gpg.asc Description: Digital signature (see http://martin-krafft.net/gpg/)
Re: RFC: Better formatting for long descriptions
On Fri, 20 Mar 2009, martin f krafft wrote: What we really should do, instead of clinging to the NIH-behaviour, reinventing the wheel, and polishing it over and over again is ditch the pseudo-RFC822 format we have and use Yaml instead. http://www.yaml.org/start.html http://yaml.org/spec/1.2/ And most probably somebody else will revive the switch to XML suggestion. I know the pros and cons for different formats but I want a solution *now* and that's the reason why I wrote: 2. Does not break any existing tool Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Fri, Mar 20, 2009 at 02:45:09PM +0100, Andreas Tille wrote: 1. Itemize lists: (li) 2. Enumerate lists: (ol) -- 3. Description lists: (dl) This suggestion is far from complete and should be enhanced. Well, not sure this should be over-engineered; I guess itemize lists already cover most of the cases (most enumerations could probably be changed to itemizations I guess). So a +1 from me. Michael -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
On Fri, 20 Mar 2009 14:45:09 +0100 (CET) Andreas Tille til...@rki.de wrote: I tried to find a clear advise how to reasonable format lists inside long descriptions of packages. The only thing I know is that lines with two leading spaces is considered verbose. Packages.gz is already 26Mb - I'd like to find ways to shorten the package descriptions, not lengthen it. :-( This leaves a lot of freedom to simulate for instance itemize lists. I'd like to give some examples for package names starting with 'a' and stopped with the first package names of 'b'. If you are bored by these examples continue reading below the -- line. - I think we should try to implement some more strict formating rules to our long descriptions. Maybe starting with a way to provide extra long descriptions by some means *other* than Packages.gz - which in turn means maintainers deciding which bits of the long description *really* need to be visible before download and which can wait until the user has decided to download the package. Can the long description be trimmed to only such data necessary to identify the package compared to similar packages? We have debtags for lots of other facets of a package description, maybe it is time that the long description itself is trimmed so that it does not repeat any information already encoded as debtags? The rationale behind this is that with some better standard formating some tools which display descriptions on web pages might be enhanced to use li, ol and dl tags which finally makes a better reading. Oh no, please don't let Packages.gz get to 40Mb or 50Mb or more. There has to be a limit somewhere. What about a way of having a really long, detailed, nicely formatted description on packages.debian.org but a much shorter, more basic version in the Packages.gz file? This suggestion is far from complete and should be enhanced. I think the entire suggestion should be redirected away from the Packages.gz file. -- Neil Williams = http://www.data-freedom.org/ http://www.nosoftwarepatents.com/ http://www.linux.codehelp.co.uk/ pgpv9i4UdL58G.pgp Description: PGP signature
Re: RFC: Better formatting for long descriptions
On Fri, 2009-03-20 at 19:03 +, Neil Williams wrote: On Fri, 20 Mar 2009 14:45:09 +0100 (CET) Andreas Tille til...@rki.de wrote: I tried to find a clear advise how to reasonable format lists inside long descriptions of packages. The only thing I know is that lines with two leading spaces is considered verbose. Packages.gz is already 26Mb - I'd like to find ways to shorten the package descriptions, not lengthen it. :-( Yeah, I'm sure being consistent about whether we use 2 or 3 spaces for indented lists in descriptions is going to make that file a lot harder to compress. Cheers, Julien -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: RFC: Better formatting for long descriptions
Neil Williams wrote: On Fri, 20 Mar 2009 14:45:09 +0100 (CET) Andreas Tille til...@rki.de wrote: I tried to find a clear advise how to reasonable format lists inside long descriptions of packages. The only thing I know is that lines with two leading spaces is considered verbose. Packages.gz is already 26Mb - I'd like to find ways to shorten the package descriptions, not lengthen it. :-( AFAICS he's not talking about lengthen the descriptions at all, but to standardize the way lists are formatted in long descriptions. That is, formalize whether we should be using 2 or 3 spaces, dashes or plus signs for items in the lists... Cheers, Emilio signature.asc Description: OpenPGP digital signature
Re: RFC: Better formatting for long descriptions
On Fri, 20 Mar 2009 20:08:43 +0100 Julien Cristau jcris...@debian.org wrote: On Fri, 2009-03-20 at 19:03 +, Neil Williams wrote: On Fri, 20 Mar 2009 14:45:09 +0100 (CET) Andreas Tille til...@rki.de wrote: I tried to find a clear advise how to reasonable format lists inside long descriptions of packages. The only thing I know is that lines with two leading spaces is considered verbose. Packages.gz is already 26Mb - I'd like to find ways to shorten the package descriptions, not lengthen it. :-( Yeah, I'm sure being consistent about whether we use 2 or 3 spaces for indented lists in descriptions is going to make that file a lot harder to compress. I'd like to get the longest descriptions out of Packages.gz completely, so encouraging their retention it not ideal. It's not about whether 2 or 3 spaces should be used, it's about whether such detailed content deserves to be in Packages.gz in the first place. If there is going to be discussion on standardising on some form of indentation, it's worth considering whether there isn't a better way of providing the data itself to achieve other benefits. Indents would need changes in all affected packages - it might be easier to provide a different means that also reduces the size of the Packages.gz file at the same time so that packages only need to be changed once. My comment for this RFC is, therefore, that better formatting for long descriptions should include a review of whether the long description deserves to be that long in the first place, whether the long description merely duplicates data already available via debtags and whether the long description should be trimmed for the package in question *as well as* standardising the formatting of what remains. Better can be construed to mean more - I merely want maintainers to consider whether better actually means less. -- Neil Williams = http://www.data-freedom.org/ http://www.nosoftwarepatents.com/ http://www.linux.codehelp.co.uk/ pgpvZrNsif0sW.pgp Description: PGP signature
Re: RFC: Better formatting for long descriptions
On Fri, Mar 20, 2009 at 07:20:43PM +, Neil Williams wrote: I'd like to get the longest descriptions out of Packages.gz completely, so encouraging their retention it not ideal. It's not about whether 2 or 3 spaces should be used, it's about whether such detailed content deserves to be in Packages.gz in the first place. Then I wonder why you hijacked this thread and did not rather start a new one? Michael -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Extended descriptions size (was Re: RFC: Better formatting for long descriptions)
On Fri, 20 Mar 2009 14:45:09 +0100 (CET) Andreas Tille til...@rki.de wrote: I tried to find a clear advise how to reasonable format lists inside long descriptions of packages. The only thing I know is that lines with two leading spaces is considered verbose. Packages.gz is already 26Mb - I'd like to find ways to shorten the package descriptions, not lengthen it. :-( Current squeeze main Packages.gz is 7 MB: http://ftp.ca.debian.org/debian/dists/squeeze/main/binary-i386/ Can the long description be trimmed to only such data necessary to identify the package compared to similar packages? We have debtags for lots of other facets of a package description, maybe it is time that the long description itself is trimmed so that it does not repeat any information already encoded as debtags? debtags is not yet at a stage where this should be done (for one thing, Synaptic, for example, does not support debtags). Even if it would be possible, I doubt this would help much. The rationale behind this is that with some better standard formating some tools which display descriptions on web pages might be enhanced to use li, ol and dl tags which finally makes a better reading. Oh no, please don't let Packages.gz get to 40Mb or 50Mb or more. There has to be a limit somewhere. I don't understand the proposal as something affecting Packages's size significantly. What about a way of having a really long, detailed, nicely formatted description on packages.debian.org but a much shorter, more basic version in the Packages.gz file? The extended description needs to be available to APT, not only via packages.d.o. I seem to remember that Mandrake Linux (or some other RPM-based distribution) used two Packages-like files, a fat one about 5 times our Packages and a slim one about a fifth of Debian's Packages. I remember finding the slim index cool, but now that there's Packages.diff, I think that developing Mandrake-like Packages files and seeing the results in, perhaps, 2 years, would not benefit much to the kind of hardware Debian will run on by then. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Re: RFC: Better formatting for long descriptions
On Fri, 20 Mar 2009, martin f krafft wrote: What we really should do, instead of clinging to the NIH-behaviour, reinventing the wheel, and polishing it over and over again is ditch the pseudo-RFC822 format we have and use Yaml instead. http://www.yaml.org/start.html http://yaml.org/spec/1.2/ And most probably somebody else will revive the switch to XML suggestion. I know the pros and cons for different formats but I want a solution *now* and that's the reason why I wrote: 2. Does not break any existing tool I tend to agree with Martin. Do you have a particular reason making this change urge? At worst, a format for extended descriptions could be usable by Debian 7. I noticed while checking if packages.debian.org rendered the current descriptions decently that acidlab's description is rendered pretty badly, but AFAICS that's just a packages.d.o bug. FWIW, I had never noticed such an issue. Kind regards Andreas. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)
On Sat, Mar 21, 2009 at 8:15 AM, Filipus Klutiero chea...@gmail.com wrote: The extended description needs to be available to APT, not only via packages.d.o. I agree with Neil William's comment in the other thread about removing long descriptions from the Packages files. I think the obvious place to put them is in dists/unstable/main/i18n/Translations-en (or C) like the descriptions from DDTP. -- bye, pabs http://wiki.debian.org/PaulWise -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org