Re: RFC: Better formatting for long descriptions

2009-05-13 Thread Morten Kjeldgaard
I haven't read the whole long thread, so perhaps this has been mentioned
by someone else. Python has recently decided to convert their
documentation to reStructuredText [1]. It would make a lot of sense for
Debian to use that de-facto standard (or some subset of it) for text
typesetting in the long descriptions, rather than re-inventing the wheel.

Cheers,
Morten

[1] http://docutils.sourceforge.net/rst.html


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-05-13 Thread Andreas Tille

On Wed, 13 May 2009, Morten Kjeldgaard wrote:


I haven't read the whole long thread, so perhaps this has been mentioned
by someone else. Python has recently decided to convert their
documentation to reStructuredText [1]. It would make a lot of sense for
Debian to use that de-facto standard (or some subset of it) for text
typesetting in the long descriptions, rather than re-inventing the wheel.


My understanding is that we currently have the choice between Markdown and
reStructuredText and from my point of view the right place to continue
the discussion is

 http://lists.debian.org/debian-devel/2009/04/msg01132.html

For my understanding it is decided to use a formating library and I tried
to compare two portantial candidates.  Further investigation should be
done - preferably by people who really know these libraries.  My time for
such things is currently limited.

Kind regards

   Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-18 Thread Vincent Danjean
Andreas Tille wrote:
 But what exactly do I have to do to get the item lists marked?

Remove the first space, remove the '.' that are alone on their line,
add a blank line before enumeration (this last point seems the more
annoying to me: it can be difficult to automatically find where to
insert a blank line).

 grep-available -s Description -F Package airport-utils | markdown

grep-aptavail -s Description -F Package airport-utils | sed -e 's/^ \(.$\)\?//' 
-e '/: *$/a\\
' | markdown
pDescription: configuration and management utilities for Apple AirPort base 
stations
This package contains various utilities to manage the Apple AirPort base
stations./p

pBe aware that Apple released several versions of the AirPort base station;
the original AirPort (Graphite) was a rebranded Lucent RG-1000 base
station, doing 802.11a/b. The AirPort Extreme (Snow) is an Apple-built
802.11a/b/g base station./p

pFor the original Apple AirPort and the Lucent RG-1000 base stations only:/p

ul
liairport-config: base station configurator/li
liairport-linkmon: wireless link monitor, gives information on the wireless
link quality between the base station and the associated hosts/li
/ul

pFor the Apple AirPort Extreme base stations only:/p

ul
liairport2-config: base station configurator/li
liairport2-portinspector: port maps monitor/li
liairport2-ipinspector: WAN interface monitoring utility/li
/ul

pFor all:/p

ul
liairport-modem: modem control utility, displays modem state, starts/stops
modem connections, displays the approximate connection time (Extreme only)
ul
liairport-hostmon: wireless hosts monitor, lists wireless hosts connected
to the base station (see airport2-portinspector for the Snow)/li
/ul/li
/ul


  Regards,
Vincent

-- 
Vincent Danjean   GPG key ID 0x9D025E87 vdanj...@debian.org
GPG key fingerprint: FC95 08A6 854D DB48 4B9A  8A94 0BF7 7867 9D02 5E87
Unofficial pacakges: http://moais.imag.fr/membres/vincent.danjean/deb.html
APT repo:  deb http://perso.debian.org/~vdanjean/debian unstable main


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-18 Thread Ben Finney
Peter Pentchev r...@ringlet.net writes:

 Just as a kind of clarification: Manoj, I think that Giacomo's
 comments were only to the *last* item of the text he quoted, not to
 the whole portion above it :) Thus, IMHO his first really needed?
 question referred specifically to the ordered lists item, and the I
 don't think they are needed referred specifically to the underlines
 and strike-throughs, not to the emphasis, strong emphasis, etc.

Traps for new players: One must remember to trim irrelevant quoted
material so it's clear what the context of one's responses are.

-- 
 \“You can't have everything; where would you put it?” —Steven |
  `\Wright |
_o__)  |
Ben Finney


pgpE8rXLZJiKa.pgp
Description: PGP signature


Re: RFC: Better formatting for long descriptions

2009-04-18 Thread Andreas Tille

On Sat, 18 Apr 2009, Vincent Danjean wrote:


Remove the first space, remove the '.' that are alone on their line,


That's cheap.


add a blank line before enumeration (this last point seems the more
annoying to me: it can be difficult to automatically find where to
insert a blank line).


Well - here is the crux which let's me wonder whether Manoj was
right in his posting[1] when he claimed:


 If you make a suggestion please answer the following question:

   A. Does the suggestion enable parsing logical structures like
  two level itemize lists?
  (This is what I want to approach and what is IMHO needed)

Markdown and ReST, trivially.

   B. Does the suggestion enable keeping the majority of description
  untouched and enables keeping the currently existing tools?
  (This is important to gain any acceptance)

Yes, for both.


It is neither trivial to detect the point where to add the needed
blank line nor would it be a solution to advise people alwasy to
enclose lists in blank lines because people will tell you that
this will look ugly in the existing interfaces.  So I would rather
tend to No for both and this is the crux here.

So while I perfectly agree with Manoj that voting on technical
decisions is a bad idea I come back to my initial suggestion because
my suggestions are technically equivalent but express a matter of
taste of the developers which might lead to better acceptance.

I would love if somebody could provide a proof that I'm wrong and
there is a reliable way to turn long descriptions into proper markdown
input to *really* be able to detect the lists.  If not I think I
continue with my intention as described. [2]

Kind regards

Andreas.


[1] http://lists.debian.org/debian-devel/2009/04/msg00652.html
[2] http://lists.debian.org/debian-devel/2009/04/msg00643.html

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-18 Thread Manoj Srivastava
On Sat, Apr 18 2009, Andreas Tille wrote:

 On Sat, 18 Apr 2009, Vincent Danjean wrote:

 Remove the first space, remove the '.' that are alone on their line,

 That's cheap.

 add a blank line before enumeration (this last point seems the more
 annoying to me: it can be difficult to automatically find where to
 insert a blank line).

 Well - here is the crux which let's me wonder whether Manoj was
 right in his posting[1] when he claimed:

  If you make a suggestion please answer the following question:
 
A. Does the suggestion enable parsing logical structures like
   two level itemize lists?
   (This is what I want to approach and what is IMHO needed)

 Markdown and ReST, trivially.

B. Does the suggestion enable keeping the majority of description
   untouched and enables keeping the currently existing tools?
   (This is important to gain any acceptance)

 Yes, for both.

 It is neither trivial to detect the point where to add the needed
 blank line nor would it be a solution to advise people alwasy to

Actually, it is pretty trivial. It is a second chanpeter
 exercise in KR; it is a first month exercise in computer science 101.

Here is an algorithm:
--8---cut here---start-8---
 we are not in a list
 while reading each line, do
   remove leading space
   if the only non white space character on the line is a singe .
 remove the dot
   if the line matches the regexp: '^\s+[\*\+\-]\s+'
 if we are not in a list
   emit blank line first
   record we are not in a list
   else
 if we are in a list
   record we are not in a list
   emit line
--8---cut here---end---8---

People who can not convert this 13 line Psuedocode into a real
 code should not be writing stuff to pretty print descriptions.

 enclose lists in blank lines because people will tell you that
 this will look ugly in the existing interfaces.  So I would rather
 tend to No for both and this is the crux here.

Frankly, I think this is very wrong.


 So while I perfectly agree with Manoj that voting on technical
 decisions is a bad idea I come back to my initial suggestion because
 my suggestions are technically equivalent but express a matter of
 taste of the developers which might lead to better acceptance.

 I would love if somebody could provide a proof that I'm wrong and
 there is a reliable way to turn long descriptions into proper markdown
 input to *really* be able to detect the lists.  If not I think I
 continue with my intention as described. [2]

Is the above algorithm proof enough for you? Or do I have to
 write that into real code in your favourite porogramming language
 before you can see it?

manoj
-- 
The minority is always right. Henrik Ibsen 1828-1906
Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/  
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-18 Thread Andreas Tille

On Sat, 18 Apr 2009, Manoj Srivastava wrote:


   Here is an algorithm:
--8---cut here---start-8---
we are not in a list
while reading each line, do
  remove leading space
  if the only non white space character on the line is a singe .
remove the dot
  if the line matches the regexp: '^\s+[\*\+\-]\s+'
if we are not in a list
  emit blank line first
  record we are not in a list
  else
if we are in a list
  record we are not in a list
  emit line
--8---cut here---end---8---

   People who can not convert this 13 line Psuedocode into a real
code should not be writing stuff to pretty print descriptions.


Thanks for the trust in the programming skills of your fellow
developers.  You obviosely are able to write the code to detect
a list *without* using a library.  Wasn't it you who told me we
should use a library to *avoid* inventing our own code?  So if
you have this code which works perfectly on the input I'm
suggesting since two weeks why you want to add an additional library
on top of this.  I feel a little bit bored by this discussion which
is running several circles starts to become personal without any
real reason (I hope I did not gave any) and finally leads to nothing
(at least this is my impression).


enclose lists in blank lines because people will tell you that
this will look ugly in the existing interfaces.  So I would rather
tend to No for both and this is the crux here.


   Frankly, I think this is very wrong.


The solution does not work without the code you wrote above.  But you
need this code anyway to detect lists in the long descriptions and so
I wonder where the real profit of an additional library is.


   Is the above algorithm proof enough for you? Or do I have to
write that into real code in your favourite porogramming language
before you can see it?


I hope you would not code the bug in line no. 9.

What you basically tried to prove is that you are keen on teaching your
fellow developers programming.  Your time would be much better spend if
you would bring the effort forward to finally reach a consensus how we
should change best practices for debian/control to enable the parsing
of list.  My suggestions I presented [1] are not in contrast to markdown
and what you finally are using for the description parsing tools -
the algorithm above or a library on top of it - does not matter at all
if we agree to some simple standard.

It would be really helpful if you would return to the constructive way
of discussion I observed in former times instead of bluring the issue
with distracting discussions.

Kind regards

  Andreas.

[1] http://lists.debian.org/debian-devel/2009/04/msg00643.html


--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-18 Thread Manoj Srivastava
On Sat, Apr 18 2009, Andreas Tille wrote:

 On Sat, 18 Apr 2009, Manoj Srivastava wrote:

Here is an algorithm:
 --8---cut here---start-8---
 we are not in a list
 while reading each line, do
   remove leading space
   if the only non white space character on the line is a singe .
 remove the dot
   if the line matches the regexp: '^\s+[\*\+\-]\s+'
 if we are not in a list
   emit blank line first
   record we are not in a list
   s/not//
   else
 if we are in a list
   record we are not in a list
   emit line
 --8---cut here---end---8---

People who can not convert this 13 line Psuedocode into a real
 code should not be writing stuff to pretty print descriptions.

 Thanks for the trust in the programming skills of your fellow
 developers.  You obviosely are able to write the code to detect
 a list *without* using a library.  Wasn't it you who told me we
 should use a library to *avoid* inventing our own code?  So if
 you have this code which works perfectly on the input I'm
 suggesting since two weeks why you want to add an additional library
 on top of this.  I feel a little bit bored by this discussion which
 is running several circles starts to become personal without any
 real reason (I hope I did not gave any) and finally leads to nothing
 (at least this is my impression).

Frankly, I have no idea where this trade is going.

With a 6 line pre-processor, you can feed the grep-dctrl
 provided Description fields to Markdown. So, seems like we have come
 somewhere -- we have had one investigation that leads one to believe
 that there are a small fraction of packages using o as a bullet that
 need to be changed, and apart fro that there are less than 50 packages
 are affected (if we want to specify markdown as the markup language for
 descriptions -- and these are the one where we have some unwanted
 emphasis, a non-fatal result).

There is a mechanism to pre-process  the description for
 markdown (Perl implementation below). What more is needed for you to
 think this is leading somewhere?

 enclose lists in blank lines because people will tell you that
 this will look ugly in the existing interfaces.  So I would rather
 tend to No for both and this is the crux here.

Frankly, I think this is very wrong.

 The solution does not work without the code you wrote above.  But you
 need this code anyway to detect lists in the long descriptions and so
 I wonder where the real profit of an additional library is.

*Sigh*.

All I am doing with the code is inserting a line before the
 lists. I am not generating html. I am not also handling the _other_
 markup that markdown handles, that I presented as something that will
 make the description more readable too. The markdown librarys does all
 the heavy lifting fro the html generation. If you think my little perl
 snippet is the equivalent for what markdown does, you have not looked
 at markdown.

I am not re-inventing the wheel when it comes to markup
 languages. 

We know we needed _some_ pre-processing because we have the
 paragraphs separated by ' .', but the code is pretty minimal.

--8---cut here---start-8---
 my $in=0;
 while() {
  chomp;  s/^ //g;  s/^\.\s*$//;
  if(/^\s+[\*\+\-]\s+/) { print \n unless $in++;}
  else  { $in=0; }
  print $_\n
}
--8---cut here---end---8---

manoj

ps: This can easily become a shell function.

__ grep-aptavail -s Description -P airport-utils | perl -e '
 my $in=0;
 while() {
  chomp;  s/^ //g;  s/^\.\s*$//;
  if(/^\s+[\*\+\-]\s+/) { print \n unless $in++;}
  else{ $in=0; }
  print $_\n
}' | markdown
pDescription: configuration and management utilities for Apple AirPort base 
stations
This package contains various utilities to manage the Apple AirPort base
stations./p

pBe aware that Apple released several versions of the AirPort base station;
the original AirPort (Graphite) was a rebranded Lucent RG-1000 base
station, doing 802.11a/b. The AirPort Extreme (Snow) is an Apple-built
802.11a/b/g base station./p

pFor the original Apple AirPort and the Lucent RG-1000 base stations only:/p

ul
liairport-config: base station configurator/li
liairport-linkmon: wireless link monitor, gives information on the wireless
link quality between the base station and the associated hosts/li
/ul

pFor the Apple AirPort Extreme base stations only:/p

ul
liairport2-config: base station configurator/li
liairport2-portinspector: port maps monitor/li
liairport2-ipinspector: WAN interface monitoring utility/li
/ul

pFor all:/p

ul
lipairport-modem: modem control utility, displays modem state, starts/stops
modem connections, displays the approximate connection time (Extreme only)/p

ul
liairport-hostmon: wireless hosts monitor, lists wireless hosts 

Re: RFC: Better formatting for long descriptions

2009-04-18 Thread Andreas Tille

On Sat, 18 Apr 2009, Manoj Srivastava wrote:


   Frankly, I have no idea where this trade is going.


IMHO the problem is that you assume our suggestions are in contrast to
each other - but they are not.  I wanted to iron out suggestions how
to format the input in a standardised way.  What will be done afterwards
is the choice of people who are working with this input.  I don't care
whether they choose markdown, restructured text or just take your
perl code and use ul / /ul instead of the additional blank lines
and wrapp the lines in lists in li / /li tags if they need HTML
output.  But this is NOT to be discussed HERE (even if it does not
harm.  The point is that our input should ENABLE this which needs
a better standardisation of long descriptions.

You are one step after this - and your input is welcome - but there
is no contradiction.


   With a 6 line pre-processor, you can feed the grep-dctrl
provided Description fields to Markdown.


BTW, your pre-processor will need some additional lines if it comes to
second level lists (and yes, I'm sure this can easily be done - but
this is, and never was the point)


So, seems like we have come
somewhere -- we have had one investigation that leads one to believe
that there are a small fraction of packages using o as a bullet that
need to be changed, and apart fro that there are less than 50 packages
are affected


Great - let's iron out the advise how to format long descriptions
in our docs to enable us to write lintian checks and file bug reports.
Manoj, we really reached a point here!


(if we want to specify markdown as the markup language for
descriptions -- and these are the one where we have some unwanted
emphasis, a non-fatal result).


Please let's draw this to a different discussion.  People who are
responsible for packages.debian.org might be interested and adopt
your idea.


   There is a mechanism to pre-process  the description for
markdown (Perl implementation below). What more is needed for you to
think this is leading somewhere?


Did I gave the impression that I wanted more?  Honestly, I'd be
interested from what part of my mails you are drawing the conclusion
to enhance my communication skills.


   All I am doing with the code is inserting a line before the
lists. I am not generating html. I am not also handling the _other_
markup that markdown handles, that I presented as something that will
make the description more readable too. The markdown librarys does all
the heavy lifting fro the html generation. If you think my little perl
snippet is the equivalent for what markdown does, you have not looked
at markdown.


In the whole discussion I was talking about structuring the input
to ENABLE turning it to html (or whatever structured output you need).
You were discussing steps to actually *do* the step I just wanted to
provide the precondition for.  I just was saying if you need a
preprocessor for a library while you could reach a similar result
by tweaking the preprocessor a little bit.  I just do not want to
force any programmer to use markdown (even if it has advantages
admittedly as I also agreed to).  This was a *sidenote* because this
whole processing of the input is just not my point.


   I am not re-inventing the wheel when it comes to markup
languages.


Same for me - or am I writing in delirium???

And your divergence of the original topic just blurs the issue -
would you mind rereading my initial mail. [1] Do you agree that
long descriptions need enhancement or not?


   We know we needed _some_ pre-processing because we have the
paragraphs separated by ' .', but the code is pretty minimal.

--8---cut here---start-8---
my $in=0;
while() {
 chomp;  s/^ //g;  s/^\.\s*$//;
 if(/^\s+[\*\+\-]\s+/) { print \n unless $in++;}
 else  { $in=0; }
 print $_\n
}
--8---cut here---end---8---

   manoj

ps: This can easily become a shell function.


Again: Please asume for the rest of this thread that I'm not stupid
and know how scripts can be used.


__ grep-aptavail -s Description -P airport-utils | perl -e '
my $in=0;
while() {
 chomp;  s/^ //g;  s/^\.\s*$//;
 if(/^\s+[\*\+\-]\s+/) { print \n unless $in++;}
 else{ $in=0; }
 print $_\n
}' | markdown
pDescription: configuration and management utilities for Apple AirPort base 
stations
This package contains various utilities to manage the Apple AirPort base
stations./p

pBe aware that Apple released several versions of the AirPort base station;
the original AirPort (Graphite) was a rebranded Lucent RG-1000 base
station, doing 802.11a/b. The AirPort Extreme (Snow) is an Apple-built
802.11a/b/g base station./p

pFor the original Apple AirPort and the Lucent RG-1000 base stations only:/p

ul
liairport-config: base station configurator/li
liairport-linkmon: wireless link monitor, gives information on the wireless
link quality between the base station and the associated 

Re: RFC: Better formatting for long descriptions

2009-04-17 Thread Andreas Tille

On Thu, 16 Apr 2009, Manoj Srivastava wrote:


   Which is good, since Markdown/ReST rules for lists will only
make the lists using o as the bullet out of whack.


Fine.


   None of which are mandatory. All the package descriptions I read
in /var/lib/dpkg/available seems to pass, though a couple had italics
in strange places. This is not a fatal flaw.


No - this perfectly fits my intention that some descriptions have to be fixed.
We just need guidelines for developers to follow.


   I find the descriptions on packages.d.o just fine right now.


IMHO it is no argument that a specific person is happy with the layout
everybody else is.


   Just like it  is no argument that someone think something is ugly
that means everyone thinks so too.


 If a text has a certain logic it should to be
supported by the means a certain output style has.  HTML can express a
list and so it should if we want to express lists.


Please do not split my paragraphs to blur my arguing.  Thanks.


   Heh. Ever heard of inline answers?


In most cases I manage to ignore this kind of questions.  Try reading my
mail again to find out a reasonable answer to your question yourself.


   I suggest you try it out, before handwaving vague FUD
around. Even tnftp description works fine with either. There are very
few descriptions (about 24 or so) where we might have unwanted
emphasis.  I think we can have that fixed.


But what exactly do I have to do to get the item lists marked?

grep-available -s Description -F Package airport-utils | markdown
pDescription: configuration and management utilities for Apple AirPort base 
stations
 This package contains various utilities to manage the Apple AirPort base
 stations.
 .
 Be aware that Apple released several versions of the AirPort base station;
 the original AirPort (Graphite) was a rebranded Lucent RG-1000 base
 station, doing 802.11a/b. The AirPort Extreme (Snow) is an Apple-built
 802.11a/b/g base station.
 .
 For the original Apple AirPort and the Lucent RG-1000 base stations only:
   - airport-config: base station configurator
   - airport-linkmon: wireless link monitor, gives information on the wireless
 link quality between the base station and the associated hosts
 .
 For the Apple AirPort Extreme base stations only:
   - airport2-config: base station configurator
   - airport2-portinspector: port maps monitor
   - airport2-ipinspector: WAN interface monitoring utility
 .
 For all:
  - airport-modem: modem control utility, displays modem state, starts/stops
 modem connections, displays the approximate connection time (Extreme only)
   - airport-hostmon: wireless hosts monitor, lists wireless hosts connected
 to the base station (see airport2-portinspector for the Snow)/p

$ grep-available -s Description -F Package tnftp | markdown
pDescription: The enhanced ftp client
 tnftp is what many users affectionately call the enhanced ftp
 client in NetBSD (http://www.netbsd.org).
 .
 This package is a codeport' of the NetBSD ftp client to other systems.
 .
 The enhancements over the standard ftp client in 4.4BSD include:
* command-line editing within ftp
* command-line fetching of URLS, including support for:
- http proxies (c.f: $http_proxy, $ftp_proxy)
- authentication
* context sensitive command and filename completion
* dynamic progress bar
* IPv6 support (from the WIDE project)
* modification time preservation
* paging of local and remote files, and of directory listings
  (c.f:/codelpage', codepage',/codepdir')
* passive mode support, with fallback to active mode
* codeset option' override of ftp environment variables
* TIS Firewall Toolkit gate ftp proxy support (c.f:/codegate')
* transfer-rate throttling (c.f: code-T',/coderate')/p


   I would simplify the rule, as opposed to having a trivial
library call in the tool. Indeed, reusing the libraries provided is
*less* work for the parser, than a NIH  new parser.


I'm really in favour of reusing a library (and I wonder whether I wrote
anything in contrast to this).  I just fail to see any effect when using
markdown except that the description is now enclosed in p/p and
some other markups appear which could be fixed.  But the intended result
to get a list markup is not reached.  Or did I missed something?


   I think we need the emphasis almost as much as we need lists;
and people are already using *word* for emphasis in  desciptions
(though not all that many).


I'm not against implementing emphasis which might be also an interesting
enhancement and if it is a small amount of packages which need to be
fixed these most probably need to be fixed in plain text anyway.  So if
you enlighten me how the lists could work I'm perfectly happy.

Kind regards

Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-17 Thread Peter Pentchev
On Thu, Apr 16, 2009 at 03:01:30PM -0500, Manoj Srivastava wrote:
 On Thu, Apr 16 2009, Giacomo Catenazzi wrote:
 
  Manoj Srivastava wrote:
   - Ability to recognize and render the following logical entities, in
 decreasing order of importance:
 + unordered lists
 + ordered lists
 
  really needed?
 
 I would think these are the guts of this proposal. Or else what
  are we discussing here?
 
 
 + emphasis
 + strong emphasis
 + definition lists
 + hypertext links
 + underlines, and strike throughs
 
  I don't think they are needed.
 
 Why not? If rendering a description in a manner that makes it
  easier to read is the goal, I fail to see why emphasis and strong
  emphasis is a bad idea (think of text-to-speech mechanisms). This is
  not just opinions we are discussing here, we should be looking at use
  cases for marking up a textual description.
 
  Underlines is generally bad, strike throughs are worse ;-)
 
 So you say. Don't use them, then. There are cases where either
  one of these constructs have value; and you should not impose your
  personal aesthetics on a general policy discussion.

Just as a kind of clarification: Manoj, I think that Giacomo's comments
were only to the *last* item of the text he quoted, not to the whole
portion above it :)  Thus, IMHO his first really needed? question
referred specifically to the ordered lists item, and the I don't think
they are needed referred specifically to the underlines and
strike-throughs, not to the emphasis, strong emphasis, etc.

G'luck,
Peter

-- 
Peter Pentchev  r...@ringlet.netr...@space.bgr...@freebsd.org
PGP key:http://people.FreeBSD.org/~roam/roam.key.asc
Key fingerprint FDBA FD79 C26F 3C51 C95E  DF9E ED18 B68D 1619 4553
If this sentence didn't exist, somebody would have invented it.


pgpiMqRiX2tCB.pgp
Description: PGP signature


Re: RFC: Better formatting for long descriptions

2009-04-17 Thread Andreas Tille

On Thu, 16 Apr 2009, Manoj Srivastava wrote:


Manoj Srivastava wrote:

 - Ability to recognize and render the following logical entities, in
   decreasing order of importance:
   + unordered lists
   + ordered lists


really needed?


   I would think these are the guts of this proposal. Or else what
are we discussing here?


In this thread it was mentioned that ordered lists are not really needed.
Despite this opinion they are actually *used* and thus there seems to be
some need.

Another thing what actually is used are description lists (which are
IMHO needed more than orderes lists) but if we at least get the two
above working there is a big win.


   + emphasis
   + strong emphasis
   + definition lists
   + hypertext links
   + underlines, and strike throughs


I don't think they are needed.


   Why not? If rendering a description in a manner that makes it
easier to read is the goal, I fail to see why emphasis and strong
emphasis is a bad idea (think of text-to-speech mechanisms). This is
not just opinions we are discussing here, we should be looking at use
cases for marking up a textual description.


As Peter Pentchev wrote in his mail I think what is not needed is
underlines, and strike throughs - but I would not forcibly restrict
the use if the lib we decide to use provides this feature.

Kind regards

  Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Andreas Tille

On Thu, 16 Apr 2009, Guillem Jover wrote:


,-- count-bullet-chars.sh --
#!/bin/sh
lists=/var/lib/apt/lists/*_sid_main_*_Packages
total=`grep ^ *[-+\*o]  $lists | wc -l`
for tag in \* - + o; do
 items=`grep ^ *$tag  $lists | wc -l`
 percent=`echo scale=4; $items / $total * 100 | bc`
 echo Tag $tag was used $items times ($percent%)
done
`--

Tag \* was used 9277 times (68.0900%)
Tag - was used 3837 times (28.1600%)
Tag + was used 120 times (.8800%)
Tag o was used 390 times (2.8600%)


Regardless of the numbers though (which have moved lately slightly in
favour of '-' due to the recommendations from the Smith reviewing
project),


I have not found any recommendation regarding this at the SRP Wiki page [1].
I vaguely remember that this Smith project was initially driven by a French
guy who might try to push a French habit into the English world. ;-)
Do you have any link to those recommendation which perhaps should be fixed
in the first place.  IMHO the Smith Review Project would be a first place
were we could start kind of a standardisation of this issue - it seems there
is no stronger place to move this suggestion to.


I've always found the asterisk the obvious character to use
for bulleted lists, as it's the one ressembling the most a bullet, and
it's the one we use in changelog entries and similar.


I perfectly agree here.  Even if I tend to a I do not care about the actual
character we use as long as it is a defined one opinion the statistics above
shows clearly a preference and we should turn this preference in a
recommendation and ask people to stick to this recommendation.

So could we settle down with the agreement:

  '  * '   for first order lists and
  '- ' for second order lists.

I would like to push this to SRP *and* 6.2. Best practices for debian/control
of developers reference.  This would finally allow us to file wishlist bug
reports against packages which do not follow this recommendation.

Kind regards

 Andreas.


[1] http://wiki.debian.org/I18n/SmithReviewProject

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Item lists bulletting (was: Re: RFC: Better formatting for long descriptions)

2009-04-16 Thread Christian Perrier
Andreas Tille a écrit :

 I have not found any recommendation regarding this at the SRP Wiki page
 [1].
 I vaguely remember that this Smith project was initially driven by a French
 guy who might try to push a French habit into the English world. ;-)


Of course. Because, contrary to the world of English language, we *do*
have written rules for such cases. From the Lexique des règles
typographiques en usage à l'Imprimerie Nationale (which is the
reference for all typographic conventions for the French languagethe
reference book of all French TeXnicians) :

Les énumérations

- elles sont introduites par un deux-points ;
- les énumérations de premier rang sont introduites par un tiret et
se terminent par un point-virgule, sauf la dernière par un point final ;
- les énumérations de second rang sont introduites par un tiret
décalé et se terminent par une virgule.


Which (badly) translates to:

Itemizations:
- they're introduced by a colon;
- first degree itemizations are preceeded by a dash and end with a
semi-colon, except the last one that ends up with a sentence dot;
- second degree itemizations are preceeded by a tabbed dash and end
up with a comma.


I have never been able to find any such solid reference for English.
There is probably something in the Chicago Manual of Style, that's
generally accepted as the Right Reference for en_US.

Maybe more input from our experts on debian-l10n-english?


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Manoj Srivastava
On Wed, Apr 15 2009, Guillem Jover wrote:


 Tag \* was used 9277 times (68.0900%)
 Tag - was used 3837 times (28.1600%)
 Tag + was used 120 times (.8800%)
 Tag o was used 390 times (2.8600%)

 Regardless of the numbers though (which have moved lately slightly in
 favour of '-' due to the recommendations from the Smith reviewing
 project), I've always found the asterisk the obvious character to use
 for bulleted lists, as it's the one ressembling the most a bullet, and
 it's the one we use in changelog entries and similar.

The primary goal of the description is to convey to the user why
 they should install the package. The maintainer can use an unsorted
 list to help convey the information; and any means that make it clear
 to the user that they are looking at a list is good enough.

Anything beyond that seems like striving for a foolish
 consistency; and the basic assumption being made (which does
 not, in my opinion, hold) is that a rigid monotonic conformity is
 aesthetically pleasing. I think a variety in the symbols used for
 bullets is better, in that it breaks the monotony.

Do we really have nothing better to do than to impose
 bureaucratic rules on what characters to use as bullet symbols in long
 descriptions even if the user can tell that the character is a bullet?

manoj

-- 
Slowly and surely the unix crept up on the Nintendo user ...
Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/  
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Item lists bulletting (was: Re: RFC: Better formatting for long descriptions)

2009-04-16 Thread Lars Wirzenius
to, 2009-04-16 kello 08:42 +0200, Christian Perrier kirjoitti:
 I have never been able to find any such solid reference for English.
 There is probably something in the Chicago Manual of Style, that's
 generally accepted as the Right Reference for en_US.
 
 Maybe more input from our experts on debian-l10n-english?

I'm not an expert, but I have the 14th edition of the CMS. It says both
bullets and dashes are acceptable (8.77, page 314, for reference).

(I am not expressing an opinion for or against the normalization of long
description markup.)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Andreas Tille

On Thu, 16 Apr 2009, Manoj Srivastava wrote:


   Do we really have nothing better to do than to impose
bureaucratic rules on what characters to use as bullet symbols in long
descriptions even if the user can tell that the character is a bullet?


The user can tell, but scripts can't reliably.  Long descriptions are
used in several places and some of these could render a better layout.
A good layout is pleasing for users.  So it is not stupid bureaucracy
but making our descriptions better readable (for instance on packages.d.o
and other places).

Kind regards

Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Manoj Srivastava
On Thu, Apr 16 2009, Andreas Tille wrote:

 On Thu, 16 Apr 2009, Manoj Srivastava wrote:

Do we really have nothing better to do than to impose
 bureaucratic rules on what characters to use as bullet symbols in long
 descriptions even if the user can tell that the character is a bullet?

 The user can tell, but scripts can't reliably.


Any script should be able to take the top 4 symbols currently
 used, and be able to detect them. I think *, +, - and o  cover most
 packages, and the scripts in question can be readily expanded. All
 kinds of markup languages already do something similar. (markdown,
 Emacs org-mode, mediawiki, etc)

 Long descriptions are used in several places and some of these could
 render a better layout.

Functionally, just rendering the description as written would
 suffice; the rest is aesthetics.

  A good layout is pleasing for users.  So it

Pleasing is in the eye of the beholder, no?

 is not stupid bureaucracy but making our descriptions better readable
 (for instance on packages.d.o and other places).

I find the descriptions on packages.d.o just fine right now.

Having sad that, I would not be averse to specifying that leading
 white space and  *, +, and -  would be acceptable as bullet marks (I
 thought specifying which mark at which level was overspecification).

manoj
-- 
A man convinced against his will is of the same opinion still.  --
Butler
Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/  
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Andreas Tille

On Thu, 16 Apr 2009, Manoj Srivastava wrote:


   Any script should be able to take the top 4 symbols currently
used, and be able to detect them. I think *, +, - and o  cover most
packages, and the scripts in question can be readily expanded. All
kinds of markup languages already do something similar. (markdown,
Emacs org-mode, mediawiki, etc)


Perhaps you missed the point that it is not only the very character
which is used but also the broken spacing which prevents scripts from
detecting levels of itemizing list.

Yes, we have more than one level itemizings in our descriptions (see
my initial posting.  Detecting these would need either a defined
character or a defined spacing (IMHO an 'and' would be better than
a non-exclusive 'or' here).


   I find the descriptions on packages.d.o just fine right now.


IMHO it is no argument that a specific person is happy with the layout
everybody else is.  If a text has a certain logic it should to be supported
by the means a certain output style has.  HTML can express a list and
so it should if we want to express lists.


   Having sad that, I would not be averse to specifying that leading
white space and  *, +, and -  would be acceptable as bullet marks (I
thought specifying which mark at which level was overspecification).


So you would be in favour of specifying only the amount of white space
to define a level?  If this might be accepted as a rough consensus it
is at least helpful to enable tools detecting what they need to detect.
Even if my esthetical feeling goes beyond this I can accept this.  But
you also specified three characters (*, +, and -) so do you want to
restrict the acceptable set yourself (for instance not accept 'o')?

Kind regards

Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Michael Banck
On Thu, Apr 16, 2009 at 02:34:52AM -0500, Manoj Srivastava wrote:
 Having sad that, I would not be averse to specifying that leading
  white space and  *, +, and -  would be acceptable as bullet marks (I
  thought specifying which mark at which level was overspecification).

Why don't we say binaries are fine in /usr/bin, /usr/local/bin and /opt
while we are at it, to provide some refreshing alternatives to our
users?


Michael


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Manoj Srivastava
On Thu, Apr 16 2009, Andreas Tille wrote:

 On Thu, 16 Apr 2009, Manoj Srivastava wrote:

Any script should be able to take the top 4 symbols currently
 used, and be able to detect them. I think *, +, - and o  cover most
 packages, and the scripts in question can be readily expanded. All
 kinds of markup languages already do something similar. (markdown,
 Emacs org-mode, mediawiki, etc)

 Perhaps you missed the point that it is not only the very character
 which is used but also the broken spacing which prevents scripts from
 detecting levels of itemizing list.

 Yes, we have more than one level itemizings in our descriptions (see
 my initial posting.  Detecting these would need either a defined
 character or a defined spacing (IMHO an 'and' would be better than
 a non-exclusive 'or' here).

Umm. I am not sure that follows. I am also not convinced we need
 to invent our own rules. Text::Markdown or Text::MultiMarkdown could
 help. And they do not seem to have issues with recognizing
 indentation/different characters as denoting levels of lists.


I find the descriptions on packages.d.o just fine right now.

 IMHO it is no argument that a specific person is happy with the layout
 everybody else is.

Just like it  is no argument that someone think something is ugly
 that means everyone thinks so too.

  If a text has a certain logic it should to be
 supported by the means a certain output style has.  HTML can express a
 list and so it should if we want to express lists.

And we do not need to specify any more rigid rules than
 established systems like markdown do in order to achieve that. Indeed,
 we can just pipe the description though markdown, and use the html


Having sad that, I would not be averse to specifying that leading
 white space and  *, +, and -  would be acceptable as bullet marks (I
 thought specifying which mark at which level was overspecification).

 So you would be in favour of specifying only the amount of white space
 to define a level?

You do not have to specify the level. Just that the indentation
 be sufficient for the user or markdown to be able to differentiate what
 level the item is at. 

 If this might be accepted as a rough consensus it is at least helpful
 to enable tools detecting what they need to detect.  Even if my
 esthetical feeling goes beyond this I can accept this.  But you also
 specified three characters (*, +, and -) so do you want to restrict
 the acceptable set yourself (for instance not accept 'o')?

I suggest we follow a convention and tool set already in place,
 with multiple language bindings, if you must insist on adding rules to
 the long description.

There are alternatives (Text::Textile comes to mind), but
 Markdown has better language support, so long description parsers might
 have an easier time.

I suggest, for readability, to use a subset of markdown; the
 link and image tags are not that human readable.

manoj

 http://en.wikipedia.org/wiki/Markdown
 http://markdown.infogami.com/
 http://daringfireball.net/projects/markdown/syntax

-- 
Man's horizons are bounded by his vision.
Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/  
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Ben Finney
(following up on IRC discussion)

Manoj Srivastava sriva...@debian.org writes:

 I suggest we follow a convention and tool set already in place,
  with multiple language bindings, if you must insist on adding rules to
  the long description.
 
 There are alternatives (Text::Textile comes to mind), but
  Markdown has better language support, so long description parsers might
  have an easier time.
 
 I suggest, for readability, to use a subset of markdown; the
  link and image tags are not that human readable.

reStructuredText URL:http://docutils.sourceforge.net/rst.html (reST)
is, I argue, a superior choice to Markdown for our existing format.

Markdown explicitly assumes the writer is going to punt to HTML for
anything not covered by Markdown, which severely limits its future
flexibility in contexts where we don't want to put HTML in the source.

reST, on the other hand, makes no such assumptions about enclosing
context; it was initially designed for documentation in program source
code, which is much closer to our needs for text in a control field.

It also helps that the simple bullet lists that are the most common case
are perfectly valid in reST too.

-- 
 \   “Never express yourself more clearly than you are able to |
  `\   think.” —Niels Bohr |
_o__)  |
Ben Finney


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Andreas Tille

On Thu, 16 Apr 2009, Manoj Srivastava wrote:


my initial posting.  Detecting these would need either a defined
character or a defined spacing (IMHO an 'and' would be better than
a non-exclusive 'or' here).


   Umm. I am not sure that follows. I am also not convinced we need
to invent our own rules.


I tried to suggest *any* rule which works.  I'm not in favour of invanting
new rules.  But the rules should be simple enough to not break any existing
tool.


Text::Markdown or Text::MultiMarkdown could
help. And they do not seem to have issues with recognizing
indentation/different characters as denoting levels of lists.


If I interpret your first link [1] right this are even *more* rules as
I suggested.


   I find the descriptions on packages.d.o just fine right now.


IMHO it is no argument that a specific person is happy with the layout
everybody else is.


   Just like it  is no argument that someone think something is ugly
that means everyone thinks so too.


 If a text has a certain logic it should to be
supported by the means a certain output style has.  HTML can express a
list and so it should if we want to express lists.


Please do not split my paragraphs to blur my arguing.  Thanks.


   And we do not need to specify any more rigid rules than
established systems like markdown do in order to achieve that. Indeed,
we can just pipe the description though markdown, and use the html


Have you tested this suggestion whether the current long descriptions will
render correctly?


So you would be in favour of specifying only the amount of white space
to define a level?


   You do not have to specify the level. Just that the indentation
be sufficient for the user or markdown to be able to differentiate what
level the item is at.


I'm sorry - I do not know markdown whether it is clever enough to render
the lists in all long descriptions.  But as long as the hint please
make sure that your long description renders with markdown is not
written in any of our documents I really doubt that.  May I draw the
conclusion that you are also in favour of some rules but not really
happy with the rules I suggested?  That's really fine for me.  I just
want *any* rule which *works* and is written down somewhere to enable
us filing bug reports against packages which do not follow this rule.
I think I mentioned this in my postings of this thread.


   I suggest we follow a convention and tool set already in place,
with multiple language bindings, if you must insist on adding rules to
the long description.

   There are alternatives (Text::Textile comes to mind), but
Markdown has better language support, so long description parsers might
have an easier time.


I do not want any complicated tool to parse our long descriptions.
In principle they are really easy to parse.  I want to have the
simplest possible rule set which enables us to reliable parse the
logic of our long descriptions.  While you claim to be against rules
you propose even harder to apply rules.  At least for me your suggestions
are confusing and just bluring the issue.


   I suggest, for readability, to use a subset of markdown; the
link and image tags are not that human readable.


Yes - that's perfectly fine.  We are just using a subset of markdown
actually - a much simpler one than the suggested, without features like
italics and strong, headings etc.  And we do not really need it - we
just should keep it simple to not break any existing tool.  If there
is a library which reliably can detect the logic of the current long
descriptions probably nothing has to be changed.  But I doubt there is
one and I really wonder why anybody who is happy with the current rendering
is suggesting even more complex things.

Kind regards

Andreas.

[1] http://en.wikipedia.org/wiki/Markdown

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Tzafrir Cohen
On Thu, Apr 16, 2009 at 04:01:20AM -0500, Manoj Srivastava wrote:

 Umm. I am not sure that follows. I am also not convinced we need
  to invent our own rules. Text::Markdown or Text::MultiMarkdown could
  help. And they do not seem to have issues with recognizing
  indentation/different characters as denoting levels of lists.

Character-level formatting of markdown as well?

Two examples:

* From abcmidi:

 This package contains the programs `abc2midi' and `midi2abc',  which

* From alltray:

 KDE, XFCE 4*, Fluxbox* and WindowMaker*.
 (*) No drag 'n drop support. Enable with -nm option.

-- 
Tzafrir Cohen | tzaf...@jabber.org | VIM is
http://tzafrir.org.il || a Mutt's
tzaf...@cohens.org.il ||  best
ICQ# 16849754 || friend


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Ben Finney
Ben Finney ben+deb...@benfinney.id.au writes:

 (following up on IRC discussion)
 
 Manoj Srivastava sriva...@debian.org writes:
 
  I suggest, for readability, to use a subset of markdown; the
   link and image tags are not that human readable.
 
 reStructuredText URL:http://docutils.sourceforge.net/rst.html (reST)
 is, I argue, a superior choice to Markdown for our existing format.

Note that, like Manoj, I'm suggesting only a *subset*, not the full
specification.

-- 
 \“Like the creators of sitcoms or junk food or package tours, |
  `\ Java's designers were consciously designing a product for |
_o__)   people not as smart as them.” —Paul Graham |
Ben Finney


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Andreas Tille

On Thu, 16 Apr 2009, Ben Finney wrote:


Note that, like Manoj, I'm suggesting only a *subset*, not the full
specification.


Well, in this thread we had several suggestions reaching from complete
change to different format up to not in detail specified subsets of
other formats.  IMHO this does not bring us foreward a single step.
If we want to move foreward we have to make sure that we will not be
forced to touch every single package because such an intend will be
bound to fail and every minute spended in discussion here is simply
wasted.  So if you suggest a subset of a specification please state
clearly which subset and whether it works with currently existing
descriptions.  I'd volunteer to set up a doodle poll with suggestions.

If you make a suggestion please answer the following question:

  A. Does the suggestion enable parsing logical structures like
 two level itemize lists?
 (This is what I want to approach and what is IMHO needed)
  B. Does the suggestion enable keeping the majority of description
 untouched and enables keeping the currently existing tools?
 (This is important to gain any acceptance)

If one of the question above is answered with no please mention
whether you are volunteering to do the work which is needed to
port the existing stuff to match your suggestion.

Currently I would feed the poll with 4 suggestions:

  0. Keep anything as unstructured as it is.
 Answer to A: no
 Answer to B: yes

  1. Use '*' for first order item lists, '-' for second order
 item lists and use '  ' (exactly two spaces) before the
 '*' and '' (exactly four spaces) before the '-'. After
 '*' and '-' exactly one space should be used and continued
 lines should start in the same column as the text starts
 above.
 Answer to A: yes
 Answer to B: yes

  2. Use '*' for first order item lists, '-' for second order
 item lists.  Spacing does not matter as long as continued
 lines will start in the same column as the text above.
 Answer to A: yes
 Answer to B: yes

  3. Use any character of ('*', '-', '+') to start a list and
 mark the level of the list by strictly following spacing
 rules and use '  ' (exactly two spaces) before the selected
 character for starting first order list and '' (exactly
 four spaces) before the character for starting second order
 list. After the marker symbold exactly one space should be
 used and continued lines should start in the same column as
 the text starts above.
 Answer to A: yes
 Answer to B: yes

If you want to make further suggestions just append this list.
I'll start a doodle poll next Monday.  Depending from the outcome
of this poll I will submit a patch for 6.2. Best practices for
debian/control.

Does this sound reasonable?

Kind regards

Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Manoj Srivastava
On Thu, Apr 16 2009, Ben Finney wrote:

 (following up on IRC discussion)

 Manoj Srivastava sriva...@debian.org writes:

 I suggest we follow a convention and tool set already in place,
  with multiple language bindings, if you must insist on adding rules to
  the long description.
 
 There are alternatives (Text::Textile comes to mind), but
  Markdown has better language support, so long description parsers might
  have an easier time.
 
 I suggest, for readability, to use a subset of markdown; the
  link and image tags are not that human readable.

 reStructuredText URL:http://docutils.sourceforge.net/rst.html (reST)
 is, I argue, a superior choice to Markdown for our existing format.

I can live with restructured text. I would like to point out,
 though, that the language support is more mature in markdown, and the
 subset of features we care about are identical in markdown and rest.

 It also helps that the simple bullet lists that are the most common case
 are perfectly valid in reST too.

Right.

manoj

-- 
Patageometry, n.: The study of those mathematical properties that are
invariant under brain transplants.
Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/  
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Manoj Srivastava
On Thu, Apr 16 2009, Andreas Tille wrote:

 On Thu, 16 Apr 2009, Ben Finney wrote:

 Note that, like Manoj, I'm suggesting only a *subset*, not the full
 specification.

 Well, in this thread we had several suggestions reaching from complete
 change to different format up to not in detail specified subsets of
 other formats.  IMHO this does not bring us foreward a single step.
 If we want to move foreward we have to make sure that we will not be
 forced to touch every single package because such an intend will be

This is exactly why I like markdown or restructured text, most
 packages conform already.

 bound to fail and every minute spended in discussion here is simply
 wasted.  So if you suggest a subset of a specification please state
 clearly which subset and whether it works with currently existing
 descriptions.  I'd volunteer to set up a doodle poll with suggestions.


Voting is a piss poor means of making a technical decision.

At this point, I would say rules for lists, and bold/italics
 should not be any more restrictive than markdown/ReST, and not impose
 any more burdens on the description writer.

 If you make a suggestion please answer the following question:

   A. Does the suggestion enable parsing logical structures like
  two level itemize lists?
  (This is what I want to approach and what is IMHO needed)

Markdown and ReST, trivially.

   B. Does the suggestion enable keeping the majority of description
  untouched and enables keeping the currently existing tools?
  (This is important to gain any acceptance)

Yes, for both.


The one issue I have seen raised is that of using *italics* and
 **bold** text; there are package descriptions where italics will
 suddenly appear. Me, I like org mode, where we have /italics/, *bold*
 +strikethrough+, _underline_; bug I doubt that org-mode will be popular
 as an interpreter.

manoj
-- 
It is better to have loved and lost -- much better.
Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/  
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Manoj Srivastava
On Thu, Apr 16 2009, Tzafrir Cohen wrote:

 On Thu, Apr 16, 2009 at 04:01:20AM -0500, Manoj Srivastava wrote:

 Umm. I am not sure that follows. I am also not convinced we need
  to invent our own rules. Text::Markdown or Text::MultiMarkdown could
  help. And they do not seem to have issues with recognizing
  indentation/different characters as denoting levels of lists.

 Character-level formatting of markdown as well?

 Two examples:

 * From abcmidi:

  This package contains the programs `abc2midi' and `midi2abc',  which

Yup, this one is a problem.
pThis package contains the programs codeabc2midi\' and/codemidi2abc\',  
which/p

So using ` as a quote seems to be an issue.
__ egrep '`' /var/lib/dpkg/available | wc -l 
149
Less than 150 instances.

 * From alltray:

  KDE, XFCE 4*, Fluxbox* and WindowMaker*.
  (*) No drag 'n drop support. Enable with -nm option.
__ echo KDE, XFCE 4*, Fluxbox* and WindowMaker*.
 (*) No drag 'n drop support. Enable with -nm option. | markdown
pKDE, XFCE 4*, Fluxbox* and WindowMaker*.
 (*) No drag 'n drop support. Enable with -nm option./p

Hmm. Looks fine to me.

manoj
-- 
If Diet Coke did not exist it would have been necessary to invent it.
Karl Lehenbauer
Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/  
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Manoj Srivastava
On Thu, Apr 16 2009, Andreas Tille wrote:

 On Thu, 16 Apr 2009, Manoj Srivastava wrote:

 my initial posting.  Detecting these would need either a defined
 character or a defined spacing (IMHO an 'and' would be better than
 a non-exclusive 'or' here).

Umm. I am not sure that follows. I am also not convinced we need
 to invent our own rules.

 I tried to suggest *any* rule which works.  I'm not in favour of invanting
 new rules.  But the rules should be simple enough to not break any existing
 tool.

Which is good, since Markdown/ReST rules for lists will only
 make the lists using o as the bullet out of whack.


 Text::Markdown or Text::MultiMarkdown could
 help. And they do not seem to have issues with recognizing
 indentation/different characters as denoting levels of lists.

 If I interpret your first link [1] right this are even *more* rules as
 I suggested.

None of which are mandatory. All the package descriptions I read
 in /var/lib/dpkg/available seems to pass, though a couple had italics
 in strange places. This is not a fatal flaw.


I find the descriptions on packages.d.o just fine right now.

 IMHO it is no argument that a specific person is happy with the layout
 everybody else is.

Just like it  is no argument that someone think something is ugly
 that means everyone thinks so too.

  If a text has a certain logic it should to be
 supported by the means a certain output style has.  HTML can express a
 list and so it should if we want to express lists.

 Please do not split my paragraphs to blur my arguing.  Thanks.

Heh. Ever heard of inline answers?  

And we do not need to specify any more rigid rules than
 established systems like markdown do in order to achieve that. Indeed,
 we can just pipe the description though markdown, and use the html

 Have you tested this suggestion whether the current long descriptions will
 render correctly?

Yup.

 So you would be in favour of specifying only the amount of white space
 to define a level?

You do not have to specify the level. Just that the indentation
 be sufficient for the user or markdown to be able to differentiate what
 level the item is at.

 I'm sorry - I do not know markdown whether it is clever enough to
 render the lists in all long descriptions.  But as long as the hint
 please make sure that your long description renders with markdown is
 not written in any of our documents I really doubt that.  May I draw

  Doubt is fine. Actually reading the package descriptions would
 have been better. 

 Tag \* was used 9277 times (68.0900%)
 Tag - was used 3837 times (28.1600%)
 Tag + was used 120 times (.8800%)

These work.

 Tag o was used 390 times (2.8600%)

These do not.

Now, using *italic* had a few issues. There are 99 lines in
 available where * is not used as a list item tag.

Of these 99 lines, 27  places the *word* is used for emphasis,
 meaning that 72 places in the available file * is used as a
 wildcard. But not all of these are an issue:

--8---cut here---start-8---
__ echo ' bsd* and others.' | markdown
pbsd* and others./p
--8---cut here---end---8---


In those 72 places, only 24 descriptions did we have a second *
 show up, to anchor the other end of the mistaken emphasis.

 the conclusion that you are also in favour of some rules but not
 really happy with the rules I suggested?  That's really fine for me.
 I just want *any* rule which *works* and is written down somewhere to
 enable us filing bug reports against packages which do not follow this
 rule.  I think I mentioned this in my postings of this thread.

I suggest you try it out, before handwaving vague FUD
 around. Even tnftp description works fine with either. There are very
 few descriptions (about 24 or so) where we might have unwanted
 emphasis.  I think we can have that fixed. 


I suggest we follow a convention and tool set already in place,
 with multiple language bindings, if you must insist on adding rules to
 the long description.

There are alternatives (Text::Textile comes to mind), but
 Markdown has better language support, so long description parsers might
 have an easier time.

 I do not want any complicated tool to parse our long descriptions.  In
 principle they are really easy to parse.  I want to have the simplest
 possible rule set which enables us to reliable parse the logic of our
 long descriptions.  While you claim to be against rules you propose
 even harder to apply rules.  At least for me your suggestions are
 confusing and just bluring the issue.

I would simplify the rule, as opposed to having a trivial
 library call in the tool. Indeed, reusing the libraries provided is
 *less* work for the parser, than a NIH  new parser.


I suggest, for readability, to use a subset of markdown; the
 link and image tags are not 

Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Manoj Srivastava
Hi,

Oh, markdown is only confused when you have `two' `words'
 quoted like this, wqhen there is only one such quote in the package, we
 are fine.

pThis package contains the programs `abc2midi'  which/p

So, less than 149 instances of the code tag where we want none.

manoj
 finding fewer problems in the descriptions than expected
-- 
Slime is the agony of water. Jean-Paul Sartre
Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/  
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Item lists bulletting (was: Re: RFC: Better formatting for long descriptions)

2009-04-16 Thread Christian Perrier
Quoting Lars Wirzenius (l...@liw.fi):
 to, 2009-04-16 kello 08:42 +0200, Christian Perrier kirjoitti:
  I have never been able to find any such solid reference for English.
  There is probably something in the Chicago Manual of Style, that's
  generally accepted as the Right Reference for en_US.
  
  Maybe more input from our experts on debian-l10n-english?
 
 I'm not an expert, but I have the 14th edition of the CMS. It says both
 bullets and dashes are acceptable (8.77, page 314, for reference).


Well, based on that discussion, these facts and the current practice,
I think that, in Smith reviews, we will, from now, recommend the use
of asterisks for 1st level items in item lists, in package
descriptions and debconf templates (these are the texts we review).

Please note that this is not *enforcing* things on maintainers. All
Smith reviews are suggestions made to maintainers and they are
associated to the whole discussion/review. When maintainers insist on
some practice (or even spelling|wording) we always follow their advice
at the endeven for mainainers who insist on using first person
sentences (hint hint).

The same will happen for item lists.




signature.asc
Description: Digital signature


Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Manoj Srivastava
Hi,

I think we need to enumerate some goals for this proposed
 change. Here is a start:

 - Minimal disruption for current packages. The impact should be
   measured by numbers of packages impacted
   + Any specification of which of *, +, - to use as th first level item
 will impact more packages than not specifying it, by several
 hundred
   + The same is true for specifying the mark used for second level list
 items
   + Specifying exact number of spaces will also hit current packages,
 and will be a source of errors in the future. 
 - Ability to recognize and render the following logical entities, in
   decreasing order of importance:
   + unordered lists
   + ordered lists
   + emphasis
   + strong emphasis
   + definition lists
   + hypertext links
   + underlines, and strike throughs
 - Readability for people looking at non-enhanced renditions, i.e.,
   using less on the Packages file. Sticking to widely known
   conventions, using the same conventions that peple are used to using
   in email, and Wikis, is a plus.
 - Ease of use for description writers.
   Again, sticking with standards that people already know and use is
   better than making our own, more restrictive standards
 - Not adding hugely to bloat for the Packages file
   This kinda excludes verbose markup like XML (which would have failed
   the readability test too)

At this point, I would say that Markdown/Resstructued text meets
 most of the goals above, as long as we restrict the markup to the list
 above:
   * unordered lists
   * ordered lists
   * emphasis
   * strong emphasis
   * definition lists
   * hypertext links
   * underlines, and strike throughs


manoj
-- 
If we can't fix it -- we'll fix it so nobody can. Gibbons
Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/  
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Giacomo Catenazzi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Manoj Srivastava wrote:
  - Ability to recognize and render the following logical entities, in
decreasing order of importance:
+ unordered lists
+ ordered lists

really needed?

+ emphasis
+ strong emphasis
+ definition lists
+ hypertext links
+ underlines, and strike throughs

I don't think they are needed. Underlines is generally bad,
strike throughs are worse ;-)

Ev. also monospace, e.g. for commands, but I really prefer to have
a simpler language as possible.

 At this point, I would say that Markdown/Resstructued text meets
  most of the goals above, as long as we restrict the markup to the list
  above:

Could provide us an example of Resstructued for the basic constructs?

* unordered lists
* ordered lists
* emphasis
* strong emphasis
* definition lists
* hypertext links
* underlines, and strike throughs

I like also creole (standardized wiki language, moinmoin support it), but no 
definition lists,
underline, strike throughs.

So for creole:

* unordered lists   \n *  \n **
* ordered lists \n #  \n ##
* emphasis  //foo//
* strong emphasis   **bar**
* definition lists  missing  ev. \n **spam** is spam
* hypertext links   normal url
* underlines, and strike throughs   missing, missing

ciao
cate
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAknnhU8ACgkQ+ZNUJLHfmlfJigCfR/Jpn96l7FxHb9INlJlHkd+S
z+MAn2eM+rOOHN9n8LJTYXi/gT7cWuMa
=3a5+
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Manoj Srivastava
On Thu, Apr 16 2009, Giacomo Catenazzi wrote:

 Manoj Srivastava wrote:
  - Ability to recognize and render the following logical entities, in
decreasing order of importance:
+ unordered lists
+ ordered lists

 really needed?

I would think these are the guts of this proposal. Or else what
 are we discussing here?


+ emphasis
+ strong emphasis
+ definition lists
+ hypertext links
+ underlines, and strike throughs

 I don't think they are needed.

Why not? If rendering a description in a manner that makes it
 easier to read is the goal, I fail to see why emphasis and strong
 emphasis is a bad idea (think of text-to-speech mechanisms). This is
 not just opinions we are discussing here, we should be looking at use
 cases for marking up a textual description.

 Underlines is generally bad, strike throughs are worse ;-)

So you say. Don't use them, then. There are cases where either
 one of these constructs have value; and you should not impose your
 personal aesthetics on a general policy discussion.

 Ev. also monospace, e.g. for commands, but I really prefer to have
 a simpler language as possible.

 At this point, I would say that Markdown/Resstructued text meets
  most of the goals above, as long as we restrict the markup to the list
  above:

 Could provide us an example of Resstructued for the basic constructs?



* unordered lists
* ordered lists
* emphasis
* strong emphasis
* definition lists
* hypertext links
* underlines, and strike throughs

 I like also creole (standardized wiki language, moinmoin support it),
 but no definition lists, underline, strike throughs.

What kind of language bindings are present for creole libraries?
 markdown has a shell interpreter, has python, perl, ruby, C, c++, lisp,
 and is widely supported and used by wikis et al.

 So for creole:

 * unordered lists \n *  \n **

This fails the Do not impact large numbers of packages test,
 since we have lots of packages using + and -. for list items.

 * ordered lists   \n #  \n ##
 * emphasis//foo//

This also fails the test above -- lots of people are using
 *emphasis*.

 * strong emphasis **bar**
 * definition listsmissing  ev. \n **spam** is spam

Hmm

 * hypertext links normal url
 * underlines, and strike throughs missing, missing

ok.

manoj

-- 
There's just something I don't like about Virginia; the state.
Manoj Srivastava sriva...@debian.org http://www.debian.org/~srivasta/  
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-16 Thread Stefano Zacchiroli
On Thu, Apr 16, 2009 at 12:50:12PM -0500, Manoj Srivastava wrote:
 I think we need to enumerate some goals for this proposed
 change. Here is a start:
 
  - Minimal disruption for current packages. The impact should be
measured by numbers of packages impacted
snip
 At this point, I would say that Markdown/Resstructued text meets
 most of the goals above, as long as we restrict the markup to the
 list above:

I agree with the goals and thanks for resetting the discussion on
their grounds.

According to the goals you pointed out, it looks like that Markdown
would be a more than suitable choice in terms of availability of
implementations, matching of mail-like markup (which is actually one
of the design goal of the language), and minimal disruption.

[ Markdown would also be my choice in term of personal tastes. Not
  that it matters, but I mention it to it make clear which is my
  church in this respect :) ]

However, markdown would not be directly applicable to the content of
the long description field, as a RFC822 parser would give you, due to
'.'s used as paragraph separators. Sure the needed pre-processing to
fix that would be trivial, but it is *some kind* of
pre-processing. One can then wonder to which extent we would allow
pre-processing before the markup processor without considering that
need a disruption of current long descriptions.

I just felt like pointing that out, because it can put back into play
some other language which can be considered non disrupting by
allowing some extra pre-processing bits. ... nevertheless I completely
agree that something like Markdown + the minimal paragraph separator
pre-processing looks like a completely reasonable implementation
plan. Out of curiosity, would restructured text be immune to this
problem?

Cheers.

-- 
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
z...@{upsilon.cc,pps.jussieu.fr,debian.org} -- http://upsilon.cc/zack/
Dietro un grande uomo c'è ..|  .  |. Et ne m'en veux pas si je te tutoie
sempre uno zaino ...| ..: | Je dis tu à tous ceux que j'aime


signature.asc
Description: Digital signature


Re: RFC: Better formatting for long descriptions

2009-04-15 Thread Guillem Jover
Hi!

On Mon, 2009-03-23 at 13:26:36 +0100, Andreas Tille wrote:
 On Mon, 23 Mar 2009, Michael Banck wrote:
  So it would be great if some numbers could be brought up first (maybe
  Andreas has a rough overview now, because he looked at the different
  kinds of itemizations).

 Well, I had not but you can get it somehow by

 for tag in \* - + o ; do
 echo Tag $tag was used `grep ^  $tag  /var/lib/dpkg/available | wc -l` 
 times
 done

 Tag \* was used 5647 times
 Tag - was used 2710 times
 Tag + was used 85 times
 Tag o was used 282 times

 which only counts those who have proper spacing - but for a rough estimation
 '*' wins definitely.

Even if we'd have to fix all the entries with wrong spacing anyway to
reach correctness, I was curious to see numbers for all spacing variants
for a wider representation of the characters used:


,-- count-bullet-chars.sh --
#!/bin/sh
lists=/var/lib/apt/lists/*_sid_main_*_Packages
total=`grep ^ *[-+\*o]  $lists | wc -l`
for tag in \* - + o; do
  items=`grep ^ *$tag  $lists | wc -l`
  percent=`echo scale=4; $items / $total * 100 | bc`
  echo Tag $tag was used $items times ($percent%)
done
`--

Tag \* was used 9277 times (68.0900%)
Tag - was used 3837 times (28.1600%)
Tag + was used 120 times (.8800%)
Tag o was used 390 times (2.8600%)


Regardless of the numbers though (which have moved lately slightly in
favour of '-' due to the recommendations from the Smith reviewing
project), I've always found the asterisk the obvious character to use
for bulleted lists, as it's the one ressembling the most a bullet, and
it's the one we use in changelog entries and similar.

regards,
guillem


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-08 Thread Andreas Tille

On Wed, 8 Apr 2009, Guillem Jover wrote:


There's been a wiki page trying to track this, including packages
which formatting was proving problematic:

 http://wiki.debian.org/Aptitude::Parse-Description-Bullets=true


Great.  The most important information from this page for myself is that
there are actually other tools (not the one I intended to write for
Blends) which actually would profit from a more standardized formating
of descriptions.  IMHO this rectifies filing bug reports against packages
that try to implement a list but fail to use the form:

  has_list |= ( line =~ /^\s+-/ )# a line starts with   -
  has_list |= ( line =~ /^\s+\+/ )   #  +
  has_list |= ( line =~ /^\s+\*/ )   #  *
  has_list |= ( line =~ /^\s+o\s+/ ) #  o 

BTW, why are you checking for \s after the itemizing symbol only after
'o'?  IMHO it should always follow each itemizing symbol.  I also see
no good chances to detect multi level lists and thus I would like to
come back to more strict rules regarding the itemizing symbol and the
spacing.  In contrast to the comment in the end the check also allows

  -

and I would rather like to force

   /^  - /  or  /^  + /

(yes, not checking for any space but really the character ' ' = blank).
IMHO this would increase the reliability of detecting a list and if there
are tools like aptitude who are actually making use of it it should be
worth the effort.

For the sake of interest: What programming language is the script above?

Kind regards

   Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-04-07 Thread Guillem Jover
Hi!

On Mon, 2009-03-23 at 16:23:12 -0700, Daniel Burrows wrote:
   I don't have the energy to push this any more, but I should probably
 at least refer to my previous attempt to standardize bulleted lists:
 
 http://lists.debian.org/debian-devel/2005/12/msg00531.html
 
   You might find it useful, or not.  At least it more or less documents
 current practice in aptitude (I think there have been some tweaks since
 then; if anyone cares I could go research what they are and dig them up).

There's been a wiki page trying to track this, including packages
which formatting was proving problematic:

  http://wiki.debian.org/Aptitude::Parse-Description-Bullets=true

regards,
guillem


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-24 Thread Stefano Zacchiroli
On Mon, Mar 23, 2009 at 07:18:07PM +0100, Michael Banck wrote:
 Uh, what are you saying here?  That we should use   *  to prepend
 items in itemized lists, so that it can be converted to HTML lists by
 packages.debian.org et al.?  If not, what else?

Yes.

More generally, I believe we can benefit in the long run of some
simple text-based markup that support the basic emphasis stuff we are
used to use in emails; markdown is just an example of such a language.

Having to choose a syntax for itemized list, it would be wise to
choose one which is future compatible with such a language.

Cheers.

-- 
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
z...@{upsilon.cc,pps.jussieu.fr,debian.org} -- http://upsilon.cc/zack/
Dietro un grande uomo c'è ..|  .  |. Et ne m'en veux pas si je te tutoie
sempre uno zaino ...| ..: | Je dis tu à tous ceux que j'aime


signature.asc
Description: Digital signature


Re: RFC: Better formatting for long descriptions

2009-03-23 Thread Christian Perrier
Quoting Michael Banck (mba...@debian.org):

  Please note that debian-l10n-english suggests using the enumeration
  style you mention for a2ps, when we're reviewing package
  descriptions...
 
 What's the rationale?  So far, I was under the impression that   * 


A not very strong one, I'm afraid..:-)

IIRC, we once found some reference indicating a tendency for dashed
enumerations to be an accepted standard but I can't quote this.

Another reason is the fact that we're using this in French
translationswhich is a bad reason..:-)

Another is that we had to choose something and, based on purely
personal impressions, we were thinking that dashed enumerations were
the majority (nobody really verified).

I think that we never really went into this to be the only proposed
change. Most of the time, there are several other
changes...particularly when enumerations are involved because, in such
cases:

- they're often too long (enumerating each and every feature of the
  software)
- they have formatting issues (punctuation, often)
- they have consistency issues (mixing verb sentences and noun
  sentences for instance)




signature.asc
Description: Digital signature


Re: RFC: Better formatting for long descriptions

2009-03-23 Thread Andreas Tille

On Mon, 23 Mar 2009, Christian Perrier wrote:



What's the rationale?  So far, I was under the impression that   * 


A not very strong one, I'm afraid..:-)

IIRC, we once found some reference indicating a tendency for dashed
enumerations to be an accepted standard but I can't quote this.


Could you please clarify whether you mean *enumeration* (in the sense
of LaTeXs enumeration environment or HTMLs ol) or would you rather
mean *itemize* (in the sense of LaTeXs itemize environment or HTMLs
ul)?  IMHO this are things which should be handled differently.
I don't care whether a '  *' or a '  -' is finally used - it just
should be used in the same way for all descriptions.


- they're often too long (enumerating each and every feature of the
 software)
- they have formatting issues (punctuation, often)
- they have consistency issues (mixing verb sentences and noun
 sentences for instance)


I completely agree that this should be fixed as well - but it is hard
to code such tests in a lintian check or something like this.

Kind regards

  Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-23 Thread Michael Banck
On Mon, Mar 23, 2009 at 07:24:45AM +0100, Christian Perrier wrote:
 Quoting Michael Banck (mba...@debian.org):
 
   Please note that debian-l10n-english suggests using the enumeration
   style you mention for a2ps, when we're reviewing package
   descriptions...
  
  What's the rationale?  So far, I was under the impression that   * 
 
 
 A not very strong one, I'm afraid..:-)
 
 IIRC, we once found some reference indicating a tendency for dashed
 enumerations to be an accepted standard but I can't quote this.
 
 Another reason is the fact that we're using this in French
 translationswhich is a bad reason..:-)
 
 Another is that we had to choose something and, based on purely
 personal impressions, we were thinking that dashed enumerations were
 the majority (nobody really verified).

Well, ok; but your initial post to this thread made it sound like some
semi-or-mostly official description review process, so having to change
all my long descriptions to   -  (after all, standardizing on one
format is the point of this thread) does not fill me with pure joy.  So
if I have to do that, I'd prefer having a reason like 80% of the
packages do it like that or this is the preferred form of itemization
in english according to ..., or something.  The above reasons do not
look very convincing to me.

So it would be great if some numbers could be brought up first (maybe
Andreas has a rough overview now, because he looked at the different
kinds of itemizations).

Again, I don't think enumerations are used that much (and if they are, a
lot of them are really itemizations I guess), but standardizing on
itemizations strikes me as useful.  Not just for packages.d.o HTML
output, but also for apt-cache show consistence etc.


Michael


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-23 Thread Andreas Tille

On Mon, 23 Mar 2009, Michael Banck wrote:


So it would be great if some numbers could be brought up first (maybe
Andreas has a rough overview now, because he looked at the different
kinds of itemizations).


Well, I had not but you can get it somehow by

for tag in \* - + o ; do
echo Tag $tag was used `grep ^  $tag  /var/lib/dpkg/available | wc -l` 
times
done

Tag \* was used 5647 times
Tag - was used 2710 times
Tag + was used 85 times
Tag o was used 282 times


which only counts those who have proper spacing - but for a rough estimation
'*' wins definitely.


Again, I don't think enumerations are used that much (and if they are, a
lot of them are really itemizations I guess)


Just recommending: There is no real need for enumerations - lets use
itemize in any case might be a valid point as well.  But IMHO whe need
descriptions (in the sense of LaTeX description environment or HTML dl).

Kind regards

   Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-23 Thread Stefano Zacchiroli
On Fri, Mar 20, 2009 at 02:45:09PM +0100, Andreas Tille wrote:
 I do not propose drastic changes but a start for Best practices
 might be reasonable and perhaps some lintian warnings might help to
 remind developers to move to some standard.

Laudable initiative, thanks for raising the issue. The current
handling of list is dumb at best.

I agree with Martin that we should avoid the NIH syndrome though, but
that does not necessarily mean that we should switch entirely control
files to a new format. It just means that we should think big.

In particular, I observe that we (IIRC) already have psuedo-parsing
code which is used at least by packages.d.o to render as proper HTML
lists the pseudo-lists which come from long descriptions. That makes
evident, at least to me, that long descriptions need some kind of
formatting for most of their use cases (packages.d.o is one, the
interface of a GUI package manager is another one).

In that respect, resisting the NIH syndrome just means choose an
already existing text-based markup language and adopt its
convention. For instance, we can just say that long description lists
have to be formatted as Markdown lists (modulo some extra bits needed
to not violate 822 parsing). That would be synergistic with a possible
future switch to Markdown for the whole markup of long
descriptions. Note that I don't care in particular about Markdown, it
can also be restructured text for what I care.

But please check that your convention matches such a markup language
and please say explicitly so in your proposal. That would also
implement a somewhat principle of least surprise for people coming
from those languages.

Thanks!
Cheers.

-- 
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
z...@{upsilon.cc,pps.jussieu.fr,debian.org} -- http://upsilon.cc/zack/
Dietro un grande uomo c'è ..|  .  |. Et ne m'en veux pas si je te tutoie
sempre uno zaino ...| ..: | Je dis tu à tous ceux que j'aime


signature.asc
Description: Digital signature


Re: RFC: Better formatting for long descriptions

2009-03-23 Thread Andreas Tille

On Mon, 23 Mar 2009, Stefano Zacchiroli wrote:


In particular, I observe that we (IIRC) already have psuedo-parsing
code which is used at least by packages.d.o to render as proper HTML
lists the pseudo-lists which come from long descriptions.


Not that I know of.  IMHO it is just set verbose (pre) just checking
the a2ps example which was mentioned here:

-
h2GNU a2ps - 'Anything to PostScript' converter and pretty-printer/h2
p
GNU a2ps converts files into PostScript for printing or viewing. It uses a
nice default format, usually two pages on each physical page, borders
surrounding pages, headers with useful information (page number, printing
date, file name or supplied header), line numbering, symbol substitution
as well as pretty printing for a wide range of programming languages.
p
Historically, a2ps started as a text to PostScript converter, but thanks
to powerful delegations it is able to let you use it for any kind of files,
ie it can also digest manual pages, dvi files, texinfo, 
p
Among the other most noticeable features of a2ps are:
pre

 - various encodings (all the Latins and others),
 - various fonts (automatic font down loading),
 - various medias,
 - various printer interfaces,
 - various output styles,
 - various programming languages,
 - various helping applications,
 - and various spoken languages.
/pre



But please check that your convention matches such a markup language
and please say explicitly so in your proposal.


This is definitely intended but I'm not an example of those markup
languages.  That's why I said:

1. Defines some kind of standard which can be parsed automatically.
2. Does not break any existing tool

If there is an existing markup language which fits this feature I'd definitely
vote for it.

Kind regards

  Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-23 Thread Christian Perrier
Quoting Andreas Tille (til...@rki.de):

 Could you please clarify whether you mean *enumeration* (in the sense

I meant itemization, actually, so more ul than ol. There are
certainly very few cases where ordered lists are really useful in
packages' description.

Sorry for the approximative English, here..



-- 




signature.asc
Description: Digital signature


Re: RFC: Better formatting for long descriptions

2009-03-23 Thread Michael Banck
On Mon, Mar 23, 2009 at 02:32:17PM +0100, Stefano Zacchiroli wrote:
 In that respect, resisting the NIH syndrome just means choose an
 already existing text-based markup language and adopt its
 convention. For instance, we can just say that long description lists
 have to be formatted as Markdown lists (modulo some extra bits needed
 to not violate 822 parsing). That would be synergistic with a possible
 future switch to Markdown for the whole markup of long
 descriptions. Note that I don't care in particular about Markdown, it
 can also be restructured text for what I care.

Uh, what are you saying here?  That we should use   *  to prepend
items in itemized lists, so that it can be converted to HTML lists by
packages.debian.org et al.?  If not, what else?


Michael


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-23 Thread Daniel Burrows
  I don't have the energy to push this any more, but I should probably
at least refer to my previous attempt to standardize bulleted lists:

http://lists.debian.org/debian-devel/2005/12/msg00531.html

  You might find it useful, or not.  At least it more or less documents
current practice in aptitude (I think there have been some tweaks since
then; if anyone cares I could go research what they are and dig them up).

  Daniel


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-22 Thread Andreas Tille

On Sun, 22 Mar 2009, Michael Bramer wrote:

if we like to remove the long description from the package file, we must 
change apt in some way and use some other rules for select the right 
description (a new 'Description-md5sum' or the Version-Nr)


I'd call the Version-Nr. a sinsible choice. ;-)

Kind regards

  Andreas.
--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-22 Thread Andreas Tille

On Sat, 21 Mar 2009, Christian Perrier wrote:


Please note that debian-l10n-english suggests using the enumeration
style you mention for a2ps, when we're reviewing package
descriptions...


BTW, once you answered in this thread: Shouldn't we make the suggested
enhancements part of the Smith-Project?

Kind regards

  Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-22 Thread Christian Perrier
Quoting Andreas Tille (til...@rki.de):
 On Sat, 21 Mar 2009, Christian Perrier wrote:

 Please note that debian-l10n-english suggests using the enumeration
 style you mention for a2ps, when we're reviewing package
 descriptions...

 BTW, once you answered in this thread: Shouldn't we make the suggested
 enhancements part of the Smith-Project?


Certainly. I currently refrain myself from reading -devel (it seems
like we are in this state of the release cycle where flame wars and
complicated discussions increase.and I try saving my own time for
productive work) but I would appreciate a summary in case things and
ideas converge (good luck for this..:-))

Another thing we encourage in Smith is the use of good boilerplates in
package descriptions, for multi-binary packagesThe point is having
a repetitive part common to all packages of a give source package,
that is the description of the general use of the framework and 1 or
2 specific paragraphs for each binary package saying things like This
package provides the development files for foo, etc.


A good example of this is the recent review of nut templatesthat
was one of the most complicated review we did (mostly because this is
one of the few where the maintainer gave advices...:-))

That review starts at
http://lists.debian.org/debian-l10n-english/2009/03/msg00025.html

...and turned out into #520591 I suggest interested parties to
look at debian/control for nut before and after the review..:-)




signature.asc
Description: Digital signature


Re: RFC: Better formatting for long descriptions

2009-03-22 Thread Lionel Elie Mamane
On Sat, Mar 21, 2009 at 10:52:10PM +0100, Andreas Tille wrote:

 I agree that some descriptions are definitely to long.  I wonder who
 should really read some descriptions to the end.  Bad examples can
 be viewn here:

http://debian-med.alioth.debian.org/tasks/typesetting.html

The very long lengths seem to come mostly from lists of CTAN packages
in a Debian package; I find these useful, as I can apt-cache search
CTAN_package to find it in Debian.

-- 
Lionel


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-22 Thread Andreas Tille

On Sun, 22 Mar 2009, Lionel Elie Mamane wrote:


   http://debian-med.alioth.debian.org/tasks/typesetting.html


The very long lengths seem to come mostly from lists of CTAN packages
in a Debian package; I find these useful, as I can apt-cache search
CTAN_package to find it in Debian.


Yes, I'm sure there are reasons for just putting everything into
the description of a package - but as this thread shows there are
also reasons against - and I wonder how many users are bored about
overlongish descriptions compared to those who grep apt-cache
output.

Kind regards

 Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-22 Thread Ben Finney
Lionel Elie Mamane lio...@mamane.lu writes:

 The very long lengths seem to come mostly from lists of CTAN
 packages in a Debian package; I find these useful, as I can
 apt-cache search CTAN_package to find it in Debian.

For that purpose, it would seem ‘apt-file’ can do the job better,
obviating the need for that listing to bloat the Packages file. Or am
I missing something?

-- 
 \“I bought a dog the other day. I named him Stay. It's fun to |
  `\ call him. ‘Come here, Stay! Come here, Stay!’ He went insane. |
_o__) Now he just ignores me and keeps typing.” —Steven Wright |
Ben Finney


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-22 Thread Raphael Geissert
Neil Williams wrote:
 
 If large numbers of package descriptions are to change collectively,
 it's best to make that one change with two aims rather than two separate
 changes. Less work for everyone involved.

But Andreas' RFC affects the source packages, yours only affects the
infrastructure that builds and uses Packages.

IOW: maintainers need to do something to go ahead with  Andrea's proposal
and do nothing to see package descriptions go away from Packages.

 
 Just looking for a bit of consideration for those situations where the
 Packages file is already too large.
 

Cheers,
Raphael Geissert


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-22 Thread Michael Banck
On Sat, Mar 21, 2009 at 11:13:54PM +0100, Christian Perrier wrote:
 Quoting Andreas Tille (til...@rki.de):
 
  Package: a2ps
- various encodings (all the Latins and others),
- various fonts (automatic font down loading),
- various medias,
  ^^ (two spaces)

 Please note that debian-l10n-english suggests using the enumeration
 style you mention for a2ps, when we're reviewing package
 descriptions...

What's the rationale?  So far, I was under the impression that   * 
was the most used enumeration style in long descriptions.


Michael


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-21 Thread Neil Williams
On Fri, 20 Mar 2009 19:15:00 -0400
Filipus Klutiero chea...@gmail.com wrote:

  On Fri, 20 Mar 2009 14:45:09 +0100 (CET)
  Andreas Tille til...@rki.de wrote:
 
   I tried to find a clear advise how to reasonable format lists inside long
   descriptions of packages.  The only thing I know is that lines with two
   leading spaces is considered verbose. 
 
  Packages.gz is already 26Mb - I'd like to find ways to shorten the
  package descriptions, not lengthen it. :-(

 Current squeeze main Packages.gz is 7 MB: 
 http://ftp.ca.debian.org/debian/dists/squeeze/main/binary-i386/

Bah, my fault - 26Mb uncompressed. I was looking at /var/lib/apt/lists/
Sorry.

  Can the long description be trimmed to only such data necessary to
  identify the package compared to similar packages? We have debtags for
  lots of other facets of a package description, maybe it is time that
  the long description itself is trimmed so that it does not repeat any
  information already encoded as debtags?

 debtags is not yet at a stage where this should be done (for one thing, 
 Synaptic, for example, does not support debtags). Even if it would be 
 possible, I doubt this would help much.

Any reduction, replicated across 13,000 packages (or even just the
ones from that 13,000 that have verbose long descriptions currently), is
only going to help reduce the size of the file.

  What about a way of having a really long, detailed, nicely formatted
  description on packages.debian.org but a much shorter, more basic
  version in the Packages.gz file?

 The extended description needs to be available to APT

Only for use by apt-search, the rest of apt doesn't care about it. apt
understands debtags, why duplicate that information? (Frontends can be
adapted or just rely on apt-cache search underneath.)

, not only via 
 packages.d.o. I seem to remember that Mandrake Linux (or some other 
 RPM-based distribution) used two Packages-like files, a fat one about 5 
 times our Packages and a slim one about a fifth of Debian's Packages. I 
 remember finding the slim index cool, but now that there's 
 Packages.diff, I think that developing Mandrake-like Packages files and 
 seeing the results in, perhaps, 2 years, would not benefit much to the 
 kind of hardware Debian will run on by then.

Debian is not exclusively for power-hungry servers and mega-powerful
workstations, Debian also runs on very small hardware and not
necessarily old stuff either. It is a mistake to think that Debian
should require more and more powerful hardware for the basic system.

Yes, there is software in Debian that needs a powerful machine, there
is also a LOT of software in Debian specifically designed for low
resource machines where the benefits of a 1Mb Packages.gz file are
appreciable.

-- 


Neil Williams
=
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/



pgp3lHY1fDFBt.pgp
Description: PGP signature


Re: RFC: Better formatting for long descriptions

2009-03-21 Thread Neil Williams
On Fri, 20 Mar 2009 23:32:51 +0100
Michael Banck mba...@debian.org wrote:

 On Fri, Mar 20, 2009 at 07:20:43PM +, Neil Williams wrote:
  I'd like to get the longest descriptions out of Packages.gz completely,
  so encouraging their retention it not ideal. It's not about whether 2
  or 3 spaces should be used, it's about whether such detailed content
  deserves to be in Packages.gz in the first place.
 
 Then I wonder why you hijacked this thread and did not rather start a
 new one?

If large numbers of package descriptions are to change collectively,
it's best to make that one change with two aims rather than two separate
changes. Less work for everyone involved.

Just looking for a bit of consideration for those situations where the
Packages file is already too large.

-- 


Neil Williams
=
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/



pgptoOhBoY8aZ.pgp
Description: PGP signature


Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-21 Thread Neil Williams
On Sat, 21 Mar 2009 12:28:36 +0900
Paul Wise p...@debian.org wrote:

 On Sat, Mar 21, 2009 at 8:15 AM, Filipus Klutiero chea...@gmail.com wrote:
 
  The extended description needs to be available to APT, not only via
  packages.d.o.
 
 I agree with Neil William's comment in the other thread about removing
 long descriptions from the Packages files. I think the obvious place
 to put them is in dists/unstable/main/i18n/Translations-en (or C) like
 the descriptions from DDTP.

Now that's a good idea - thanks Paul. That way, the long descriptions
can be moved aside without needing changes by lots of maintainers and
other formatting changes like the original thread can proceed
independently.

It's another instance of duplication - why retain the long description
in the Packages file while a translated version also exists from DDTP?
Probably better for the description to be removed from the Packages
file completely and the DDTP one contains the translated version and
English ones for those with missing or outdated translations. That way,
apt spends less time parsing the (smaller) Packages file when doing
ordinary stuff like package installation and only needs to look at the
DDTP information when specifically called as 'apt-cache search'.

CC:'ing debian-i18n to see if there are problems with this approach.

-- 


Neil Williams
=
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/



pgprAi03SA6jw.pgp
Description: PGP signature


Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-21 Thread Paul Wise
On Sat, Mar 21, 2009 at 4:58 PM, Neil Williams codeh...@debian.org wrote:

 It's another instance of duplication - why retain the long description
 in the Packages file while a translated version also exists from DDTP?
 Probably better for the description to be removed from the Packages
 file completely and the DDTP one contains the translated version and
 English ones for those with missing or outdated translations. That way,
 apt spends less time parsing the (smaller) Packages file when doing
 ordinary stuff like package installation and only needs to look at the
 DDTP information when specifically called as 'apt-cache search'.

One issue is that many people will have disabled downloading
translations so they'll need to change their configuration from none
to en:

APT::Acquire::Translation none;

Since en will now be a Translation, perhaps a different config item
is more appropriate:

APT::Acquire::Description en;

-- 
bye,
pabs

http://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-21 Thread Andreas Tille

On Fri, 20 Mar 2009, Neil Williams wrote:


Packages.gz is already 26Mb - I'd like to find ways to shorten the
package descriptions, not lengthen it. :-(


Please read again.  Chances are good that packages files might
become shorter.


The rationale behind this is that with some
better standard formating some tools which display descriptions on web
pages might be enhanced to use li, ol and dl tags which finally
makes a better reading.


Oh no, please don't let Packages.gz get to 40Mb or 50Mb or more. There
has to be a limit somewhere.


You should definitely read again - in how far removing / adding some spaces
and use defined characters instead of random ones should have such an
effect?

Kind regards

Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-21 Thread Andreas Tille

On Fri, 20 Mar 2009, Neil Williams wrote:


My comment for this RFC is, therefore, that better formatting for long
descriptions should include a review of whether the long description
deserves to be that long in the first place, whether the long
description merely duplicates data already available via debtags and
whether the long description should be trimmed for the package in
question *as well as* standardising the formatting of what remains.


I agree that some descriptions are definitely to long.  I wonder who
should really read some descriptions to the end.  Bad examples can be
viewn here:

   http://debian-med.alioth.debian.org/tasks/typesetting.html

Kind regards

   Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-21 Thread Andreas Tille

On Fri, 20 Mar 2009, Filipus Klutiero wrote:


   2. Does not break any existing tool

I tend to agree with Martin. Do you have a particular reason making this 
change urge?


Just to give the suggestion a small chance.  I'm not against a better
format but I have read enough suggestions that ended in nothing.  BTW,
getting the descriptions in some standard shape might make an automatic
transition to a better format easier.

Kind regards

   Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-21 Thread Christian Perrier
Quoting Andreas Tille (til...@rki.de):

 Package: a2ps
   - various encodings (all the Latins and others),
   - various fonts (automatic font down loading),
   - various medias,
 ^^ (two spaces)

 Package: acerhk-source
* controlling LEDs (Mail, Wireless)
* enable/disable wireless hardware
 ^^^ (three spaces)


.../...

Please note that debian-l10n-english suggests using the enumeration
style you mention for a2ps, when we're reviewing package
descriptions...

Of course, that triggers rewrites but these are generally coupled with
much more very good improvement suggestions (the team features an
artist of the English language and that's not /mewhich is obvious
for everybody).




signature.asc
Description: Digital signature


Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-21 Thread Michael Bramer



Paul Wise schrieb:

On Sat, Mar 21, 2009 at 4:58 PM, Neil Williams codeh...@debian.org wrote:


It's another instance of duplication - why retain the long description
in the Packages file while a translated version also exists from DDTP?
Probably better for the description to be removed from the Packages
file completely and the DDTP one contains the translated version and
English ones for those with missing or outdated translations. That way,
apt spends less time parsing the (smaller) Packages file when doing
ordinary stuff like package installation and only needs to look at the
DDTP information when specifically called as 'apt-cache search'.


One issue is that many people will have disabled downloading
translations so they'll need to change their configuration from none
to en:

APT::Acquire::Translation none;

Since en will now be a Translation, perhaps a different config item
is more appropriate:

APT::Acquire::Description en;


This will not work:

apt use a md5sum from the sort and lang description (from the packages 
file) to find the right 'translation'. If you remove the long 
description from the packages file, apt can't do this task...


if we like to remove the long description from the package file, we must 
 change apt in some way and use some other rules for select the right 
description (a new 'Description-md5sum' or the Version-Nr)


Gruss
Grisu


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-21 Thread Filipus Klutiero

Neil Williams wrote:

On Fri, 20 Mar 2009 19:15:00 -0400
Filipus Klutiero chea...@gmail.com wrote:

[...]

  What about a way of having a really long, detailed, nicely formatted
  description on packages.debian.org but a much shorter, more basic
  version in the Packages.gz file?

 The extended description needs to be available to APT


Only for use by apt-search, the rest of apt doesn't care about it. apt
understands debtags, why duplicate that information? (Frontends can be
adapted or just rely on apt-cache search underneath.)
  
I don't understand what you mean. Where would apt-cache get the extended 
description from? Again, debtags is not mature enough yet to shrink 
descriptions.
, not only via 
 packages.d.o. I seem to remember that Mandrake Linux (or some other 
 RPM-based distribution) used two Packages-like files, a fat one about 5 
 times our Packages and a slim one about a fifth of Debian's Packages. I 
 remember finding the slim index cool, but now that there's 
 Packages.diff, I think that developing Mandrake-like Packages files and 
 seeing the results in, perhaps, 2 years, would not benefit much to the 
 kind of hardware Debian will run on by then.


Debian is not exclusively for power-hungry servers and mega-powerful
workstations, Debian also runs on very small hardware and not
necessarily old stuff either. It is a mistake to think that Debian
should require more and more powerful hardware for the basic system.
  
Actually, I was only saying that I thought such a reduction of the 
hardware requirements would not help much.

Yes, there is software in Debian that needs a powerful machine, there
is also a LOT of software in Debian specifically designed for low
resource machines where the benefits of a 1Mb Packages.gz file are
appreciable.
I agree, after reading Paul's comment, that if we get a Translations-en 
file via DDTP, removing the extended description from Packages would be 
less work, and thus more interesting.


I tested the gain with
awk '$0 !~ /^(Description| )/'
and the result loses close to half of its compressed size.
-rw-r--r-- 1 chealer chealer  4224356 mar 21 20:12 nodesc.tar.gz
-rw-r--r-- 1 chealer chealer  7350583 mar 21 15:56 
debian.savoirfairelinux.net_debian_dists_testing_main_binary-i386_Packages.tar.gz



--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-20 Thread martin f krafft
also sprach Andreas Tille til...@rki.de [2009.03.20.1445 +0100]:
 I tried to find a clear advise how to reasonable format lists inside long
 descriptions of packages.  The only thing I know is that lines with two
 leading spaces is considered verbose.  This leaves a lot of freedom to
 simulate for instance itemize lists.  I'd like to give some examples for
 package names starting with 'a' and stopped with the first package names
 of 'b'.  If you are bored by these examples continue reading below the
   -- line.

What we really should do, instead of clinging to the NIH-behaviour,
reinventing the wheel, and polishing it over and over again is ditch
the pseudo-RFC822 format we have and use Yaml instead.

http://www.yaml.org/start.html
http://yaml.org/spec/1.2/

-- 
 .''`.   martin f. krafft madd...@d.o  Related projects:
: :'  :  proud Debian developer   http://debiansystem.info
`. `'`   http://people.debian.org/~madduckhttp://vcs-pkg.org
  `-  Debian - when you have better things to do than fixing systems
 
den stil verbessern, das heißt den gedanken verbessern.
 - friedrich nietzsche


digital_signature_gpg.asc
Description: Digital signature (see http://martin-krafft.net/gpg/)


Re: RFC: Better formatting for long descriptions

2009-03-20 Thread Andreas Tille

On Fri, 20 Mar 2009, martin f krafft wrote:


What we really should do, instead of clinging to the NIH-behaviour,
reinventing the wheel, and polishing it over and over again is ditch
the pseudo-RFC822 format we have and use Yaml instead.

http://www.yaml.org/start.html
http://yaml.org/spec/1.2/


And most probably somebody else will revive the switch to XML suggestion.
I know the pros and cons for different formats but I want a solution *now*
and that's the reason why I wrote:


   2. Does not break any existing tool


Kind regards

   Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-20 Thread Michael Banck
On Fri, Mar 20, 2009 at 02:45:09PM +0100, Andreas Tille wrote:
 1. Itemize lists: (li)
 

 2. Enumerate lists: (ol)
 --

 3. Description lists: (dl)
 

 This suggestion is far from complete and should be enhanced.

Well, not sure this should be over-engineered; I guess itemize lists
already cover most of the cases (most enumerations could probably be
changed to itemizations I guess).

So a +1 from me.


Michael


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-20 Thread Neil Williams
On Fri, 20 Mar 2009 14:45:09 +0100 (CET)
Andreas Tille til...@rki.de wrote:

 I tried to find a clear advise how to reasonable format lists inside long
 descriptions of packages.  The only thing I know is that lines with two
 leading spaces is considered verbose. 

Packages.gz is already 26Mb - I'd like to find ways to shorten the
package descriptions, not lengthen it. :-(

 This leaves a lot of freedom to
 simulate for instance itemize lists.  I'd like to give some examples for
 package names starting with 'a' and stopped with the first package names
 of 'b'.  If you are bored by these examples continue reading below the
-- line.
 -
 
 I think we should try to implement some more strict formating rules
 to our long descriptions. 

Maybe starting with a way to provide extra long descriptions by some
means *other* than Packages.gz - which in turn means maintainers
deciding which bits of the long description *really* need to be visible
before download and which can wait until the user has decided to
download the package.

Can the long description be trimmed to only such data necessary to
identify the package compared to similar packages? We have debtags for
lots of other facets of a package description, maybe it is time that
the long description itself is trimmed so that it does not repeat any
information already encoded as debtags?

 The rationale behind this is that with some
 better standard formating some tools which display descriptions on web
 pages might be enhanced to use li, ol and dl tags which finally
 makes a better reading.

Oh no, please don't let Packages.gz get to 40Mb or 50Mb or more. There
has to be a limit somewhere. 

What about a way of having a really long, detailed, nicely formatted
description on packages.debian.org but a much shorter, more basic
version in the Packages.gz file?

 This suggestion is far from complete and should be enhanced. 

I think the entire suggestion should be redirected away from the
Packages.gz file.

-- 


Neil Williams
=
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/



pgpv9i4UdL58G.pgp
Description: PGP signature


Re: RFC: Better formatting for long descriptions

2009-03-20 Thread Julien Cristau
On Fri, 2009-03-20 at 19:03 +, Neil Williams wrote:
 On Fri, 20 Mar 2009 14:45:09 +0100 (CET)
 Andreas Tille til...@rki.de wrote:
 
  I tried to find a clear advise how to reasonable format lists inside long
  descriptions of packages.  The only thing I know is that lines with two
  leading spaces is considered verbose. 
 
 Packages.gz is already 26Mb - I'd like to find ways to shorten the
 package descriptions, not lengthen it. :-(

Yeah, I'm sure being consistent about whether we use 2 or 3 spaces for
indented lists in descriptions is going to make that file a lot harder
to compress.

Cheers,
Julien


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: RFC: Better formatting for long descriptions

2009-03-20 Thread Emilio Pozuelo Monfort
Neil Williams wrote:
 On Fri, 20 Mar 2009 14:45:09 +0100 (CET)
 Andreas Tille til...@rki.de wrote:
 
 I tried to find a clear advise how to reasonable format lists inside long
 descriptions of packages.  The only thing I know is that lines with two
 leading spaces is considered verbose. 
 
 Packages.gz is already 26Mb - I'd like to find ways to shorten the
 package descriptions, not lengthen it. :-(

AFAICS he's not talking about lengthen the descriptions at all, but to
standardize the way lists are formatted in long descriptions. That is, formalize
whether we should be using 2 or 3 spaces, dashes or plus signs for items in the
lists...

Cheers,
Emilio



signature.asc
Description: OpenPGP digital signature


Re: RFC: Better formatting for long descriptions

2009-03-20 Thread Neil Williams
On Fri, 20 Mar 2009 20:08:43 +0100
Julien Cristau jcris...@debian.org wrote:

 On Fri, 2009-03-20 at 19:03 +, Neil Williams wrote:
  On Fri, 20 Mar 2009 14:45:09 +0100 (CET)
  Andreas Tille til...@rki.de wrote:
  
   I tried to find a clear advise how to reasonable format lists inside long
   descriptions of packages.  The only thing I know is that lines with two
   leading spaces is considered verbose. 
  
  Packages.gz is already 26Mb - I'd like to find ways to shorten the
  package descriptions, not lengthen it. :-(
 
 Yeah, I'm sure being consistent about whether we use 2 or 3 spaces for
 indented lists in descriptions is going to make that file a lot harder
 to compress.

I'd like to get the longest descriptions out of Packages.gz completely,
so encouraging their retention it not ideal. It's not about whether 2
or 3 spaces should be used, it's about whether such detailed content
deserves to be in Packages.gz in the first place.

If there is going to be discussion on standardising on some form of
indentation, it's worth considering whether there isn't a better way of
providing the data itself to achieve other benefits. Indents would need
changes in all affected packages - it might be easier to provide a
different means that also reduces the size of the Packages.gz file
at the same time so that packages only need to be changed once.

My comment for this RFC is, therefore, that better formatting for long
descriptions should include a review of whether the long description
deserves to be that long in the first place, whether the long
description merely duplicates data already available via debtags and
whether the long description should be trimmed for the package in
question *as well as* standardising the formatting of what remains.

Better can be construed to mean more - I merely want maintainers to
consider whether better actually means less.

-- 


Neil Williams
=
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/



pgpvZrNsif0sW.pgp
Description: PGP signature


Re: RFC: Better formatting for long descriptions

2009-03-20 Thread Michael Banck
On Fri, Mar 20, 2009 at 07:20:43PM +, Neil Williams wrote:
 I'd like to get the longest descriptions out of Packages.gz completely,
 so encouraging their retention it not ideal. It's not about whether 2
 or 3 spaces should be used, it's about whether such detailed content
 deserves to be in Packages.gz in the first place.

Then I wonder why you hijacked this thread and did not rather start a
new one?


Michael


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-20 Thread Filipus Klutiero


On Fri, 20 Mar 2009 14:45:09 +0100 (CET)
Andreas Tille til...@rki.de wrote:

 I tried to find a clear advise how to reasonable format lists inside long
 descriptions of packages.  The only thing I know is that lines with two
 leading spaces is considered verbose. 


Packages.gz is already 26Mb - I'd like to find ways to shorten the
package descriptions, not lengthen it. :-(
  
Current squeeze main Packages.gz is 7 MB: 
http://ftp.ca.debian.org/debian/dists/squeeze/main/binary-i386/

Can the long description be trimmed to only such data necessary to
identify the package compared to similar packages? We have debtags for
lots of other facets of a package description, maybe it is time that
the long description itself is trimmed so that it does not repeat any
information already encoded as debtags?
  
debtags is not yet at a stage where this should be done (for one thing, 
Synaptic, for example, does not support debtags). Even if it would be 
possible, I doubt this would help much.

 The rationale behind this is that with some
 better standard formating some tools which display descriptions on web
 pages might be enhanced to use li, ol and dl tags which finally
 makes a better reading.

Oh no, please don't let Packages.gz get to 40Mb or 50Mb or more. There
has to be a limit somewhere.
  
I don't understand the proposal as something affecting Packages's size 
significantly.

What about a way of having a really long, detailed, nicely formatted
description on packages.debian.org but a much shorter, more basic
version in the Packages.gz file?
  
The extended description needs to be available to APT, not only via 
packages.d.o. I seem to remember that Mandrake Linux (or some other 
RPM-based distribution) used two Packages-like files, a fat one about 5 
times our Packages and a slim one about a fifth of Debian's Packages. I 
remember finding the slim index cool, but now that there's 
Packages.diff, I think that developing Mandrake-like Packages files and 
seeing the results in, perhaps, 2 years, would not benefit much to the 
kind of hardware Debian will run on by then.



--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Re: RFC: Better formatting for long descriptions

2009-03-20 Thread Filipus Klutiero


On Fri, 20 Mar 2009, martin f krafft wrote:

  


What we really should do, instead of clinging to the NIH-behaviour,
reinventing the wheel, and polishing it over and over again is ditch
the pseudo-RFC822 format we have and use Yaml instead.

http://www.yaml.org/start.html
http://yaml.org/spec/1.2/



And most probably somebody else will revive the switch to XML suggestion.
I know the pros and cons for different formats but I want a solution *now*
and that's the reason why I wrote:

  


   2. Does not break any existing tool


I tend to agree with Martin. Do you have a particular reason making this 
change urge? At worst, a format for extended descriptions could be 
usable by Debian 7.
I noticed while checking if packages.debian.org rendered the current 
descriptions decently that acidlab's description is rendered pretty 
badly, but AFAICS that's just a packages.d.o bug. FWIW, I had never 
noticed such an issue.

Kind regards

   Andreas.
  



--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-20 Thread Paul Wise
On Sat, Mar 21, 2009 at 8:15 AM, Filipus Klutiero chea...@gmail.com wrote:

 The extended description needs to be available to APT, not only via
 packages.d.o.

I agree with Neil William's comment in the other thread about removing
long descriptions from the Packages files. I think the obvious place
to put them is in dists/unstable/main/i18n/Translations-en (or C) like
the descriptions from DDTP.

-- 
bye,
pabs

http://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org