Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-22 Thread Andreas Tille

On Sun, 22 Mar 2009, Michael Bramer wrote:

if we like to remove the long description from the package file, we must 
change apt in some way and use some other rules for select the right 
description (a new 'Description-md5sum' or the Version-Nr)


I'd call the Version-Nr. a sinsible choice. ;-)

Kind regards

  Andreas.
--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-21 Thread Neil Williams
On Fri, 20 Mar 2009 19:15:00 -0400
Filipus Klutiero chea...@gmail.com wrote:

  On Fri, 20 Mar 2009 14:45:09 +0100 (CET)
  Andreas Tille til...@rki.de wrote:
 
   I tried to find a clear advise how to reasonable format lists inside long
   descriptions of packages.  The only thing I know is that lines with two
   leading spaces is considered verbose. 
 
  Packages.gz is already 26Mb - I'd like to find ways to shorten the
  package descriptions, not lengthen it. :-(

 Current squeeze main Packages.gz is 7 MB: 
 http://ftp.ca.debian.org/debian/dists/squeeze/main/binary-i386/

Bah, my fault - 26Mb uncompressed. I was looking at /var/lib/apt/lists/
Sorry.

  Can the long description be trimmed to only such data necessary to
  identify the package compared to similar packages? We have debtags for
  lots of other facets of a package description, maybe it is time that
  the long description itself is trimmed so that it does not repeat any
  information already encoded as debtags?

 debtags is not yet at a stage where this should be done (for one thing, 
 Synaptic, for example, does not support debtags). Even if it would be 
 possible, I doubt this would help much.

Any reduction, replicated across 13,000 packages (or even just the
ones from that 13,000 that have verbose long descriptions currently), is
only going to help reduce the size of the file.

  What about a way of having a really long, detailed, nicely formatted
  description on packages.debian.org but a much shorter, more basic
  version in the Packages.gz file?

 The extended description needs to be available to APT

Only for use by apt-search, the rest of apt doesn't care about it. apt
understands debtags, why duplicate that information? (Frontends can be
adapted or just rely on apt-cache search underneath.)

, not only via 
 packages.d.o. I seem to remember that Mandrake Linux (or some other 
 RPM-based distribution) used two Packages-like files, a fat one about 5 
 times our Packages and a slim one about a fifth of Debian's Packages. I 
 remember finding the slim index cool, but now that there's 
 Packages.diff, I think that developing Mandrake-like Packages files and 
 seeing the results in, perhaps, 2 years, would not benefit much to the 
 kind of hardware Debian will run on by then.

Debian is not exclusively for power-hungry servers and mega-powerful
workstations, Debian also runs on very small hardware and not
necessarily old stuff either. It is a mistake to think that Debian
should require more and more powerful hardware for the basic system.

Yes, there is software in Debian that needs a powerful machine, there
is also a LOT of software in Debian specifically designed for low
resource machines where the benefits of a 1Mb Packages.gz file are
appreciable.

-- 


Neil Williams
=
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/



pgp3lHY1fDFBt.pgp
Description: PGP signature


Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-21 Thread Neil Williams
On Sat, 21 Mar 2009 12:28:36 +0900
Paul Wise p...@debian.org wrote:

 On Sat, Mar 21, 2009 at 8:15 AM, Filipus Klutiero chea...@gmail.com wrote:
 
  The extended description needs to be available to APT, not only via
  packages.d.o.
 
 I agree with Neil William's comment in the other thread about removing
 long descriptions from the Packages files. I think the obvious place
 to put them is in dists/unstable/main/i18n/Translations-en (or C) like
 the descriptions from DDTP.

Now that's a good idea - thanks Paul. That way, the long descriptions
can be moved aside without needing changes by lots of maintainers and
other formatting changes like the original thread can proceed
independently.

It's another instance of duplication - why retain the long description
in the Packages file while a translated version also exists from DDTP?
Probably better for the description to be removed from the Packages
file completely and the DDTP one contains the translated version and
English ones for those with missing or outdated translations. That way,
apt spends less time parsing the (smaller) Packages file when doing
ordinary stuff like package installation and only needs to look at the
DDTP information when specifically called as 'apt-cache search'.

CC:'ing debian-i18n to see if there are problems with this approach.

-- 


Neil Williams
=
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/



pgprAi03SA6jw.pgp
Description: PGP signature


Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-21 Thread Paul Wise
On Sat, Mar 21, 2009 at 4:58 PM, Neil Williams codeh...@debian.org wrote:

 It's another instance of duplication - why retain the long description
 in the Packages file while a translated version also exists from DDTP?
 Probably better for the description to be removed from the Packages
 file completely and the DDTP one contains the translated version and
 English ones for those with missing or outdated translations. That way,
 apt spends less time parsing the (smaller) Packages file when doing
 ordinary stuff like package installation and only needs to look at the
 DDTP information when specifically called as 'apt-cache search'.

One issue is that many people will have disabled downloading
translations so they'll need to change their configuration from none
to en:

APT::Acquire::Translation none;

Since en will now be a Translation, perhaps a different config item
is more appropriate:

APT::Acquire::Description en;

-- 
bye,
pabs

http://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-21 Thread Michael Bramer



Paul Wise schrieb:

On Sat, Mar 21, 2009 at 4:58 PM, Neil Williams codeh...@debian.org wrote:


It's another instance of duplication - why retain the long description
in the Packages file while a translated version also exists from DDTP?
Probably better for the description to be removed from the Packages
file completely and the DDTP one contains the translated version and
English ones for those with missing or outdated translations. That way,
apt spends less time parsing the (smaller) Packages file when doing
ordinary stuff like package installation and only needs to look at the
DDTP information when specifically called as 'apt-cache search'.


One issue is that many people will have disabled downloading
translations so they'll need to change their configuration from none
to en:

APT::Acquire::Translation none;

Since en will now be a Translation, perhaps a different config item
is more appropriate:

APT::Acquire::Description en;


This will not work:

apt use a md5sum from the sort and lang description (from the packages 
file) to find the right 'translation'. If you remove the long 
description from the packages file, apt can't do this task...


if we like to remove the long description from the package file, we must 
 change apt in some way and use some other rules for select the right 
description (a new 'Description-md5sum' or the Version-Nr)


Gruss
Grisu


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-21 Thread Filipus Klutiero

Neil Williams wrote:

On Fri, 20 Mar 2009 19:15:00 -0400
Filipus Klutiero chea...@gmail.com wrote:

[...]

  What about a way of having a really long, detailed, nicely formatted
  description on packages.debian.org but a much shorter, more basic
  version in the Packages.gz file?

 The extended description needs to be available to APT


Only for use by apt-search, the rest of apt doesn't care about it. apt
understands debtags, why duplicate that information? (Frontends can be
adapted or just rely on apt-cache search underneath.)
  
I don't understand what you mean. Where would apt-cache get the extended 
description from? Again, debtags is not mature enough yet to shrink 
descriptions.
, not only via 
 packages.d.o. I seem to remember that Mandrake Linux (or some other 
 RPM-based distribution) used two Packages-like files, a fat one about 5 
 times our Packages and a slim one about a fifth of Debian's Packages. I 
 remember finding the slim index cool, but now that there's 
 Packages.diff, I think that developing Mandrake-like Packages files and 
 seeing the results in, perhaps, 2 years, would not benefit much to the 
 kind of hardware Debian will run on by then.


Debian is not exclusively for power-hungry servers and mega-powerful
workstations, Debian also runs on very small hardware and not
necessarily old stuff either. It is a mistake to think that Debian
should require more and more powerful hardware for the basic system.
  
Actually, I was only saying that I thought such a reduction of the 
hardware requirements would not help much.

Yes, there is software in Debian that needs a powerful machine, there
is also a LOT of software in Debian specifically designed for low
resource machines where the benefits of a 1Mb Packages.gz file are
appreciable.
I agree, after reading Paul's comment, that if we get a Translations-en 
file via DDTP, removing the extended description from Packages would be 
less work, and thus more interesting.


I tested the gain with
awk '$0 !~ /^(Description| )/'
and the result loses close to half of its compressed size.
-rw-r--r-- 1 chealer chealer  4224356 mar 21 20:12 nodesc.tar.gz
-rw-r--r-- 1 chealer chealer  7350583 mar 21 15:56 
debian.savoirfairelinux.net_debian_dists_testing_main_binary-i386_Packages.tar.gz



--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-20 Thread Filipus Klutiero


On Fri, 20 Mar 2009 14:45:09 +0100 (CET)
Andreas Tille til...@rki.de wrote:

 I tried to find a clear advise how to reasonable format lists inside long
 descriptions of packages.  The only thing I know is that lines with two
 leading spaces is considered verbose. 


Packages.gz is already 26Mb - I'd like to find ways to shorten the
package descriptions, not lengthen it. :-(
  
Current squeeze main Packages.gz is 7 MB: 
http://ftp.ca.debian.org/debian/dists/squeeze/main/binary-i386/

Can the long description be trimmed to only such data necessary to
identify the package compared to similar packages? We have debtags for
lots of other facets of a package description, maybe it is time that
the long description itself is trimmed so that it does not repeat any
information already encoded as debtags?
  
debtags is not yet at a stage where this should be done (for one thing, 
Synaptic, for example, does not support debtags). Even if it would be 
possible, I doubt this would help much.

 The rationale behind this is that with some
 better standard formating some tools which display descriptions on web
 pages might be enhanced to use li, ol and dl tags which finally
 makes a better reading.

Oh no, please don't let Packages.gz get to 40Mb or 50Mb or more. There
has to be a limit somewhere.
  
I don't understand the proposal as something affecting Packages's size 
significantly.

What about a way of having a really long, detailed, nicely formatted
description on packages.debian.org but a much shorter, more basic
version in the Packages.gz file?
  
The extended description needs to be available to APT, not only via 
packages.d.o. I seem to remember that Mandrake Linux (or some other 
RPM-based distribution) used two Packages-like files, a fat one about 5 
times our Packages and a slim one about a fifth of Debian's Packages. I 
remember finding the slim index cool, but now that there's 
Packages.diff, I think that developing Mandrake-like Packages files and 
seeing the results in, perhaps, 2 years, would not benefit much to the 
kind of hardware Debian will run on by then.



--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Extended descriptions size (was Re: RFC: Better formatting for long descriptions)

2009-03-20 Thread Paul Wise
On Sat, Mar 21, 2009 at 8:15 AM, Filipus Klutiero chea...@gmail.com wrote:

 The extended description needs to be available to APT, not only via
 packages.d.o.

I agree with Neil William's comment in the other thread about removing
long descriptions from the Packages files. I think the obvious place
to put them is in dists/unstable/main/i18n/Translations-en (or C) like
the descriptions from DDTP.

-- 
bye,
pabs

http://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org