Re: [gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-12-02 Thread James Cloos
 Jan == Jan Kundrát [EMAIL PROTECTED] writes:

 - less data in metadata cache;

Jan Isn't it in the cache for some reason? Really, I'm just asking.

If for nothing else, so that update-eix can get it to allow searching on
homepage.  And, yes, that is an important feature.  And, no, openeing
every metadata.xml file during update-eix is in no way acceptable.

For eix above, of course, read your favourite query tool.

-JimC
-- 
James Cloos [EMAIL PROTECTED] OpenPGP: 1024D/ED7DAEA6



Re: [gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-12-02 Thread James Cloos
 Diego == Diego 'Flameeyes' Pettenò [EMAIL PROTECTED] writes:

 But also the need to replicate http://www.kde.org/ to metadata.xml of
 all KDE split ebuilds -- right now, this is set by an eclass.

Diego The usefulness of this is IMHO debatable; why not just writing it one
Diego package (say kde-base/kde or kde-meta) and just there? Having each
Diego mini-package express itself as having that as its homepage is not very
Diego useful to me, but I guess it's debatable.

Searching is an important reason for every package to specify its homepage.

-JimC
-- 
James Cloos [EMAIL PROTECTED] OpenPGP: 1024D/ED7DAEA6



[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-12-02 Thread Diego 'Flameeyes' Pettenò
James Cloos [EMAIL PROTECTED] writes:

 Searching is an important reason for every package to specify its homepage.

And?

metadata.xml already contains data that eix and other software should be
able to search in (like longdescriptions), and having each package in
kde-base report http://www.kde.org/ as its homepage is kinda pointless
if you think about search, since that's not data, it's noise.

Which only adds to my point.

-- 
Diego Flameeyes Pettenò
http://blog.flameeyes.eu/


pgpS3fcWfM3UH.pgp
Description: PGP signature


[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-12-02 Thread Ryan Hill
On Mon, 01 Dec 2008 10:00:33 +0100
[EMAIL PROTECTED] (Diego 'Flameeyes' Pettenò) wrote:

 Alec Warner [EMAIL PROTECTED] writes:
 
  That being said I still don't see the usefulness here.
 
  You seem to think that using the existing APIs for this data is
  wrong, and I think the opposite, so I guess we will agree to
  disagree on this matter.
 
 Yeah I still think that there is no point in requiring using of a
 specific API when the same data can easily be available in a format
 that is more or less parsable with ease in any modern (and non)
 programming language.
 
 Beside, I find expanding the HOMEPAGE syntax to allow more than one
 link a bit ... overkill, if the same thing can be achieved in
 metadata.xml...

I find moving HOMEPAGE out of ebuilds to be a bit overkill.


-- 
gcc-porting,  by design, by neglect
treecleaner,  for a fact or just for effect
wxwidgets @ gentoo EFFD 380E 047A 4B51 D2BD C64F 8AA8 8346 F9A4 0662


signature.asc
Description: PGP signature


Re: [gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-12-02 Thread Marius Mauch
On Wed, 03 Dec 2008 02:05:31 +0100
[EMAIL PROTECTED] (Diego 'Flameeyes' Pettenò) wrote:

 metadata.xml already contains data that eix and other software should
 be able to search in (like longdescriptions), and having each package
 in kde-base report http://www.kde.org/ as its homepage is kinda
 pointless if you think about search, since that's not data, it's
 noise.

So you're saying if I'm interested in a url to look for information
about kalarm, I should search for it in metadata.xml of random kde
packages? Sorry, but that doesn't make any sense to me.

While I'm not necessarily against your primary goal here, your
argumentation is very subjective to say the least (e.g. just because
you find xml easier to read/parse than ebuilds doesn't mean the same
holds true for everyone else, ignoring the whole cache issue). It
feels a bit like you're looking for problems to justify your solution
rather than the other way round.

Marius



[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-12-01 Thread Diego 'Flameeyes' Pettenò
Alec Warner [EMAIL PROTECTED] writes:

 - Space savings.  Certainly your scheme may be smaller, but the XML
 tag overhead may eat into the savings.  You should do some estimates
 to show the community how much smaller the tree will be from this
 proposal.

Sorry but you lost me on any point you might have brought across since
after this I feel like you were trying to put words in my mouth.

Beside, if you really want to go down that road you should be counting
that beside ReiserFS with tail, I don't remember any other Linux FS that
has block smaller than 512bytes, which means that each file in metadata
cache is taking up much more than just its size in characters.

All your math is thus wrong.

-- 
Diego Flameeyes Pettenò
http://blog.flameeyes.eu/


pgpcFFtmtOt8h.pgp
Description: PGP signature


[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-12-01 Thread Diego 'Flameeyes' Pettenò
Jan Kundrát [EMAIL PROTECTED] writes:

 But also the need to replicate http://www.kde.org/ to metadata.xml of
 all KDE split ebuilds -- right now, this is set by an eclass.

The usefulness of this is IMHO debatable; why not just writing it one
package (say kde-base/kde or kde-meta) and just there? Having each
mini-package express itself as having that as its homepage is not very
useful to me, but I guess it's debatable.

 - allows proper handling of packages lacking a HOMEPAGE;

 Could you elaborate a bit about how different is handling of an
 empty/uninitialized shell variable from an empty XML element?

That you can provide _other_ links beside an homepage, like
unmaintained, gentoo:userguide and stuff like that so that user
don't just get no homepage at all, and they are not misdirected by
homepage being http://www.gentoo.org/ or something.

 - users can check the metadata much more easily by just opening the xml
   file or interfacing to that rather than having to skim through the
   ebuild, the xml files are probably more user readable then ebuilds
   using multiple eclasses;

 Haven't we already agreed that accessing ebuilds/... directly is
 broken by design?

For a software sure, but as an user I am automatically brought to just
look at the files if I'm looking for the homepage of a package I know,
and seeing a metadata.xml file I'm more likely to look at that rather
than the metadata cache in /var/db/... .

And it's certainly more user-readable an XML file than HOMEPAGE with
depend-like syntax for labels and conditionals and whatever else seems
like Alec is proposing for EAPI=3

 - webapps like packages.gentoo.org would be able to display basic
   information without having to parse the ebuilds or the metadata cache.

 Except for the ebuilds which still use the old format (that is 100% of
 the tree right now)

This of course is meant as whenever this is fully implemented

-- 
Diego Flameeyes Pettenò
http://blog.flameeyes.eu/


pgpfOxlYEmqMh.pgp
Description: PGP signature


Re: [gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-12-01 Thread Alec Warner
On Mon, Dec 1, 2008 at 12:24 AM, Diego 'Flameeyes' Pettenò
[EMAIL PROTECTED] wrote:
 Alec Warner [EMAIL PROTECTED] writes:

 - Space savings.  Certainly your scheme may be smaller, but the XML
 tag overhead may eat into the savings.  You should do some estimates
 to show the community how much smaller the tree will be from this
 proposal.

 Sorry but you lost me on any point you might have brought across since
 after this I feel like you were trying to put words in my mouth.

Sorry for that, I never meant to imply that you said space savings.

That being said I still don't see the usefulness here.

You seem to think that using the existing APIs for this data is wrong,
and I think the opposite, so I guess we will agree to disagree on this
matter.


 Beside, if you really want to go down that road you should be counting
 that beside ReiserFS with tail, I don't remember any other Linux FS that
 has block smaller than 512bytes, which means that each file in metadata
 cache is taking up much more than just its size in characters.

 All your math is thus wrong.

As was pointed out on IRC, UTF8 characters are not a fixed size,
making my math even more wrong ;)


 --
 Diego Flameeyes Pettenò
 http://blog.flameeyes.eu/



[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-12-01 Thread Diego 'Flameeyes' Pettenò
Alec Warner [EMAIL PROTECTED] writes:

 That being said I still don't see the usefulness here.

 You seem to think that using the existing APIs for this data is wrong,
 and I think the opposite, so I guess we will agree to disagree on this
 matter.

Yeah I still think that there is no point in requiring using of a
specific API when the same data can easily be available in a format that
is more or less parsable with ease in any modern (and non) programming
language.

Beside, I find expanding the HOMEPAGE syntax to allow more than one link
a bit ... overkill, if the same thing can be achieved in metadata.xml...

 Beside, if you really want to go down that road you should be counting
 that beside ReiserFS with tail, I don't remember any other Linux FS that
 has block smaller than 512bytes, which means that each file in metadata
 cache is taking up much more than just its size in characters.

 All your math is thus wrong.

 As was pointed out on IRC, UTF8 characters are not a fixed size,
 making my math even more wrong ;)

If we consider HOMEPAGE, the assumption that characters are fixed size
to 1 byte is good enough; URLs are usually encoded in pure ascii
character space for compatibility; while IDN would break that
assumption, we can't even assume that IDN is always available and so on.

For description maybe it's different because there is space there for
UTF-8 characters, but that's going to bring us even farthest than the
point.

-- 
Diego Flameeyes Pettenò
http://blog.flameeyes.eu/


pgpfg98QhGFq3.pgp
Description: PGP signature


[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-11-30 Thread Diego 'Flameeyes' Pettenò
Jan Kundrát [EMAIL PROTECTED] writes:

 I believe the reason was that HOMEPAGE might change with new versions
 and that metadata.xml didn't (doesn't?) support version-specific data.

At least the maintainer and (iirc, at least that's how we proposed
it the first time around) flag tags support a restrict attribute.

But I really expect that as long as the package is the same, homepage is
unlikely to change with version; maybe with slot I guess, but even that
is debatable and somewhat rare I think.

-- 
Diego Flameeyes Pettenò
http://blog.flameeyes.eu/


pgpB5SH1b5Viy.pgp
Description: PGP signature


[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-11-30 Thread Diego 'Flameeyes' Pettenò
[EMAIL PROTECTED] (Diego 'Flameeyes' Pettenò) writes:

 I have a very quick proposal: why don't we move the packages' homepage
 in metadata.xml (since it's usually unique for all the versions) and we
 get rid of the variable for the next EAPI version?

I forgot to say that this also addresses, for the future EAPI, the
problem of what to do with missing HOMEPAGE. We still have to find a
solution for that on the EAPI 0, 1 and 2 though since it's a bit of a
big problem when we point to domain squatters.

If it was feasible to just make missing HOMEPAGE a softfail for the
other three it would be even better.

-- 
Diego Flameeyes Pettenò
http://blog.flameeyes.eu/


pgp3NTn0n5kOR.pgp
Description: PGP signature


[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-11-30 Thread Diego 'Flameeyes' Pettenò
Tobias Scherbaum [EMAIL PROTECTED] writes:

 But what about additional slot or version attributes like
 link type=homepage slot=1.4http://java.sun.com/j2se/1.4.2//link
 (or a version attribute)? If slot and version aren't specified they'd be
 interpreted as wildcards.

link type=homepage restrict=dev-java/sun-jdk:1.4

The restrict attribute exists already and it's better to reuse the same
code, isn't it?

-- 
Diego Flameeyes Pettenò
http://blog.flameeyes.eu/


pgpJJ5dlUYEV7.pgp
Description: PGP signature


Re: [gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-11-30 Thread Tobias Scherbaum
Diego 'Flameeyes' Pettenò wrote:
 Tobias Scherbaum [EMAIL PROTECTED] writes:
 
  But what about additional slot or version attributes like
  link type=homepage slot=1.4http://java.sun.com/j2se/1.4.2//link
  (or a version attribute)? If slot and version aren't specified they'd be
  interpreted as wildcards.
 
 link type=homepage restrict=dev-java/sun-jdk:1.4
 
 The restrict attribute exists already and it's better to reuse the same
 code, isn't it?

In general yes, but in that case you're duplicating info like
dev-java/sun-jdk unnecessarily. Reducing this to restrict=1.4 isn't
easily readable as you'd need to know that restrict would specify a
slot. If your plan is to make it easier to find useful information about
a package (without using a fancy frontend, just reading the metadata.xml
with $EDITOR) slot=1.4 (or a version attribute) might be a tad more
human readable. 

  Tobias


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-11-30 Thread Diego 'Flameeyes' Pettenò
Tobias Scherbaum [EMAIL PROTECTED] writes:

 dev-java/sun-jdk unnecessarily. Reducing this to restrict=1.4 isn't
 easily readable as you'd need to know that restrict would specify a
 slot. If your plan is to make it easier to find useful information about
 a package (without using a fancy frontend, just reading the metadata.xml
 with $EDITOR) slot=1.4 (or a version attribute) might be a tad more
 human readable. 

Well if we go to these things we should just apply the same to the other
attributes using restrict, since we want to have something coherent,
don't we? ;)

-- 
Diego Flameeyes Pettenò
http://blog.flameeyes.eu/


pgpqrcOBdsryo.pgp
Description: PGP signature


[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-11-30 Thread Diego 'Flameeyes' Pettenò
Alec Warner [EMAIL PROTECTED] writes:

 Diego, What are the concrete benefits of your proposal?

As I said:

- no need to replicate homepage data between versions; even though forks
  can change homepage, I would expect that to at worse split in two a
  package, or have to be different by slot, like Java;
- allows proper handling of packages lacking a HOMEPAGE;
- less data in metadata cache;
- users can check the metadata much more easily by just opening the xml
  file or interfacing to that rather than having to skim through the
  ebuild, the xml files are probably more user readable then ebuilds
  using multiple eclasses;
- displaying info about the package does not require parsing the full
  ebuild file, with its eclasses;
- extensible to provide more links than just the homepage (forums,
  trackers, gentoo-specific documentation, ...);
- if we also move DESCRIPTION, search software can ignore everything
  about ebuild parsing, and just use the metadata.xml files; considering
  how many people actually use or used eix, it would make sense to allow
  third-party applications to be able to search through the tree;
- webapps like packages.gentoo.org would be able to display basic
  information without having to parse the ebuilds or the metadata cache.
- as much as people might think metadata is easier to parse than
  anything, XML has one huge advantage: there are plently of parsers for
  any language without having to actually write one, even as easy as it
  can be, and it's easily interfaced with anything; I wrote a simple XSL
  file that outputs the basic metadata details for packages without
  having any parser or executable code but xsltproc (or any other XSLT
  software), correlating data with herds.xml too;
- it really is metadata, and it makes very little sense to need parsing
  of eclasses and EAPI handling to get some data from a package that is
  non-functional in nature and free form (just like DESCRIPTION, and
  unlike LICENSE like Alec said), and that changes at worse once each
  slot (unlike LICENSE that can change at any given version).

Disadvantages:

- it requires user-interface software to parse metadata.xml to show
  data for a package; which is already needed to show per-package USE
  flag meaning;

General points:

- it does not solve unrelated problems like code replication;

Can someone come up with any other point beside I don't like XML
(which I already said is a puny answer) and it can theorically be 10
different homepages for 10 different versions (which I have sincerely
some beef with myself since if you fork a software you might as well
change its name)?

As I said, moving out the HOMEPAGE field from a package manager
prospective is non functional; if you're showing to the user some data
about a package you might as well show as much as you can, like long
descriptions, other links, and USE flags. And the fact that you can ask
the package manager for something is for me not a valid reason to avoi
moving something in a more approchable place for other software.

-- 
Diego Flameeyes Pettenò
http://blog.flameeyes.eu/


pgpJevDGzJEf0.pgp
Description: PGP signature


Re: [gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-11-30 Thread Jan Kundrát

Diego 'Flameeyes' Pettenò wrote:

- no need to replicate homepage data between versions; even though forks
  can change homepage, I would expect that to at worse split in two a
  package, or have to be different by slot, like Java;


But also the need to replicate http://www.kde.org/ to metadata.xml of 
all KDE split ebuilds -- right now, this is set by an eclass.



- allows proper handling of packages lacking a HOMEPAGE;


Could you elaborate a bit about how different is handling of an 
empty/uninitialized shell variable from an empty XML element?



- less data in metadata cache;


Isn't it in the cache for some reason? Really, I'm just asking.


- users can check the metadata much more easily by just opening the xml
  file or interfacing to that rather than having to skim through the
  ebuild, the xml files are probably more user readable then ebuilds
  using multiple eclasses;


Haven't we already agreed that accessing ebuilds/... directly is broken 
by design?



- webapps like packages.gentoo.org would be able to display basic
  information without having to parse the ebuilds or the metadata cache.


Except for the ebuilds which still use the old format (that is 100% of 
the tree right now)


Cheers,
-jkt

--
cd /local/pub  more beer  /dev/mouth



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-11-30 Thread Ciaran McCreesh
On Mon, 01 Dec 2008 00:12:23 +0100
[EMAIL PROTECTED] (Diego 'Flameeyes' Pettenò) wrote:
 - no need to replicate homepage data between versions; even though
 forks can change homepage, I would expect that to at worse split in
 two a package, or have to be different by slot, like Java;

You mean no way of handling generated homepages, use conditional
homepages, per version homepages or common homepages.

 - allows proper handling of packages lacking a HOMEPAGE;

Uh, we can do that using in-ebuild HOMEPAGE too. Just need to decide on
a convention.

 - less data in metadata cache;

Entirely a non-issue. Heck, we want more in there, not less. 

 - users can check the metadata much more easily by just opening the
 xml file or interfacing to that rather than having to skim through the
   ebuild, the xml files are probably more user readable then ebuilds
   using multiple eclasses;

...or they can just use a decent too. Try 'paludis --query' for an
example.

 - displaying info about the package does not require parsing the full
   ebuild file, with its eclasses;

Uhm. It doesn't anyway, because of the metadata cache.

 - extensible to provide more links than just the homepage (forums,
   trackers, gentoo-specific documentation, ...);

So's HOMEPAGE. You could extend the syntax to allow annotations:

HOMEPAGE=
http://example.com/
http://forums.example.com/ [[ role = forums ]]
http://www.gentoo.org/example [[ role = [ Gentoo specific docs ] ]]
gtk+? ( http://gui.example.com/ [[ role = [ Optional GUI docs ] ]]


 - if we also move DESCRIPTION, search software can ignore everything
   about ebuild parsing, and just use the metadata.xml files;
 considering how many people actually use or used eix, it would make
 sense to allow third-party applications to be able to search through
 the tree;

Except that any decent search client needs to be aware of masks,
visibility and so on anyway.

 - webapps like packages.gentoo.org would be able to display basic
   information without having to parse the ebuilds or the metadata
 cache.

But they already display complex information.

 - as much as people might think metadata is easier to parse than
   anything, XML has one huge advantage: there are plently of parsers
 for any language without having to actually write one, even as easy
 as it can be, and it's easily interfaced with anything; I wrote a
 simple XSL file that outputs the basic metadata details for packages
 without having any parser or executable code but xsltproc (or any
 other XSLT software), correlating data with herds.xml too;

...or you could use a proper ebuild-aware tool that displays metadata
details, including things like visibility. Again, paludis --query.

 - it really is metadata, and it makes very little sense to need
 parsing of eclasses and EAPI handling to get some data from a package
 that is non-functional in nature and free form (just like
 DESCRIPTION, and unlike LICENSE like Alec said), and that changes at
 worse once each slot (unlike LICENSE that can change at any given
 version).

It isn't non-functional.

 And the fact that you can ask the package manager for something is
 for me not a valid reason to avoi moving something in a more
 approchable place for other software.

More approachable is a decent package manager API. If you had that
you wouldn't need to mess around with XML APIs.

-- 
Ciaran McCreesh


signature.asc
Description: PGP signature


Re: [gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-11-30 Thread Alec Warner
On Sun, Nov 30, 2008 at 3:12 PM, Diego 'Flameeyes' Pettenò
[EMAIL PROTECTED] wrote:
 Alec Warner [EMAIL PROTECTED] writes:

 Diego, What are the concrete benefits of your proposal?

 As I said:

 - no need to replicate homepage data between versions; even though forks
  can change homepage, I would expect that to at worse split in two a
  package, or have to be different by slot, like Java;
 - allows proper handling of packages lacking a HOMEPAGE;
 - less data in metadata cache;
 - users can check the metadata much more easily by just opening the xml
  file or interfacing to that rather than having to skim through the
  ebuild, the xml files are probably more user readable then ebuilds
  using multiple eclasses;
 - displaying info about the package does not require parsing the full
  ebuild file, with its eclasses;
 - extensible to provide more links than just the homepage (forums,
  trackers, gentoo-specific documentation, ...);
 - if we also move DESCRIPTION, search software can ignore everything
  about ebuild parsing, and just use the metadata.xml files; considering
  how many people actually use or used eix, it would make sense to allow
  third-party applications to be able to search through the tree;
 - webapps like packages.gentoo.org would be able to display basic
  information without having to parse the ebuilds or the metadata cache.
 - as much as people might think metadata is easier to parse than
  anything, XML has one huge advantage: there are plently of parsers for
  any language without having to actually write one, even as easy as it
  can be, and it's easily interfaced with anything; I wrote a simple XSL
  file that outputs the basic metadata details for packages without
  having any parser or executable code but xsltproc (or any other XSLT
  software), correlating data with herds.xml too;
 - it really is metadata, and it makes very little sense to need parsing
  of eclasses and EAPI handling to get some data from a package that is
  non-functional in nature and free form (just like DESCRIPTION, and
  unlike LICENSE like Alec said), and that changes at worse once each
  slot (unlike LICENSE that can change at any given version).

 Disadvantages:

 - it requires user-interface software to parse metadata.xml to show
  data for a package; which is already needed to show per-package USE
  flag meaning;

 General points:

 - it does not solve unrelated problems like code replication;

 Can someone come up with any other point beside I don't like XML
 (which I already said is a puny answer) and it can theorically be 10
 different homepages for 10 different versions (which I have sincerely
 some beef with myself since if you fork a software you might as well
 change its name)?

 As I said, moving out the HOMEPAGE field from a package manager
 prospective is non functional; if you're showing to the user some data
 about a package you might as well show as much as you can, like long
 descriptions, other links, and USE flags. And the fact that you can ask
 the package manager for something is for me not a valid reason to avoi
 moving something in a more approchable place for other software.

Ciaran covered most of my points already.

Third party programs should not parse ebuilds and eclasses by hand.
I'd expect half of them to get it wrong if they tried.
Ebuild parsing is hard, that is why we have three complex software
packages that for the most part do it properly.

Why is 'ask the package manager' an invalid reason to not making
something more accessible?
How accessible must this data be?

Writing an XML parser is not accessible enough (for me), we should
just put it in plain text on the hard drive, perhaps in
/var/cache/edb/dep/${PORTDIR}/$C/$PV

Oh wait, we do that already[1].

So really this is where I'm confused.
If third parties are using the package manager APIs to get at this
data; the only rationale to move it out of ebuilds is:

- Space savings.  Certainly your scheme may be smaller, but the XML
tag overhead may eat into the savings.  You should do some estimates
to show the community how much smaller the tree will be from this
proposal.

Randomly looking:

cd /var/cache/edb/dep/usr/portage
grep -hR HOMEPGE | wc -m
yields 1.1million characters.  Each character is 1 byte (is that so in UTF8?)
So at best you could save the 1.2GB tree 2.2 million bytes (about 2
megs) if your scheme was (more than) 100% efficient.
The extra 1.1 million characters comes from the space freed in the
cache (since we don't cache metadata.xml).

2 megs into 1200 megs is.. .16% of the tree.  As I thought, not
very compelling.

Looking at DESCRIPTION:

grep -hR DESCRIPTION | wc -m
yields ~1.5 million characters.  Nice!

So if we purge that from the cache and replace it with a (more than)
100% efficient metadata.xml solution we could save: 3 megs

3 megs saved + 2 megs saved = 5 megs saved.  5 / 1200 = .41% of
the tree.  Still again not very compelling.

- Extra Tags.  Extending HOMEPAGE is harder than