Hello,
I would like to propose changes which will add new functionallity to the
IPS. Basically the changes will add new file/files on the server side,
which will contain specific meta data about fmri's. Other distributions
already have those kind of metadata[1]
The file for Ubuntu gusy is around 1.3M gzipped and 6.3M uncompressed,
but the file contains much more information then we would put into
cataloginfo file and the file contains much more packages (5425).
I've made sample cataloginfo data and it's around 30K gzipped.
Benefits:
- Package Manager/Update Manager startup with fully loaded
descriptions will be reduced. Our goal is to have fully
functional GUI application with loaded data in less then 10
seconds. This needs to be done in the scalable way, so the user
will not loose the performance even with loads of authorities.
- client-server side operations will be reduced
- Ability to search for packages descriptions/names using web based
search. The fix for 3014 could be extended to allow users to
search for descriptions of the packages.
- The "pkg list -s" performance would be much improved
When the cataloginfo file will be created/re-created:
The idea is that server will create/re-create cataloginfo file from
the meta-data:
- while sending packages to server
- pkg.depotd:
- add "--rebuild-cataloginfo"
Synchronization of the cataloginfo files:
pkg(1):
- the --refresh operation should allow to specify that we do
not want to get the cataloginfo file, by default we will
always get the cataloginfo file.
api:
- extend current refresh operation:
def refresh(self, full_refresh, auths=None, locales=["C",]):
"""Refreshes the catalogs. full_refresh controls
whether to do a full retrieval of the catalog and
cataloginfo from the authority or only update the
existing catalog cataloginfo files. auths is a list of
authorities to refresh. Passing an empty list or using
the default value means all known authorities will be
refreshed. locales is a list of the locale specific
cataloginfo files which will be downloaded during
refresh operation. Passing an empty list will skip
getting locale specific cataloginfo files. The default
is an "C" cataloginfo file, which is supposed to be
downloaded. While it currently returns an image object,
this is an expedient for allowing existing code to work
while the rest of the API is put into place."""
Package Manager:
- any operation which involves call to api refresh.
The cataloginfo:
There should be one default language cataloginfo file per
repository and corresponding l18n files. The catalog files can be
stored in compressed binary format such as those generated using
cPickle.
Example:
catalog/
attrs
catalog
cataloginfo
i18n/
cataloginfo_de
cataloginfo_pl
cataloginfo_en_GB
cfg_cache
file/
index/
pkg/
trans/
updatelog/
The catalog/attrs file should contain information about last
modification of cataloginfo similar to the "S Last-Modified:
[timespec]" attribute for catalog file.
cataloginfo may contain the following information about package:
- FMRI
- display name
- display description
- categories
- ??????????????????????????????????????????
Because we can have multiply FMRI's for each package, the
cataloginfo file should store only those values which are
changing.
Example:
Server have 4 versions of the same package PKG_NAME:
A_FMRI4
A_FMRI3
A_FMRI2
A_FMRI1
The display description value have changed in the A_FMRI1 and
then was updated in A_FMRI3, and the categories were updated in
the A_FMRI4 we should have:
PKG_NAME
A_FMRI4
categories: Applications/CoolGnomeApps
A_FMRI3
display description: here is version 3.45 of some
fancy app
A_FMRI1
categories: Applications/CoolApps
display name: fancy package
display description: here is some fancy application
This will allow to get specific version attributes and also the
newest one for not installed packages at the same time reducing
the size of the file.
corresponding i18n/cataloginfo_* file may contain:
- FMRI
- translated display name
- translated short description
- translated categories
[1]
Gentoo
Gentoo stores the metadata for each package in separate file,
which makes refreshing catalog not very efficient:
http://sources.gentoo.org/viewcvs.py/gentoo-x86/app-office/openoffice/metadata.xml?view=markup
Ubuntu
Similar solution to the proposed one, but ubuntu is storing all
information about the packages in flat file (similar to manifest):
http://archive.ubuntu.com/ubuntu/dists/gutsy/main/
http://archive.ubuntu.com/ubuntu/dists/gutsy/main/binary-i386/
http://archive.ubuntu.com/ubuntu/dists/gutsy/main/i18n/
--
best
Michal Pryc
http://blogs.sun.com/migi
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss