On 03/31/2014 01:02 AM, Denis Silakov wrote:
Hi all,
As many of you can likely notice, package publishing in ABF usually
takes relatively significant time - about several minutes (sometimes
up to 10 minutes).
One of the time-consuming tasks in the publishing is generation of
hdlist.cz file - this is a huge file containing internal urpm
representation of metadata, including file lists, package
descriptions, etc. This file seems to be redundant - nowadays we
generate additional xml files (changelog.xml, info.xml, etc.) which in
combination with lightweight synthesis.hdlist.cz provide the same
information. However, I am not so familiar with hdlist.cz and can't
guarantee that nothing will be lost if we completely drop it.
So the question is - can somebody say what will we lost (if any) if we
drop hdlist.cz files? Or maybe we should just try and see?
Hi,
I hope someone can make some sense of my experience wrt hdlist.cz etc.
Many years ago, iirc, synthesis.hdlist.cz was introduced as an optional
source of media info that could be chosen by those with a slow internet
connection, as it was smaller and took less time to download than the
traditional, default hdlist.cz. I always chose hdlist.cz as my
connection was relatively quick and, I think, there was more information
included, naturally. Also, whenever I used urpmq/urpmf to query the
database, which is my primary usage of this important tool for
investigating packages capabilities, the result was almost immediate,
since the data was already on my computer. At some point, hdlist.cz was
no longer available as a way to configure urpmi. Every time I search
for a package containing a file of interest, I have to wait a long time
for xml files to be downloaded from each media source. Also, when I
look for a changelog in MandrivaUpdate or with urpmq, it must be
retrieved. IIANM, the package info would already be available on my
machine when using hdlist.cz as urpmi.update had already been done.
Admittedly, this is an accounting of the events of a number of years
that is challenging for an aged memory, but my recollected experience is
that the functionality of urpm was better with hdlist.cz than with
anything that has come since. Maybe generation of the others could be
dropped to gain publishing speed? :) Alternately, perhaps an option
could be provided where all the current information is downloaded when
urpmi.update is run and/or with CL switches.
^^^Those words reminded me of the policy option to Always download xml
information in the rpmdrake media manager, which I recall trying,
before, without improvement. I see this in the urpmi.cfg manual about
global options:
xml-info
For remote media, specify when files.xml.lzma,
changelog.xml.lzma and info.xml.lzma are downloaded:
never
on-demand
(This is the default).
The specific xml info file is downloaded when
urpmq/urpmf/rpmdrake ask for it. urpmi.update will remove
outdated xml info file.
nb: if urpmq/urpmf/rpmdrake is not run by root, the xml
info file is downloaded into /tmp/.urpmi-<uid>/
update-only
urpmi.update will update xml info files already required
at least once by urpmq/urpmf/rpmdrake.
nb: with update-only, urpmi.update will not update
/tmp/.urpmi-<uid>/ xml info files
always
all xml info files are downloaded when doing
urpmi.addmedia and urpmi.update
I checked and no global policy was defined, so I set it to Always in
media manager, which is reflected, now, in urpmi.cfg:
{
downloader: curl
verify-rpm: 1
xml-info: always
}
This stanza was empty, before, by default, I guess. I then ran a urpmf
query and watched as xml files began to be downloaded, so I quit. This
is what I recall of my previous attempt(s) to replicate the hdlist.cz
behavior. Just in case, I ran urpmi.update (as root), followed by urpmf
(as user). The following lists were all downloaded before the query
finished:
http://mirror.rosalab.ru/rosa/rosa2012.1/repository/x86_64/media/main/updates/media_info/files.xml.lzma
http://mirror.rosalab.ru/rosa/rosa2012.1/repository/i586/media/main/updates/media_info/files.xml.lzma
http://mirror.rosalab.ru/rosa/rosa2012.1/repository/x86_64/media/non-free/updates/media_info/files.xml.lzma
http://mirror.rosalab.ru/rosa/rosa2012.1/repository/i586/media/non-free/updates/media_info/files.xml.lzma
http://mirror.rosalab.ru/rosa/rosa2012.1/repository/x86_64/media/restricted/updates/media_info/files.xml.lzma
http://mirror.rosalab.ru/rosa/rosa2012.1/repository/i586/media/restricted/updates/media_info/files.xml.lzma
A urpmq --changelog query is quicker, since urpmi knows where the single
package comes from, but that sources changelog.xml.lzma is still
downloaded, even after the previous configurations, whereas the old
behavior with hdlist.cz was to immediately use the info on the
computer. As I say, my accounting might be distorted by
misunderstanding, time, or nostalgic prejudices but that's my story and
I'm sticking to it!
Thanks,
Rolf