Just a suggestion but would it make sense to generate it just before release. It could also be regenerated on demand after a certain number of updated packages. That is of course whether any of it's functionality is still useful. Historical changelogs perhaps.
On Monday 31 Mar 2014 13:36:34 Tomasz Paweł Gajc wrote: > hdlist.cz is like bigger than 70 MB for main repo. Disabling generation of > this can really save time, space and bandwith. > > Wysłano z BlackBerry® w Orange > > -----Original Message----- > From: Denis Silakov <[email protected]> > Sender: [email protected]: Mon, 31 Mar 2014 17:30:06 > To: <[email protected]> > Reply-To: Cooker OpenMandriva <[email protected]> > Subject: Re: [OM Cooker] Disabling generation of old hdlist.cz in ABF? > > Ok, thanks a lot for the info. > > So at least handling of xml files can be improved. > > As for dropping their generation - actually this won't save much > time/space, unlike hdlist.cz. > > On 03/31/2014 05:27 PM, Rolf Pedersen wrote: > > > > On 03/31/2014 01:02 AM, Denis Silakov wrote: > >> Hi all, > >> > >> As many of you can likely notice, package publishing in ABF usually > >> takes relatively significant time - about several minutes (sometimes > >> up to 10 minutes). > >> > >> One of the time-consuming tasks in the publishing is generation of > >> hdlist.cz file - this is a huge file containing internal urpm > >> representation of metadata, including file lists, package > >> descriptions, etc. This file seems to be redundant - nowadays we > >> generate additional xml files (changelog.xml, info.xml, etc.) which > >> in combination with lightweight synthesis.hdlist.cz provide the same > >> information. However, I am not so familiar with hdlist.cz and can't > >> guarantee that nothing will be lost if we completely drop it. > >> > >> So the question is - can somebody say what will we lost (if any) if > >> we drop hdlist.cz files? Or maybe we should just try and see? > >> > > Hi, > > I hope someone can make some sense of my experience wrt hdlist.cz > > etc. Many years ago, iirc, synthesis.hdlist.cz was introduced as an > > optional source of media info that could be chosen by those with a > > slow internet connection, as it was smaller and took less time to > > download than the traditional, default hdlist.cz. I always chose > > hdlist.cz as my connection was relatively quick and, I think, there > > was more information included, naturally. Also, whenever I used > > urpmq/urpmf to query the database, which is my primary usage of this > > important tool for investigating packages capabilities, the result > > was almost immediate, since the data was already on my computer. At > > some point, hdlist.cz was no longer available as a way to configure > > urpmi. Every time I search for a package containing a file of > > interest, I have to wait a long time for xml files to be downloaded > > from each media source. Also, when I look for a changelog in > > MandrivaUpdate or with urpmq, it must be retrieved. IIANM, the > > package info would already be available on my machine when using > > hdlist.cz as urpmi.update had already been done. Admittedly, this is > > an accounting of the events of a number of years that is challenging > > for an aged memory, but my recollected experience is that the > > functionality of urpm was better with hdlist.cz than with anything > > that has come since. Maybe generation of the others could be dropped > > to gain publishing speed? :) Alternately, perhaps an option could be > > provided where all the current information is downloaded when > > urpmi.update is run and/or with CL switches. > > > > ^^^Those words reminded me of the policy option to Always download xml > > information in the rpmdrake media manager, which I recall trying, > > before, without improvement. I see this in the urpmi.cfg manual about > > global options: > > > > xml-info > > For remote media, specify when files.xml.lzma, > > changelog.xml.lzma and info.xml.lzma are downloaded: > > > > never > > on-demand > > (This is the default). > > > > The specific xml info file is downloaded when > > urpmq/urpmf/rpmdrake ask for it. urpmi.update will remove > > outdated xml info file. > > > > nb: if urpmq/urpmf/rpmdrake is not run by root, the xml > > info file is downloaded into /tmp/.urpmi-<uid>/ > > > > update-only > > urpmi.update will update xml info files already > > required at least once by urpmq/urpmf/rpmdrake. > > > > nb: with update-only, urpmi.update will not update > > /tmp/.urpmi-<uid>/ xml info files > > > > always > > all xml info files are downloaded when doing > > urpmi.addmedia and urpmi.update > > > > I checked and no global policy was defined, so I set it to Always in > > media manager, which is reflected, now, in urpmi.cfg: > > > > { > > downloader: curl > > verify-rpm: 1 > > xml-info: always > > } > > > > This stanza was empty, before, by default, I guess. I then ran a > > urpmf query and watched as xml files began to be downloaded, so I > > quit. This is what I recall of my previous attempt(s) to replicate > > the hdlist.cz behavior. Just in case, I ran urpmi.update (as root), > > followed by urpmf (as user). The following lists were all downloaded > > before the query finished: > > > > http://mirror.rosalab.ru/rosa/rosa2012.1/repository/x86_64/media/main/updates/media_info/files.xml.lzma > > > > > > http://mirror.rosalab.ru/rosa/rosa2012.1/repository/i586/media/main/updates/media_info/files.xml.lzma > > > > > > http://mirror.rosalab.ru/rosa/rosa2012.1/repository/x86_64/media/non-free/updates/media_info/files.xml.lzma > > > > > > http://mirror.rosalab.ru/rosa/rosa2012.1/repository/i586/media/non-free/updates/media_info/files.xml.lzma > > > > > > http://mirror.rosalab.ru/rosa/rosa2012.1/repository/x86_64/media/restricted/updates/media_info/files.xml.lzma > > > > > > http://mirror.rosalab.ru/rosa/rosa2012.1/repository/i586/media/restricted/updates/media_info/files.xml.lzma > > > > > > > > A urpmq --changelog query is quicker, since urpmi knows where the > > single package comes from, but that sources changelog.xml.lzma is > > still downloaded, even after the previous configurations, whereas the > > old behavior with hdlist.cz was to immediately use the info on the > > computer. As I say, my accounting might be distorted by > > misunderstanding, time, or nostalgic prejudices but that's my story > > and I'm sticking to it! > > Thanks, > > Rolf > > > > > -- > Denis Silakov, ROSA Laboratory. > www.rosalab.ru > > > >
