Ulrich Windl wrote:
> On 23 May 2006 at 14:09, jdd wrote:
> 
>> If I follow well the thread it seems this is a meta-data 
>> download problem.
> 
> If it's a download problem, why would the CPU be at 100%?

That sounds a lot like XML repository metadata parsing.
libzypp/ZMD is most probably not parsing the data stream as it is
being downloaded, so I presume it's download everything (for one
repository) first, then parse.

Would be interesting to do some profiling on parse-metadata.
Anything available for Mono ?

What XML parsing model is being used there, SAX, DOM, StAX ?

Most probably not DOM...

When I look at my ("guru") RPM-MD repository for 10.0 (which is large,
but a lot smaller than the FTP tree):
== compressed:
primary.xml.gz   =   1,058,326 (bytes)
filelists.xml.gz =   1,029,756
other.xml.gz     =     497,642

== uncompressed:
primary.xml      =   6,174,750
filelists.xml    =  11,032,393
other.xml        =   2,950,989

== compression ratio:
primary.xml   =  5.83
filelists.xml = 10.42
other.xml     =  5.92

Now when I look at SL-10.1/inst-source/suse/repodata:
== compressed:
primary.xml.gz   =   8,056,136 (bytes)
filelists.xml.gz =  17,474,199
other.xml.gz     =  53,265,854

BTW, other.xml.gz is *huge* (contains %changelog information) - I
don't know whether libzypp/zmd download and/or use "other.xml.gz"
though. smart (http://smartpm.org) doesn't.

== uncompressed:
primary.xml      =  47,127,500
filelists.xml    = 211,884,423
other.xml        = 206,534,422

== compression ratio:
primary.xml    =  5.83
filelists.xml  = 12.12
other.xml      =  3.87

Assuming that other.xml is not being used by libzypp/ZMD, I would
guess the following memory usage with DOM:
- primary.xml:    47MB on disk => 150-200MB RAM
- filelists.xml: 210MB on disk => 600-800MB RAM

Hmm.. after all... maybe it _is_ DOM ;)

Could someone with the mentioned libzypp/ZMD problems have a look at
memory/swap usage as well ?
- vmstat -n 5 999
- sar -r 5 999 (even better; sar is part of the sysstat package))

The yast2 format is possibly more efficient wrt memory and CPU.
Maybe worth investigating whether the memory+CPU problems happen with
RPM-MD repos but not with yast2 repos... ?

I'd say turn down (or even remove) all repositories, then just add the
10.1 FTP tree metadata, and run vmstat or sar to monitor CPU+memory
usage. With sysstat, it can be done like this:

sar -r -X `pidof parse-metadata` 5 999

(5 = 5 second interval, 999 = number of iterations)

You can even draw graphs from that data when you store it into a file
(-o option, it's a binary format), but you can't use -o in conjunction
with -X, so the stats would be system-wide:

mkdir ~/sar
sar -o ~/sar/sa.$(date '+%Y_%m_%d') -r -u 5 999
isag -p ~/sar

... then choose the file (click on the "-" button) and choose memory
or CPU graph (and you can even save the picture).

cheers
-- 
  -o) Pascal Bleser     http://linux01.gwdg.de/~pbleser/
  /\\ <[EMAIL PROTECTED]>       <[EMAIL PROTECTED]>
 _\_v http://www.fosdem.org          http://opensuse.org

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to