Problem with current 00INDEX.rdf

2003-09-16 Thread Matthias Kurz

Hi.

When i try to do a openpkg build bash, i get FATAL: cannot find package.

Looking into 00INDEX.rdf.bz2, it looks somewhat strange. Try
lesspipe .../current/SRC/00INDEX.rdf.bz2|grep Description|grep about|less
and look at the about= and href=:
E.g.
rdf:Description about=perl-time-20030821-20030821 
href=perl-time-20030821-20030821.src.rpm
rdf:Description about=-- href=kerberos-1.3.1-20030910.src.rpm
rdf:Description about=-- href=coreutils-5.0.91-20030912.src.rpm
 ^^
rdf:Description about=libnet-1.1.0-20030721 
href=dailystrips-1.0.28-20030825.src.rpm
 ^ ^

When i do a openpkg build locally, the generated 00INDEX.rdf.bz2 looks
better. No 'about=--'.
$ rpm -q perl
perl-5.8.0-20030903


   (mk)

-- 
Matthias Kurz; Fuldastr. 3; D-28199 Bremen; VOICE +49 421 53 600 47
Im prämotorischen Cortex kann jeder ein Held sein. (bdw) 
__
The OpenPKG Projectwww.openpkg.org
Developer Communication List   [EMAIL PROTECTED]


Re: Problem with current 00INDEX.rdf

2003-09-16 Thread Michael van Elst
On Tue, Sep 16, 2003 at 09:33:40AM +0200, Matthias Kurz wrote:
 
 rdf:Description about=-- href=kerberos-1.3.1-20030910.src.rpm

 When i do a openpkg build locally, the generated 00INDEX.rdf.bz2 looks
 better. No 'about=--'.

That's a corrupted specfile cache database.

The href attribute is computed from the RPM filename.
The about attribute is computed from the parsed specfile.

Specfiles are extracted from the source RPMs and stored in a
database to avoid unpacking the RPMs all the time. From the
index you can see that sometimes even wrong specfiles are
retrieved from the database, e.g.:

rdf:Description about=-- href=kerberos-1.3.1-20030910.src.rpm
...
Description
Various Perl modules for Date and Time handling:
...
/rdf:Description

Database corruption usually occurs when the indexer is killed,
Berkeley-DB then gets quickly inconsistent and corrupted.


Greetings,
-- 
Michael van Elst
Internet: [EMAIL PROTECTED]
A potential Snark may lurk in every tree.
__
The OpenPKG Projectwww.openpkg.org
Developer Communication List   [EMAIL PROTECTED]


Re: Problem with current 00INDEX.rdf

2003-09-16 Thread Matthias Kurz
On Tue, Sep 16, 2003 at 09:53:47AM +0200, Michael van Elst wrote:
 On Tue, Sep 16, 2003 at 09:33:40AM +0200, Matthias Kurz wrote:
  
  rdf:Description about=-- href=kerberos-1.3.1-20030910.src.rpm
 
  When i do a openpkg build locally, the generated 00INDEX.rdf.bz2 looks
  better. No 'about=--'.
 
 That's a corrupted specfile cache database.
[...]
 Database corruption usually occurs when the indexer is killed,
 Berkeley-DB then gets quickly inconsistent and corrupted.

I hope it is clear, that this happens with the 00INDEX.rdf.bz2 on
openpkg.org, not here.
How does one clean up such a corruption ?


   (mk)

-- 
Matthias Kurz; Fuldastr. 3; D-28199 Bremen; VOICE +49 421 53 600 47
Im prämotorischen Cortex kann jeder ein Held sein. (bdw) 
__
The OpenPKG Projectwww.openpkg.org
Developer Communication List   [EMAIL PROTECTED]


Re: Problem with current 00INDEX.rdf

2003-09-16 Thread Michael van Elst
On Tue, Sep 16, 2003 at 10:11:02AM +0200, Matthias Kurz wrote:

 I hope it is clear, that this happens with the 00INDEX.rdf.bz2 on
 openpkg.org, not here.

Sure.

 How does one clean up such a corruption ?

Just delete the cache file. It will be recreated (slowly). I don't
think there is a reliable method to repair a broken database.

Greetings,
-- 
Michael van Elst
Internet: [EMAIL PROTECTED]
A potential Snark may lurk in every tree.
__
The OpenPKG Projectwww.openpkg.org
Developer Communication List   [EMAIL PROTECTED]


Re: Problem with current 00INDEX.rdf

2003-09-16 Thread Matthias Kurz
On Tue, Sep 16, 2003 at 11:03:10AM +0200, Michael van Elst wrote:
 On Tue, Sep 16, 2003 at 10:11:02AM +0200, Matthias Kurz wrote:
[...]
  How does one clean up such a corruption ?
 
 Just delete the cache file. It will be recreated (slowly). I don't
 think there is a reliable method to repair a broken database.

What is its name ?


   (mk)

-- 
Matthias Kurz; Fuldastr. 3; D-28199 Bremen; VOICE +49 421 53 600 47
Im prämotorischen Cortex kann jeder ein Held sein. (bdw) 
__
The OpenPKG Projectwww.openpkg.org
Developer Communication List   [EMAIL PROTECTED]


Re: Problem with current 00INDEX.rdf

2003-09-16 Thread Ralf S. Engelschall
On Tue, Sep 16, 2003, Michael van Elst wrote:

  rdf:Description about=-- href=kerberos-1.3.1-20030910.src.rpm

  When i do a openpkg build locally, the generated 00INDEX.rdf.bz2 looks
  better. No 'about=--'.

 That's a corrupted specfile cache database.

 The href attribute is computed from the RPM filename.
 The about attribute is computed from the parsed specfile.

 Specfiles are extracted from the source RPMs and stored in a
 database to avoid unpacking the RPMs all the time. From the
 index you can see that sometimes even wrong specfiles are
 retrieved from the database, e.g.:

 rdf:Description about=-- href=kerberos-1.3.1-20030910.src.rpm
 ...
 Description
 Various Perl modules for Date and Time handling:
 ...
 /rdf:Description

 Database corruption usually occurs when the indexer is killed,
 Berkeley-DB then gets quickly inconsistent and corrupted.

We had yesterday some other brokeness in the index of CURRENT related to
apache. I've removed the index.current.cache on master.openpkg.org and
it was regenerated. I've now removed it again in the hope the index is
now regenerated again more correctly.

   Ralf S. Engelschall
   [EMAIL PROTECTED]
   www.engelschall.com

__
The OpenPKG Projectwww.openpkg.org
Developer Communication List   [EMAIL PROTECTED]


Re: Problem with current 00INDEX.rdf

2003-09-16 Thread Michael van Elst
On Tue, Sep 16, 2003 at 11:31:00AM +0200, Matthias Kurz wrote:

  Just delete the cache file. It will be recreated (slowly). I don't
  think there is a reliable method to repair a broken database.
 
 What is its name ?

You should know :) You pass the name to 'openpkg index' with the -C
option.

Without -C there is no cache.

The cache is only used when indexing source RPMs because unpacking
the RPMs to get to the specfiles is rather slow.

Greetings,
-- 
Michael van Elst
Internet: [EMAIL PROTECTED]
A potential Snark may lurk in every tree.
__
The OpenPKG Projectwww.openpkg.org
Developer Communication List   [EMAIL PROTECTED]


Re: Problem with current 00INDEX.rdf

2003-09-16 Thread Matthias Kurz
On Tue, Sep 16, 2003 at 11:35:26AM +0200, Michael van Elst wrote:
 On Tue, Sep 16, 2003 at 11:31:00AM +0200, Matthias Kurz wrote:
 
   Just delete the cache file. It will be recreated (slowly). I don't
   think there is a reliable method to repair a broken database.
  
  What is its name ?
 
 You should know :) You pass the name to 'openpkg index' with the -C
 option.

Hey, i never used -C - perhaps _theerefor_ the index creation takes _so_
much time :-))

 Without -C there is no cache.
 
 The cache is only used when indexing source RPMs because unpacking
 the RPMs to get to the specfiles is rather slow.

Look, and again i learned something :)


   (mk)

-- 
Matthias Kurz; Fuldastr. 3; D-28199 Bremen; VOICE +49 421 53 600 47
Im prämotorischen Cortex kann jeder ein Held sein. (bdw) 
__
The OpenPKG Projectwww.openpkg.org
Developer Communication List   [EMAIL PROTECTED]


Re: Problem with current 00INDEX.rdf

2003-09-16 Thread Thomas Lotterer
On Tue, Sep 16, 2003, Ralf S. Engelschall wrote:

 On Tue, Sep 16, 2003, Michael van Elst wrote:
 
  [...]
  Database corruption usually occurs when the indexer is killed,
  Berkeley-DB then gets quickly inconsistent and corrupted.
 
 We had yesterday some other brokeness in the index of CURRENT related to
 apache. I've removed the index.current.cache on master.openpkg.org and
 it was regenerated. I've now removed it again in the hope the index is
 now regenerated again more correctly.
 
Problem persisted but is now fixed manually on openpkg.org for one time.
The index is not only broken if the indexer is killed but also when
two or more instances of the indexer are running simultaneously. We
schedule index rebuilds quarterly. This works most of the time. With the
current size of the ftp area a broken index cannot be repaired by just
deleting the cache. The next quarterly run will then start from scratch
which takes longer than 15min and the next quarterly run will launch a
second instance which destroys the cache again.

We have to find out what the best practice is to repair such damage in
the future.

--
[EMAIL PROTECTED]
Development Team, Operations Northern Europe, Cable  Wireless
__
The OpenPKG Projectwww.openpkg.org
Developer Communication List   [EMAIL PROTECTED]


Re: Problem with current 00INDEX.rdf

2003-09-16 Thread Michael van Elst
On Tue, Sep 16, 2003 at 12:09:01PM +0200, Thomas Lotterer wrote:

 Problem persisted but is now fixed manually on openpkg.org for one time.
 The index is not only broken if the indexer is killed but also when
 two or more instances of the indexer are running simultaneously.

Correct. Berkeley-DB as used by DB_File doesn't lock the database.

I guess the best solution is to add support for a separate lock file
to openpkg index.

-- 
Michael van Elst
Internet: [EMAIL PROTECTED]
A potential Snark may lurk in every tree.
__
The OpenPKG Projectwww.openpkg.org
Developer Communication List   [EMAIL PROTECTED]