Ralf,

Thanks for the second test program. Interestingly, it gives no errors at all in a 
situation where rpm 4.2.1 is reporting
mmap EAGAIN errors.

All target files were on an NFS mounted filesystem, but this time the system is hosted 
by a RedHat 9 server (since out
usual Solaris 8 NFS system was running out of locks again). But the behavior is 
identical to the behavior when using the
Solaris NFS server, so I don't think that it is a factor right now.

Output from your test program, both before and after running rpm 4.2.1:
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
> ./testlock2
open() -> 3, 0
fcntl() -> 0, 0
write() -> 8192, 0
mmap() -> ff3a0000, 0
munmap() -> 0, 0
fcntl() -> 0, 0
close() -> 0, 0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Partial output from rpm:
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
15701:  stat64("/", 0xFFBFF698)                         = 0
15701:  stat64("/usr/", 0xFFBFF698)                     = 0
15701:  stat64("/usr/psr.oit/", 0xFFBFF698)             = 0
15701:  stat64("/usr/psr.oit/solaris9/", 0xFFBFF698)    = 0
15701:  stat64("/usr/psr.oit/solaris9/RPM/", 0xFFBFF698) = 0
15701:  stat64("/usr/psr.oit/solaris9/RPM/DB", 0xFFBFF698) = 0
15701:  access("/usr/psr.oit/solaris9/RPM/DB", 2)       = 0
15701:  stat64("/usr/psr.oit/solaris9/RPM/DB/__db.001", 0xFFBFF7C0) = 0
15701:  access("/usr/psr.oit/solaris9/RPM/DB/__db.001", 0) = 0
15701:  access("/usr/psr.oit/solaris9/RPM/DB/Packages", 0) = 0
15701:  stat("/usr/psr.oit/solaris9/RPM/DB/DB_CONFIG", 0xFFBFF3C8) Err#2 ENOENT
15701:  open("/usr/psr.oit/solaris9/RPM/DB/DB_CONFIG", O_RDONLY) Err#2 ENOENT
15701:  stat("/usr/psr.oit/solaris9/RPM/DB/__db.001", 0xFFBFF448) = 0
15701:  open("/usr/psr.oit/solaris9/RPM/DB/__db.001", O_RDWR|O_CREAT|O_EXCL, 0644) 
Err#17 EEXIST
15701:  open("/usr/psr.oit/solaris9/RPM/DB/__db.001", O_RDWR) = 3
15701:  fcntl(3, F_SETFD, 0x00000001)                   = 0
15701:  ioctl(3, 0x2000664C, 0x00000001)                = 0
15701:  fstat(3, 0xFFBFF4C0)                            = 0
15701:  open("/usr/psr.oit/solaris9/RPM/DB/__db.001", O_RDWR|O_CREAT, 0644) = 4
15701:  fcntl(4, F_SETFD, 0x00000001)                   = 0
15701:  ioctl(4, 0x2000664C, 0x00000001)                = 0
15701:  lseek(4, 0, SEEK_END)                           = 0
15701:  lseek(4, 0, SEEK_CUR)                           = 0
15701:  write(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192)    = 8192
15701:  mmap(0x00000000, 8192, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) Err#11 EAGAIN
15701:  fstat64(2, 0xFFBFE478)                          = 0
15701:  write(2, " r p m d b", 5)                       = 5
15701:  write(2, " :  ", 2)                             = 2
15701:  write(2, " m m a p :  ", 6)                     = 6
15701:  write(2, " R e s o u r c e   t e m".., 32)      = 32
15701:  write(2, "\n", 1)                               = 1
15701:  close(4)                                        = 0
15701:  close(3)                                        = 0
15701:  write(2, " e r r o r :  ", 7)                   = 7
15701:  write(2, " d b 4   e r r o r ( 1 1".., 65)      = 65
15701:  write(2, " e r r o r :  ", 7)                   = 7
15701:  write(2, " c a n n o t   o p e n  ".., 27)      = 27
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

So I changed your program to use "__db.001" as the filename to open, and lo and behold:
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
> ./testlock2.db.001
open() -> 3, 0
fcntl() -> 0, 0
write() -> 8192, 0
mmap() -> ffffffff, 11
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(BTW, I had to change your prog to test for MAP_FAILED instead of NULL for mmap 
failure.)

So I saved and cleaned out the DB directory and ran rpm --db-build and got loads of 
EAGAINs:
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
> truss -f -o/tmp/rpmdebug.build rpm --db-build
rpmdb: BUILDING NEW RPM DATABASE FROM SCRATCH (/usr/psr.oit/solaris9/RPM/DB)
rpmdb: removing (possibly existing) old RPM database DB files
rpmdb: creating new RPM database (built-in RPM procedure)
rpmdb: mmap: Resource temporarily unavailable
error: db4 error(11) from dbenv->open: Resource temporarily unavailable
error: cannot open Packages index using db3 - Resource temporarily unavailable (11)
rpmdb: operating on new RPM database
rpmdb: mmap: Resource temporarily unavailable
error: db4 error(11) from dbenv->open: Resource temporarily unavailable
error: cannot open Packages index using db3 - Resource temporarily unavailable (11)
error: cannot open Packages database in /usr/psr.oit/solaris9/RPM/DB
error: /usr/psr.oit/solaris9/etc/openpkg/openpkg.pgp: import failed.
rpmdb: mmap: Resource temporarily unavailable
error: db4 error(11) from dbenv->open: Resource temporarily unavailable
error: cannot open Packages index using db3 - Resource temporarily unavailable (11)
error: cannot open Packages database in /usr/psr.oit/solaris9/RPM/DB
error: package gpg-pubkey-63c4cb9f-3c591eda is not installed
rpmdb: mmap: Resource temporarily unavailable
error: db4 error(11) from dbenv->open: Resource temporarily unavailable
error: cannot open Packages database in /usr/psr.oit/solaris9/RPM/DB
rpmdb: rebuilding new RPM database (built-in RPM procedure)
rpmdb: mmap: Resource temporarily unavailable
error: db4 error(11) from dbenv->open: Resource temporarily unavailable
error: cannot open Packages index
rpmdb: making sure RPM database contains all possible DB files
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
db_load: mmap: Resource temporarily unavailable
rpmdb: rebuilding RPM database (built-in RPM procedure)
rpmdb: mmap: Resource temporarily unavailable
error: db4 error(11) from dbenv->open: Resource temporarily unavailable
error: cannot open Packages index
rpmdb: performing read/write operation on RPM database
rpmdb: mmap: Resource temporarily unavailable
error: db4 error(11) from dbenv->open: Resource temporarily unavailable
error: cannot open Packages index using db3 - Resource temporarily unavailable (11)
error: cannot open Packages database in /usr/psr.oit/solaris9/RPM/DB
rpmdb: mmap: Resource temporarily unavailable
error: db4 error(11) from dbenv->open: Resource temporarily unavailable
error: cannot open Packages index using db3 - Resource temporarily unavailable (11)
error: cannot open Packages database in /usr/psr.oit/solaris9/RPM/DB
rpmdb: mmap: Resource temporarily unavailable
error: db4 error(11) from dbenv->open: Resource temporarily unavailable
error: cannot open Packages index using db3 - Resource temporarily unavailable (11)
error: cannot open Packages database in /usr/psr.oit/solaris9/RPM/DB
error: /usr/psr.oit/solaris9/etc/openpkg/openpkg.pgp: import failed.
rpmdb: making sure RPM database files have consistent attributes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This is what the directory contains afterward:
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
> ls -l
total 1152
-rw-r--r--   1 psr      psr        49152 Oct 17 15:29 Basenames
-rw-r--r--   1 psr      psr        49152 Oct 17 15:29 Conflictname
-rw-r--r--   1 psr      psr        49152 Oct 17 15:29 Depends
-rw-r--r--   1 psr      psr        32768 Oct 17 15:29 Dirnames
-rw-r--r--   1 psr      psr        49152 Oct 17 15:29 Filemd5s
-rw-r--r--   1 psr      psr        49152 Oct 17 15:29 Group
-rw-r--r--   1 psr      psr        32768 Oct 17 15:29 Installtid
-rw-r--r--   1 psr      psr        49152 Oct 17 15:29 Name
-rw-r--r--   1 psr      psr        49152 Oct 17 15:29 Packages
-rw-r--r--   1 psr      psr        49152 Oct 17 15:29 Providename
-rw-r--r--   1 psr      psr        32768 Oct 17 15:29 Provideversion
-rw-r--r--   1 psr      psr        49152 Oct 17 15:29 Pubkeys
-rw-r--r--   1 psr      psr        49152 Oct 17 15:29 Requirename
-rw-r--r--   1 psr      psr        32768 Oct 17 15:29 Requireversion
-rw-r--r--   1 psr      psr        49152 Oct 17 15:29 Sha1header
-rw-r--r--   1 psr      psr        49152 Oct 17 15:29 Sigmd5
-rw-r--r--   1 psr      psr        49152 Oct 17 15:29 Triggername
-rw-r--r--   1 psr      psr         8192 Oct 17 15:29 __db.001
-rw-r--r--   1 psr      psr            0 Oct 17 15:29 __db.002
-rw-r--r--   1 psr      psr            0 Oct 17 15:29 __db.003
-rw-r--r--   1 psr      psr            0 Oct 17 15:29 __db.004
-rw-r--r--   1 psr      psr            0 Oct 17 15:29 __db.005
-rw-r--r--   1 psr      psr            0 Oct 17 15:29 __db.006
-rw-r--r--   1 psr      psr            0 Oct 17 15:29 __db.007
-rw-r--r--   1 psr      psr            0 Oct 17 15:29 __db.008
-rw-r--r--   1 psr      psr            0 Oct 17 15:29 __db.009
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

And this is the first EAGAIN from the truss output. It looks a lot like the one before:
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
16417:  stat64("/", 0xFFBFF730)                         = 0
16417:  stat64("/usr/", 0xFFBFF730)                     = 0
16417:  stat64("/usr/psr.oit/", 0xFFBFF730)             = 0
16417:  stat64("/usr/psr.oit/solaris9/", 0xFFBFF730)    = 0
16417:  stat64("/usr/psr.oit/solaris9/RPM/", 0xFFBFF730) = 0
16417:  stat64("/usr/psr.oit/solaris9/RPM/DB", 0xFFBFF730) = 0
16417:  access("/usr/psr.oit/solaris9/RPM/DB", 2)       = 0
16417:  stat64("/usr/psr.oit/solaris9/RPM/DB/__db.001", 0xFFBFF858) Err#2 ENOENT
16417:  access("/usr/psr.oit/solaris9/RPM/DB/__db.001", 0) Err#2 ENOENT
16417:  stat("/usr/psr.oit/solaris9/RPM/DB/DB_CONFIG", 0xFFBFF460) Err#2 ENOENT
16417:  open("/usr/psr.oit/solaris9/RPM/DB/DB_CONFIG", O_RDONLY) Err#2 ENOENT
16417:  stat("/usr/psr.oit/solaris9/RPM/DB/__db.001", 0xFFBFF4E0) Err#2 ENOENT
16417:  open("/usr/psr.oit/solaris9/RPM/DB/__db.001", O_RDWR|O_CREAT|O_EXCL, 0644) = 3
16417:  fcntl(3, F_SETFD, 0x00000001)                   = 0
16417:  ioctl(3, 0x2000664C, 0x00000001)                = 0
16417:  open("/usr/psr.oit/solaris9/RPM/DB/__db.001", O_RDWR|O_CREAT, 0644) = 4
16417:  fcntl(4, F_SETFD, 0x00000001)                   = 0
16417:  ioctl(4, 0x2000664C, 0x00000001)                = 0
16417:  lseek(4, 0, SEEK_END)                           = 0
16417:  lseek(4, 0, SEEK_CUR)                           = 0
16417:  write(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192)    = 8192
16417:  mmap(0x00000000, 8192, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) Err#11 EAGAIN
16417:  fstat64(2, 0xFFBFE510)                          = 0
16417:  write(2, " r p m d b", 5)                       = 5
16417:  write(2, " :  ", 2)                             = 2
16417:  write(2, " m m a p :  ", 6)                     = 6
16417:  write(2, " R e s o u r c e   t e m".., 32)      = 32
16417:  write(2, "\n", 1)                               = 1
16417:  close(4)                                        = 0
16417:  close(3)                                        = 0
16417:  write(2, " e r r o r :  ", 7)                   = 7
16417:  write(2, " d b 4   e r r o r ( 1 1".., 65)      = 65
16417:  write(2, " e r r o r :  ", 7)                   = 7
16417:  write(2, " c a n n o t   o p e n  ".., 77)      = 77
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

At this point, your original test program still reports no problems in creating or 
opening an existing foo.db. But the
modified test program that opens __db.001 still reports an EAGAIN, even though 
__db.001 was just now created by
the --db-build process.

Any ideas as to what I could try next?

Thanks,
       Dennis

> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of Ralf S. Engelschall
> Sent: Friday, October 17, 2003 9:06 AM
> To: [EMAIL PROTECTED]
> Subject: Re: Problems running openpkg-20031006-20031006 on Solaris and
> Linux (was Two problems building/running openpkg-20031006-20031006 on
> RedHat 9)
>
>
> On Thu, Oct 16, 2003, Dennis McRitchie wrote:
>
> > 3) When I tried to tun "rpm --db-rebuild" on our Solaris 9 system, (as
> > alaways the DB files were on the NFS system), I got the same problem
> > as before: EAGAIN on the mmap calls. (Yet your test program still runs
> > successfully.) Stderr output is:
> > [...]
> > 9887:       open("/usr/psr.oit/solaris9/RPM/DB/__db.001", O_RDWR|O_CREAT, 0644) = 4
> > 9887:       fcntl(4, F_SETFD, 0x00000001)                   = 0
> > 9887:       ioctl(4, 0x2000664C, 0x00000001)                = 0
> > 9887:       lseek(4, 0, SEEK_END)                           = 0
> > 9887:       lseek(4, 0, SEEK_CUR)                           = 0
> > 9887:       write(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192)    = 8192
> > 9887:       mmap(0x00000000, 8192, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) Err#11 
> > EAGAIN
> > [...]
> > Any idea why this might still be happening even though your test
> > program seems to suggest that locking is working?
>
> Well, my test program is for the fcntl(2) call only. Your problem now
> is the failing mmap(2): EAGAIN on mmap(2) is something completely
> different. That's interesting. So, we're now at least one step further
> for your situation.
>
> So, next round: what about the following extended test program "foo.c"?
>
> -------------------------------------------------------------------
> #include <stdlib.h>
> #include <stdio.h>
> #include <unistd.h>
> #include <string.h>
> #include <fcntl.h>
> #include <errno.h>
> #include <sys/types.h>
> #include <sys/mman.h>
>
> int main(int argc, char *argv[])
> {
>     int fd;
>     struct flock l;
>     int rv;
>     char buf[8192];
>     void *vp;
>
>     fd = open("foo.db", O_RDWR|O_CREAT, 0644);
>     printf("open() -> %d, %d\n", fd, fd == -1 ? errno : 0);
>     if (fd == -1) { exit(1); }
>
>     l.l_type   = F_WRLCK;
>     l.l_whence = SEEK_SET;
>     l.l_start  = 0;
>     l.l_len    = 0;
>     rv = fcntl(fd, F_SETLKW, &l);
>     printf("fcntl() -> %d, %d\n", rv, rv == -1 ? errno : 0);
>     if (rv == -1) { exit(1); }
>
>     memset(buf, 0, 8192);
>     rv = write(fd, buf, 8192);
>     printf("write() -> %d, %d\n", rv, rv == -1 ? errno : 0);
>     if (rv == -1) { exit(1); }
>
>     vp = mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
>     printf("mmap() -> %lx, %d\n", (unsigned long)vp, vp == NULL ? errno : 0);
>     if (vp == NULL) { exit(1); }
>
>     rv = munmap(vp, 8192);
>     printf("munmap() -> %d, %d\n", rv, rv == -1 ? errno : 0);
>     if (rv == -1) { exit(1); }
>
>     l.l_type = F_UNLCK;
>     rv = fcntl(fd, F_SETLKW, &l);
>     printf("fcntl() -> %d, %d\n", rv, rv == -1 ? errno : 0);
>     if (rv == -1) { exit(1); }
>
>     rv = close(fd);
>     printf("close() -> %d, %d\n", rv, rv == -1 ? errno : 0);
>     if (rv == -1) { exit(1); }
>
>     return 0;
> }
> -------------------------------------------------------------------
>
> What is the output when run on multiple systems while staying (with
> current working directory) on NFS, UFS and MFS/TEMPFS filesystems?
> Hopefully here we see the EAGAIN again...
>
>                                        Ralf S. Engelschall
>                                        [EMAIL PROTECTED]
>                                        www.engelschall.com
>
> ______________________________________________________________________
> The OpenPKG Project                                    www.openpkg.org
> User Communication List                      [EMAIL PROTECTED]
>

______________________________________________________________________
The OpenPKG Project                                    www.openpkg.org
User Communication List                      [EMAIL PROTECTED]

Reply via email to