Resend...I think this might have been lost in the switchover ************************************************************************
Thanks so much for your prompt reply! On Wed, 2003-09-17 at 01:44, Strahl, Torsten wrote: > Hi, > you are absolute right. The problem is the content of the device > /dev/raw/raw20. The DB kernel expects to find the USM Page on > Data Devspace 10, offset 0, but he can find only a data index > page with the pageNo 1467308! > > Page Header Page Trailer > page AC63160001070200...AC63160001070200 > | | |_ Index Page > | | > | |_ Data Page > | > |_ PageNo 1467308 > > 1. May be you modified the data base configuration file and you changed > the assignment of data devspaces and raw devices after you have restored > the hostfile? Sorry, I failed to mention that this was a "migration" rather than a restore. We made the backup on our project machine and use that to do the automated test runs of dbt2 on our STP lab systems. The test runs have migrated that database backup to the STP test machines many times over under normal conditions. > > 2. May be your experimental kernel did not execute the write order of the USM PAGE? Sorry to be clueless. Can you tell me a "USM Page" is? I've seen "USM" mentioned as a type of memory, so I'm confused about what it's doing on disk. > > 3. May be you have a hardware problem ? At this point it appears to be a bug in the OS (one of the Linux 2.6 test kernels) . We have a patched kernel working now... and our kernel guys are making sure it is a universal solution. Interestingly we were running this dbt2 test on RedHat 9.0. In addition to the STP lab systems, we were also using our project machine to track down the bug. The error was a little different (I think because it was a restore there, not a migration). The strace we had was faulty with the new NPTL threads library. We could see a bad write (0 bytes written with a 8192 request), but couldn't find the open anywhere for that file descriptor. We started vserver with LD_ASSUME_KERNEL=2.4.1 to revert to the old threads and strace produced everything! Seeing the open call was the key. Otherwise we haven't had trouble running with RH9, except that the address space is different with the new NTPL threads, and we cant get quite as much database buffer cache as we used do to the ~3GB limitation. Thanks again for the help! > > I prefer no. 1. > > Regards, > Torsten > > SAP DB, SAP Labs Berlin > > > > -----Original Message----- > From: Mary Edie Meredith [mailto:[EMAIL PROTECTED] > Sent: Dienstag, 16. September 2003 21:51 > To: [EMAIL PROTECTED] > Subject: In need of enlightenment... > > > We are running the OSDL dbt2-1tier test (on sapdb 7.3.0.25) and are > getting the failure below. This is running an experimental Linux kernel > (with a bug we feel), but we are stumped at what is causing it. > > The failure occurs during the test run as we start to migrate a copy of > the database to the test machine (ie during the restore). With very > similar kernels the process works just fine, so I don't think it is our > kit. > > I think the message means that the first page of /dev/raw/raw20 is read > and does not contain what the thread expects (some header, perhaps?), so > it assumes the disk is corrupt and errors out. If anyone can enlighten > us further, it would be greatly appreciated: > > > > 2003-09-12 16:13:53 4245 ERR 54001 I/O page > AC63160001070200...AC63160001070200 > 2003-09-12 16:13:53 4245 ERR 54001 I/O BAD USM PAGE 0 > 2003-09-12 16:13:53 4245 ERR 54001 I/O on DEVNO 10 DEV_OFFSET 0 > 2003-09-12 16:13:53 4245 ERR 53016 I/O /dev/raw/raw20 > 2003-09-12 16:13:55 4245 ERR 54001 I/O page AC63160001070200...AC63160001070200 > 2003-09-12 16:13:55 4245 ERR 54001 I/O BAD USM PAGE 0 > 2003-09-12 16:13:55 4245 ERR 54001 I/O on DEVNO 10 DEV_OFFSET 0 > 2003-09-12 16:13:55 4245 ERR 53016 I/O /dev/raw/raw20 > 2003-09-12 16:13:57 4245 ERR 54001 I/O page AC63160001070200...AC63160001070200 > 2003-09-12 16:13:57 4245 ERR 54001 I/O BAD USM PAGE 0 > 2003-09-12 16:13:57 4245 ERR 54001 I/O on DEVNO 10 DEV_OFFSET 0 > 2003-09-12 16:13:57 4245 ERR 53016 I/O /dev/raw/raw20 > > The whole diag file is in: > http://khack.osdl.org/stp/279795/results/db_knldiag.out > -- Mary Edie Meredith <[EMAIL PROTECTED]> Open Source Development Lab -- MaxDB Discussion Mailing List For list archives: http://lists.mysql.com/maxdb To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]
