Re: [ADMIN] ERROR: could not read block

2011-05-22 Thread Diego Fernández Slezak
Thanks Tom. I executed a REINDEX DATABASE and received the error: . . . NOTICE: table "pg_enum" was reindexed NOTICE: table "pg_namespace" was reindexed NOTICE: table "pg_conversion" was reindexed NOTICE: table "pg_depend" was reindexed NOTICE: table "users" was reindexed NOTICE: table "resu

[ADMIN] ERROR: could not read block

2011-05-21 Thread Diego Fernández Slezak
Hello everybody, I had a hard drive failure last week. After lots of effort I've been able to backup a 700GB database, with only one file with corruption. When I do some big queries, it throws me errors on this faulty file: could not read block 390041 of relation 1663/350994/351212: read only 0 o

Re: [ADMIN] ERROR: could not read block

2011-05-21 Thread Tom Lane
=?ISO-8859-1?Q?Diego_Fern=E1ndez_Slezak?= writes: > Hello everybody, > I had a hard drive failure last week. After lots of effort I've been able to > backup a 700GB database, with only one file with corruption. > When I do some big queries, it throws me errors on this faulty file: > could not rea

[ADMIN] ERROR: could not read block

2011-05-21 Thread Diego Fernández Slezak
Hello everybody, I had a hard drive failure last week. After lots of effort I've been able to backup a 700GB database, with only one file with corruption. When I do some big queries, it throws me errors on this faulty file: could not read block 390041 of relation 1663/350994/351212: read only 0 o

Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-21 Thread Jim C. Nasby
On Thu, Nov 17, 2005 at 07:56:21PM +0100, Magnus Hagander wrote: > The way I read it, a delay should help. It's basically running out of > kernel buffers, and we just delay, somebody else (another process, or an > IRQ handler, or whatever) should get finished with their I/O, free up > the buffer, a

Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Kevin Grittner
A couple clarifications: There were only a few network sockets open. I'm told that the eventlog was reviewed for any events which mgiht be related to the failures before it was cleared. They found none, so that makes it fairly certain there was no 2020 event. -Kevin >>> "Kevin Grittner" <[EMA

Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Kevin Grittner
There weren't a large number of connections -- it seemed to be that the one big update query, by itself, would do this. It seemed to get through a lot of rows before failing. This table is normally "insert only" -- so it would likely be getting most or all of the space for inserting the updated r

Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Magnus Hagander
> None of this seems material, however. It's pretty clear that > the problem was exhaustion of the Windows page pool. Our > Windows experts have reconfigured the machine (which had been > tuned for Sybase ASE). Their changes have boosted the page > pool from 20,000 entries to 180,000 entries

Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Magnus Hagander
> >>> Tom Lane <[EMAIL PROTECTED]> >>> > "Kevin Grittner" <[EMAIL PROTECTED]> writes: > > None of this seems material, however. It's pretty clear that the > > problem was exhaustion of the Windows page pool. > > ... > > If we don't want to tell Windows users to make highly technical > > changes

Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Kevin Grittner
I'm not an expert on that, but it seems reasonable to me that the page pool would free space as the I/O system caught up with the load. Also, I'm going on what was said by Qingqing and in one of the pages he referenced: http://support.microsoft.com/default.aspx?scid=kb;en-us;274310 -Kevin >>>

Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Tom Lane
"Kevin Grittner" <[EMAIL PROTECTED]> writes: > None of this seems material, however. It's pretty clear that the > problem was exhaustion of the Windows page pool. > ... > If we don't want to tell Windows users to make highly technical > changes to the Windows registry in order to use PostgreSQL, >

Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Kevin Grittner
1) We run a couple Java applications on the same box to provide middle tier access. When the box is heavily loaded, I think I've seen about 80% PostgreSQL, 20% Java load. 2) I checked that no antivirus software was running, and had the techs pare down the services running on that box to the absol

Re: [ADMIN] ERROR: could not read block

2005-11-17 Thread Magnus Hagander
[copying this one over to hackers] > Our DBAs reviewed the Microsoft documentation you referenced, > modified the registry, and rebooted the OS. We've been > beating up on the database without seeing the error so far. > We'll keep at it for a while. Very interesting. As this seems to be a re

Re: [ADMIN] ERROR: could not read block

2005-11-16 Thread Kevin Grittner
Our DBAs reviewed the Microsoft documentation you referenced, modified the registry, and rebooted the OS. We've been beating up on the database without seeing the error so far. We'll keep at it for a while. -Kevin >>> Qingqing Zhou <[EMAIL PROTECTED]> >>> On Wed, 16 Nov 2005, Kevin Grittner

Re: [ADMIN] ERROR: could not read block

2005-11-16 Thread Tom Lane
Qingqing Zhou <[EMAIL PROTECTED]> writes: > On Wed, 16 Nov 2005, Kevin Grittner wrote: >> [2005-11-16 11:59:29.015 ] 4904 LOG: >> read failed on relation 1663/16385/1494810: -1 bytes, 1450 > 1450 ERROR_NO_SYSTEM_RESOURCES > Insufficient system resources exist to complete the requested service Hm

Re: [ADMIN] ERROR: could not read block

2005-11-16 Thread Qingqing Zhou
On Wed, 16 Nov 2005, Kevin Grittner wrote: > Ran with this change. Didn't take long to hit it. > > [2005-11-16 11:59:29.015 ] 4904 LOG: > read failed on relation 1663/16385/1494810: -1 bytes, 1450 > [2005-11-16 11:59:29.015 ] 4904 ERROR: > could not read block 25447 of relation 1663/16385/149

Re: [ADMIN] ERROR: could not read block

2005-11-16 Thread Kevin Grittner
Ran with this change. Didn't take long to hit it. Let me know if there's anything else I can do. [2005-11-16 11:59:29.015 ] 4904 LOG: read failed on relation 1663/16385/1494810: -1 bytes, 1450 [2005-11-16 11:59:29.015 ] 4904 ERROR: could not read block 25447 of relation 1663/16385/1494810: I

Re: [ADMIN] ERROR: could not read block

2005-11-16 Thread Tom Lane
"Kevin Grittner" <[EMAIL PROTECTED]> writes: > On Linux: > md.c:445: warning: implicit declaration of function `GetLastError' Of course. This is a Windows-only hack. > On Windows: > md.c:445: warning: int format, DWORD arg (arg 6) > md.c:457: warning: int format, DWORD arg (arg 7) I think this

Re: [ADMIN] ERROR: could not read block

2005-11-16 Thread Kevin Grittner
This code generates warnings on both Linux and Windows. My C is too rusty to feel confident of what to do. On Linux: md.c:445: warning: implicit declaration of function `GetLastError' On Windows: md.c:445: warning: int format, DWORD arg (arg 6) md.c:457: warning: int format, DWORD arg (arg 7)

Re: [ADMIN] ERROR: could not read block

2005-11-16 Thread Tom Lane
"Kevin Grittner" <[EMAIL PROTECTED]> writes: > Is there anything you would like me to include in my build for my > test runs, or any steps you would like me to take during the tests? You might want to insert some debugging elog's into mdread() in md.c, rather than in its caller smgrread. I'm conc

Re: [ADMIN] ERROR: could not read block

2005-11-16 Thread Kevin Grittner
Is there anything you would like me to include in my build for my test runs, or any steps you would like me to take during the tests? -Kevin >>> Tom Lane <[EMAIL PROTECTED]> >>> As I said before, we really really need to find out what the Windows-level error code is --- "Invalid argument" isn'

Re: [ADMIN] ERROR: could not read block

2005-11-16 Thread Kevin Grittner
I will patch, build, and run similar updates to try to hit the problem. Hopefully I can have something to post later today. -Kevin >>> Qingqing Zhou <[EMAIL PROTECTED]> >>> On Tue, 15 Nov 2005, Kevin Grittner wrote: > > Is there anything that anyone wants me to do at this point, to try > to p

Re: [ADMIN] ERROR: could not read block

2005-11-15 Thread Tom Lane
"Kevin Grittner" <[EMAIL PROTECTED]> writes: > ERROR: could not read block 1482762 of relation 1663/16385/16483: > Invalid argument > So the block number is increasing each time. I'm inclined to think > that this is the result of the scan passing over rows added by itself. It's just about impos

Re: [ADMIN] ERROR: could not read block

2005-11-15 Thread Qingqing Zhou
On Tue, 15 Nov 2005, Kevin Grittner wrote: > I got the error log working on Windows (with redirect_stderr). I had > to stop and restart postgres to do so. I ran the query (for the fourth > time), and it completed successfully. Strange - the phyiscal read for the 2nd, 3rd, 4th time should be t

Re: [ADMIN] ERROR: could not read block

2005-11-15 Thread Kevin Keith
If I have followed the chain correctly, I saw that you were trying to run an update statement on a large number of records in a large table right? I have changed my strategy in the past for this type of problem. I don't know if it would have fixed this problem or not, but I have seen with Postg

Re: [ADMIN] ERROR: could not read block

2005-11-15 Thread Kevin Grittner
I got the error log working on Windows (with redirect_stderr). I had to stop and restart postgres to do so. I ran the query (for the fourth time), and it completed successfully. I'm not inclined to believe that changing the redirect_stderr setting would change this behavior, so I guess that eit

Re: [ADMIN] ERROR: could not read block

2005-11-15 Thread Kevin Grittner
Correction: dtr=> select count(*) from "DbTranRepository" dtr-> WHERE ( dtr(> ("userId" <> UPPER("userId")) AND dtr(> ("timestampValue" BETWEEN '2005-10-28' AND '2005-11-15')); count 611255 (1 row) I'm becoming more convinced that this happens as the UPDATE runs into rows inser

Re: [ADMIN] ERROR: could not read block

2005-11-15 Thread Kevin Grittner
The table has about 23.3 million rows, of which about 200,000 will be affected by this update. Run time is about an hour. During the first run, the table was the target of about 45,000 inserts. This rerun was done as the only task. A third run (also by itself) gave this: ERROR: could not read

Re: [ADMIN] ERROR: could not read block

2005-11-15 Thread Qingqing Zhou
> > I reran the query. Same error, same relation, different block. > > dtr=> UPDATE > dtr-> "DbTranRepository" > dtr-> SET "userId" = UPPER("userId") > dtr-> WHERE ( > dtr(> ("userId" <> UPPER("userId")) AND > dtr(> ("timestampValue" BETWEEN '2005-10-28' AND '2005-11-15')); > ERROR

Re: [ADMIN] ERROR: could not read block

2005-11-15 Thread Joshua Marsh
On 11/15/05, Kevin Grittner <[EMAIL PROTECTED]> wrote: Could my issue be the same problem as this thread?:http://archives.postgresql.org/pgsql-bugs/2005-11/msg00114.phpThe references to "Invalid Argument" caught my eye.  That thread did start from a very different point, though.-Kevin It's possible

Re: [ADMIN] ERROR: could not read block

2005-11-15 Thread Kevin Grittner
Could my issue be the same problem as this thread?: http://archives.postgresql.org/pgsql-bugs/2005-11/msg00114.php The references to "Invalid Argument" caught my eye. That thread did start from a very different point, though. -Kevin >>> "Kevin Grittner" <[EMAIL PROTECTED]> >>> It appears tha

Re: [ADMIN] ERROR: could not read block

2005-11-15 Thread Kevin Grittner
It appears that the log file is not being written -- I'll start a separate thread on that issue. I reran the query. Same error, same relation, different block. dtr=> UPDATE dtr-> "DbTranRepository" dtr-> SET "userId" = UPPER("userId") dtr-> WHERE ( dtr(> ("userId" <> UPPER("userId")) AND

Re: [ADMIN] ERROR: could not read block

2005-11-15 Thread Andrew Sullivan
On Mon, Nov 14, 2005 at 06:19:16PM -0600, Kevin Grittner wrote: > the moment aren't sure. The current machines are "transitional", > and it may not be too late to set the permanent servers up with ECC > memory. Is it something I should fight for? Yes. Always. A -- Andrew Sullivan | [EMAIL P

Re: [ADMIN] ERROR: could not read block

2005-11-14 Thread Tom Lane
Scott Marlowe <[EMAIL PROTECTED]> writes: > On Mon, 2005-11-14 at 17:20, Kevin Grittner wrote: >> ERROR: could not read block 649847 of relation 1663/16385/16483: >> Invalid argument > When a block is unreadable, this means that the OS is experiencing a > read error from the hard drive. I'd beli

Re: [ADMIN] ERROR: could not read block

2005-11-14 Thread Kevin Grittner
Both machines are IBM xSeries 346 model 884042U with 6 drives in a RAID 5 array through an IBM battery backed controller. We had a couple of these lying around after replacing them with better, but they have been pretty stable workhorses for us. I'm checking on whether the RAM is ECC -- the techs

Re: [ADMIN] ERROR: could not read block

2005-11-14 Thread Joshua Marsh
On 11/14/05, Scott Marlowe <[EMAIL PROTECTED]> wrote: If you were running on top of a RAID 1+0 or RAID 5 array, such an errorwould likely never have happened, since it would have been detected bythe controller, and either the bad block would be mapped out or thedrive would be kicked out of the arra

Re: [ADMIN] ERROR: could not read block

2005-11-14 Thread Scott Marlowe
On Mon, 2005-11-14 at 17:20, Kevin Grittner wrote: > A programmer ran a query to fix some data against two "identical" > databases -- one on Linux and one on Windows. They are both 8.1.0, > running on dual hyperthreaded Xeons, with data on RAID5. The Linux > update went fine. The Windows attempt

[ADMIN] ERROR: could not read block

2005-11-14 Thread Kevin Grittner
A programmer ran a query to fix some data against two "identical" databases -- one on Linux and one on Windows. They are both 8.1.0, running on dual hyperthreaded Xeons, with data on RAID5. The Linux update went fine. The Windows attempt give this: dtr=> UPDATE dtr-> "DbTranRepository" dtr