Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-12-01 Thread Kevin Grittner
Due to its size, in the Windows environment we can't dump this
database in any format except plain text, so the zlib issues don't
apply here.

-Kevin


 Qingqing Zhou [EMAIL PROTECTED]  

 By they way, they found that they were getting this on a pg_dump,
 too.  We will test both failure cases.  If the test goes OK, we would
 be happy to leave it in production with this patch.


I can believe that pg_dump faces the similar situtation, i.e., running
out
of kernel buffers. But seems pg_dump supports -Z 0..9 option which
uses
some external I/O functions from zlib. This part may be not easy to
retry.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-12-01 Thread Kevin Grittner
[Apologies for the delayed response; fighting through a backlog.]

I checked with out DBAs, and they are willing to test it.

By they way, they found that they were getting this on a pg_dump,
too.  We will test both failure cases.  If the test goes OK, we would
be happy to leave it in production with this patch.

-Kevin


 Qingqing Zhou [EMAIL PROTECTED]  

Tom Lane [EMAIL PROTECTED] wrote

 Would a simple retry loop actually help?  It's not clear to me how
 persistent such a failure would be.


[with reply to all followup threads] Yeah, this is the key and we
definitely 
have no 100% guarantee that several retries will solve the problem -
just as 
the situation in pg_unlink/pg_rename. But shall we do something now? If 
Kevin could help on testing(you may have to revert the registry changes
:-() 
, I would like to send a patch in the retry style.

Regards,
Qingqing 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-12-01 Thread Tom Lane
Qingqing Zhou [EMAIL PROTECTED] writes:
 I come up with a patch to fix server-side problem.

Applied.

 For windows, I set a one second waiting time -

The code actually does one millisecond; I left it that way since it
seems a reasonable value.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-30 Thread Qingqing Zhou


On Wed, 30 Nov 2005, Kevin Grittner wrote:


 I checked with out DBAs, and they are willing to test it.

Thanks, that's very nice!

 By they way, they found that they were getting this on a pg_dump,
 too.  We will test both failure cases.  If the test goes OK, we would
 be happy to leave it in production with this patch.


I can believe that pg_dump faces the similar situtation, i.e., running out
of kernel buffers. But seems pg_dump supports -Z 0..9 option which uses
some external I/O functions from zlib. This part may be not easy to
retry.

Magnus, do you want to work on this? If not, I will give it a try.

Regards,
Qingqing

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-30 Thread Qingqing Zhou

I come up with a patch to fix server-side problem. The basic idea is to
convert ERROR_NO_SYSTEM_RESOURCES to EINTR and add code to do retry unless
a new error encountered or successfully done. I tweak the FileRead() logic
on returnCode = 0 a little bit by separating it to 0 and ==0
parts. This is because if our read passed EOF, read() will not set errno
which may cause a dead loop if a previous read() is interrupted.

For windows, I set a one second waiting time - this should be ok since the
problem is very rare. If the error is permenate, you can always SIGINT the
process since the waiting is done by pg_usleep().

Regards,
Qingqing

---

Index: src/backend/storage/file/fd.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/storage/file/fd.c,v
retrieving revision 1.122
diff -c -r1.122 fd.c
*** src/backend/storage/file/fd.c   22 Nov 2005 18:17:20 -  1.122
--- src/backend/storage/file/fd.c   1 Dec 2005 01:09:59 -
***
*** 1009,1019 
if (returnCode  0)
return returnCode;

!   returnCode = read(VfdCache[file].fd, buffer, amount);
!   if (returnCode  0)
!   VfdCache[file].seekPos += returnCode;
!   else
!   VfdCache[file].seekPos = FileUnknownPos;

return returnCode;
  }
--- 1009,1052 
if (returnCode  0)
return returnCode;

!   for (;;)
!   {
!   returnCode = read(VfdCache[file].fd, buffer, amount);
!
!   if (returnCode  0)
!   VfdCache[file].seekPos += returnCode;
!   else if (returnCode == 0)
!   VfdCache[file].seekPos = FileUnknownPos;
!   else
!   {
! #ifdef WIN32
!   DWORD error = GetLastError();
!
!   switch (error)
!   {
!   /*
!* Since we are using buffered IO now, so 
windows may run
!* out of kernel buffer and return a 
Insufficient system
!* resources error. Retry to solve it.
!*/
!   case ERROR_NO_SYSTEM_RESOURCES:
!   pg_usleep(1000);
!   errno = EINTR;
!   break;
!   default:
!   _dosmaperr(error);
!   Assert(errno != EINTR);
!   }
! #endif
!   /* Ok if interrupted and retry */
!   if (errno == EINTR)
!   continue;
!
!   VfdCache[file].seekPos = FileUnknownPos;
!   }
!
!   break;
!   }

return returnCode;
  }
***
*** 1033,1049 
if (returnCode  0)
return returnCode;

!   errno = 0;
!   returnCode = write(VfdCache[file].fd, buffer, amount);

!   /* if write didn't set errno, assume problem is no disk space */
!   if (returnCode != amount  errno == 0)
!   errno = ENOSPC;

!   if (returnCode  0)
!   VfdCache[file].seekPos += returnCode;
!   else
!   VfdCache[file].seekPos = FileUnknownPos;

return returnCode;
  }
--- 1066,1108 
if (returnCode  0)
return returnCode;

!   for (;;)
!   {
!   errno = 0;
!   returnCode = write(VfdCache[file].fd, buffer, amount);

!   /* if write didn't set errno, assume problem is no disk space */
!   if (returnCode != amount  errno == 0)
!   errno = ENOSPC;

!   if (returnCode  0)
!   VfdCache[file].seekPos += returnCode;
!   else
!   {
! #ifdef WIN32
!   DWORD error = GetLastError();
!
!   switch (error)
!   {
!   /* see comments in FileRead() */
!   case ERROR_NO_SYSTEM_RESOURCES:
!   pg_usleep(1000);
!   errno = EINTR;
!   break;
!   default:
!   _dosmaperr(error);
!   Assert(errno != EINTR);
!   }
! #endif
!   /* Ok if interrupted and retry */
!   if (errno == EINTR)
!   continue;
!
!   VfdCache[file].seekPos = FileUnknownPos;
!   }
!
!   break;
!   }

return returnCode;
  }

---(end of broadcast)---
TIP 6: explain 

Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-30 Thread Tom Lane
Qingqing Zhou [EMAIL PROTECTED] writes:
 ! default:
 ! _dosmaperr(error);
 ! Assert(errno != EINTR);

What's the point of that ... didn't it already happen inside read()?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-30 Thread Qingqing Zhou


On Thu, 1 Dec 2005, Tom Lane wrote:

 Qingqing Zhou [EMAIL PROTECTED] writes:
  !   default:
  !   _dosmaperr(error);
  !   Assert(errno != EINTR);

 What's the point of that ... didn't it already happen inside read()?


Recall that we have some reports that read() failed to convert some
windows error number to some meaningful errno. For example, the
ERROR_SHARING_VIOLATION error was converted to EINVAL. So we do it
ourselves here and we can get better diagnostic information if this error
is reported again.

Regards,
Qingqing

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-21 Thread Qingqing Zhou

Tom Lane [EMAIL PROTECTED] wrote

 Would a simple retry loop actually help?  It's not clear to me how
 persistent such a failure would be.


[with reply to all followup threads] Yeah, this is the key and we definitely 
have no 100% guarantee that several retries will solve the problem - just as 
the situation in pg_unlink/pg_rename. But shall we do something now? If 
Kevin could help on testing(you may have to revert the registry changes :-() 
, I would like to send a patch in the retry style.

Regards,
Qingqing 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-21 Thread Jim C. Nasby
On Thu, Nov 17, 2005 at 07:56:21PM +0100, Magnus Hagander wrote:
 The way I read it, a delay should help. It's basically running out of
 kernel buffers, and we just delay, somebody else (another process, or an
 IRQ handler, or whatever) should get finished with their I/O, free up
 the buffer, and let us have it. Looking around a bit I see several
 references that you should retry on it, but nothing in the API docs.
 I do think it's probably a good idea to do a short delay before retrying
 - at least to yield the CPU for one slice. That would greatly increase
 the probability of someone else finishing their I/O...

If that makes it into code, ISTM it would be good if it also threw a
NOTICE so that users could see if this was happening; kinda like the
notice about log files being recycled frequently.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-21 Thread Qingqing Zhou

Magnus Hagander [EMAIL PROTECTED] wrote

 The way I read it, a delay should help. It's basically running out of
 kernel buffers, and we just delay, somebody else (another process, or an
 IRQ handler, or whatever) should get finished with their I/O, free up
 the buffer, and let us have it. Looking around a bit I see several
 references that you should retry on it, but nothing in the API docs.
 I do think it's probably a good idea to do a short delay before retrying
 - at least to yield the CPU for one slice. That would greatly increase
 the probability of someone else finishing their I/O...


More I read on the second thread:

 NTBackupread and NTBackupwrite both use buffered I/O. This means that 
Windows NT caches the I/O that is performed against the stream. It is also 
the only API that will back up the metadata of a file. This cache is pulled 
from limited resources: namely, pool and nonpaged pool. Because of this, 
extremely large numbers of files or files that are very large may cause the 
pool resources to run low. 

So does it imply that if we use unbuffered I/O in Windows system will 
elminate this problem? If so, just add FILE_FLAG_NO_BUFFERING when we open 
data file will solve the problem -- but this change in fact very invasive, 
because it will make the strategy of server I/O optimization totally 
different from *nix.

Regards,
Qingqing



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Magnus Hagander
[copying this one over to hackers]

 Our DBAs reviewed the Microsoft documentation you referenced, 
 modified the registry, and rebooted the OS.  We've been 
 beating up on the database without seeing the error so far.  
 We'll keep at it for a while.

Very interesting. As this seems to be a resource error, a couple of
questions. Sorry if you've already answered some of them, couldn't find
it in the archives.
 
1) Is this a dedicated pg server, or does it have something else on it?

2) We have to ask this - do you run any antivirus on it, that might nto
be releasing resources the right way? Anything else that might stick in
a kernel driver?

3) Are you hitting the database with many connections, or is this a
single/few connection scenario? Are the other connections typically
active when this shows up?


Seems like we could just retry when we get this failure. The question is
we need to do a small amount of sleep before we do? Also, we can't just
retry forever, there has to be some kind of end to it...
(If you read the SQL kb, it can be read as retrying is the correct
thing, because the bug in sql was that it didn't retry)

//Magnus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Qingqing Zhou

Magnus Hagander [EMAIL PROTECTED] wrote

 Seems like we could just retry when we get this failure. The question is
 we need to do a small amount of sleep before we do? Also, we can't just
 retry forever, there has to be some kind of end to it...
 (If you read the SQL kb, it can be read as retrying is the correct
 thing, because the bug in sql was that it didn't retry)


Agree on the retry solution. Yes, two important factors are: intervals, 
times.  I suspect if it is a dedicated server, serveral retry can handle it. 
But for a server might running backup together, who knows how long we need. 
But in either way, I don't think an endless loop is needed -- at most 3 
minutes (since s_lock() does this :-)).

Also, this is a partial solution to the invalid parameter win32 IO 
problem. There are some other cases like ACESS_VIOLATION error need more 
evidence to pin down.

Regards,
Qingqing

P.s. Go to be out of town for several days ...




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Kevin Grittner
1) We run a couple Java applications on the same box to provide
middle tier access.  When the box is heavily loaded, I think I've
seen about 80% PostgreSQL, 20% Java load.

2) I checked that no antivirus software was running, and had the
techs pare down the services running on that box to the absolute
minimum after the second failure, so that we could eliminate such
issues as possible causes.

3) The aforementioned Java apps hold open 21 database
connections.  (One for a software publisher to query a list of jar
files for access to the database, and 20 for a connection pool in
the middle tier.)  The way the pool is configured, six of those are
used for queries of normal priority, so we rarely have more than
six connections doing anything an any one moment.  During the
initial failure, the middle tier was under normal load, so 45,000
inserts were made to the table in question during the ujpdate.
After we hit the problem, we removed that middle tier from the
list of targets, so it was running, but totally idle during the
remaining tests.

None of this seems material, however.  It's pretty clear that the
problem was exhaustion of the Windows page pool.  Our Windows
experts have reconfigured the machine (which had been tuned
for Sybase ASE).  Their changes have boosted the page pool
from 20,000 entries to 180,000 entries.  We're continuing to test
to ensure that the problem is not showing up with this
configuration; but, so far, it looks good.

If we don't want to tell Windows users to make highly technical
changes to the Windows registry in order to use PostgreSQL,
it does seem wise to use retries, as has already been discussed
on this thread.

-Kevin


 Magnus Hagander [EMAIL PROTECTED]  
[copying this one over to hackers]

 Our DBAs reviewed the Microsoft documentation you referenced, 
 modified the registry, and rebooted the OS.  We've been 
 beating up on the database without seeing the error so far.  
 We'll keep at it for a while.

Very interesting. As this seems to be a resource error, a couple of
questions. Sorry if you've already answered some of them, couldn't find
it in the archives.
 
1) Is this a dedicated pg server, or does it have something else on it?

2) We have to ask this - do you run any antivirus on it, that might nto
be releasing resources the right way? Anything else that might stick in
a kernel driver?

3) Are you hitting the database with many connections, or is this a
single/few connection scenario? Are the other connections typically
active when this shows up?


Seems like we could just retry when we get this failure. The question is
we need to do a small amount of sleep before we do? Also, we can't just
retry forever, there has to be some kind of end to it...
(If you read the SQL kb, it can be read as retrying is the correct
thing, because the bug in sql was that it didn't retry)

//Magnus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Tom Lane
Kevin Grittner [EMAIL PROTECTED] writes:
 None of this seems material, however.  It's pretty clear that the
 problem was exhaustion of the Windows page pool.
 ...
 If we don't want to tell Windows users to make highly technical
 changes to the Windows registry in order to use PostgreSQL,
 it does seem wise to use retries, as has already been discussed
 on this thread.

Would a simple retry loop actually help?  It's not clear to me how
persistent such a failure would be.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Kevin Grittner
I'm not an expert on that, but it seems reasonable to me that the
page pool would free space as the I/O system caught up with
the load.  Also, I'm going on what was said by Qingqing and
in one of the pages he referenced:

http://support.microsoft.com/default.aspx?scid=kb;en-us;274310

-Kevin


 Tom Lane [EMAIL PROTECTED]  
Kevin Grittner [EMAIL PROTECTED] writes:
 None of this seems material, however.  It's pretty clear that the
 problem was exhaustion of the Windows page pool.
 ...
 If we don't want to tell Windows users to make highly technical
 changes to the Windows registry in order to use PostgreSQL,
 it does seem wise to use retries, as has already been discussed
 on this thread.

Would a simple retry loop actually help?  It's not clear to me how
persistent such a failure would be.

regards, tom lane


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Magnus Hagander
  Tom Lane [EMAIL PROTECTED]  
 Kevin Grittner [EMAIL PROTECTED] writes:
  None of this seems material, however.  It's pretty clear that the 
  problem was exhaustion of the Windows page pool.
  ...
  If we don't want to tell Windows users to make highly technical 
  changes to the Windows registry in order to use PostgreSQL, it does 
  seem wise to use retries, as has already been discussed on this 
  thread.
 
 Would a simple retry loop actually help?  It's not clear to 
 me how persistent such a failure would be.

(Not sure why I didn't get Toms mail - lists acting up again? Anyway, I
got Kevins response, but am responding primarily to Tom)

The way I read it, a delay should help. It's basically running out of
kernel buffers, and we just delay, somebody else (another process, or an
IRQ handler, or whatever) should get finished with their I/O, free up
the buffer, and let us have it. Looking around a bit I see several
references that you should retry on it, but nothing in the API docs.
I do think it's probably a good idea to do a short delay before retrying
- at least to yield the CPU for one slice. That would greatly increase
the probability of someone else finishing their I/O...

That's how I read it, but I'm not 100% sure.

//Magnus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Magnus Hagander
 None of this seems material, however.  It's pretty clear that 
 the problem was exhaustion of the Windows page pool.  Our 
 Windows experts have reconfigured the machine (which had been 
 tuned for Sybase ASE).  Their changes have boosted the page 
 pool from 20,000 entries to 180,000 entries.  We're 
 continuing to test to ensure that the problem is not showing 
 up with this configuration; but, so far, it looks good.

Nope, with these numbers it doesn't. I was looking for a reason as to
why it would exhaust the pool - such as a huge number of connections.
Which doesn't appear to be so :-(

Another thing that will affect this is if you have a lot of network
sockets open. Anything like that?


BTW; do you get any eventid 2020 in your eventlog?


 If we don't want to tell Windows users to make highly 
 technical changes to the Windows registry in order to use 
 PostgreSQL, it does seem wise to use retries, as has already 
 been discussed on this thread.

Yeah, I think it's at least worth a try at that.

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Kevin Grittner
There weren't a large number of connections -- it seemed to be
that the one big update query, by itself, would do this.  It seemed
to get through a lot of rows before failing.  This table is normally
insert only -- so it would likely be getting most or all of the space
for inserting the updated rows from extending the table.  Also, the
only reasonable plan for this update would be a table scan, so
it is possible that the failure occurred some time after the scan got
to rows added by the update statement.

It appears that the techs cleared the eventlog when they
reconfigured the machine, so I can no longer check for events
from the failures.   :-(

-Kevin


 Magnus Hagander [EMAIL PROTECTED]  
 None of this seems material, however.  It's pretty clear that 
 the problem was exhaustion of the Windows page pool.  Our 
 Windows experts have reconfigured the machine (which had been 
 tuned for Sybase ASE).  Their changes have boosted the page 
 pool from 20,000 entries to 180,000 entries.  We're 
 continuing to test to ensure that the problem is not showing 
 up with this configuration; but, so far, it looks good.

Nope, with these numbers it doesn't. I was looking for a reason as to
why it would exhaust the pool - such as a huge number of connections.
Which doesn't appear to be so :-(

Another thing that will affect this is if you have a lot of network
sockets open. Anything like that?


BTW; do you get any eventid 2020 in your eventlog?


 If we don't want to tell Windows users to make highly 
 technical changes to the Windows registry in order to use 
 PostgreSQL, it does seem wise to use retries, as has already 
 been discussed on this thread.

Yeah, I think it's at least worth a try at that.

//Magnus


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [ADMIN] ERROR: could not read block

2005-11-17 Thread Kevin Grittner
A couple clarifications:

There were only a few network sockets open.

I'm told that the eventlog was reviewed for any events which
mgiht be related to the failures before it was cleared.  They
found none, so that makes it fairly certain there was no 2020
event.

-Kevin


 Kevin Grittner [EMAIL PROTECTED]  
There weren't a large number of connections -- it seemed to be
that the one big update query, by itself, would do this.  It seemed
to get through a lot of rows before failing.  This table is normally
insert only -- so it would likely be getting most or all of the space
for inserting the updated rows from extending the table.  Also, the
only reasonable plan for this update would be a table scan, so
it is possible that the failure occurred some time after the scan got
to rows added by the update statement.

It appears that the techs cleared the eventlog when they
reconfigured the machine, so I can no longer check for events
from the failures.   :-(

-Kevin


 Magnus Hagander [EMAIL PROTECTED]  
 None of this seems material, however.  It's pretty clear that 
 the problem was exhaustion of the Windows page pool.  Our 
 Windows experts have reconfigured the machine (which had been 
 tuned for Sybase ASE).  Their changes have boosted the page 
 pool from 20,000 entries to 180,000 entries.  We're 
 continuing to test to ensure that the problem is not showing 
 up with this configuration; but, so far, it looks good.

Nope, with these numbers it doesn't. I was looking for a reason as to
why it would exhaust the pool - such as a huge number of connections.
Which doesn't appear to be so :-(

Another thing that will affect this is if you have a lot of network
sockets open. Anything like that?


BTW; do you get any eventid 2020 in your eventlog?


 If we don't want to tell Windows users to make highly 
 technical changes to the Windows registry in order to use 
 PostgreSQL, it does seem wise to use retries, as has already 
 been discussed on this thread.

Yeah, I think it's at least worth a try at that.

//Magnus


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 6: explain analyze is your friend