Re: [PATCHES] Partial match in GIN

2008-04-09 Thread Heikki Linnakangas

Alvaro Herrera wrote:

Well, LIKE %foo% is supposed to match foo unanchored, but with a btree
(or two btrees) you can only get 'foo' anchored to either end of the
string (or both).


Ah, true.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] EXPLAIN progress info

2008-04-09 Thread Heikki Linnakangas

Tom Lane wrote:

Gregory Stark [EMAIL PROTECTED] writes:
There are downsides: 


Insurmountable ones at that.  This one already makes it a non-starter:


a) the overhead of counting rows and loops is there for every query execution,
even if you don't do explain analyze.


and you are also engaging in a flight of fantasy about what the
client-side code might be able to handle.  Particularly if it's buried
inside, say, httpd or some huge Java app.  Yeah, you could possibly make
it work for the case that the problem query was manually executed in
psql, but that doesn't cover very much real-world territory.


I think there's two different use cases here. The one that Greg's 
proposal would be good for is a GUI, like pgAdmin. It would be cool to 
see how a query progresses through the EXPLAIN tree when you run it from 
the query tool. That would be great for visualizing the executor; a 
great teaching tool.


But I agree it's no good for use by a DBA to monitor a live system 
running a real-world application. For that we do need something else.



You'd be far more likely to get somewhere with a design that involves
looking from another session to see if anything's happening.  In the
case of queries that are making database changes, pgstattuple is
certainly a usable option.  For SELECT-only queries, I agree it's
harder, but it's still possible.  I seem to recall some discussion of
including a low-overhead progress counter of some kind in the
pg_stat_activity state exposed by a backend.  The number of rows so far
processed by execMain.c in the current query might do for the
definition.


Yeah, something like this would be better for monitoring a live system.

The number of rows processed by execMain.c would only count the number 
of rows processed by the top node of the tree, right? For a query that 
for example performs a gigantic sort, that would be 0 until the sort is 
done, which is not good. It's hard to come up with a single counter 
that's representative :-(.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] EXPLAIN progress info

2008-04-09 Thread Gregory Stark

Heikki Linnakangas [EMAIL PROTECTED] writes:

 Tom Lane wrote:
 Gregory Stark [EMAIL PROTECTED] writes:
 There are downsides: 

 Insurmountable ones at that.  This one already makes it a non-starter:

 a) the overhead of counting rows and loops is there for every query 
 execution,
 even if you don't do explain analyze.

Note that this doesn't include the gettimeofdays. It's just a couple integer
increments and assigments per tuple.

 and you are also engaging in a flight of fantasy about what the
 client-side code might be able to handle.  Particularly if it's buried
 inside, say, httpd or some huge Java app.  Yeah, you could possibly make
 it work for the case that the problem query was manually executed in
 psql, but that doesn't cover very much real-world territory.

 I think there's two different use cases here. The one that Greg's proposal
 would be good for is a GUI, like pgAdmin. It would be cool to see how a query
 progresses through the EXPLAIN tree when you run it from the query tool. That
 would be great for visualizing the executor; a great teaching tool.

It also means if a query takes suspiciously long you don't have to run explain
in another session (possibly getting a different plan) and if it takes way too
long such that it's too long to wait for results you can get an explain
analyze for at least partial data.

 Yeah, something like this would be better for monitoring a live system.

 The number of rows processed by execMain.c would only count the number of rows
 processed by the top node of the tree, right? For a query that for example
 performs a gigantic sort, that would be 0 until the sort is done, which is not
 good. It's hard to come up with a single counter that's representative :-(.

Alternately you could count the number of records which went through
ExecProcNode. That would at least get something which gives you a holistic
view of the query. I don't see how you would know what the expected end-point
would be though.

I think a better way to get a real percentage done would be to add a method
to each node which estimates its percentage done based on the percentage done
its children report and its actual and expected rows and its costs.

So for example a nested loop would calculate P1-(1-P2)/ER1 where P1 is the
percentage done of the first child and P2 is the percentage done of the second
child and ER1 is the expected number of records from the first child. Hash
Join would calculate (P1*C1 + P2*C2)/(C1+C2).

That could get a very good estimate of the percentage done, basically as good
as the estimated number of records.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


[PATCHES] Improve shutdown during online backup, take 2

2008-04-09 Thread Albe Laurenz
This patch replaces the previous version
http://archives.postgresql.org/pgsql-patches/2008-04/msg4.php

As suggested by Simon Riggs in
http://archives.postgresql.org/pgsql-patches/2008-04/msg00013.php ,
a smart shutdown will now wait for online backup mode to finish.
The functions that touch backup_label have been moved to xlog.c.

As suggested by Heikki Linnakangas in
http://archives.postgresql.org/pgsql-patches/2008-04/msg00180.php ,
I have introduced a new postmaster state PM_WAIT_BACKUP.
In this state the postmaster will still accept connections.

Thanks for the feedback!
I'll try to add a link at the Wiki page.

Yours,
Laurenz Albe



backup-shut.doc.patch
Description: backup-shut.doc.patch


backup-shut.patch
Description: backup-shut.patch

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] EXPLAIN progress info

2008-04-09 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 I think a better way to get a real percentage done would be to add a method
 to each node which estimates its percentage done based on the percentage done
 its children report and its actual and expected rows and its costs.

You can spend a week inventing some complicated method, and the patch
will be rejected because it adds too much overhead.  Anything we do here
has to be cheap enough that no one will object to having it turned on
all the time --- else it'll be useless exactly when they need it.

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


[PATCHES] Fix for win32 stat() problems

2008-04-09 Thread Magnus Hagander
Attached is a patch that attempts to fix the issues with stat() not
properly updating st_size on win32, as reported in this thread:
http://archives.postgresql.org/pgsql-hackers/2008-03/msg01181.php

It has to have a chance to affect things beyond just the
pg_relation_size() function, so this patch fixes all cases where we do
stat() and use the st_size member. It doesn't change all occurances of
stat() since I didn't want to incur the double filesystem lookups
unnecessary cases.

Any objections?

//Magnus
Index: src/backend/access/transam/xlog.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.296
diff -c -r1.296 xlog.c
*** src/backend/access/transam/xlog.c	5 Apr 2008 01:34:06 -	1.296
--- src/backend/access/transam/xlog.c	9 Apr 2008 17:01:50 -
***
*** 2560,2566 
--- 2560,2570 
  		 * isn't one, and we'd incorrectly conclude we've reached the end of
  		 * WAL and we're done recovering ...
  		 */
+ #ifndef WIN32
  		if (stat(xlogpath, stat_buf) == 0)
+ #else
+ 		if (pgwin32_safestat(xlogpath, stat_buf) == 0)
+ #endif
  		{
  			if (expectedSize  0  stat_buf.st_size != expectedSize)
  ereport(FATAL,
Index: src/backend/storage/file/fd.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/storage/file/fd.c,v
retrieving revision 1.144
diff -c -r1.144 fd.c
*** src/backend/storage/file/fd.c	10 Mar 2008 20:06:27 -	1.144
--- src/backend/storage/file/fd.c	9 Apr 2008 17:00:22 -
***
*** 999,1005 
--- 999,1009 
  		vfdP-fdstate = ~FD_TEMPORARY;
  		if (log_temp_files = 0)
  		{
+ #ifndef WIN32
  			if (stat(vfdP-fileName, filestats) == 0)
+ #else
+ 			if (pgwin32_safestat(vfdP-fileName, filestats) == 0)
+ #endif
  			{
  if (filestats.st_size = log_temp_files)
  	ereport(LOG,
Index: src/backend/utils/adt/dbsize.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/dbsize.c,v
retrieving revision 1.18
diff -c -r1.18 dbsize.c
*** src/backend/utils/adt/dbsize.c	31 Mar 2008 01:31:43 -	1.18
--- src/backend/utils/adt/dbsize.c	9 Apr 2008 17:09:44 -
***
*** 52,58 
--- 52,62 
  
  		snprintf(filename, MAXPGPATH, %s/%s, path, direntry-d_name);
  
+ #ifndef WIN32
  		if (stat(filename, fst)  0)
+ #else
+ 		if (pgwin32_safestat(filename, fst)  0)
+ #endif
  		{
  			if (errno == ENOENT)
  continue;
***
*** 199,205 
--- 203,213 
  
  		snprintf(pathname, MAXPGPATH, %s/%s, tblspcPath, direntry-d_name);
  
+ #ifndef WIN32
  		if (stat(pathname, fst)  0)
+ #else
+ 		if (pgwin32_safestat(pathname, fst)  0)
+ #endif
  		{
  			if (errno == ENOENT)
  continue;
***
*** 268,274 
--- 276,286 
  			snprintf(pathname, MAXPGPATH, %s.%u,
  	 relationpath, segcount);
  
+ #ifndef WIN32
  		if (stat(pathname, fst)  0)
+ #else
+ 		if (pgwin32_safestat(pathname, fst)  0)
+ #endif
  		{
  			if (errno == ENOENT)
  break;
Index: src/backend/utils/adt/genfile.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/genfile.c,v
retrieving revision 1.19
diff -c -r1.19 genfile.c
*** src/backend/utils/adt/genfile.c	31 Mar 2008 01:31:43 -	1.19
--- src/backend/utils/adt/genfile.c	9 Apr 2008 16:59:25 -
***
*** 161,167 
--- 161,171 
  
  	filename = convert_and_check_filename(filename_t);
  
+ #ifndef WIN32
  	if (stat(filename, fst)  0)
+ #else
+ 	if (pgwin32_safestat(filename, fst)  0)
+ #endif
  		ereport(ERROR,
  (errcode_for_file_access(),
   errmsg(could not stat file \%s\: %m, filename)));
Index: src/include/port.h
===
RCS file: /projects/cvsroot/pgsql/src/include/port.h,v
retrieving revision 1.118
diff -c -r1.118 port.h
*** src/include/port.h	29 Feb 2008 15:31:33 -	1.118
--- src/include/port.h	9 Apr 2008 16:35:26 -
***
*** 298,303 
--- 298,306 
  #define popen(a,b) _popen(a,b)
  #define pclose(a) _pclose(a)
  
+ /* stat() is not guaranteed to report the current size of the file */
+ extern int pgwin32_safestat(const char *path, struct stat *buf);
+ 
  /* Missing rand functions */
  extern long lrand48(void);
  extern void srand48(long seed);
Index: src/port/dirmod.c
===
RCS file: /projects/cvsroot/pgsql/src/port/dirmod.c,v
retrieving revision 1.51
diff -c -r1.51 dirmod.c
*** src/port/dirmod.c	1 Jan 2008 19:46:00 -	1.51
--- src/port/dirmod.c	9 Apr 2008 17:06:58 -
***
*** 447,449 
--- 447,480 
  	pgfnames_cleanup(filenames);
  	return false;
  }
+ 
+ 
+ /*
+  * The stat() function in win32 is not guaranteed to report the st_size
+  * field properly. So we define our own version 

Re: [PATCHES] Fix for win32 stat() problems

2008-04-09 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 + #ifndef WIN32
   if (stat(xlogpath, stat_buf) == 0)
 + #else
 + if (pgwin32_safestat(xlogpath, stat_buf) == 0)
 + #endif

Ick.  Please do this the way we normally do things when we have to
override broken Windows syscalls, that is put something like

#define stat(...)  pgwin32_stat(...)

into the win32 port header file, so the calls don't have to be
nonstandard.

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Fix for win32 stat() problems

2008-04-09 Thread Magnus Hagander
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  + #ifndef WIN32
  if (stat(xlogpath, stat_buf) == 0)
  + #else
  +   if (pgwin32_safestat(xlogpath, stat_buf) == 0)
  + #endif
 
 Ick.  Please do this the way we normally do things when we have to
 override broken Windows syscalls, that is put something like
 
   #define stat(...)  pgwin32_stat(...)
 
 into the win32 port header file, so the calls don't have to be
 nonstandard.

The reason not to do so was to avoid having to do the two filesystem
calls for *every* stat, instead just calling them both when we actually
need to use the st_size member. I take it you don't think that's a good
enough reason, but I just want to be sure you're aware that's why I did
it the way I did.

Or do you by any chance now a better way to accomplish that goal? :-)

//Magnus

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Fix for win32 stat() problems

2008-04-09 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Ick.

 The reason not to do so was to avoid having to do the two filesystem
 calls for *every* stat, instead just calling them both when we actually
 need to use the st_size member.

I don't think that's worth (a) the code uglification, or (b) the chance
of a bug later due to someone not knowing they need the special version
of stat().  Are there any stat() calls that are in sufficiently
performance-critical paths that it really matters?  How much slower is
the second call, anyway?

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Fix for win32 stat() problems

2008-04-09 Thread Magnus Hagander
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  Ick.
 
  The reason not to do so was to avoid having to do the two filesystem
  calls for *every* stat, instead just calling them both when we
  actually need to use the st_size member.
 
 I don't think that's worth (a) the code uglification, or (b) the
 chance of a bug later due to someone not knowing they need the
 special version of stat().  Are there any stat() calls that are in
 sufficiently performance-critical paths that it really matters?  How
 much slower is the second call, anyway?

Not sure really, the VM I'm working on now is so slow I can't measure
these things properly anyway. Probably not enough to matter compared to
other things - I'll try it that way to see if I can notice any
significant difference.

Trying to prepare a patch that does it the normal way, but so far I'm
failing rather miserably. The *struct* stat is already redefined on
win32, so whenever I try #undef or so it conflicts with that :-( Since
there is no way to #undef only the parametrized version.

Though right now I have the backend linking properly even though a
bunch of files refer to pgwin32_safestat() and I've actually removed
the implementation of the function, so maybe I just need to go to bed...

//Magnus

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Fix for win32 stat() problems

2008-04-09 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Trying to prepare a patch that does it the normal way, but so far I'm
 failing rather miserably. The *struct* stat is already redefined on
 win32, so whenever I try #undef or so it conflicts with that :-( Since
 there is no way to #undef only the parametrized version.

I don't follow ... there's no such redefinition in our code AFAICS.
Do you mean that the system header files declare it as a macro?

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Fix for win32 stat() problems

2008-04-09 Thread Magnus Hagander
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  Trying to prepare a patch that does it the normal way, but so far
  I'm failing rather miserably. The *struct* stat is already
  redefined on win32, so whenever I try #undef or so it conflicts
  with that :-( Since there is no way to #undef only the parametrized
  version.
 
 I don't follow ... there's no such redefinition in our code AFAICS.
 Do you mean that the system header files declare it as a macro?

Yes.

//Magnus

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Fix for win32 stat() problems

2008-04-09 Thread Andrew Dunstan



Magnus Hagander wrote:

Tom Lane wrote:
  

Magnus Hagander [EMAIL PROTECTED] writes:


Trying to prepare a patch that does it the normal way, but so far
I'm failing rather miserably. The *struct* stat is already
redefined on win32, so whenever I try #undef or so it conflicts
with that :-( Since there is no way to #undef only the parametrized
version.
  

I don't follow ... there's no such redefinition in our code AFAICS.
Do you mean that the system header files declare it as a macro?



Yes.


  


How about #defining safe_stat to be pg_win32_safe_stat on Windows and 
simply stat elsewhere? Then use safe_stat at the places you consider 
critical.


cheers

andrew

--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches