Re: [HACKERS] Hint Bits and Write I/O

2008-05-28 Thread Simon Riggs

On Tue, 2008-05-27 at 19:32 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  My proposal is to have this as a two-stage process. When we set the hint
  on a tuple in a clean buffer we mark it BM_DIRTY_HINTONLY, if not
  already dirty. If we set a hint on a buffer that is BM_DIRTY_HINTONLY
  then we mark it BM_DIRTY.
 
 I wonder if it is worth actually counting the number of newly set hint
 bits, rather than just having a counter that saturates at two.  We could
 steal a byte from usage_count without making the buffer headers bigger.

That's the right place to start. We can instrument the backend like that
and then get some data about what actually happens. 

The other stuff is probably me just explaining it badly, so lets leave
it for now. You're right, it was too complex for first cut.

  If the bgwriter has time, it will write out BM_DIRTY_HINTONLY buffers,
  though on a consistently busy server this should not occur.
 
 What do you mean by if it has time?  How would it know that?
 
  This won't change the behaviour of first-read-after-copy. To improve
  that behaviour, I suggest that we only move from BM_DIRTY_HINTONLY to
  BM_DIRTY when we are setting the hint for a new xid. If we are just
  setting the same xid over-and-over again then we should avoid setting
  the page dirty. So when data has been loaded via COPY, we will just
  check the status of the xid once, then scan the whole page using the
  single-item transaction cache.
 
 This doesn't make any sense to me.  What is a new xid?  And what is
 setting the same xid over and over?  If a page is full of occurrences
 of the same xid, that doesn't really mean that it's less useful to
 correctly hint each occurrence.
 
 The whole proposal seems a bit overly complicated.  What we talked about
 at PGCon was simply not setting the dirtybit when setting a hint bit.
 There's a certain amount of self-optimization there: if a page
 continually receives hint bit updates, that also means it is getting
 pinned and hence its usage_count stays high, thus it will tend to stay
 in shared buffers until something happens to make it really dirty.
 (Although that argument might not hold water for a bulk seqscan: you'll
 have hinted all the tuples and then very possibly throw the page away
 immediately.  So counting the hints and eventually deciding we did
 enough to justify dirtying the page might be worth doing.)

Yes, we probably need to do something different for bulk seqscans.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove redundant extra_desc info for enum GUC variables?

2008-05-28 Thread Magnus Hagander
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  One point of interest is that for client_min_messages and
  log_min_messages, the ordering of the values has significance, and
  it's different for the two cases.
 
  Is there any actual reason why they're supposed to be treated
  differently?
 
 Yeah: LOG level sorts differently in the two cases; it's fairly high
 priority for server log output and much lower for client output.

Ok, easy fix if we break them apart. Should we continue to accept
values that we're not going to care about, or should I change that at
the same time? (for example, client_min_messages doesn't use INFO,
but we do accept that in = 8.3 anyway)

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Add dblink function to check if a named connection exists

2008-05-28 Thread Tommy Gildseth
I have locked down access to all dblink_* functions, so that only 
certain privileged users have access to them, and instead provide a set 
of SRF functions defined as security definer functions, where I connect 
to the remote server, fetch some data, disconnect from remote server, 
and return the data.
One obvious disadvantage of this approach, is that I need to connect and 
disconnect in every function. A possible solution to this, would be 
having a function f.ex dblink_exists('connection_name') that returns 
true/false depending on whether the  connection already exists. This 
way, I could just check if a named connection exists, and establish a 
connection if not, and wait until the end of the session to disconnect 
all established connections.


I've attached a patch with a suggested implementation of such a function.


--
Tommy Gildseth
Index: dblink.c
===
RCS file: /projects/cvsroot/pgsql/contrib/dblink/dblink.c,v
retrieving revision 1.73
diff -c -c -r1.73 dblink.c
*** dblink.c	4 Apr 2008 17:02:56 -	1.73
--- dblink.c	28 May 2008 08:06:23 -
***
*** 192,208 
  freeconn = true; \
  			} \
  	} while (0)
- 
  #define DBLINK_GET_NAMED_CONN \
  	do { \
! 			char *conname = text_to_cstring(PG_GETARG_TEXT_PP(0)); \
  			rconn = getConnectionByName(conname); \
  			if(rconn) \
  conn = rconn-conn; \
- 			else \
- DBLINK_CONN_NOT_AVAIL; \
  	} while (0)
  
  #define DBLINK_INIT \
  	do { \
  			if (!pconn) \
--- 192,214 
  freeconn = true; \
  			} \
  	} while (0)
  #define DBLINK_GET_NAMED_CONN \
  	do { \
! 			char *conname = NULL; \
! 			DBLINK_GET_NAMED_CONN_IF_EXISTS; \
! 			if(!rconn) \
! DBLINK_CONN_NOT_AVAIL; \
! 	} while (0)
! 
! #define DBLINK_GET_NAMED_CONN_IF_EXISTS \
! 	do { \
! 			conname = text_to_cstring(PG_GETARG_TEXT_PP(0)); \
  			rconn = getConnectionByName(conname); \
  			if(rconn) \
  conn = rconn-conn; \
  	} while (0)
  
+ 
  #define DBLINK_INIT \
  	do { \
  			if (!pconn) \
***
*** 1056,1061 
--- 1062,1090 
  	PG_RETURN_INT32(PQisBusy(conn));
  }
  
+ 
+ /*
+  * Checks if a given named remote connection exists
+  *
+  * Returns 1 if the connection is busy, 0 otherwise
+  * Params:
+  *	text connection_name - name of the connection to check
+  *
+  */
+ PG_FUNCTION_INFO_V1(dblink_exists);
+ Datum
+ dblink_exists(PG_FUNCTION_ARGS)
+ {
+ 	PGconn	   *conn = NULL;
+ 	remoteConn *rconn = NULL;
+ 	char *conname = NULL;
+ 
+ 	DBLINK_INIT;
+ 	DBLINK_GET_NAMED_CONN_IF_EXISTS;
+ 
+ 	PG_RETURN_BOOL(conn != NULL);
+ }
+ 
  /*
   * Cancels a running request on a connection
   *
Index: dblink.h
===
RCS file: /projects/cvsroot/pgsql/contrib/dblink/dblink.h,v
retrieving revision 1.20
diff -c -c -r1.20 dblink.h
*** dblink.h	4 Apr 2008 16:57:21 -	1.20
--- dblink.h	28 May 2008 08:06:23 -
***
*** 49,54 
--- 49,55 
  extern Datum dblink_get_result(PG_FUNCTION_ARGS);
  extern Datum dblink_get_connections(PG_FUNCTION_ARGS);
  extern Datum dblink_is_busy(PG_FUNCTION_ARGS);
+ extern Datum dblink_exists(PG_FUNCTION_ARGS);
  extern Datum dblink_cancel_query(PG_FUNCTION_ARGS);
  extern Datum dblink_error_message(PG_FUNCTION_ARGS);
  extern Datum dblink_exec(PG_FUNCTION_ARGS);
Index: dblink.sql.in
===
RCS file: /projects/cvsroot/pgsql/contrib/dblink/dblink.sql.in,v
retrieving revision 1.17
diff -c -c -r1.17 dblink.sql.in
*** dblink.sql.in	5 Apr 2008 02:44:42 -	1.17
--- dblink.sql.in	28 May 2008 08:06:23 -
***
*** 178,183 
--- 178,188 
  AS 'MODULE_PATHNAME', 'dblink_is_busy'
  LANGUAGE C STRICT;
  
+ CREATE OR REPLACE FUNCTION dblink_exists(text)
+ RETURNS boolean
+ AS 'MODULE_PATHNAME', 'dblink_exists'
+ LANGUAGE C STRICT;
+ 
  CREATE OR REPLACE FUNCTION dblink_get_result(text)
  RETURNS SETOF record
  AS 'MODULE_PATHNAME', 'dblink_get_result'
Index: expected/dblink.out
===
RCS file: /projects/cvsroot/pgsql/contrib/dblink/expected/dblink.out,v
retrieving revision 1.23
diff -c -c -r1.23 dblink.out
*** expected/dblink.out	6 Apr 2008 16:54:48 -	1.23
--- expected/dblink.out	28 May 2008 08:06:23 -
***
*** 731,736 
--- 731,748 
0
  (1 row)
  
+ SELECT dblink_exists('dtest1');
+  dblink_exists
+ ---
+  t
+ (1 row)
+ 
+ SELECT dblink_exists('doesnotexist');
+  dblink_exists
+ ---
+  f
+ (1 row)
+ 
  SELECT dblink_disconnect('dtest1');
   dblink_disconnect 
  ---
Index: sql/dblink.sql
===
RCS file: /projects/cvsroot/pgsql/contrib/dblink/sql/dblink.sql,v
retrieving revision 1.20
diff -c -c -r1.20 dblink.sql
*** sql/dblink.sql	6 Apr 2008 16:54:48 -	1.20
--- sql/dblink.sql	28 May 2008 08:06:23 

Re: [HACKERS] Hiding undocumented enum values?

2008-05-28 Thread Magnus Hagander
Alex Hunsaker wrote:
 On Tue, May 27, 2008 at 12:05 PM, Magnus Hagander
 [EMAIL PROTECTED] wrote:
  Alex Hunsaker wrote:
  On Tue, May 27, 2008 at 10:20 AM, Tom Lane [EMAIL PROTECTED]
  wrote:
   I am wondering if it's a good idea to hide the redundant entries
   to reduce clutter in the pg_settings display.  (We could do this
   by adding a hidden boolean to struct config_enum_entry.)
   Thoughts?
 
  +1
 
  regards, tom lane
 
  Maybe something like the attached patch?
 
  Oops, missed that there was a patch posted already. Looks like the
  way to do it (except I'd move the comment :-P) if that's the way we
  go.
 
 OK, the updated patch is on pg_patches under guc config_enum_entry
 add hidden field

Thanks, I've reviewed and applied.


 -moved the comment into config_enum_get_options()

I moved it again, to the header :-)

 -fixed a possible buffer underrun if every option was hidden

That fix didn't take into account the possibility of having different
prefixes. Since it is a pretty stupid thing to have a GUC enum with
*only* hidden entries, I just made it do nothing in this case and
updated the comment. The buffer underrun check is still there.

  I looked into just making it a string so we could use parse_bool...
  because backslash_quote seems to be the exception not the rule.
  But I decided having a hidden flag seems more useful anyway...
 
  It used to be a string. We don't want that, because then we can't
  tell the client which possible values are available. That's the
  whole reason for the creation of the enum type gucs...
 
 Well its good i did not go that route then :)

Yup :)

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hint Bits and Write I/O

2008-05-28 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 (Although that argument might not hold water for a bulk seqscan: you'll
 have hinted all the tuples and then very possibly throw the page away
 immediately.  

That seems like precisely the case where we don't want to dirty the buffer.

 So counting the hints and eventually deciding we did
 enough to justify dirtying the page might be worth doing.)

What if we counted how many hint bits were *not* set? I feel like the goal
should be to dirty the buffer precisely once when all the bits can be set. The
problem case is when we dirty the page but still have some hint bits to be set
on a subsequent iteration.

Of course that doesn't deal with the case where tuples are being touched
continuously. Perhaps the idea should be to treat the page as dirty every n
hint bit settings where n is the number of tuples on the page. or highest
number of unset hint bits seen on the page. or something like that.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Avoiding second heap scan in VACUUM

2008-05-28 Thread Pavan Deolasee
Tom brought this up during the PGCon developer meet. After thinking a
bit about it, I think it's actually possible to avoid the second heap
scan, especially now that we've HOT. If we can remove the second pass,
not only would that speed up vacuum, but also reduce lots of redundant
read and write IO.

Currently second heap scan is required to remove the dead tuples from
the heap. We can not do this in the first scan because we haven't yet
removed the index pointers pointing to them. HOT now prunes and
defrags the pages in the first phase itself and what is left behind is
just a bunch of DEAD line pointers. The line pointers are marked
UNUSED in the second heap scan. Since we don't repair any line
pointer bloat, no additional free space is created in the second pass.
So frankly there is not much left to be done in the second phase. Of
course we also update the FSM information at the end of second pass.

If we want to remove the second pass, what we need is a mechanism to
reclaim the DEAD line pointers. But to this correctly, we must ensure
that the DEAD line pointers are reclaimed only and only after the
index entries pointing to them are removed.

Tom's idea was to store the vacuum-xid in the tuple header and check
that xid to see if the vacuum successfully removed the index pointers
or not. Heikki had some brilliant idea to store the xid in the line
pointer itself. These ideas are good, but would require xid wraparound
handling. I am thinking of a solution on the following lines to handle
DEAD line pointers. Other ideas are welcome too.

1. Before VACUUM starts, it updates the pg_class row of the target
table, noting that VACUUM_IN_PROGRESS for the target table.
2. It then waits for all the existing transactions to finish to make
sure that everyone can see the change in the pg_class row,
3. It then scans the heap, prunes and defrags the pages. The normal
pruning would reclaim all the dead tuples and mark their line pointers
as DEAD. Since VACUUM is going to remove the index pointers pointing
to these DEAD line pointers, it now marks these DEAD line pointers
with additional flag, say DEAD_RECLAIMED.
4. At the end of first scan, VACUUM updates FSM information for heap pages.
5. It then proceeds with the index scan and removes index pointers
pointing to the DEAD line pointers collected in the heap scan.
6. Finally, it again updates the pg_class row and clears the
VACUUM_IN_PROGRESS flag.

Any other backend, when invokes page pruning, would check if the
VACUUM is in progress by looking at the VACUUM_IN_PROGRESS flag. Note
that if the previous vacuum had failed or database crashed before
vacuum completed, the VACUUM_IN_PROGRESS flag would remain set until
the next vacuum successfully completes on the table and resets the
flag (VACUUM_NOT_IN_PROGRESS state).

Since vacuum waits for the existing transactions to finish before
marking any DEAD line pointers DEAD_RECLAIMED, for a backend which
sees VACUUM_NOT_IN_PROGRESS,  any DEAD_RECLAIMED line pointer it finds
must be left over from the previously successfully completed vacuum.
Since the previous vacuum must have removed the index pointers
pointing to it, the backend can now safely reclaim the line pointer
itself. The backend can potentially do this any time it sees a
DEAD_RECLAIMED line pointer, but we may restrict this only during the
pruning activity to keep things simple. This operation need not be WAL
logged if we appropriately handle DEAD_RECLAIMED line pointer during
redo recovery  (if it's reused for some other insert/update activity).

I think this scheme guarantees that a backend would always see
VACUUM_IN_PROGRESS if vacuum is currently in progress on the table or
the last vacuum has failed. There might be situations when a backend
sees VACUUM_IN_PROGRESS when if fact there is no vacuum is progress
and the last vacuum finished successfully, but that won't have any
correctness implication, but would only delay reclaiming
DEAD_RECLAIMED line pointers.

Comments ?

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hint Bits and Write I/O

2008-05-28 Thread Simon Riggs

On Wed, 2008-05-28 at 06:08 -0400, Gregory Stark wrote:
 Tom Lane [EMAIL PROTECTED] writes:
 
  (Although that argument might not hold water for a bulk seqscan: you'll
  have hinted all the tuples and then very possibly throw the page away
  immediately.  
 
 That seems like precisely the case where we don't want to dirty the buffer.

(1)

  So counting the hints and eventually deciding we did
  enough to justify dirtying the page might be worth doing.)
 
 What if we counted how many hint bits were *not* set? I feel like the goal
 should be to dirty the buffer precisely once when all the bits can be set. 

(2)

Agreed. I think the difficulty is that (1) and (2) are contradictory
goals, and since those conditions frequently occur together, cause
conflict. 

When we fully scan a buffer this will result in 1 or more actual clog
lookups, L.  L is often less than the number of tuples on the page
because of the single-item xid cache. If L = 1 then there is a high
probability that when we do a seq scan the clog blocks will be cached
also, so although we do a 1 clog lookup per table block we would seldom
do clog I/O during a SeqScan. So what I tried to say in a previous post
was that if L  1 then we should dirty the buffer because the
single-item cache becomes less-effective and we may need to access other
clog blocks, that may result in clog I/O.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_regress: referencing shared objects from tests

2008-05-28 Thread Jorgen Austvik - Sun Norway

Hi,

we would like to be able to use and ship pg_regress and the PostgreSQL 
test suite independently of the PostgreSQL build environment, for 
testing and maybe even as a separate package to be build and shipped 
with the OS for others to test their setup. Does this sound like a sane 
and OK thing to do?


I have a problem with one of the tests (create_function_1.source):

-8888---
 20 CREATE FUNCTION int44out(city_budget)
 21RETURNS cstring
 22AS '@abs_builddir@/[EMAIL PROTECTED]@'
 23LANGUAGE C STRICT;
 24
 25 CREATE FUNCTION check_primary_key ()
 26 RETURNS trigger
 27 AS '@abs_builddir@/../../../contrib/spi/[EMAIL PROTECTED]@'
 28 LANGUAGE C;
...
 35 CREATE FUNCTION autoinc ()
 36 RETURNS trigger
 37 AS '@abs_builddir@/../../../contrib/spi/[EMAIL PROTECTED]@'
 38 LANGUAGE C;
-8888---

(The ../../../contrib/spi-path does not exist outside of the build 
environment, so to be able to run the test you need to have source code, 
compilers, ...)


I could work around this problem by copying the needed shared objects to 
@abs_builddir@ as part of make or make check, I could add a 
“–look-for-shared-objects-here” parameter to pg_regress, and you 
probably have other suggestions.


Is this something we want to fix, and what would be the right way to do 
it? (I have no problem providing a patch.)


-Jørgen
--

Jørgen Austvik, Software Engineering - QA
Sun Microsystems Database Group

http://blogs.sun.com/austvik, http://www.autvik.net/

Sun Microsystems AS
Haakon VII gt. 7b
N-7485 Trondheim, Norway

begin:vcard
fn;quoted-printable:J=C3=B8rgen Austvik
n;quoted-printable:Austvik;J=C3=B8rgen
org:Sun Microsystems;Database Technology Group
adr:;;Haakon VII gt. 7b;Trondheim;;NO-7485;Norway
email;internet:[EMAIL PROTECTED]
title:Senior Engineer
tel;work:+47 73 84 21 10 
tel;fax:+47 73 84 21 01
tel;cell:+47 901 97 886
x-mozilla-html:FALSE
url:http://www.sun.com/
version:2.1
end:vcard


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [JDBC] How embarrassing: optimization of a one-shot query doesn't work

2008-05-28 Thread Dave Cramer


On 23-May-08, at 9:20 AM, Tom Lane wrote:


Dave Cramer [EMAIL PROTECTED] writes:

Any word on 8.3.2 ?


Obviously, nothing is happening during PGCon ;-)

There was some discussion a week or so back about scheduling a set of
releases in early June, but it's not formally decided.



Now that PGCon is over has there been any more discussion ?

Dave

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_regress: dbname in PostgreSQL test suite

2008-05-28 Thread Jorgen Austvik - Sun Norway

Hi.

pg_regress has a --dbname option (which actually take a list of database 
names):


  --dbname=DB   use database DB (default \regression\)

... but the PostgreSQL regression test suite does not really support this:

[EMAIL PROTECTED]:regress] ggrep -R regression sql/* | grep -v 
regression_ | grep -v :--

sql/prepare.sql:EXECUTE q2('regression');
sql/privileges.sql:\c regression
sql/temp.sql:\c regression

I suggest we replace @dbname@ with the first element in the dblist 
linked list in convert_sourcefiles_in(). What do you think?


(I can provide a patch if you think it is an acceptable solution.)

-J
--

Jørgen Austvik, Software Engineering - QA
Sun Microsystems Database Group

http://blogs.sun.com/austvik/
http://www.austvik.net/
begin:vcard
fn;quoted-printable:J=C3=B8rgen Austvik
n;quoted-printable:Austvik;J=C3=B8rgen
org:Sun Microsystems;Database Group
adr:;;Haakon VII gt. 7b;Trondheim;;NO-7485;Norway
email;internet:[EMAIL PROTECTED]
title:Senior Engineer
tel;work:+47 73 84 21 10 
tel;fax:+47 73 84 21 01
tel;cell:+47 901 97 886
note:http://www.austvik.net/
x-mozilla-html:FALSE
url:http://blogs.sun.com/austvik/
version:2.1
end:vcard


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_regress: referencing shared objects from tests

2008-05-28 Thread Tom Lane
Jorgen Austvik - Sun Norway [EMAIL PROTECTED] writes:
 we would like to be able to use and ship pg_regress and the PostgreSQL 
 test suite independently of the PostgreSQL build environment, for 
 testing and maybe even as a separate package to be build and shipped 
 with the OS for others to test their setup. Does this sound like a sane 
 and OK thing to do?

The RPM packages have done this since approximately forever.  You might
want to look at the patches used there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_regress: dbname in PostgreSQL test suite

2008-05-28 Thread Tom Lane
Jorgen Austvik - Sun Norway [EMAIL PROTECTED] writes:
 pg_regress has a --dbname option (which actually take a list of database 
 names):

--dbname=DB   use database DB (default \regression\)

 ... but the PostgreSQL regression test suite does not really support this:

That option is intended for running other sets of regression tests
(eg, the contrib ones are customarily run in contrib_regression).
I see zero value in trying to make the standard tests run under
some other database name.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_regress: dbname in PostgreSQL test suite

2008-05-28 Thread Andrew Dunstan



Jorgen Austvik - Sun Norway wrote:

Hi.

pg_regress has a --dbname option (which actually take a list of 
database names):


  --dbname=DB   use database DB (default \regression\)

... but the PostgreSQL regression test suite does not really support 
this:


[EMAIL PROTECTED]:regress] ggrep -R regression sql/* | grep -v 
regression_ | grep -v :--

sql/prepare.sql:EXECUTE q2('regression');
sql/privileges.sql:\c regression
sql/temp.sql:\c regression

I suggest we replace @dbname@ with the first element in the dblist 
linked list in convert_sourcefiles_in(). What do you think?


(I can provide a patch if you think it is an acceptable solution.)




We have more than one set of regression tests. This feature is used by 
the PL regression tests and the contrib regression tests to run using a 
different database name.


I'm not quite sure why it's a list.

cheers

andrew

cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Upcoming back-branch update releases

2008-05-28 Thread Tom Lane
Yup, we're overdue for that, so:

After some discussion among core and the packagers list, we have
tentatively set June 9 as the release date for minor updates of
all supported PG release branches (back to 7.4).  As has been the
recent practice, code freeze will occur the preceding Thursday, June 5.

If you've got any bug fixes you've been working on, now is a good time
to get them finished up and sent in...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [JDBC] How embarrassing: optimization of a one-shot query doesn't work

2008-05-28 Thread Tom Lane
Dave Cramer [EMAIL PROTECTED] writes:
 On 23-May-08, at 9:20 AM, Tom Lane wrote:
 There was some discussion a week or so back about scheduling a set of
 releases in early June, but it's not formally decided.

 Now that PGCon is over has there been any more discussion ?

Yeah, I just posted an announcement about it on -hackers.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add dblink function to check if a named connection exists

2008-05-28 Thread Tom Lane
Tommy Gildseth [EMAIL PROTECTED] writes:
 One obvious disadvantage of this approach, is that I need to connect and 
 disconnect in every function. A possible solution to this, would be 
 having a function f.ex dblink_exists('connection_name') that returns 
 true/false depending on whether the  connection already exists.

Can't you do this already?

SELECT 'myconn' = ANY (dblink_get_connections());

A dedicated function might be a tad faster, but it probably isn't going
to matter compared to the overhead of sending a remote query.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove redundant extra_desc info for enum GUC variables?

2008-05-28 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Yeah: LOG level sorts differently in the two cases; it's fairly high
 priority for server log output and much lower for client output.

 Ok, easy fix if we break them apart. Should we continue to accept
 values that we're not going to care about, or should I change that at
 the same time? (for example, client_min_messages doesn't use INFO,
 but we do accept that in = 8.3 anyway)

I'd be inclined to keep the actual behavior the same as it was.
We didn't document INFO for this variable, perhaps, but it's accepted
and has a well-defined behavior.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding second heap scan in VACUUM

2008-05-28 Thread Gregory Stark
Pavan Deolasee [EMAIL PROTECTED] writes:

 1. Before VACUUM starts, it updates the pg_class row of the target
 table, noting that VACUUM_IN_PROGRESS for the target table.

If I understand correctly nobody would be able to re-use any line-pointers
when a vacuum is in progress? I find that a bit scary since for large tables
you may actually always be running a vacuum. Perhaps the DSM will fix that but
for heavily updated tables I think you might still be pretty much continuously
running vacuum. 

On the other hand it would just result in line pointer bloat. And I think
VACUUM could still safely remove old dead line pointers if it noted that the
table had a clean vacuum status when it started.

 2. It then waits for all the existing transactions to finish to make
 sure that everyone can see the change in the pg_class row,

I'm a bit scared of how many waits for all transactions to finish we're
accumulating. It seemed safe enough when we had only one but I'm not sure what
the consequences for this action are when there are several of them. Are we
perhaps creating deadlocks?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #4204: COPY to table with FK has memory leak

2008-05-28 Thread Gregory Stark

[moving to -hackers]

Tom Lane [EMAIL PROTECTED] writes:

 Tomasz Rybak [EMAIL PROTECTED] writes:
 I tried to use COPY to import 27M rows to table:
 CREATE TABLE sputnik.ccc24 (
 station CHARACTER(4) NOT NULL REFERENCES sputnik.station24 (id),
 moment INTEGER NOT NULL,
 flags INTEGER NOT NULL
 ) INHERITS (sputnik.sputnik);
 COPY sputnik.ccc24(id, moment, station, strength, sequence, flags)
 FROM '/tmp/24c3' WITH DELIMITER AS ' ';

 This is expected to take lots of memory because each row-requiring-check
 generates an entry in the pending trigger event list.  Even if you had
 not exhausted memory, the actual execution of the retail checks would
 have taken an unreasonable amount of time.  The recommended way to do
 this sort of thing is to add the REFERENCES constraint *after* you load
 all the data; that'll be a lot faster in most cases because the checks
 are done in bulk using a JOIN rather than one-at-a-time.

Hm, it occurs to me that we could still do a join against the pending event
trigger list... I wonder how feasible it would be to store the pending trigger
event list in a temporary table instead of in ram.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Packages in oracle Style

2008-05-28 Thread Pavel Stehule
2008/5/27 Zdenek Kotala [EMAIL PROTECTED]:
 Coutinho napsal(a):

 this is listed on TODO:
 http://www.postgresql.org/docs/faqs.TODO.html

 Add features of Oracle-style packages (Pavel)


My last idea was only global variables for plpgsql. It needs hack of
plpgsql :(. But it's can be simple work.

Pavel


 I see. Sorry I overlooked it. I think Pavel Stehule will help you. He has
 idea how to do it.

Zdenek

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #4204: COPY to table with FK has memory leak

2008-05-28 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 This is expected to take lots of memory because each row-requiring-check
 generates an entry in the pending trigger event list.

 Hm, it occurs to me that we could still do a join against the pending event
 trigger list... I wonder how feasible it would be to store the pending trigger
 event list in a temporary table instead of in ram.

We could make that list spill to disk, but the problem remains that
verifying the rows one at a time will take forever.

The idea that's been kicked around occasionally is that once you get
past N pending events, throw them all away and instead queue a single
operation to do a bulk verify (just like initial establishment of the
FK constraint).  I'm not sure how to do the queue management for this
though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding second heap scan in VACUUM

2008-05-28 Thread Simon Riggs

On Wed, 2008-05-28 at 16:56 +0530, Pavan Deolasee wrote:

 2. It then waits for all the existing transactions to finish to make
 sure that everyone can see the change in the pg_class row

I'm not happy that the VACUUM waits. It might wait a very long time and
cause worse overall performance than the impact of the second scan.

Happily, I think we already have a solution to this overall problem
elsewhere in the code. When we VACUUM away all the index entries on a
page we don't yet remove it. We only add it to the FSM on the second
pass of that page on the *next* VACUUM.

So the idea is to have one pass per VACUUM, but make that one pass do
the first pass of *this* VACUUM and the second pass of the *last*
VACUUM.

We mark the xid of the VACUUM in pg_class as you suggest, but we do it
after VACUUM has completed the pass.

In single pass we mark DEAD line pointers as RECENTLY_DEAD. If the last
VACUUM xid is old enough we mark RECENTLY_DEAD as UNUSED, as well,
during this first pass. If last xid is not old enough we do second pass
to remove them.

That has the effect that large tables that are infrequently VACUUMed
will need only a single scan. Smaller tables that require almost
continual VACUUMing will probably do two scans, but who cares?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #4204: COPY to table with FK has memory leak

2008-05-28 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 This is expected to take lots of memory because each row-requiring-check
 generates an entry in the pending trigger event list.

 Hm, it occurs to me that we could still do a join against the pending event
 trigger list... I wonder how feasible it would be to store the pending 
 trigger
 event list in a temporary table instead of in ram.

 We could make that list spill to disk, but the problem remains that
 verifying the rows one at a time will take forever.

Well I was thinking if we did a join between a temporary table and the fk
target then it wouldn't have to be a one-by-one operation. It could be a merge
join if the planner thought that was better. How to get accurate stats into
the planner at that point would be a missing detail though.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding second heap scan in VACUUM

2008-05-28 Thread Gregory Stark
Simon Riggs [EMAIL PROTECTED] writes:

 So the idea is to have one pass per VACUUM, but make that one pass do
 the first pass of *this* VACUUM and the second pass of the *last*
 VACUUM.

I think that's exactly the same as the original suggestion of having HOT
pruning do the second pass of the last vacuum. The trick is to know whether
the last vacuum committed or not. If it didn't commit then it's not safe to
remove those line pointers yet.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding second heap scan in VACUUM

2008-05-28 Thread Simon Riggs

On Wed, 2008-05-28 at 16:55 -0400, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 
  So the idea is to have one pass per VACUUM, but make that one pass do
  the first pass of *this* VACUUM and the second pass of the *last*
  VACUUM.
 
 I think that's exactly the same as the original suggestion of having HOT
 pruning do the second pass of the last vacuum. The trick is to know whether
 the last vacuum committed or not. If it didn't commit then it's not safe to
 remove those line pointers yet.

Perhaps, though I'm not suggesting storing extra xids on-block.

I think if we have to wait for a VACUUM to run before marking the line
pointers then we may as well wait for two. Having something wait for a
VACUUM and then removed it by HOT afterwards gives you the worst of both
worlds: long wait for a VACUUM then more overhead and extra code during
HOT pruning.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #4204: COPY to table with FK has memory leak

2008-05-28 Thread Simon Riggs

On Wed, 2008-05-28 at 16:28 -0400, Tom Lane wrote:
 Gregory Stark [EMAIL PROTECTED] writes:
  Tom Lane [EMAIL PROTECTED] writes:
  This is expected to take lots of memory because each row-requiring-check
  generates an entry in the pending trigger event list.
 
  Hm, it occurs to me that we could still do a join against the pending event
  trigger list... I wonder how feasible it would be to store the pending 
  trigger
  event list in a temporary table instead of in ram.
 
 We could make that list spill to disk, but the problem remains that
 verifying the rows one at a time will take forever.
 
 The idea that's been kicked around occasionally is that once you get
 past N pending events, throw them all away and instead queue a single
 operation to do a bulk verify (just like initial establishment of the
 FK constraint).  I'm not sure how to do the queue management for this
 though.

Neither of those approaches is really suitable. Just spilling to disk is
O(N) of the number of rows loaded, the second one is O(N) at least on
the number of rows (loaded + existing). The second one doesn't help
either since if the table was empty you'd have added the FK afterwards,
so we must assume there is already rows in there and in most cases rows
already loaded will exceed those being added by the bulk operation.

AFAICS we must aggregate the trigger checks. We would need a special
property of triggers that allowed them to be aggregated when two similar
checks arrived. We can then use hash aggregation to accumulate them. We
might conceivably need to spill to disk also, since the aggregation may
not always be effective. But in most cases the tables against which FK
checks are made are significantly smaller than the tables being loaded.
Once we have hash aggregated them, that is then the first part of a hash
join to the target table.

We certainly need a TODO item for improve RI checks during bulk
operations.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #4204: COPY to table with FK has memory leak

2008-05-28 Thread Gregory Stark

Simon Riggs [EMAIL PROTECTED] writes:

 AFAICS we must aggregate the trigger checks. We would need a special
 property of triggers that allowed them to be aggregated when two similar
 checks arrived. We can then use hash aggregation to accumulate them. We
 might conceivably need to spill to disk also, since the aggregation may
 not always be effective. But in most cases the tables against which FK
 checks are made are significantly smaller than the tables being loaded.
 Once we have hash aggregated them, that is then the first part of a hash
 join to the target table.

Well we can't aggregate them as they're created because later modifications
could delete or update the original records. The SQL spec requires that FK
checks be effective at the end of the command. 

I admit off the top of my head I can't actually come up with any situations
which would be covered by the spec. All the instances I can think of involve
either Postgres's UPDATE FROM or plpgsql functions or some other postgres
specific functionality. But I do seem to recall there were some situations
where it mattered.

But we could aggregate them when it comes time to actually check them. Or we
could hash the FK keys and scan the event list. Or we could sort the two and
merge join them

 We certainly need a TODO item for improve RI checks during bulk
 operations.

I have a feeling it's already there. Hm. There's a whole section on RI
triggers but the closest I see is this, neither of the links appear to refer
to bulk operations:

 Optimize referential integrity checks

 http://archives.postgresql.org/pgsql-performance/2005-10/msg00458.php
 http://archives.postgresql.org/pgsql-hackers/2007-04/msg00744.php

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #4204: COPY to table with FK has memory leak

2008-05-28 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Simon Riggs [EMAIL PROTECTED] writes:
 We certainly need a TODO item for improve RI checks during bulk
 operations.

 I have a feeling it's already there. Hm. There's a whole section on RI
 triggers but the closest I see is this, neither of the links appear to refer
 to bulk operations:

  Optimize referential integrity checks
  http://archives.postgresql.org/pgsql-performance/2005-10/msg00458.php
  http://archives.postgresql.org/pgsql-hackers/2007-04/msg00744.php

No, both of those are talking about the same thing, ie, (1) making the
are-the-keys-unchanged optimization work when NULLs are present,
and (2) not testing for this case twice.

There's an entry in the Triggers section

* Add deferred trigger queue file

  Right now all deferred trigger information is stored in backend
  memory.  This could exhaust memory for very large trigger queues.
  This item involves dumping large queues into files.

but as already noted, this is a pretty myopic answer (at least for
RI triggers).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PERFORM] Memory question on win32 systems

2008-05-28 Thread Sabbiolina
Hello, in my particular case I need to configure Postgres to handle only a
few concurrent connections, but I need it to be blazingly fast, so I need it
to cache everything possible. I've changed the config file and multiplied
all memory-related values by 10, still Postgres uses only less than 50 Mb of
my RAM.

I have 4 Gigs of RAM, how do I force Postgres to use a higher part of such
memory in order to cache more indexes, queries and so on?

Thanks!


Re: [HACKERS] Hint Bits and Write I/O

2008-05-28 Thread Florian G. Pflug

Simon Riggs wrote:
Hmm, I think the question is: How many hint bits need to be set 
before we mark the buffer dirty? (N)


Should it be 1, as it is now? Should it be never? Never is a long 
time. As N increases, clog accesses increase. So it would seem there 
is likely to be an optimal value for N.


After further thought, I begin to think that the number of times we set
a dirty hint-bit shouldn't influence the decision of whether to dirty
the page too much. Instead, we should look at the *age* of the last xid
which modified the tuple. The idea is that the clog pages showing the
status of young xids are far more likely to be cached that the pages
for older xids. This makes a lost hint-bit update much cheaper for
young than for old xids, because we probably won't waste any IO if we
have to set the hint-bit again later, because the buffer was evicted
from shared_buffers before being written out. Additionally, I think we
should put some randomness into the decision, to spread the IO caused by
hit-bit updates after a batch load.

All in all, I envision a formula like
chance_of_dirtying = min(1,
  alpha
  *floor((next_xid - last_modifying_xid)/clog_page_size)
  /clog_buffers
)

This means that a hint-bit update never triggers dirtying if the last
modifying xid belongs to the same clog page as the next unused xid -
which sounds good, since that clog page gets touched on every commit and
abort, and therefore is cached nearly for sure.

For xids on older pages, the chance of dirtying grows (more aggresivly
for larger alpha values). For alpha = 1, a hint-bit update dirties a
buffer for sure only if the xid is older than clog_page_size*clog_buffers.

regards,
Florian Pflug

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] intercepting WAL writes

2008-05-28 Thread Mike
Hello,

 

I'm new to the core PostgreSQL code, so pardon the question if the answer is
really obvious, and I'm just missing it, but I've got a relatively large web
application that uses PostgreSQL as a back-end database, and we're heavily
using memcached to cache frequently accessed data.

 

I'm looking at modifying PostgreSQL (in some way) to push changes directly
to our memcache servers, in hopes of moving towards a system where only
writes are actually sent to the databases, and reads are exclusively sent to
the memcache servers.

 

I'm guessing that I could intercept the WAL writes, and use this information
to push out to my memcache servers, similar to a replication model.

 

Can somebody point to the most logical place in the code to intercept the
WAL writes? (just a rough direction would be enough)- or if this doesn't
make sense at all, another suggestion on where to get the data? (I'm trying
to avoid doing it using triggers).

 

Thanks,

 

Mike



Re: [HACKERS] intercepting WAL writes

2008-05-28 Thread Jonah H. Harris
On Wed, May 28, 2008 at 7:11 PM, Mike [EMAIL PROTECTED] wrote:
 Can somebody point to the most logical place in the code to intercept the
 WAL writes? (just a rough direction would be enough)

XLogInsert

 or if this doesn't make sense at all, another suggestion on where to get
 the data? (I'm trying to avoid doing it using triggers).

Without triggers, you don't have many options.  With triggers, you
could use pg_memcache.  If you take it from the WAL, you'll have to do
a bit of decoding to make it usable in the context you're looking for,
which is quite a bit of work.

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | [EMAIL PROTECTED]
Edison, NJ 08837 | http://www.enterprisedb.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] Memory question on win32 systems

2008-05-28 Thread Douglas McNaught
On Wed, May 28, 2008 at 7:05 PM, Sabbiolina [EMAIL PROTECTED] wrote:
 Hello, in my particular case I need to configure Postgres to handle only a
 few concurrent connections, but I need it to be blazingly fast, so I need it
 to cache everything possible. I've changed the config file and multiplied
 all memory-related values by 10, still Postgres uses only less than 50 Mb of
 my RAM.

How are you measuring this?

 I have 4 Gigs of RAM, how do I force Postgres to use a higher part of such
 memory in order to cache more indexes, queries and so on?

Post the settings values you're using and people will be better able
to help you.

-Doug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] intercepting WAL writes

2008-05-28 Thread Jeff Davis
On Wed, 2008-05-28 at 19:11 -0400, Mike wrote:
 Can somebody point to the most logical place in the code to intercept
 the WAL writes? (just a rough direction would be enough)- or if this
 doesn’t make sense at all, another suggestion on where to get the
 data? (I’m trying to avoid doing it using triggers).

Why are you avoiding triggers? One solution might be to use Slony to
just create the log, and then read the log of events into memcached
rather than another PostgreSQL instance.

http://slony.info/documentation/logshipping.html

Those logs might be easier to process than the WAL.

Also, why do you need to intercept the WAL writes, and not just read
from a WAL archive? Does this need to be synchronous?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Catching exceptions from COPY

2008-05-28 Thread Darren Reed

Is it feasible to add the ability to catch exceptions from COPY?

Darren


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] intercepting WAL writes

2008-05-28 Thread Jonah H. Harris
On Wed, May 28, 2008 at 8:30 PM, Mike [EMAIL PROTECTED] wrote:
 When you say a bit of decoding, is that because the data written to the logs
 is after the query parser/planner? Or because it's written in several
 chunks? Or?

Because that's the actual recovery record.  There is no SQL text, just
the WAL record type (XLOG_HEAP_INSERT, XLOG_HEAP_UPDATE,
XLOG_XACT_COMMIT, ...) and the data as it relates to that operation.

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | [EMAIL PROTECTED]
Edison, NJ 08837 | http://www.enterprisedb.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Catching exceptions from COPY

2008-05-28 Thread Greg Smith

On Wed, 28 May 2008, Darren Reed wrote:


Is it feasible to add the ability to catch exceptions from COPY?


Depends on what you consider feasible.  There's a start to a plan for that 
on the TODO list:  http://www.postgresql.org/docs/faqs.TODO.html but it's 
not trivial to implement.


It's also possible to do this right now using pgloader: 
http://pgfoundry.org/projects/pgloader/ That requires some setup and 
there's overhead to passing through that loading layer.


A third possibility is to write a short script specifically aimed at your 
copy need that breaks your input files into smaller chunks and loads them, 
kicking back the ones that don't load, or breaking them into even smaller 
chunks until you've found the problem line or lines.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] intercepting WAL writes

2008-05-28 Thread Mike
On Wed, May 28, 2008 at 8:30 PM, Mike [EMAIL PROTECTED] wrote:
 When you say a bit of decoding, is that because the data written to the
logs
 is after the query parser/planner? Or because it's written in several
 chunks? Or?

Because that's the actual recovery record.  There is no SQL text, just
the WAL record type (XLOG_HEAP_INSERT, XLOG_HEAP_UPDATE,
XLOG_XACT_COMMIT, ...) and the data as it relates to that operation.

Oh- right- that makes sense.

I installed and started looking at the source code for xlogviewer and
xlogdump; seems like a reasonable place to start.

Thanks for your help,

Mike


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [JDBC] Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

2008-05-28 Thread Jignesh K. Shah

Are there any head fixes proposed for it?

I am seeing some scaling problems with EAStress which uses JDBC with 
8.3.0 and this one could be the reason why I am seeing some problems.. I 
will be happy to try it out and report on it.. The setup is ready right 
now if someone can point me to a patch that I can try it out and 
hopefully see if the patch fixes my problem.


-Jignesh


Dave Cramer wrote:

It's pretty easy to test.

prepare the query
and
run explain analyze on the prepared statement.

Dave
On 10-Apr-08, at 5:47 AM, Thomas Burdairon wrote:


Is there any patch available for this one?

I'm encountering troubles with some JDBC queries and I'd like to test 
it before asking some help on the JDBC list.


Thanks.
Tom

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [JDBC] Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

2008-05-28 Thread Jignesh K. Shah



Tom Lane wrote:

Jignesh K. Shah [EMAIL PROTECTED] writes:
  

Are there any head fixes proposed for it?



It's been fixed in CVS for a month.  We just haven't pushed a release yet.
  


Let me try it out and see what I find out in my EAStress workload.

Regards,
Jignesh


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [JDBC] Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

2008-05-28 Thread Tom Lane
Jignesh K. Shah [EMAIL PROTECTED] writes:
 Are there any head fixes proposed for it?

It's been fixed in CVS for a month.  We just haven't pushed a release yet.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] intercepting WAL writes

2008-05-28 Thread Koichi Suzuki
And you will have a chance to encounter full page writes, whole page
image, which could be produced during the hot backup and the first
modification to the data page after a checkpoint (if you turn full
page write option on by GUC).

2008/5/29 Mike [EMAIL PROTECTED]:
 On Wed, May 28, 2008 at 8:30 PM, Mike [EMAIL PROTECTED] wrote:
 When you say a bit of decoding, is that because the data written to the
 logs
 is after the query parser/planner? Or because it's written in several
 chunks? Or?

Because that's the actual recovery record.  There is no SQL text, just
the WAL record type (XLOG_HEAP_INSERT, XLOG_HEAP_UPDATE,
XLOG_XACT_COMMIT, ...) and the data as it relates to that operation.

 Oh- right- that makes sense.

 I installed and started looking at the source code for xlogviewer and
 xlogdump; seems like a reasonable place to start.

 Thanks for your help,

 Mike


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
--
Koichi Suzuki

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] WITH RECURSIVE patch V0.1

2008-05-28 Thread Josh Berkus

Tom,


I think this patch is plenty complicated enough without adding useless
restrictive options.


+1 for no additonal GUC options.

--Josh Berkus


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Estimating recursive query cost

2008-05-28 Thread Josh Berkus

All,

I'm really uncomfortable with just having recursive queries return a 
cost of 1000 or some similar approach.  That's always been a problem 
for SRFs and it looks to be a bigger problem for WR.


However, it doesn't seem like the computer science establishment has 
made a lot of headway in this regard either.  Most approaches I found 
abstracts for would cost more CPU to calculate than the query was likely 
to take in order to execute.  Several of the stupider looking ones are 
for no apparent reason patented.


However, since we know for certain that the recursive query is going to 
be executed, and we don't have multiple choices of execution paths for 
it, it seems like our primary concern for estimation purposes is what 
portion of the table will be returned by the query, i.e. should we use a 
table scan or an index scan, if an appropriate index is available?


Or will that not be calculated at the recursive query level?

--Josh Berkus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding second heap scan in VACUUM

2008-05-28 Thread Pavan Deolasee
On Thu, May 29, 2008 at 2:02 AM, Simon Riggs [EMAIL PROTECTED] wrote:


 I'm not happy that the VACUUM waits. It might wait a very long time and
 cause worse overall performance than the impact of the second scan.


Lets not get too paranoid about the wait. It's a minor detail in the
whole theory. I would suggest that the benefit of avoiding second scan
would be huge. Remember, its just not a scan, it also dirties those
blocks again, forcing them write to disk. Also, if you really have a
situation where vacuum needs to wait for very long, then you are
already in trouble. The long running transactions would prevent
vacuuming many tuples.

I think we can easily tweak the wait so that it doesn't wait
indefinitely. If the wait times out, vacuum can still proceed, but
it can mark the DEAD line pointers as DEAD_RECLAIMED. It would then
have a choice of making a second pass and reclaiming the DEAD line
pointers (like it does today).



 So the idea is to have one pass per VACUUM, but make that one pass do
 the first pass of *this* VACUUM and the second pass of the *last*
 VACUUM.

 We mark the xid of the VACUUM in pg_class as you suggest, but we do it
 after VACUUM has completed the pass.


The trick is to correctly know if the last vacuum removed the index
pointers or not. There could be several ways to do that. But you need
to explain in detail how it would work in cases of vacuum failures and
database crash.

 In single pass we mark DEAD line pointers as RECENTLY_DEAD. If the last
 VACUUM xid is old enough we mark RECENTLY_DEAD as UNUSED, as well,
 during this first pass. If last xid is not old enough we do second pass
 to remove them.


Lets not call them RECENTLY_DEAD :-) DEAD is already stricter than
that. We need something even more strong. That's why I used
DEAD_RECLAIMED, to note that the line pointer is DEAD and the index
pointer may have been removed as well.


 That has the effect that large tables that are infrequently VACUUMed
 will need only a single scan. Smaller tables that require almost
 continual VACUUMing will probably do two scans, but who cares?


Yeah, I think we need to target the large table case. The second pass
is obviously much more costly for large tables. I think the timed-wait
answers your concern.

Thanks,
Pavan


-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #4204: COPY to table with FK has memory leak

2008-05-28 Thread Simon Riggs

On Wed, 2008-05-28 at 18:17 -0400, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 
  AFAICS we must aggregate the trigger checks. We would need a special
  property of triggers that allowed them to be aggregated when two similar
  checks arrived. We can then use hash aggregation to accumulate them. We
  might conceivably need to spill to disk also, since the aggregation may
  not always be effective. But in most cases the tables against which FK
  checks are made are significantly smaller than the tables being loaded.
  Once we have hash aggregated them, that is then the first part of a hash
  join to the target table.
 
 Well we can't aggregate them as they're created because later modifications
 could delete or update the original records. The SQL spec requires that FK
 checks be effective at the end of the command. 

Well, thats what we need to do. We just need to find a way...

Currently, we store trigger entries by htid. I guess we need to
aggregate them on the actual values looked up.

The SQL spec also says that the contents of the FK check table should be
taken as at the start of the command, so we should be safe to aggregate
the values prior to the check.

As already suggested in work on Read Only Tables, we could optimise them
away to being constraint checks.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding second heap scan in VACUUM

2008-05-28 Thread Simon Riggs

On Thu, 2008-05-29 at 09:57 +0530, Pavan Deolasee wrote:
 On Thu, May 29, 2008 at 2:02 AM, Simon Riggs [EMAIL PROTECTED] wrote:
 
 
  I'm not happy that the VACUUM waits. It might wait a very long time and
  cause worse overall performance than the impact of the second scan.
 
 
 Lets not get too paranoid about the wait. It's a minor detail in the
 whole theory. I would suggest that the benefit of avoiding second scan
 would be huge. Remember, its just not a scan, it also dirties those
 blocks again, forcing them write to disk. Also, if you really have a
 situation where vacuum needs to wait for very long, then you are
 already in trouble. The long running transactions would prevent
 vacuuming many tuples.
 
 I think we can easily tweak the wait so that it doesn't wait
 indefinitely. If the wait times out, vacuum can still proceed, but
 it can mark the DEAD line pointers as DEAD_RECLAIMED. It would then
 have a choice of making a second pass and reclaiming the DEAD line
 pointers (like it does today).

Which is exactly what I suggested. Don't wait, just check.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers