Re: [HACKERS] vacuumlo patch

2011-08-07 Thread Josh Kupershmidt
On Sun, Aug 7, 2011 at 12:41 AM, Tim elatl...@gmail.com wrote:
 Hi Josh,

 Thanks for help. Attached is a patch including changes suggested in your
 comments.

 Excerpts from Josh's message On Sat, Aug 6, 2011 at 9:57 PM:

  1. It wasn't clear to me whether you're OK with Aron's suggested
 tweak, please include it in your patch if so.


 I decided to and included Aron's tweak.
 I'm not sure if the PostgreSQL project prefers simplified code over faster
 run time (if only under rare conditions).
 Who knows maybe the compiler will re-optimize it anyway.

Thanks for the quick update.

It was pretty hard for me to compare your initial versions with
Aron's, since I had trouble with those patches. But if it's just a
minor change to how transaction_limit is handled, I wouldn't worry
about it; the vast majority of vacuumlo's time is going to be spent in
the database AFAICT.

Incidentally, if someone wants to optimize vacuumlo, I see some
low-hanging fruit there, such as maybe avoiding that expensive loop of
DELETEs out of vacuum_l entirely with a bit smarter queries. But
that's a problem for another day.

   2. I think it might be better to use INT_MAX instead of hardcoding
 2147483647.

 Fixed, though I'm not sure if I put the include in the preferred order.

Hrm yeah.. maybe keep it so that all the system headers are together
(i.e. put limits.h after fcntl.h). At least, that's how similar
header files like pg_upgrade.h seem to be structured.

   5. transaction_limit is an int, yet you're using strtol() which
 returns long. Maybe you want to use atoi() or make transaction_limit a
 long?

 The other INT in the code is using strtol() so I also used strtol instead of
 changing more code.
 I'm not sure if there are any advantages or disadvantages.
 maybe it's easier to prevent/detect/report overflow wraparound.

Ugh, yeah you're right. I think this vacuumlo.c code is not such a
great role model for clean code :-) Probably not a big deal, since you
are checking for param.transaction_limit  0 which would detect
overflow.

I'm not sure if the other half of that check, (param.transaction_limit
 INT_MAX) has any meaning, i.e. how can an int ever be  INT_MAX?

 I can't think of a good reason anyone would want to limit transaction to
 something more than INT_MAX.
 Implementing that would best be done in separate and large patch as
 PQntuples returns an int and is used on 271 lines across PostgreSQL.

Right, I wasn't suggesting that would actually be useful - I just
thought the return type of strtol() or atoi() should agree with its
lvalue.

I've only spent a little bit of time with this patch and LOs so far,
but one general question/idea I have is whether the -l LIMIT option
could be made smarter in some way. Say, instead of making the user
figure out how many LOs he can feasibly delete in a single transaction
(how is he supposed to know anyway, trial and error?) could we figure
out what that limit should be based on max_locks_per_transaction? and
handle deleting all the ophan LOs in several transactions for the user
automatically?

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] vacuumlo patch

2011-08-07 Thread Peter Eisentraut
On sön, 2011-08-07 at 00:41 -0400, Tim wrote:
 Thanks for help. Attached is a patch including changes suggested in your
 comments.

Please put the new option 'l' in some sensible order in the code and the
help output (normally alphabetical).  Also, there should probably be
some update to the documentation.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] vacuumlo patch

2011-08-07 Thread Tim
Excerpts from Josh's message On Sun, Aug 7, 2011 at 2:36 AM:

 could we figure out what that limit should be based on
 max_locks_per_transaction?


It would be nice to implement via -l max instead of making users do it
manually or something like this -l $(grep max_locks_per_transaction.*=
postgresql.conf | perl -p -e 's/#.*//g;s/^.*?([0-9]+).*?$/$1/g' | grep .
|head -1 ).
For this patch I just want to enable the limit functionality, leaving higher
level options like max to the user for now.


 handle deleting all the ophan LOs in several transactions for the user
 automatically?


I addressed this option before and basically said it is an undesirable
alternative because It is a less flexible option that is easily implemented
in a shell script.
Again it would be a welcome extra option but it can be left to the user for
now.


[HACKERS] Will switchover still need a checkpoint in 9.1 SR Hot Standby

2011-08-07 Thread Hannu Krosing
In 9.0 (as in earlier versions) a former standby host has to do a full
checkpoint before becoming available as an independent database instance
in either switchover or failover scenarios. 

For most combinations of of bigger than minimal shared buffers and
non-memory-speed disks this can take from several seconds to tens of
minutes on busy systems.

Is the pre-activation checkpoint still required in 9.1 ?

The reason I ask is that I'm looking into making planned switchovers and
adding partitions more instantaneous. 

Streaming replication provides replica lag of only millisecond vs. high
tens or even hundreds of milliseconds achievable on trigger-based
replication systems like Londiste. 

Where Londiste currently wins, is the ability to become available
instantaneously without any checkpoint or other lengthy maintenance
operations.

-- 
---
Hannu Krosing
PostgreSQL Infinite Scalability and Performance Consultant
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Common object property boards

2011-08-07 Thread Kohei KaiGai
 So add a bunch of macros on top for the two or three (five?) most common
 cases -- say those that occur 3 times or more.

 I could go for that.

 OK, I'll try to implement according to the idea.

I'm under implementation of this code according to the suggestion.
However, I'm not sure whether it is really portable way (at least, GCC accepts),
and whether the interface is simpler than as Robert suggested at first.

extern void get_object_property(ObjectType objtype,
const char **objtype_text,
Oid *relationId,
int *catid_oidlookup,
int *catid_namelookup,
AttrNumber *attnum_name,
AttrNumber *attnum_namespace,
AttrNumber *attnum_owner);

#define get_object_property_attnum_name(objtype)\
({  AttrNumber temp;\
get_object_property((objtype), NULL, NULL, NULL, NULL,  \
temp, NULL, NULL); \
temp; })
#define get_object_property_attnum_namespace(objtype)   \
({  AttrNumber temp;\
get_object_property((objtype), NULL, NULL, NULL, NULL,  \
NULL, temp, NULL); \
temp; })
#define get_object_property_attnum_ownership(objtype)   \
({  AttrNumber temp;\
get_object_property((objtype), NULL, NULL, NULL, NULL,  \
NULL, NULL, temp); \
temp; })

If get_object_property() returns an entry within the big property table,
the bunch of macros will become simple, as follows:

extern ObjectProperty get_object_property(ObjectType objtype);

#define get_object_property_attnum_name(objtype) \
(get_object_property(objtype)-attnum_name)

2011/8/2 Kohei KaiGai kai...@kaigai.gr.jp:
 2011/8/2 Robert Haas robertmh...@gmail.com:
 On Mon, Aug 1, 2011 at 4:27 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of lun ago 01 16:12:56 -0400 2011:
 On Mon, Aug 1, 2011 at 4:02 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
  Excerpts from Kohei KaiGai's message of dom jul 31 02:21:55 -0400 2011:
  2011/7/29 Tom Lane t...@sss.pgh.pa.us:
 
   It would likely be better to not expose the struct type, just 
   individual
   lookup functions.
  
  If so, individual functions to expose a certain property of the supplied
  object type should be provided.
 
    int get_object_property_catid_oidlookup(ObjectType);
    int get_object_property_catid_namelookup(ObjectType);
    Oid get_object_property_relation_id(ObjectType);
    AttrNumber get_object_property_nameattnum(ObjectType);
    AttrNumber get_object_property_namespacenum(ObjectType);
    AttrNumber get_object_property_ownershipnum(ObjectType);
 
  Maybe a single lookup function that receives pointers that the lookup
  routine can fill with the appropriate information; allowing for a NULL
  pointer in each, meaning caller is not interested in that property.

 That seems like a lot of extra notational complexity for no particular
 benefit.  Every time someone wants to add a new property to this
 array, they're going to have to touch every caller, and all
 third-party code using this interface will have to be rejiggered.

 So add a bunch of macros on top for the two or three (five?) most common
 cases -- say those that occur 3 times or more.

 I could go for that.

 OK, I'll try to implement according to the idea.

 Thanks,
 --
 KaiGai Kohei kai...@kaigai.gr.jp




-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] vacuumlo patch

2011-08-07 Thread Tim
Excerpts from Peter's message On Sun, Aug 7, 2011 at 3:49 AM:

 Please put the new option 'l' in some sensible order in the code and the
 help output (normally alphabetical).  Also, there should probably be
 some update to the documentation.


I have alphabetized the help output, and one piece of code.
I'm hesitate to alphabetize some portions of the code because they are
grouped by type and not already alphabetized.
If I left something un-alphabetized please be explicit about about what
should be changed and why.
Documentation is now also in the patch.
Thanks for the tips.


vacuumlo.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Will switchover still need a checkpoint in 9.1 SR Hot Standby

2011-08-07 Thread Simon Riggs
On Sun, Aug 7, 2011 at 8:57 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 In 9.0 (as in earlier versions) a former standby host has to do a full
 checkpoint before becoming available as an independent database instance
 in either switchover or failover scenarios.

 For most combinations of of bigger than minimal shared buffers and
 non-memory-speed disks this can take from several seconds to tens of
 minutes on busy systems.

For switchover, you issue a checkpoint first, to reduce this time as
much as possible.

 Is the pre-activation checkpoint still required in 9.1 ?

Yes, but I've found a way to remove them in 9.2 and will be patching that soon.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transient plans versus the SPI API

2011-08-07 Thread Hannu Krosing
On Wed, 2011-08-03 at 15:19 -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  This seems like a good design.  Now what would be really cool is if
  you could observe a stream of queries like this:
 
  SELECT a, b FROM foo WHERE c = 123
  SELECT a, b FROM foo WHERE c = 97
  SELECT a, b FROM foo WHERE c = 236
 
  ...and say, hey, I could just make a generic plan and use it every
  time I see one of these.  It's not too clear to me how you'd make
  recognition of such queries cheap enough to be practical, but maybe
  someone will think of a way...
 
 Hm, you mean reverse-engineering the parameterization of the query?

Yes, basically re-generate the query after (or while) parsing, replacing
constants and arguments with another set of generated arguments and
printing the list of these arguments at the end. It may be easiest to do
This in parallel with parsing.

 Interesting thought, but I really don't see a way to make it practical.

Another place where this could be really useful is logging  monitoring

If there were an option to log the above queries as 

SELECT a, b FROM foo WHERE c = $1, (123)
SELECT a, b FROM foo WHERE c = $1, (97)
SELECT a, b FROM foo WHERE c = $1, (236)

it would make all kinds of general performance monitoring tasks also
much easier, not to mention that this forw would actually be something
that kan be cached internally.

For some users this might even be worth to use this feature alone,
without it providing Repeating Plan Recognition.

 In any case, it would amount to making up for a bad decision on the
 application side, ie, not transmitting the query in the parameterized
 form that presumably exists somewhere in the application.  I think
 we'd be better served all around by encouraging app developers to rely
 more heavily on parameterized queries ... but first we have to fix the
 performance risks there.
 
   regards, tom lane
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transient plans versus the SPI API

2011-08-07 Thread Hannu Krosing
On Sun, 2011-08-07 at 11:15 +0200, Hannu Krosing wrote:
 On Wed, 2011-08-03 at 15:19 -0400, Tom Lane wrote:

  Hm, you mean reverse-engineering the parameterization of the query?
 
 Yes, basically re-generate the query after (or while) parsing, replacing
 constants and arguments with another set of generated arguments and
 printing the list of these arguments at the end. It may be easiest to do
 This in parallel with parsing.
 
  Interesting thought, but I really don't see a way to make it practical.
 
 Another place where this could be really useful is logging  monitoring
 
 If there were an option to log the above queries as 
 
 SELECT a, b FROM foo WHERE c = $1, (123)
 SELECT a, b FROM foo WHERE c = $1, (97)
 SELECT a, b FROM foo WHERE c = $1, (236)

The main monitoring use_case would be pg_stat_statements,
http://developer.postgresql.org/pgdocs/postgres/pgstatstatements.html
which is currently pretty useless for queries without parameters 

 it would make all kinds of general performance monitoring tasks also
 much easier, not to mention that this forw would actually be something
 that kan be cached internally.
 
 For some users this might even be worth to use this feature alone,
 without it providing Repeating Plan Recognition.
 
  In any case, it would amount to making up for a bad decision on the
  application side, ie, not transmitting the query in the parameterized
  form that presumably exists somewhere in the application.  I think
  we'd be better served all around by encouraging app developers to rely
  more heavily on parameterized queries ... but first we have to fix the
  performance risks there.
  
  regards, tom lane
  
 
 
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cataloguing NOT NULL constraints

2011-08-07 Thread Peter Eisentraut
On lör, 2011-08-06 at 12:58 +0100, Dean Rasheed wrote:
 Right now \d gives:
 
   Table public.foo
  Column |  Type   | Modifiers
 +-+---
  a  | integer | not null
  b  | integer |
  c  | integer |
 Check constraints:
 foo_b_check CHECK (b IS NOT NULL)
 foo_c_check CHECK (NOT c IS NULL)
 
 With this approach, one change would be that you'd gain an extra not
 null in the Modifiers column for b.
 
 But how many CHECK constraints would show? I guess it would show 3,
 although it could be changed to just show 1. But it certainly couldn't
 continue to show 2, since nothing in the catalogs could distinguish
 the constraints on a from those on b.

I'd say we would show all (what is currently known as) NOT NULL
constraints under Check constraints:.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transient plans versus the SPI API

2011-08-07 Thread Simon Riggs
On Sat, Aug 6, 2011 at 7:29 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Tom Lane t...@sss.pgh.pa.us writes:
 I think we'll be a lot better off with the framework discussed last
 year: build a generic plan, as well as custom plans for the first few
 sets of parameter values, and then observe whether there's a significant
 reduction in estimated costs for the custom plans.

 Another way here would be to cache more than a single plan and to keep
 execution time samples or some other relevant runtime characteristics.
 Then what we need would be a way to switch from a plan to another at run
 time on some conditions, like realizing that the reason why the planner
 thought a nestloop would be perfect is obviously wrong, or maybe just
 based on runtime characteristics.

Tom and I discussed storing multiple sub-plans on a node back in '05
IIRC, and Tom later put in support for that.

That wasn't followed up on.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Will switchover still need a checkpoint in 9.1 SR Hot Standby

2011-08-07 Thread PostgreSQL - Hans-Jürgen Schönig

On Aug 7, 2011, at 11:01 AM, Simon Riggs wrote:

 On Sun, Aug 7, 2011 at 8:57 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 In 9.0 (as in earlier versions) a former standby host has to do a full
 checkpoint before becoming available as an independent database instance
 in either switchover or failover scenarios.
 
 For most combinations of of bigger than minimal shared buffers and
 non-memory-speed disks this can take from several seconds to tens of
 minutes on busy systems.
 
 For switchover, you issue a checkpoint first, to reduce this time as
 much as possible.
 
 Is the pre-activation checkpoint still required in 9.1 ?
 
 Yes, but I've found a way to remove them in 9.2 and will be patching that 
 soon.



hi simon,

this is highly interesting. this is am important issue for big iron.
can you share the idea you have in mind?

many thanks,

hans


--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PL/pgSQL] %TYPE and array declaration

2011-08-07 Thread Wojciech Muła
Hi all, does anybody work on this TODO item?
http://wiki.postgresql.org/wiki/Todo#PL.2FpgSQL

I didn't find any related posting/bug report.

w.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Common object property boards

2011-08-07 Thread Tom Lane
Kohei KaiGai kai...@kaigai.gr.jp writes:
 I'm under implementation of this code according to the suggestion.
 However, I'm not sure whether it is really portable way (at least, GCC 
 accepts),
 and whether the interface is simpler than as Robert suggested at first.

 #define get_object_property_attnum_name(objtype)\
 ({  AttrNumber temp;\
 get_object_property((objtype), NULL, NULL, NULL, NULL,  \
 temp, NULL, NULL); \
 temp; })

Blocks within expressions are a gcc-ism and will fail on any other
compiler, so you can't do it that way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Yes, WaitLatch is vulnerable to weak-memory-ordering bugs

2011-08-07 Thread Tom Lane
I suspected $SUBJECT from the beginning, and I've now put in enough work
to be able to prove it.  The attached test program reliably fails within
a few minutes of being started, when run with 8 worker processes on an
8-core PPC machine.  It's a pretty simple token passing ring protocol,
and at some point one of the processes sees its latch set without seeing
its flag set, so it goes back to sleep and the token stops getting passed.

The original worry about the latch code was that there was some internal
race condition of this sort, but I think we were failing to see the forest
for the trees.  The specification of how to use a latch reads like this:

 * The correct pattern to wait for an event is:
 *
 * for (;;)
 * {
 * ResetLatch();
 * if (work to do)
 * Do Stuff();
 *
 * WaitLatch();
 * }
 *
 * It's important to reset the latch *before* checking if there's work to
 * do. Otherwise, if someone sets the latch between the check and the
 * ResetLatch call, you will miss it and Wait will block.
 *
 * To wake up the waiter, you must first set a global flag or something
 * else that the main loop tests in the if (work to do) part, and call
 * SetLatch *after* that.

Any protocol of that sort has *obviously* got a race condition between
changes of the latch state and changes of the separate flag state, if run
in a weak-memory-ordering machine.

At least on the hardware I'm testing, it seems that the key requirement
for a failure to be seen is that there's a lot of contention for the cache
line containing the flag, but not for the cache line containing the latch.
This allows the write to the flag to take longer to propagate to main
memory than the write to the latch.  I could not get it to fail until
I added the extra shared-memory traffic in the delay loop around line 175.
This probably also explains why it doesn't fail with small numbers of
workers --- you need enough contending CPUs to saturate the memory
hardware.  (On Red Hat's test machines, 7 or 8 workers would reliably
fail within a few minutes, but 6 or fewer didn't always.)

I'm not sure whether we need to do anything about this for 9.1; it may be
that none of the existing uses of latches are subject to the kind of
contention that would create a risk.  But I'm entirely convinced that we
need to insert some memory-barrier instructions into the latch code before
we expand our uses of latches much more.  Since we were already finding
that memory barriers will be needed for performance reasons elsewhere,
I think it's time to bite the bullet and develop that infrastructure.

regards, tom lane


/*
 * Simple test program to exercise SetLatch/WaitLatch.
 *
 * This just passes a token flag around among some number of worker
 * processes.
 *
 * To use: compile this into a .so, then execute
 *
 * CREATE FUNCTION start_worker() RETURNS void AS '/path/to/.so' LANGUAGE c;
 *
 * Then, for each of up to MAX_WORKERS sessions, execute
 *
 * psql -c SELECT start_worker() dbname 
 *
 * About one session per physical CPU should be good.
 *
 * Watch the postmaster log for awhile to confirm nobody drops the ball.
 * If anyone does, all the processes will exit with a timeout failure
 * after about a minute.  If you get bored, pg_ctl stop -m immediate
 * is the easiest way out.
 *
 * Be sure to restart the postmaster between attempts, since there's no
 * logic for re-initializing a workspace that's already present.
 *
 * Note: this will NOT work with the Windows implementation of latches, since
 * we are cheesy about how we initialize the shared memory.  Doesn't matter
 * for now, since Windows doesn't run on any machines with weak memory
 * ordering anyway.
 */

#include postgres.h

#include fmgr.h
#include miscadmin.h
#include storage/latch.h
#include storage/shmem.h

PG_MODULE_MAGIC;


#define MAX_WORKERS 16

#define BIAS 50/* to reduce time-to-failure */


/*
 * Arrays have padding to try to ensure active values are in different cache
 * lines
 */
typedef struct
{
	int			flag;
	int			pad2[4];
	int			junk;
	char		pad[333];
} TokStruct;

typedef struct
{
	Latch		latch;
	char		pad[333];
} LatchStruct;

typedef struct
{
	int			num_workers;

	TokStruct	flags[MAX_WORKERS];

	LatchStruct	latches[MAX_WORKERS];
} MySharedSpace;

volatile long	gsum = 0;

Datum		start_worker(PG_FUNCTION_ARGS);

PG_FUNCTION_INFO_V1(start_worker);

Datum
start_worker(PG_FUNCTION_ARGS)
{
	volatile MySharedSpace *workspace;
	bool		found;
	int			i;
	int			my_number;
	long		counter = 0;
	unsigned short xseed[3];

	/* Set up non-interlocked RNG */
	xseed[0] = xseed[1] = xseed[2] = 42;

	/* Create or access the shared data */
	workspace = (MySharedSpace *)
		ShmemInitStruct(Latch Testing, sizeof(MySharedSpace), found);

	if (!found)
	{
		/* I'm the first, initialize */
		memset((void *) workspace, 0, sizeof(MySharedSpace));

		for (i = 0; i  MAX_WORKERS; i++)
			InitSharedLatch((workspace-latches[i].latch));

		/* Give the token to the 

Re: [HACKERS] WIP: Fast GiST index build

2011-08-07 Thread Alexander Korotkov
Hi!

There is last version of patch. There is the list of most significant
changes in comparison with your version of patch:
1) Reference counting of path items was added. It should helps against
possible accumulation of path items.
2) Neighbor relocation was added.
3) Subtree prefetching was added.
4) Final emptying algorithm was reverted to the original one. My experiments
shows that typical number of passes in final emptying in your version of
patch is about 5. It may be significant itself. Also I haven't estimate of
number of passes and haven't guarantees that it will not be high in some
corner cases. I.e. I prefer more predictable single-pass algorithm in spite
of it's a little more complex.
5) Unloading even last page of node buffer from main memory to the disk.
Imagine that that with levelstep = 1 each inner node has buffer. It seems to
me that keeping one page of each buffer in memory may be memory consuming.

Open items I see at this moment:
1) I dislike my switching to buffering build method because it's based on
very unproven assumptions. And I didn't more reliable assumptions in
scientific papers while. I would like to replace it with something much
simplier. For example, switching to buffering build when regular build
actually starts to produce a lot of IO. For this approach implementation I
need to somehow detect actual IO (not just buffer read but miss of OS
cache).
2) I'm worrying about possible size of nodeBuffersTab and path items. If we
imagine extremely large tree with levelstep = 1 size of this datastructures
will be singnificant. And it's hard to predict that size without knowing of
tree size.

--
With best regards,
Alexander Korotkov.


gist_fast_build-0.10.0.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] vacuumlo patch

2011-08-07 Thread Josh Kupershmidt
On Sun, Aug 7, 2011 at 3:54 AM, Tim elatl...@gmail.com wrote:

 Excerpts from Josh's message On Sun, Aug 7, 2011 at 2:36 AM:

 could we figure out what that limit should be based on
 max_locks_per_transaction?

 It would be nice to implement via -l max instead of making users do it
 manually or something like this -l $(grep max_locks_per_transaction.*=
 postgresql.conf | perl -p -e 's/#.*//g;s/^.*?([0-9]+).*?$/$1/g' | grep .
 |head -1 ).
 For this patch I just want to enable the limit functionality, leaving higher
 level options like max to the user for now.


 handle deleting all the ophan LOs in several transactions for the user
 automatically?

 I addressed this option before and basically said it is an undesirable
 alternative because It is a less flexible option that is easily implemented
 in a shell script.
 Again it would be a welcome extra option but it can be left to the user for
 now.

As a preface, I appreciate the work you're putting into this module,
and I am all for keeping the scope of this change as small as possible
so that we actually get somewhere. Having said that, it's a bit
unfortunate that this module seems to be rather neglected, and has
some significant usability problems.

For instance, if you do blow out the max_locks_per_transaction limit,
you get a very reasonable error message and hint like:

  Failed to remove lo 44459: ERROR:  out of shared memory
  HINT:  You might need to increase max_locks_per_transaction.

Unfortunately, the code doesn't 'break;' after that, it just keeps
plowing through the lo_unlink() calls, generating a ton of rather
unhelpful messages like:

  Failed to remove lo 47087: ERROR:  current transaction is aborted,
  commands ignored until end of transaction block

which clog up stderr, and make it easy to miss the one helpful error
message at the beginning. Now, here's where your patch might really
help things with a minor adjustment. How about structuring that
lo_unlink() call so that users will see only a reasonably helpful
error message if they happen to come across this problem, like this in
non-verbose mode:

  WARNING:  out of shared memory

  Failed to remove lo 46304: ERROR:  out of shared memory
  HINT:  You might need to increase max_locks_per_transaction.
  Bailing out. Try using -l LIMIT flag, with a LIMIT of 1845

(Side note: I was asking upthread about how to figure out what LIMIT
value to use, because I don't understand how max_locks_per_transaction
relates to the number of lo_unlink() calls one can make in a
transaction... I seem to be able use a limit of 1845, but 1846 will
error out, with max_locks_per_transaction = 64. Anyone know why this
is?)

A related problem I noticed that's not really introduced by your
script, but which might easily be fixed, is the return value from
vacuumlo(). The connection and query failures at the top of the
function all return -1 upon failure, but if an lo_unlink() call fails
and the entire transaction gets rolled back, vacuumlo() happily
returns 0.

I've put together an updated version of your patch (based off your
latest version downstream with help output alphabetized) showing how I
envision these two problems being fixed. Also, a small adjustment to
your SGML changes to blend in better. Let me know if that all looks OK
or if you'd rather handle things differently. The new error messages
might well need some massaging. I didn't edit the INT_MAX stuff
either, will leave that for you.

Josh


vacuumlo.v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl crash with Debian 6 (64 bit), pl/perlu, libwww and https

2011-08-07 Thread Tim Bunce
[I've included a summary of the thread and Bcc'd this to perl5-porters
for a sanity check. Please trim heavily when replying.]

On Thu, Aug 04, 2011 at 09:42:31AM -0400, Andrew Dunstan wrote:
 
 So doesn't this look like a bug in the perl module that sets the
 signal handler and doesn't restore it?
 
 What happens if you wrap the calls to the module like this?:
 
 {
 local $SIG{ALRM};
 # do LWP stuff here
 }
 return 'OK';
 
 That should restore the old handler on exit from the block.
 
 I think if you use a perl module that monkeys with the signal
 handlers for any signal postgres uses all bets are off.

On Thu, Aug 04, 2011 at 10:28:45AM -0400, Tom Lane wrote:
 
  Sure, but how expensive would it be for pl/perl to do this
  automatically ?
 
 How can anything like that possibly work with any reliability
 whatsoever?  If the signal comes in, you don't know whether it was
 triggered by the event Postgres expected, or the event the perl module
 expected, and hence there's no way to deliver it to the right signal
 handler (not that the code you're describing is even trying to do that).
 
 What *I'd* like is a way to prevent libperl from touching the host
 application's signal handlers at all.  Sadly, Perl does not actually
 think of itself as an embedded library, and therefore thinks it owns all
 resources of the process and can diddle them without anybody's
 permission.

The PERL_IMPLICIT_SYS mechanism addresses this. Unfortunately it only
works with USE_ITHREADS on Windows currently.
http://perldoc.perl.org/perlguts.html#Future-Plans-and-PERL_IMPLICIT_SYS

On Thu, Aug 04, 2011 at 04:09:47PM -0600, Alex Hunsaker wrote:
 
 Well we can't prevent perl XS (aka C) from messing with signals (and
 other modules like POSIX that expose things like sigprocmask,
 siglongjump etc.) , but we could prevent plperl(u) from playing with
 signals on the perl level ala %SIG.
 
 [ IIRC I proposed doing something about this when we were talking
 about the whole Safe mess, but I think there was too much other
 discussion going on at the time :-) ]
 
 Mainly the options im thinking about are:
 1) if anyone touches %SIG die
 2) turn %SIG into a regular hash so people can set/play with %SIG, but
 it has no real effect.
 3) local %SIG before we call their trigger function. This lets signals
 still work while in trigger scope (like we do for %_TD)
 4) if we can't get any of the above to work we can save each %SIG
 handler before and restore them after each trigger call. (mod_perl
 does something similar so Im fairly certain we should be able to get
 that to work)

On Thu, Aug 4, 2011 at 16:34, David E. Wheeler da...@kineticode.com wrote:

 1) if anyone touches %SIG die
 2) turn %SIG into a regular hash so people can set/play with %SIG, but
 it has no real effect.

 These would disable stuff like $SIG{__WARN__} and $SIG{__DIE__}, which would 
 be an unfortunate side-effect.

On Thu, Aug 04, 2011 at 07:52:45PM -0400, Tom Lane wrote:
 
 The scenario I was imagining was:
 
 1. perl temporarily takes over SIGALRM.
 
 2. while perl function is running, statement_timeout expires, causing
 SIGALRM to be delivered.
 
 3. perl code is probably totally confused, and even if it isn't,
 statement_timeout will not be enforced since Postgres won't ever get the
 interrupt.
 
 Even if you don't think statement_timeout is a particularly critical
 piece of functionality, similar interference with the delivery of, say,
 SIGUSR1 would be catastrophic.
 
 How do you propose to prevent this sort of problem?

I don't think there's complete solution for that particular scenario.
[Though redirecting the perl alarm() opcode to code that would check the
argument against the remaining seconds before statement_timeout expires,
might get close.]


On Thu, Aug 04, 2011 at 06:44:18PM -0600, Alex Hunsaker wrote:
 
  How do you propose to prevent this sort of problem?
 
 Well, I think that makes it unworkable.
 
 So back to #1 or #2.
 
 For plperlu sounds like we are going to need to disallow setting _any_
 signals (minus __DIE__ and __WARN__). I should be able to make it so
 when you try it gives you a warning something along the lines of
 plperl can't set signal handlers, ignoring
 
 For plperl I think we should probably do the same. It seems like
 Andrew might disagree though? Anyone else want to chime in on if
 plperl lets you muck with signal handlers?
 
 Im not entirely sure how much of this is workable, I still need to go
 through perl's guts and see. At the very worst I think we can mark
 each signal handler that is exposed in %SIG readonly (which would mean
 we would  die instead of warning), but I think I can make the warning
 variant workable as well.
 
 I also have not dug deep enough to know how to handle __WARN__ and
 __DIE__ (and exactly what limitations allowing those will impose). I
 still have some work at $day_job before I can really dig into this.

On Thu, Aug 04, 2011 at 09:40:57PM -0400, Andrew Dunstan wrote:
 
 

Re: [HACKERS] plperl crash with Debian 6 (64 bit), pl/perlu, libwww and https

2011-08-07 Thread Andrew Dunstan



On 08/07/2011 07:06 PM, Tim Bunce wrote:


After a little digging and some discussion on the #p5p channel [thanks
to ilmari++ leont++ and sorear++ for their help] it seems that local(%SIG)
doesn't do what you might expect. The %SIG does become empty but the OS
level handlers, even those installed by perl, *aren't changed*:

$ perl -wE '$SIG{INT} = sub { say Foo}; { local %SIG; kill INT, $$; };'
Foo
And, even worse, they're not reset at scope exit:

$ perl -wE '$SIG{INT} = sub { say Foo}; { local %SIG; $SIG{INT} = sub {say Bar }} 
kill INT, $$;'
Bar

That sure seems like a bug (I'll check with the perl5-porters list).


Yeah, that seems very bad. :-(


Localizing an individual element of %SIG works fine.
In C that's something like this (untested):

 hv = gv_fetchpv(SIG, 0, SVt_PVHV);
 keysv = ...SV containing ALRM...
 he = hv_fetch_ent(hv, keysv, 0, 0);
 if (he) {  /* arrange to restore existing elem */
 save_helem_flags(hv, keysv,HeVAL(he), SAVEf_SETMAGIC);
 }
 else { /* arrange to delete a new elem */
 SAVEHDELETE(hv, keysv);
 }




Hmm. I think we'll need to test how much it's going to cost to add that 
to every plperl (or maybe just every plperlu) function call for the six 
or so signals we use.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] vacuumlo patch

2011-08-07 Thread Tim
Thanks Josh,
I like the ability to bail out on PQTRANS_INERROR, and I think it's a small
enough fix to be appropriate to include in this patch.
I did consider it before but did not implement it because I am still new to
pgsql-hackers and did not know how off-the-cuff.
So thanks for the big improvement.


[HACKERS] psql document fix about showing FDW options

2011-08-07 Thread Shigeru Hanada
I noticed that psql document wrongly says that \d+ command shows
per-table FDW options of a foreign table, but in fact, per-table FDW
options are shown only in the result of \det+ command.  Attached patch
removes this wrong description.

This fix should be applied to 9.1 too.

Regards,
-- 
Shigeru Hanada
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index e2e2abe..926ee60 100644
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** testdb=gt;
*** 899,906 
  The command form literal\d+/literal is identical, except that
  more information is displayed: any comments associated with the
  columns of the table are shown, as is the presence of OIDs in the
! table, the view definition if the relation is a view, and the generic
! options if the relation is a foreign table.
  /para
  
  para
--- 899,905 
  The command form literal\d+/literal is identical, except that
  more information is displayed: any comments associated with the
  columns of the table are shown, as is the presence of OIDs in the
! table, and the view definition if the relation is a view.
  /para
  
  para

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers