Re: [HACKERS] Where to Host Project

2008-09-22 Thread Dave Page
On Sat, Sep 20, 2008 at 8:44 PM, Dimitri Fontaine
[EMAIL PROTECTED] wrote:

 I suppose the plan is to upgrade to a newer GForge. Is it still time to
 propose something completely different? I have real good feedbacks about
 VHFFS, a perl based clean-room re-implementation of it, if you want to see
 it this way.
  http://www.vhffs.org/wiki/index.php
  http://fr.wikipedia.org/wiki/VHFFS (easy to grasp keywords)

Certainly not an idea I want to entertain - migrating to a new project
site would be a massive undertaking, and liable to drag on for far
longer than any of us want. It took long enough to migrate from GBorg
:-(


-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch: Collation support

2008-09-22 Thread Heikki Linnakangas

Martijn van Oosterhout wrote:

On Fri, Sep 19, 2008 at 10:13:43AM +0300, Heikki Linnakangas wrote:
It's not like the patch is going to disappear from planet Earth if it 
doesn't get committed for 8.4. It's still valuable and available when 
the new catalogs are needed.


I just prefer it as it was because it takes care of a useful subset of
the features people want in a way that is compatable for the future.
Whereas the stripped down version, I'm not sure it gets us anywhere.


It gives the capability to have different collations in different 
databases within the same cluster. IOW, the same feature as the original 
patch. Finer-grained collation would be even better, of course, but 
database-level collations is a valuable feature on its own.


The critical question is how much compatibility trouble we're going to 
get by having to support the extension to CREATE DATABASE in the 
stripped-down patch, when the pg_collation catalog is introduced in a 
later version in one form or another. So let's investigate that a bit 
further:


In the stripped down version, the CREATE DATABASE syntax is:

CREATE DATABASE name WITH COLLATE=locale name CTYPE=locale name

In the original patch, the CREATE DATABASE syntax is:

CREATE DATABASE name WITH COLLATE=collation name

The first thing that we see is that the COLLATE keyword means different 
things, so it's probably best to change that into:


CREATE DATABASE name WITH LC_COLLATE=locale name LC_CTYPE=locale name

in the stripped-down version. Then we need a way to map the 
stripped-down syntax into the one in the original patch. That's just a 
matter of looking up the collation in the pg_collation catalog with the 
right LC_COLLATE and LC_CTYPE.


Things get slightly more complicated if there is no such collation in 
the pg_collation catalog. One option is to simply create it at that point.


BTW, the original patch didn't have any provision for creating rows in 
pg_collation reflecting the locales available in the OS, but I think 
we'd need that. Otherwise the DBA would need to manually run CREATE 
COLLATION for every collation they want users to be able to use. 
Assuming we do that, the situation that we can't find a row with given 
LC_COLLATE and LC_CTYPE should not arise in practice.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: move column defaults into pg_attribute along with attacl

2008-09-22 Thread Asko Oja
On Mon, Sep 22, 2008 at 5:41 AM, Stephen Frost [EMAIL PROTECTED] wrote:

 * Tom Lane ([EMAIL PROTECTED]) wrote:
  Stephen Frost [EMAIL PROTECTED] writes:
   If we were to accept the pg_attrdef approach, why aren't we
   doing a pg_attracl table instead of adding a column to pg_attribute?
 
  That's actually not an unreasonable question.  If you were to do that
  then you could attach OIDs to the attribute ACLs, which might be a nicer
  representation in pg_shdepend than you were thinking of using.

 What bugs me about this is that it comes across as poor database design-
 both of these really are attributes of a column.  We're creating
 seperate tables for each so we can induce a cleaner ID for them, which
 just isn't the right approach imv.  This would also be another table to
 go deal with when a column is removed, and a less-than-obvious place to
 look for this information from the user's perspective.  It's also the
 case that the items in these tables and the columns they're attached to
 really are one-to-one, there's no many-to-one or one-to-many
 relationship between them..

That's exactly the impression i get also :)


 At the end of the day, this approach feels like more of a kludge to me
 to keep the dependency system simple rather than making the dependency
 system support the real-world system layout, which is that columns don't
 have their own IDs.  Maybe we could approach this another way- what
 about creating a new table which is pg_attrcolids that has both
 pg_attrdef and pg_attracl rolled into it?  Then at least we're accepting
 that we need a distinct ID for columns, but keeping them in one specific
 place?  Is there a reason we would need a seperate ID for each?

 It also strikes me to wonder about possible future support for
 re-ordering columns, though I don't immediately see a way to use this as
 a step towards supporting that.

Thanks,

Stephen

 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.9 (GNU/Linux)

 iEYEARECAAYFAkjXBdkACgkQrzgMPqB3kijuVwCfU2C0TMgd1HYsaDY+wxRSTUph
 YKsAnjtzysLoTpo3jWJMSxjmU23/RMaT
 =OvBL
 -END PGP SIGNATURE-




Re: [HACKERS] macport for libpqxx

2008-09-22 Thread Dave Page
On Sat, Sep 20, 2008 at 5:25 PM, Darren Weber
[EMAIL PROTECTED] wrote:
 Hi Dave,

 Thanks for getting back to me.  Please find attached a draft Portfile
 for libpqxx-2.6.9 (the stable version).  It's easy to read the
 Portfile to see what is going on.  I think it should work fine, but I
 would appreciate any advice about any configure options that should be
 enabled.

There's nothing much to configure from what I can see.

 I've got a problem within macports (not specific to pg or libpqxx).
 MacPorts will not locate the pg_config.sh file during the macport
 build.  I can't just modify the $PATH env because the macport build
 ignores it.  There is an internal variable called $binpath in
 macports, but it's read-only.  I can't figure out how to get the
 macport configure process to find the right pg_config.  Any help
 appreciated.

Can you do something like:

configure   { system cd ${workpath}/${distname}-${version} 
${configure.env}  PATH=/foo/bar:$PATH ./configure ${configure.args}
}

Alternatively, the libpqxx docs say you can set ${PG_CONFIG} to the
path to pg_config, so perhaps you can set that in configure.env (that
sounds like the cleanest option).

-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Toasted table not deleted when no out of line columns left

2008-09-22 Thread Simon Riggs

On Sun, 2008-09-21 at 12:05 -0400, Tom Lane wrote:

 ... and it goes on to point out how to force immediate space reclamation
 if you need that.  These statements apply independently of whether any
 particular value is toasted or not.
 
 The reason for this choice is that reclaiming the space immediately
 would turn DROP COLUMN from a quick operation into a slow one, as it
 would have to grovel over every row of the table looking for TOAST
 pointers.
 
  Judging from that, the toasted table
  cleanup may be part of ALTER TABLE DROP COLUMN.

I thought Hans meant cleanup, not drop?

Perhaps there is room for a function that scans a toast table to remove
unreferenced toast data? It could be done much more efficiently than the
UPDATE and VACUUM FULL technique. No need to add it into DROP COLUMN,
but that doesn't mean it shouldn't be available somewhere, somehow.

Hans is likely to write this anyway for his customer, so it seems worth
defining how it should look so we can accept it into core. VACUUM TOAST
perhaps?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_restore

2008-09-22 Thread Dimitri Fontaine
Le lundi 22 septembre 2008, Andrew Dunstan a écrit :
  You'd really want the latter anyway for some cases, ie, when you don't
  want the restore trying to hog the machine.  Maybe the right form for
  the extra option is just a limit on how many connections to use.  Set it
  to one to force the exact restore order, and to other values to throttle
  how much of the machine the restore tries to eat.

 My intention is to have single-thread restore remain the default, at
 least for this go round, and have the user be able to choose
 --multi-thread=nn to specify the number of concurrent connections to use.

What about the make famous -j option?

   -j [jobs], --jobs[=jobs]
Specifies the number of jobs (commands) to run simultaneously.  If
there  is  more than one -j option, the last one is effective.  If
the -j option is given without an argument, make  will  not  limit
the number of jobs that can run simultaneously.

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Toasted table not deleted when no out of line columns left

2008-09-22 Thread Hans-Jürgen Schönig


On Sep 22, 2008, at 9:46 AM, Simon Riggs wrote:



On Sun, 2008-09-21 at 12:05 -0400, Tom Lane wrote:

... and it goes on to point out how to force immediate space  
reclamation
if you need that.  These statements apply independently of whether  
any

particular value is toasted or not.

The reason for this choice is that reclaiming the space immediately
would turn DROP COLUMN from a quick operation into a slow one, as it
would have to grovel over every row of the table looking for TOAST
pointers.


Judging from that, the toasted table
cleanup may be part of ALTER TABLE DROP COLUMN.


I thought Hans meant cleanup, not drop?

Perhaps there is room for a function that scans a toast table to  
remove
unreferenced toast data? It could be done much more efficiently than  
the

UPDATE and VACUUM FULL technique. No need to add it into DROP COLUMN,
but that doesn't mean it shouldn't be available somewhere, somehow.

Hans is likely to write this anyway for his customer, so it seems  
worth
defining how it should look so we can accept it into core. VACUUM  
TOAST

perhaps?




hello simon,

we definitely have to do something about this problem. VACUUM FULL is  
not an option at all.
once the last text column is gone (toastable column) we definitely  
have to reclaim space.
we just cannot afford to lose hundreds of gigs of good storage because  
of this missing feature.


so, to comment tom's answer - it is not about not understanding no;  
it was more a request to get a how to do it best because we have to  
do it somehow.


best regards,

hans


--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Initial prefetch performance testing

2008-09-22 Thread Greg Smith
The complicated patch I've been working with for a while now is labeled 
sequential scan posix fadvise in the CommitFest queue.  There are a lot 
of parts to that, going back to last December, and I've added the many 
most relevant links to the September CommitFest page.


The first message there on this topic is 
http://archives.postgresql.org/message-id/[EMAIL PROTECTED] 
which is a program from Greg Stark that measures how much prefetching 
advisory information improves the overall transfer speed on a synthetic 
random read benchmark.  The idea is that you advise the OS about up to n 
requests at a time, where n goes from 1 (no prefetch at all) to 8192.  As 
n goes up, the total net bandwidth usually goes up as well.  You can 
basically divide the bandwidth at any prefetch level by the baseline (1=no 
prefetch) to get a speedup multiplier.  The program allows you to submit 
both unsorted and sorted requests, and the speedup is pretty large and 
similarly distributed (but of different magnitude) in both cases.


While not a useful PostgreSQL patch on its own, this program lets one 
figure out if the basic idea here, advise about blocks ahead of time to 
speed up the whole thing, works on a particular system without having to 
cope with a larger test.  What I have to report here are some results from 
many systems running both Linux and Solaris with various numbers of disk 
spindles.  The Linux systems use the posix fadvise call, while the Solaris 
ones use its aio library.


Using the maximum prefetch working set tested, 8192, here's the speedup 
multiplier on this benchmark for both sorted and unsorted requests using a 
8GB file:


OS  SpindlesUnsorted X  Sorted X
1:Linux 1   2.3 2.1
2:Linux 1   1.5 1.0
3:Solaris   1   2.6 3.0
4:Linux 3   6.3 2.8
5:Linux (Stark) 3   5.3 3.6
6:Linux 10  5.4 4.9
7:Solaris*  48  16.99.2

Systems (1)-(3) are standard single-disk workstations with various speed 
and size disks.  (4) is a 3-disk software RAID0 (on an Areca card in JBOD 
mode).  (5) is the system Greg Stark originally reported his results on, 
which is also a 3-disk array of some sort.  (6) uses a Sun 2640 disk array 
with a 10 disk RAID0+1 setup, while (7) is a Sun Fire X4500 with 48 disks 
in a giant RAID-Z array.


The Linux systems drop the OS cache after each run, they're all running 
kernel 2.6.18 or higher with that feature.  Solaris system (3) is using 
the UFS filesystem with the default tuning, which doesn't cache enough 
information for that to be necessary[1]--the results look very similar to 
the Linux case even without explicitly dropping the cache.


* For (7) the results there showed obvious caching (150MB/s), as I 
expected from Solaris's ZFS which does cache aggressively by default.  In 
order to get useful results with the server's 16GB of RAM, I increased the 
test file to 64GB, at which point the results looked reasonable.


Comparing with a prefetch working set of 256, which I eyeballed on the 
results spreadsheet I made as the best return on prefetch effort before 
improvements leveled off, the speedups looked like this:


OS  SpindlesUnsorted X  Sorted X
1:Linux 1   2.3 2.0
2:Linux 1   1.5 0.9
3:Solaris   1   2.5 3.3
4:Linux 3   5.8 2.6
5:Linux (Stark) 3   5.6 3.7
6:Linux 10  5.7 5.1
7:Solaris   48  10.07.8

Observations:

-For the most part, using the fadvise/aio technique was a significant win 
even on single disk systems.  The worst result, on system (2) with sorted 
blocks, was basically break even within the measurement tolerance here: 
94% of the no prefetch rate is the worst result I saw, but all these 
bounced around about +/- 5% so I wouldn't read too much into that.  In 
every other case, there was at least a 50% speed increase even with a 
single disk.


-As Greg Stark suggested, the larger the spindle count the larger the 
speedup, and the larger the prefetch size that might make sense.  His 
suggestion to model the user GUC as effective_spindle_count looks like a 
good one.  The sequential scan fadvise implementation patch submitted uses 
the earlier preread_pages name for that parameter, which I agree seems 
less friendly.


-The Solaris aio implementation seems to perform a bit better relative to 
no prefetch than the Linux fadvise one.  I'm left wondering a bit about 
whether that's just a Solaris vs. Linux thing, in particular whether 
that's just some lucky caching on Solaris where the cache isn't completely 
cleared, or whether Linux's aio library might work better than its fadvise 
call does.


The attached 

Re: [HACKERS] Toasted table not deleted when no out of line columns left

2008-09-22 Thread Simon Riggs

On Mon, 2008-09-22 at 10:59 +0200, Hans-Jürgen Schönig wrote:
 On Sep 22, 2008, at 9:46 AM, Simon Riggs wrote:

  I thought Hans meant cleanup, not drop?

 we definitely have to do something about this problem. 

I think the issue is identifying the problem. Reading the title of the
post, I think Tom says no to *deleting* the toast table. He also says
no to cleaning the table as part of DROP COLUMN. That still leaves you
an opening for an out-of-line command/function to perform a clean,
without deleting the table completely.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial prefetch performance testing

2008-09-22 Thread Simon Riggs

On Mon, 2008-09-22 at 04:57 -0400, Greg Smith wrote:

 -As Greg Stark suggested, the larger the spindle count the larger the 
 speedup, and the larger the prefetch size that might make sense.  His 
 suggestion to model the user GUC as effective_spindle_count looks like a 
 good one.  The sequential scan fadvise implementation patch submitted uses 
 the earlier preread_pages name for that parameter, which I agree seems 
 less friendly.

Good news about the testing.

I'd prefer to set this as a tablespace level storage parameter. Since
that is where it would need to live when we have multiple tablespaces.
Specifically as a storage parameter, so we have same syntax for
table-level and tablespace-level storage parameters. That would also
allow us to have tablespace-level defaults for table-level settings.

prefetch_... is a much better name since its an existing industry term.
I'm not in favour of introducing the concept of spindles, since I can
almost hear the questions about ramdisks and memory-based storage. Plus
I don't ever want to discover that the best setting for
effective_spindles is 7 (or 5) when I have 6 disks because of some
technology shift or postgres behaviour change in the future.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_restore

2008-09-22 Thread Simon Riggs

On Mon, 2008-09-22 at 09:53 +0200, Dimitri Fontaine wrote:

  My intention is to have single-thread restore remain the default, at
  least for this go round, and have the user be able to choose
  --multi-thread=nn to specify the number of concurrent connections to use.
 
 What about the make famous -j option?
 
-j [jobs], --jobs[=jobs]
 Specifies the number of jobs (commands) to run simultaneously.  If
 there  is  more than one -j option, the last one is effective.  If
 the -j option is given without an argument, make  will  not  limit
 the number of jobs that can run simultaneously.

+1

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_restore

2008-09-22 Thread Simon Riggs

On Sun, 2008-09-21 at 18:15 -0400, Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  I am working on getting parallel pg_restore working. I'm currently 
  getting all the scaffolding working, and hope to have a naive prototype 
  posted within about a week.
 
  The major question is how to choose the restoration order so as to 
  maximize efficiency both on the server and in reading the archive.
 
 One of the first software design principles I ever learned was to
 separate policy from mechanism.  ISTM in this first cut you ought to
 concentrate on mechanism and let the policy just be something dumb
 (but coded separately from the infrastructure).  We can refine it after
 that.

Agreed. We musn't make too many built in assumptions about the best way
to parallelise the restore.

For example, running all CREATE INDEX at same time may help I/O on the
scan but it may also swamp memory and force additional I/O as a result.

We might need a setting for total memory available, so pg_restore can
try not to run tasks that will exceed that across settings. Preferably
this wouldn't be just a pg_restore setting.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] fix dblink security hole

2008-09-22 Thread Tom Lane
Joe Conway [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 What do you think about getting rid of the password_from_string state
 variable?  It was always a bit of a kluge, and we don't seem to need
 it anymore with this approach.

 It is still used in PQconnectionUsedPassword(). That is still needed to 
 prevent a non-superuser from logging in as the superuser if the server 
 does not require authentication.

No, the test to see if the server actually *asked* for the password is
the important part at that end.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Toasted table not deleted when no out of line columns left

2008-09-22 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 I think the issue is identifying the problem. Reading the title of the
 post, I think Tom says no to *deleting* the toast table. He also says
 no to cleaning the table as part of DROP COLUMN. That still leaves you
 an opening for an out-of-line command/function to perform a clean,

... see CLUSTER ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] HOWTO: FK: BIGINT[] - BIGINT(Theoreticaly AnyElem[] - AnyElem)

2008-09-22 Thread Oleg Serov
(Sorry for my bad english comments)
Demo sql:

BEGIN; -- Begins the magic
CREATE TABLE pk ( -- pk table
id BIGINT,
CONSTRAINT pk2_id PRIMARY KEY (id)
);

CREATE TABLE fk ( -- fk table
fk_ids BIGINT[]
);

CREATE FUNCTION bia2bi(bigint[]) RETURNS bigint -- temp type cast
AS $$
SELECT $1[array_lower($1, 1)];
$$
LANGUAGE sql;

CREATE FUNCTION bi_fk(bigint, bigint[]) RETURNS boolean --FK FUNCTION FOR
TABLE pk
AS $$
BEGIN
IF (SELECT count(*) FROM pk WHERE id = ANY($2)) = (array_upper($2, 1) -
array_lower($2, 1) + 1) THEN
RETURN TRUE;
END IF;
RAISE EXCEPTION 'NO FK![%, %]', $1, $2; -- RAISE ERROR MSG, and dont
scan other million records
END;
$$
LANGUAGE 'plpgsql';

CREATE OPERATOR == ( -- Cutsom operator for calling bi_fk
PROCEDURE = bi_fk,
LEFTARG = bigint,
RIGHTARG = bigint[]);

CREATE CAST (bigint[] AS bigint) -- TEMP CAST FOR INDEX CREATION
WITH FUNCTION bia2bi(bigint[])
AS IMPLICIT;

ALTER TABLE fk -- CREATE FK
  ADD CONSTRAINT fk_id FOREIGN KEY (fk_ids)
REFERENCES pk(id)
ON DELETE NO ACTION
ON UPDATE NO ACTION
NOT DEFERRABLE;
DROP CAST(bigint[] AS bigint); -- REMOVE CAST(recomended, but not needed)

UPDATE pg_constraint
SET conpfeqop = ARRAY(SELECT OID FROM pg_operator WHERE oprname = '==')
WHERE conname = 'fk_id'; -- CHANGING COMPARE FUNCTION


INSERT INTO pk SELECT k FROM GENERATE_series(1, 10) as k; -- MAKE DATA

INSERT INTO fk VALUES(ARRAY[1,2, 3]); -- TESTING
INSERT INTO fk VALUES(ARRAY[6,3,5]); -- TESTING
INSERT INTO fk VALUES(ARRAY[6,3,45]); -- FK BREAK.
ROLLBACK; -- Magic rollback =)


Re: [HACKERS] pg_dump feature

2008-09-22 Thread Tom Lane
Naz [EMAIL PROTECTED] writes:
 ... It would be far easier if there were a mechanism in pg_dump 
 that allowed you do dump the two parts of the schema separately, 

Feel free to fix up and resubmit the incomplete patch for that
that was rejected during the last commitfest.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] fix dblink security hole

2008-09-22 Thread Joe Conway

Tom Lane wrote:

Joe Conway [EMAIL PROTECTED] writes:

Tom Lane wrote:

What do you think about getting rid of the password_from_string state
variable?  It was always a bit of a kluge, and we don't seem to need
it anymore with this approach.


It is still used in PQconnectionUsedPassword(). That is still needed to 
prevent a non-superuser from logging in as the superuser if the server 
does not require authentication.


No, the test to see if the server actually *asked* for the password is
the important part at that end.


Oh, I see that now. So yes, as far as I can tell, password_from_string 
is not used for anything anymore and should be removed.


Joe

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch: Collation support

2008-09-22 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 The first thing that we see is that the COLLATE keyword means different 
 things, so it's probably best to change that into:

 CREATE DATABASE name WITH LC_COLLATE=locale name LC_CTYPE=locale name

 in the stripped-down version. Then we need a way to map the 
 stripped-down syntax into the one in the original patch. That's just a 
 matter of looking up the collation in the pg_collation catalog with the 
 right LC_COLLATE and LC_CTYPE.

It seems to me that in an installation using libc-based collation
support, the collation names are likely to be the same as allowed values
of LC_COLLATE anyway.  So inventing different keywords doesn't really
seem necessary.

What might be sensible to ask is whether it is ever actually reasonable
for LC_COLLATE and LC_CTYPE to have different settings.  If we were
willing to enforce that they be the same, we could reduce this to just
the standard syntax COLLATE=something and be done with it.  Not being
much of a user of anything except C locale, I might be the wrong person
to opine on this; but it seems to me that having them different is far
more likely to be a mistake than desirable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] fix dblink security hole

2008-09-22 Thread Tom Lane
Joe Conway [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 No, the test to see if the server actually *asked* for the password is
 the important part at that end.

 Oh, I see that now. So yes, as far as I can tell, password_from_string 
 is not used for anything anymore and should be removed.

Okay.  I just committed the patch without that change, but I'll go back
and add it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch: Collation support

2008-09-22 Thread Heikki Linnakangas

Tom Lane wrote:

What might be sensible to ask is whether it is ever actually reasonable
for LC_COLLATE and LC_CTYPE to have different settings.  If we were
willing to enforce that they be the same, we could reduce this to just
the standard syntax COLLATE=something and be done with it.  Not being
much of a user of anything except C locale, I might be the wrong person
to opine on this; but it seems to me that having them different is far
more likely to be a mistake than desirable.


Agreed, it doesn't make much sense. I find it hard to imagine anyone 
doing that on purpose, but we have supported it at initdb time for ages.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Toasted table not deleted when no out of line columns left

2008-09-22 Thread Simon Riggs

On Mon, 2008-09-22 at 07:53 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  I think the issue is identifying the problem. Reading the title of the
  post, I think Tom says no to *deleting* the toast table. He also says
  no to cleaning the table as part of DROP COLUMN. That still leaves you
  an opening for an out-of-line command/function to perform a clean,
 
 ... see CLUSTER ...

It's possible we'd want to do this even with no indexes on a table and
we definitely might want to do it without taking lengthy locks. 

It's good that DROP COLUMN is very quick, but its not good that it
doesn't remove the space and there's no way to make it do that without
requiring locks to be held for long periods.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial prefetch performance testing

2008-09-22 Thread Hans-Jürgen Schönig


On Sep 22, 2008, at 12:02 PM, Simon Riggs wrote:



On Mon, 2008-09-22 at 04:57 -0400, Greg Smith wrote:


-As Greg Stark suggested, the larger the spindle count the larger the
speedup, and the larger the prefetch size that might make sense.  His
suggestion to model the user GUC as effective_spindle_count looks  
like a
good one.  The sequential scan fadvise implementation patch  
submitted uses
the earlier preread_pages name for that parameter, which I agree  
seems

less friendly.


Good news about the testing.



absolutely; we made tests and got similar figures.
also, I/O is much more stable and steady with the patch.




I'd prefer to set this as a tablespace level storage parameter. Since
that is where it would need to live when we have multiple tablespaces.
Specifically as a storage parameter, so we have same syntax for
table-level and tablespace-level storage parameters. That would also
allow us to have tablespace-level defaults for table-level settings.




+1


prefetch_... is a much better name since its an existing industry  
term.

I'm not in favour of introducing the concept of spindles, since I can
almost hear the questions about ramdisks and memory-based storage.  
Plus

I don't ever want to discover that the best setting for
effective_spindles is 7 (or 5) when I have 6 disks because of some
technology shift or postgres behaviour change in the future.




i would definitely avoid to use of spindles.
i totally agree with simon here. once mature SSD storage or some in- 
memory stuff will be available for the masses, this is not suitable  
anymore.
the best thing would be to simply use the parameter as it was in the  
original patch.
maybe we should simply make the parameter adjustable per table and per  
index. this would automatically cover 95% of all cases such as  
clustered tables and so on.


many thanks and best regards,

hans

--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Toasted table not deleted when no out of line columns left

2008-09-22 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Simon Riggs [EMAIL PROTECTED] writes:
 I think the issue is identifying the problem. Reading the title of the
 post, I think Tom says no to *deleting* the toast table. He also says
 no to cleaning the table as part of DROP COLUMN. That still leaves you
 an opening for an out-of-line command/function to perform a clean,

 ... see CLUSTER ...

Hmm I wonder if this doesn't have the same problems you're describing with
the toaster. If someone has a cursor WITH HOLD against the table they don't
get a session level lock against the tables which fed the cursor do they? In
which case it's possible for there to be toast pointers in the cursor which
will expanded much later. If someone else has run CLUSTER in the intervening
time the user will get an error.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch: Collation support

2008-09-22 Thread Zdenek Kotala

Heikki Linnakangas napsal(a):


For anyone counting, Firebird added support for ICU more than three
years ago.


ICU is orthogonal to this patch. This patch didn't provide ICU support, 
and we could start using ICU without the catalog changes.


This patch should allow to use both system catalog and ICU. pg_collate catalog 
contains comparing function which is called for string comparing and if somebody 
creates function which will use ICU then ICU will be supported. It is advantage 
of pg_catalog. Without them you can have system or ICU but probably not both.


Zdenek




--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Toasted table not deleted when no out of line columns left

2008-09-22 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Hmm I wonder if this doesn't have the same problems you're describing with
 the toaster. If someone has a cursor WITH HOLD against the table they don't
 get a session level lock against the tables which fed the cursor do
 they?

Hmm, interesting point.  The tuples are read out and saved in a
tuplestore before we release locks, but I'm not sure if there is
anything in that codepath that would detoast toast references.
Seems like you're right that there would need to be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch: Collation support

2008-09-22 Thread Heikki Linnakangas

Zdenek Kotala wrote:

Heikki Linnakangas napsal(a):


For anyone counting, Firebird added support for ICU more than three
years ago.


ICU is orthogonal to this patch. This patch didn't provide ICU 
support, and we could start using ICU without the catalog changes.


This patch should allow to use both system catalog and ICU. 


Not without another patch that actually introduces ICU support. What 
that would look like, how that would be stored in the catalogs, and 
whether we want that is whole another topic. Without that, the STRCOLFN 
part of the original patch is pointless, and I would've ripped that out 
anyway even if we decided to add the pg_collation catalog in this release.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_restore

2008-09-22 Thread Andrew Dunstan



Simon Riggs wrote:

On Mon, 2008-09-22 at 09:53 +0200, Dimitri Fontaine wrote:

  

My intention is to have single-thread restore remain the default, at
least for this go round, and have the user be able to choose
--multi-thread=nn to specify the number of concurrent connections to use.
  

What about the make famous -j option?

   -j [jobs], --jobs[=jobs]
Specifies the number of jobs (commands) to run simultaneously.  If
there  is  more than one -j option, the last one is effective.  If
the -j option is given without an argument, make  will  not  limit
the number of jobs that can run simultaneously.



+1

  


If that's the preferred name I have no problem. I'm not sure about the 
default argument part, though.


First, I'm not sure out getopt infrastructure actually provides for 
optional arguments, and I am not going to remove it in pg_restore to get 
around such a problem, at least now.


More importantly, I'm not convinced it's a good idea. It seems more like 
a footgun that will potentially try to launch thousands of simultaneous 
restore connections. I should have thought that optimal performance 
would be reached at some small multiple (say maybe 2?) of the number of 
CPUs on the server. You could achieve unlimited parallelism by saying 
something like --jobs=9, but I'd rather that were done very 
explicitly instead of as the default value of the parameter.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FSM patch - performance test

2008-09-22 Thread Zdenek Kotala

Heikki Linnakangas napsal(a):

Zdenek Kotala wrote:

Zdenek Kotala napsal(a):

Heikki Linnakangas napsal(a):

Zdenek Kotala wrote:
My conclusion is that new implementation is about 8% slower in OLTP 
workload.


Can you do some analysis of why that is?


I tested it several times and last test was surprise for me. I run 
original server (with old FSM) on the database which has been created 
by new server (with new FSM) and performance is similar (maybe new 
implementation is little bit better):


MQThL (Maximum Qualified Throughput LIGHT): 1348.90 tpm
MQThM (Maximum Qualified Throughput MEDIUM): 2874.76 tpm
MQThH (Maximum Qualified Throughput HEAVY): 2422.20 tpm

The question is why? There could be two reasons for that. One is 
realated to OS/FS or HW. Filesystem could be defragmented or HDD is 
slower in some part...


Ugh. Could it be autovacuum kicking in at different times? Do you get 
any other metrics than the TPM out of it.


I don't think that it is autovacuum problem. I run test more times and result 
was same. But today I created fresh database and I got similar throughput for 
original and new FSM implementation. It seems to me that I hit a HW/OS 
singularity. I'll verify it tomorrow.


I recognize only little bit slowdown during index creation, (4:11mins vs. 
3:47mins), but I tested it only once.


Zdenek


--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial prefetch performance testing

2008-09-22 Thread Gregory Stark

Simon Riggs [EMAIL PROTECTED] writes:

 On Mon, 2008-09-22 at 04:57 -0400, Greg Smith wrote:

 -As Greg Stark suggested, the larger the spindle count the larger the 
 speedup, and the larger the prefetch size that might make sense.  His 
 suggestion to model the user GUC as effective_spindle_count looks like a 
 good one.  The sequential scan fadvise implementation patch submitted uses 
 the earlier preread_pages name for that parameter, which I agree seems 
 less friendly.

 Good news about the testing.

 I'd prefer to set this as a tablespace level storage parameter. 

Sounds, like a good idea, except... what's a tablespace level storage parameter?


 prefetch_... is a much better name since its an existing industry term.
 I'm not in favour of introducing the concept of spindles, since I can
 almost hear the questions about ramdisks and memory-based storage. Plus
 I don't ever want to discover that the best setting for
 effective_spindles is 7 (or 5) when I have 6 disks because of some
 technology shift or postgres behaviour change in the future.

In principle I quite strongly disagree with this.

Someone might very well want to set spindle_count to 6 when he actually has 7
but at least he can have an intuitive feel for what he's doing -- he's setting
it to slightly less than Postgres thinks is optimal.

Number of blocks to prefetch is an internal implementation detail that the DBA
has absolutely no way to know what the correct value is. That's how we get the
cargo cult configuration tweaks we've seen in the past where people follow
recommendations with no idea what the consequences are or whether they apply.

In an ideal world we would have a half-dozen parameters to tell Postgres how
much memory is available, how many disks available, etc and Postgres would
know how best to use the resources. I think if we expose internal knobs like
you propose then we end up with hundreds of parameters and to adjust them
you'll have to be an expert in Postgres internals.

That said, there is a place for these internal knobs when we don't really know
how to best make use of resources. At this point we only have results from a
few systems and the results don't seem to jibe with the theory.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_restore

2008-09-22 Thread Simon Riggs

On Mon, 2008-09-22 at 11:38 -0400, Andrew Dunstan wrote:
 
 Simon Riggs wrote:
  On Mon, 2008-09-22 at 09:53 +0200, Dimitri Fontaine wrote:
 

  My intention is to have single-thread restore remain the default, at
  least for this go round, and have the user be able to choose
  --multi-thread=nn to specify the number of concurrent connections to use.

  What about the make famous -j option?
 
 -j [jobs], --jobs[=jobs]
  Specifies the number of jobs (commands) to run simultaneously. 
   If
  there  is  more than one -j option, the last one is effective. 
   If
  the -j option is given without an argument, make  will  not  
  limit
  the number of jobs that can run simultaneously.
  
 
  +1
 

 
 If that's the preferred name I have no problem. I'm not sure about the 
 default argument part, though.
 
 First, I'm not sure out getopt infrastructure actually provides for 
 optional arguments, and I am not going to remove it in pg_restore to get 
 around such a problem, at least now.
 
 More importantly, I'm not convinced it's a good idea. It seems more like 
 a footgun that will potentially try to launch thousands of simultaneous 
 restore connections. I should have thought that optimal performance 
 would be reached at some small multiple (say maybe 2?) of the number of 
 CPUs on the server. You could achieve unlimited parallelism by saying 
 something like --jobs=9, but I'd rather that were done very 
 explicitly instead of as the default value of the parameter.

OK, sounds best.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial prefetch performance testing

2008-09-22 Thread Simon Riggs

On Mon, 2008-09-22 at 16:46 +0100, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 
  On Mon, 2008-09-22 at 04:57 -0400, Greg Smith wrote:
 
  -As Greg Stark suggested, the larger the spindle count the larger the 
  speedup, and the larger the prefetch size that might make sense.  His 
  suggestion to model the user GUC as effective_spindle_count looks like a 
  good one.  The sequential scan fadvise implementation patch submitted uses 
  the earlier preread_pages name for that parameter, which I agree seems 
  less friendly.
 
  Good news about the testing.
 
  I'd prefer to set this as a tablespace level storage parameter. 
 
 Sounds, like a good idea, except... what's a tablespace level storage 
 parameter?

A storage parameter, just at tablespace level.

WITH (storage_parameter = value)

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_restore

2008-09-22 Thread Joshua Drake
On Mon, 22 Sep 2008 17:24:28 +0100
Simon Riggs [EMAIL PROTECTED] wrote:

  More importantly, I'm not convinced it's a good idea. It seems more
  like a footgun that will potentially try to launch thousands of
  simultaneous restore connections. I should have thought that
  optimal performance would be reached at some small multiple (say
  maybe 2?) of the number of CPUs on the server. You could achieve
  unlimited parallelism by saying something like --jobs=9, but
  I'd rather that were done very explicitly instead of as the default
  value of the parameter.
 
 OK, sounds best.
 

I will not argue vehemently here but I will say that jobs doesn't
seem correct. The term workers seems more appropriate.

Sincerely,

Joshua D. Drake

-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_restore

2008-09-22 Thread Dimitri Fontaine
Le lundi 22 septembre 2008, Joshua Drake a écrit :
 I will not argue vehemently here but I will say that jobs doesn't
 seem correct. The term workers seems more appropriate.

Mmmm, it sounds like it depends on the implementation (and how all workers 
will share the same serializable transaction or just be independant jobs), 
but my point here is more about giving the user a name they are used to.
Like in oh, pg_restore -j, I see, thanks.

Now, if your argument is that the make concept of job does not match the 
parallel pg_restore concept of workers, I'll simply bow to your choice: 
baring other limits, English not being my natural language makes it hard 
for me to follow there ;)

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Initial prefetch performance testing

2008-09-22 Thread Greg Smith

On Mon, 22 Sep 2008, Simon Riggs wrote:


I'd prefer to set this as a tablespace level storage parameter.


That seems reasonable, but I'm not working at that level yet.  There's 
still a larger open questions about how the buffer manager interaction 
will work here, and I'd like to have a better view of that first before 
getting into the exact syntax used to set the parameter.  For now, a GUC 
works well enough, but you're right that something finer-grained may make 
sense before this actually hits the codebase.



prefetch_... is a much better name since its an existing industry term.
I'm not in favour of introducing the concept of spindles, since I can
almost hear the questions about ramdisks and memory-based storage.


It's possible to make a case for exposing the internal number that's 
getting varied here, naming the parameter something like prefetch_depth, 
and letting people set that to whatever they want.  Based on the current 
data I might suggest a default of 256, using 0 to turn the feature off 
altogether, and a maximum of at least 8192 and possibly more.


In practice I expect there to only be a couple of popular values and the 
idea of fine-tuning is a bit questionable.  I think that's what Greg Stark 
was driving at with how the value was re-spun.  Instead of using 
effective_spindle_count, you could just as easily make a case for an enum 
like [off,low,medium,high] mapping to [0,16,256,8192].  From what I've 
seen so far, that would reduce tweaking time in the field considerably 
while not really changing the range of available behavior very much.


I will be running a set of tests on a fast SSD device before I'm done, 
that's another one that I'll try once I've got the database-level tests 
ready to run, too.  What I expect is that it will favor 0, presumably you 
might as well just read the blocks rather than advise about them when the 
seek overhead is close to zero.  Should be able to do a RAM disk run as 
well.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Where to Host Project

2008-09-22 Thread Stefan Kaltenbrunner

Dave Page wrote:

On Sat, Sep 20, 2008 at 8:37 AM, Joshua D. Drake [EMAIL PROTECTED] wrote:

Dave Page wrote:


Well that's not strictly true - I persuaded one of the GForge
developers to work on the upgrade. As far as I'm aware, we're still
waiting for the hardware/OS platform to be sorted out after some
initial problems. I suspect JD will tell me something different though
- that being the case, perhaps we can work out the issues and get on
with the upgrade.

The machine is ready to go and as far as I know even has a jail. Stefan
would know more.


OK, cool. Stefan; what's your take on where we're at?


yeah there is a box and a jail I set up a while ago but for various 
reasons the actual migration (planning and testing) never happened.
I'm still prepared to handle the required sysadmin level work but I 
don't have time for anything more fancy right now.




Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Where to Host Project

2008-09-22 Thread David E. Wheeler

On Sep 22, 2008, at 10:08, Stefan Kaltenbrunner wrote:

The machine is ready to go and as far as I know even has a jail.  
Stefan

would know more.

OK, cool. Stefan; what's your take on where we're at?


yeah there is a box and a jail I set up a while ago but for various  
reasons the actual migration (planning and testing) never happened.
I'm still prepared to handle the required sysadmin level work but I  
don't have time for anything more fancy right now.


If this upgrade happens, and I can use SVN with pgFoundry, that's  
exactly where I'll stay. That would make me happy.


Whether or not it was a good idea to get into the hosting business,  
since we do, as a community, have a hosting platform, it behooves us  
to try to keep it up-to-date. I'd be willing to give a bit of time for  
this.


But I do agree with Robert that we *should* get into the indexing  
business. This is CPAN's secret: It doesn't host anything, but  
provides a distributed index of Perl modules. What would be useful is  
to make it easy for people to add their stuff to the index; and if  
that could be automated with pgFoundry, so much the better for those  
who host there.


My $0.02. Thanks for the discussion, folks.

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get_relation_stats_hook()

2008-09-22 Thread Gregory Stark

Hm, I assume we want to be able to turn on and off plugins in a running
session? I think the free_using_plugin flag:


!   if (get_relation_stats_hook)
!   vardata-statsTuple = 
(*get_relation_stats_hook) 
!   
(rte-relid, 
!   
 var-varattno);
! 
!   if (vardata-statsTuple)
!   vardata-free_using_plugin = true;  

!   else
!   vardata-statsTuple = SearchSysCache(STATRELATT,

is insufficient to handle this. vardata-free_using_plugin could be true but
by the time the variable is released the plugin pointer could have been
cleared. Or worse, set to a different plugin.

The easiest way to fix this seems like also the best way, instead of storing a
boolean store the pointer to the release function.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] allow has_table_privilege(..., 'usage') on sequences

2008-09-22 Thread Jaime Casanova
On Sun, Sep 7, 2008 at 10:55 PM, Tom Lane [EMAIL PROTECTED] wrote:
 Abhijit Menon-Sen [EMAIL PROTECTED] writes:
 (I can't help but think that the USAGE privilege is a bit unfortunate.
 If granting SELECT rights allowed currval(), INSERT allowed nextval(),
 and UPDATE allowed nextval() and setval(), then has_table_privilege()
 would have been sufficient and there would be no need to invent a new
 set of functions just to check USAGE.

 That train left the station already, and anyway you are failing to
 consider SELECT * FROM sequence, which definitely needs to have
 different privileges from nextval()/currval().


can we tell there is consensus in create a new has_sequence_privilege()?
Abhijit will you make it? if not i can make a try...

-- 
regards,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_restore

2008-09-22 Thread Simon Riggs

On Mon, 2008-09-22 at 09:30 -0700, Joshua Drake wrote:
 On Mon, 22 Sep 2008 17:24:28 +0100
 Simon Riggs [EMAIL PROTECTED] wrote:
 
   More importantly, I'm not convinced it's a good idea. It seems more
   like a footgun that will potentially try to launch thousands of
   simultaneous restore connections. I should have thought that
   optimal performance would be reached at some small multiple (say
   maybe 2?) of the number of CPUs on the server. You could achieve
   unlimited parallelism by saying something like --jobs=9, but
   I'd rather that were done very explicitly instead of as the default
   value of the parameter.
  
  OK, sounds best.
  
 
 I will not argue vehemently here but I will say that jobs doesn't
 seem correct. The term workers seems more appropriate.

Agreed, but most utilities have j free but not w, p, t or other
letters that might be synonyms.

j is at least used for exactly this purpose in other tools.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial prefetch performance testing

2008-09-22 Thread Simon Riggs

On Mon, 2008-09-22 at 13:06 -0400, Greg Smith wrote:

  prefetch_... is a much better name since its an existing industry term.
  I'm not in favour of introducing the concept of spindles, since I can
  almost hear the questions about ramdisks and memory-based storage.
 
 It's possible to make a case for exposing the internal number that's 
 getting varied here, naming the parameter something like prefetch_depth, 
 and letting people set that to whatever they want.  Based on the current 
 data I might suggest a default of 256, using 0 to turn the feature off 
 altogether, and a maximum of at least 8192 and possibly more.
 
 In practice I expect there to only be a couple of popular values and the 
 idea of fine-tuning is a bit questionable.  I think that's what Greg Stark 
 was driving at with how the value was re-spun.  Instead of using 
 effective_spindle_count, you could just as easily make a case for an enum 
 like [off,low,medium,high] mapping to [0,16,256,8192].  From what I've 
 seen so far, that would reduce tweaking time in the field considerably 
 while not really changing the range of available behavior very much.

Tuning Postgres I/O already involves quite a few parameters called
buffersize, segment width, stripe size, etc.. I've never heard anything
from a disk manufacturer say this is wrong and we should just have
spindle equivalents. I don't think we should dress this up too much,
that's all. We aren't going to make anybody's life any easier. But we
will probably generate lots of annoying phone calls to disk
manufacturers asking so how many spindles is your subsystem worth in
Postgres terms? to which they will shrug and say no idea.

Is the behaviour of this sufficiently linear to be able to say that 3
spindles = 3 effective_spindles and 6=6 etc.? I would guess it won't be
and you're left with a name more misleading than useful.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial prefetch performance testing

2008-09-22 Thread Ron Mayer

Gregory Stark wrote:

Simon Riggs [EMAIL PROTECTED] writes:

I'm not in favour of introducing the concept of spindles


In principle I quite strongly disagree with this
Number of blocks to prefetch is an internal implementation detail that the DBA
has absolutely no way to know what the correct value is. 


Even more often on systems I see these days, spindles
is an implementation detail that the DBA has no way to know
what the correct value is.

For example, on our sites hosted with Amazon's compute cloud (a great
place to host web sites), I know nothing about spindles, but know
about Amazon Elastic Block Store[2]'s and Instance Store's[1].   I
have some specs and are able to run benchmarks on them; but couldn't
guess how many spindles my X% of the N-disk device that corresponds
to.  For another example, some of our salesguys with SSD drives
have 0 spindles on their demo machines.

I'd rather a parameter that expressed things more in terms of
measurable quantities -- perhaps seeks/second?  perhaps
random-access/sequential-access times?



[1] http://www.amazon.com/gp/browse.html?node=201590011
[2] 
http://www.amazon.com/b/ref=sc_fe_c_0_201590011_1?ie=UTF8node=689343011no=201590011me=A36L942TSJ2AJA


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get_relation_stats_hook()

2008-09-22 Thread Simon Riggs

On Mon, 2008-09-22 at 18:41 +0100, Gregory Stark wrote:

 The easiest way to fix this seems like also the best way, instead of storing a
 boolean store the pointer to the release function.

OK, I like that better anyhow.

Hadn't thought about turning plugin off, but I can see the requirement
now your raise it.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_regress inputdir

2008-09-22 Thread Peter Eisentraut

Peter Eisentraut wrote:

Tom Lane wrote:

But I think Alvaro is worried about something
at a higher level: the regression test process as a whole has some
directory layout assumptions built into it, particularly in regards
to where to find .so's.


The only information about the location of the .so's is in the test 
files themselves, which seems reasonable, because they are created and 
installed at the same time as the .so's that they are presumably 
supposed to test.  So I see no problem here.


Here is a more involved patch that fixes all these issues.  The major 
simplication is that the input files are looked for in both the build 
tree and the source tree (like a vpath search), which allowed me to 
remove a lot of redundant makefile code.  I could also remove the 
--srcdir option but added --dlpath to address the above mentioned issue 
and changed some option defaults.  Now you can run pg_regress inside and 
outside of the build tree.  It isn't quite ready for the general public, 
but a packager that wants to adopt this can use it.  Currently, you need 
to create the directories sql, expected, and testtablespace yourself, 
when running outside the build tree.  We can attempt to sort that out 
later, but SELinux might make it difficult.
diff -cr -x TAGS ../cvs-pgsql/src/makefiles/pgxs.mk ./src/makefiles/pgxs.mk
*** ../cvs-pgsql/src/makefiles/pgxs.mk  2008-04-07 17:15:58.0 +0300
--- ./src/makefiles/pgxs.mk 2008-09-22 20:40:39.0 +0300
***
*** 232,254 
  # where to find psql for running the tests
  PSQLDIR = $(bindir)
  
- # When doing a VPATH build, must copy over the test .sql and .out
- # files so that the driver script can find them.  We have to use an
- # absolute path for the targets, because otherwise make will try to
- # locate the missing files using VPATH, and will find them in
- # $(srcdir), but the point here is that we want to copy them from
- # $(srcdir) to the build directory.
- 
- ifdef VPATH
- abs_builddir := $(shell pwd)
- test_files_src := $(wildcard $(srcdir)/sql/*.sql) $(wildcard 
$(srcdir)/expected/*.out) $(wildcard $(srcdir)/data/*.data)
- test_files_build := $(patsubst $(srcdir)/%, $(abs_builddir)/%, 
$(test_files_src))
- 
- all: $(test_files_build)
- $(test_files_build): $(abs_builddir)/%: $(srcdir)/%
-   ln -s $ $@
- endif # VPATH
- 
  .PHONY: submake
  submake:
  ifndef PGXS
--- 232,237 
diff -cr -x TAGS ../cvs-pgsql/src/pl/plperl/GNUmakefile 
./src/pl/plperl/GNUmakefile
*** ../cvs-pgsql/src/pl/plperl/GNUmakefile  2008-04-07 17:15:58.0 
+0300
--- ./src/pl/plperl/GNUmakefile 2008-09-22 20:29:07.0 +0300
***
*** 50,76 
  SPI.c: SPI.xs
$(PERL) $(perl_privlibexp)/ExtUtils/xsubpp -typemap 
$(perl_privlibexp)/ExtUtils/typemap $ $@
  
- # When doing a VPATH build, copy over the .sql and .out files so that the
- # test script can find them.  See comments in src/test/regress/GNUmakefile.
- ifdef VPATH
- 
- ifneq ($(PORTNAME),win32)
- abs_srcdir := $(shell cd $(srcdir)  pwd)
- abs_builddir := $(shell pwd)
- else
- abs_srcdir := $(shell cd $(srcdir)  pwd -W)
- abs_builddir := $(shell pwd -W)
- endif
- 
- test_files_src := $(wildcard $(srcdir)/sql/*.sql) $(wildcard 
$(srcdir)/expected/*.out)
- test_files_build := $(patsubst $(srcdir)/%, $(abs_builddir)/%, 
$(test_files_src))
- 
- all: $(test_files_build)
- $(test_files_build): $(abs_builddir)/%: $(srcdir)/%
-   ln -s $ $@
- 
- endif
- 
  install: all installdirs install-lib
  
  installdirs: installdirs-lib
--- 50,55 
diff -cr -x TAGS ../cvs-pgsql/src/pl/plpython/Makefile 
./src/pl/plpython/Makefile
*** ../cvs-pgsql/src/pl/plpython/Makefile   2008-04-07 17:15:58.0 
+0300
--- ./src/pl/plpython/Makefile  2008-09-22 20:30:40.0 +0300
***
*** 66,92 
  
  all: all-lib
  
- # When doing a VPATH build, copy over the .sql and .out files so that the
- # test script can find them.  See comments in src/test/regress/GNUmakefile.
- ifdef VPATH
- 
- ifneq ($(PORTNAME),win32)
- abs_srcdir := $(shell cd $(srcdir)  pwd)
- abs_builddir := $(shell pwd)
- else
- abs_srcdir := $(shell cd $(srcdir)  pwd -W)
- abs_builddir := $(shell pwd -W)
- endif
- 
- test_files_src := $(wildcard $(srcdir)/sql/*.sql) $(wildcard 
$(srcdir)/expected/*.out)
- test_files_build := $(patsubst $(srcdir)/%, $(abs_builddir)/%, 
$(test_files_src))
- 
- all: $(test_files_build)
- $(test_files_build): $(abs_builddir)/%: $(srcdir)/%
-   ln -s $ $@
- 
- endif
- 
  install: all installdirs install-lib
  
  installdirs: installdirs-lib
--- 66,71 
diff -cr -x TAGS ../cvs-pgsql/src/pl/tcl/Makefile ./src/pl/tcl/Makefile
*** ../cvs-pgsql/src/pl/tcl/Makefile2008-04-07 17:15:58.0 +0300
--- ./src/pl/tcl/Makefile   2008-09-22 20:31:13.0 +0300
***
*** 49,75 
  all: all-lib
$(MAKE) -C modules $@
  
- # When doing a VPATH build, copy over the .sql and .out files so that the
- # test script can find them.  See comments 

Re: [HACKERS] FSM patch - performance test

2008-09-22 Thread Heikki Linnakangas

Tom Lane wrote:

What this means is that if we start with next pointing at a page
without enough space (quite likely considering that we now index all
pages not only those with free space), then it is highly possible that
the search will end on a page *before* where next was.  The most trivial
case is that we have an even-numbered page with a lot of free space and
its odd-numbered successor has none --- in this case, far from spreading
out the backends, all comers will be handed back that same page!  (Until
someone reports that it's full.)  In general it seems that this behavior
will tend to concentrate the returned pages in a small area rather than
allowing them to range over the whole FSM page as was intended.


Good point.


So the bottom line is that the next addition doesn't actually work and
needs to be rethought.  It might be possible to salvage it by paying
attention to next during the descent phase and preferentially trying
to descend to the right of next; but I'm not quite sure how to make
that work efficiently, and even less sure how to wrap around cleanly
when the starting value of next is near the last slot on the page.


Yeah, I think it can be salvaged like that. see the patch I just posted 
on a separate thread.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial prefetch performance testing

2008-09-22 Thread Greg Smith

On Mon, 22 Sep 2008, Gregory Stark wrote:


Hm, I'm disappointed with the 48-drive array here. I wonder why it maxed out
at only 10x the bandwidth of one drive. I would expect more like 24x or more.


The ZFS RAID-Z implementation doesn't really scale that linearly.  It's 
rather hard to get the full bandwidth out of a X4500 with any single 
process, and I haven't done any filesystem tuning to improve 
things--everything is at the defaults.


I'm quite surprised Solaris doesn't support posix_fadvise -- perhaps 
it's in some other version of Solaris?


Both the systems I used were standard Solaris 10 boxes and I'm not aware 
of any changes in this area in the later OpenSolaris releases (which is 
where I'd expect something like this to change first).  The test program I 
tried failed to find #ifdef POSIX_FADV_WILLNEED, and the message I saw 
from you at 
http://archives.postgresql.org/message-id/[EMAIL PROTECTED] 
suggested you didn't find any fadvise either so I didn't look much 
further.


The above is a cue for someone from Sun to chime in on this subject.


I have an updated patch I'll be sending along shortly. You might want to test
with that?


Obviously I've got everything setup to test right now, am currently 
analyzing your earlier patch and the sequential scan fork that derived 
from it.  If you've got a later version of the bitmap heap scan one as 
well, I'll replace the one I had been planning to test (your 
bitmap-preread-v9) with that one when it's available.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_restore

2008-09-22 Thread Andrew Dunstan



Simon Riggs wrote:

I will not argue vehemently here but I will say that jobs doesn't
seem correct. The term workers seems more appropriate.



Agreed, but most utilities have j free but not w, p, t or other
letters that might be synonyms.

j is at least used for exactly this purpose in other tools.

  


There are in fact very few letters available, as we've been fairly 
profligate in our use of option letters in the pg_dump suite.


j and m happen to be two of those that are available.

I honestly don't have a terribly strong opinion about what it should be 
called. I can live with jobs or multi-threads.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Interval literal rounding bug(?) and patch.

2008-09-22 Thread Ron Mayer

I think it's a bug that these 3 different ways of writing 0.7 seconds
produce different results from each other on HEAD.

head=# select interval '0:0:0.7', interval '@ 0.70 secs', interval '0.7 
seconds';
  interval   |interval |interval
-+-+-
 00:00:00.70 | 00:00:00.69 | 00:00:00.69
(1 row)

The attached patch will make all of those output 00:00:00.70 which.

Postgres 8.3 tended to output the 00:00:00.70 like this patch, I believe
because it didn't default to HAVE_INT64_TIMESTAMP like HEAD is.   The patch
seems to pass the existing regression tests.

Does this seem reasonable?

   Ron


*** a/src/backend/utils/adt/datetime.c
--- b/src/backend/utils/adt/datetime.c
***
*** 2888,2894  DecodeInterval(char **field, int *ftype, int nf, int range,
  {
  	case DTK_MICROSEC:
  #ifdef HAVE_INT64_TIMESTAMP
! 		*fsec += val + fval;
  #else
  		*fsec += (val + fval) * 1e-6;
  #endif
--- 2888,2894 
  {
  	case DTK_MICROSEC:
  #ifdef HAVE_INT64_TIMESTAMP
! 		*fsec += rint(val + fval);
  #else
  		*fsec += (val + fval) * 1e-6;
  #endif
***
*** 2897,2903  DecodeInterval(char **field, int *ftype, int nf, int range,
  
  	case DTK_MILLISEC:
  #ifdef HAVE_INT64_TIMESTAMP
! 		*fsec += (val + fval) * 1000;
  #else
  		*fsec += (val + fval) * 1e-3;
  #endif
--- 2897,2903 
  
  	case DTK_MILLISEC:
  #ifdef HAVE_INT64_TIMESTAMP
! 		*fsec += rint((val + fval) * 1000);
  #else
  		*fsec += (val + fval) * 1e-3;
  #endif
***
*** 2907,2913  DecodeInterval(char **field, int *ftype, int nf, int range,
  	case DTK_SECOND:
  		tm-tm_sec += val;
  #ifdef HAVE_INT64_TIMESTAMP
! 		*fsec += fval * 100;
  #else
  		*fsec += fval;
  #endif
--- 2907,2913 
  	case DTK_SECOND:
  		tm-tm_sec += val;
  #ifdef HAVE_INT64_TIMESTAMP
! 		*fsec += rint(fval * 100);
  #else
  		*fsec += fval;
  #endif
***
*** 2932,2938  DecodeInterval(char **field, int *ftype, int nf, int range,
  			sec = fval;
  			tm-tm_sec += sec;
  #ifdef HAVE_INT64_TIMESTAMP
! 			*fsec += (fval - sec) * 100;
  #else
  			*fsec += fval - sec;
  #endif
--- 2932,2938 
  			sec = fval;
  			tm-tm_sec += sec;
  #ifdef HAVE_INT64_TIMESTAMP
! 			*fsec += rint((fval - sec) * 100);
  #else
  			*fsec += fval - sec;
  #endif
***
*** 2950,2956  DecodeInterval(char **field, int *ftype, int nf, int range,
  			sec = fval;
  			tm-tm_sec += sec;
  #ifdef HAVE_INT64_TIMESTAMP
! 			*fsec += (fval - sec) * 100;
  #else
  			*fsec += fval - sec;
  #endif
--- 2950,2956 
  			sec = fval;
  			tm-tm_sec += sec;
  #ifdef HAVE_INT64_TIMESTAMP
! 			*fsec += rint((fval - sec) * 100);
  #else
  			*fsec += fval - sec;
  #endif
***
*** 2969,2975  DecodeInterval(char **field, int *ftype, int nf, int range,
  			sec = fval;
  			tm-tm_sec += sec;
  #ifdef HAVE_INT64_TIMESTAMP
! 			*fsec += (fval - sec) * 100;
  #else
  			*fsec += fval - sec;
  #endif
--- 2969,2975 
  			sec = fval;
  			tm-tm_sec += sec;
  #ifdef HAVE_INT64_TIMESTAMP
! 			*fsec += rint((fval - sec) * 100);
  #else
  			*fsec += fval - sec;
  #endif
***
*** 2995,3001  DecodeInterval(char **field, int *ftype, int nf, int range,
  sec = fval;
  tm-tm_sec += sec;
  #ifdef HAVE_INT64_TIMESTAMP
! *fsec += (fval - sec) * 100;
  #else
  *fsec += fval - sec;
  #endif
--- 2995,3001 
  sec = fval;
  tm-tm_sec += sec;
  #ifdef HAVE_INT64_TIMESTAMP
! *fsec += rint((fval - sec) * 100);
  #else
  *fsec += fval - sec;
  #endif
***
*** 3022,3028  DecodeInterval(char **field, int *ftype, int nf, int range,
  sec = fval;
  tm-tm_sec += sec;
  #ifdef HAVE_INT64_TIMESTAMP
! *fsec += (fval - sec) * 100;
  #else
  *fsec += fval - sec;
  #endif
--- 3022,3028 
  sec = fval;
  tm-tm_sec += sec;
  #ifdef HAVE_INT64_TIMESTAMP
! *fsec += rint((fval - sec) * 100);
  #else
  *fsec += fval - sec;
  #endif

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial prefetch performance testing

2008-09-22 Thread Gregory Stark

Ron Mayer [EMAIL PROTECTED] writes:

 For example, on our sites hosted with Amazon's compute cloud (a great
 place to host web sites), I know nothing about spindles, but know
 about Amazon Elastic Block Store[2]'s and Instance Store's[1].   I
 have some specs and are able to run benchmarks on them; but couldn't
 guess how many spindles my X% of the N-disk device that corresponds
 to.  

Well I don't see how you're going to guess how much prefetching is optimal for
those environments either... 

 For another example, some of our salesguys with SSD drives
 have 0 spindles on their demo machines.

Sounds to me like you're finding it pretty intuitive. Actually you would want
1 because it can handle one request at a time. Actually if you have a
multipath array I imagine you would want to think of each interface as a
spindle because that's the bottleneck and you'll want to keep all the
interfaces busy.

 I'd rather a parameter that expressed things more in terms of
 measurable quantities -- perhaps seeks/second?  perhaps
 random-access/sequential-access times?

Well that's precisely what I'm saying. Simon et al want a parameter to control
how much prefetching to do. That's *not* a measurable quantity. I'm suggesting
effective_spindle_count which *is* a measurable quantity even if it might be a
bit harder to measure in some environments than others.

The two other quantities you describe are both currently represented by our
random_page_cost (or random_page_cost/sequential_page_cost). What we're
dealing with now is an entirely orthogonal property of your system: how many
concurrent requests can the system handle.

If you have ten spindles then you really want to send enough requests to
ensure there are ten concurrent requests being processed on ten different
drives (assuming you want each scan to make maximum use of the resources which
is primarily true in DSS but might not be true in OLTP). That's a lot more
than ten requests though because if you sent ten requests many of them would
end up on the same devices.

In theory my logic led me to think for ten drives it would be about 30.
Experiments seem to show it's more like 300-400. That discrepancy might be a
reason to put this debate aside for now anywaysand expose the internal
implementation until we understand better what's going on there.

Ironically I'm pretty happy to lose this argument because EDB is interested in
rolling this into its dynamic tuning module. If there's a consensus -- by my
count three people have spoken up already which is more than usual -- then
I'll gladly concede. Anyone object to going back to preread_pages? Or should
it be prefetch_pages? prefetch_blocks? Something else?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial prefetch performance testing

2008-09-22 Thread Ron Mayer

Gregory Stark wrote:

Ron Mayer [EMAIL PROTECTED] writes:

I'd rather a parameter that expressed things more in terms of
measurable quantities [...]


...What we're
dealing with now is an entirely orthogonal property of your system: how many
concurrent requests can the system handle.


Really?  I'd have thought you'd want to give the OS enough guesses
about the future that it's elevator algorithms for the drive heads
don't keep seeking back-and-forth but rather do as much per sweep
across a device that they can.


Ironically I'm pretty happy to lose this argument because EDB is interested in
rolling this into its dynamic tuning module. If there's a consensus -- by my
count three people have spoken up already which is more than usual -- then
I'll gladly concede. Anyone object to going back to preread_pages? Or should
it be prefetch_pages? prefetch_blocks? Something else?


Well - as you pointed out, I'm not on their side of the debate either.
I'm not sure what a relevant measurable parameter would be so I'm not
being too helpful in the conversation either.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] fix dblink security hole

2008-09-22 Thread Tommy Gildseth

Tom Lane wrote:

Joe Conway [EMAIL PROTECTED] writes:

Tom Lane wrote:

No, the test to see if the server actually *asked* for the password is
the important part at that end.


Oh, I see that now. So yes, as far as I can tell, password_from_string 
is not used for anything anymore and should be removed.


Okay.  I just committed the patch without that change, but I'll go back
and add it.



I'm not quite sure I fully understand the consequence of this change. 
Does it basically mean that it's not possible to use .pgpass with dblink 
for authentication?
The alternative then would be to hardcode the password in your stored 
procedures, or store it in a separate table somehow?



--
Tommy Gildseth

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get_relation_stats_hook()

2008-09-22 Thread Simon Riggs

On Mon, 2008-09-22 at 18:41 +0100, Gregory Stark wrote:

 The easiest way 

Did you have further review comments? If so, I'll wait for those before
making further mods. Thanks for ones so far.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PATCHES] [HACKERS] Infrastructure changes for recovery

2008-09-22 Thread Simon Riggs

On Thu, 2008-09-18 at 10:09 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  On Thu, 2008-09-18 at 09:06 -0400, Tom Lane wrote:
  Do we really need a checkpoint there at all?
 
  Timelines only change at shutdown checkpoints.
 
 Hmm.  I *think* that that is just a debugging crosscheck rather than a
 critical property.  But yeah, it would take some close investigation,
 which maybe isn't warranted if you have a less-invasive solution.

OK, new patch, version 6. Some major differences to previous patch.

* new IsRecoveryProcessingMode() in shmem
* padding in XLogCtl to ensure above call is cheap
* specific part of bgwriter shmem for passing restartpoint data
* avoid Shutdown checkpoint at end of recovery, with carefully
considered positioning of statements (beware!)
* only one new postmaster mode, PM_RECOVERY
* bgwriter changes state without stopping/starting

Modes I have tested so far
* make check
* Start, Stop
* Crash Recovery
* Archive Recovery
* Archive Recovery, switch in middle of restartpoint

Modes not yet tested
* EXEC_BACKEND

Ready for serious review prior to commit. I will be performing further
testing also.

 backend/access/transam/multixact.c |2 
 backend/access/transam/xlog.c  |  328 ---
 backend/postmaster/bgwriter.c  |  371 +---!
 backend/postmaster/postmaster.c|   62 !!
 backend/storage/buffer/README  |5 
 backend/storage/buffer/bufmgr.c|   34 +!!
 include/access/xlog.h  |   14 !
 include/access/xlog_internal.h |3 
 include/catalog/pg_control.h   |2 
 include/postmaster/bgwriter.h  |2 
 include/storage/bufmgr.h   |2 
 include/storage/pmsignal.h |1 
 12 files changed, 279 insertions(+), 56 deletions(-), 491 mods(!)

There's a few subtle points along the way. I've tried to explain them
all in code comments, but questions welcome. At v6, most things are now
done a particular way for a specific reason.

Look especially at InRecovery, which is used extensively in different
parts of the code. The meaning of this has been subdivided into two
meanings, so only *some* of the places that use it have been changed.
All have been checked.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support
Index: src/backend/access/transam/multixact.c
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/multixact.c,v
retrieving revision 1.28
diff -c -r1.28 multixact.c
*** src/backend/access/transam/multixact.c	1 Aug 2008 13:16:08 -	1.28
--- src/backend/access/transam/multixact.c	22 Sep 2008 19:28:56 -
***
*** 1543,1549 
  	 * SimpleLruTruncate would get confused.  It seems best not to risk
  	 * removing any data during recovery anyway, so don't truncate.
  	 */
! 	if (!InRecovery)
  		TruncateMultiXact();
  
  	TRACE_POSTGRESQL_MULTIXACT_CHECKPOINT_DONE(true);
--- 1543,1549 
  	 * SimpleLruTruncate would get confused.  It seems best not to risk
  	 * removing any data during recovery anyway, so don't truncate.
  	 */
! 	if (!IsRecoveryProcessingMode())
  		TruncateMultiXact();
  
  	TRACE_POSTGRESQL_MULTIXACT_CHECKPOINT_DONE(true);
Index: src/backend/access/transam/xlog.c
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.317
diff -c -r1.317 xlog.c
*** src/backend/access/transam/xlog.c	11 Aug 2008 11:05:10 -	1.317
--- src/backend/access/transam/xlog.c	22 Sep 2008 21:30:24 -
***
*** 119,124 
--- 119,125 
  
  /* Are we doing recovery from XLOG? */
  bool		InRecovery = false;
+ bool		reachedSafeStopPoint = false;
  
  /* Are we recovering using offline XLOG archives? */
  static bool InArchiveRecovery = false;
***
*** 131,137 
  static bool recoveryTarget = false;
  static bool recoveryTargetExact = false;
  static bool recoveryTargetInclusive = true;
- static bool recoveryLogRestartpoints = false;
  static TransactionId recoveryTargetXid;
  static TimestampTz recoveryTargetTime;
  static TimestampTz recoveryLastXTime = 0;
--- 132,137 
***
*** 286,295 
--- 286,297 
  /*
   * Total shared-memory state for XLOG.
   */
+ #define	XLOGCTL_BUFFER_SPACING	128
  typedef struct XLogCtlData
  {
  	/* Protected by WALInsertLock: */
  	XLogCtlInsert Insert;
+ 	char	InsertPadding[XLOGCTL_BUFFER_SPACING - sizeof(XLogCtlInsert)];
  
  	/* Protected by info_lck: */
  	XLogwrtRqst LogwrtRqst;
***
*** 297,305 
--- 299,314 
  	uint32		ckptXidEpoch;	/* nextXID  epoch of latest checkpoint */
  	TransactionId ckptXid;
  	XLogRecPtr	asyncCommitLSN; /* LSN of newest async commit */
+ 	/* add data structure padding for above info_lck declarations */
+ 	char	InfoPadding[XLOGCTL_BUFFER_SPACING - sizeof(XLogwrtRqst) 
+ - 

Re: [HACKERS] [patch] fix dblink security hole

2008-09-22 Thread Joe Conway

Tommy Gildseth wrote:

Tom Lane wrote:

Okay.  I just committed the patch without that change, but I'll go back
and add it.


I'm not quite sure I fully understand the consequence of this change. 
Does it basically mean that it's not possible to use .pgpass with dblink 
for authentication?


It only applies to 8.4 (which is not yet released) and beyond.

dblink will still work as before for superusers.

The alternative then would be to hardcode the password in your stored 
procedures, or store it in a separate table somehow?


Trusted non-superusers can be granted permission to use dblink_connect_u().

Joe

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed patch: make SQL interval-literal syntax work per spec

2008-09-22 Thread Bruce Momjian
Tom Lane wrote:
  There is still some unfinished business if anyone wants to make it
  really exactly 100% spec compliant ...
  
  I agree.
 
 I committed the patch as presented, and I think I might go take a quick

Tom, which Interval TODO items did you complete with this patch?

http://wiki.postgresql.org/wiki/Todo#Dates_and_Times

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_type.h regression?

2008-09-22 Thread Greg Sabino Mullane
Looks like the box-array semicolon got changed to a comma at some point -
attached patch changes it back (\054 to \073)

-- 
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 20080927
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
Index: src/include/catalog/pg_type.h
===
RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_type.h,v
retrieving revision 1.199
diff -c -r1.199 pg_type.h
*** src/include/catalog/pg_type.h	30 Jul 2008 19:35:13 -	1.199
--- src/include/catalog/pg_type.h	23 Sep 2008 02:26:35 -
***
*** 446,452 
  DATA(insert OID = 1017 (  _point	 PGNSP PGUID -1 f b A f t \054 0 600 0 array_in array_out array_recv array_send - - - d x f 0 -1 0 _null_ _null_ ));
  DATA(insert OID = 1018 (  _lseg		 PGNSP PGUID -1 f b A f t \054 0 601 0 array_in array_out array_recv array_send - - - d x f 0 -1 0 _null_ _null_ ));
  DATA(insert OID = 1019 (  _path		 PGNSP PGUID -1 f b A f t \054 0 602 0 array_in array_out array_recv array_send - - - d x f 0 -1 0 _null_ _null_ ));
! DATA(insert OID = 1020 (  _box		 PGNSP PGUID -1 f b A f t \054 0 603 0 array_in array_out array_recv array_send - - - d x f 0 -1 0 _null_ _null_ ));
  DATA(insert OID = 1021 (  _float4	 PGNSP PGUID -1 f b A f t \054 0 700 0 array_in array_out array_recv array_send - - - i x f 0 -1 0 _null_ _null_ ));
  #define FLOAT4ARRAYOID 1021
  DATA(insert OID = 1022 (  _float8	 PGNSP PGUID -1 f b A f t \054 0 701 0 array_in array_out array_recv array_send - - - d x f 0 -1 0 _null_ _null_ ));
--- 446,452 
  DATA(insert OID = 1017 (  _point	 PGNSP PGUID -1 f b A f t \054 0 600 0 array_in array_out array_recv array_send - - - d x f 0 -1 0 _null_ _null_ ));
  DATA(insert OID = 1018 (  _lseg		 PGNSP PGUID -1 f b A f t \054 0 601 0 array_in array_out array_recv array_send - - - d x f 0 -1 0 _null_ _null_ ));
  DATA(insert OID = 1019 (  _path		 PGNSP PGUID -1 f b A f t \054 0 602 0 array_in array_out array_recv array_send - - - d x f 0 -1 0 _null_ _null_ ));
! DATA(insert OID = 1020 (  _box		 PGNSP PGUID -1 f b A f t \073 0 603 0 array_in array_out array_recv array_send - - - d x f 0 -1 0 _null_ _null_ ));
  DATA(insert OID = 1021 (  _float4	 PGNSP PGUID -1 f b A f t \054 0 700 0 array_in array_out array_recv array_send - - - i x f 0 -1 0 _null_ _null_ ));
  #define FLOAT4ARRAYOID 1021
  DATA(insert OID = 1022 (  _float8	 PGNSP PGUID -1 f b A f t \054 0 701 0 array_in array_out array_recv array_send - - - d x f 0 -1 0 _null_ _null_ ));


signature.asc
Description: PGP signature


Re: [HACKERS] [patch] fix dblink security hole

2008-09-22 Thread Tom Lane
Joe Conway [EMAIL PROTECTED] writes:
 Tommy Gildseth wrote:
 I'm not quite sure I fully understand the consequence of this change. 
 Does it basically mean that it's not possible to use .pgpass with dblink 
 for authentication?

 It only applies to 8.4 (which is not yet released) and beyond.
 dblink will still work as before for superusers.

The visible, documented behavior actually is not any different from what
it's been in recent PG releases.  This change only plugs a possible
security issue that we weren't aware of before, ie, that dblink might
send a password to a server before failing the connect attempt.  It will
fail the connect attempt either way, though, so no functionality
changes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed patch: make SQL interval-literal syntax work per spec

2008-09-22 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom, which Interval TODO items did you complete with this patch?
   http://wiki.postgresql.org/wiki/Todo#Dates_and_Times

I think we've at least mostly fixed

* Support ISO INTERVAL syntax if units cannot be determined from the string, 
and are supplied after the string

* Add support for year-month syntax, INTERVAL '50-6' YEAR TO MONTH

There might be a few glitches left but they are at much smaller grain
than the TODO is talking about.


... while I'm looking: I am not sure that I think either of these TODO
items are sane or standards-compliant:

* Interpret INTERVAL '1 year' MONTH as CAST (INTERVAL '1 year' AS INTERVAL 
MONTH), and this should return '12 months'

* Support precision, CREATE TABLE foo (a INTERVAL MONTH(3))

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_type.h regression?

2008-09-22 Thread Tom Lane
Greg Sabino Mullane [EMAIL PROTECTED] writes:
 Looks like the box-array semicolon got changed to a comma at some point -
 attached patch changes it back (\054 to \073)

[ scratches head... ]  I seem to have done that in rev 1.198, but I
don't recall why.  It's late here though ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FSM, now without WAL-logging

2008-09-22 Thread Simon Riggs

On Mon, 2008-09-22 at 20:43 +0300, Heikki Linnakangas wrote:

 Attached is a revamped version of the FSM rewrite. WAL-logging is now 
 gone. Any inconsistencies between levels of the FSM is fixed during 
 vacuum, and by searchers when they run into a dead end because of a 
 discrepancy. Corruption within FSM pages is likewise fixed by vacuum and 
 searchers.
 
 The FSM in a warm standby gets updated by replay of heap CLEAN WAL 
 records. That means that the FSM will reflect the situation after the 
 last vacuum, which is better than what we have now, but not quite 
 up-to-date. I'm worried that this might not be enough...

I hadn't realised you would remove it completely. Did you identify WAL
as the bottleneck?

Is there some mid-way point between every time and almost never
(VACUUM!)? 

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_restore

2008-09-22 Thread Joshua D. Drake

Andrew Dunstan wrote:

There are in fact very few letters available, as we've been fairly 
profligate in our use of option letters in the pg_dump suite.


j and m happen to be two of those that are available.


--max-workers

Max makes sense because the number of workers won't be consistent, a 
worker may not have a job to do. It is also consistent with 
auto_vacuum_max_workers.


Joshua D. Drake

Sincerely,

Joshua D. Drake


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers