Re: [HACKERS] Anyone for SSDs?

2010-12-11 Thread Vaibhav Kaushal
On Fri, 2010-12-10 at 18:07 -0800, Josh Berkus wrote: 
 On 12/10/10 5:06 PM, Daniel Loureiro wrote:
  An quicksort method in
  sequential disk its just awful to be thinking in a non SSD world, but
  its possible in an SSD.
 
 So, code it.  Shouldn't be hard to write a demo comparison.  I don't
 believe that SSDs make quicksort-on-disk feasible, but would be happy to
 be proven wrong.

I too do not believe it in normal case. However, considering the 'types'
of SSDs, it may be feasible! Asking for 'the next page and getting it'
has a time delay in the process. While on a regular HDD with spindles,
the question is where is that page located, with SSDs, the question
disappears, because the access time is uniform in case of SSDs. Also,
the access time is about 100 times fasterm which would change quite a
few things about the whole process.

I would like to do that, but I do not have a SSD on my machine! :( Would
it be impractical to try it for me?

May be some of those I meet on the chat, and are into the enterprise may
do it, but I would like to be a part of it.

-Vaibhav (*_*) 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anyone for SSDs?

2010-12-11 Thread Daniel Loureiro
 You can believe whatever you want, that doesn't make it true.
completely agree. Like yours, Its just my point of view, not the reality.

I agree with most points here, but I wondering how many good ideas are
killed with the thought: this will be a performance killer with so
many random access, lets discarded it. If in 80's the sequential
access has more cost compared with random access (ok, there's not the
SSD case), will be the PostgreSQL in the same design that it have
nowadays ?

--
Daniel Loureiro.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags

2010-12-11 Thread Magnus Hagander
This broke the buildfarm on Windows, and I'm not sure of the best way to fix it.

We currently define read and write permissions in port/win32.h
specifically for windows. A quick-fix to just add these new ones as
aliases won't work, because they are used in both open and umask
calls.

Since umask() turns off permissions, they can't be defined to the
same, or all files are created readonly. For umask, it would work to
define them to 0. But for open() and other such calls that turn on
permissions, it needs to be defined to the same value as read
permissions for user.

Not sure of the best way to fix this. Perhaps we need to invent
PG_UMASK_xyz macros?

//Magnus

On Fri, Dec 10, 2010 at 23:35, Tom Lane t...@sss.pgh.pa.us wrote:
 Use symbolic names not octal constants for file permission flags.

 Purely cosmetic patch to make our coding standards more consistent ---
 we were doing symbolic some places and octal other places.  This patch
 fixes all C-coded uses of mkdir, chmod, and umask.  There might be some
 other calls I missed.  Inconsistency noted while researching tablespace
 directory permissions issue.

 Branch
 --
 master

 Details
 ---
 http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=04f4e10cfc158239ca00a6ed6a84428c7acb1e6d

 Modified Files
 --
 src/backend/access/transam/xlog.c   |    2 +-
 src/backend/commands/copy.c         |    2 +-
 src/backend/commands/tablespace.c   |    2 +-
 src/backend/libpq/be-fsstubs.c      |    7 ---
 src/backend/postmaster/postmaster.c |    4 ++--
 src/backend/postmaster/syslogger.c  |    6 +++---
 src/backend/storage/file/copydir.c  |    2 +-
 src/backend/storage/ipc/ipc.c       |    4 ++--
 src/bin/initdb/initdb.c             |   16 
 src/bin/pg_ctl/pg_ctl.c             |    2 +-
 10 files changed, 24 insertions(+), 23 deletions(-)


 --
 Sent via pgsql-committers mailing list (pgsql-committ...@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-committers




-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Final(?) proposal for wal_sync_method changes

2010-12-11 Thread Magnus Hagander
On Wed, Dec 8, 2010 at 15:40, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Wed, Dec 8, 2010 at 02:07, Tom Lane t...@sss.pgh.pa.us wrote:
 [ win32.h says ]
 #define fsync(fd) _commit(fd)

 What this means is that switching to a simple preference order
 fdatasync, then fsync will result in choosing fsync on Windows (since
 it hasn't got fdatasync), meaning _commit, meaning Windows users see
 a behavioral change after all.

 _commit() != fsync()

 Um, the macro quoted above makes them the same, no?  One of us
 is confused.

Uh, yeah. Sorry, that was the unclear:ness from being too preoccupied
with the conference.. Pretty sure I'm the confused one.
.
_commit() is definitely the same as fsync() on the API level. And this
correspond to fsync_writethrough, not fsync, when you talk about the
wal_sync_method parameter. It will always sync through the write
cache, even if it's hardware BBU'ed cache.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: auxiliary functions for record type

2010-12-11 Thread Florian Pflug
On Dec11, 2010, at 06:20 , Pavel Stehule wrote:
 I wrote a few functions for record type - record_expand,
 record_get_fields, record_get_field, record_set_fields.

Just FYI, I've created something similar a while ago. The code can be found at
https://github.com/fgp/pg_record_inspect

The main difference seems to be that my code allows you to retrieve fields from 
a record without casting it to text. As a consequence, however, it lacks the 
set-returning functions that your's provides. It also doesn't provide a way to 
modify fields.

Still, maybe we could join efforts here? I certain would *love* to see 
something like this either as a contrib module or in core.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PS display and standby query conflict

2010-12-11 Thread Simon Riggs
On Thu, 2010-12-09 at 22:13 +0900, Fujii Masao wrote:

 When I created the conflict between recovery and many read-only
 transactions in the standby server for test purpose, I found that the
 keyword waiting disappeared from PS display for just a moment
 even though the conflict had not been resolved yet. This seems
 strange to me.
 
 This problem happens because ResolveRecoveryConflictWithVirtualXIDs
 resets PS display for each read-only transactions that recovery
 waits for. Why do we need to reset that each time even though
 the conflict has not been resolved yet? The attached patch
 suppresses such a needless reset. Comments?

The reset occurs at most each 500ms, so not much problem there.

But if it annoys you, it seems OK to change it. Don't see a reason to backpatch 
though?

 BTW, ResolveRecoveryConflictWithVirtualXIDs calls
 pgstat_report_waiting(), which is also needless since the startup
 process doesn't have the shared memory entry (i.e., MyBEEntry)
 for pg_stat_activity. The attached patch removes that call.

IIRC that wasn't added by me, so not sure why its there. Not harming anything 
either.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq changes for synchronous replication

2010-12-11 Thread Robert Haas
On Mon, Dec 6, 2010 at 12:54 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Dec 6, 2010 at 3:07 AM, Greg Smith g...@2ndquadrant.com wrote:
 The one time this year top-posting seems appropriate...this patch seems
 stalled waiting for some sort of response to the concerns Alvaro raised
 here.

 Sorry for the delay. I didn't have the time.

 I gave this a look.  It seems good, but I'm not sure about this bit:

 Thanks for the review!

 I guess this was OK when this was conceived as CopyXlog, but since it's
 now a generic mechanism, this seems a bit unwise.  Should this be
 reconsidered so that it's possible to change the format or number of
 columns?

 I changed CopyBothResponse message so that it includes the format
 and number of columns of copy data. Please see the attached patch.

 (The paragraph added to the docs is also a bit too specific about this
 being used exclusively in streaming replication, ISTM)

 Yes. But it seems difficult to generalize the docs more because currently
 only SR uses Copy-both. So I had to write that, for example, the condition
 to get into the state is only START REPLICATION command.

 While modifying the code, it occurred to me that we might have to add new
 ExecStatusType like PGRES_COPY_BOTH and use that for CopyBoth mode,
 for the sake of consistency. But since it's just alias of PGRES_COPY_BOTH
 for now, i.e., there is no specific behavior for that ExecStatusType, I
 don't
 think that it's worth adding that yet.


 I'm not so sure about this.  If we think that it's worth adding a new
 possible state, we should do so now; we will not be able to change this
 behavior later.

 OK. I added that new state.

Committed with just a few changes to the documentation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: auxiliary functions for record type

2010-12-11 Thread Pavel Stehule
2010/12/11 Florian Pflug f...@phlo.org:
 On Dec11, 2010, at 06:20 , Pavel Stehule wrote:
 I wrote a few functions for record type - record_expand,
 record_get_fields, record_get_field, record_set_fields.

 Just FYI, I've created something similar a while ago. The code can be found at
 https://github.com/fgp/pg_record_inspect

 The main difference seems to be that my code allows you to retrieve fields 
 from a record without casting it to text. As a consequence, however, it lacks 
 the set-returning functions that your's provides. It also doesn't provide a 
 way to modify fields.

Casting to text is necessary for PL/pgSQL. I am not happy from this,
but there are not other way than using a common type - text - because
you don't know a target type.


 Still, maybe we could join efforts here? I certain would *love* to see 
 something like this either as a contrib module or in core.


I'll look on your module,

Regards

Pavel Stehule

 best regards,
 Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: auxiliary functions for record type

2010-12-11 Thread Florian Pflug
On Dec11, 2010, at 16:03 , Pavel Stehule wrote:
 2010/12/11 Florian Pflug f...@phlo.org:
 On Dec11, 2010, at 06:20 , Pavel Stehule wrote:
 I wrote a few functions for record type - record_expand,
 record_get_fields, record_get_field, record_set_fields.
 
 Just FYI, I've created something similar a while ago. The code can be found 
 at
 https://github.com/fgp/pg_record_inspect
 
 The main difference seems to be that my code allows you to retrieve fields 
 from a record without casting it to text. As a consequence, however, it 
 lacks the set-returning functions that your's provides. It also doesn't 
 provide a way to modify fields.
 
 Casting to text is necessary for PL/pgSQL. I am not happy from this,
 but there are not other way than using a common type - text - because
 you don't know a target type.


I use the anyarray/anyelement machinery to cheat there, at least a bit. My 
function fieldvalue() returns anyelement and takes a parameter defval of type 
anyelement, which serves two purposes.

First, by virtue of the anyelement machinery, the return type of fieldvalue() 
is that of defval. If the actual type of the requested field matches that 
type, the value is returned. If they don't match, the parameter coerce 
decided whether fieldvalue() tries to cast the value to the requested type, or 
simply raises an error.

Second, to also give the *value*, not only the *type* of defval a meaning, it 
serves as the default return value. If requested field contains NULL, 
defvalue is returned instead. You are, of course, free to pass NULL for 
defvalue itself to turn that mapping into a NOP.

Note that the returned value's type is always the same as defval's type, so 
the whole thing is perfectly type-safe from the point of view of the rest of 
the system.

As long as you know all possible types than can appear in your record's fields, 
you can do in Pl/PgSQL something along the line of 
declare
v_value_type1 type1;
v_value_type2 type2;
...
v_value_typeN typeN;
begin
for v_field in select * from fieldinfos(myrec) loop
case
when v_field.fieldtype = 'type1'::regtype then
v_value_type1 := fieldvalue(myrec, NULL::type1, 
false);
Do something with v_value_type1
...
when v_field.fieldtype = 'typeN'::regtype then
v_value_typeN := fieldvalue(myrec, NULL::typeN, 
false);
Do something with v_value_typeN
else raise exception 'Unexpected type % in record %', 
v_field.fieldtype, myrec;
end case;
end loop;
end;

It works pretty well for me...

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags

2010-12-11 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 This broke the buildfarm on Windows, and I'm not sure of the best way to fix 
 it.
 We currently define read and write permissions in port/win32.h
 specifically for windows. A quick-fix to just add these new ones as
 aliases won't work, because they are used in both open and umask
 calls.

Hm, those symbols are already in use elsewhere in the code; I would
assume it's just a matter of missing #includes in these particular
files.  Where does Windows define 'em?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: auxiliary functions for record type

2010-12-11 Thread Dmitriy Igrishin
Hey Pavel,

Everything looks fine, but I propose you to add record_set_field
just for convenience and as opposite of record_get_field.

2010/12/11 Pavel Stehule pavel.steh...@gmail.com

 Hello

 I wrote a few functions for record type - record_expand,
 record_get_fields, record_get_field, record_set_fields.

 A usage of this functions is described in my blog
 http://okbob.blogspot.com/2010/12/iteration-over-record-in-plpgsql.html

 Do you think, so these functions can be in core? These are relative
 strong and enough general with zero dependency. Sure, these functions
 are not defined in ANSI SQL.

 Regards

 Pavel Stehule

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
// Dmitriy.


Re: [HACKERS] create tablespace fails silently, or succeeds improperly

2010-12-11 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Robert Haas (robertmh...@gmail.com) wrote:
 On Fri, Dec 10, 2010 at 10:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Anybody have a problem with adopting this behavior?

 Seems a bit surprising.

 Yeahh..  I'm not really sure about mkdir -p type actions from a SQL
 command.  Not entirely sure why but it doesn't feel 'right' to me.  I'd
 rather have PG complain that directory doesn't exist.

OK.  Is there any value in doing mkdir -p in WAL-recovery execution of
CREATE TABLESPACE but not regular execution?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags

2010-12-11 Thread Tom Lane
I wrote:
 Magnus Hagander mag...@hagander.net writes:
 This broke the buildfarm on Windows, and I'm not sure of the best way to fix 
 it.

 Hm, those symbols are already in use elsewhere in the code; I would
 assume it's just a matter of missing #includes in these particular
 files.  Where does Windows define 'em?

Ah, I have a theory: fcntl.h.  Seems that ancient Unix specs say
S_IRUSR etc should be defined there, rather than sys/stat.h which
is the modern expectation.  Count on M$ to find creative ways of being
incompatible ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags

2010-12-11 Thread Magnus Hagander
On Sat, Dec 11, 2010 at 17:13, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 Magnus Hagander mag...@hagander.net writes:
 This broke the buildfarm on Windows, and I'm not sure of the best way to 
 fix it.

 Hm, those symbols are already in use elsewhere in the code; I would
 assume it's just a matter of missing #includes in these particular
 files.  Where does Windows define 'em?

 Ah, I have a theory: fcntl.h.  Seems that ancient Unix specs say
 S_IRUSR etc should be defined there, rather than sys/stat.h which
 is the modern expectation.  Count on M$ to find creative ways of being
 incompatible ...

Nope, not there. I can't find S_IWGRP in any of the files.

But that symbol, OTOH, is *not* in use anywhere else in the code.
(only in zic.c, but it's ifdef'ed out on win32)

I guess I need to go look for each individual one that breaks and
split this into sub-problems.
-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags

2010-12-11 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Sat, Dec 11, 2010 at 17:13, Tom Lane t...@sss.pgh.pa.us wrote:
 Ah, I have a theory: fcntl.h.

 Nope, not there. I can't find S_IWGRP in any of the files.

Well, I notice that copydir.c compiled before, and still compiles after,
despite this change:

-   if (mkdir(todir, S_IRUSR | S_IWUSR | S_IXUSR) != 0)
+   if (mkdir(todir, S_IRWXU) != 0)

I think the reason it's not failing is that it includes fcntl.h.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags

2010-12-11 Thread Magnus Hagander
On Sat, Dec 11, 2010 at 17:23, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Sat, Dec 11, 2010 at 17:13, Tom Lane t...@sss.pgh.pa.us wrote:
 Ah, I have a theory: fcntl.h.

 Nope, not there. I can't find S_IWGRP in any of the files.

 Well, I notice that copydir.c compiled before, and still compiles after,
 despite this change:

 -   if (mkdir(todir, S_IRUSR | S_IWUSR | S_IXUSR) != 0)
 +   if (mkdir(todir, S_IRWXU) != 0)

 I think the reason it's not failing is that it includes fcntl.h.

S_IRWXU is defined in port/win32.h...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags

2010-12-11 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Sat, Dec 11, 2010 at 17:23, Tom Lane t...@sss.pgh.pa.us wrote:
 I think the reason it's not failing is that it includes fcntl.h.

 S_IRWXU is defined in port/win32.h...

No, it isn't.  There's an apparently-useless definition of _S_IRWXU
there, but no S_IRWXU.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags

2010-12-11 Thread Magnus Hagander
On Sat, Dec 11, 2010 at 17:38, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Sat, Dec 11, 2010 at 17:23, Tom Lane t...@sss.pgh.pa.us wrote:
 I think the reason it's not failing is that it includes fcntl.h.

 S_IRWXU is defined in port/win32.h...

 No, it isn't.  There's an apparently-useless definition of _S_IRWXU
 there, but no S_IRWXU.

Hmm. You're right, of course.

A search on my windows box finds the text string S_IRWXU in the
following *.h files across the whole filesystem:
c:\perl\lib\CORE\perl.h
c:\perl64\lib\CORE\perl.h
c:\pgsql\src\include\pg_config_os.h
c:\pgsql\src\include\port\win32.h

that's it.



-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags

2010-12-11 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Sat, Dec 11, 2010 at 17:38, Tom Lane t...@sss.pgh.pa.us wrote:
 No, it isn't.  There's an apparently-useless definition of _S_IRWXU
 there, but no S_IRWXU.

 Hmm. You're right, of course.

 A search on my windows box finds the text string S_IRWXU in the
 following *.h files across the whole filesystem:
 c:\perl\lib\CORE\perl.h
 c:\perl64\lib\CORE\perl.h
 c:\pgsql\src\include\pg_config_os.h
 c:\pgsql\src\include\port\win32.h

 that's it.

OK, now I'm really confused.  We have at least two questions:

1. How did all those pre-existing references to S_IRXWU compile?

2. Why didn't the previously hard-wired constants passed to chmod
and umask fail on Windows?  The M$ documentation I can find at the
moment suggests that *only* _S_IREAD and _S_IWRITE bits are allowed
in the inputs to those functions, which apparently is untrue or none
of this code would have executed successfully.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: auxiliary functions for record type

2010-12-11 Thread Pavel Stehule
2010/12/11 Florian Pflug f...@phlo.org:
 On Dec11, 2010, at 16:03 , Pavel Stehule wrote:
 2010/12/11 Florian Pflug f...@phlo.org:
 On Dec11, 2010, at 06:20 , Pavel Stehule wrote:
 I wrote a few functions for record type - record_expand,
 record_get_fields, record_get_field, record_set_fields.

 Just FYI, I've created something similar a while ago. The code can be found 
 at
 https://github.com/fgp/pg_record_inspect

 The main difference seems to be that my code allows you to retrieve fields 
 from a record without casting it to text. As a consequence, however, it 
 lacks the set-returning functions that your's provides. It also doesn't 
 provide a way to modify fields.

 Casting to text is necessary for PL/pgSQL. I am not happy from this,
 but there are not other way than using a common type - text - because
 you don't know a target type.


 I use the anyarray/anyelement machinery to cheat there, at least a bit. My 
 function fieldvalue() returns anyelement and takes a parameter defval of 
 type anyelement, which serves two purposes.

 First, by virtue of the anyelement machinery, the return type of fieldvalue() 
 is that of defval. If the actual type of the requested field matches that 
 type, the value is returned. If they don't match, the parameter coerce 
 decided whether fieldvalue() tries to cast the value to the requested type, 
 or simply raises an error.

It same trick, that I use in record_set_fields. But I don't want to
use it for reading of value. I don't like it. You don't need to know a
value, you have to know a type - NULL::type. it is just not nice :). I
though about it too, and maybe is a time for new polymorphic type
anytype - and then you don't need to write a litte bit strange
NULL::type

it can be fieldvalue(myrec, type1, false)

Regards

Pavel Stehule




 Second, to also give the *value*, not only the *type* of defval a meaning, 
 it serves as the default return value. If requested field contains NULL, 
 defvalue is returned instead. You are, of course, free to pass NULL for 
 defvalue itself to turn that mapping into a NOP.

 Note that the returned value's type is always the same as defval's type, so 
 the whole thing is perfectly type-safe from the point of view of the rest of 
 the system.

 As long as you know all possible types than can appear in your record's 
 fields, you can do in Pl/PgSQL something along the line of
 declare
        v_value_type1 type1;
        v_value_type2 type2;
        ...
        v_value_typeN typeN;
 begin
        for v_field in select * from fieldinfos(myrec) loop
                case
                        when v_field.fieldtype = 'type1'::regtype then
                                v_value_type1 := fieldvalue(myrec, 
 NULL::type1, false);
                                Do something with v_value_type1
                        ...
                        when v_field.fieldtype = 'typeN'::regtype then
                                v_value_typeN := fieldvalue(myrec, 
 NULL::typeN, false);
                                Do something with v_value_typeN
                        else raise exception 'Unexpected type % in record %', 
 v_field.fieldtype, myrec;
                end case;
        end loop;
 end;

 It works pretty well for me...

 best regards,
 Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags

2010-12-11 Thread Magnus Hagander
On Sat, Dec 11, 2010 at 17:55, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Sat, Dec 11, 2010 at 17:38, Tom Lane t...@sss.pgh.pa.us wrote:
 No, it isn't.  There's an apparently-useless definition of _S_IRWXU
 there, but no S_IRWXU.

 Hmm. You're right, of course.

 A search on my windows box finds the text string S_IRWXU in the
 following *.h files across the whole filesystem:
 c:\perl\lib\CORE\perl.h
 c:\perl64\lib\CORE\perl.h
 c:\pgsql\src\include\pg_config_os.h
 c:\pgsql\src\include\port\win32.h

 that's it.

 OK, now I'm really confused.  We have at least two questions:

 1. How did all those pre-existing references to S_IRXWU compile?

Yeah, that's weird. IIRC (I stopped looking for the moment, need a
step back) some of them were protected by #ifndef WIN32, but not all
of them..


 2. Why didn't the previously hard-wired constants passed to chmod
 and umask fail on Windows?  The M$ documentation I can find at the
 moment suggests that *only* _S_IREAD and _S_IWRITE bits are allowed
 in the inputs to those functions, which apparently is untrue or none
 of this code would have executed successfully.

Probably it ignores any flags it doesn't know about?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags

2010-12-11 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Sat, Dec 11, 2010 at 17:55, Tom Lane t...@sss.pgh.pa.us wrote:
 2. Why didn't the previously hard-wired constants passed to chmod
 and umask fail on Windows?  The M$ documentation I can find at the
 moment suggests that *only* _S_IREAD and _S_IWRITE bits are allowed
 in the inputs to those functions, which apparently is untrue or none
 of this code would have executed successfully.

 Probably it ignores any flags it doesn't know about?

Maybe, but unless _S_IREAD and _S_IWRITE are defined as 0400 and 0200,
the code would have been doing the wrong thing altogether.

I'm not sure why you think the group/other macros couldn't be #define'd
as 0?  But in any case there's still the question of where S_IRWXU is
coming from, since it clearly does work in several places.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags

2010-12-11 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Sat, Dec 11, 2010 at 17:55, Tom Lane t...@sss.pgh.pa.us wrote:
 1. How did all those pre-existing references to S_IRXWU compile?

 Yeah, that's weird. IIRC (I stopped looking for the moment, need a
 step back) some of them were protected by #ifndef WIN32, but not all
 of them..

The lightbulb just went on: in win32.h,

#define mkdir(a,b)  mkdir(a)

I didn't go through in complete detail, but I'll bet all the working
instances are in mkdir calls, or else inside #ifndef WIN32.

I think we can just #define the other cases as zeroes.  I'm not sure why
you think that's an issue for open --- the privileges don't exist.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags

2010-12-11 Thread Magnus Hagander
On Sat, Dec 11, 2010 at 18:46, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Sat, Dec 11, 2010 at 17:55, Tom Lane t...@sss.pgh.pa.us wrote:
 1. How did all those pre-existing references to S_IRXWU compile?

 Yeah, that's weird. IIRC (I stopped looking for the moment, need a
 step back) some of them were protected by #ifndef WIN32, but not all
 of them..

 The lightbulb just went on: in win32.h,

 #define mkdir(a,b)      mkdir(a)

 I didn't go through in complete detail, but I'll bet all the working
 instances are in mkdir calls, or else inside #ifndef WIN32.

Ah, that certainly looks like a smoking gun.


 I think we can just #define the other cases as zeroes.  I'm not sure why
 you think that's an issue for open --- the privileges don't exist.

Hmm. I was/am worried about any case that specifies *just* one of the
permissions that don't exist. That'll leave it at zero, whereas the
correct one might be the user-only version of whatever (read/write)
was given.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags

2010-12-11 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Sat, Dec 11, 2010 at 18:46, Tom Lane t...@sss.pgh.pa.us wrote:
 I think we can just #define the other cases as zeroes.  I'm not sure why
 you think that's an issue for open --- the privileges don't exist.

 Hmm. I was/am worried about any case that specifies *just* one of the
 permissions that don't exist. That'll leave it at zero, whereas the
 correct one might be the user-only version of whatever (read/write)
 was given.

If we didn't specify the user read or write privilege, we shouldn't
get it.

What are the values of _S_IREAD and _S_IWRITE, anyway?  I'm still
wondering how come the previous coding with hardwired constants
behaved correctly.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] create tablespace fails silently, or succeeds improperly

2010-12-11 Thread Robert Haas
On Sat, Dec 11, 2010 at 10:56 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Stephen Frost sfr...@snowman.net writes:
 * Robert Haas (robertmh...@gmail.com) wrote:
 On Fri, Dec 10, 2010 at 10:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Anybody have a problem with adopting this behavior?

 Seems a bit surprising.

 Yeahh..  I'm not really sure about mkdir -p type actions from a SQL
 command.  Not entirely sure why but it doesn't feel 'right' to me.  I'd
 rather have PG complain that directory doesn't exist.

 OK.  Is there any value in doing mkdir -p in WAL-recovery execution of
 CREATE TABLESPACE but not regular execution?

I don't think so.  If someone creates a directory that is not fsync'd,
and then creates a subdirectory and puts a tablespace on it, and then
crashes after this has been WAL-logged but before the directory
entries have hit the disk, well, unlucky for them, but that's a
vanishingly rare situation.  There's no guarantee that we'd set
properties on the parent directory that would match the user's
expectation anyway, especially if SE-Linux or something is involved.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] create tablespace fails silently, or succeeds improperly

2010-12-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sat, Dec 11, 2010 at 10:56 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 OK.  Is there any value in doing mkdir -p in WAL-recovery execution of
 CREATE TABLESPACE but not regular execution?

 I don't think so.  If someone creates a directory that is not fsync'd,
 and then creates a subdirectory and puts a tablespace on it, and then
 crashes after this has been WAL-logged but before the directory
 entries have hit the disk, well, unlucky for them, but that's a
 vanishingly rare situation.

I'm not really thinking about crash recovery, but replication slaves.
Omitting to create the tablespace location directories on slaves
doesn't seem far-fetched at all.  Of course, it's likely that
the slave server will lack permissions to create in the location
directory's parent; but if it can, the outcome shouldn't be too
unreasonable.

 There's no guarantee that we'd set
 properties on the parent directory that would match the user's
 expectation anyway, especially if SE-Linux or something is involved.

If SELinux doesn't like it, it's going to fail no matter what we do.
I'm just suggesting that trying to create the directory path will fail
in fewer cases than not trying.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] create tablespace fails silently, or succeeds improperly

2010-12-11 Thread Robert Haas
On Sat, Dec 11, 2010 at 1:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sat, Dec 11, 2010 at 10:56 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 OK.  Is there any value in doing mkdir -p in WAL-recovery execution of
 CREATE TABLESPACE but not regular execution?

 I don't think so.  If someone creates a directory that is not fsync'd,
 and then creates a subdirectory and puts a tablespace on it, and then
 crashes after this has been WAL-logged but before the directory
 entries have hit the disk, well, unlucky for them, but that's a
 vanishingly rare situation.

 I'm not really thinking about crash recovery, but replication slaves.
 Omitting to create the tablespace location directories on slaves
 doesn't seem far-fetched at all.  Of course, it's likely that
 the slave server will lack permissions to create in the location
 directory's parent; but if it can, the outcome shouldn't be too
 unreasonable.

Creating the tablespace directory itself would be reasonable, but
recursing all the way up seems dubious.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] create tablespace fails silently, or succeeds improperly

2010-12-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sat, Dec 11, 2010 at 1:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm not really thinking about crash recovery, but replication slaves.
 Omitting to create the tablespace location directories on slaves
 doesn't seem far-fetched at all.  Of course, it's likely that
 the slave server will lack permissions to create in the location
 directory's parent; but if it can, the outcome shouldn't be too
 unreasonable.

 Creating the tablespace directory itself would be reasonable, but
 recursing all the way up seems dubious.

Well, it's *very* unlikely that the slave server would have permissions
to create in the root directory or close to it.  If you grant that it's
reasonable to create the location directory itself, why not the parent
too, if that's still in a directory that's writable?  I agree that the
reasonableness of the behavior drops off the more you go up, but so does
the probability of having the needed permissions.  I don't agree that
it's a binary choice where creating exactly one directory is reasonable
but exactly two isn't.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)

2010-12-11 Thread tv
Hi everyone,

I just wrote my first patch, and I need to know whether I missed something
or not. I haven't used C for a really long time, so sickbags on standby,
and if you notice something really stupid don't hesitate to call me an
asshole (according to Simon Phipps that proves we are a healthy open
community).

So what the patch does (or should do)? It tracks when were the stats for a
given object (database, table or function) reset for the last time. This
is useful when you do snapshots of the stats for analysis - when comparing
two snapshots, you have to know whether the stats were reset (in that case
the analysis usually yields random noise and automatic tools get confused
by this).

Tom Lane already recommended a workaround - firing the DBA who randomly
resets statistics, but that's not a good solution I think. First, you have
to be superior to the DBA to be able to fire him ;-) Second, everyone
makes a mistake from time to time. Third, when there are functions to
reset stats, it's nice to provide such info as it makes life much easier.

And there are cases when you don't reset the stats explicitly but the data
are actually gone - e.g. when after a restore or upgrade (OK, this is
solvable using pg_postmaster_start_time).

In short, I think it's a useful feature (I need it and I think there are
others). And I think it's not disruptive.

So what the patch actually does:

- extends PgStat_StatDBEntry, PgStat_StatTableEntry and
PgStat_StatFuncEntry with a new field (stat_reset_timestamp)

- adds functions to read current value from these fields
(pg_stat_get_db_last_stat_reset_time, pg_stat_get_last_stat_reset_time and
pg_stat_get_function_last_stat_reset_time)

- extends the system views with calls to these functions
(pg_stat_database, pg_stat_user_functions and pg_stat_all_tables)

The values are set like this:

- when a database is created, current timestamp is stored in
PgStat_StatDBEntry.stat_reset_timestamp
- by default all tables/functions inherit this timestamp
- when stats for a given table / function are reset, current timestamp is
stored in the stat_reset_timestamp (this happens in
pgstat_recv_resetsinglecounter)
- when stats for the whole database are reset, everything starts from
scratch (this happens in pgstat_recv_resetcounter)

What I'm not sure about:

- I really am not sure about the changes made in pg_proc.h. I'm not sure
how to assign OIDs to the new functions (I've simply chosen values that
are were not used in this file), and I'm not sure about the other columns
(I've copied and modified another function with the same parameter/return
types)

- I'm not sure if there are any other ways how the stat entries can be
created. I've found two ways - directly (when asked for the stats e.g.
from pgstat_get_tab_entry), and indirectly (when processing stats from a
backend e.g. in pgstat_recv_tabstat).

regards
Tomasdiff --git a/src/backend/catalog/system_views.sql 
b/src/backend/catalog/system_views.sql
index 346eaaf..0ee59b1 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -310,6 +310,7 @@ CREATE VIEW pg_stat_all_tables AS
 pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum,
 pg_stat_get_last_analyze_time(C.oid) as last_analyze,
 pg_stat_get_last_autoanalyze_time(C.oid) as last_autoanalyze,
+   pg_stat_get_last_stat_reset_time(C.oid) as last_stat_reset,
 pg_stat_get_vacuum_count(C.oid) AS vacuum_count,
 pg_stat_get_autovacuum_count(C.oid) AS autovacuum_count,
 pg_stat_get_analyze_count(C.oid) AS analyze_count,
@@ -502,7 +503,8 @@ CREATE VIEW pg_stat_database AS
 pg_stat_get_db_tuples_fetched(D.oid) AS tup_fetched,
 pg_stat_get_db_tuples_inserted(D.oid) AS tup_inserted,
 pg_stat_get_db_tuples_updated(D.oid) AS tup_updated,
-pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted
+pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted,
+pg_stat_get_db_last_stat_reset_time(D.oid) AS last_stat_reset
 FROM pg_database D;
 
 CREATE VIEW pg_stat_user_functions AS
@@ -512,7 +514,8 @@ CREATE VIEW pg_stat_user_functions AS
 P.proname AS funcname,
 pg_stat_get_function_calls(P.oid) AS calls,
 pg_stat_get_function_time(P.oid) / 1000 AS total_time,
-pg_stat_get_function_self_time(P.oid) / 1000 AS self_time
+pg_stat_get_function_self_time(P.oid) / 1000 AS self_time,
+pg_stat_get_function_last_stat_reset_time(P.oid) AS last_stat_reset
 FROM pg_proc P LEFT JOIN pg_namespace N ON (N.oid = P.pronamespace)
 WHERE P.prolang != 12  -- fast check to eliminate built-in functions
   AND pg_stat_get_function_calls(P.oid) IS NOT NULL;
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index c3c136a..f0b3453 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -249,6 

Re: [HACKERS] Anyone for SSDs?

2010-12-11 Thread Jeff Janes
On Fri, Dec 10, 2010 at 8:09 PM, Vaibhav Kaushal
vaibhavkaushal...@gmail.com wrote:
 On Fri, 2010-12-10 at 18:07 -0800, Josh Berkus wrote:
 On 12/10/10 5:06 PM, Daniel Loureiro wrote:
  An quicksort method in
  sequential disk its just awful to be thinking in a non SSD world, but
  its possible in an SSD.

 So, code it.  Shouldn't be hard to write a demo comparison.  I don't
 believe that SSDs make quicksort-on-disk feasible, but would be happy to
 be proven wrong.

 I too do not believe it in normal case. However, considering the 'types'
 of SSDs, it may be feasible! Asking for 'the next page and getting it'
 has a time delay in the process. While on a regular HDD with spindles,
 the question is where is that page located, with SSDs, the question
 disappears, because the access time is uniform in case of SSDs. Also,
 the access time is about 100 times fasterm which would change quite a
 few things about the whole process.

I don't understand what it is you are proposing.  Quicksort is usually
swap based, and so the records would need to be the same size.  Are
you proposing to do an in-memory sort of pointers, which reference
on-disk records?  Or an on-disk sort of pointers which reference
on-disk (but somewhere else) records?  If you are swapping pointers on
disk, you have to consider the write performance, not just the read
performance.


 I would like to do that (coding), but I do not have a SSD on my
 machine!

That doesn't mean you can't do the coding, it just means you can't
test the performance.
The barrier to get someone else to performance test it for you is a
lot lower than the
barrier to get someone else to write it for you and then performance
test it for you.
(But I can't test-drive it, as I don't have any computer which has
both an SSD and
a hard drive, just one or the other.  And the one with SSD would be
hard to compile
PG on.)


 :( Would it be impractical to try it for me? Again I do not
 know how to test PG :(

Yeah, I think it would be impractical.  If I thought it would likely
work, it would be different (at least, it might be if I had the right
hardware).
But I think it would likely not work.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unlogged tables

2010-12-11 Thread Jeff Janes
On Wed, Dec 8, 2010 at 6:52 AM, Marti Raudsepp ma...@juffo.org wrote:
 A very useful feature for unlogged tables would be the ability to
 switch them back to normal tables -- this way you could do bulk
 loading into an unlogged table and then turn it into a regular table
 using just fsync(), bypassing all the WAL-logging overhead.

If archive_mode is off, then you can often find a way to bypass
WAL-logging during bulk loading anyway.

If archive_mode is on, then I don't see how this can work without
massive changes.

One possibility would be to create a mechanism to inject entire large
files into the archive log stream. (Such a facility might be useful
for other purposes too).  So the transaction that changes the mode
from unlogged to logged would have to take an exclusive lock on the
unlogged table and make sure shared buffers for it are written out,
then it would just copy the backing file(s) for that table into the
archive steam with a special header that tells the recovery process
Set these aside, I'll explain later. Once that is done, it would
just have to ensure the WAL segment it is currently on will come after
the injected files in the archive stream, and write a WAL record
explaining where those bulk files it sent early are supposed to go.

I don't know, sound like a lot of work and lot of pitfalls.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER EXTENSION ... UPGRADE;

2010-12-11 Thread Dimitri Fontaine
Hi,

I've been reading through the entire thread and it seems like this is
the best mail to choose to answer.

Tom Lane t...@sss.pgh.pa.us writes:
 Maybe I misread David's meaning, but I thought he was saying that
 there's no value in inventing all those control file entries in the
 first place.  Just hard-wire in ALTER EXTENSION UPGRADE the convention
 that the name of an upgrade script to upgrade from prior version VVV is
 EXTNAME-upgrade.VVV.sql (or any variant spelling of that you care for).

Yeah that works, as soon as VVV is the version we upgrade from.

That said, we need to find a way to lighten the process for extensions
where it's easy to have a single script to support upgrade from more
than once past release.

What about having the following keys supported in the control file:

  upgrade_version = 'script.version.sql'
  upgrade_all = 'script.sql'

Where the version here is the version you're upgrading *from* (to is
known and static when you distribute the files after all), and where
upgrade_all is applied last no matter what got applied before.

Also, do we want a subdirectory per extension to host all those files?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unlogged tables

2010-12-11 Thread Jeff Janes
On Fri, Dec 10, 2010 at 8:16 PM, Robert Haas robertmh...@gmail.com wrote:

 As I was working on the hash index support, it occurred to me that at
 some point in the future, we might want to allow an unlogged index on
 a permanent table.

That is the feature I would be most excited about.

 With the current patch, an index is unlogged if
 and only if the corresponding table is unlogged, and both the table
 and the index are reset to empty on restart.  But we could have a
 slightly different flavor of index that, instead of being reset to
 empty, just gets marked invalid, perhaps by truncating the file to
 zero-length (and adding some code to treat that as something other
 than a hard error).  Perhaps you could even arrange for autovacuum to
 kick off an automatic rebuild,

Or just have rebuilding the index as part of crash recovery.  I
wouldn't use the feature anyway on indexes that would take more than a
few seconds to rebuild, And wouldn't want to advertise the database as
being available when it is essentially crippled from missing indexes.
I'd rather bite the bullet up front.

I would think of it is as declaring that, instead of making the index
recoverable via WAL logging and replay, instead make it recoverable by
rebuilding.  So in that way it is quit unlike unlogged tables, in that
we are not risking any data, just giving the database a hint about
what the most expeditious way to maintain the index might be.  Well,
more of an order than a hint, I guess.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: auxiliary functions for record type

2010-12-11 Thread Florian Pflug
On Dec11, 2010, at 17:55 , Pavel Stehule wrote:
 It same trick, that I use in record_set_fields.
Oh, I see. Sorry, must haven missed that when I read your blog entry.

 But I don't want to
 use it for reading of value. I don't like it. You don't need to know a
 value, you have to know a type - NULL::type. it is just not nice :).
Well, no, it's not the most elegant API on earth, that's for sure. But I my 
opinion, it isn't so bad that it rectifies casting everything to text. 

 I
 though about it too, and maybe is a time for new polymorphic type
 anytype - and then you don't need to write a litte bit strange
 NULL::type
 
 it can be fieldvalue(myrec, type1, false)
Hm, I don't think the ability to write just type1 instead of NULL::type1 is 
worth the necessary effort.

If anything, I'd allow function to use anyelement as their return value 
*without* having any polymorphic arguments. You'd need to surround calls to 
such function with a CAST() expression, unless something else determines the 
type. In pl/pgsql, for example, one could allow the CAST() to be skipped for 
assignment to variables. Then, you could write
v_value := fieldvalue(myrec, 'f1', true)

That'd at least free you from having to specify the type in some cases. But 
still, even this seems to be a lot of effort for quite little gain...

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Reduce spurious Hot Standby conflicts from never-visible records

2010-12-11 Thread Heikki Linnakangas

(Moving to pgsql-hackers)

On 10.12.2010 20:21, Tom Lane wrote:

Simon Riggssi...@2ndquadrant.com  writes:

Reduce spurious Hot Standby conflicts from never-visible records.
Hot Standby conflicts only with tuples that were visible at
some point. So ignore tuples from aborted transactions or for
tuples updated/deleted during the inserting transaction when
generating the conflict transaction ids.



Following detailed analysis and test case by Noah Misch.
Original report covered btree delete records, correctly observed
by Heikki Linnakangas that this applies to other cases also.
Fix covers all sources of cleanup records via common code.
Includes additional fix compared to commit on HEAD


ISTM HeapTupleHeaderAdvanceLatestRemovedXid is still pretty broken,
in that it's examining xmax without having checked that xmax is (a)
valid or (b) a lock rather than a deletion xmax.


In current use, it's only called for tuples that are known to be dead, 
so either xmax is a valid deletion, or xmin didn't commit in which case 
the function doesn't use xmax for anything. So I think it actually works 
as it is.


I agree it doesn't look right, though. At the very least it needs 
comments explaining that, but preferably it should do something sane 
when faced with a tuple that's not dead after all. Perhaps throw an 
error (though that would be bad during recovery), or an Assert, or just 
refrain from advancing latestRemovedXid (or advance it, that would be 
the conservative stance given the current use).


Also, I'm not totally convinced it's correct when xmin  xmax, despite 
Simon's follow-up commit to fix that. Shouldn't it advance 
latestRemovedXid to xmin in that case? Or maybe it's ok as it is because 
we know that xmax committed after xmin. The impression I get from the 
comment above the function now is that it advances latestRemovedXid to 
the highest XID present in the tuple, but that's not what it does in the 
xmin  xmax case. That comment needs clarification.


While we're at it, perhaps it would be better to move this function to 
tqual.c. And I feel that a more natural interface would be something like:


TransactionId
HeapTupleHeaderGetLatestRemovedXid(HeapTupleHeader tuple);

IOW, instead bumping up the passed-in latestRemovedXid value, return the 
highest XID on the tuple (if it was dead).


PS. it would be good to set hint bits in that function like in 
HeapTupleSatisfies* functions.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Wildcard search support for pg_trgm

2010-12-11 Thread Alexander Korotkov
Hackers,

Here is first version of patch, which enable index support of wildcard
search in pg_trgm contrib module. The idea of the patch is to extract from
wildcard trigrams which should occurs in wildcard matching string. For
example, for '%sector%' wildcard such trigrams would be: 'sec', 'ect',
'tor'.

create table words (word text);
copy words from '/usr/share/dict/american-english';

test=# explain analyze select * from words where word ilike '%independ%';
  QUERY PLAN

--
 Seq Scan on words  (cost=0.00..1703.11 rows=10 width=9) (actual
time=18.818..174.146 rows=7 loops=1)
   Filter: (word ~~* '%independ%'::text)
 Total runtime: 174.200 ms
(3 rows)

CREATE INDEX trgm_idx ON words USING gist (word gist_trgm_ops);

test=# explain analyze select * from words where word ilike '%independ%';
QUERY PLAN


--
 Bitmap Heap Scan on words  (cost=4.36..40.11 rows=10 width=9) (actual
time=2.445..2.529 rows=7 loops=1)
   Recheck Cond: (word ~~* '%independ%'::text)
   -  Bitmap Index Scan on trgm_idx  (cost=0.00..4.35 rows=10 width=0)
(actual time=2.406..2.406 rows=7 loops=1)
 Index Cond: (word ~~* '%independ%'::text)
 Total runtime: 2.612 ms
(5 rows)

CREATE INDEX trgm_idx ON words USING gin (word gin_trgm_ops);

test=# explain analyze select * from words where word ilike '%independ%';
QUERY PLAN


---
 Bitmap Heap Scan on words  (cost=76.08..111.83 rows=10 width=9) (actual
time=2.675..2.755 rows=7 loops=1)
   Recheck Cond: (word ~~* '%independ%'::text)
   -  Bitmap Index Scan on trgm_idx  (cost=0.00..76.07 rows=10 width=0)
(actual time=2.642..2.642 rows=7 loops=1)
 Index Cond: (word ~~* '%independ%'::text)
 Total runtime: 2.839 ms
(5 rows)

I've encountered with following problems:
1) Indexing support for ilike is possible only with case-insensetive
wildcards, e.g. when IGNORECASE macro is enabled. But I can't use this macro
in pg_trgm.sql.in, where list of operators is defined. Probably, is it
enuogh to put comment near IGNORECASE, which tells that if one disable this
macro he should also remove oparators from pg_trgm.sql.in?
2) I found gist index not very useful with default SIGLENINT = 3. I've
changed this value to 15 and I found gist index performs very good on
dictionary. But on longer strings greater values of SIGLENINT may be
required (probably even SIGLENINT  122 will give benefit in some cases in
spite of TOAST).


With best regards,
Alexander Korotkov.
*** a/contrib/pg_trgm/pg_trgm.sql.in
--- b/contrib/pg_trgm/pg_trgm.sql.in
***
*** 113,118  FOR TYPE text USING gist
--- 113,120 
  AS
  OPERATOR1   % (text, text),
  OPERATOR2   - (text, text) FOR ORDER BY pg_catalog.float_ops,
+ OPERATOR3   ~~ (text, text),
+ OPERATOR4   ~~* (text, text),
  FUNCTION1   gtrgm_consistent (internal, text, int, oid, internal),
  FUNCTION2   gtrgm_union (bytea, internal),
  FUNCTION3   gtrgm_compress (internal),
***
*** 144,149  CREATE OPERATOR CLASS gin_trgm_ops
--- 146,153 
  FOR TYPE text USING gin
  AS
  OPERATOR1   % (text, text),
+ OPERATOR3   ~~ (text, text),
+ OPERATOR4   ~~* (text, text),
  FUNCTION1   btint4cmp (int4, int4),
  FUNCTION2   gin_extract_trgm (text, internal),
  FUNCTION3   gin_extract_trgm (text, internal, int2, internal, internal),
*** a/contrib/pg_trgm/trgm.h
--- b/contrib/pg_trgm/trgm.h
***
*** 19,24 
--- 19,26 
  /* operator strategy numbers */
  #define	SimilarityStrategyNumber	1
  #define	DistanceStrategyNumber		2
+ #define LikeStrategyNumber			3
+ #define ILikeStrategyNumber			4
  
  
  typedef char trgm[3];
***
*** 53,59  typedef struct
  
  /* gist */
  #define BITBYTE 8
! #define SIGLENINT  3			/* 122 = key will toast, so very slow!!! */
  #define SIGLEN	( sizeof(int)*SIGLENINT )
  
  #define SIGLENBIT (SIGLEN*BITBYTE - 1)	/* see makesign */
--- 55,61 
  
  /* gist */
  #define BITBYTE 8
! #define SIGLENINT  15			/* 122 = key will toast, so very slow!!! */
  #define SIGLEN	( sizeof(int)*SIGLENINT )
  
  #define SIGLENBIT (SIGLEN*BITBYTE - 1)	/* see makesign */
***
*** 89,94  typedef char *BITVECP;
--- 91,101 
  extern float4 trgm_limit;
  
  TRGM	   *generate_trgm(char *str, int slen);
+ TRGM	   *generate_wildcard_trgm(char *str, int slen);
  float4		

[HACKERS] pg_execute_from_file, patch v10

2010-12-11 Thread Dimitri Fontaine
Hi,

The other infrastructure patch that has been mark ready for commit then
commented further upon by Tom is $subject, even if the function provided
as been renamed to pg_execute_sql_file().

Please find attached the newer version that fixes Tom concerns, removing
the VARIADIC forms of the functions (those placeholders idea).

The git tree already contains a fixed extension code, but the next patch
for that one will have to wait some more (psql refactoring).

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 14449,14466  postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
/row
row
 entry
! literalfunctionpg_read_file(parameterfilename/ typetext/, parameteroffset/ typebigint/, parameterlength/ typebigint/)/function/literal
 /entry
 entrytypetext/type/entry
 entryReturn the contents of a text file/entry
/row
row
 entry
  literalfunctionpg_stat_file(parameterfilename/ typetext/)/function/literal
 /entry
 entrytyperecord/type/entry
 entryReturn information about a file/entry
/row
   /tbody
  /tgroup
 /table
--- 14449,14488 
/row
row
 entry
! literalfunctionpg_read_file(parameterfilename/ typetext/, parameteroffset/ typebigint/ [, parameterlength/ typebigint/])/function/literal
 /entry
 entrytypetext/type/entry
 entryReturn the contents of a text file/entry
/row
row
 entry
+ literalfunctionpg_read_binary_file(parameterfilename/ typetext/, parameteroffset/ typebigint/ [, parameterlength/ typebigint/])/function/literal
+/entry
+entrytypebytea/type/entry
+entryReturn the contents of a file/entry
+   /row
+   row
+entry
  literalfunctionpg_stat_file(parameterfilename/ typetext/)/function/literal
 /entry
 entrytyperecord/type/entry
 entryReturn information about a file/entry
/row
+   row
+entry
+ literalfunctionpg_execute_sql_string(parametersql/ typetext/) )/function/literal
+/entry
+entrytypevoid/type/entry
+entryExecute the given string as acronymSQL/ commands./entry
+   /row
+   row
+entry
+ literalfunctionpg_execute_sql_file(parameterfilename/ typetext/ [, parameterencoding/parameter typename/type]) )/function/literal
+/entry
+entrytypevoid/type/entry
+entryExecute contents of the given file as acronymSQL/ commands,
+expected in either database encoding or given encoding./entry
+   /row
   /tbody
  /tgroup
 /table
***
*** 14482,14487  postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 14504,14533 
  at the given parameteroffset/, returning at most parameterlength/
  bytes (less if the end of file is reached first).  If parameteroffset/
  is negative, it is relative to the end of the file.
+ When the parameter parameterlength/ is omitted,
+ functionpg_read_file/ reads until the end of the file.
+ The part of a file must be a valid text in the server encoding.
+/para
+ 
+indexterm
+ primarypg_read_binary_file/primary
+/indexterm
+para
+ functionpg_read_binary_file/ returns part of a file as like as
+ functionpg_read_file/, but the result is a bytea value.
+ programlisting
+ SELECT convert_from(pg_read_binary_file('postgresql.conf', -69), 'utf8');
+  convert_from  
+ ---
+  #custom_variable_classes = ''   # list of custom variable class names+
+  
+ (1 row)
+ /programlisting
+/para
+para
+ When the parameter parameterlength/ is
+ omited, functionpg_read_binary_file/ reads until the end of the
+ file.
 /para
  
 indexterm
***
*** 14499,14504  SELECT (pg_stat_file('filename')).modification;
--- 14545,14582 
  /programlisting
 /para
  
+indexterm
+ primarypg_execute_sql_string/primary
+/indexterm
+para
+ functionpg_execute_sql_string/ makes the server execute the given string
+ as acronymSQL/ commands.
+ The script may contain placeholders that will be replaced by the optional
+ parameters, that are pairs of variable names and values. Here's an example:
+ programlisting
+ SELECT pg_execute_sql_string('CREATE SCHEMA utils');
+  pg_execute_sql_string 
+ ---
+  
+ (1 row)
+ 
+ SELECT oid, * from pg_namespace where nspname = 'utils';
+   oid  | nspname | nspowner | nspacl 
+ ---+-+--+
+  16387 | utils   |   10 | 
+ (2 rows)
+ /programlisting
+/para
+ 
+indexterm
+ 

Re: [HACKERS] Extensions, patch v16

2010-12-11 Thread David Fetter
On Fri, Dec 10, 2010 at 11:24:27AM -0500, Tom Lane wrote:
 Dimitri Fontaine dimi...@2ndquadrant.fr writes:
  Tom Lane t...@sss.pgh.pa.us writes:
  Are there any actual remaining use-cases for that sed step?
 
  The goal here is to allow extension authors to maintain their version
  number in the Makefile rather than in the Makefile and in the control
  file separately. Having the same version number in more than one place
  never eases maintenance.
 
 Why is it in the makefile at all?  If the makefile does need to know it,
 why don't we have it scrape the number out of the control file?  Or even
 more to the point, since when do we need version numbers in extensions?

We *absolutely* need version numbers in extensions.  People will want
to have a certain version, or a certain minimum version, etc., etc.,
etc., just as they do for any other software.

Seriously, are you OK?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: auxiliary functions for record type

2010-12-11 Thread Pavel Stehule
2010/12/11 Florian Pflug f...@phlo.org:
 On Dec11, 2010, at 17:55 , Pavel Stehule wrote:
 It same trick, that I use in record_set_fields.
 Oh, I see. Sorry, must haven missed that when I read your blog entry.

 But I don't want to
 use it for reading of value. I don't like it. You don't need to know a
 value, you have to know a type - NULL::type. it is just not nice :).
 Well, no, it's not the most elegant API on earth, that's for sure. But I my 
 opinion, it isn't so bad that it rectifies casting everything to text.

yes, there are two no good solution now.


 I
 though about it too, and maybe is a time for new polymorphic type
 anytype - and then you don't need to write a litte bit strange
 NULL::type


 it can be fieldvalue(myrec, type1, false)
 Hm, I don't think the ability to write just type1 instead of NULL::type1 
 is worth the necessary effort.

 If anything, I'd allow function to use anyelement as their return value 
 *without* having any polymorphic arguments. You'd need to surround calls to 
 such function with a CAST() expression, unless something else determines the 
 type. In pl/pgsql, for example, one could allow the CAST() to be skipped for 
 assignment to variables. Then, you could write
        v_value := fieldvalue(myrec, 'f1', true)


yes, this is probably way and I like it. But now with zero support.
a) there isn't mechanism for propagation target type from outer to
function - function have to have to know target type
b) plpgsql uses only IO cast - so values are translated to text and
back everytime

but CAST has own parser part and probably executor part too, so this
is possible and too hard.

so is possible to call: SELECT CAST(fieldvalue(myrec, 'f1', true) AS int)

yes, this task should not be hard. But who will comunicate about this with Tom?

Regards

Pavel



 That'd at least free you from having to specify the type in some cases. But 
 still, even this seems to be a lot of effort for quite little gain...

 best regards,
 Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)

2010-12-11 Thread Pavel Stehule
Hello

you have to respect pg coding style:

a) not too long lines
b) not C++ line comments

Zdar

Pavel

2010/12/11  t...@fuzzy.cz:
 Hi everyone,

 I just wrote my first patch, and I need to know whether I missed something
 or not. I haven't used C for a really long time, so sickbags on standby,
 and if you notice something really stupid don't hesitate to call me an
 asshole (according to Simon Phipps that proves we are a healthy open
 community).

 So what the patch does (or should do)? It tracks when were the stats for a
 given object (database, table or function) reset for the last time. This
 is useful when you do snapshots of the stats for analysis - when comparing
 two snapshots, you have to know whether the stats were reset (in that case
 the analysis usually yields random noise and automatic tools get confused
 by this).

 Tom Lane already recommended a workaround - firing the DBA who randomly
 resets statistics, but that's not a good solution I think. First, you have
 to be superior to the DBA to be able to fire him ;-) Second, everyone
 makes a mistake from time to time. Third, when there are functions to
 reset stats, it's nice to provide such info as it makes life much easier.

 And there are cases when you don't reset the stats explicitly but the data
 are actually gone - e.g. when after a restore or upgrade (OK, this is
 solvable using pg_postmaster_start_time).

 In short, I think it's a useful feature (I need it and I think there are
 others). And I think it's not disruptive.

 So what the patch actually does:

 - extends PgStat_StatDBEntry, PgStat_StatTableEntry and
 PgStat_StatFuncEntry with a new field (stat_reset_timestamp)

 - adds functions to read current value from these fields
 (pg_stat_get_db_last_stat_reset_time, pg_stat_get_last_stat_reset_time and
 pg_stat_get_function_last_stat_reset_time)

 - extends the system views with calls to these functions
 (pg_stat_database, pg_stat_user_functions and pg_stat_all_tables)

 The values are set like this:

 - when a database is created, current timestamp is stored in
 PgStat_StatDBEntry.stat_reset_timestamp
 - by default all tables/functions inherit this timestamp
 - when stats for a given table / function are reset, current timestamp is
 stored in the stat_reset_timestamp (this happens in
 pgstat_recv_resetsinglecounter)
 - when stats for the whole database are reset, everything starts from
 scratch (this happens in pgstat_recv_resetcounter)

 What I'm not sure about:

 - I really am not sure about the changes made in pg_proc.h. I'm not sure
 how to assign OIDs to the new functions (I've simply chosen values that
 are were not used in this file), and I'm not sure about the other columns
 (I've copied and modified another function with the same parameter/return
 types)

 - I'm not sure if there are any other ways how the stat entries can be
 created. I've found two ways - directly (when asked for the stats e.g.
 from pgstat_get_tab_entry), and indirectly (when processing stats from a
 backend e.g. in pgstat_recv_tabstat).

 regards
 Tomas

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensions, patch v16

2010-12-11 Thread David E. Wheeler
On Dec 11, 2010, at 1:09 PM, David Fetter wrote:

 Why is it in the makefile at all?  If the makefile does need to know it,
 why don't we have it scrape the number out of the control file?  Or even
 more to the point, since when do we need version numbers in extensions?
 
 We *absolutely* need version numbers in extensions.  People will want
 to have a certain version, or a certain minimum version, etc., etc.,
 etc., just as they do for any other software.
 
 Seriously, are you OK?

One of the biggest mistakes in the creation of CPAN was allowing modules 
without extensions. It makes figuring out what to upgrade extremely difficult. 
Learning from that, PGXN requires version numbers for all extensions.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER EXTENSION ... UPGRADE;

2010-12-11 Thread David E. Wheeler
On Dec 11, 2010, at 12:09 PM, Dimitri Fontaine wrote:

 Yeah that works, as soon as VVV is the version we upgrade from.
 
 That said, we need to find a way to lighten the process for extensions
 where it's easy to have a single script to support upgrade from more
 than once past release.
 
 What about having the following keys supported in the control file:
 
  upgrade_version = 'script.version.sql'
  upgrade_all = 'script.sql'

Why not just use an upgrade script naming convention? Think: Convention over 
configuration.

 Where the version here is the version you're upgrading *from* (to is
 known and static when you distribute the files after all), and where
 upgrade_all is applied last no matter what got applied before.
 
 Also, do we want a subdirectory per extension to host all those files?

How are things currently arranged?

Best,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] would hw acceleration help postgres (databases in general) ?

2010-12-11 Thread Jim Nasby
On Dec 10, 2010, at 6:18 PM, Jeff Janes wrote:
 On Fri, Dec 10, 2010 at 3:09 PM, Hamza Bin Sohail hsoh...@purdue.edu wrote:
 
 Hello hackers,
 
 I think i'm at the right place to ask this question.
 
 Based on your experience and the fact that you have written the Postgres 
 code,
 can you tell what a rough break-down - in your opinion - is for the time the
 database spends time just fetching and writing  stuff to memory and the
 actual computation.
 
 The database is a general purpose tool.  Pick a bottleneck you wish to have,
 and probably someone uses it in a way that causes that bottleneck to occur.

A common bottleneck we run into is sorting of text data. Unfortunately, I doubt 
that a GPU would be able to help with that.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)

2010-12-11 Thread tv
 Hello

 you have to respect pg coding style:

 a) not too long lines
 b) not C++ line comments

OK, thanks for the notice. I've fixed those two problems.

regards
Tomasdiff --git a/src/backend/catalog/system_views.sql 
b/src/backend/catalog/system_views.sql
index 346eaaf..0ee59b1 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -310,6 +310,7 @@ CREATE VIEW pg_stat_all_tables AS
 pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum,
 pg_stat_get_last_analyze_time(C.oid) as last_analyze,
 pg_stat_get_last_autoanalyze_time(C.oid) as last_autoanalyze,
+   pg_stat_get_last_stat_reset_time(C.oid) as last_stat_reset,
 pg_stat_get_vacuum_count(C.oid) AS vacuum_count,
 pg_stat_get_autovacuum_count(C.oid) AS autovacuum_count,
 pg_stat_get_analyze_count(C.oid) AS analyze_count,
@@ -502,7 +503,8 @@ CREATE VIEW pg_stat_database AS
 pg_stat_get_db_tuples_fetched(D.oid) AS tup_fetched,
 pg_stat_get_db_tuples_inserted(D.oid) AS tup_inserted,
 pg_stat_get_db_tuples_updated(D.oid) AS tup_updated,
-pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted
+pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted,
+pg_stat_get_db_last_stat_reset_time(D.oid) AS last_stat_reset
 FROM pg_database D;
 
 CREATE VIEW pg_stat_user_functions AS
@@ -512,7 +514,8 @@ CREATE VIEW pg_stat_user_functions AS
 P.proname AS funcname,
 pg_stat_get_function_calls(P.oid) AS calls,
 pg_stat_get_function_time(P.oid) / 1000 AS total_time,
-pg_stat_get_function_self_time(P.oid) / 1000 AS self_time
+pg_stat_get_function_self_time(P.oid) / 1000 AS self_time,
+pg_stat_get_function_last_stat_reset_time(P.oid) AS last_stat_reset
 FROM pg_proc P LEFT JOIN pg_namespace N ON (N.oid = P.pronamespace)
 WHERE P.prolang != 12  -- fast check to eliminate built-in functions
   AND pg_stat_get_function_calls(P.oid) IS NOT NULL;
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index c3c136a..f20a9ea 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -249,6 +249,8 @@ static void pgstat_sighup_handler(SIGNAL_ARGS);
 static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create);
 static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry,
 Oid tableoid, bool create);
+static PgStat_StatFuncEntry *pgstat_get_func_entry(PgStat_StatDBEntry *dbentry,
+Oid funcoid, bool create);
 static void pgstat_write_statsfile(bool permanent);
 static HTAB *pgstat_read_statsfile(Oid onlydb, bool permanent);
 static void backend_read_statsfile(void);
@@ -3129,7 +3131,9 @@ pgstat_get_db_entry(Oid databaseid, bool create)
result-n_tuples_updated = 0;
result-n_tuples_deleted = 0;
result-last_autovac_time = 0;
-
+   
+   result-stat_reset_timestamp = GetCurrentTimestamp();
+   
memset(hash_ctl, 0, sizeof(hash_ctl));
hash_ctl.keysize = sizeof(Oid);
hash_ctl.entrysize = sizeof(PgStat_StatTabEntry);
@@ -3196,11 +3200,48 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid 
tableoid, bool create)
result-autovac_vacuum_count = 0;
result-analyze_count = 0;
result-autovac_analyze_count = 0;
+   
+   /* inherit stat time reset from dbentry */
+   result-stat_reset_timestamp = dbentry-stat_reset_timestamp;
+   
}
 
return result;
 }
 
+/*
+ * Lookup the hash table entry for the specified function. If no hash
+ * table entry exists, initialize it, if the create parameter is true.
+ * Else, return NULL.
+ */
+static PgStat_StatFuncEntry *
+pgstat_get_func_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create)
+{
+   PgStat_StatFuncEntry *result;
+   boolfound;
+   HASHACTION  action = (create ? HASH_ENTER : HASH_FIND);
+
+   /* Lookup or create the hash table entry for this table */
+   result = (PgStat_StatFuncEntry *) hash_search(dbentry-functions,
+   
 tableoid,
+   
 action, found);
+
+   if (!create  !found)
+   return NULL;
+
+   /* If not found, initialize the new one. */
+   if (!found)
+   {
+   result-f_numcalls = 0;
+   result-f_time = 0;
+   result-f_time_self = 0;
+   
+   /* inherit stat time reset from dbentry */
+   result-stat_reset_timestamp = 

Re: [HACKERS] proposal: auxiliary functions for record type

2010-12-11 Thread Jim Nasby
On Dec 11, 2010, at 9:24 AM, Florian Pflug wrote:
 begin
   for v_field in select * from fieldinfos(myrec) loop
   case
   when v_field.fieldtype = 'type1'::regtype then
   v_value_type1 := fieldvalue(myrec, NULL::type1, 
 false);
   Do something with v_value_type1
   ...
   when v_field.fieldtype = 'typeN'::regtype then
   v_value_typeN := fieldvalue(myrec, NULL::typeN, 
 false);
   Do something with v_value_typeN
   else raise exception 'Unexpected type % in record %', 
 v_field.fieldtype, myrec;
   end case;
   end loop;
 end;

All that casting looks horribly painful. :(

This is a case where what we really need is a data type that can accept data of 
any type and tell you what the type actually is. Another place that could 
really use that are crosstabs, though that case is more complex in that you're 
dealing with a recordset of fields and you don't know the type of any of them.

I wonder if it would be possible to create such a type without major backend 
surgery... I'm thinking the type would have a structure that stores the OID of 
the original type and then stores the text representation of that type. CAST 
functions to this type would know what the input data was and could store that 
information. The challenge is creating a way to cast back to the original 
type... I don't see a way to do that with current infrastructure.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: auxiliary functions for record type

2010-12-11 Thread Pavel Stehule
Hello

I looked on your module now and I am can't to say, what is better -
both modules has little bit different target if I understand, and I
don't know if they can be mixed. There are two strategies - rich
contrib module or a few functions in core. Probably we can't to have a
too rich API for record type like regexp has - transform result to
table, to array, read values to array, to table, read tuple desc to
array, to table, ... there should be one strategy - both strategies
(based on array, based on tables) are legal. I prefer a table based
solution, because I don't need a one unnest, but other preferences
are valid too. I dissatisfied with your design of explicit target type
via unused value.  I think, so we are not a infrastructure for it now
- from my view is better to use a common type, that is text now. It's
nothing new - plpgsql use it too.

I see one well design of explicit target type based on polymorphic
types that respect a PostgreSQL fmgr practice:

We have to allow a polymorphic functions without polymorphic
parameters. These functions shoud be designed to return value in
unknown type format when this function has not outer information.
This information can be passed in function context. When function
context isn't null, then function has to read target type and should
to return value in target type. Who can fill a function context? It is
task for executor. And when CAST contains just function call, then we
can recheck, if function is polymorphic, and if it is, then we can set
function context to target type, and then we don't need to call a
conversion function, because polymorphic function must returns data in
correct format. So when polymorphic function is used without
polymorphic parameters and without explicit CAST, then returns
unknown.

Without described functionality we can design a not polymorphic
function, that can returns unknown type. When similar functionality
will be implemented, then this function will be changed to
polymorphic, but from user's perspective, there isn't a change.

Regards

Pavel Stehule

2010/12/11 Florian Pflug f...@phlo.org:
 On Dec11, 2010, at 17:55 , Pavel Stehule wrote:
 It same trick, that I use in record_set_fields.
 Oh, I see. Sorry, must haven missed that when I read your blog entry.

 But I don't want to
 use it for reading of value. I don't like it. You don't need to know a
 value, you have to know a type - NULL::type. it is just not nice :).
 Well, no, it's not the most elegant API on earth, that's for sure. But I my 
 opinion, it isn't so bad that it rectifies casting everything to text.

 I
 though about it too, and maybe is a time for new polymorphic type
 anytype - and then you don't need to write a litte bit strange
 NULL::type

 it can be fieldvalue(myrec, type1, false)
 Hm, I don't think the ability to write just type1 instead of NULL::type1 
 is worth the necessary effort.

 If anything, I'd allow function to use anyelement as their return value 
 *without* having any polymorphic arguments. You'd need to surround calls to 
 such function with a CAST() expression, unless something else determines the 
 type. In pl/pgsql, for example, one could allow the CAST() to be skipped for 
 assignment to variables. Then, you could write
        v_value := fieldvalue(myrec, 'f1', true)

 That'd at least free you from having to specify the type in some cases. But 
 still, even this seems to be a lot of effort for quite little gain...

 best regards,
 Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] function attributes

2010-12-11 Thread Andrew Dunstan
Yesterday I did a bit of work on allowing bytea values to be passed into 
and out of plperl in binary format, effectively removing the need to 
escape and de-escape them. (The work can be seen on he plperlargs branch 
of my development repo at 
https://github.com/adunstan/postgresql-dev/commits/plperlargs/).


At the moment the behaviour is triggered by a custom setting 
(plperl.pass_binary_bytea), but this isn't really satisfactory. We could 
turn it on  permanently, but that would break a lot of legacy code. What 
we really need is a way of marking a function with some attributes. Of 
course, we could put it in the program text like plpgsql's 
#variable_conflict, but that's really rather ugly. The grammar already 
has an attribute mechanism for functions, and ISTM we just need to 
extend that a bit to allow setting of function attributes reasonably 
flexibly, much as we can now specify format options on EXPLAIN or we'll 
soon be able to specify options for foreign tables.


Thoughts?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] function attributes

2010-12-11 Thread David E. Wheeler
On Dec 11, 2010, at 2:27 PM, Andrew Dunstan wrote:

 Yesterday I did a bit of work on allowing bytea values to be passed into and 
 out of plperl in binary format, effectively removing the need to escape and 
 de-escape them. (The work can be seen on he plperlargs branch of my 
 development repo at 
 https://github.com/adunstan/postgresql-dev/commits/plperlargs/).

andrew++ # Woo!

 At the moment the behaviour is triggered by a custom setting 
 (plperl.pass_binary_bytea), but this isn't really satisfactory. We could turn 
 it on  permanently, but that would break a lot of legacy code. What we really 
 need is a way of marking a function with some attributes. Of course, we could 
 put it in the program text like plpgsql's #variable_conflict, but that's 
 really rather ugly. The grammar already has an attribute mechanism for 
 functions, and ISTM we just need to extend that a bit to allow setting of 
 function attributes reasonably flexibly, much as we can now specify format 
 options on EXPLAIN or we'll soon be able to specify options for foreign table

What does the existing attribute grammar for functions look like? An example 
perhaps?

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] create tablespace fails silently, or succeeds improperly

2010-12-11 Thread Robert Haas
On Sat, Dec 11, 2010 at 1:38 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sat, Dec 11, 2010 at 1:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm not really thinking about crash recovery, but replication slaves.
 Omitting to create the tablespace location directories on slaves
 doesn't seem far-fetched at all.  Of course, it's likely that
 the slave server will lack permissions to create in the location
 directory's parent; but if it can, the outcome shouldn't be too
 unreasonable.

 Creating the tablespace directory itself would be reasonable, but
 recursing all the way up seems dubious.

 Well, it's *very* unlikely that the slave server would have permissions
 to create in the root directory or close to it.  If you grant that it's
 reasonable to create the location directory itself, why not the parent
 too, if that's still in a directory that's writable?  I agree that the
 reasonableness of the behavior drops off the more you go up, but so does
 the probability of having the needed permissions.  I don't agree that
 it's a binary choice where creating exactly one directory is reasonable
 but exactly two isn't.

I'm just giving you my opinion.  Take it for what it's worth.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] function attributes

2010-12-11 Thread Andrew Dunstan



On 12/11/2010 08:01 PM, David E. Wheeler wrote:



At the moment the behaviour is triggered by a custom setting 
(plperl.pass_binary_bytea), but this isn't really satisfactory. We could turn 
it on  permanently, but that would break a lot of legacy code. What we really 
need is a way of marking a function with some attributes. Of course, we could 
put it in the program text like plpgsql's #variable_conflict, but that's really 
rather ugly. The grammar already has an attribute mechanism for functions, and 
ISTM we just need to extend that a bit to allow setting of function attributes 
reasonably flexibly, much as we can now specify format options on EXPLAIN or 
we'll soon be able to specify options for foreign table

What does the existing attribute grammar for functions look like? An example 
perhaps?




create function foo() . with ( /attribute/ [, ...] )

Currently allowed attributes are isStrict and isCachable. The mechanism 
is effectively obsolete right now, but we could use it for what I have 
in mind quite nicely.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unlogged tables

2010-12-11 Thread Robert Haas
On Sat, Dec 11, 2010 at 2:53 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Wed, Dec 8, 2010 at 6:52 AM, Marti Raudsepp ma...@juffo.org wrote:
 A very useful feature for unlogged tables would be the ability to
 switch them back to normal tables -- this way you could do bulk
 loading into an unlogged table and then turn it into a regular table
 using just fsync(), bypassing all the WAL-logging overhead.

 If archive_mode is off, then you can often find a way to bypass
 WAL-logging during bulk loading anyway.

 If archive_mode is on, then I don't see how this can work without
 massive changes.

Well,  you'd need to work your way through the heap and all of its
indices and XLOG every page.  And you've got to do that in a way
that's transaction-safe, and I don't have a design in mind for that
off the top of my head.  But I think massive changes is probably an
overstatement.  We can already handle ALTER TABLE operations that
involve a full relation rewrite, and that already does the
full-table-XLOG thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unlogged tables

2010-12-11 Thread Robert Haas
On Sat, Dec 11, 2010 at 3:18 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Fri, Dec 10, 2010 at 8:16 PM, Robert Haas robertmh...@gmail.com wrote:

 As I was working on the hash index support, it occurred to me that at
 some point in the future, we might want to allow an unlogged index on
 a permanent table.

 That is the feature I would be most excited about.

 With the current patch, an index is unlogged if
 and only if the corresponding table is unlogged, and both the table
 and the index are reset to empty on restart.  But we could have a
 slightly different flavor of index that, instead of being reset to
 empty, just gets marked invalid, perhaps by truncating the file to
 zero-length (and adding some code to treat that as something other
 than a hard error).  Perhaps you could even arrange for autovacuum to
 kick off an automatic rebuild,

 Or just have rebuilding the index as part of crash recovery.  I
 wouldn't use the feature anyway on indexes that would take more than a
 few seconds to rebuild, And wouldn't want to advertise the database as
 being available when it is essentially crippled from missing indexes.
 I'd rather bite the bullet up front.

I don't think you can rebuild the indexes during crash recovery; I
believe you need to be bound to the database that contains the index,
and, as we've been over before, binding to a database is irrevocable,
so the startup process can't bind to the database, rebuild the index,
and then unbind.  It would need to signal the postmaster to fire up
other backends to do this work, and at that point I think you may as
well piggyback on autovacuum rather than designing a similar mechanism
from scratch.

Also, while YOU might use such a feature only for indexes that can be
rebuilt in a few seconds, I strongly suspect that other people might
use it in other ways.  In particular, it seems that it would be
possibly sensible to use a feature like this for an index that's only
used for reporting queries.  If the database crashes, we'll still have
our primary key so we can continue operating, but we'll need to
reindex before running the nightly reports.

 I would think of it is as declaring that, instead of making the index
 recoverable via WAL logging and replay, instead make it recoverable by
 rebuilding.  So in that way it is quit unlike unlogged tables, in that
 we are not risking any data, just giving the database a hint about
 what the most expeditious way to maintain the index might be.  Well,
 more of an order than a hint, I guess.

I think it's six of one, half a dozen of the other.  An index by its
nature only contains data that is duplicated in a table, so by
definition loss of an index isn't risking any data.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] function attributes

2010-12-11 Thread Robert Haas
On Sat, Dec 11, 2010 at 5:27 PM, Andrew Dunstan
andrew.duns...@pgexperts.com wrote:
 Yesterday I did a bit of work on allowing bytea values to be passed into and
 out of plperl in binary format, effectively removing the need to escape and
 de-escape them. (The work can be seen on he plperlargs branch of my
 development repo at
 https://github.com/adunstan/postgresql-dev/commits/plperlargs/).

 At the moment the behaviour is triggered by a custom setting
 (plperl.pass_binary_bytea), but this isn't really satisfactory. We could
 turn it on  permanently, but that would break a lot of legacy code. What we
 really need is a way of marking a function with some attributes. Of course,
 we could put it in the program text like plpgsql's #variable_conflict, but
 that's really rather ugly. The grammar already has an attribute mechanism
 for functions, and ISTM we just need to extend that a bit to allow setting
 of function attributes reasonably flexibly, much as we can now specify
 format options on EXPLAIN or we'll soon be able to specify options for
 foreign tables.

 Thoughts?

Well, you could set that GUC (plperl.pass_binary_bytea) on a function
without changing any syntax on at all.

CREATE FUNCTION name (args) ... SET plperl.pass_binary_bytea = true;

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unlogged tables

2010-12-11 Thread Jeff Janes
On Fri, Dec 10, 2010 at 8:16 PM, Robert Haas robertmh...@gmail.com wrote:

 I think the first patch (relpersistence-v4.patch) is ready to commit,
 and the third patch to allow synchronous commits to become
 asynchronous when it doesn't matter (relax-sync-commit-v1.patch)
 doesn't seem to be changing much either, although I would appreciate
 it if someone with more expertise than I have with our write-ahead
 logging system would give it a quick once-over.

I don't understand what the point of the relax-sync-commit patch is.

If XactLastRecEnd.xrecoff == 0, then calling
XLogFlush(XactLastRecEnd) is pretty much a null operation anyway
because it will short-circuit at the early statement:

if (XLByteLE(record, LogwrtResult.Flush)) return

Or at least it had better return at that point, or we might have a
serious problem.  If XactLastRecEnd.xrecoff == 0 then the only way to
keep going is if XactLastRecEnd.xlogid is ahead of
LogwrtResult.Flush.xlogid.

I guess that could happen legitimately if the logs have recently
rolled over the 4GB boundary, and XactLastRecEnd is aware of this
while LogwrtResult is not yet aware of it.  I don't know if that is a
possible state of affairs.  If it is, then the result would be that on
very rare occasion your patch removes a spurious, but not harmful
other than performance,  fsync.

If somehow XactLastRecEnd gets a falsely advanced value of xlogid,
then calling XLogFlush with it would cause a PANIC xlog write request
%X/%X is past end of log %X/%X.  So unless people have been seeing
this, that must not be able to happen.  And looking at the only places
XactLastRecEnd.xlogid get set, I don't see how it could happen.


So maybe in your patch:

if ((wrote_xlog  XactSyncCommit) || forceSyncCommit || nrels  0)

should be

if (wrote_xlog  (XactSyncCommit || forceSyncCommit || nrels  0) )

It seems like on general principles we should not be passing to
XLogFlush a structure which is by definition invalid.




But even if XLogFlush is going to return immediately, that doesn't
negate the harm caused by commit_delay doing its thing needlessly.
Perhaps that was the original motivation for your patch.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] function attributes

2010-12-11 Thread Andrew Dunstan



On 12/11/2010 09:16 PM, Robert Haas wrote:


Well, you could set that GUC (plperl.pass_binary_bytea) on a function
without changing any syntax on at all.

CREATE FUNCTION name (args) ... SET plperl.pass_binary_bytea = true;


Oh, good point. I'd forgotten about that. I'll experiment and see if I 
can break it :-)


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unlogged tables

2010-12-11 Thread Robert Haas
On Sat, Dec 11, 2010 at 9:21 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Fri, Dec 10, 2010 at 8:16 PM, Robert Haas robertmh...@gmail.com wrote:

 I think the first patch (relpersistence-v4.patch) is ready to commit,
 and the third patch to allow synchronous commits to become
 asynchronous when it doesn't matter (relax-sync-commit-v1.patch)
 doesn't seem to be changing much either, although I would appreciate
 it if someone with more expertise than I have with our write-ahead
 logging system would give it a quick once-over.

 I don't understand what the point of the relax-sync-commit patch is.

Suppose we begin a transaction, write a bunch of data to a temporary
table, and commit.  Suppose further that synchronous_commit = off.  At
transaction commit time, we haven't written any XLOG records yet, but
we do have an XID assigned because of the writes to the temporary
tables.  So we'll issue a commit record.  Without this patch, since
synchronous_commit = off, we'll force that commit record to disk
before acknowledging the commit to the user.  However, that's not
really necessary because if we crash after acknowledging the commit to
the user, the temporary tables will disappear anyway, and our XID
doesn't exist on disk any place else - thus, whether the commit makes
it to disk before the crash or not will be immaterial on restart.

If you have a bunch of transactions that make write to temporary (or
unlogged) tables but not to any permanent tables, this makes it muuuch
faster.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] function attributes

2010-12-11 Thread Robert Haas
On Sat, Dec 11, 2010 at 9:28 PM, Andrew Dunstan and...@dunslane.net wrote:
 On 12/11/2010 09:16 PM, Robert Haas wrote:

 Well, you could set that GUC (plperl.pass_binary_bytea) on a function
 without changing any syntax on at all.

 CREATE FUNCTION name (args) ... SET plperl.pass_binary_bytea = true;

 Oh, good point. I'd forgotten about that. I'll experiment and see if I can
 break it :-)

I guess one problem with that approach is that the scoping of the
parameter change will be dynamic rather than lexical.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)

2010-12-11 Thread Robert Haas
On Sat, Dec 11, 2010 at 4:40 PM,  t...@fuzzy.cz wrote:
 Hello

 you have to respect pg coding style:

 a) not too long lines
 b) not C++ line comments

 OK, thanks for the notice. I've fixed those two problems.

Please add this to the currently-open commitfest.

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] proposal : cross-column stats

2010-12-11 Thread Tomas Vondra
Hi everyone,

one of the ssesion I've attended on PgDay last week was Heikki's session
about statistics in PostgreSQL. One of the issues he mentioned (and one
I regularly run into) is the absence of cross-column stats. When the
columns are not independent, this usually result in poor estimates (and
then in suboptimal plans).

I was thinking about this issue before, but that session was the last
impulse that pushed me to try to hack up a proof of concept. So here it
is ;-)

Lets talk about one special case - I'll explain how the proposed
solution works, and then I'll explain how to make it more general, what
improvements are possible, what issues are there. Anyway this is by no
means a perfect or complete solution - it's just a starting point.

 Short introduction 

Say we have a table with two INT columns - col_a and col_b, and we want
to estimate number of rows for a condition involving both columns:

   WHERE (col_a BETWEEN m AND n) AND (col_b BETWEEN p AND q)

When the columns are independent, doing the estimate is just a matter of
multiplication. When the columns are dependent, the estimate may be way off.

Lets assume there are histograms with 5 bins for both columns. What I
propose is basically building a 2D histogram. It kind of resembles
contingency table.

So we do have a table like this

col_b \ col_a  || 20% | 20% | 20% | 20% | 20% |
===||==
  20%  ||  4% |  4% |  4% |  4% |  4% |
  20%  ||  4% |  4% |  4% |  4% |  4% |
  20%  ||  4% |  4% |  4% |  4% |  4% |
  20%  ||  4% |  4% |  4% |  4% |  4% |
  20%  ||  4% |  4% |  4% |  4% |  4% |
===||==

where each column / row represents a bin in the original histograms, and
each cell represents an expected number of rows in it (for really
independent columns, it's 4%).

For dependent columns the actual values may be actually very different,
of course - e.g. for strongly dependent columns it might be like this

col_b \ col_a  || 20% | 20% | 20% | 20% | 20% |
===||==
  20%  || 20% |  0% |  0% |  0% |  0% |
  20%  ||  0% | 20% |  0% |  0% |  0% |
  20%  ||  0% |  0% | 20% |  0% |  0% |
  20%  ||  0% |  0% |  0% | 20% |  0% |
  20%  ||  0% |  0% |  9% |  0% | 20% |
===||==

To estimate the number of rows matching the condition, you'd sum
estimates for cells matching the condition. I.e. when the condition on
col_a matches the lowest 20% (the first histogram bin) and the condition
on col_b matches the lowest 20% of values, this corresponds to the first
cell (20% of rows).

Current optimizer estimates this to be 4% as it believes the columns are
independent.

I'm not sure whether I've explained it well enough, but the essence of
the proposal is to build N-dimensional histograms (where N is the number
of columns covered by the statistics) just as we are building histograms
today.

--- Proof of concept ---

I've hacked a nasty proof of concept in PL/pgSQL (see the attachment).
It creates two tables - test_table and cross_stats. The former contains
the data (in two columns), the latter is used to store cross-column
statistics of test_table.

Then there are several PL/pgSQL functions - two of them are really
important:

collect_stats(p_bins_a INT, p_bins_b INT)
  - this one is used to collect the stats, the parameters represent
number of bins for the columns (sides of the contingency table)
  - collect_stats(10,10) will build contingency table with 100 cells

get_estimate(p_from_a INT, p_to_a INT, p_from_b INT, p_to_b INT)
  - computes estimate for the condition listed above (ranges on both
columns)

So to run the PoC, just do something like this:

1) fill the table with some data

   INSERT INTO test_table SELECT round(random()*1000),
   round(random()*1000) FROM generate_series(1,10);

2) collect the cross-column statistics

   SELECT collect_stats(10,10);

3) see current estimated and actual number of rows

   EXPLAIN ANALYZE SELECT * FROM test_table
   WHERE (col_a BETWEEN 30 AND 129)
 AND (col_b BETWEEN 234 AND 484);

4) see the estimate based on contingency table

   SELECT get_estimate(30, 129, 234, 484);

Just two very simple tests for now - col_a/col_b contain the range from
the query, then there are actual number of matching rows and a current
estimate, And finally the new estimate based on contingency table with
various number of bins.

A) independent columns (proof that it produces just as good estimates
   as the current code)

   col_a   |   col_b   | actual | expected | 10x10 | 20x20 |
  [50,70]  |  [50,70]  |41  | 40   |   41  |47 |
  [50,250] |  [50,250] |  4023  |   4024   | 4436  |  3944 |
  [50,250] | [750,950] |  4023  |   3955   | 4509  |  

Re: [HACKERS] function attributes

2010-12-11 Thread David E. Wheeler
On Dec 11, 2010, at 5:58 PM, Andrew Dunstan wrote:

 create function foo() . with ( /attribute/ [, ...] )
 
 Currently allowed attributes are isStrict and isCachable. The mechanism is 
 effectively obsolete right now, but we could use it for what I have in mind 
 quite nicely.

Makes sense, though I think I might also like some sort of global default to 
toggle.

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers