Re: [HACKERS] Anyone for SSDs?
On Fri, 2010-12-10 at 18:07 -0800, Josh Berkus wrote: On 12/10/10 5:06 PM, Daniel Loureiro wrote: An quicksort method in sequential disk its just awful to be thinking in a non SSD world, but its possible in an SSD. So, code it. Shouldn't be hard to write a demo comparison. I don't believe that SSDs make quicksort-on-disk feasible, but would be happy to be proven wrong. I too do not believe it in normal case. However, considering the 'types' of SSDs, it may be feasible! Asking for 'the next page and getting it' has a time delay in the process. While on a regular HDD with spindles, the question is where is that page located, with SSDs, the question disappears, because the access time is uniform in case of SSDs. Also, the access time is about 100 times fasterm which would change quite a few things about the whole process. I would like to do that, but I do not have a SSD on my machine! :( Would it be impractical to try it for me? May be some of those I meet on the chat, and are into the enterprise may do it, but I would like to be a part of it. -Vaibhav (*_*) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Anyone for SSDs?
You can believe whatever you want, that doesn't make it true. completely agree. Like yours, Its just my point of view, not the reality. I agree with most points here, but I wondering how many good ideas are killed with the thought: this will be a performance killer with so many random access, lets discarded it. If in 80's the sequential access has more cost compared with random access (ok, there's not the SSD case), will be the PostgreSQL in the same design that it have nowadays ? -- Daniel Loureiro. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags
This broke the buildfarm on Windows, and I'm not sure of the best way to fix it. We currently define read and write permissions in port/win32.h specifically for windows. A quick-fix to just add these new ones as aliases won't work, because they are used in both open and umask calls. Since umask() turns off permissions, they can't be defined to the same, or all files are created readonly. For umask, it would work to define them to 0. But for open() and other such calls that turn on permissions, it needs to be defined to the same value as read permissions for user. Not sure of the best way to fix this. Perhaps we need to invent PG_UMASK_xyz macros? //Magnus On Fri, Dec 10, 2010 at 23:35, Tom Lane t...@sss.pgh.pa.us wrote: Use symbolic names not octal constants for file permission flags. Purely cosmetic patch to make our coding standards more consistent --- we were doing symbolic some places and octal other places. This patch fixes all C-coded uses of mkdir, chmod, and umask. There might be some other calls I missed. Inconsistency noted while researching tablespace directory permissions issue. Branch -- master Details --- http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=04f4e10cfc158239ca00a6ed6a84428c7acb1e6d Modified Files -- src/backend/access/transam/xlog.c | 2 +- src/backend/commands/copy.c | 2 +- src/backend/commands/tablespace.c | 2 +- src/backend/libpq/be-fsstubs.c | 7 --- src/backend/postmaster/postmaster.c | 4 ++-- src/backend/postmaster/syslogger.c | 6 +++--- src/backend/storage/file/copydir.c | 2 +- src/backend/storage/ipc/ipc.c | 4 ++-- src/bin/initdb/initdb.c | 16 src/bin/pg_ctl/pg_ctl.c | 2 +- 10 files changed, 24 insertions(+), 23 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committ...@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final(?) proposal for wal_sync_method changes
On Wed, Dec 8, 2010 at 15:40, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: On Wed, Dec 8, 2010 at 02:07, Tom Lane t...@sss.pgh.pa.us wrote: [ win32.h says ] #define fsync(fd) _commit(fd) What this means is that switching to a simple preference order fdatasync, then fsync will result in choosing fsync on Windows (since it hasn't got fdatasync), meaning _commit, meaning Windows users see a behavioral change after all. _commit() != fsync() Um, the macro quoted above makes them the same, no? One of us is confused. Uh, yeah. Sorry, that was the unclear:ness from being too preoccupied with the conference.. Pretty sure I'm the confused one. . _commit() is definitely the same as fsync() on the API level. And this correspond to fsync_writethrough, not fsync, when you talk about the wal_sync_method parameter. It will always sync through the write cache, even if it's hardware BBU'ed cache. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: auxiliary functions for record type
On Dec11, 2010, at 06:20 , Pavel Stehule wrote: I wrote a few functions for record type - record_expand, record_get_fields, record_get_field, record_set_fields. Just FYI, I've created something similar a while ago. The code can be found at https://github.com/fgp/pg_record_inspect The main difference seems to be that my code allows you to retrieve fields from a record without casting it to text. As a consequence, however, it lacks the set-returning functions that your's provides. It also doesn't provide a way to modify fields. Still, maybe we could join efforts here? I certain would *love* to see something like this either as a contrib module or in core. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PS display and standby query conflict
On Thu, 2010-12-09 at 22:13 +0900, Fujii Masao wrote: When I created the conflict between recovery and many read-only transactions in the standby server for test purpose, I found that the keyword waiting disappeared from PS display for just a moment even though the conflict had not been resolved yet. This seems strange to me. This problem happens because ResolveRecoveryConflictWithVirtualXIDs resets PS display for each read-only transactions that recovery waits for. Why do we need to reset that each time even though the conflict has not been resolved yet? The attached patch suppresses such a needless reset. Comments? The reset occurs at most each 500ms, so not much problem there. But if it annoys you, it seems OK to change it. Don't see a reason to backpatch though? BTW, ResolveRecoveryConflictWithVirtualXIDs calls pgstat_report_waiting(), which is also needless since the startup process doesn't have the shared memory entry (i.e., MyBEEntry) for pg_stat_activity. The attached patch removes that call. IIRC that wasn't added by me, so not sure why its there. Not harming anything either. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq changes for synchronous replication
On Mon, Dec 6, 2010 at 12:54 AM, Fujii Masao masao.fu...@gmail.com wrote: On Mon, Dec 6, 2010 at 3:07 AM, Greg Smith g...@2ndquadrant.com wrote: The one time this year top-posting seems appropriate...this patch seems stalled waiting for some sort of response to the concerns Alvaro raised here. Sorry for the delay. I didn't have the time. I gave this a look. It seems good, but I'm not sure about this bit: Thanks for the review! I guess this was OK when this was conceived as CopyXlog, but since it's now a generic mechanism, this seems a bit unwise. Should this be reconsidered so that it's possible to change the format or number of columns? I changed CopyBothResponse message so that it includes the format and number of columns of copy data. Please see the attached patch. (The paragraph added to the docs is also a bit too specific about this being used exclusively in streaming replication, ISTM) Yes. But it seems difficult to generalize the docs more because currently only SR uses Copy-both. So I had to write that, for example, the condition to get into the state is only START REPLICATION command. While modifying the code, it occurred to me that we might have to add new ExecStatusType like PGRES_COPY_BOTH and use that for CopyBoth mode, for the sake of consistency. But since it's just alias of PGRES_COPY_BOTH for now, i.e., there is no specific behavior for that ExecStatusType, I don't think that it's worth adding that yet. I'm not so sure about this. If we think that it's worth adding a new possible state, we should do so now; we will not be able to change this behavior later. OK. I added that new state. Committed with just a few changes to the documentation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: auxiliary functions for record type
2010/12/11 Florian Pflug f...@phlo.org: On Dec11, 2010, at 06:20 , Pavel Stehule wrote: I wrote a few functions for record type - record_expand, record_get_fields, record_get_field, record_set_fields. Just FYI, I've created something similar a while ago. The code can be found at https://github.com/fgp/pg_record_inspect The main difference seems to be that my code allows you to retrieve fields from a record without casting it to text. As a consequence, however, it lacks the set-returning functions that your's provides. It also doesn't provide a way to modify fields. Casting to text is necessary for PL/pgSQL. I am not happy from this, but there are not other way than using a common type - text - because you don't know a target type. Still, maybe we could join efforts here? I certain would *love* to see something like this either as a contrib module or in core. I'll look on your module, Regards Pavel Stehule best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: auxiliary functions for record type
On Dec11, 2010, at 16:03 , Pavel Stehule wrote: 2010/12/11 Florian Pflug f...@phlo.org: On Dec11, 2010, at 06:20 , Pavel Stehule wrote: I wrote a few functions for record type - record_expand, record_get_fields, record_get_field, record_set_fields. Just FYI, I've created something similar a while ago. The code can be found at https://github.com/fgp/pg_record_inspect The main difference seems to be that my code allows you to retrieve fields from a record without casting it to text. As a consequence, however, it lacks the set-returning functions that your's provides. It also doesn't provide a way to modify fields. Casting to text is necessary for PL/pgSQL. I am not happy from this, but there are not other way than using a common type - text - because you don't know a target type. I use the anyarray/anyelement machinery to cheat there, at least a bit. My function fieldvalue() returns anyelement and takes a parameter defval of type anyelement, which serves two purposes. First, by virtue of the anyelement machinery, the return type of fieldvalue() is that of defval. If the actual type of the requested field matches that type, the value is returned. If they don't match, the parameter coerce decided whether fieldvalue() tries to cast the value to the requested type, or simply raises an error. Second, to also give the *value*, not only the *type* of defval a meaning, it serves as the default return value. If requested field contains NULL, defvalue is returned instead. You are, of course, free to pass NULL for defvalue itself to turn that mapping into a NOP. Note that the returned value's type is always the same as defval's type, so the whole thing is perfectly type-safe from the point of view of the rest of the system. As long as you know all possible types than can appear in your record's fields, you can do in Pl/PgSQL something along the line of declare v_value_type1 type1; v_value_type2 type2; ... v_value_typeN typeN; begin for v_field in select * from fieldinfos(myrec) loop case when v_field.fieldtype = 'type1'::regtype then v_value_type1 := fieldvalue(myrec, NULL::type1, false); Do something with v_value_type1 ... when v_field.fieldtype = 'typeN'::regtype then v_value_typeN := fieldvalue(myrec, NULL::typeN, false); Do something with v_value_typeN else raise exception 'Unexpected type % in record %', v_field.fieldtype, myrec; end case; end loop; end; It works pretty well for me... best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags
Magnus Hagander mag...@hagander.net writes: This broke the buildfarm on Windows, and I'm not sure of the best way to fix it. We currently define read and write permissions in port/win32.h specifically for windows. A quick-fix to just add these new ones as aliases won't work, because they are used in both open and umask calls. Hm, those symbols are already in use elsewhere in the code; I would assume it's just a matter of missing #includes in these particular files. Where does Windows define 'em? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: auxiliary functions for record type
Hey Pavel, Everything looks fine, but I propose you to add record_set_field just for convenience and as opposite of record_get_field. 2010/12/11 Pavel Stehule pavel.steh...@gmail.com Hello I wrote a few functions for record type - record_expand, record_get_fields, record_get_field, record_set_fields. A usage of this functions is described in my blog http://okbob.blogspot.com/2010/12/iteration-over-record-in-plpgsql.html Do you think, so these functions can be in core? These are relative strong and enough general with zero dependency. Sure, these functions are not defined in ANSI SQL. Regards Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- // Dmitriy.
Re: [HACKERS] create tablespace fails silently, or succeeds improperly
Stephen Frost sfr...@snowman.net writes: * Robert Haas (robertmh...@gmail.com) wrote: On Fri, Dec 10, 2010 at 10:33 PM, Tom Lane t...@sss.pgh.pa.us wrote: Anybody have a problem with adopting this behavior? Seems a bit surprising. Yeahh.. I'm not really sure about mkdir -p type actions from a SQL command. Not entirely sure why but it doesn't feel 'right' to me. I'd rather have PG complain that directory doesn't exist. OK. Is there any value in doing mkdir -p in WAL-recovery execution of CREATE TABLESPACE but not regular execution? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags
I wrote: Magnus Hagander mag...@hagander.net writes: This broke the buildfarm on Windows, and I'm not sure of the best way to fix it. Hm, those symbols are already in use elsewhere in the code; I would assume it's just a matter of missing #includes in these particular files. Where does Windows define 'em? Ah, I have a theory: fcntl.h. Seems that ancient Unix specs say S_IRUSR etc should be defined there, rather than sys/stat.h which is the modern expectation. Count on M$ to find creative ways of being incompatible ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags
On Sat, Dec 11, 2010 at 17:13, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Magnus Hagander mag...@hagander.net writes: This broke the buildfarm on Windows, and I'm not sure of the best way to fix it. Hm, those symbols are already in use elsewhere in the code; I would assume it's just a matter of missing #includes in these particular files. Where does Windows define 'em? Ah, I have a theory: fcntl.h. Seems that ancient Unix specs say S_IRUSR etc should be defined there, rather than sys/stat.h which is the modern expectation. Count on M$ to find creative ways of being incompatible ... Nope, not there. I can't find S_IWGRP in any of the files. But that symbol, OTOH, is *not* in use anywhere else in the code. (only in zic.c, but it's ifdef'ed out on win32) I guess I need to go look for each individual one that breaks and split this into sub-problems. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags
Magnus Hagander mag...@hagander.net writes: On Sat, Dec 11, 2010 at 17:13, Tom Lane t...@sss.pgh.pa.us wrote: Ah, I have a theory: fcntl.h. Nope, not there. I can't find S_IWGRP in any of the files. Well, I notice that copydir.c compiled before, and still compiles after, despite this change: - if (mkdir(todir, S_IRUSR | S_IWUSR | S_IXUSR) != 0) + if (mkdir(todir, S_IRWXU) != 0) I think the reason it's not failing is that it includes fcntl.h. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags
On Sat, Dec 11, 2010 at 17:23, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: On Sat, Dec 11, 2010 at 17:13, Tom Lane t...@sss.pgh.pa.us wrote: Ah, I have a theory: fcntl.h. Nope, not there. I can't find S_IWGRP in any of the files. Well, I notice that copydir.c compiled before, and still compiles after, despite this change: - if (mkdir(todir, S_IRUSR | S_IWUSR | S_IXUSR) != 0) + if (mkdir(todir, S_IRWXU) != 0) I think the reason it's not failing is that it includes fcntl.h. S_IRWXU is defined in port/win32.h... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags
Magnus Hagander mag...@hagander.net writes: On Sat, Dec 11, 2010 at 17:23, Tom Lane t...@sss.pgh.pa.us wrote: I think the reason it's not failing is that it includes fcntl.h. S_IRWXU is defined in port/win32.h... No, it isn't. There's an apparently-useless definition of _S_IRWXU there, but no S_IRWXU. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags
On Sat, Dec 11, 2010 at 17:38, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: On Sat, Dec 11, 2010 at 17:23, Tom Lane t...@sss.pgh.pa.us wrote: I think the reason it's not failing is that it includes fcntl.h. S_IRWXU is defined in port/win32.h... No, it isn't. There's an apparently-useless definition of _S_IRWXU there, but no S_IRWXU. Hmm. You're right, of course. A search on my windows box finds the text string S_IRWXU in the following *.h files across the whole filesystem: c:\perl\lib\CORE\perl.h c:\perl64\lib\CORE\perl.h c:\pgsql\src\include\pg_config_os.h c:\pgsql\src\include\port\win32.h that's it. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags
Magnus Hagander mag...@hagander.net writes: On Sat, Dec 11, 2010 at 17:38, Tom Lane t...@sss.pgh.pa.us wrote: No, it isn't. There's an apparently-useless definition of _S_IRWXU there, but no S_IRWXU. Hmm. You're right, of course. A search on my windows box finds the text string S_IRWXU in the following *.h files across the whole filesystem: c:\perl\lib\CORE\perl.h c:\perl64\lib\CORE\perl.h c:\pgsql\src\include\pg_config_os.h c:\pgsql\src\include\port\win32.h that's it. OK, now I'm really confused. We have at least two questions: 1. How did all those pre-existing references to S_IRXWU compile? 2. Why didn't the previously hard-wired constants passed to chmod and umask fail on Windows? The M$ documentation I can find at the moment suggests that *only* _S_IREAD and _S_IWRITE bits are allowed in the inputs to those functions, which apparently is untrue or none of this code would have executed successfully. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: auxiliary functions for record type
2010/12/11 Florian Pflug f...@phlo.org: On Dec11, 2010, at 16:03 , Pavel Stehule wrote: 2010/12/11 Florian Pflug f...@phlo.org: On Dec11, 2010, at 06:20 , Pavel Stehule wrote: I wrote a few functions for record type - record_expand, record_get_fields, record_get_field, record_set_fields. Just FYI, I've created something similar a while ago. The code can be found at https://github.com/fgp/pg_record_inspect The main difference seems to be that my code allows you to retrieve fields from a record without casting it to text. As a consequence, however, it lacks the set-returning functions that your's provides. It also doesn't provide a way to modify fields. Casting to text is necessary for PL/pgSQL. I am not happy from this, but there are not other way than using a common type - text - because you don't know a target type. I use the anyarray/anyelement machinery to cheat there, at least a bit. My function fieldvalue() returns anyelement and takes a parameter defval of type anyelement, which serves two purposes. First, by virtue of the anyelement machinery, the return type of fieldvalue() is that of defval. If the actual type of the requested field matches that type, the value is returned. If they don't match, the parameter coerce decided whether fieldvalue() tries to cast the value to the requested type, or simply raises an error. It same trick, that I use in record_set_fields. But I don't want to use it for reading of value. I don't like it. You don't need to know a value, you have to know a type - NULL::type. it is just not nice :). I though about it too, and maybe is a time for new polymorphic type anytype - and then you don't need to write a litte bit strange NULL::type it can be fieldvalue(myrec, type1, false) Regards Pavel Stehule Second, to also give the *value*, not only the *type* of defval a meaning, it serves as the default return value. If requested field contains NULL, defvalue is returned instead. You are, of course, free to pass NULL for defvalue itself to turn that mapping into a NOP. Note that the returned value's type is always the same as defval's type, so the whole thing is perfectly type-safe from the point of view of the rest of the system. As long as you know all possible types than can appear in your record's fields, you can do in Pl/PgSQL something along the line of declare v_value_type1 type1; v_value_type2 type2; ... v_value_typeN typeN; begin for v_field in select * from fieldinfos(myrec) loop case when v_field.fieldtype = 'type1'::regtype then v_value_type1 := fieldvalue(myrec, NULL::type1, false); Do something with v_value_type1 ... when v_field.fieldtype = 'typeN'::regtype then v_value_typeN := fieldvalue(myrec, NULL::typeN, false); Do something with v_value_typeN else raise exception 'Unexpected type % in record %', v_field.fieldtype, myrec; end case; end loop; end; It works pretty well for me... best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags
On Sat, Dec 11, 2010 at 17:55, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: On Sat, Dec 11, 2010 at 17:38, Tom Lane t...@sss.pgh.pa.us wrote: No, it isn't. There's an apparently-useless definition of _S_IRWXU there, but no S_IRWXU. Hmm. You're right, of course. A search on my windows box finds the text string S_IRWXU in the following *.h files across the whole filesystem: c:\perl\lib\CORE\perl.h c:\perl64\lib\CORE\perl.h c:\pgsql\src\include\pg_config_os.h c:\pgsql\src\include\port\win32.h that's it. OK, now I'm really confused. We have at least two questions: 1. How did all those pre-existing references to S_IRXWU compile? Yeah, that's weird. IIRC (I stopped looking for the moment, need a step back) some of them were protected by #ifndef WIN32, but not all of them.. 2. Why didn't the previously hard-wired constants passed to chmod and umask fail on Windows? The M$ documentation I can find at the moment suggests that *only* _S_IREAD and _S_IWRITE bits are allowed in the inputs to those functions, which apparently is untrue or none of this code would have executed successfully. Probably it ignores any flags it doesn't know about? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags
Magnus Hagander mag...@hagander.net writes: On Sat, Dec 11, 2010 at 17:55, Tom Lane t...@sss.pgh.pa.us wrote: 2. Why didn't the previously hard-wired constants passed to chmod and umask fail on Windows? The M$ documentation I can find at the moment suggests that *only* _S_IREAD and _S_IWRITE bits are allowed in the inputs to those functions, which apparently is untrue or none of this code would have executed successfully. Probably it ignores any flags it doesn't know about? Maybe, but unless _S_IREAD and _S_IWRITE are defined as 0400 and 0200, the code would have been doing the wrong thing altogether. I'm not sure why you think the group/other macros couldn't be #define'd as 0? But in any case there's still the question of where S_IRWXU is coming from, since it clearly does work in several places. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags
Magnus Hagander mag...@hagander.net writes: On Sat, Dec 11, 2010 at 17:55, Tom Lane t...@sss.pgh.pa.us wrote: 1. How did all those pre-existing references to S_IRXWU compile? Yeah, that's weird. IIRC (I stopped looking for the moment, need a step back) some of them were protected by #ifndef WIN32, but not all of them.. The lightbulb just went on: in win32.h, #define mkdir(a,b) mkdir(a) I didn't go through in complete detail, but I'll bet all the working instances are in mkdir calls, or else inside #ifndef WIN32. I think we can just #define the other cases as zeroes. I'm not sure why you think that's an issue for open --- the privileges don't exist. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags
On Sat, Dec 11, 2010 at 18:46, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: On Sat, Dec 11, 2010 at 17:55, Tom Lane t...@sss.pgh.pa.us wrote: 1. How did all those pre-existing references to S_IRXWU compile? Yeah, that's weird. IIRC (I stopped looking for the moment, need a step back) some of them were protected by #ifndef WIN32, but not all of them.. The lightbulb just went on: in win32.h, #define mkdir(a,b) mkdir(a) I didn't go through in complete detail, but I'll bet all the working instances are in mkdir calls, or else inside #ifndef WIN32. Ah, that certainly looks like a smoking gun. I think we can just #define the other cases as zeroes. I'm not sure why you think that's an issue for open --- the privileges don't exist. Hmm. I was/am worried about any case that specifies *just* one of the permissions that don't exist. That'll leave it at zero, whereas the correct one might be the user-only version of whatever (read/write) was given. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Use symbolic names not octal constants for file permission flags
Magnus Hagander mag...@hagander.net writes: On Sat, Dec 11, 2010 at 18:46, Tom Lane t...@sss.pgh.pa.us wrote: I think we can just #define the other cases as zeroes. I'm not sure why you think that's an issue for open --- the privileges don't exist. Hmm. I was/am worried about any case that specifies *just* one of the permissions that don't exist. That'll leave it at zero, whereas the correct one might be the user-only version of whatever (read/write) was given. If we didn't specify the user read or write privilege, we shouldn't get it. What are the values of _S_IREAD and _S_IWRITE, anyway? I'm still wondering how come the previous coding with hardwired constants behaved correctly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] create tablespace fails silently, or succeeds improperly
On Sat, Dec 11, 2010 at 10:56 AM, Tom Lane t...@sss.pgh.pa.us wrote: Stephen Frost sfr...@snowman.net writes: * Robert Haas (robertmh...@gmail.com) wrote: On Fri, Dec 10, 2010 at 10:33 PM, Tom Lane t...@sss.pgh.pa.us wrote: Anybody have a problem with adopting this behavior? Seems a bit surprising. Yeahh.. I'm not really sure about mkdir -p type actions from a SQL command. Not entirely sure why but it doesn't feel 'right' to me. I'd rather have PG complain that directory doesn't exist. OK. Is there any value in doing mkdir -p in WAL-recovery execution of CREATE TABLESPACE but not regular execution? I don't think so. If someone creates a directory that is not fsync'd, and then creates a subdirectory and puts a tablespace on it, and then crashes after this has been WAL-logged but before the directory entries have hit the disk, well, unlucky for them, but that's a vanishingly rare situation. There's no guarantee that we'd set properties on the parent directory that would match the user's expectation anyway, especially if SE-Linux or something is involved. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] create tablespace fails silently, or succeeds improperly
Robert Haas robertmh...@gmail.com writes: On Sat, Dec 11, 2010 at 10:56 AM, Tom Lane t...@sss.pgh.pa.us wrote: OK. Is there any value in doing mkdir -p in WAL-recovery execution of CREATE TABLESPACE but not regular execution? I don't think so. If someone creates a directory that is not fsync'd, and then creates a subdirectory and puts a tablespace on it, and then crashes after this has been WAL-logged but before the directory entries have hit the disk, well, unlucky for them, but that's a vanishingly rare situation. I'm not really thinking about crash recovery, but replication slaves. Omitting to create the tablespace location directories on slaves doesn't seem far-fetched at all. Of course, it's likely that the slave server will lack permissions to create in the location directory's parent; but if it can, the outcome shouldn't be too unreasonable. There's no guarantee that we'd set properties on the parent directory that would match the user's expectation anyway, especially if SE-Linux or something is involved. If SELinux doesn't like it, it's going to fail no matter what we do. I'm just suggesting that trying to create the directory path will fail in fewer cases than not trying. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] create tablespace fails silently, or succeeds improperly
On Sat, Dec 11, 2010 at 1:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sat, Dec 11, 2010 at 10:56 AM, Tom Lane t...@sss.pgh.pa.us wrote: OK. Is there any value in doing mkdir -p in WAL-recovery execution of CREATE TABLESPACE but not regular execution? I don't think so. If someone creates a directory that is not fsync'd, and then creates a subdirectory and puts a tablespace on it, and then crashes after this has been WAL-logged but before the directory entries have hit the disk, well, unlucky for them, but that's a vanishingly rare situation. I'm not really thinking about crash recovery, but replication slaves. Omitting to create the tablespace location directories on slaves doesn't seem far-fetched at all. Of course, it's likely that the slave server will lack permissions to create in the location directory's parent; but if it can, the outcome shouldn't be too unreasonable. Creating the tablespace directory itself would be reasonable, but recursing all the way up seems dubious. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] create tablespace fails silently, or succeeds improperly
Robert Haas robertmh...@gmail.com writes: On Sat, Dec 11, 2010 at 1:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: I'm not really thinking about crash recovery, but replication slaves. Omitting to create the tablespace location directories on slaves doesn't seem far-fetched at all. Of course, it's likely that the slave server will lack permissions to create in the location directory's parent; but if it can, the outcome shouldn't be too unreasonable. Creating the tablespace directory itself would be reasonable, but recursing all the way up seems dubious. Well, it's *very* unlikely that the slave server would have permissions to create in the root directory or close to it. If you grant that it's reasonable to create the location directory itself, why not the parent too, if that's still in a directory that's writable? I agree that the reasonableness of the behavior drops off the more you go up, but so does the probability of having the needed permissions. I don't agree that it's a binary choice where creating exactly one directory is reasonable but exactly two isn't. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)
Hi everyone, I just wrote my first patch, and I need to know whether I missed something or not. I haven't used C for a really long time, so sickbags on standby, and if you notice something really stupid don't hesitate to call me an asshole (according to Simon Phipps that proves we are a healthy open community). So what the patch does (or should do)? It tracks when were the stats for a given object (database, table or function) reset for the last time. This is useful when you do snapshots of the stats for analysis - when comparing two snapshots, you have to know whether the stats were reset (in that case the analysis usually yields random noise and automatic tools get confused by this). Tom Lane already recommended a workaround - firing the DBA who randomly resets statistics, but that's not a good solution I think. First, you have to be superior to the DBA to be able to fire him ;-) Second, everyone makes a mistake from time to time. Third, when there are functions to reset stats, it's nice to provide such info as it makes life much easier. And there are cases when you don't reset the stats explicitly but the data are actually gone - e.g. when after a restore or upgrade (OK, this is solvable using pg_postmaster_start_time). In short, I think it's a useful feature (I need it and I think there are others). And I think it's not disruptive. So what the patch actually does: - extends PgStat_StatDBEntry, PgStat_StatTableEntry and PgStat_StatFuncEntry with a new field (stat_reset_timestamp) - adds functions to read current value from these fields (pg_stat_get_db_last_stat_reset_time, pg_stat_get_last_stat_reset_time and pg_stat_get_function_last_stat_reset_time) - extends the system views with calls to these functions (pg_stat_database, pg_stat_user_functions and pg_stat_all_tables) The values are set like this: - when a database is created, current timestamp is stored in PgStat_StatDBEntry.stat_reset_timestamp - by default all tables/functions inherit this timestamp - when stats for a given table / function are reset, current timestamp is stored in the stat_reset_timestamp (this happens in pgstat_recv_resetsinglecounter) - when stats for the whole database are reset, everything starts from scratch (this happens in pgstat_recv_resetcounter) What I'm not sure about: - I really am not sure about the changes made in pg_proc.h. I'm not sure how to assign OIDs to the new functions (I've simply chosen values that are were not used in this file), and I'm not sure about the other columns (I've copied and modified another function with the same parameter/return types) - I'm not sure if there are any other ways how the stat entries can be created. I've found two ways - directly (when asked for the stats e.g. from pgstat_get_tab_entry), and indirectly (when processing stats from a backend e.g. in pgstat_recv_tabstat). regards Tomasdiff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 346eaaf..0ee59b1 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -310,6 +310,7 @@ CREATE VIEW pg_stat_all_tables AS pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum, pg_stat_get_last_analyze_time(C.oid) as last_analyze, pg_stat_get_last_autoanalyze_time(C.oid) as last_autoanalyze, + pg_stat_get_last_stat_reset_time(C.oid) as last_stat_reset, pg_stat_get_vacuum_count(C.oid) AS vacuum_count, pg_stat_get_autovacuum_count(C.oid) AS autovacuum_count, pg_stat_get_analyze_count(C.oid) AS analyze_count, @@ -502,7 +503,8 @@ CREATE VIEW pg_stat_database AS pg_stat_get_db_tuples_fetched(D.oid) AS tup_fetched, pg_stat_get_db_tuples_inserted(D.oid) AS tup_inserted, pg_stat_get_db_tuples_updated(D.oid) AS tup_updated, -pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted +pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted, +pg_stat_get_db_last_stat_reset_time(D.oid) AS last_stat_reset FROM pg_database D; CREATE VIEW pg_stat_user_functions AS @@ -512,7 +514,8 @@ CREATE VIEW pg_stat_user_functions AS P.proname AS funcname, pg_stat_get_function_calls(P.oid) AS calls, pg_stat_get_function_time(P.oid) / 1000 AS total_time, -pg_stat_get_function_self_time(P.oid) / 1000 AS self_time +pg_stat_get_function_self_time(P.oid) / 1000 AS self_time, +pg_stat_get_function_last_stat_reset_time(P.oid) AS last_stat_reset FROM pg_proc P LEFT JOIN pg_namespace N ON (N.oid = P.pronamespace) WHERE P.prolang != 12 -- fast check to eliminate built-in functions AND pg_stat_get_function_calls(P.oid) IS NOT NULL; diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index c3c136a..f0b3453 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -249,6
Re: [HACKERS] Anyone for SSDs?
On Fri, Dec 10, 2010 at 8:09 PM, Vaibhav Kaushal vaibhavkaushal...@gmail.com wrote: On Fri, 2010-12-10 at 18:07 -0800, Josh Berkus wrote: On 12/10/10 5:06 PM, Daniel Loureiro wrote: An quicksort method in sequential disk its just awful to be thinking in a non SSD world, but its possible in an SSD. So, code it. Shouldn't be hard to write a demo comparison. I don't believe that SSDs make quicksort-on-disk feasible, but would be happy to be proven wrong. I too do not believe it in normal case. However, considering the 'types' of SSDs, it may be feasible! Asking for 'the next page and getting it' has a time delay in the process. While on a regular HDD with spindles, the question is where is that page located, with SSDs, the question disappears, because the access time is uniform in case of SSDs. Also, the access time is about 100 times fasterm which would change quite a few things about the whole process. I don't understand what it is you are proposing. Quicksort is usually swap based, and so the records would need to be the same size. Are you proposing to do an in-memory sort of pointers, which reference on-disk records? Or an on-disk sort of pointers which reference on-disk (but somewhere else) records? If you are swapping pointers on disk, you have to consider the write performance, not just the read performance. I would like to do that (coding), but I do not have a SSD on my machine! That doesn't mean you can't do the coding, it just means you can't test the performance. The barrier to get someone else to performance test it for you is a lot lower than the barrier to get someone else to write it for you and then performance test it for you. (But I can't test-drive it, as I don't have any computer which has both an SSD and a hard drive, just one or the other. And the one with SSD would be hard to compile PG on.) :( Would it be impractical to try it for me? Again I do not know how to test PG :( Yeah, I think it would be impractical. If I thought it would likely work, it would be different (at least, it might be if I had the right hardware). But I think it would likely not work. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Wed, Dec 8, 2010 at 6:52 AM, Marti Raudsepp ma...@juffo.org wrote: A very useful feature for unlogged tables would be the ability to switch them back to normal tables -- this way you could do bulk loading into an unlogged table and then turn it into a regular table using just fsync(), bypassing all the WAL-logging overhead. If archive_mode is off, then you can often find a way to bypass WAL-logging during bulk loading anyway. If archive_mode is on, then I don't see how this can work without massive changes. One possibility would be to create a mechanism to inject entire large files into the archive log stream. (Such a facility might be useful for other purposes too). So the transaction that changes the mode from unlogged to logged would have to take an exclusive lock on the unlogged table and make sure shared buffers for it are written out, then it would just copy the backing file(s) for that table into the archive steam with a special header that tells the recovery process Set these aside, I'll explain later. Once that is done, it would just have to ensure the WAL segment it is currently on will come after the injected files in the archive stream, and write a WAL record explaining where those bulk files it sent early are supposed to go. I don't know, sound like a lot of work and lot of pitfalls. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER EXTENSION ... UPGRADE;
Hi, I've been reading through the entire thread and it seems like this is the best mail to choose to answer. Tom Lane t...@sss.pgh.pa.us writes: Maybe I misread David's meaning, but I thought he was saying that there's no value in inventing all those control file entries in the first place. Just hard-wire in ALTER EXTENSION UPGRADE the convention that the name of an upgrade script to upgrade from prior version VVV is EXTNAME-upgrade.VVV.sql (or any variant spelling of that you care for). Yeah that works, as soon as VVV is the version we upgrade from. That said, we need to find a way to lighten the process for extensions where it's easy to have a single script to support upgrade from more than once past release. What about having the following keys supported in the control file: upgrade_version = 'script.version.sql' upgrade_all = 'script.sql' Where the version here is the version you're upgrading *from* (to is known and static when you distribute the files after all), and where upgrade_all is applied last no matter what got applied before. Also, do we want a subdirectory per extension to host all those files? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Fri, Dec 10, 2010 at 8:16 PM, Robert Haas robertmh...@gmail.com wrote: As I was working on the hash index support, it occurred to me that at some point in the future, we might want to allow an unlogged index on a permanent table. That is the feature I would be most excited about. With the current patch, an index is unlogged if and only if the corresponding table is unlogged, and both the table and the index are reset to empty on restart. But we could have a slightly different flavor of index that, instead of being reset to empty, just gets marked invalid, perhaps by truncating the file to zero-length (and adding some code to treat that as something other than a hard error). Perhaps you could even arrange for autovacuum to kick off an automatic rebuild, Or just have rebuilding the index as part of crash recovery. I wouldn't use the feature anyway on indexes that would take more than a few seconds to rebuild, And wouldn't want to advertise the database as being available when it is essentially crippled from missing indexes. I'd rather bite the bullet up front. I would think of it is as declaring that, instead of making the index recoverable via WAL logging and replay, instead make it recoverable by rebuilding. So in that way it is quit unlike unlogged tables, in that we are not risking any data, just giving the database a hint about what the most expeditious way to maintain the index might be. Well, more of an order than a hint, I guess. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: auxiliary functions for record type
On Dec11, 2010, at 17:55 , Pavel Stehule wrote: It same trick, that I use in record_set_fields. Oh, I see. Sorry, must haven missed that when I read your blog entry. But I don't want to use it for reading of value. I don't like it. You don't need to know a value, you have to know a type - NULL::type. it is just not nice :). Well, no, it's not the most elegant API on earth, that's for sure. But I my opinion, it isn't so bad that it rectifies casting everything to text. I though about it too, and maybe is a time for new polymorphic type anytype - and then you don't need to write a litte bit strange NULL::type it can be fieldvalue(myrec, type1, false) Hm, I don't think the ability to write just type1 instead of NULL::type1 is worth the necessary effort. If anything, I'd allow function to use anyelement as their return value *without* having any polymorphic arguments. You'd need to surround calls to such function with a CAST() expression, unless something else determines the type. In pl/pgsql, for example, one could allow the CAST() to be skipped for assignment to variables. Then, you could write v_value := fieldvalue(myrec, 'f1', true) That'd at least free you from having to specify the type in some cases. But still, even this seems to be a lot of effort for quite little gain... best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Reduce spurious Hot Standby conflicts from never-visible records
(Moving to pgsql-hackers) On 10.12.2010 20:21, Tom Lane wrote: Simon Riggssi...@2ndquadrant.com writes: Reduce spurious Hot Standby conflicts from never-visible records. Hot Standby conflicts only with tuples that were visible at some point. So ignore tuples from aborted transactions or for tuples updated/deleted during the inserting transaction when generating the conflict transaction ids. Following detailed analysis and test case by Noah Misch. Original report covered btree delete records, correctly observed by Heikki Linnakangas that this applies to other cases also. Fix covers all sources of cleanup records via common code. Includes additional fix compared to commit on HEAD ISTM HeapTupleHeaderAdvanceLatestRemovedXid is still pretty broken, in that it's examining xmax without having checked that xmax is (a) valid or (b) a lock rather than a deletion xmax. In current use, it's only called for tuples that are known to be dead, so either xmax is a valid deletion, or xmin didn't commit in which case the function doesn't use xmax for anything. So I think it actually works as it is. I agree it doesn't look right, though. At the very least it needs comments explaining that, but preferably it should do something sane when faced with a tuple that's not dead after all. Perhaps throw an error (though that would be bad during recovery), or an Assert, or just refrain from advancing latestRemovedXid (or advance it, that would be the conservative stance given the current use). Also, I'm not totally convinced it's correct when xmin xmax, despite Simon's follow-up commit to fix that. Shouldn't it advance latestRemovedXid to xmin in that case? Or maybe it's ok as it is because we know that xmax committed after xmin. The impression I get from the comment above the function now is that it advances latestRemovedXid to the highest XID present in the tuple, but that's not what it does in the xmin xmax case. That comment needs clarification. While we're at it, perhaps it would be better to move this function to tqual.c. And I feel that a more natural interface would be something like: TransactionId HeapTupleHeaderGetLatestRemovedXid(HeapTupleHeader tuple); IOW, instead bumping up the passed-in latestRemovedXid value, return the highest XID on the tuple (if it was dead). PS. it would be good to set hint bits in that function like in HeapTupleSatisfies* functions. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Wildcard search support for pg_trgm
Hackers, Here is first version of patch, which enable index support of wildcard search in pg_trgm contrib module. The idea of the patch is to extract from wildcard trigrams which should occurs in wildcard matching string. For example, for '%sector%' wildcard such trigrams would be: 'sec', 'ect', 'tor'. create table words (word text); copy words from '/usr/share/dict/american-english'; test=# explain analyze select * from words where word ilike '%independ%'; QUERY PLAN -- Seq Scan on words (cost=0.00..1703.11 rows=10 width=9) (actual time=18.818..174.146 rows=7 loops=1) Filter: (word ~~* '%independ%'::text) Total runtime: 174.200 ms (3 rows) CREATE INDEX trgm_idx ON words USING gist (word gist_trgm_ops); test=# explain analyze select * from words where word ilike '%independ%'; QUERY PLAN -- Bitmap Heap Scan on words (cost=4.36..40.11 rows=10 width=9) (actual time=2.445..2.529 rows=7 loops=1) Recheck Cond: (word ~~* '%independ%'::text) - Bitmap Index Scan on trgm_idx (cost=0.00..4.35 rows=10 width=0) (actual time=2.406..2.406 rows=7 loops=1) Index Cond: (word ~~* '%independ%'::text) Total runtime: 2.612 ms (5 rows) CREATE INDEX trgm_idx ON words USING gin (word gin_trgm_ops); test=# explain analyze select * from words where word ilike '%independ%'; QUERY PLAN --- Bitmap Heap Scan on words (cost=76.08..111.83 rows=10 width=9) (actual time=2.675..2.755 rows=7 loops=1) Recheck Cond: (word ~~* '%independ%'::text) - Bitmap Index Scan on trgm_idx (cost=0.00..76.07 rows=10 width=0) (actual time=2.642..2.642 rows=7 loops=1) Index Cond: (word ~~* '%independ%'::text) Total runtime: 2.839 ms (5 rows) I've encountered with following problems: 1) Indexing support for ilike is possible only with case-insensetive wildcards, e.g. when IGNORECASE macro is enabled. But I can't use this macro in pg_trgm.sql.in, where list of operators is defined. Probably, is it enuogh to put comment near IGNORECASE, which tells that if one disable this macro he should also remove oparators from pg_trgm.sql.in? 2) I found gist index not very useful with default SIGLENINT = 3. I've changed this value to 15 and I found gist index performs very good on dictionary. But on longer strings greater values of SIGLENINT may be required (probably even SIGLENINT 122 will give benefit in some cases in spite of TOAST). With best regards, Alexander Korotkov. *** a/contrib/pg_trgm/pg_trgm.sql.in --- b/contrib/pg_trgm/pg_trgm.sql.in *** *** 113,118 FOR TYPE text USING gist --- 113,120 AS OPERATOR1 % (text, text), OPERATOR2 - (text, text) FOR ORDER BY pg_catalog.float_ops, + OPERATOR3 ~~ (text, text), + OPERATOR4 ~~* (text, text), FUNCTION1 gtrgm_consistent (internal, text, int, oid, internal), FUNCTION2 gtrgm_union (bytea, internal), FUNCTION3 gtrgm_compress (internal), *** *** 144,149 CREATE OPERATOR CLASS gin_trgm_ops --- 146,153 FOR TYPE text USING gin AS OPERATOR1 % (text, text), + OPERATOR3 ~~ (text, text), + OPERATOR4 ~~* (text, text), FUNCTION1 btint4cmp (int4, int4), FUNCTION2 gin_extract_trgm (text, internal), FUNCTION3 gin_extract_trgm (text, internal, int2, internal, internal), *** a/contrib/pg_trgm/trgm.h --- b/contrib/pg_trgm/trgm.h *** *** 19,24 --- 19,26 /* operator strategy numbers */ #define SimilarityStrategyNumber 1 #define DistanceStrategyNumber 2 + #define LikeStrategyNumber 3 + #define ILikeStrategyNumber 4 typedef char trgm[3]; *** *** 53,59 typedef struct /* gist */ #define BITBYTE 8 ! #define SIGLENINT 3 /* 122 = key will toast, so very slow!!! */ #define SIGLEN ( sizeof(int)*SIGLENINT ) #define SIGLENBIT (SIGLEN*BITBYTE - 1) /* see makesign */ --- 55,61 /* gist */ #define BITBYTE 8 ! #define SIGLENINT 15 /* 122 = key will toast, so very slow!!! */ #define SIGLEN ( sizeof(int)*SIGLENINT ) #define SIGLENBIT (SIGLEN*BITBYTE - 1) /* see makesign */ *** *** 89,94 typedef char *BITVECP; --- 91,101 extern float4 trgm_limit; TRGM *generate_trgm(char *str, int slen); + TRGM *generate_wildcard_trgm(char *str, int slen); float4
[HACKERS] pg_execute_from_file, patch v10
Hi, The other infrastructure patch that has been mark ready for commit then commented further upon by Tom is $subject, even if the function provided as been renamed to pg_execute_sql_file(). Please find attached the newer version that fixes Tom concerns, removing the VARIADIC forms of the functions (those placeholders idea). The git tree already contains a fixed extension code, but the next patch for that one will have to wait some more (psql refactoring). Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 14449,14466 postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); /row row entry ! literalfunctionpg_read_file(parameterfilename/ typetext/, parameteroffset/ typebigint/, parameterlength/ typebigint/)/function/literal /entry entrytypetext/type/entry entryReturn the contents of a text file/entry /row row entry literalfunctionpg_stat_file(parameterfilename/ typetext/)/function/literal /entry entrytyperecord/type/entry entryReturn information about a file/entry /row /tbody /tgroup /table --- 14449,14488 /row row entry ! literalfunctionpg_read_file(parameterfilename/ typetext/, parameteroffset/ typebigint/ [, parameterlength/ typebigint/])/function/literal /entry entrytypetext/type/entry entryReturn the contents of a text file/entry /row row entry + literalfunctionpg_read_binary_file(parameterfilename/ typetext/, parameteroffset/ typebigint/ [, parameterlength/ typebigint/])/function/literal +/entry +entrytypebytea/type/entry +entryReturn the contents of a file/entry + /row + row +entry literalfunctionpg_stat_file(parameterfilename/ typetext/)/function/literal /entry entrytyperecord/type/entry entryReturn information about a file/entry /row + row +entry + literalfunctionpg_execute_sql_string(parametersql/ typetext/) )/function/literal +/entry +entrytypevoid/type/entry +entryExecute the given string as acronymSQL/ commands./entry + /row + row +entry + literalfunctionpg_execute_sql_file(parameterfilename/ typetext/ [, parameterencoding/parameter typename/type]) )/function/literal +/entry +entrytypevoid/type/entry +entryExecute contents of the given file as acronymSQL/ commands, +expected in either database encoding or given encoding./entry + /row /tbody /tgroup /table *** *** 14482,14487 postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); --- 14504,14533 at the given parameteroffset/, returning at most parameterlength/ bytes (less if the end of file is reached first). If parameteroffset/ is negative, it is relative to the end of the file. + When the parameter parameterlength/ is omitted, + functionpg_read_file/ reads until the end of the file. + The part of a file must be a valid text in the server encoding. +/para + +indexterm + primarypg_read_binary_file/primary +/indexterm +para + functionpg_read_binary_file/ returns part of a file as like as + functionpg_read_file/, but the result is a bytea value. + programlisting + SELECT convert_from(pg_read_binary_file('postgresql.conf', -69), 'utf8'); + convert_from + --- + #custom_variable_classes = '' # list of custom variable class names+ + + (1 row) + /programlisting +/para +para + When the parameter parameterlength/ is + omited, functionpg_read_binary_file/ reads until the end of the + file. /para indexterm *** *** 14499,14504 SELECT (pg_stat_file('filename')).modification; --- 14545,14582 /programlisting /para +indexterm + primarypg_execute_sql_string/primary +/indexterm +para + functionpg_execute_sql_string/ makes the server execute the given string + as acronymSQL/ commands. + The script may contain placeholders that will be replaced by the optional + parameters, that are pairs of variable names and values. Here's an example: + programlisting + SELECT pg_execute_sql_string('CREATE SCHEMA utils'); + pg_execute_sql_string + --- + + (1 row) + + SELECT oid, * from pg_namespace where nspname = 'utils'; + oid | nspname | nspowner | nspacl + ---+-+--+ + 16387 | utils | 10 | + (2 rows) + /programlisting +/para + +indexterm +
Re: [HACKERS] Extensions, patch v16
On Fri, Dec 10, 2010 at 11:24:27AM -0500, Tom Lane wrote: Dimitri Fontaine dimi...@2ndquadrant.fr writes: Tom Lane t...@sss.pgh.pa.us writes: Are there any actual remaining use-cases for that sed step? The goal here is to allow extension authors to maintain their version number in the Makefile rather than in the Makefile and in the control file separately. Having the same version number in more than one place never eases maintenance. Why is it in the makefile at all? If the makefile does need to know it, why don't we have it scrape the number out of the control file? Or even more to the point, since when do we need version numbers in extensions? We *absolutely* need version numbers in extensions. People will want to have a certain version, or a certain minimum version, etc., etc., etc., just as they do for any other software. Seriously, are you OK? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: auxiliary functions for record type
2010/12/11 Florian Pflug f...@phlo.org: On Dec11, 2010, at 17:55 , Pavel Stehule wrote: It same trick, that I use in record_set_fields. Oh, I see. Sorry, must haven missed that when I read your blog entry. But I don't want to use it for reading of value. I don't like it. You don't need to know a value, you have to know a type - NULL::type. it is just not nice :). Well, no, it's not the most elegant API on earth, that's for sure. But I my opinion, it isn't so bad that it rectifies casting everything to text. yes, there are two no good solution now. I though about it too, and maybe is a time for new polymorphic type anytype - and then you don't need to write a litte bit strange NULL::type it can be fieldvalue(myrec, type1, false) Hm, I don't think the ability to write just type1 instead of NULL::type1 is worth the necessary effort. If anything, I'd allow function to use anyelement as their return value *without* having any polymorphic arguments. You'd need to surround calls to such function with a CAST() expression, unless something else determines the type. In pl/pgsql, for example, one could allow the CAST() to be skipped for assignment to variables. Then, you could write v_value := fieldvalue(myrec, 'f1', true) yes, this is probably way and I like it. But now with zero support. a) there isn't mechanism for propagation target type from outer to function - function have to have to know target type b) plpgsql uses only IO cast - so values are translated to text and back everytime but CAST has own parser part and probably executor part too, so this is possible and too hard. so is possible to call: SELECT CAST(fieldvalue(myrec, 'f1', true) AS int) yes, this task should not be hard. But who will comunicate about this with Tom? Regards Pavel That'd at least free you from having to specify the type in some cases. But still, even this seems to be a lot of effort for quite little gain... best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)
Hello you have to respect pg coding style: a) not too long lines b) not C++ line comments Zdar Pavel 2010/12/11 t...@fuzzy.cz: Hi everyone, I just wrote my first patch, and I need to know whether I missed something or not. I haven't used C for a really long time, so sickbags on standby, and if you notice something really stupid don't hesitate to call me an asshole (according to Simon Phipps that proves we are a healthy open community). So what the patch does (or should do)? It tracks when were the stats for a given object (database, table or function) reset for the last time. This is useful when you do snapshots of the stats for analysis - when comparing two snapshots, you have to know whether the stats were reset (in that case the analysis usually yields random noise and automatic tools get confused by this). Tom Lane already recommended a workaround - firing the DBA who randomly resets statistics, but that's not a good solution I think. First, you have to be superior to the DBA to be able to fire him ;-) Second, everyone makes a mistake from time to time. Third, when there are functions to reset stats, it's nice to provide such info as it makes life much easier. And there are cases when you don't reset the stats explicitly but the data are actually gone - e.g. when after a restore or upgrade (OK, this is solvable using pg_postmaster_start_time). In short, I think it's a useful feature (I need it and I think there are others). And I think it's not disruptive. So what the patch actually does: - extends PgStat_StatDBEntry, PgStat_StatTableEntry and PgStat_StatFuncEntry with a new field (stat_reset_timestamp) - adds functions to read current value from these fields (pg_stat_get_db_last_stat_reset_time, pg_stat_get_last_stat_reset_time and pg_stat_get_function_last_stat_reset_time) - extends the system views with calls to these functions (pg_stat_database, pg_stat_user_functions and pg_stat_all_tables) The values are set like this: - when a database is created, current timestamp is stored in PgStat_StatDBEntry.stat_reset_timestamp - by default all tables/functions inherit this timestamp - when stats for a given table / function are reset, current timestamp is stored in the stat_reset_timestamp (this happens in pgstat_recv_resetsinglecounter) - when stats for the whole database are reset, everything starts from scratch (this happens in pgstat_recv_resetcounter) What I'm not sure about: - I really am not sure about the changes made in pg_proc.h. I'm not sure how to assign OIDs to the new functions (I've simply chosen values that are were not used in this file), and I'm not sure about the other columns (I've copied and modified another function with the same parameter/return types) - I'm not sure if there are any other ways how the stat entries can be created. I've found two ways - directly (when asked for the stats e.g. from pgstat_get_tab_entry), and indirectly (when processing stats from a backend e.g. in pgstat_recv_tabstat). regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions, patch v16
On Dec 11, 2010, at 1:09 PM, David Fetter wrote: Why is it in the makefile at all? If the makefile does need to know it, why don't we have it scrape the number out of the control file? Or even more to the point, since when do we need version numbers in extensions? We *absolutely* need version numbers in extensions. People will want to have a certain version, or a certain minimum version, etc., etc., etc., just as they do for any other software. Seriously, are you OK? One of the biggest mistakes in the creation of CPAN was allowing modules without extensions. It makes figuring out what to upgrade extremely difficult. Learning from that, PGXN requires version numbers for all extensions. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER EXTENSION ... UPGRADE;
On Dec 11, 2010, at 12:09 PM, Dimitri Fontaine wrote: Yeah that works, as soon as VVV is the version we upgrade from. That said, we need to find a way to lighten the process for extensions where it's easy to have a single script to support upgrade from more than once past release. What about having the following keys supported in the control file: upgrade_version = 'script.version.sql' upgrade_all = 'script.sql' Why not just use an upgrade script naming convention? Think: Convention over configuration. Where the version here is the version you're upgrading *from* (to is known and static when you distribute the files after all), and where upgrade_all is applied last no matter what got applied before. Also, do we want a subdirectory per extension to host all those files? How are things currently arranged? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] would hw acceleration help postgres (databases in general) ?
On Dec 10, 2010, at 6:18 PM, Jeff Janes wrote: On Fri, Dec 10, 2010 at 3:09 PM, Hamza Bin Sohail hsoh...@purdue.edu wrote: Hello hackers, I think i'm at the right place to ask this question. Based on your experience and the fact that you have written the Postgres code, can you tell what a rough break-down - in your opinion - is for the time the database spends time just fetching and writing stuff to memory and the actual computation. The database is a general purpose tool. Pick a bottleneck you wish to have, and probably someone uses it in a way that causes that bottleneck to occur. A common bottleneck we run into is sorting of text data. Unfortunately, I doubt that a GPU would be able to help with that. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)
Hello you have to respect pg coding style: a) not too long lines b) not C++ line comments OK, thanks for the notice. I've fixed those two problems. regards Tomasdiff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 346eaaf..0ee59b1 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -310,6 +310,7 @@ CREATE VIEW pg_stat_all_tables AS pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum, pg_stat_get_last_analyze_time(C.oid) as last_analyze, pg_stat_get_last_autoanalyze_time(C.oid) as last_autoanalyze, + pg_stat_get_last_stat_reset_time(C.oid) as last_stat_reset, pg_stat_get_vacuum_count(C.oid) AS vacuum_count, pg_stat_get_autovacuum_count(C.oid) AS autovacuum_count, pg_stat_get_analyze_count(C.oid) AS analyze_count, @@ -502,7 +503,8 @@ CREATE VIEW pg_stat_database AS pg_stat_get_db_tuples_fetched(D.oid) AS tup_fetched, pg_stat_get_db_tuples_inserted(D.oid) AS tup_inserted, pg_stat_get_db_tuples_updated(D.oid) AS tup_updated, -pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted +pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted, +pg_stat_get_db_last_stat_reset_time(D.oid) AS last_stat_reset FROM pg_database D; CREATE VIEW pg_stat_user_functions AS @@ -512,7 +514,8 @@ CREATE VIEW pg_stat_user_functions AS P.proname AS funcname, pg_stat_get_function_calls(P.oid) AS calls, pg_stat_get_function_time(P.oid) / 1000 AS total_time, -pg_stat_get_function_self_time(P.oid) / 1000 AS self_time +pg_stat_get_function_self_time(P.oid) / 1000 AS self_time, +pg_stat_get_function_last_stat_reset_time(P.oid) AS last_stat_reset FROM pg_proc P LEFT JOIN pg_namespace N ON (N.oid = P.pronamespace) WHERE P.prolang != 12 -- fast check to eliminate built-in functions AND pg_stat_get_function_calls(P.oid) IS NOT NULL; diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index c3c136a..f20a9ea 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -249,6 +249,8 @@ static void pgstat_sighup_handler(SIGNAL_ARGS); static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create); static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create); +static PgStat_StatFuncEntry *pgstat_get_func_entry(PgStat_StatDBEntry *dbentry, +Oid funcoid, bool create); static void pgstat_write_statsfile(bool permanent); static HTAB *pgstat_read_statsfile(Oid onlydb, bool permanent); static void backend_read_statsfile(void); @@ -3129,7 +3131,9 @@ pgstat_get_db_entry(Oid databaseid, bool create) result-n_tuples_updated = 0; result-n_tuples_deleted = 0; result-last_autovac_time = 0; - + + result-stat_reset_timestamp = GetCurrentTimestamp(); + memset(hash_ctl, 0, sizeof(hash_ctl)); hash_ctl.keysize = sizeof(Oid); hash_ctl.entrysize = sizeof(PgStat_StatTabEntry); @@ -3196,11 +3200,48 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create) result-autovac_vacuum_count = 0; result-analyze_count = 0; result-autovac_analyze_count = 0; + + /* inherit stat time reset from dbentry */ + result-stat_reset_timestamp = dbentry-stat_reset_timestamp; + } return result; } +/* + * Lookup the hash table entry for the specified function. If no hash + * table entry exists, initialize it, if the create parameter is true. + * Else, return NULL. + */ +static PgStat_StatFuncEntry * +pgstat_get_func_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create) +{ + PgStat_StatFuncEntry *result; + boolfound; + HASHACTION action = (create ? HASH_ENTER : HASH_FIND); + + /* Lookup or create the hash table entry for this table */ + result = (PgStat_StatFuncEntry *) hash_search(dbentry-functions, + tableoid, + action, found); + + if (!create !found) + return NULL; + + /* If not found, initialize the new one. */ + if (!found) + { + result-f_numcalls = 0; + result-f_time = 0; + result-f_time_self = 0; + + /* inherit stat time reset from dbentry */ + result-stat_reset_timestamp =
Re: [HACKERS] proposal: auxiliary functions for record type
On Dec 11, 2010, at 9:24 AM, Florian Pflug wrote: begin for v_field in select * from fieldinfos(myrec) loop case when v_field.fieldtype = 'type1'::regtype then v_value_type1 := fieldvalue(myrec, NULL::type1, false); Do something with v_value_type1 ... when v_field.fieldtype = 'typeN'::regtype then v_value_typeN := fieldvalue(myrec, NULL::typeN, false); Do something with v_value_typeN else raise exception 'Unexpected type % in record %', v_field.fieldtype, myrec; end case; end loop; end; All that casting looks horribly painful. :( This is a case where what we really need is a data type that can accept data of any type and tell you what the type actually is. Another place that could really use that are crosstabs, though that case is more complex in that you're dealing with a recordset of fields and you don't know the type of any of them. I wonder if it would be possible to create such a type without major backend surgery... I'm thinking the type would have a structure that stores the OID of the original type and then stores the text representation of that type. CAST functions to this type would know what the input data was and could store that information. The challenge is creating a way to cast back to the original type... I don't see a way to do that with current infrastructure. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: auxiliary functions for record type
Hello I looked on your module now and I am can't to say, what is better - both modules has little bit different target if I understand, and I don't know if they can be mixed. There are two strategies - rich contrib module or a few functions in core. Probably we can't to have a too rich API for record type like regexp has - transform result to table, to array, read values to array, to table, read tuple desc to array, to table, ... there should be one strategy - both strategies (based on array, based on tables) are legal. I prefer a table based solution, because I don't need a one unnest, but other preferences are valid too. I dissatisfied with your design of explicit target type via unused value. I think, so we are not a infrastructure for it now - from my view is better to use a common type, that is text now. It's nothing new - plpgsql use it too. I see one well design of explicit target type based on polymorphic types that respect a PostgreSQL fmgr practice: We have to allow a polymorphic functions without polymorphic parameters. These functions shoud be designed to return value in unknown type format when this function has not outer information. This information can be passed in function context. When function context isn't null, then function has to read target type and should to return value in target type. Who can fill a function context? It is task for executor. And when CAST contains just function call, then we can recheck, if function is polymorphic, and if it is, then we can set function context to target type, and then we don't need to call a conversion function, because polymorphic function must returns data in correct format. So when polymorphic function is used without polymorphic parameters and without explicit CAST, then returns unknown. Without described functionality we can design a not polymorphic function, that can returns unknown type. When similar functionality will be implemented, then this function will be changed to polymorphic, but from user's perspective, there isn't a change. Regards Pavel Stehule 2010/12/11 Florian Pflug f...@phlo.org: On Dec11, 2010, at 17:55 , Pavel Stehule wrote: It same trick, that I use in record_set_fields. Oh, I see. Sorry, must haven missed that when I read your blog entry. But I don't want to use it for reading of value. I don't like it. You don't need to know a value, you have to know a type - NULL::type. it is just not nice :). Well, no, it's not the most elegant API on earth, that's for sure. But I my opinion, it isn't so bad that it rectifies casting everything to text. I though about it too, and maybe is a time for new polymorphic type anytype - and then you don't need to write a litte bit strange NULL::type it can be fieldvalue(myrec, type1, false) Hm, I don't think the ability to write just type1 instead of NULL::type1 is worth the necessary effort. If anything, I'd allow function to use anyelement as their return value *without* having any polymorphic arguments. You'd need to surround calls to such function with a CAST() expression, unless something else determines the type. In pl/pgsql, for example, one could allow the CAST() to be skipped for assignment to variables. Then, you could write v_value := fieldvalue(myrec, 'f1', true) That'd at least free you from having to specify the type in some cases. But still, even this seems to be a lot of effort for quite little gain... best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] function attributes
Yesterday I did a bit of work on allowing bytea values to be passed into and out of plperl in binary format, effectively removing the need to escape and de-escape them. (The work can be seen on he plperlargs branch of my development repo at https://github.com/adunstan/postgresql-dev/commits/plperlargs/). At the moment the behaviour is triggered by a custom setting (plperl.pass_binary_bytea), but this isn't really satisfactory. We could turn it on permanently, but that would break a lot of legacy code. What we really need is a way of marking a function with some attributes. Of course, we could put it in the program text like plpgsql's #variable_conflict, but that's really rather ugly. The grammar already has an attribute mechanism for functions, and ISTM we just need to extend that a bit to allow setting of function attributes reasonably flexibly, much as we can now specify format options on EXPLAIN or we'll soon be able to specify options for foreign tables. Thoughts? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function attributes
On Dec 11, 2010, at 2:27 PM, Andrew Dunstan wrote: Yesterday I did a bit of work on allowing bytea values to be passed into and out of plperl in binary format, effectively removing the need to escape and de-escape them. (The work can be seen on he plperlargs branch of my development repo at https://github.com/adunstan/postgresql-dev/commits/plperlargs/). andrew++ # Woo! At the moment the behaviour is triggered by a custom setting (plperl.pass_binary_bytea), but this isn't really satisfactory. We could turn it on permanently, but that would break a lot of legacy code. What we really need is a way of marking a function with some attributes. Of course, we could put it in the program text like plpgsql's #variable_conflict, but that's really rather ugly. The grammar already has an attribute mechanism for functions, and ISTM we just need to extend that a bit to allow setting of function attributes reasonably flexibly, much as we can now specify format options on EXPLAIN or we'll soon be able to specify options for foreign table What does the existing attribute grammar for functions look like? An example perhaps? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] create tablespace fails silently, or succeeds improperly
On Sat, Dec 11, 2010 at 1:38 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sat, Dec 11, 2010 at 1:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: I'm not really thinking about crash recovery, but replication slaves. Omitting to create the tablespace location directories on slaves doesn't seem far-fetched at all. Of course, it's likely that the slave server will lack permissions to create in the location directory's parent; but if it can, the outcome shouldn't be too unreasonable. Creating the tablespace directory itself would be reasonable, but recursing all the way up seems dubious. Well, it's *very* unlikely that the slave server would have permissions to create in the root directory or close to it. If you grant that it's reasonable to create the location directory itself, why not the parent too, if that's still in a directory that's writable? I agree that the reasonableness of the behavior drops off the more you go up, but so does the probability of having the needed permissions. I don't agree that it's a binary choice where creating exactly one directory is reasonable but exactly two isn't. I'm just giving you my opinion. Take it for what it's worth. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function attributes
On 12/11/2010 08:01 PM, David E. Wheeler wrote: At the moment the behaviour is triggered by a custom setting (plperl.pass_binary_bytea), but this isn't really satisfactory. We could turn it on permanently, but that would break a lot of legacy code. What we really need is a way of marking a function with some attributes. Of course, we could put it in the program text like plpgsql's #variable_conflict, but that's really rather ugly. The grammar already has an attribute mechanism for functions, and ISTM we just need to extend that a bit to allow setting of function attributes reasonably flexibly, much as we can now specify format options on EXPLAIN or we'll soon be able to specify options for foreign table What does the existing attribute grammar for functions look like? An example perhaps? create function foo() . with ( /attribute/ [, ...] ) Currently allowed attributes are isStrict and isCachable. The mechanism is effectively obsolete right now, but we could use it for what I have in mind quite nicely. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Sat, Dec 11, 2010 at 2:53 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Wed, Dec 8, 2010 at 6:52 AM, Marti Raudsepp ma...@juffo.org wrote: A very useful feature for unlogged tables would be the ability to switch them back to normal tables -- this way you could do bulk loading into an unlogged table and then turn it into a regular table using just fsync(), bypassing all the WAL-logging overhead. If archive_mode is off, then you can often find a way to bypass WAL-logging during bulk loading anyway. If archive_mode is on, then I don't see how this can work without massive changes. Well, you'd need to work your way through the heap and all of its indices and XLOG every page. And you've got to do that in a way that's transaction-safe, and I don't have a design in mind for that off the top of my head. But I think massive changes is probably an overstatement. We can already handle ALTER TABLE operations that involve a full relation rewrite, and that already does the full-table-XLOG thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Sat, Dec 11, 2010 at 3:18 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Fri, Dec 10, 2010 at 8:16 PM, Robert Haas robertmh...@gmail.com wrote: As I was working on the hash index support, it occurred to me that at some point in the future, we might want to allow an unlogged index on a permanent table. That is the feature I would be most excited about. With the current patch, an index is unlogged if and only if the corresponding table is unlogged, and both the table and the index are reset to empty on restart. But we could have a slightly different flavor of index that, instead of being reset to empty, just gets marked invalid, perhaps by truncating the file to zero-length (and adding some code to treat that as something other than a hard error). Perhaps you could even arrange for autovacuum to kick off an automatic rebuild, Or just have rebuilding the index as part of crash recovery. I wouldn't use the feature anyway on indexes that would take more than a few seconds to rebuild, And wouldn't want to advertise the database as being available when it is essentially crippled from missing indexes. I'd rather bite the bullet up front. I don't think you can rebuild the indexes during crash recovery; I believe you need to be bound to the database that contains the index, and, as we've been over before, binding to a database is irrevocable, so the startup process can't bind to the database, rebuild the index, and then unbind. It would need to signal the postmaster to fire up other backends to do this work, and at that point I think you may as well piggyback on autovacuum rather than designing a similar mechanism from scratch. Also, while YOU might use such a feature only for indexes that can be rebuilt in a few seconds, I strongly suspect that other people might use it in other ways. In particular, it seems that it would be possibly sensible to use a feature like this for an index that's only used for reporting queries. If the database crashes, we'll still have our primary key so we can continue operating, but we'll need to reindex before running the nightly reports. I would think of it is as declaring that, instead of making the index recoverable via WAL logging and replay, instead make it recoverable by rebuilding. So in that way it is quit unlike unlogged tables, in that we are not risking any data, just giving the database a hint about what the most expeditious way to maintain the index might be. Well, more of an order than a hint, I guess. I think it's six of one, half a dozen of the other. An index by its nature only contains data that is duplicated in a table, so by definition loss of an index isn't risking any data. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function attributes
On Sat, Dec 11, 2010 at 5:27 PM, Andrew Dunstan andrew.duns...@pgexperts.com wrote: Yesterday I did a bit of work on allowing bytea values to be passed into and out of plperl in binary format, effectively removing the need to escape and de-escape them. (The work can be seen on he plperlargs branch of my development repo at https://github.com/adunstan/postgresql-dev/commits/plperlargs/). At the moment the behaviour is triggered by a custom setting (plperl.pass_binary_bytea), but this isn't really satisfactory. We could turn it on permanently, but that would break a lot of legacy code. What we really need is a way of marking a function with some attributes. Of course, we could put it in the program text like plpgsql's #variable_conflict, but that's really rather ugly. The grammar already has an attribute mechanism for functions, and ISTM we just need to extend that a bit to allow setting of function attributes reasonably flexibly, much as we can now specify format options on EXPLAIN or we'll soon be able to specify options for foreign tables. Thoughts? Well, you could set that GUC (plperl.pass_binary_bytea) on a function without changing any syntax on at all. CREATE FUNCTION name (args) ... SET plperl.pass_binary_bytea = true; -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Fri, Dec 10, 2010 at 8:16 PM, Robert Haas robertmh...@gmail.com wrote: I think the first patch (relpersistence-v4.patch) is ready to commit, and the third patch to allow synchronous commits to become asynchronous when it doesn't matter (relax-sync-commit-v1.patch) doesn't seem to be changing much either, although I would appreciate it if someone with more expertise than I have with our write-ahead logging system would give it a quick once-over. I don't understand what the point of the relax-sync-commit patch is. If XactLastRecEnd.xrecoff == 0, then calling XLogFlush(XactLastRecEnd) is pretty much a null operation anyway because it will short-circuit at the early statement: if (XLByteLE(record, LogwrtResult.Flush)) return Or at least it had better return at that point, or we might have a serious problem. If XactLastRecEnd.xrecoff == 0 then the only way to keep going is if XactLastRecEnd.xlogid is ahead of LogwrtResult.Flush.xlogid. I guess that could happen legitimately if the logs have recently rolled over the 4GB boundary, and XactLastRecEnd is aware of this while LogwrtResult is not yet aware of it. I don't know if that is a possible state of affairs. If it is, then the result would be that on very rare occasion your patch removes a spurious, but not harmful other than performance, fsync. If somehow XactLastRecEnd gets a falsely advanced value of xlogid, then calling XLogFlush with it would cause a PANIC xlog write request %X/%X is past end of log %X/%X. So unless people have been seeing this, that must not be able to happen. And looking at the only places XactLastRecEnd.xlogid get set, I don't see how it could happen. So maybe in your patch: if ((wrote_xlog XactSyncCommit) || forceSyncCommit || nrels 0) should be if (wrote_xlog (XactSyncCommit || forceSyncCommit || nrels 0) ) It seems like on general principles we should not be passing to XLogFlush a structure which is by definition invalid. But even if XLogFlush is going to return immediately, that doesn't negate the harm caused by commit_delay doing its thing needlessly. Perhaps that was the original motivation for your patch. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function attributes
On 12/11/2010 09:16 PM, Robert Haas wrote: Well, you could set that GUC (plperl.pass_binary_bytea) on a function without changing any syntax on at all. CREATE FUNCTION name (args) ... SET plperl.pass_binary_bytea = true; Oh, good point. I'd forgotten about that. I'll experiment and see if I can break it :-) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Sat, Dec 11, 2010 at 9:21 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Fri, Dec 10, 2010 at 8:16 PM, Robert Haas robertmh...@gmail.com wrote: I think the first patch (relpersistence-v4.patch) is ready to commit, and the third patch to allow synchronous commits to become asynchronous when it doesn't matter (relax-sync-commit-v1.patch) doesn't seem to be changing much either, although I would appreciate it if someone with more expertise than I have with our write-ahead logging system would give it a quick once-over. I don't understand what the point of the relax-sync-commit patch is. Suppose we begin a transaction, write a bunch of data to a temporary table, and commit. Suppose further that synchronous_commit = off. At transaction commit time, we haven't written any XLOG records yet, but we do have an XID assigned because of the writes to the temporary tables. So we'll issue a commit record. Without this patch, since synchronous_commit = off, we'll force that commit record to disk before acknowledging the commit to the user. However, that's not really necessary because if we crash after acknowledging the commit to the user, the temporary tables will disappear anyway, and our XID doesn't exist on disk any place else - thus, whether the commit makes it to disk before the crash or not will be immaterial on restart. If you have a bunch of transactions that make write to temporary (or unlogged) tables but not to any permanent tables, this makes it muuuch faster. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function attributes
On Sat, Dec 11, 2010 at 9:28 PM, Andrew Dunstan and...@dunslane.net wrote: On 12/11/2010 09:16 PM, Robert Haas wrote: Well, you could set that GUC (plperl.pass_binary_bytea) on a function without changing any syntax on at all. CREATE FUNCTION name (args) ... SET plperl.pass_binary_bytea = true; Oh, good point. I'd forgotten about that. I'll experiment and see if I can break it :-) I guess one problem with that approach is that the scoping of the parameter change will be dynamic rather than lexical. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)
On Sat, Dec 11, 2010 at 4:40 PM, t...@fuzzy.cz wrote: Hello you have to respect pg coding style: a) not too long lines b) not C++ line comments OK, thanks for the notice. I've fixed those two problems. Please add this to the currently-open commitfest. https://commitfest.postgresql.org/action/commitfest_view/open -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] proposal : cross-column stats
Hi everyone, one of the ssesion I've attended on PgDay last week was Heikki's session about statistics in PostgreSQL. One of the issues he mentioned (and one I regularly run into) is the absence of cross-column stats. When the columns are not independent, this usually result in poor estimates (and then in suboptimal plans). I was thinking about this issue before, but that session was the last impulse that pushed me to try to hack up a proof of concept. So here it is ;-) Lets talk about one special case - I'll explain how the proposed solution works, and then I'll explain how to make it more general, what improvements are possible, what issues are there. Anyway this is by no means a perfect or complete solution - it's just a starting point. Short introduction Say we have a table with two INT columns - col_a and col_b, and we want to estimate number of rows for a condition involving both columns: WHERE (col_a BETWEEN m AND n) AND (col_b BETWEEN p AND q) When the columns are independent, doing the estimate is just a matter of multiplication. When the columns are dependent, the estimate may be way off. Lets assume there are histograms with 5 bins for both columns. What I propose is basically building a 2D histogram. It kind of resembles contingency table. So we do have a table like this col_b \ col_a || 20% | 20% | 20% | 20% | 20% | ===||== 20% || 4% | 4% | 4% | 4% | 4% | 20% || 4% | 4% | 4% | 4% | 4% | 20% || 4% | 4% | 4% | 4% | 4% | 20% || 4% | 4% | 4% | 4% | 4% | 20% || 4% | 4% | 4% | 4% | 4% | ===||== where each column / row represents a bin in the original histograms, and each cell represents an expected number of rows in it (for really independent columns, it's 4%). For dependent columns the actual values may be actually very different, of course - e.g. for strongly dependent columns it might be like this col_b \ col_a || 20% | 20% | 20% | 20% | 20% | ===||== 20% || 20% | 0% | 0% | 0% | 0% | 20% || 0% | 20% | 0% | 0% | 0% | 20% || 0% | 0% | 20% | 0% | 0% | 20% || 0% | 0% | 0% | 20% | 0% | 20% || 0% | 0% | 9% | 0% | 20% | ===||== To estimate the number of rows matching the condition, you'd sum estimates for cells matching the condition. I.e. when the condition on col_a matches the lowest 20% (the first histogram bin) and the condition on col_b matches the lowest 20% of values, this corresponds to the first cell (20% of rows). Current optimizer estimates this to be 4% as it believes the columns are independent. I'm not sure whether I've explained it well enough, but the essence of the proposal is to build N-dimensional histograms (where N is the number of columns covered by the statistics) just as we are building histograms today. --- Proof of concept --- I've hacked a nasty proof of concept in PL/pgSQL (see the attachment). It creates two tables - test_table and cross_stats. The former contains the data (in two columns), the latter is used to store cross-column statistics of test_table. Then there are several PL/pgSQL functions - two of them are really important: collect_stats(p_bins_a INT, p_bins_b INT) - this one is used to collect the stats, the parameters represent number of bins for the columns (sides of the contingency table) - collect_stats(10,10) will build contingency table with 100 cells get_estimate(p_from_a INT, p_to_a INT, p_from_b INT, p_to_b INT) - computes estimate for the condition listed above (ranges on both columns) So to run the PoC, just do something like this: 1) fill the table with some data INSERT INTO test_table SELECT round(random()*1000), round(random()*1000) FROM generate_series(1,10); 2) collect the cross-column statistics SELECT collect_stats(10,10); 3) see current estimated and actual number of rows EXPLAIN ANALYZE SELECT * FROM test_table WHERE (col_a BETWEEN 30 AND 129) AND (col_b BETWEEN 234 AND 484); 4) see the estimate based on contingency table SELECT get_estimate(30, 129, 234, 484); Just two very simple tests for now - col_a/col_b contain the range from the query, then there are actual number of matching rows and a current estimate, And finally the new estimate based on contingency table with various number of bins. A) independent columns (proof that it produces just as good estimates as the current code) col_a | col_b | actual | expected | 10x10 | 20x20 | [50,70] | [50,70] |41 | 40 | 41 |47 | [50,250] | [50,250] | 4023 | 4024 | 4436 | 3944 | [50,250] | [750,950] | 4023 | 3955 | 4509 |
Re: [HACKERS] function attributes
On Dec 11, 2010, at 5:58 PM, Andrew Dunstan wrote: create function foo() . with ( /attribute/ [, ...] ) Currently allowed attributes are isStrict and isCachable. The mechanism is effectively obsolete right now, but we could use it for what I have in mind quite nicely. Makes sense, though I think I might also like some sort of global default to toggle. David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers