Re: [HACKERS] [PATCHES] contrib/pgbench bugfix
Good catch! Thanks. I have committed your fix. -- Tatsuo Ishii SRA OSS, Inc. Japan > I found a buffer overflow bug in contrib/pgbench. > This occures when -c >= 2. > > > > The type of 'state' is CState*, so we should use state+1 or &state[1], > not state + sizeof(*state) > > > *** pgbench.c Mon Jul 31 13:18:45 2006 > --- pgbench.fixed.c Mon Jul 31 13:18:10 2006 > *** main(int argc, char **argv) > *** 1344,1350 > exit(1); > } > > ! memset(state + sizeof(*state), 0, sizeof(*state) * (nclients - > 1)); > > for (i = 1; i < nclients; i++) > { > --- 1344,1350 > exit(1); > } > > ! memset(state + 1, 0, sizeof(*state) * (nclients - 1)); > > for (i = 1; i < nclients; i++) > { > > Regards, > --- > ITAGAKI Takahiro > NTT Open Source Software Center > > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faq > ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] DTrace enabled build fails
Peter Eisentraut wrote: That rings a bell. Can we get a more precise designation on what version of DTrace we support? And where can one get that required update? Peter, The problem with static function was fixed recently and is now available in Solaris Express (the development version of Solaris). You can get the bits from http://www.sun.com/software/solaris/solaris-express/get.jsp. I forgot to mention this know issue in my previous emails! I was told by the DTrace engineer that this fix will be in the next update of Solaris 10. Regards, -Robert ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Let psql process files with > 4,294,967,295 lines
David Fetter <[EMAIL PROTECTED]> writes: > + #include "pg_config.h" You should not need that. All PG code assumes that c.h and its inclusions have already been read. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Let psql process files with > 4,294,967,295 lines
On Sun, Jul 30, 2006 at 05:40:16PM -0400, Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > David Fetter wrote: > >> This patch changes the data type from unsigned int to unsigned > >> long long, which is probably not the correct thing in order to > >> get 64-bit arithmetic, but I figure it's good enough to get a > >> discussion started. > > > The only thing I can tell you is that you should use INT64_FORMAT > > instead of %lld. > > And the datatype should be declared int64, not "long long" which > doesn't exist everywhere. > > Actually you probably want uint64 and UINT64_FORMAT... > > regards, tom lane I think this fixes it, but I'm unsure how to test it. Two of the methods mentioned in IRC, attaching with gdb and setting to a value > 2^32, and setting it directly in some code, seem like OK approaches. Cheers, D -- David Fetter <[EMAIL PROTECTED]> http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! Index: src/bin/psql/common.c === RCS file: /projects/cvsroot/pgsql/src/bin/psql/common.c,v retrieving revision 1.122 diff -c -r1.122 common.c *** src/bin/psql/common.c 14 Jul 2006 14:52:26 - 1.122 --- src/bin/psql/common.c 31 Jul 2006 01:57:42 - *** *** 188,194 fflush(pset.queryFout); if (pset.inputfile) ! fprintf(stderr, "%s:%s:%u: ", pset.progname, pset.inputfile, pset.lineno); va_start(ap, fmt); vfprintf(stderr, _(fmt), ap); va_end(ap); --- 188,194 fflush(pset.queryFout); if (pset.inputfile) ! fprintf(stderr, "%s:%s:" UINT64_FORMAT ": ", pset.progname, pset.inputfile, pset.lineno); va_start(ap, fmt); vfprintf(stderr, _(fmt), ap); va_end(ap); Index: src/bin/psql/mainloop.c === RCS file: /projects/cvsroot/pgsql/src/bin/psql/mainloop.c,v retrieving revision 1.81 diff -c -r1.81 mainloop.c *** src/bin/psql/mainloop.c 14 Jul 2006 14:52:26 - 1.81 --- src/bin/psql/mainloop.c 31 Jul 2006 01:57:42 - *** *** 44,50 /* Save the prior command source */ FILE *prev_cmd_source; boolprev_cmd_interactive; ! unsigned int prev_lineno; /* Save old settings */ prev_cmd_source = pset.cur_cmd_source; --- 44,50 /* Save the prior command source */ FILE *prev_cmd_source; boolprev_cmd_interactive; ! uint64 prev_lineno; /* Save old settings */ prev_cmd_source = pset.cur_cmd_source; Index: src/bin/psql/settings.h === RCS file: /projects/cvsroot/pgsql/src/bin/psql/settings.h,v retrieving revision 1.27 diff -c -r1.27 settings.h *** src/bin/psql/settings.h 5 Mar 2006 15:58:52 - 1.27 --- src/bin/psql/settings.h 31 Jul 2006 01:57:42 - *** *** 12,17 --- 12,18 #include "variables.h" #include "print.h" + #include "pg_config.h" #define DEFAULT_FIELD_SEP "|" #define DEFAULT_RECORD_SEP "\n" *** *** 50,56 char *inputfile; /* for error reporting */ char *dirname;/* current directory for \s display */ ! unsignedlineno; /* also for error reporting */ booltiming; /* enable timing of all queries */ --- 51,57 char *inputfile; /* for error reporting */ char *dirname;/* current directory for \s display */ ! uint64 lineno; /* also for error reporting */ booltiming; /* enable timing of all queries */ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Hash indexes (was: On-disk bitmap index patch)
On Fri, Jul 28, 2006 at 12:14:49PM -0500, Jim C. Nasby wrote: > On Thu, Jul 27, 2006 at 01:46:01PM -0400, Alvaro Herrera wrote: > > Jim Nasby wrote: > > > On Jul 25, 2006, at 3:31 PM, Tom Lane wrote: > > > >Hannu Krosing <[EMAIL PROTECTED]> writes: > > > > > >>What would be the use-case for hash indexes ? And what should be > > > >>done to make them faster than btree ? > > > > > > > >If we knew, we'd do it ;-) But no one's put enough effort into it > > > >to find out. > > > > > > Do they use the same hash algorithm as hash joins/aggregation? If so, > > > wouldn't hash indexes be faster for those operations than regular > > > indexes? > > > > The main problem doesn't seem to be in the hash algorithm (which I > > understand to mean the hashing function), but in the protocol for > > concurrent access of index pages, and the distribution of keys in pages > > of a single hash key. > > > > This is described in a README file or a code comment somewhere in the > > hash AM code. Someone needs to do some profiling to find out what the > > bottleneck really is, and ideally find a way to fix it. > > What I'm getting at is that I've never seen any explanation for the > theoretical use cases where a hash index would outperform a btree. If we > knew what kind of problems hash indexes were supposed to solve, we could > try and interest people who are solving those kinds of problems in > fixing hash indexes. The big win for hash indexes is the idea that searching for a single value should only take 1 I/O operation in a perfect world. Btree can not do that. Ken ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 64-bit integers for GUC
Robert Treat <[EMAIL PROTECTED]> writes: > On Tuesday 25 July 2006 14:28, Josh Berkus wrote: >> To be quite frank, current PostgreSQL can't effectively use more than >> 256mb of work_mem anyway. We'd like to fix that, but it's not fixed yet >> AFAIK. > Josh, can you clarify this statement for me? Perhaps I shouldn't put words in Josh' mouth, but I *think* what he meant is that the tuplesort code does not get any faster once work_mem exceeds a few hundred meg. I believe we've addressed that to some extent in CVS HEAD, but it's a fair gripe against the existing release branches. I'm not aware that anyone has done any work to characterize performance vs work_mem setting for any of the other uses of work_mem (such as hash table sizes). regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 64-bit integers for GUC
On Tuesday 25 July 2006 14:28, Josh Berkus wrote: > Peter, > > > I wonder whether platforms with INT64_IS_BROKEN can address more than 2GB > > of memory anyway. > > To be quite frank, current PostgreSQL can't effectively use more than > 256mb of work_mem anyway. We'd like to fix that, but it's not fixed yet > AFAIK. > Josh, can you clarify this statement for me? Using work mem of higher than 256MB is common practice in certain cases (db restore for example). Are you speaking in a high volume OLTP sense, or something beyond this? -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] extension for sql update
On Sun, 2006-07-30 at 20:20 -0400, Robert Treat wrote: > On Thursday 27 July 2006 09:28, Bruce Momjian wrote: > > Tom Lane wrote: > > > Bruce Momjian <[EMAIL PROTECTED]> writes: > > > > Tom Lane wrote: > > > >> UPDATE mytab SET (foo, bar, baz) = > > > >> (SELECT alpha, beta, gamma FROM othertab WHERE key = mytab.key); > > > > > > > > That UPDATE example is interesting because I remember when using > > > > Informix that I had to do a separate SELECT statement for each UPDATE > > > > column I wanted to update. I didn't realize that you could group > > > > columns and assign them from a single select --- clearly that is a > > > > powerful syntax we should support some day. > > > > > > No question. The decision at hand is whether we want to look like > > > we support it, when we don't yet. I'd vote not, because I think the > > > main use-case for the row-on-the-left syntax is exactly this, and > > > so I fear people will just get frustrated if they see it in the > > > syntax synopsis and try to use it. > > > > I'm not a big fan of implementing partial solutions (remember "left-joins are > not implemented messages" :-) way back when) , however in my experience with > this form of the update command, the primary usage is not to use a subselect > to derive the values, but to make it easier to generate sql, using a single I disagree. UPDATE mytab SET (foo, bar, baz) =(SELECT ...) is the specifications way of doing an update with a join. That is its primary purpose. UPDATE ... FROM is a PostgreSQL alternative to the above. -- ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] extension for sql update
On Thursday 27 July 2006 09:28, Bruce Momjian wrote: > Tom Lane wrote: > > Bruce Momjian <[EMAIL PROTECTED]> writes: > > > Tom Lane wrote: > > >> UPDATE mytab SET (foo, bar, baz) = > > >> (SELECT alpha, beta, gamma FROM othertab WHERE key = mytab.key); > > > > > > That UPDATE example is interesting because I remember when using > > > Informix that I had to do a separate SELECT statement for each UPDATE > > > column I wanted to update. I didn't realize that you could group > > > columns and assign them from a single select --- clearly that is a > > > powerful syntax we should support some day. > > > > No question. The decision at hand is whether we want to look like > > we support it, when we don't yet. I'd vote not, because I think the > > main use-case for the row-on-the-left syntax is exactly this, and > > so I fear people will just get frustrated if they see it in the > > syntax synopsis and try to use it. > I'm not a big fan of implementing partial solutions (remember "left-joins are not implemented messages" :-) way back when) , however in my experience with this form of the update command, the primary usage is not to use a subselect to derive the values, but to make it easier to generate sql, using a single update statement, based on an array of passed in values (in languages like perl/php/etc...). This solution would solve that problem for us, so I would lean toward including it. I would be interested in hearing from actual users who really need the subselect version though, but right now my thinking is that group is a small minority of who would benefit from this version of the update command. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] request: support of array in plperl OUT arguments
On Fri, Jul 28, 2006 at 10:42:49AM +0200, Pavel Stehule wrote: > Hello, > > I miss better support OUT arguments in plerlu: > > create or replace function foo(out p varchar[]) as $$ return { p => [pavel, > jana] }; $$ language plperlu; > postgres=# select foo(); > ERROR: array value must start with "{" or dimension information > postgres=# > > I starting work on it. I hope It will be done before current feature freeze. > > Regards > Pavel Stehule It seems Pavel missed sending the preliminary patch, so here it is :) Cheers, D -- David Fetter <[EMAIL PROTECTED]> http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! *** ./plperl.c.orig 2006-07-29 21:07:09.0 +0200 --- ./plperl.c 2006-07-30 22:50:56.0 +0200 *** *** 52,57 --- 52,58 FmgrInforesult_in_func; /* I/O function and arg for result type */ Oid result_typioparam; int nargs; + int num_out_args; /* number of out arguments */ FmgrInfoarg_out_func[FUNC_MAX_ARGS]; boolarg_is_rowtype[FUNC_MAX_ARGS]; SV *reference; *** *** 117,122 --- 118,126 static void plperl_init_shared_libs(pTHX); static HV *plperl_spi_execute_fetch_result(SPITupleTable *, int, int); + static SV *plperl_convert_to_pg_array(SV *src); + + /* * This routine is a crock, and so is everyplace that calls it. The problem * is that the cached form of plperl functions/queries is allocated permanently *** *** 412,418 (errcode(ERRCODE_UNDEFINED_COLUMN), errmsg("Perl hash contains nonexistent column \"%s\"", key))); ! if (SvOK(val) && SvTYPE(val) != SVt_NULL) values[attn - 1] = SvPV(val, PL_na); } hv_iterinit(perlhash); --- 416,427 (errcode(ERRCODE_UNDEFINED_COLUMN), errmsg("Perl hash contains nonexistent column \"%s\"", key))); ! ! /* if value is ref on array do to pg string array conversion */ ! if (SvTYPE(val) == SVt_RV && ! SvTYPE(SvRV(val)) == SVt_PVAV) ! values[attn - 1] = SvPV(plperl_convert_to_pg_array(val), PL_na); ! else if (SvOK(val) && SvTYPE(val) != SVt_NULL) values[attn - 1] = SvPV(val, PL_na); } hv_iterinit(perlhash); *** *** 691,702 HeapTuple tuple; Form_pg_proc proc; charfunctyptype; - int numargs; - Oid*argtypes; - char **argnames; - char *argmodes; boolistrigger = false; - int i; /* Get the new function's pg_proc entry */ tuple = SearchSysCache(PROCOID, --- 700,706 *** *** 724,740 format_type_be(proc->prorettype; } - /* Disallow pseudotypes in arguments (either IN or OUT) */ - numargs = get_func_arg_info(tuple, - &argtypes, &argnames, &argmodes); - for (i = 0; i < numargs; i++) - { - if (get_typtype(argtypes[i]) == 'p') - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), -errmsg("plperl functions cannot take type %s", - format_type_be(argtypes[i]; - } ReleaseSysCache(tuple); --- 728,733 *** *** 1014,1019 --- 1007,1065 return retval; } + /* + * Verify type of result if proc has out params and transform it + * to scalar if proc has only one out parameter + */ + + static SV * + plperl_transform_result(plperl_proc_desc *prodesc, SV *result) + { + boolexactly_one_field = false; + HV *hvr; + SV *val; + char *key; + I32 klen; + + + if (prodesc->num_out_args > 0) + { + if (!SvOK(result) || SvTYPE(result) != SVt_RV || + SvTYPE(SvRV(result)) != SVt_PVHV) + { + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), +errmsg("Perl function with OUT arguments" + " must return reference to hash"))); + }
Re: [HACKERS] Let psql process files with > 4,294,967,295 lines
Alvaro Herrera <[EMAIL PROTECTED]> writes: > David Fetter wrote: >> This patch changes the data type from unsigned int to unsigned long >> long, which is probably not the correct thing in order to get 64-bit >> arithmetic, but I figure it's good enough to get a discussion started. > The only thing I can tell you is that you should use INT64_FORMAT > instead of %lld. And the datatype should be declared int64, not "long long" which doesn't exist everywhere. Actually you probably want uint64 and UINT64_FORMAT... regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Let psql process files with > 4,294,967,295 lines
David Fetter wrote: Hi, > I just ran across an issue where in psql, people can get the line > number in the file so long as it is under 2^32-1 lines long, but once > it gets larger than that, it's hosed. > > This patch changes the data type from unsigned int to unsigned long > long, which is probably not the correct thing in order to get 64-bit > arithmetic, but I figure it's good enough to get a discussion started. The only thing I can tell you is that you should use INT64_FORMAT instead of %lld. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Let psql process files with > 4,294,967,295 lines
Folks, I just ran across an issue where in psql, people can get the line number in the file so long as it is under 2^32-1 lines long, but once it gets larger than that, it's hosed. This patch changes the data type from unsigned int to unsigned long long, which is probably not the correct thing in order to get 64-bit arithmetic, but I figure it's good enough to get a discussion started. Cheers, D -- David Fetter <[EMAIL PROTECTED]> http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! Index: src/bin/psql/common.c === RCS file: /projects/cvsroot/pgsql/src/bin/psql/common.c,v retrieving revision 1.122 diff -c -r1.122 common.c *** src/bin/psql/common.c 14 Jul 2006 14:52:26 - 1.122 --- src/bin/psql/common.c 30 Jul 2006 21:08:39 - *** *** 188,194 fflush(pset.queryFout); if (pset.inputfile) ! fprintf(stderr, "%s:%s:%u: ", pset.progname, pset.inputfile, pset.lineno); va_start(ap, fmt); vfprintf(stderr, _(fmt), ap); va_end(ap); --- 188,194 fflush(pset.queryFout); if (pset.inputfile) ! fprintf(stderr, "%s:%s:%lld: ", pset.progname, pset.inputfile, pset.lineno); va_start(ap, fmt); vfprintf(stderr, _(fmt), ap); va_end(ap); Index: src/bin/psql/mainloop.c === RCS file: /projects/cvsroot/pgsql/src/bin/psql/mainloop.c,v retrieving revision 1.81 diff -c -r1.81 mainloop.c *** src/bin/psql/mainloop.c 14 Jul 2006 14:52:26 - 1.81 --- src/bin/psql/mainloop.c 30 Jul 2006 21:08:40 - *** *** 44,50 /* Save the prior command source */ FILE *prev_cmd_source; boolprev_cmd_interactive; ! unsigned int prev_lineno; /* Save old settings */ prev_cmd_source = pset.cur_cmd_source; --- 44,50 /* Save the prior command source */ FILE *prev_cmd_source; boolprev_cmd_interactive; ! unsigned long long prev_lineno; /* Save old settings */ prev_cmd_source = pset.cur_cmd_source; Index: src/bin/psql/settings.h === RCS file: /projects/cvsroot/pgsql/src/bin/psql/settings.h,v retrieving revision 1.27 diff -c -r1.27 settings.h *** src/bin/psql/settings.h 5 Mar 2006 15:58:52 - 1.27 --- src/bin/psql/settings.h 30 Jul 2006 21:08:40 - *** *** 50,56 char *inputfile; /* for error reporting */ char *dirname;/* current directory for \s display */ ! unsignedlineno; /* also for error reporting */ booltiming; /* enable timing of all queries */ --- 50,56 char *inputfile; /* for error reporting */ char *dirname;/* current directory for \s display */ ! unsigned long long lineno; /* also for error reporting */ booltiming; /* enable timing of all queries */ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Relation locking and relcache load (was Re: Going for "all green" buildfarm results)
I wrote: > ... This means the only thing stopping us from > taking lock before we invoke relcache is lack of knowledge about the > rel's relisshared status. Given that the set of shared relations is > pretty small, and fixed in any given backend version, it wouldn't be > unreasonable to handle this by keeping a hardwired list of shared > relation OIDs somewhere in the backend. On further investigation, there is one small stumbling block in that plan. We currently have hard-wired OIDs for shared catalogs and their indexes ... but not for their toast tables (nor the indexes thereon). I think the best solution for this might be to put the responsibility for creating system catalogs' toast tables into the bootstrap phase instead of making initdb do it afterwards. This would be a Good Thing anyway since currently we are incapable of dealing with bootstrap-time insertions of values large enough to need toasting. I'm imagining adding macros to the include/catalog/*.h files along the lines of TOAST_CATALOG(pg_class,4242,4243) where the numbers are hand-assigned OIDs for the toast table and index. The existing mechanisms for creating the .bki bootstrap script would turn these into commands in the .bki script. We could then get rid of ALTER TABLE ... CREATE TOAST TABLE as a SQL command altogether, which seems like a good thing to me. Anyone want to argue for keeping it? There really shouldn't be any case where a user needs to invoke it. Thoughts, objections? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] problem with volatile functions in subselects ?
"Sergey E. Koposov" <[EMAIL PROTECTED]> writes: > I see the very strange behaviour with the following set of queries: > wsdb=# select na,nb, na::double precision as da, nb::double precision as db > from ( select random()::numeric as na,random()::numeric as nb from > generate_series(1,2)) as xx; The planner "flattens" this query to a single level of SELECT, so what you effectively have is select random()::numeric as na, random()::numeric as nb, random()::numeric::double precision as da, random()::numeric::double precision as db from generate_series(1,2) as xx; There's been some talk about prohibiting flattening if there are any volatile functions in the subselect's targetlist, but nothing's been done about that. Disabling flattening is a sufficiently big hit to the planner's optimization ability that it shouldn't be done lightly. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] PATCH to allow concurrent VACUUMs to not lock each
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Knew I should have taken time to review that patch before it went in ... > Which one? The one I applied doesn't have this change. Never mind --- I misunderstood the context of the discussion and thought you had made larger changes in the last version of the patch than I was expecting ... The patch as committed looks fine to me, modulo a couple of comments which I've fixed. One thing that slightly troubles me is that GetOldestXmin will now ignore a lazy vacuum's *own* xmin, which is not like the previous behavior. Offhand I can't see a reason why this is not safe, but maybe it'd have been better for it to do + if (ignoreVacuum && proc->inVacuum && proc != MyProc) + continue; Thoughts? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] 64 bits bitwise operations support
Is there a way to tell if there is support for 64 bits bitwise operations since c.h apparently defines BUSTED for no support for 64 bits variables. 10x. -- Regards, Tzahi. -- Tzahi Fadida Blog: http://tzahi.blogsite.org | Home Site: http://tzahi.webhop.info WARNING TO SPAMMERS: see at http://members.lycos.co.uk/my2nis/spamwarning.html ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Relation locking and relcache load (was Re: Going for "all green" buildfarm results)
I wrote: > I bet Alvaro's spotted the problem. ALTER INDEX RENAME doesn't seem to > take any lock on the index's parent table, only on the index itself. > That means that a query on "onek" could be trying to read the pg_class > entries for onek's indexes concurrently with someone trying to commit > a pg_class update to rename an index. If the query manages to visit > the new and old versions of the row in that order, and the commit > happens between, *neither* of the versions would look valid. MVCC > doesn't save us because this is all SnapshotNow. On further reflection, it seems this is a special case of the general problem that we ought to lock a relation *before* we attempt to load the relcache entry, not after. We've seen previous complaints about this, for instance a process erroring out with a message about pg_trigger having a different number of entries than it expected because someone had committed an ADD/DROP TRIGGER between it reading pg_class and reading pg_trigger. A long time ago (30 Oct 2002) I wrote: > Thinking further, it's really kinda bogus that LockRelation() works on > an already-opened Relation; if possible we should acquire the lock > before attempting to create a relcache entry. (We only need to know the > OID and the relisshared status before we can make a locktag, so it'd be > possible to acquire the lock using only the contents of the pg_class row.) > Not sure how much code restructuring might be involved to make this > happen, but it'd be worth thinking about for 7.4. but that never got off the back burner. The current example shows that "read the pg_class row and then lock" doesn't work anyway, because we might fail to find any version of the pg_class row that passes SnapshotNow. One thing that has improved since 7.3 is that we only do relcache loads by OID, never by name. This means the only thing stopping us from taking lock before we invoke relcache is lack of knowledge about the rel's relisshared status. Given that the set of shared relations is pretty small, and fixed in any given backend version, it wouldn't be unreasonable to handle this by keeping a hardwired list of shared relation OIDs somewhere in the backend. Anyway I really think we need to move to a coding rule that says thou shalt not access a relcache entry without *already* having some kind of lock on the relation. Comments? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] problem with volatile functions in subselects ?
Hello Hackers, I see the very strange behaviour with the following set of queries: wsdb=# select na,nb, na::double precision as da, nb::double precision as db from ( select random()::numeric as na,random()::numeric as nb from generate_series(1,2)) as xx; na |nb |da |db ---+---+---+--- 0.756045001445359 | 0.505602368389071 | 0.283893094995941 | 0.160685719065687 0.792114335015469 | 0.416411793053342 | 0.342387438445532 | 0.531201674850286 (2 rows) On my understanding that should produce the "na" column equal to "da" ? When I do the same with the select from the table the result is similar: wsdb=# select na,nb, na::double precision as da, nb::double precision as db from ( select random()::numeric as na,random()::numeric as nb from pg_proc) as xx; na | nb | da | db ---+---+--+-- 0.125243402610181 | 0.620239329347498 | 0.64666960465101 | 0.257827353318141 0.934299875951512 |0.0322264223509591 | 0.96565025298188 | 0.0439542480949099 But when I limit the select, I get the expected result. wsdb=# select na,nb, na::double precision as da, nb::double precision as db from ( select random()::numeric as na,random()::numeric as nb from pg_proc limit 2) as xx; na |nb |da |db ---+---+---+--- 0.543030349324937 | 0.925069289712733 | 0.543030349324937 | 0.925069289712733 0.934251406665077 | 0.292522935332974 | 0.934251406665077 | 0.292522935332974 (2 rows) Is that a bug, or I'm missing something ? PG version is 8.1.4 or 8.2dev. Regards, Sergey *** Sergey E. Koposov Max Planck Institute for Astronomy/Sternberg Astronomical Institute Tel: +49-6221-528-349 Web: http://lnfm1.sai.msu.ru/~math E-mail: [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [HACKERS] 8.2 features?
Tom Lane wrote: So what I'm currently thinking is 1. Implement ValuesScan. 2. Convert all existing uses of Result without a child node into ValuesScan. 3. Rename Result to Filter and rip out whatever code is only used for the no-child-node case. Steps 2 and 3 are just in the nature of housekeeping and can wait till after the VALUES feature is in. Sounds good to me. As far as avoiding overhead goes, here's what I'm thinking: * The Values RTE node should contain a list of lists of bare expressions, without TargetEntry decoration (you probably do not need ResTarget in the raw parse tree for VALUES, either). * The ValuesScan plan node will just reference this list-of-lists (avoiding making a copy). It will need to contain a targetlist because all plan nodes do, but the base version of that will just be a trivial "Var 1", "Var 2", etc. (The planner might replace that with a nontrivial targetlist in cases such as the example above.) I'll work on that today. * At runtime, ValuesScan evaluates each sublist of expressions and stores the results into a virtual tuple slot which is returned as the "scan tuple" to execScan. If the targetlist is nontrivial then it is evaluated with this tuple as input. If the targetlist is a trivial Var list then the existing "physical tuple" optimization kicks in and execScan will just return the scan tuple unmodified. So for INSERT ... VALUES, the execScan layer will cost us nothing in memory space and not much in execution time. There are still some things I don't like about the way you did ValuesScan but I'll work on improving that. OK. Thanks, Joe ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Do we need multiple forms of the SQL2003 statistics
On Sat, 29 Jul 2006, Tom Lane wrote: Peter Eisentraut <[EMAIL PROTECTED]> writes: ... From a mathematician's point of view, however, some of these functions normally produce irrational numbers anyway, so it seems unlikely that numeric will be useful. But looking at the definition of, say, regr_avgx(Y, X), if all the input values are integers, it might be useful if I could get an exact integer or rational number as output, instead of a float, that is. The question is whether this is useful enough to justify adding a thousand lines to numeric.c. I believe also that the numeric forms of the aggregates will be enormously slower than the float forms, at least on most modern machines with decent floating-point performance. I don't have time to do any performance testing though. Based on just simple tests On my laptop (1.8Ghz Centrino) the numeric versions are 5-10 times slower (depending on what 2arg function is used) The example: template1=# \d test Table "public.test" Column | Type | Modifiers +--+--- na | numeric | nb | numeric | da | double precision | db | double precision | template1=# select count(*) from test; count - 100 (1 row) template1=# select regr_avgx(da,db) from test; regr_avgx - 0.5002412120227 (1 row) Time: 1052,893 ms template1=# select regr_avgx(na,nb) from test; regr_avgx - 0.500040167263887822939 (1 row) Time: 4459,739 ms template1=# select regr_sxx(da,db) from test; regr_sxx -- 83303.6317359119 (1 row) Time: 1043,891 ms template1=# select regr_sxx(na,nb) from test; regr_sxx -- 83342.044294954140912267902323775495680113567986 (1 row) Time: 8514,843 ms template1=# select corr(da,db) from test; corr -- 0.000527588261283456 (1 row) Time: 1074,948 ms template1=# select corr(na,nb) from test; corr 0.000759857150984988517883855238363403977440313567465424735082001422354119457463407737 (1 row) Time: 18327,376 ms Yes, that's significantly slower, but I don't think that it make the functions unusable... Regards, Sergey *** Sergey E. Koposov Max Planck Institute for Astronomy/Sternberg Astronomical Institute Tel: +49-6221-528-349 Web: http://lnfm1.sai.msu.ru/~math E-mail: [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] 8.2 features?
I wrote: > I still dislike the way you're doing things in the executor though. > I don't see the point of using the execScan.c machinery; most of the > time that'll be useless overhead. As I said before, I think the right > direction here is to split Result into two single-purpose node types > and make the non-filter version capable of taking a list of targetlists. After more thought I've reconsidered this. The "ValuesScan" node is still redundant with Result's non-filter case, but we should probably get rid of the latter not the former. The reason is that in the general case of VALUES-in-FROM, we do need all the generality of execScan. Consider SELECT x,y,x+y FROM (VALUES (1,2),(3,4),...) AS foo(x,y) WHERE x < y; which AFAICS is perfectly legal SQL. We need a qual condition for the WHERE and a projection step to form the x+y result. We could make a non-filtering Result clause do all that but it'd really be reinventing the execScan wheel. So what I'm currently thinking is 1. Implement ValuesScan. 2. Convert all existing uses of Result without a child node into ValuesScan. 3. Rename Result to Filter and rip out whatever code is only used for the no-child-node case. Steps 2 and 3 are just in the nature of housekeeping and can wait till after the VALUES feature is in. As far as avoiding overhead goes, here's what I'm thinking: * The Values RTE node should contain a list of lists of bare expressions, without TargetEntry decoration (you probably do not need ResTarget in the raw parse tree for VALUES, either). * The ValuesScan plan node will just reference this list-of-lists (avoiding making a copy). It will need to contain a targetlist because all plan nodes do, but the base version of that will just be a trivial "Var 1", "Var 2", etc. (The planner might replace that with a nontrivial targetlist in cases such as the example above.) * At runtime, ValuesScan evaluates each sublist of expressions and stores the results into a virtual tuple slot which is returned as the "scan tuple" to execScan. If the targetlist is nontrivial then it is evaluated with this tuple as input. If the targetlist is a trivial Var list then the existing "physical tuple" optimization kicks in and execScan will just return the scan tuple unmodified. So for INSERT ... VALUES, the execScan layer will cost us nothing in memory space and not much in execution time. There are still some things I don't like about the way you did ValuesScan but I'll work on improving that. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] New variable server_version_num
On Sun, Jul 30, 2006 at 12:17:57PM -0400, Jonah H. Harris wrote: > On 7/30/06, David Fetter <[EMAIL PROTECTED]> wrote: > >Failure to parse means the transaction bails out, which is just > >what I want in my case, as it disallows people attempting to run > >the programs--they're for DBI-Link--on too early a version of > >PostgreSQL. As there are some subtleties to the implementation, I > >need something that quickly returns boolean or fails entirely when > >it detects same. > > From an application development standpoint, it would be nice to have > a strictly numeric version returning function for checking server > compatibility. It sure would :) Cheers, D (whose boolean function is the output of a numeric comparison between the required server version and the one at hand) -- David Fetter <[EMAIL PROTECTED]> http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] New variable server_version_num
On 7/30/06, David Fetter <[EMAIL PROTECTED]> wrote: Failure to parse means the transaction bails out, which is just what I want in my case, as it disallows people attempting to run the programs--they're for DBI-Link--on too early a version of PostgreSQL. As there are some subtleties to the implementation, I need something that quickly returns boolean or fails entirely when it detects same. From an application development standpoint, it would be nice to have a strictly numeric version returning function for checking server compatibility. -- Jonah H. Harris, Software Architect | phone: 732.331.1300 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] New variable server_version_num
On Sun, Jul 30, 2006 at 11:27:33AM -0400, Tom Lane wrote: > David Fetter <[EMAIL PROTECTED]> writes: > > On Sat, Jul 29, 2006 at 09:44:10PM -0400, Tom Lane wrote: > >> The correct solution is for client-side libraries to provide the > >> feature. > > > Not if the app is written in SQL, as the bootstrap, regression > > test, etc. code for modules frequently is. > > SQL doesn't really have any conditional ability strong enough to > deal with existence or non-existence of features. What are you > hoping to do, a CASE expression? Both arms of the CASE still have > to parse, so I remain unconvinced that there are real world uses. Failure to parse means the transaction bails out, which is just what I want in my case, as it disallows people attempting to run the programs--they're for DBI-Link--on too early a version of PostgreSQL. As there are some subtleties to the implementation, I need something that quickly returns boolean or fails entirely when it detects same. Cheers, D -- David Fetter <[EMAIL PROTECTED]> http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Going for "all green" buildfarm results
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Stefan Kaltenbrunner wrote: >> FYI: lionfish just managed to hit that problem again: >> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=lionfish&dt=2006-07-29%2023:30:06 > The test alter_table, which is on the same parallel group as limit (the > failing test), contains these lines: > ALTER INDEX onek_unique1 RENAME TO tmp_onek_unique1; > ALTER INDEX tmp_onek_unique1 RENAME TO onek_unique1; I bet Alvaro's spotted the problem. ALTER INDEX RENAME doesn't seem to take any lock on the index's parent table, only on the index itself. That means that a query on "onek" could be trying to read the pg_class entries for onek's indexes concurrently with someone trying to commit a pg_class update to rename an index. If the query manages to visit the new and old versions of the row in that order, and the commit happens between, *neither* of the versions would look valid. MVCC doesn't save us because this is all SnapshotNow. Not sure what to do about this. Trying to lock the parent table could easily be a cure-worse-than-the-disease, because it would create deadlock risks (we've already locked the index before we could look up and lock the parent). Thoughts? The path of least resistance might just be to not run these tests in parallel. The chance of this issue causing problems in the real world seems small. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] New variable server_version_num
David Fetter <[EMAIL PROTECTED]> writes: > On Sat, Jul 29, 2006 at 09:44:10PM -0400, Tom Lane wrote: >> The correct solution is for client-side libraries to provide the >> feature. > Not if the app is written in SQL, as the bootstrap, regression test, > etc. code for modules frequently is. SQL doesn't really have any conditional ability strong enough to deal with existence or non-existence of features. What are you hoping to do, a CASE expression? Both arms of the CASE still have to parse, so I remain unconvinced that there are real world uses. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Three weeks left until feature freeze
Hi, Tom Lane írta: Zoltan Boszormenyi <[EMAIL PROTECTED]> writes: I am working on adding a new column contraint, namely the GENERATED [ALWAYS | BY DEFAULT ] AS [ IDENTITY ( sequence_options ) | ( expression )] Doesn't this still have the issue that we're taking over spec-defined syntax to represent behavior that does not quite match the spec-defined semantics? It's not clear to me how closely tied this syntax is to NEXT VALUE FOR, but if it's got any of the latter's action-at-a-distance subtleties then I think we ought to leave well enough alone till we have a solution for that. regards, tom lane Sorry for not answering earlier, I was on a holiday. I read again sections 6.13 and 9.21 about NEXT VALUE FOR and generation of the next value of a sequence generator, respectively. If I see it right, neither of them require the other one. The IDENTITY COLUMN syntax in section 11.4 only mentions section 9.21. Section 14.8 about INSERT statement does not even mention it, only refers to "default clause" in section 11.5. And that also doesn't say anything about neither NEXT VALUE FOR nor next value generation of a sequence. And I saw comments in the PostgreSQL documentation that goes like this: "Standard doesn't specify so we are conforming." Hint, hint. ;-) I think the IDENTITY COLUMN (and GENERATED ALWAYS AS) can stand on its own without NEXT VALUE FOR. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Going for "all green" buildfarm results
Alvaro Herrera wrote: > Stefan Kaltenbrunner wrote: >> Tom Lane wrote: >>> Stefan Kaltenbrunner <[EMAIL PROTECTED]> writes: FWIW: lionfish had a weird make check error 3 weeks ago which I (unsuccessfully) tried to reproduce multiple times after that: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=lionfish&dt=2006-05-12%2005:30:14 >>> Weird. >>> >>> SELECT ''::text AS eleven, unique1, unique2, stringu1 >>> FROM onek WHERE unique1 < 50 >>> ORDER BY unique1 DESC LIMIT 20 OFFSET 39; >>> ! ERROR: could not open relation with OID 27035 >>> >>> AFAICS, the only way to get that error in HEAD is if ScanPgRelation >>> can't find a pg_class row with the mentioned OID. Presumably 27035 >>> belongs to "onek" or one of its indexes. The very next command also >>> refers to "onek", and doesn't fail, so what we seem to have here is >>> a transient lookup failure. We've found a btree bug like that once >>> before ... wonder if there's still one left? >> FYI: lionfish just managed to hit that problem again: >> >> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=lionfish&dt=2006-07-29%2023:30:06 > > The error message this time is > > ! ERROR: could not open relation with OID 27006 yeah and before it was: ! ERROR: could not open relation with OID 27035 which looks quite related :-) > > It's worth mentioning that the portals_p2 test, which happens in the > parallel group previous to where this test is run, also accesses the > onek table successfully. It may be interesting to see exactly what > relation is 27006. sorry but i don't have access to the cluster in question any more (lionfish is quite resource starved and I only enabled to keep failed builds on -HEAD after the last incident ...) > > The test alter_table, which is on the same parallel group as limit (the > failing test), contains these lines: > > ALTER INDEX onek_unique1 RENAME TO tmp_onek_unique1; > ALTER INDEX tmp_onek_unique1 RENAME TO onek_unique1; hmm interesting - lionfish is a slow box(250Mhz MIPS) and particulary low on memory(48MB+140MB swap) so it is quite likely that the parallel regress tests are driving it into swap - maybe some sort of subtile timing issue ? Stefan ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings