Re: [HACKERS] WIP: Join push-down for foreign tables
Hi Michael, (2011/09/21 12:52), Michael Paquier wrote: I am interested in the development you are doing regarding join push down and fdw stuff for remote postgreSQL servers. Is there a way to get the postgres fdw you are providing here for common 9.1? I saw that the tar you are providing here is adapted only for your patch. As you say, the pgsql_fdw I posted requires my join-push-down patch. But, at least in current revision, using PG_VERSION_NUM would make it compile-able for both 9.1 and 9.2. But I'm not sure that changes required for 9.2 development cycle are enough small for this workaround. Anyway, I'm going to publish recent pgsql_fdw for 9.1 on PGXN or elsewhere, though it needs some (hopefully little) time. Regards, -- Shigeru Hanada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online base backup from the hot-standby
On Wed, Sep 21, 2011 at 2:13 PM, Magnus Hagander mag...@hagander.net wrote: On Wed, Sep 21, 2011 at 04:50, Fujii Masao masao.fu...@gmail.com wrote: 3. Copy the pg_control file from the cluster directory on the standby to the backup as follows: cp $PGDATA/global/pg_control /mnt/server/backupdir/global But this is done as part of step 2 already. I assume what this really means is that the pg_control file must be the last file backed up? Yes. When we perform an archive recovery from the backup taken during normal processing, we gets a backup end location from the backup-end WAL record which was written by pg_stop_backup(). But since no WAL writing is allowed during recovery, pg_stop_backup() on the standby cannot write a backup-end WAL record. So, in his patch, instead of a backup-end WAL record, the startup process uses the minimum recovery point recorded in pg_control which has been included in the backup, as a backup end location. BTW, a backup end location is used to check whether recovery has reached a consistency state (i.e., end-of-backup). To use the minimum recovery point in pg_control as a backup end location safely, pg_control must be backed up last. Otherwise, data page which has the newer LSN than the minimum recovery point might be included in the backup. (Since there are certainly a lot other ways to do the backup than just cp to a mounted directory..) Yes. The above command I described is just an example. 4. Execute pg_stop_backup on the standby. The backup taken by the above procedure is available for an archive recovery or standby server. If the standby is promoted during a backup, pg_stop_backup() detects the change of the server status and fails. The data backed up before the promotion is invalid and not available for recovery. Taking a backup from the standby by using pg_basebackup is still not possible. But we can relax that restriction after applying this patch. I think that this is going to be very important, particularly given the requirements on pt 3 above. (But yes, it certainly doesn't have to be done as part of this patch, but it really should be the plan to have this included in the same version) Agreed. To take a base backup during recovery safely, some sort of parameters must be set properly. Hot standby must be enabled on the standby, i.e., wal_level and hot_standby must be enabled on the master and the standby, respectively. FPW (full page writes) is required for a base backup, so full_page_writes must be enabled on the master. Presumably pg_start_backup() will check this. And we'll somehow track this before pg_stop_backup() as well? (for such evil things such as the user changing FPW from on to off and then back to on again during a backup, will will make it look correct both during start and stop, but incorrect in the middle - pg_stop_backup needs to fail in that case as well) Right. As I suggested upthread, to address that problem, we need to log the change of FPW on the master, and then we need to check whether such a WAL is replayed on the standby during the backup. If it's done, pg_stop_backup() should emit an error. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On 21.09.2011 02:53, Peter Geoghegan wrote: C stdlib quick-sort time elapsed: 2.092451 seconds Inline quick-sort time elapsed: 1.587651 seconds Does *that* look attractive to you? Not really, to be honest. That's a 25% speedup in pure qsorting speed. How much of a gain in a real query do you expect to get from that, in the best case? There's so many other sources of overhead that I'm afraid this will be lost in the noise. If you find a query that spends, say, 50% of its time in qsort(), you will only get a 12.5% speedup on that query. And even 50% is really pushing it - I challenge you to find a query that spends any significant amount of time qsorting integers. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On Tue, Sep 20, 2011 at 3:51 AM, Tom Lane t...@sss.pgh.pa.us wrote: This performance patch differs from most in that it's difficult in principle to imagine a performance regression occurring. Really? N copies of the same code could lead to performance loss just due to code bloat (ie, less of a query's inner loops fitting in CPU cache). Not to mention the clear regression in maintainability. So I'm disinclined to consider this sort of change without a significantly bigger win than you're suggesting above (no, I don't even consider the -O0 number attractive, let alone what you're finding at -O2). More copies of the code are somewhat annoying, but its only 100 lines of code in one module and we can easily have specific tests for each. The extra code size is minor in comparison to the reams of code we add elsewhere. It's a surprisingly good win for such a common use case. Well done, Peter. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On Wed, Sep 21, 2011 at 7:51 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 21.09.2011 02:53, Peter Geoghegan wrote: C stdlib quick-sort time elapsed: 2.092451 seconds Inline quick-sort time elapsed: 1.587651 seconds Does *that* look attractive to you? Not really, to be honest. That's a 25% speedup in pure qsorting speed. How much of a gain in a real query do you expect to get from that, in the best case? There's so many other sources of overhead that I'm afraid this will be lost in the noise. If you find a query that spends, say, 50% of its time in qsort(), you will only get a 12.5% speedup on that query. And even 50% is really pushing it - I challenge you to find a query that spends any significant amount of time qsorting integers. How about almost every primary index creation? Don't really see a reason for the negativity here. If you use that argument no performance gain is worth it because all workloads are mixed. This is a marvellous win, a huge gain from a small, isolated and easily tested change. By far the smallest amount of additional code to sorting we will have added and yet one of the best gains. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On 21.09.2011 10:01, Simon Riggs wrote: On Wed, Sep 21, 2011 at 7:51 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 21.09.2011 02:53, Peter Geoghegan wrote: C stdlib quick-sort time elapsed: 2.092451 seconds Inline quick-sort time elapsed: 1.587651 seconds Does *that* look attractive to you? Not really, to be honest. That's a 25% speedup in pure qsorting speed. How much of a gain in a real query do you expect to get from that, in the best case? There's so many other sources of overhead that I'm afraid this will be lost in the noise. If you find a query that spends, say, 50% of its time in qsort(), you will only get a 12.5% speedup on that query. And even 50% is really pushing it - I challenge you to find a query that spends any significant amount of time qsorting integers. How about almost every primary index creation? Nope. Swamped by everything else. Also note that as soon as the sort grows big enough to not fit in maintenance_work_mem, you switch to the external sort algorithm, which doesn't use qsort. Perhaps you could do similar inlining in the heap sort merge passes done in the external sort, but it's unlikely to be as big a win there. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Join push-down for foreign tables
2011/9/21 Shigeru Hanada shigeru.han...@gmail.com Hi Michael, (2011/09/21 12:52), Michael Paquier wrote: I am interested in the development you are doing regarding join push down and fdw stuff for remote postgreSQL servers. Is there a way to get the postgres fdw you are providing here for common 9.1? I saw that the tar you are providing here is adapted only for your patch. As you say, the pgsql_fdw I posted requires my join-push-down patch. But, at least in current revision, using PG_VERSION_NUM would make it compile-able for both 9.1 and 9.2. But I'm not sure that changes required for 9.2 development cycle are enough small for this workaround. OK, don't worry. I may be able to change that if necessary. Anyway, I'm going to publish recent pgsql_fdw for 9.1 on PGXN or elsewhere, though it needs some (hopefully little) time. Thanks, that will be helpful, I think this work is very valuable for postgresql and keep an eye on it. Btw, please don't feel that you have to rush in doing that. I am not in a hurry at all so of course do it at your rhythm. Regards, -- Michael Paquier http://michael.otacoo.com
Re: [HACKERS] Range Types - typo + NULL string constructor
On Mon, 2011-09-19 at 18:32 +0200, Florian Pflug wrote: No, but more similar the format are the easier it gets to at least factor the hairy parts of such a parser into a common subroutine. Assume that we don't support NULL as an alias for INF. What would then be the result of '[A,NULL)'::textrange? I think that the range input should *parse* NULL in a similar way, but reject it. So, to make it the range between two definite strings, you'd do: '[A,NULL)'::textrange which would be equal to textrange('A','NULL','[)'). Without the quotes, it would detect the NULL, and give an error. Open to suggestion here, though. Presumably, it'd be the same as textrange('A','NULL','[)'). Which think is a bit surprising, since '[A,NULL]'::text[] produces ARRAY['A',NULL], *NOT* ARRAY['A','NULL']. BTW, we currently represent infinity for floating point values as 'Infinity', not 'INF'. Shouldn't we do the same for ranges, i.e. make int4range(0,NULL,'[)')::text return '[0,Infinity)'? I'm open to that, if you think it's an improvement I'll do it (but we should probably pick one identifiable string and stick with it). What I'd like to avoid is adding to the NULL/infinity confusion. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Mon, 2011-09-19 at 12:26 -0400, Robert Haas wrote: What I really care about is that we don't talk ourselves into needing a zillion constructor functions. Making things work with a single constructor function seems to me to simplify life quite a bit, and allowing there seems essential for that. I think we pretty much all agree on that. However, you did see the note about the difficulty of using default parameters in built-in functions, right? I ultimately ended up with 4 constructors, each with the same name but 0, 1, 2, and 3 parameters. Suggestions welcome. (I am also vaguely wondering what happens if if you have a text range is (nubile, null) ambiguous?) There are a few ways to handle that. I would lean toward parsing the NULL as a special keyword, and then rejecting it (does it matter if it's upper case?). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On 21 September 2011 01:48, karave...@mail.bg wrote: All -O2 version show 42% speedup with inlined qsort. -O0 showed 25% speedup. Thanks. Looks like the figures I posted last night were fairly conservative. Does anyone else care to report results? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Shared sequence-like objects in PostgreSQL
Hello all, I'm writing a C-language function that is similar to nextval() but should return the next member of the recurrent sequence: T(n+1) = f(T(n), T(n-1), ..., T(n-k)), where f is some function and k is a constant. The state of this object should be persistent between database restarts and should be easily recovered if the database crashes. So the first problem I encountered was where to store the current state of this object (n and values T(n), T(n-1), ... T(n-k)). I believe that TopMemoryContext is not shared between processes, therefore I must use shmem functions from backend/storage/ipc/shmem.c to create a structure in shared memory. The next issue is how to synchronize backends' reads/writes to this chunk of shared memory. I suppose there must be something to handle with semaphores in the Postgres code. Then I periodically need to persist the state of this object to the database, for example for every 100 generated values, as well as on the postmaster's shutdown. What is the best method for doing that? Please let me know if this problem has been solved before. Thanks for you help. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online base backup from the hot-standby
On Wed, Sep 21, 2011 at 08:23, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Sep 21, 2011 at 2:13 PM, Magnus Hagander mag...@hagander.net wrote: On Wed, Sep 21, 2011 at 04:50, Fujii Masao masao.fu...@gmail.com wrote: 3. Copy the pg_control file from the cluster directory on the standby to the backup as follows: cp $PGDATA/global/pg_control /mnt/server/backupdir/global But this is done as part of step 2 already. I assume what this really means is that the pg_control file must be the last file backed up? Yes. When we perform an archive recovery from the backup taken during normal processing, we gets a backup end location from the backup-end WAL record which was written by pg_stop_backup(). But since no WAL writing is allowed during recovery, pg_stop_backup() on the standby cannot write a backup-end WAL record. So, in his patch, instead of a backup-end WAL record, the startup process uses the minimum recovery point recorded in pg_control which has been included in the backup, as a backup end location. BTW, a backup end location is used to check whether recovery has reached a consistency state (i.e., end-of-backup). To use the minimum recovery point in pg_control as a backup end location safely, pg_control must be backed up last. Otherwise, data page which has the newer LSN than the minimum recovery point might be included in the backup. Ah, check. (Since there are certainly a lot other ways to do the backup than just cp to a mounted directory..) Yes. The above command I described is just an example. ok. 4. Execute pg_stop_backup on the standby. The backup taken by the above procedure is available for an archive recovery or standby server. If the standby is promoted during a backup, pg_stop_backup() detects the change of the server status and fails. The data backed up before the promotion is invalid and not available for recovery. Taking a backup from the standby by using pg_basebackup is still not possible. But we can relax that restriction after applying this patch. I think that this is going to be very important, particularly given the requirements on pt 3 above. (But yes, it certainly doesn't have to be done as part of this patch, but it really should be the plan to have this included in the same version) Agreed. To take a base backup during recovery safely, some sort of parameters must be set properly. Hot standby must be enabled on the standby, i.e., wal_level and hot_standby must be enabled on the master and the standby, respectively. FPW (full page writes) is required for a base backup, so full_page_writes must be enabled on the master. Presumably pg_start_backup() will check this. And we'll somehow track this before pg_stop_backup() as well? (for such evil things such as the user changing FPW from on to off and then back to on again during a backup, will will make it look correct both during start and stop, but incorrect in the middle - pg_stop_backup needs to fail in that case as well) Right. As I suggested upthread, to address that problem, we need to log the change of FPW on the master, and then we need to check whether such a WAL is replayed on the standby during the backup. If it's done, pg_stop_backup() should emit an error. I somehow missed this thread completely, so I didn't catch your previous comments - oops, sorry. The important point being that we need to track if when this happens even if it has been reset to a valid value. So we can't just check the state of the variable at the beginning and at the end. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improve lseek scalability v3
2011/9/19 Matthew Wilcox matt...@wil.cx: On Mon, Sep 19, 2011 at 08:31:00AM -0400, Stephen Frost wrote: * Benjamin LaHaise (b...@kvack.org) wrote: For such tables, can't Postgres track the size of the file internally? I'm assuming it's keeping file descriptors open on the tables it manages, in which case when it writes to a file to extend it, the internally stored size could be updated. Not making a syscall at all would scale far better than even a modified lseek() will perform. We'd have to have it in shared memory and have a lock around it, it wouldn't be cheap at all. Yep, that makes perfect sense. After all, the kernel does basically the same thing to maintain this information; why should we have userspace duplicating the same infrastructure? I must admit, I'd never heard of this usage of lseek to get the current size of a file before; I'd assumed everybody used fstat. Given this legitimate reason for a high-frequency calling of lseek, I withdraw my earlier objection to the patch series. -- Matthew Wilcox Intel Open Source Technology Centre Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step. -- To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html I really don't understand the approach here. An improvement is an improvement, do we need a use case to add an improvement to the kernel? We are not talking about to add a new syscall or to do an ABI change in this case. So my absolute ack to these patches. Marco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] POC: inline int4 comparison in tuplesort
This attempts to be as simple as it gets while reducing function call depth, and should be viewed as a proof of concept. It is also untested as of now, but will try to do that and report back. I'm hoping I followed the rabbit hole correctly and are correctly comparing the right pointers to each other in order to short circuit the case where we are using the int4 comparison operator. Peter, if you want to compare stock vs. your patch vs. this patch, we might be able to get some sort of read on where the maintainablity vs. performance curve lies. Note that this version should still allow sorting of anything, and simply shifts gears for int4 tuples... --- src/backend/utils/sort/tuplesort.c | 23 +-- 1 files changed, 21 insertions(+), 2 deletions(-) diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c index 3505236..ddd5ced 100644 --- a/src/backend/utils/sort/tuplesort.c +++ b/src/backend/utils/sort/tuplesort.c @@ -2652,6 +2652,22 @@ myFunctionCall2Coll(FmgrInfo *flinfo, Oid collation, Datum arg1, Datum arg2) return result; } +static inline +int int4cmp(Datum first, Datum second) +{ + int32 a = DatumGetInt32(first); + int32 b = DatumGetInt32(second); + + if (a b) + return 1; + else if (a == b) + return 0; + else + return -1; +} + +extern Datum btint4cmp(PG_FUNCTION_ARGS); + /* * Apply a sort function (by now converted to fmgr lookup form) * and return a 3-way comparison result. This takes care of handling @@ -2683,8 +2699,11 @@ inlineApplySortFunction(FmgrInfo *sortFunction, int sk_flags, Oid collation, } else { - compare = DatumGetInt32(myFunctionCall2Coll(sortFunction, collation, - datum1, datum2)); + if (sortFunction-fn_addr == btint4cmp) + compare = int4cmp(datum1, datum2); + else + compare = DatumGetInt32(myFunctionCall2Coll(sortFunction, collation, + datum1, datum2)); if (sk_flags SK_BT_DESC) compare = -compare; -- 1.7.6.3 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
Recent discussions on the threads Double sorting split patch and CUDA sorting raised the possibility that there could be significant performance optimisation low-hanging fruit picked by having the executor treat integers and floats as a special case during sorting, avoiding going to the trouble of calling a comparator using the built-in SQL function machinery Why only for integers and floats why not for char/varchar? But I believe this can make code less maintainable as similar things can be done at other places to avoid SQL function machinery. Once the cache has been warmed, explain analyze very consistently reports a runtime of 123ms for this query on master/HEAD, which varies +/- 1 ms, with a few outliers of maybe +/- 2ms. However, when I apply this patch, that goes down to 107ms +/- 1ms at -O0. Time 123ms which is without your change is with which optimization -O2 or O0? *** This e-mail and attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient's) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Peter Geoghegan Sent: Tuesday, September 20, 2011 7:26 AM To: PG Hackers Subject: [HACKERS] Inlining comparators as a performance optimisation Recent discussions on the threads Double sorting split patch and CUDA sorting raised the possibility that there could be significant performance optimisation low-hanging fruit picked by having the executor treat integers and floats as a special case during sorting, avoiding going to the trouble of calling a comparator using the built-in SQL function machinery, and taking advantage of inlining of the comparator, which has been shown to have a considerable performance advantage (at least compared to a general purpose c stdlib qsort(), that takes a function pointer as its comparator, much like tuplesort). I've hacked together a sloppy POC implementation in a hurry (basically, some code is shifted around) , which is attached - I felt that it would be useful to determine if the community feels that this is a worth-while undertaking in advance of a business trip that I'm leaving on tomorrow lasting until Friday, during which I will be mostly unavailable. The patch breaks the Postgres sorting executor node (at least when it quicksorts) for any type other than int4. I apologise for how rough the patch is, but the code itself isn't important right now - the idea is. I anticipate that the value state-datumType or something similar will be set in a near future revision, so that tuplesort_performsort will know which type-specific optimisation it can use for the type, while falling back on the existing generic qsort_arg + qsort_arg_comparator, and sorting won't actually be broken. I've been doing some preliminary testing using the dell store 2 sample database. I increase work_mem to '50MB', to ensure that a quicksort will be performed for sorting (otherwise, I'm using the postgresql.conf that initdb gave me). The query is: explain analyze select * from orderlines order by prod_id; Once the cache has been warmed, explain analyze very consistently reports a runtime of 123ms for this query on master/HEAD, which varies +/- 1 ms, with a few outliers of maybe +/- 2ms. However, when I apply this patch, that goes down to 107ms +/- 1ms at -O0. I think that that's a pretty good start. Funnily enough, the difference/advantage vanishes at -O2 (I'm guessing that the higher optimisation level of GCC 4.5 hyper-corrects away the inlining, but I don't have time to check that right now). I imagine the version that I actually submit for patch review will have a macro-based infrastructure for inlining the sorting of various built-in types, initially integers and floats. It will most likely have some other optimisations - I haven't even used a profiler yet. This performance patch differs from most in that it's difficult in principle to imagine a performance regression occurring. Thoughts? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Re: [HACKERS] Range Types - typo + NULL string constructor
On Sep21, 2011, at 09:23 , Jeff Davis wrote: On Mon, 2011-09-19 at 18:32 +0200, Florian Pflug wrote: No, but more similar the format are the easier it gets to at least factor the hairy parts of such a parser into a common subroutine. Assume that we don't support NULL as an alias for INF. What would then be the result of '[A,NULL)'::textrange? I think that the range input should *parse* NULL in a similar way, but reject it. So, to make it the range between two definite strings, you'd do: '[A,NULL)'::textrange which would be equal to textrange('A','NULL','[)'). Without the quotes, it would detect the NULL, and give an error. Open to suggestion here, though. Hm, that seems like a reasonable compromise. As long as range types and arrays agree on the same basic lexical rules regarding quoting and whitespace (i.e. that spaces outside of double-quotes are non-significant, that keywords like NULL and INF/Infinity are case-insensitive, ...) I'm happy I guess. BTW, we currently represent infinity for floating point values as 'Infinity', not 'INF'. Shouldn't we do the same for ranges, i.e. make int4range(0,NULL,'[)')::text return '[0,Infinity)'? I'm open to that, if you think it's an improvement I'll do it (but we should probably pick one identifiable string and stick with it). What I'd like to avoid is adding to the NULL/infinity confusion. I've thought about this some more, and came to realize that the question here really is whether floatrange(0, 'Infinity'::float, '[)') and floatrange(0, NULL, '[)') are the same thing or not. If they're not, then obviously using Infinity to represent omitted bounds is going to be very confusing. If they are, then using Infinity seems preferable. Maybe boundaries should be restricted to numeric float values (i.e. +/-Infinity and NaN should be rejected), though I dunno if the range type infrastructure supports that. Thoughts? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Wed, Sep 21, 2011 at 3:29 AM, Jeff Davis pg...@j-davis.com wrote: On Mon, 2011-09-19 at 12:26 -0400, Robert Haas wrote: What I really care about is that we don't talk ourselves into needing a zillion constructor functions. Making things work with a single constructor function seems to me to simplify life quite a bit, and allowing there seems essential for that. I think we pretty much all agree on that. However, you did see the note about the difficulty of using default parameters in built-in functions, right? I ultimately ended up with 4 constructors, each with the same name but 0, 1, 2, and 3 parameters. Suggestions welcome. (I am also vaguely wondering what happens if if you have a text range is (nubile, null) ambiguous?) There are a few ways to handle that. I would lean toward parsing the NULL as a special keyword, and then rejecting it (does it matter if it's upper case?). Boy, that seems really weird to me. If you're going to do it, it ought to be case-insensitive, but I think detecting the case only for the purpose of rejecting it is probably a mistake. I mean, if (nubile, nutty) is OK, then (nubile, null) and (null, nutty) don't really seem like they ought to be any different. Otherwise, anyone who wants to construct these strings programatically is going to need to escape everything and always write (cat,dog) or however you do that, and that seems like an unnecessary imposition. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Shared sequence-like objects in PostgreSQL
On Wed, Sep 21, 2011 at 8:19 AM, Vlad Arkhipov arhi...@dc.baikal.ru wrote: I'm writing a C-language function that is similar to nextval() but should return the next member of the recurrent sequence: T(n+1) = f(T(n), T(n-1), ..., T(n-k)), where f is some function and k is a constant. The state of this object should be persistent between database restarts and should be easily recovered if the database crashes. The purpose of nextval() is to provide an escape hatch from the normal transactional guarantees which would normally serialize everything using it. Avoiding the performance impact of that is the only reason it needs to use shared memory and so on. If this function isn't performance critical and doesn't need to be highly concurrent then you would be better off storing this information in a table and updating the table using regular database updates. The way you've defined it also makes me wonder whether you can afford to skip values. If not then you don't really get an option of avoiding the serialization. If you can, one short-cut you could consider would be to populate a table with the values of the sequence, and periodically populate more values when you run short of unused values. Then you can use a regular postgres sequence to generate indexes into that table. That would not perform quite as well as a shared memory native implementation like you describe but wouldn't require nearly as much Postgres-specific C code. Perhaps if you can explain what the problem you're actually trying to solve is it might be clearer whether it justifies working at such a low level. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Sep21, 2011, at 14:00 , Robert Haas wrote: On Wed, Sep 21, 2011 at 3:29 AM, Jeff Davis pg...@j-davis.com wrote: On Mon, 2011-09-19 at 12:26 -0400, Robert Haas wrote: (I am also vaguely wondering what happens if if you have a text range is (nubile, null) ambiguous?) There are a few ways to handle that. I would lean toward parsing the NULL as a special keyword, and then rejecting it (does it matter if it's upper case?). Boy, that seems really weird to me. If you're going to do it, it ought to be case-insensitive, but I think detecting the case only for the purpose of rejecting it is probably a mistake. I mean, if (nubile, nutty) is OK, then (nubile, null) and (null, nutty) don't really seem like they ought to be any different. But that's exactly how arrays behave too. '{null,nutty}' is interpreted as ARRAY[NULL,'nutty'] while '{nubile,nutty}' is interpreted as ARRAY['nubile','nutty']. Otherwise, anyone who wants to construct these strings programatically is going to need to escape everything and always write (cat,dog) or however you do that, and that seems like an unnecessary imposition. Unless you fully depart from what arrays you, you'll have to do that anyway because leading and trailing spaces aren't considered to be significant in non-quoted elements. In other words, '( cat , dog )' represents textrange('cat', 'dog', '()'), *not* textrange(' cat ', ' dog ', '()'). Also, as long as we need to recognize at least one special value meaning a non-existing bound ('INF' or 'Infinity' or whatever), I don't see a way around the need for quotes in the general case. Well, expect making the representation of range(X, NULL, '[)') be '[X)', the one of range(NULL, X, '(]') be '(X]' and the one of range(NULL, NULL, '()') be '()', but I'm not sure that's an improvement. And even if it was, you'd still need to quote X if it contained one of (,),[,] or ,. So most client would probably still choose to quote unconditionally, instead of detecting whether it was necessary or not. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Wed, Sep 21, 2011 at 8:41 AM, Florian Pflug f...@phlo.org wrote: Boy, that seems really weird to me. If you're going to do it, it ought to be case-insensitive, but I think detecting the case only for the purpose of rejecting it is probably a mistake. I mean, if (nubile, nutty) is OK, then (nubile, null) and (null, nutty) don't really seem like they ought to be any different. But that's exactly how arrays behave too. '{null,nutty}' is interpreted as ARRAY[NULL,'nutty'] while '{nubile,nutty}' is interpreted as ARRAY['nubile','nutty']. Oh. Well, never mind then. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On Wed, Sep 21, 2011 at 8:08 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: How about almost every primary index creation? Nope. Swamped by everything else. Really? I think it's pretty common for shops to be able to dedicate large amounts of RAM to building initial indexes on data loads or reindex operations. Enough that they can cache the entire table for the short time they're doing the index builds even if they're quite large. Witness the recent pleas to allow maintenance_work_mem on the order of tens of gigabytes. And it's also pretty common that shops can dedicate very large I/O bandwidth, in many cases enough to saturate the memory bandwidth, for doing these kinds of batch operations when they get large enough to need to do an external sort. There's still overhead of reading the pages, the tuples, finding the sort keys in the tuple, etc. But I think the actual qsort or heap operations in tapesort are pretty big portions of the work. This is pretty easy to measure. Just run oprofile or gprof and see what percentage of time for a big index build is spent in qsort. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On Wed, Sep 21, 2011 at 8:47 AM, Greg Stark st...@mit.edu wrote: On Wed, Sep 21, 2011 at 8:08 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: How about almost every primary index creation? Nope. Swamped by everything else. Really? I think it's pretty common for shops to be able to dedicate large amounts of RAM to building initial indexes on data loads or reindex operations. Enough that they can cache the entire table for the short time they're doing the index builds even if they're quite large. Witness the recent pleas to allow maintenance_work_mem on the order of tens of gigabytes. And it's also pretty common that shops can dedicate very large I/O bandwidth, in many cases enough to saturate the memory bandwidth, for doing these kinds of batch operations when they get large enough to need to do an external sort. There's still overhead of reading the pages, the tuples, finding the sort keys in the tuple, etc. But I think the actual qsort or heap operations in tapesort are pretty big portions of the work. This is pretty easy to measure. Just run oprofile or gprof and see what percentage of time for a big index build is spent in qsort. +1 for some actual measurements. I don't think anyone on this thread is saying that if we can get big performance gains from doing this we still shouldn't do it. But at this point it's unclear that we can get a consistent speedup that isn't heavily dependent on the choice of compiler flags (to say nothing of compiler and OS), and even if we can, it's not clear that it will still be noticeable when you measure the run time of an entire query rather than just the speed of qsort(). Like Tom and Heikki, I'm a bit skeptical: it wouldn't surprise me to find out that qsort() is 5% of the runtime any realistic test case and the average qsort() speedup based on tests on a couple different platforms is 10% and so on average we're looking at a 0.5% improvement, in which case it might be more trouble than it's worth, especially if it turns out that there are OS/platform combinations where the inlined version is (for some crazy reason) slower. I've seen performance differences of up to 3% from minor code rearrangements that don't seem like they should matter at all, just because code and data shifts around across cache-line boundaries and the new arrangement is slightly better or worse than the old one. So if the performance improvement turns out to be very small, then validating that it actually IS an improvement in general is likely to be kind of a pain in the ass. On the other hand, the performance improvement might turn out to be large. Maybe there's a test case where, as Heikki suggests, 50% of the time is spent in qsort(). If we can reliably make that 25% faster, I wouldn't dismiss that out of hand; I think that would be pretty good, assuming it didn't require massive amounts of spaghetti code to make it work. I don't see that that would be any more marginal than the sorts of things we've optimized in, say, commit 4fc115b2e981f8c63165ca86a23215380a3fda66, or commit f4d242ef94730c447d87b9840a40b0ec3371fe0f. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On 21 September 2011 07:51, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 21.09.2011 02:53, Peter Geoghegan wrote: C stdlib quick-sort time elapsed: 2.092451 seconds Inline quick-sort time elapsed: 1.587651 seconds Does *that* look attractive to you? Not really, to be honest. That's a 25% speedup in pure qsorting speed. How much of a gain in a real query do you expect to get from that, in the best case? There's so many other sources of overhead that I'm afraid this will be lost in the noise. I'm surprised that you're dismissive of this. After all, we have in the past indulged in micro-optimisation of qsort, or so it would seem from this comment: * We have modified their original by adding a check for already-sorted input, * which seems to be a win per discussions on pgsql-hackers around 2006-03-21. Makes affected queries radically faster (In the best case, a speedup somewhat greater than 12.5%) is an unreasonably high standard for a performance optimisation of the executor in general (such a high standard might be sensible if it was due to a particular maintainability downside, but you didn't mention one). Even still, I think that the 12.5% figure is pretty pessimistic - I've already sped up the dell store query by almost that much, and that's with a patch that was, due to circumstances, cobbled together. Not only are we benefiting from the effects of inlining, we're also benefiting from the removal of unnecessary indirection. As Tom said, In concrete terms, there would be no reason to have tuplesort.c's myFunctionCall2Coll, and maybe not inlineApplySortFunction either, if the datatype-specific comparison functions had APIs that were closer to what sorting wants rather than following the general SQL-callable-function API. He was just referring to the benefits of removing indirection here, so ISTM that this is really two performance optimisations rolled into one - it's conceivable that the total performance improvement will even exceed the isolated inlining comparator benchmark. As I've said, I believe this patch can be committed without compromising the maintainability of the tuplesort code to an extent that is not clearly worth it, through the use of a clean, macro-based abstraction. Concerns about bloated binaries are probably not well founded, because what I'm proposing is to a certain extent emulating C++ templates, while using a very common pattern used with C++ templates. In the C++ world, algorithms are often generalised as templates, so that they can be used equally well with any datatype (that supports the interface of the template), while availing of compiler optimisations per template instantiation (instance of using a given type with a given template). I actually got the idea for this patch in part from a book that I read years ago that described the fact that counter-intuitively, std::sort() consistently outperforms qsort(), because the comparator is often inlined, and the compiler can generally avail of optimisations from knowing the comparator at compile-time. On 21 September 2011 13:47, Greg Stark st...@mit.edu wrote: This is pretty easy to measure. Just run oprofile or gprof and see what percentage of time for a big index build is spent in qsort. I'll do so soon. I intend to get to this on Friday evening, and perhaps have a proper patch to show next week. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On 21.09.2011 17:20, Peter Geoghegan wrote: Even still, I think that the 12.5% figure is pretty pessimistic - I've already sped up the dell store query by almost that much, and that's with a patch that was, due to circumstances, cobbled together. I'm not against making things faster, it's just that I haven't seen solid evidence yet that this will help. Just provide a best-case test case for this that shows a huge improvement, and I'll shut up. If the improvement is only modest, then let's discuss how big it is and whether it's worth the code ugliness this causes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
Florian Pflug f...@phlo.org writes: On Sep21, 2011, at 14:00 , Robert Haas wrote: Otherwise, anyone who wants to construct these strings programatically is going to need to escape everything and always write (cat,dog) or however you do that, and that seems like an unnecessary imposition. Unless you fully depart from what arrays you, you'll have to do that anyway because leading and trailing spaces aren't considered to be significant in non-quoted elements. In other words, '( cat , dog )' represents textrange('cat', 'dog', '()'), *not* textrange(' cat ', ' dog ', '()'). Keep in mind that the array I/O behavior is widely considered to suck. When we defined the record I/O behavior, we did not emulate that whitespace weirdness, nor a number of other weirdnesses. I would argue that ranges ought to model their I/O behavior on records not arrays, because that's not as much of a legacy syntax. Also, as long as we need to recognize at least one special value meaning a non-existing bound ('INF' or 'Infinity' or whatever), I don't see a way around the need for quotes in the general case. Right. In the record case, we used an empty string for NULL, and then had to insist on quotes for actual empty strings. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 21.09.2011 17:20, Peter Geoghegan wrote: Even still, I think that the 12.5% figure is pretty pessimistic - I've already sped up the dell store query by almost that much, and that's with a patch that was, due to circumstances, cobbled together. I'm not against making things faster, it's just that I haven't seen solid evidence yet that this will help. Just provide a best-case test case for this that shows a huge improvement, and I'll shut up. If the improvement is only modest, then let's discuss how big it is and whether it's worth the code ugliness this causes. The other question that I'm going to be asking is whether it's not possible to get most of the same improvement with a much smaller code footprint. I continue to suspect that getting rid of the SQL function impedance-match layer (myFunctionCall2Coll etc) would provide most of whatever gain is to be had here, without nearly as large a cost in code size and maintainability, and with the extra benefit that the speedup would also be available to non-core datatypes. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Hot Backup with rsync fails at pg_clog if under load
Hello, * Context * I'm observing problems with provisioning a standby from the master by following a basic and documented Making a Base Backup [1] procedure with rsync if, in the mean time, heavy load is applied on the master. After searching the archives, the only more discussed and similar issue I found hit was by Daniel Farina in a thread hot backups: am I doing it wrong, or do we have a problem with pg_clog? [2], but, it seems, the issue was discarded because of a non-standard backup procedure Deniel used. However, I'm observing the same error with a simple procedure, hence this message. * Details * Procedure: 1. Start load generator on the master (WAL archiving enabled). 2. Prepare a Streaming Replication standby (accepting WAL files too): 2.1. pg_switch_xlog() on the master; 2.2. pg_start_backup(Obackup_under_load¹) on the master (this will take a while as master is loaded up); 2.3. rsync data/global/pg_control to the standby; 2.4. rsync all other data/ (without pg_xlog) to the standby; 2.5. pg_stop_backup() on the master; 2.6. Wait to receive all WAL files, generated during the backup, on the standby; 2.6. Start the standby PG instance. The last step will, usually, fail with a similar error: 2011-09-21 13:41:05 CEST LOG: database system was interrupted; last known up at 2011-09-21 13:40:50 CEST Restoring 0014.history mv: cannot stat `/opt/PostgreSQL/9.1/archive/0014.history': No such file or directory Restoring 0013.history 2011-09-21 13:41:05 CEST LOG: restored log file 0013.history from archive 2011-09-21 13:41:05 CEST LOG: entering standby mode Restoring 0013000600DC 2011-09-21 13:41:05 CEST LOG: restored log file 0013000600DC from archive Restoring 0013000600DB 2011-09-21 13:41:05 CEST LOG: restored log file 0013000600DB from archive 2011-09-21 13:41:05 CEST FATAL: could not access status of transaction 1188673 2011-09-21 13:41:05 CEST DETAIL: Could not read from file pg_clog/0001 at offset 32768: Success. 2011-09-21 13:41:05 CEST LOG: startup process (PID 13819) exited with exit code 1 2011-09-21 13:41:05 CEST LOG: aborting startup due to startup process failure The procedure works very reliably if there is little or no load on the master, but fails very often with the pg_clog error when load generator (few thousands of SELECTs, ~60 INSERTs, ~60 DELETEs and ~60 UPDATES per second) is started up. I assumed that a file system backup taken during pg_start_backup and pg_stop_backup is guaranteed to be consistent and that missing pieces will be taken from the WAL files, generated shipped during the backup, but is it really? Is this procedure missing some steps? Or maybe this a known issue? Thank you, Linas [1] http://www.postgresql.org/docs/current/static/continuous-archiving.html [2] http://archives.postgresql.org/pgsql-hackers/2011-04/msg01132.php -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On 09/21/2011 10:50 AM, Tom Lane wrote: The other question that I'm going to be asking is whether it's not possible to get most of the same improvement with a much smaller code footprint. I continue to suspect that getting rid of the SQL function impedance-match layer (myFunctionCall2Coll etc) would provide most of whatever gain is to be had here, without nearly as large a cost in code size and maintainability, and with the extra benefit that the speedup would also be available to non-core datatypes. Can we get a patch so we can do benchmarks on this? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] sequence locking
Hi, I find the current behaviour of locking of sequences rather problematic. Multiple things: - First and foremost I find it highly dangerous that ALTER SEQUENCE ... is for the biggest part not transactional. I think about the only transaction part is the name, owner and schema. Sure, its documented, but ... The cited reasons for wanting that behaviour look a bit bogus to me? Why should concurrency be important when doing an ALTER SEQUENCE? - Its impossible to emulate proper locking yourself because locking is not allowed for sequences The first one looks rather hard to solve to me with my passing knowledge of the sequence, but probably worthy of a TODO entry. The second one looks easier. Any arguments against allowing it again? It seems to have been allowed in prehistoric times. Greetings, Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Sep21, 2011, at 16:41 , Tom Lane wrote: Florian Pflug f...@phlo.org writes: On Sep21, 2011, at 14:00 , Robert Haas wrote: Otherwise, anyone who wants to construct these strings programatically is going to need to escape everything and always write (cat,dog) or however you do that, and that seems like an unnecessary imposition. Unless you fully depart from what arrays you, you'll have to do that anyway because leading and trailing spaces aren't considered to be significant in non-quoted elements. In other words, '( cat , dog )' represents textrange('cat', 'dog', '()'), *not* textrange(' cat ', ' dog ', '()'). Keep in mind that the array I/O behavior is widely considered to suck. When we defined the record I/O behavior, we did not emulate that whitespace weirdness, nor a number of other weirdnesses. I would argue that ranges ought to model their I/O behavior on records not arrays, because that's not as much of a legacy syntax. Interesting. Funnily enough, I always assumed it was the other way around. Probably because I don't care much for the empty-unquoted-string-is-NULL behaviour of records. But leaving that personal opinion aside, yeah, in that case we should make range I/O follow record I/O. Also, as long as we need to recognize at least one special value meaning a non-existing bound ('INF' or 'Infinity' or whatever), I don't see a way around the need for quotes in the general case. Right. In the record case, we used an empty string for NULL, and then had to insist on quotes for actual empty strings. Hm, so we'd have '(X,)' for range(X, NULL, '()'), '(,X)' for range(NULL, X, '()') and '(,)' for range(NULL, NULL, '()'). We'd then have the choice of either declaring '(X,]' to mean '(X,)', '[,X)' to mean '(,X)' and '[,]' to mean '(,)' or to forbid the use of '[' and ']' for unspecified bounds. (Leaving out the ',' in the case of only one bound as in my reply to Robert's mail somewhere else in this thread doesn't actually work, since it'd be ambiguous whether '(X)' means range(X, NULL, '()') or range(NULL, X, '()').) One nice property is that, apart from the different brackets used, this representation is identical to the one used by records while still avoiding the infinity vs. NULL confusion. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
Simon Riggs si...@2ndquadrant.com writes: This is a marvellous win, a huge gain from a small, isolated and easily tested change. By far the smallest amount of additional code to sorting we will have added and yet one of the best gains. I think you forgot your cheerleader uniform. A patch along these lines is not going to be small, isolated, easily maintained, nor beneficial for any but a small number of predetermined datatypes. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is there really no interest in SQL Standard?
Hello Alvaro, On 16.09.2011 15:08, Alvaro Herrera wrote: It's certainly possible to create a private mailing list to support this idea. How would the membership be approved, however, is not clear to me. Would we let only well-known names from other pgsql lists into it? (I, for one, had no idea you were in the SQL committee.) It will be personally me - who gets the penalty when something get outside the group who is supporting me. Members should be people that I can trust. Besides core team I trust all who core team trust. And of course there are ppl outside core team who I am trusting. Susanne -- Susanne Ebrecht - 2ndQuadrant PostgreSQL Development, 24x7 Support, Training and Services www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On 21 September 2011 15:50, Tom Lane t...@sss.pgh.pa.us wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: I'm not against making things faster, it's just that I haven't seen solid evidence yet that this will help. Just provide a best-case test case for this that shows a huge improvement, and I'll shut up. If the improvement is only modest, then let's discuss how big it is and whether it's worth the code ugliness this causes. Fair enough. The other question that I'm going to be asking is whether it's not possible to get most of the same improvement with a much smaller code footprint. That's a reasonable question, and I hope to be able to come up with a good answer. I continue to suspect that getting rid of the SQL function impedance-match layer (myFunctionCall2Coll etc) would provide most of whatever gain is to be had here, without nearly as large a cost in code size and maintainability, and with the extra benefit that the speedup would also be available to non-core datatypes. I'm fairly surprised that your view on that is mostly or entirely unchanged, even after I've demonstrated a considerable performance advantage from a macro-based qsort implementation over my OS vendor's c std lib qsort(), using an isolated test-case, that does not have anything to do with that impedance mismatch. I'm not sure why you doubt that the same thing is happening within tuplesort. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
Andrew Dunstan and...@dunslane.net writes: On 09/21/2011 10:50 AM, Tom Lane wrote: The other question that I'm going to be asking is whether it's not possible to get most of the same improvement with a much smaller code footprint. I continue to suspect that getting rid of the SQL function impedance-match layer (myFunctionCall2Coll etc) would provide most of whatever gain is to be had here, without nearly as large a cost in code size and maintainability, and with the extra benefit that the speedup would also be available to non-core datatypes. Can we get a patch so we can do benchmarks on this? Well, we'd have to negotiate what the API ought to be. What I'm envisioning is that datatypes could provide alternate comparison functions that are designed to be qsort-callable rather than SQL-callable. As such, they could not have entries in pg_proc, so it seems like there's no ready way to represent them in the catalogs. The idea that I was toying with was to allow the regular SQL-callable comparison function to somehow return a function pointer to the alternate comparison function, so that the first comparison in a given sort run would be done the traditional way but then we'd notice the provided function pointer and start using that. It would not be too hard to pass back the pointer using FunctionCallInfoData.context, say. The downside is adding cycles to unoptimized cases to uselessly check for a returned function pointer that's not there. Perhaps it could be hacked so that we only add cycles to the very first call, but I've not looked closely at the code to see what would be involved. Has anyone got a better idea for getting hold of the alternate function? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On 21.09.2011 18:46, Tom Lane wrote: Andrew Dunstanand...@dunslane.net writes: On 09/21/2011 10:50 AM, Tom Lane wrote: The other question that I'm going to be asking is whether it's not possible to get most of the same improvement with a much smaller code footprint. I continue to suspect that getting rid of the SQL function impedance-match layer (myFunctionCall2Coll etc) would provide most of whatever gain is to be had here, without nearly as large a cost in code size and maintainability, and with the extra benefit that the speedup would also be available to non-core datatypes. Can we get a patch so we can do benchmarks on this? Well, we'd have to negotiate what the API ought to be. What I'm envisioning is that datatypes could provide alternate comparison functions that are designed to be qsort-callable rather than SQL-callable. As such, they could not have entries in pg_proc, so it seems like there's no ready way to represent them in the catalogs. The idea that I was toying with was to allow the regular SQL-callable comparison function to somehow return a function pointer to the alternate comparison function, so that the first comparison in a given sort run would be done the traditional way but then we'd notice the provided function pointer and start using that. It would not be too hard to pass back the pointer using FunctionCallInfoData.context, say. The downside is adding cycles to unoptimized cases to uselessly check for a returned function pointer that's not there. Perhaps it could be hacked so that we only add cycles to the very first call, but I've not looked closely at the code to see what would be involved. You could have a new function with a pg_proc entry, that just returns a function pointer to the qsort-callback. Or maybe the interface should be an even more radical replacement of qsort, not just the comparison function. Instead of calling qsort, tuplesort.c would call the new datatype-specific sort-function (which would be in pg_proc). The implementation could use an inlined version of qsort, like Peter is suggesting, or it could do something completely different, like a radix sort or a GPU-assisted sort or whatever. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On Wed, Sep 21, 2011 at 4:46 PM, Tom Lane t...@sss.pgh.pa.us wrote: As such, they could not have entries in pg_proc, so it seems like there's no ready way to represent them in the catalogs. Why couldn't they be in pg_proc with a bunch of opaque arguments like the GIST opclass support functions? I'm a bit puzzled what the arguments would look like. They would still need to know the collation, nulls first/last flags, etc. And calling it would still not be inlinable. So they would have to check those flags on each invocation instead of having a piece of straightline code that hard codes the behaviour with the right behaviour inline. ISTM the hope for a speedup from the inlining mostly came from the idea that the compiler might be able to hoist this logic outside the loop (and I suppose implement n specialized loops depending on the behaviour needed). -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 21.09.2011 18:46, Tom Lane wrote: The idea that I was toying with was to allow the regular SQL-callable comparison function to somehow return a function pointer to the alternate comparison function, You could have a new function with a pg_proc entry, that just returns a function pointer to the qsort-callback. Yeah, possibly. That would be a much more invasive change, but cleaner in some sense. I'm not really prepared to do all the legwork involved in that just to get to a performance-testable patch though. Or maybe the interface should be an even more radical replacement of qsort, not just the comparison function. Instead of calling qsort, tuplesort.c would call the new datatype-specific sort-function (which would be in pg_proc). The implementation could use an inlined version of qsort, like Peter is suggesting, or it could do something completely different, like a radix sort or a GPU-assisted sort or whatever. No. In the first place, that only helps for in-memory sorts. In the second, it would absolutely destroy our ability to change the behavior of sorting ever again. Considering that we've added ASC/DESC, NULLS FIRST/LAST, and collation support over the years, are you really prepared to bet that the sort code will never need any more feature upgrades? (This concern is in fact the source of my beef with the whole inlining proposal to begin with, but allowing the inlining to occur into third-party code that we don't control at all would be a hundred times worse.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On 21.09.2011 18:46, Tom Lane wrote: Well, we'd have to negotiate what the API ought to be. What I'm envisioning is that datatypes could provide alternate comparison functions that are designed to be qsort-callable rather than SQL-callable. As such, they could not have entries in pg_proc, so it seems like there's no ready way to represent them in the catalogs. Quite aside from this qsort-thing, it would be nice to have versions of all simple functions that could be called without the FunctionCall overhead. So instead of: FunctionCall2(flinfo_for_int4pl, 1, 2) you could do simply int4pl_fastpath(1,2) I'm not sure how big an effect this would have, but it seems like it could shave some cycles across the system. We could have an extended version of the PG_FUNCTION_INFO_V1 macro that would let you register the fastpath function: PG_FUNCTION_INFO_V1(int4pl, int4pl_fastpath); -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] DECLARE CURSOR must not contain data-modifying statements in WITH
Hi all, Whats the reason for disallowing cursors on wCTEs? I am not sure I can follow the comment: /* * We also disallow data-modifying WITH in a cursor. (This could be * allowed, but the semantics of when the updates occur might be * surprising.) */ if (result-hasModifyingCTE) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg(DECLARE CURSOR must not contain data-modifying statements in WITH))); Given that cursors are about the only sensible way to return larger amounts of data, that behaviour reduces the usefulness of wCTEs a bit. Whats the exact cause of concern here? I personally don't think there is a problem documenting that you should fetch the cursor fully before relying on the updated tables to be in a sensible state. But that may be just me. Thanks, Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
Greg Stark st...@mit.edu writes: On Wed, Sep 21, 2011 at 4:46 PM, Tom Lane t...@sss.pgh.pa.us wrote: As such, they could not have entries in pg_proc, so it seems like there's no ready way to represent them in the catalogs. Why couldn't they be in pg_proc with a bunch of opaque arguments like the GIST opclass support functions? That does not mean the same thing at all. Everything in pg_proc is meant to be called through the V0 or V1 function call info protocols. I'm a bit puzzled what the arguments would look like. They would still need to know the collation, nulls first/last flags, etc. No, I wasn't thinking that we should do that. The datatype comparison functions should have the exact same semantics they do now, just a lower-overhead call mechanism. If you try to push stuff like NULLS FIRST/LAST into the per-datatype code, then you are up against a problem when you want to add a new flag: you have to touch lots of code not all of which you even control. And calling it would still not be inlinable. So they would have to check those flags on each invocation instead of having a piece of straightline code that hard codes the behaviour with the right behaviour inline. ISTM the hope for a speedup from the inlining mostly came from the idea that the compiler might be able to hoist this logic outside the loop (and I suppose implement n specialized loops depending on the behaviour needed). None of that stuff is inlinable or constant-foldable today, nor would it be with the patch that Peter was proposing AFAICS, because none of the flags will ever be compile time constant values. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On 21-09-2011 11:44, Linas Virbalas wrote: [This question doesn't belong to -hackers. Please post it in -general or -admin] Procedure: 1. Start load generator on the master (WAL archiving enabled). 2. Prepare a Streaming Replication standby (accepting WAL files too): 2.1. pg_switch_xlog() on the master; You don't need this. 2.2. pg_start_backup(Obackup_under_load¹) on the master (this will take a while as master is loaded up); No. if you use pg_start_backup('foo', true) it will be fast. Check the manual. 2.3. rsync data/global/pg_control to the standby; Why are you doing this? If ... 2.4. rsync all other data/ (without pg_xlog) to the standby; you will copy it again or no? Don't understand your point. 2.5. pg_stop_backup() on the master; 2.6. Wait to receive all WAL files, generated during the backup, on the standby; 2.6. Start the standby PG instance. The last step will, usually, fail with a similar error: The problem could be that the minimum recovery point (step 2.3) is different from the end of rsync if you are under load. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 21.09.2011 18:46, Tom Lane wrote: Well, we'd have to negotiate what the API ought to be. What I'm envisioning is that datatypes could provide alternate comparison functions that are designed to be qsort-callable rather than SQL-callable. As such, they could not have entries in pg_proc, so it seems like there's no ready way to represent them in the catalogs. Quite aside from this qsort-thing, it would be nice to have versions of all simple functions that could be called without the FunctionCall overhead. Hmm, that's an interesting idea. I think probably the important aspects are (1) known number of arguments and (2) no null argument or result values are allowed. Not sure what we'd do with collations though. We could have an extended version of the PG_FUNCTION_INFO_V1 macro that would let you register the fastpath function: PG_FUNCTION_INFO_V1(int4pl, int4pl_fastpath); We don't use PG_FUNCTION_INFO_V1 for built-in functions ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] unaccent contrib
Hi guys! I know Postgresql 9.x includes unaccent contrib on their deliver package. unaccent is compatible with postgresql 8.4 (but not is in their contrib version distribution) what's better way to setup unaccent module on Postgresql 8.4 production server. Copy contrib/unaccent from 9.x to the 8.4 installation and execute it??? We can do it??? Other ideas?? Can Global Development Group, make some acumulative rpm for contrib modules that are backward compatible??? thx!
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Sep21, 2011, at 16:44 , Linas Virbalas wrote: After searching the archives, the only more discussed and similar issue I found hit was by Daniel Farina in a thread hot backups: am I doing it wrong, or do we have a problem with pg_clog? [2], but, it seems, the issue was discarded because of a non-standard backup procedure Deniel used. That's not the way I read that thread. In fact, Robert Haas confirmed that Daniel's backup procedure was sound in theory. The open question was whether the error occurred because of a Bug in Daniel's backup code or postgresql's restore code. The thread then petered out without that question being answered. Procedure: 1. Start load generator on the master (WAL archiving enabled). 2. Prepare a Streaming Replication standby (accepting WAL files too): 2.1. pg_switch_xlog() on the master; 2.2. pg_start_backup(Obackup_under_load¹) on the master (this will take a while as master is loaded up); 2.3. rsync data/global/pg_control to the standby; 2.4. rsync all other data/ (without pg_xlog) to the standby; 2.5. pg_stop_backup() on the master; 2.6. Wait to receive all WAL files, generated during the backup, on the standby; 2.6. Start the standby PG instance. Looks good. (2.1) and (2.3) seem redundant (as Euler already noticed), but shouldn't cause any errors. Could you provide us with the exact rsync version and parameters you use? The last step will, usually, fail with a similar error: 2011-09-21 13:41:05 CEST LOG: database system was interrupted; last known up at 2011-09-21 13:40:50 CEST Restoring 0014.history mv: cannot stat `/opt/PostgreSQL/9.1/archive/0014.history': No such file or directory Restoring 0013.history 2011-09-21 13:41:05 CEST LOG: restored log file 0013.history from archive 2011-09-21 13:41:05 CEST LOG: entering standby mode Restoring 0013000600DC 2011-09-21 13:41:05 CEST LOG: restored log file 0013000600DC from archive Restoring 0013000600DB 2011-09-21 13:41:05 CEST LOG: restored log file 0013000600DB from archive 2011-09-21 13:41:05 CEST FATAL: could not access status of transaction 1188673 2011-09-21 13:41:05 CEST DETAIL: Could not read from file pg_clog/0001 at offset 32768: Success. Whats the size of the file (pg_clog/0001) on both the master and the slave? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On Wed, Sep 21, 2011 at 5:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: None of that stuff is inlinable or constant-foldable today, nor would it be with the patch that Peter was proposing AFAICS, because none of the flags will ever be compile time constant values. I was referring to transformations like this which I believe compilers are already capable of doing: v = ...; while (...) if (v) { if (a b) ... else ; } else { if (a b) ... else ...; } turning it into code that looks like: if (v) { while () if (ab) ... else ...; } else { while () if (ab) ... else ...; } which may not look like much -- especially with branch prediction -- but then it's much more likely to be able to unroll the loop and do clever instruction scheduling and so on than if there's an extra branch in the middle of the loop. But if there's a function call to an external function called through a function pointer in the middle of the loop then the whole endeavour would be for naught. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is there really no interest in SQL Standard?
Excerpts from Peter Eisentraut's message of mié sep 21 00:27:53 -0300 2011: On tis, 2011-09-20 at 11:12 -0300, Alvaro Herrera wrote: +1 for a closed mailing list. It's a bit annoying to have to do such a thing, but it's not like we haven't got other closed lists for appropriate purposes. Well, that much we've already decided a few years ago. The blocking issues are: (1) do we have enough interest, and (2) where to put it (I'm looking at you, pgfoundry). I don't see why we wouldn't put it in @postgresql.org. One nice thing about pgfoundry would be the document manager. Also, at least at some point in the past, a pgfoundry project was easier to manage than getting anything done about a @postgresql.org mailing list. The document manager might be useful, true. I cannot speak about past administrators of the Majordomo installation that serves the @postgresql.org lists, though. For all intents and purposes, it seems I'm in charge of it now. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is there really no interest in SQL Standard?
On Wed, Sep 21, 2011 at 5:49 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Peter Eisentraut's message of mié sep 21 00:27:53 -0300 2011: On tis, 2011-09-20 at 11:12 -0300, Alvaro Herrera wrote: +1 for a closed mailing list. It's a bit annoying to have to do such a thing, but it's not like we haven't got other closed lists for appropriate purposes. Well, that much we've already decided a few years ago. The blocking issues are: (1) do we have enough interest, and (2) where to put it (I'm looking at you, pgfoundry). I don't see why we wouldn't put it in @postgresql.org. One nice thing about pgfoundry would be the document manager. Also, at least at some point in the past, a pgfoundry project was easier to manage than getting anything done about a @postgresql.org mailing list. The document manager might be useful, true. I cannot speak about past administrators of the Majordomo installation that serves the @postgresql.org lists, though. For all intents and purposes, it seems I'm in charge of it now. Only seems? :-) -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence locking
Andres Freund and...@anarazel.de wrote: - Its impossible to emulate proper locking yourself because locking is not allowed for sequences Any arguments against allowing it again? It seems to have been allowed in prehistoric times. It would be nice to allow it. I've had to create a dummy table just to use for locking a sequence (by convention). -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Remastering using streaming only replication?
Fujii, I haven't really been following your latest patches about taking backups from the standby and cascading replication, but I wanted to see if it fulfills another TODO: the ability to remaster (that is, designate the lead standby as the new master) without needing to copy WAL files. Supporting remastering using steaming replication only was on your TODO list when we closed 9.1. It seems like this would get solved as a side-effect, but I wanted to confirm that. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence locking
On Wed, Sep 21, 2011 at 11:51 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Andres Freund and...@anarazel.de wrote: - Its impossible to emulate proper locking yourself because locking is not allowed for sequences Any arguments against allowing it again? It seems to have been allowed in prehistoric times. It would be nice to allow it. I've had to create a dummy table just to use for locking a sequence (by convention). another (better?) way is advisory locks... merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
Robert, Josh is arguing that we ought to use the term replication, but it Actually, no. I'm arguing that we should use the term standby, since that term is consistent with how we refer to replica servers throughout the docs, and the term recovery is not. seems to me that's just as misleading - maybe moreso, since recovery is sufficiently a term of art to make you at least think about reading the manual, whereas you know (or think you know) what replication is. Nope. What it means is that users see stuff relating to recovery and say oh, that's not right, the replication stuff must be somewhere else. I've taught a half-dozen classes on PostgreSQL binary replication now, and the recovery nomenclature *always* confuses students. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On Wed, Sep 21, 2011 at 12:55 PM, Josh Berkus j...@agliodbs.com wrote: Josh is arguing that we ought to use the term replication, but it Actually, no. I'm arguing that we should use the term standby, since that term is consistent with how we refer to replica servers throughout the docs, and the term recovery is not. seems to me that's just as misleading - maybe moreso, since recovery is sufficiently a term of art to make you at least think about reading the manual, whereas you know (or think you know) what replication is. Nope. What it means is that users see stuff relating to recovery and say oh, that's not right, the replication stuff must be somewhere else. I've taught a half-dozen classes on PostgreSQL binary replication now, and the recovery nomenclature *always* confuses students. Yeah, I get it. But I think standby would confuse them, too, just in a different set of situations. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Wed, 2011-09-21 at 17:20 +0200, Florian Pflug wrote: Hm, so we'd have '(X,)' for range(X, NULL, '()'), '(,X)' for range(NULL, X, '()') and '(,)' for range(NULL, NULL, '()'). We'd then have the choice of either declaring '(X,]' to mean '(X,)', '[,X)' to mean '(,X)' and '[,]' to mean '(,)' or to forbid the use of '[' and ']' for unspecified bounds. Right now, I just canonicalize it to round brackets if infinite. Seems pointless to reject it, but I can if someone thinks it's better. (Leaving out the ',' in the case of only one bound as in my reply to Robert's mail somewhere else in this thread doesn't actually work, since it'd be ambiguous whether '(X)' means range(X, NULL, '()') or range(NULL, X, '()').) One nice property is that, apart from the different brackets used, this representation is identical to the one used by records while still avoiding the infinity vs. NULL confusion. OK, I like that. Slightly strange to require quoting empty strings, but not stranger than the alternatives. While we're at it, any suggestions on the text representation of an empty range? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Wed, 2011-09-21 at 13:24 +0200, Florian Pflug wrote: I've thought about this some more, and came to realize that the question here really is whether floatrange(0, 'Infinity'::float, '[)') and floatrange(0, NULL, '[)') are the same thing or not. The unbounded side of a range is never equal to a value in the data type's domain, so no, it's not the same. I think that we pretty much settled on just using an empty string for infinity in the other thread, right? So that makes this a non-issue. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence locking
Merlin Moncure mmonc...@gmail.com wrote: On Wed, Sep 21, 2011 at 11:51 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Andres Freund and...@anarazel.de wrote: - Its impossible to emulate proper locking yourself because locking is not allowed for sequences Any arguments against allowing it again? It seems to have been allowed in prehistoric times. It would be nice to allow it. I've had to create a dummy table just to use for locking a sequence (by convention). another (better?) way is advisory locks... Not under 9.0 or earlier if you want the lock to last until the end of the transaction. Also, the fact that advisory locks are only on numbers, without any mechanism for mapping those to character strings, makes them poorly suited to many tasks. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
Yeah, I get it. But I think standby would confuse them, too, just in a different set of situations. Other than PITR, what situations? PITR is used by a minority of our users. Binary replication, if not already used by a majority, will be in the future (it's certainly the majority of my professional clients). Further, PITR is usually something which is either handled by vendor backup management software, or by professional DBAs, whereas replication is used by developers with little or no DBA support. Why should we make terminology obscure for the majority usecase to make it clear for the minority one? Especially since the majority use-case has almost all the newbies? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On Wed, Sep 21, 2011 at 1:03 PM, Josh Berkus j...@agliodbs.com wrote: Yeah, I get it. But I think standby would confuse them, too, just in a different set of situations. Other than PITR, what situations? Hot backup? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On 9/21/11 10:07 AM, Robert Haas wrote: On Wed, Sep 21, 2011 at 1:03 PM, Josh Berkus j...@agliodbs.com wrote: Yeah, I get it. But I think standby would confuse them, too, just in a different set of situations. Other than PITR, what situations? Hot backup? Hot backup == PITR. You're just not bothering to accumulate WAL logs. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PERFORM] Constraint exclusion on UNION ALL subqueries with WHERE conditions
=?ISO-8859-1?Q?Gunnlaugur_=DE=F3r_Briem?= gunnlau...@gmail.com writes: On Monday, September 19, 2011 3:59:30 AM UTC, Tom Lane wrote: Works for me in 8.4.8. Do you have constraint_exclusion set to ON? I did try with constraint_exclusion set to on, though the docs suggest partition should be enough (examine constraints only for ... UNION ALL subqueries) Here's a minimal test case (which I should have supplied in the original post, sorry), tried just now in 8.4.8: CREATE OR REPLACE VIEW v_heavy_view AS SELECT (random()*1e5)::integer col FROM generate_series(1, 1e6::integer); CREATE OR REPLACE VIEW v_test_constraint_exclusion AS SELECT col FROM v_heavy_view WHERE col 3 UNION ALL SELECT col FROM v_heavy_view WHERE col = 3; EXPLAIN SELECT * FROM v_test_constraint_exclusion WHERE col=2; Hmm. The reason this particular case doesn't work is that we don't apply relation_excluded_by_constraints() to functions-in-FROM. It's only used for plain-table RTEs, not subqueries, functions, etc. I suspect the complainant's real case involved an unflattenable subquery. Probably the rationale for that coding was that only plain tables could have CHECK constraints; but the portion of the logic that looks for mutually contradictory scan constraints could apply to non-table relations. Should we change the code to make such checks in these cases? The default behavior (with constraint_exclusion = partition) would still be to do nothing extra, but it would add planning expense when constraint_exclusion = on. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On Wed, Sep 21, 2011 at 1:08 PM, Josh Berkus j...@agliodbs.com wrote: On 9/21/11 10:07 AM, Robert Haas wrote: On Wed, Sep 21, 2011 at 1:03 PM, Josh Berkus j...@agliodbs.com wrote: Yeah, I get it. But I think standby would confuse them, too, just in a different set of situations. Other than PITR, what situations? Hot backup? Hot backup == PITR. You're just not bothering to accumulate WAL logs. Well, I don't think of it that way, but YMMV, of course. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence locking
On Wednesday 21 Sep 2011 19:03:17 Kevin Grittner wrote: Merlin Moncure mmonc...@gmail.com wrote: On Wed, Sep 21, 2011 at 11:51 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Andres Freund and...@anarazel.de wrote: - Its impossible to emulate proper locking yourself because locking is not allowed for sequences Any arguments against allowing it again? It seems to have been allowed in prehistoric times. It would be nice to allow it. I've had to create a dummy table just to use for locking a sequence (by convention). another (better?) way is advisory locks... Not under 9.0 or earlier if you want the lock to last until the end of the transaction. Also, the fact that advisory locks are only on numbers, without any mechanism for mapping those to character strings, makes them poorly suited to many tasks. The usual trick is to lock on the oid of some database object. But I agree, its a poor workaround for this specific problem. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Wed, Sep 21, 2011 at 12:22 PM, Euler Taveira de Oliveira eu...@timbira.com wrote: [This question doesn't belong to -hackers. Please post it in -general or -admin] -hackers or -bugs seems appropriate to me; I think this is a bug. 2.2. pg_start_backup(Obackup_under_loadš) on the master (this will take a while as master is loaded up); No. if you use pg_start_backup('foo', true) it will be fast. Check the manual. If the server is sufficiently heavily loaded that a checkpoint takes a nontrivial amount of time, the OP is correct that this will be not fast, regardless of whether you choose to force an immediate checkpoint. 2.3. rsync data/global/pg_control to the standby; Why are you doing this? If ... 2.4. rsync all other data/ (without pg_xlog) to the standby; you will copy it again or no? Don't understand your point. His point is that exercising the bug depends on doing the copying in a certain order. Any order of copying the data theoretically ought to be OK, as long as it's all between starting the backup and stopping the backup, but apparently it isn't. The problem could be that the minimum recovery point (step 2.3) is different from the end of rsync if you are under load. It seems pretty clear that some relevant chunk of WAL isn't getting replayed, but it's not at all clear to me why not. It seems like it would be useful to compare the LSN returned by pg_start_backup() with the location at which replay begins when you fire up the clone. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence locking
Kevin Grittner kevin.gritt...@wicourts.gov writes: Andres Freund and...@anarazel.de wrote: - Its impossible to emulate proper locking yourself because locking is not allowed for sequences Any arguments against allowing it again? It seems to have been allowed in prehistoric times. If you think that it used to be allowed, it'd be a good idea to see if you can find the archived discussions about changing it. It would be nice to allow it. I've had to create a dummy table just to use for locking a sequence (by convention). One question is what you think the lock means. I believe for example that taking a non-exclusive regular table lock on a sequence would not prevent other sessions from doing nextval(); even an exclusive one would not prevent them from doing so if they had pre-cached values. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On Wed, Sep 21, 2011 at 1:13 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Sep 21, 2011 at 1:08 PM, Josh Berkus j...@agliodbs.com wrote: On 9/21/11 10:07 AM, Robert Haas wrote: On Wed, Sep 21, 2011 at 1:03 PM, Josh Berkus j...@agliodbs.com wrote: Yeah, I get it. But I think standby would confuse them, too, just in a different set of situations. Other than PITR, what situations? Hot backup? Hot backup == PITR. You're just not bothering to accumulate WAL logs. Well, I don't think of it that way, but YMMV, of course. I think that the major differentiating factor is the intended action when caught up, and the definition of caught up, and trying to use a single term for both of them is going to always cause confusion. So I tend to think of the use cases by their continuation. A slave is intended to continually keep trying to get more once it's retrieved and applied all the changes it can. It can be hot, or cold, streaming, or archive, etc... And recovery is intended to stop recovering and become normal once it's finished retrieving and applying all changes it can. Again, it has multiple ways to retrive it's wal too. And I think Tom touched on this point in the recovery.conf/recovery.done thread a bit too. Maybe we need to really start talking about the different when done do ... distinctions, and and using that distinction to help our nomenclature. Both recovery/slave (both hot or cold) use the same retrieve/apply machinery (and thus configuration options). But because of the different caught up action, are different features. a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On Wed, Sep 21, 2011 at 1:34 PM, Aidan Van Dyk ai...@highrise.ca wrote: And I think Tom touched on this point in the recovery.conf/recovery.done thread a bit too. Doh! That's this thread /me slinks away, ashamed for not even taking a close look at the to/cc list... -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence locking
On Wednesday 21 Sep 2011 19:24:55 Tom Lane wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Andres Freund and...@anarazel.de wrote: - Its impossible to emulate proper locking yourself because locking is not allowed for sequences Any arguments against allowing it again? It seems to have been allowed in prehistoric times. If you think that it used to be allowed, it'd be a good idea to see if you can find the archived discussions about changing it. The message I was thinking about was http://archives.postgresql.org/pgsql-hackers/2001-10/msg00930.php It would be nice to allow it. I've had to create a dummy table just to use for locking a sequence (by convention). One question is what you think the lock means. I believe for example that taking a non-exclusive regular table lock on a sequence would not prevent other sessions from doing nextval(); even an exclusive one would not prevent them from doing so if they had pre-cached values. I don't see what a non-exclusive lock on a sequence should sensibly do so I don't see a problem with not supporting them. That already cached values are not affected by the lock seems to be pretty logical to me - and not really problematic. At least in my cases I would look at last_value from the sequence after locking it- which includes the cached values so its fine that they can be used. The case that somebody already acquired a sequence value that not visible to other sessions has to be taken into account anyway. Greetings, Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unaccent contrib
On 21-09-2011 13:28, Daniel Vázquez wrote: unaccent is compatible with postgresql 8.4 (but not is in their contrib version distribution) No, it is not. AFAICS it is necessary to add some backend code that is not in 8.4. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] HeapTupleSatisfiesToast not setting XMIN_COMMITTED?
Hi, I notice that HeapTupleSatisfiesToast is not setting the HEAP_XMIN_COMMITTED bit, though it is reading it. Is there a reason for this? It seems to me that it'd make sense to have it set ... unless it's set by some other routine, somehow? -- Álvaro Herrera alvhe...@alvh.no-ip.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence locking
Andres Freund and...@anarazel.de wrote: On Wednesday 21 Sep 2011 19:24:55 Tom Lane wrote: One question is what you think the lock means. I believe for example that taking a non-exclusive regular table lock on a sequence would not prevent other sessions from doing nextval(); even an exclusive one would not prevent them from doing so if they had pre-cached values. I don't see what a non-exclusive lock on a sequence should sensibly do so I don't see a problem with not supporting them. That already cached values are not affected by the lock seems to be pretty logical to me - and not really problematic. At least in my cases I would look at last_value from the sequence after locking it- which includes the cached values so its fine that they can be used. The case that somebody already acquired a sequence value that not visible to other sessions has to be taken into account anyway. I think all of that holds for us, as well. Our only real use for this (so far, anyway) is in our trigger-based replication -- a deferred AFTER INSERT trigger assigns a strictly monotonically increasing commit number which must match the order of commit. I don't see how getting an exclusive lock on the sequence itself could introduce any bugs which we wouldn't have using a dummy table created only to serve as a lock target. Given that I can't think of any other uses for this feature, I guess it would be pretty low on my list of priorities. As I said earlier, it would be nice. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence locking
On Wed, Sep 21, 2011 at 12:03 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Merlin Moncure mmonc...@gmail.com wrote: On Wed, Sep 21, 2011 at 11:51 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Andres Freund and...@anarazel.de wrote: - Its impossible to emulate proper locking yourself because locking is not allowed for sequences Any arguments against allowing it again? It seems to have been allowed in prehistoric times. It would be nice to allow it. I've had to create a dummy table just to use for locking a sequence (by convention). another (better?) way is advisory locks... Not under 9.0 or earlier if you want the lock to last until the end of the transaction. Also, the fact that advisory locks are only on numbers, without any mechanism for mapping those to character strings, makes them poorly suited to many tasks. hm, this was one of the things I used advisory locks for -- in fact, not having to hold the lock for the duration of the transaction was useful for cases of sequence locking. Basically, you steal the sequence oid for a lock id and wrap nextval() with an advisory sharelock. Then, if you need to do some type of heavy duty operation, like reserve a contiguous block of identifiers, you can full lock the same lock and block everyone. If the locks were full transaction locks, that would essentially serialize all transactions that sharelocked the sequence...no bueno. So, considering all that, what are the actual use cases for proper locking of sequence locks (outside of the metadata stuff you can lock now)? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unaccent contrib
... ok No alternatives for unaccent on 8.4? 2011/9/21 Euler Taveira de Oliveira eu...@timbira.com On 21-09-2011 13:28, Daniel Vázquez wrote: unaccent is compatible with postgresql 8.4 (but not is in their contrib version distribution) No, it is not. AFAICS it is necessary to add some backend code that is not in 8.4. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/**mailpref/pgsql-hackershttp://www.postgresql.org/mailpref/pgsql-hackers -- Daniel Vázquez SICONET (A Bull Group Company) Torre Agbar. Avda. Diagonal, 211 - planta 23 08018 - Barcelona telf: + 34 93 2272727 (Ext. 2952) fax: + 34 93 2272728 www.bull.es - www.siconet.es daniel.vazq...@bull.es
Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
On Thu, Sep 15, 2011 at 11:57 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: s/atomic/barrier/ Oops. +/* + * A compiler barrier need not (and preferably should not) emit any actual + * machine code, but must act as an optimization fence: the compiler must not + * reorder loads or stores to main memory around the barrier. However, the + * CPU may still reorder loads or stores at runtime, if the architecture's + * memory model permits this. + * + * A memory barrier must act as a compiler barrier, and in addition must + * guarantee that all loads and stores issued prior to the barrier are + * completed before any loads or stores issued after the barrier. Unless + * loads and stores are totally ordered (which is not the case on most + * architectures) this requires issuing some sort of memory fencing + * instruction. This seems like a strange way to explain the problem. I would suggest structuring those paragraphs along the lines of: A PostgreSQL memory barrier guarantees that any loads/stores before the barrier are completely finished and visible to other CPUs, before the loads/stores after the barrier are performed. That involves two things: 1. We must stop the compiler from rearranging loads/stores across the barrier. 2. We must stop the CPU from reordering the loads/stores across the barrier. That doesn't seem much different than I wrote? One thing to keep in mind about whatever language we use here is that memory barrier instructions need not (and often do not) compel the CPU to completely finish anything before doing the next thing. Execution is heavily pipelined, and on a sequence like STORE/WRITE BARRIER/STORE the CPU is perfectly well entitled to begin the second store before the first one is done. It just can't let the second STORE get far enough for other CPUs to see it. Do we have any use for compiler barriers that are not also memory barriers? If not, I would suggest not exposing the pg_compiler_barrier() macro, but keep that as an implementation detail in the implementations of pg_memory_barrier(). I think there might be some use for a pure compiler barrier, but I'm not sure yet. It's probably not worth having a fallback implementation involving a spinlock, though, because odds are good that any code that is performance-critical enough to be attempting to safely use a compiler barrier can't survive having a spinlock acquire-and-release shoved in there, so any such code should probably be #ifdef'd. Some examples on the correct usage of these barriers would be nice, too. That's a big topic. A trivial example is that if you write: a[*x] = i; ++*x; ...where x and i are pointers into shared memory, another backend might loop over a from 0 to x-1 and accidentally read off the end of the array. But even a full explanation of that case seems like almost too much for the comment of a header file, and there are certainly far more complex cases. I think anyone who is using these primitives is going to have to do some independent reading... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
Robert Haas robertmh...@gmail.com wrote: But even a full explanation of that case seems like almost too much for the comment of a header file, and there are certainly far more complex cases. I think anyone who is using these primitives is going to have to do some independent reading... Maybe a URL or two in the header comments, pointing to relevant papers for the techniques used? After all, years from now someone might be familiar with other techniques from newer papers and wonder what the techniques in the code are based on. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] HeapTupleSatisfiesToast not setting XMIN_COMMITTED?
On Wed, Sep 21, 2011 at 1:03 PM, Alvaro Herrera alvhe...@alvh.no-ip.org wrote: Hi, I notice that HeapTupleSatisfiesToast is not setting the HEAP_XMIN_COMMITTED bit, though it is reading it. Is there a reason for this? It seems to me that it'd make sense to have it set ... unless it's set by some other routine, somehow? I think it's implied from the untoasted row. Notice that it's not checking visibility either except in binary upgrade cases. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
On Wed, Sep 21, 2011 at 2:48 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas robertmh...@gmail.com wrote: But even a full explanation of that case seems like almost too much for the comment of a header file, and there are certainly far more complex cases. I think anyone who is using these primitives is going to have to do some independent reading... Maybe a URL or two in the header comments, pointing to relevant papers for the techniques used? After all, years from now someone might be familiar with other techniques from newer papers and wonder what the techniques in the code are based on. If there are any academic papers on this topic, I haven't found them. Mostly, I've found lots of articles written by people who were coding for the Linux kernel, and a lot of those articles are extremely Linux-specific (you could use the smb_rb() macro here, but it's better to instead use this RCU-related macro, because it'll do it for you, blah blah). I'm happy to link to any sources anyone thinks we ought to link to, but I've mostly had to piece this together bit by bit from blog posts and (sometimes buggy) examples. It hasn't been a particularly thrilling exercise. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] HeapTupleSatisfiesToast not setting XMIN_COMMITTED?
Excerpts from Merlin Moncure's message of mié sep 21 16:02:34 -0300 2011: On Wed, Sep 21, 2011 at 1:03 PM, Alvaro Herrera alvhe...@alvh.no-ip.org wrote: Hi, I notice that HeapTupleSatisfiesToast is not setting the HEAP_XMIN_COMMITTED bit, though it is reading it. Is there a reason for this? It seems to me that it'd make sense to have it set ... unless it's set by some other routine, somehow? I think it's implied from the untoasted row. Notice that it's not checking visibility either except in binary upgrade cases. Yeah, so toast visibility is highly dependant to the referencing row. Which makes sense. But then, if the XMIN_COMMITTED bit is not set, you're forced to check the other bits every time. I guess the tradeoff is that if you set it, the page is dirtied and you're forced to write it down, which is even worse. More interesting, however, is the fact that the XMAX_COMMITTED bit is never set either. I guess the rows are deleted by a different mechanism (tuptoaster probably) -- it isn't obvious how this works just by looking at tqual.c. It seems to do nothing at all. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On Wed, Sep 21, 2011 at 4:22 PM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: This is a marvellous win, a huge gain from a small, isolated and easily tested change. By far the smallest amount of additional code to sorting we will have added and yet one of the best gains. I think you forgot your cheerleader uniform. LOL. I'm happy whoever and whenever we get large wins like that. Go Postgres! A patch along these lines is not going to be small, isolated, easily maintained, nor beneficial for any but a small number of predetermined datatypes. That was the starting premise. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] HeapTupleSatisfiesToast not setting XMIN_COMMITTED?
On Wed, Sep 21, 2011 at 2:14 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Merlin Moncure's message of mié sep 21 16:02:34 -0300 2011: On Wed, Sep 21, 2011 at 1:03 PM, Alvaro Herrera alvhe...@alvh.no-ip.org wrote: Hi, I notice that HeapTupleSatisfiesToast is not setting the HEAP_XMIN_COMMITTED bit, though it is reading it. Is there a reason for this? It seems to me that it'd make sense to have it set ... unless it's set by some other routine, somehow? I think it's implied from the untoasted row. Notice that it's not checking visibility either except in binary upgrade cases. Yeah, so toast visibility is highly dependant to the referencing row. Which makes sense. But then, if the XMIN_COMMITTED bit is not set, you're forced to check the other bits every time. I guess the tradeoff is that if you set it, the page is dirtied and you're forced to write it down, which is even worse. yeah -- there's no way it's worth the i/o in that case, since there's no visibility check to protect yourself from. More interesting, however, is the fact that the XMAX_COMMITTED bit is never set either. I guess the rows are deleted by a different mechanism (tuptoaster probably) -- it isn't obvious how this works just by looking at tqual.c. It seems to do nothing at all. yup -- probably the only reason it exists at all is vacuum issues which all visibility routines have to handle. otherwise, it's a fancy 'return true'. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: SP-GiST, Space-Partitioned GiST
On 05.09.2011 09:39, Oleg Bartunov wrote: I attached wrong patch in previous message, sorry ! Here is a last version. One little detail caught my eye: In spgSplitNodeAction, you call SpGistGetBuffer() within a critical section. That should be avoided, SpGistGetBuffer() can easily fail if you e.g run out of disk space. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
Robert Haas robertmh...@gmail.com wrote: On Wed, Sep 21, 2011 at 2:48 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas robertmh...@gmail.com wrote: But even a full explanation of that case seems like almost too much for the comment of a header file, and there are certainly far more complex cases. I think anyone who is using these primitives is going to have to do some independent reading... Maybe a URL or two in the header comments, pointing to relevant papers for the techniques used? After all, years from now someone might be familiar with other techniques from newer papers and wonder what the techniques in the code are based on. If there are any academic papers on this topic, I haven't found them. Mostly, I've found lots of articles written by people who were coding for the Linux kernel, and a lot of those articles are extremely Linux-specific (you could use the smb_rb() macro here, but it's better to instead use this RCU-related macro, because it'll do it for you, blah blah). I'm happy to link to any sources anyone thinks we ought to link to, but I've mostly had to piece this together bit by bit from blog posts and (sometimes buggy) examples. It hasn't been a particularly thrilling exercise. Well, if it really is that hard to piece together the relevant techniques, it seems cruel not to check in the results of your efforts to work it out somewhere. Perhaps a README file? On the other hand, a search turned up these two papers (which I haven't yet read, but will): http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.97.2397rep=rep1type=pdf http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.153.6657rep=rep1type=pdf On a quick scan, they both look promising in themselves, and reference a lot of other promising-sounding papers. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unaccent contrib
On 21-09-2011 15:23, Daniel Vázquez wrote: No alternatives for unaccent on 8.4? Not that I know of. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] HeapTupleSatisfiesToast not setting XMIN_COMMITTED?
Alvaro Herrera alvhe...@alvh.no-ip.org writes: I notice that HeapTupleSatisfiesToast is not setting the HEAP_XMIN_COMMITTED bit, though it is reading it. Is there a reason for this? It seems to me that it'd make sense to have it set ... unless it's set by some other routine, somehow? Hmm ... I wonder if that has something to do with the performance complaints we sometimes get about TOAST. VACUUM would presumably set the bit, but in a never-vacuumed toast table this could be a problem. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] HeapTupleSatisfiesToast not setting XMIN_COMMITTED?
Alvaro Herrera alvhe...@commandprompt.com writes: More interesting, however, is the fact that the XMAX_COMMITTED bit is never set either. I guess the rows are deleted by a different mechanism (tuptoaster probably) -- it isn't obvious how this works just by looking at tqual.c. It seems to do nothing at all. I have some vague recollection that the only reason we do any xmin/xmax checking at all for a TOAST tuple is to support tuple motion caused by old-style VACUUM FULL. Jan might remember better. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
On Wed, Sep 14, 2011 at 4:29 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Aug 8, 2011 at 7:47 AM, Robert Haas robertmh...@gmail.com wrote: I've been thinking about this too and actually went so far as to do some research and put together something that I hope covers most of the interesting cases. The attached patch is pretty much entirely untested, but reflects my present belief about how things ought to work. And, here's an updated version, with some of the more obviously broken things fixed. You declare dummy_spinlock variable as extren and use it, but it is not defined anywhere. Wouldn't that be a linker error? -- Gurjeet Singh EnterpriseDB Corporation The Enterprise PostgreSQL Company
Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
On Wed, Sep 21, 2011 at 4:19 PM, Gurjeet Singh singh.gurj...@gmail.com wrote: On Wed, Sep 14, 2011 at 4:29 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Aug 8, 2011 at 7:47 AM, Robert Haas robertmh...@gmail.com wrote: I've been thinking about this too and actually went so far as to do some research and put together something that I hope covers most of the interesting cases. The attached patch is pretty much entirely untested, but reflects my present belief about how things ought to work. And, here's an updated version, with some of the more obviously broken things fixed. You declare dummy_spinlock variable as extren and use it, but it is not defined anywhere. Wouldn't that be a linker error? Yeah, we need to add that somewhere, maybe s_lock.c -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unaccent contrib
On Wed, 2011-09-21 at 18:28 +0200, Daniel Vázquez wrote: Can Global Development Group, make some acumulative rpm for contrib modules that are backward compatible??? No (as the RPM maintainer). -- Devrim GÜNDÜZ Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz signature.asc Description: This is a digitally signed message part
Re: [HACKERS] unaccent contrib
Euler Taveira de Oliveira eu...@timbira.com writes: On 21-09-2011 13:28, Daniel Vázquez wrote: unaccent is compatible with postgresql 8.4 (but not is in their contrib version distribution) No, it is not. AFAICS it is necessary to add some backend code that is not in 8.4. [ pokes at it ] Yeah, you are right. The version of unaccent that is in our source tree is a filtering dictionary, and therefore cannot possibly work with backends older than 9.0 (when the filtering dictionary feature was added). So I'm wondering where the OP read that it was compatible with 8.4. Our own documentation about it certainly does not say that. It's possible that Oleg and Teodor had some prototype version, different from what got committed to our tree, that would work in 8.4. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
On Wed, Sep 21, 2011 at 3:39 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas robertmh...@gmail.com wrote: On Wed, Sep 21, 2011 at 2:48 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas robertmh...@gmail.com wrote: But even a full explanation of that case seems like almost too much for the comment of a header file, and there are certainly far more complex cases. I think anyone who is using these primitives is going to have to do some independent reading... Maybe a URL or two in the header comments, pointing to relevant papers for the techniques used? After all, years from now someone might be familiar with other techniques from newer papers and wonder what the techniques in the code are based on. If there are any academic papers on this topic, I haven't found them. Mostly, I've found lots of articles written by people who were coding for the Linux kernel, and a lot of those articles are extremely Linux-specific (you could use the smb_rb() macro here, but it's better to instead use this RCU-related macro, because it'll do it for you, blah blah). I'm happy to link to any sources anyone thinks we ought to link to, but I've mostly had to piece this together bit by bit from blog posts and (sometimes buggy) examples. It hasn't been a particularly thrilling exercise. Well, if it really is that hard to piece together the relevant techniques, it seems cruel not to check in the results of your efforts to work it out somewhere. Perhaps a README file? I don't know if techniques is the right word. I mean, there are three questions that you might want to answer here: 1. I have a new architecture and I want barrier.h to support it. What do I need to do? 2. What is the behavior of the various constructs provided by barrier.h? 3. Why would I want that behavior and how can I use it to do cool stuff? I intended the comment in that file to be enough to answer questions #1 and #2. What you and Heikki are asking about is really #3, and that seems to me to be setting the bar awfully high. I mean, lwlock.c explains what a lightweight lock does, but it doesn't explain all of the ways that a lightweight lock can be used to solve performance problems, nor should it. That would be recapitulating the documentation that is hopefully present everywhere that LWLocks are used as well as speculating about future applications. It just doesn't seem sensible to me to try to enumerate all the ways that a fundamental primitive can potentially be used down the line. What I found hard about memory barriers is basically this: I didn't understand that the CPU is allowed to perform operations out of order. And I couldn't understand what the consequences of that fact were. I sort of understood - but hadn't really internalized - the idea that execution is highly pipelined, so the single moment at which an execution is performed is not well defined. Before I really got my head around it, I had to read the explanations of what a memory barrier actually does over and over again. I probably read ten different explanations saying the same thing in different words about twenty times a piece, and slowly the light dawned. I did my best to explain that in the existing comment; I'm happy to expand the comment if people have suggestions for what to put in there; but basically I think this is a hard concept and if you haven't done this stuff before it's going to take a while to get up to speed. On the other hand, a search turned up these two papers (which I haven't yet read, but will): http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.97.2397rep=rep1type=pdf http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.153.6657rep=rep1type=pdf On a quick scan, they both look promising in themselves, and reference a lot of other promising-sounding papers. I'm not sure these are much help for learning how to program with memory barriers, but if somebody really wants them (or something else) included, I'm not going to fight too hard. I don't expect this to be perfect the first time through; I am just trying to get a basic infrastructure in place. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
Robert Haas robertmh...@gmail.com wrote: there are three questions that you might want to answer here: 1. I have a new architecture and I want barrier.h to support it. What do I need to do? 2. What is the behavior of the various constructs provided by barrier.h? 3. Why would I want that behavior and how can I use it to do cool stuff? I intended the comment in that file to be enough to answer questions #1 and #2. What you and Heikki are asking about is really #3, and that seems to me to be setting the bar awfully high. OK, put that way, I mostly agree. A general overview of the known usage patterns in a header or README file still doesn't seem out of line to me. With LW locks, for example, I've only seen two patterns used in PostgreSQL: (1) Get a shared lock for reads and an exclusive lock for writes, or (2) get a shared lock to read or write your own data, but an exclusive lock to read anyone else's data. In both cases, there must be a defined order of lock acquisition to avoid deadlock, with a strong recommendation that locks be released in the reverse order. Mentioning that much doesn't preclude other uses of LW locks, but might help people new to the code. That's the level of summary *I* would like to see included. What I found hard about memory barriers is basically this: I didn't understand that the CPU is allowed to perform operations out of order. And I couldn't understand what the consequences of that fact were. I sort of understood - but hadn't really internalized - the idea that execution is highly pipelined, so the single moment at which an execution is performed is not well defined. Before I really got my head around it, I had to read the explanations of what a memory barrier actually does over and over again. I probably read ten different explanations saying the same thing in different words about twenty times a piece, and slowly the light dawned. I did my best to explain that in the existing comment; I'm happy to expand the comment if people have suggestions for what to put in there; but basically I think this is a hard concept and if you haven't done this stuff before it's going to take a while to get up to speed. That's the sort of thing where it would be helpful to provide one or two URLs for cogent explanations of this. Even if it takes repeated readings and meditations on the explanations for it to sink in, this is worth it. (For SSI I had to read the paper many times, and then go read several referenced papers, before I really had my head around it, and I've had others say the same thing. But having a link to the material gives someone a chance to *do* that.) -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EXPLAIN and nfiltered, take two
Robert Haas robertmh...@gmail.com writes: On Tue, Sep 20, 2011 at 8:19 PM, Tom Lane t...@sss.pgh.pa.us wrote: I find it useless and probably confusing to put out Rows Removed by Recheck Cond: 0 unless we're dealing with a lossy index. I don't really see the point of this. I think printing it always is both clear and appropriate; it would be even nicer if we also had a line for Rows Rechecked. I think you'll be singing a different tune as soon as you find out that every single indexscan in your plans has a generally-useless Rows Removed line added to it. The patch as I have it now is doing this: regression=# explain analyze select * from tenk1 where unique1 = 42; QUERY PLAN --- Index Scan using tenk1_unique1 on tenk1 (cost=0.00..8.27 rows=1 width=244) (actual time=0.137..0.153 rows=1 loops=1) Index Cond: (unique1 = 42) Rows Removed by Index Cond: 0 Total runtime: 0.635 ms (4 rows) and personally I object to a 50% increase in the vertical space occupied by a plan item in order to tell me not a damn thing. For an ordinary btree indexscan, that line WILL NOT EVER be nonzero. I do not own monitors large enough to accept that sort of overhead without complaint. Now the entry *is* useful when considering a lossy indexscan. I just don't want it there otherwise. (Note that I'm not proposing suppressing such entries from the machine-readable formats, just text format.) I think we are getting to the point where EXPLAIN is complex enough that it should really have its own chapter in the documentation. Yeah, the lack of documentation in the submitted patch is another big problem with it, but it's not especially relevant to arguing about what the behavior should be. I'll see about documentation after I'm done with the code. Also, upthread it was argued that we shouldn't measure the effects of joinqual filtering. I don't buy this for a minute, ... Yep, I agree. We should measure everything we possibly can. I don't have a clear idea how this ought to be represented, but leaving it out doesn't seem like the right answer. After playing with it for awhile, I'm inclined to just put out the measured count and let the user interpret it. Here are some examples using table tenk1 from the regression database, with most of the planner's enable_ flags turned off to force it to produce the same shape of plan for each case: regression=# explain analyze select * from tenk1 a join tenk1 b on a.thousand = b.hundred ; QUERY PLAN --- Merge Join (cost=0.00..3172.38 rows=10 width=488) (actual time=0.449..1678.464 rows=10 loops=1) Merge Cond: (b.hundred = a.thousand) - Index Scan using tenk1_hundred on tenk1 b (cost=0.00..1702.16 rows=1 width=244) (actual time=0.228..117.637 rows=1 loops=1) - Materialize (cost=0.00..1727.20 rows=1 width=244) (actual time=0.154..431.251 rows=11 loops=1) - Index Scan using tenk1_thous_tenthous on tenk1 a (cost=0.00..1702.20 rows=1 width=244) (actual time=0.115..12.033 rows=1001 loops=1) Total runtime: 2057.743 ms (6 rows) regression=# explain analyze select * from tenk1 a join tenk1 b on a.thousand = b.hundred and a.four = b.ten; QUERY PLAN --- Merge Join (cost=0.00..3422.38 rows=1 width=488) (actual time=0.657..1314.394 rows=2 loops=1) Merge Cond: (b.hundred = a.thousand) Join Filter: (a.four = b.ten) Rows Removed by Join Filter: 8 - Index Scan using tenk1_hundred on tenk1 b (cost=0.00..1702.16 rows=1 width=244) (actual time=0.395..110.506 rows=1 loops=1) - Materialize (cost=0.00..1727.20 rows=1 width=244) (actual time=0.165..390.867 rows=11 loops=1) - Index Scan using tenk1_thous_tenthous on tenk1 a (cost=0.00..1702.20 rows=1 width=244) (actual time=0.121..11.680 rows=1001 loops=1) Total runtime: 1386.446 ms (8 rows) OK, that's pretty straightforward, what about a left join? regression=# explain analyze select * from tenk1 a left join tenk1 b on a.thousand = b.hundred ; QUERY PLAN
Re: [HACKERS] Double sorting split patch
I've processed the results of the tests with double sorting split which I've sheduled for buffering build. I've updated wiki page with them: http://wiki.postgresql.org/wiki/Fast_GiST_index_build_GSoC_2011#Testing_results Raw results of query speed measues are in the attachment. There number of pages accesse is presented in dependency of table, buffering build and split method. Note, that tests were run with not last version of fast build patch (gist_fast_build-heikki-0.14.1.1.patch was used). Therefore, build time with buffering can be better. -- With best regards, Alexander Korotkov. dataset | build method | split method | predicate | page accesses -+++---+--- uniform | buffering (1/366/on) | new linear | point @ '(0.505,0.505),(0.5,0.5)'::box | 492 uniform | buffering (1/366/on) | new linear | point @ '(0.501,0.501),(0.5,0.5)'::box | 125 uniform | buffering (1/366/on) | new linear | point @ '(0.51,0.51),(0.5,0.5)'::box | 876 uniform | buffering (1/366/on) | new linear | point @ '(0.21,0.21),(0.2,0.2)'::box | 1453 uniform | buffering (1/366/on) | new linear | point @ '(0.103,0.903),(0.1,0.9)'::box | 394 uniform | buffering (1/366/auto) | new linear | point @ '(0.505,0.505),(0.5,0.5)'::box | 333 uniform | buffering (1/366/auto) | new linear | point @ '(0.501,0.501),(0.5,0.5)'::box | 134 uniform | buffering (1/366/auto) | new linear | point @ '(0.51,0.51),(0.5,0.5)'::box | 570 uniform | buffering (1/366/auto) | new linear | point @ '(0.21,0.21),(0.2,0.2)'::box | 720 uniform | buffering (1/366/auto) | new linear | point @ '(0.103,0.903),(0.1,0.9)'::box | 520 uniform | ordinal| new linear | point @ '(0.505,0.505),(0.5,0.5)'::box | 563 uniform | ordinal| new linear | point @ '(0.501,0.501),(0.5,0.5)'::box | 159 uniform | ordinal| new linear | point @ '(0.51,0.51),(0.5,0.5)'::box | 917 uniform | ordinal| new linear | point @ '(0.21,0.21),(0.2,0.2)'::box | 809 uniform | ordinal| new linear | point @ '(0.103,0.903),(0.1,0.9)'::box | 330 uniform | buffering (1/366/on) | double sorting | point @ '(0.505,0.505),(0.5,0.5)'::box |43 uniform | buffering (1/366/on) | double sorting | point @ '(0.501,0.501),(0.5,0.5)'::box | 8 uniform | buffering (1/366/on) | double sorting | point @ '(0.51,0.51),(0.5,0.5)'::box | 128 uniform | buffering (1/366/on) | double sorting | point @ '(0.21,0.21),(0.2,0.2)'::box | 120 uniform | buffering (1/366/on) | double sorting | point @ '(0.103,0.903),(0.1,0.9)'::box |18 uniform | buffering (1/366/auto) | double sorting | point @ '(0.505,0.505),(0.5,0.5)'::box |38 uniform | buffering (1/366/auto) | double sorting | point @ '(0.501,0.501),(0.5,0.5)'::box | 7 uniform | buffering (1/366/auto) | double sorting | point @ '(0.51,0.51),(0.5,0.5)'::box | 119 uniform | buffering (1/366/auto) | double sorting | point @ '(0.21,0.21),(0.2,0.2)'::box | 116 uniform | buffering (1/366/auto) | double sorting | point @ '(0.103,0.903),(0.1,0.9)'::box |20 uniform | ordinal| double sorting | point @ '(0.505,0.505),(0.5,0.5)'::box |37 uniform | ordinal| double sorting | point @ '(0.501,0.501),(0.5,0.5)'::box | 7 uniform | ordinal| double sorting | point @ '(0.51,0.51),(0.5,0.5)'::box | 116 uniform | ordinal| double sorting | point @ '(0.21,0.21),(0.2,0.2)'::box | 123 uniform | ordinal| double sorting | point @ '(0.103,0.903),(0.1,0.9)'::box |19 usnoa2 | buffering (1/366/on) | new linear | point @ '(275.8,6.8),(275.5,6.5)'::box |70 usnoa2 | buffering (1/366/on) | new linear | point @ '(300.8,6.8),(300.5,6.5)'::box |80 usnoa2 | buffering (1/366/on) | new linear | point @ '(240,-42.9),(239.9,-43)'::box |15 usnoa2 | buffering (1/366/on) | new linear | point @ '(259,-44.5),(258.9,-45.6)'::box |89 usnoa2 | buffering (1/366/on) | new linear | point @ '(309.6,27),(309.5,26.9)'::box |
[HACKERS] citext operator precedence fix
All, I just tripped over this: select 'josh'::varchar(32) = 'Josh'::citext; ?column? -- f While it's clear why it's that way, it's not how people would expect citext to behave. Users expect case-insensitive text to be case-insensitive for *all* comparisons, not just for comparisons with the same data type. I would also think that folks migrating from SQL Server and MySQL would get bitten by this. I'd like to patch the citext contrib module for 9.2 to fix this by creating four new = operators to establish the comparison function for text and varchar. Before I clean up my ad-hoc fix code for submission, are there strong objections to this idea? Or are there other data types I'd need to cover? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] citext operator precedence fix
Josh Berkus j...@agliodbs.com writes: I'd like to patch the citext contrib module for 9.2 to fix this by creating four new = operators to establish the comparison function for text and varchar. I think you'll find that's easier said than done (problem 1 is going to be ambiguity, and problem 2 is going to be that comparisons involving these operators won't get indexed). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] citext operator precedence fix
I think you'll find that's easier said than done (problem 1 is going to be ambiguity, Ambiguity? and problem 2 is going to be that comparisons involving these operators won't get indexed). Yeah, that's acceptable, since it's not any worse than the behavior of the comparisons now. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Sep21, 2011, at 19:00 , Jeff Davis wrote: While we're at it, any suggestions on the text representation of an empty range? My personal favourite would be '0', since it resembles the symbol used for empty sets in mathematics, and we already decided to use mathematical notation for ranges. If we're concerned that most of our users won't get that, then 'empty' would be a viable alternative I think. From a consistency POV it'd make sense to use a bracket-based syntax also for empty ranges. But the only available options would be '()' and '[]', which are too easily confused with '(,)' and '[,]' (which we already decided should represent the full range). best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
On Tue, Sep 13, 2011 at 10:13 PM, Kyotaro HORIGUCHI horiguchi.kyot...@oss.ntt.co.jp wrote: This is rebased patch of `Allow encoding specific character incrementer'(https://commitfest.postgresql.org/action/patch_view?id=602). Addition to the patch, increment sanity check program for new functions pg_generic_charinc and pg_utf8_increment is attached. I took a look at this patch ... and the thread ... and the previous thread with the same subject line: http://archives.postgresql.org/pgsql-bugs/2010-06/msg00303.php As I understand it, the issue here is that when we try to generate suitable and quals for a LIKE expression, we need to find a string which is greater than the prefix we're searching for, and the existing algorithm sometimes fails. But there seems to be some dispute over how likely this is to occur. Tom argues that the case is so rare that we shouldn't worry about it: http://archives.postgresql.org/pgsql-bugs/2010-06/msg00336.php ...while Kyotaro Horiguchi clearly feels otherwise, citing the statistic that about 100 out of 7000 Japanese characters fail to work properly: http://archives.postgresql.org/pgsql-bugs/2011-07/msg00064.php That statistic seems to justify some action, but what? Ideas: 1. Adopt the patch as proposed, or something like it. 2. Instead of installing encoding-specific character incrementing functions, we could try to come up with a more reliable generic algorithm. Not sure exactly what, though. 3. Come up with some way to avoid needing to do this in the first place. One random idea I have is - instead of generating and clauses, could we define a prefix match operator - i.e. a ### b iff substr(a, 1, length(b)) = b? We'd need to do something about the selectivity, but I don't see why that would be a problem. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types - typo + NULL string constructor
On Sep21, 2011, at 19:02 , Jeff Davis wrote: On Wed, 2011-09-21 at 13:24 +0200, Florian Pflug wrote: I've thought about this some more, and came to realize that the question here really is whether floatrange(0, 'Infinity'::float, '[)') and floatrange(0, NULL, '[)') are the same thing or not. The unbounded side of a range is never equal to a value in the data type's domain, so no, it's not the same. I think that we pretty much settled on just using an empty string for infinity in the other thread, right? So that makes this a non-issue. Agreed. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] citext operator precedence fix
Josh Berkus j...@agliodbs.com writes: I think you'll find that's easier said than done (problem 1 is going to be ambiguity, Ambiguity? Yeah, I'm worried about the possibility of parser failing to resolve which operator is meant. and problem 2 is going to be that comparisons involving these operators won't get indexed). Yeah, that's acceptable, since it's not any worse than the behavior of the comparisons now. No, I don't think so. For people for whom the right thing is happening, you'll risk making it (a) wrong and (b) lots slower. For people for whom the wrong thing is happening, maybe you'll fix it so it's semantically right, but if indexes don't work they still won't be happy. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] citext operator precedence fix
Ambiguity? Yeah, I'm worried about the possibility of parser failing to resolve which operator is meant. Any idea of how to test for that? What kind of situations would make things difficult for the parser? Also, how is this any different for any optional data type which overrides = ? No, I don't think so. For people for whom the right thing is happening, you'll risk making it (a) wrong and (b) lots slower. Well, I'm dubious that current behavior is the right thing for anybody. The best I could do to answer that would be an informal community survey. For people for whom the wrong thing is happening, maybe you'll fix it so it's semantically right, but if indexes don't work they still won't be happy. So I'd need to add operator classes and indexing support functions as well, then, presumably. Annoying, but not impossible. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Separating bgwriter and checkpointer
On 09/20/2011 09:35 AM, Heikki Linnakangas wrote: Yeah, I was thinking of write()s, not fsyncs. I agree this might have some effect during fsync phase. Right; that's where the most serious problems seem to pop up at anyway now. All the testing I did earlier this year suggested Linux at least is happy to do a granular fsync, and it can also use things like barriers when appropriate to schedule I/O. The hope here is that the background writer work to clean ahead of the strategy point is helpful to backends, and that should keep going even during the sync phase--which currently doesn't pause for anything else once it's started. The cleaner writes should all queue up into RAM in a lazy way rather than block the true I/O, which is being driven by sync calls. There is some risk here that the cleaner writes happen faster than the true rate at which backends really need buffers, since it has a predictive component it can be wrong about. Those could in theory result in the write cache filling faster than it would in the current environment, such that writes truly block that would have been cached in the current code. If you're that close to the edge though, backends should really benefit from the cleaner--that same write done by a client would turn into a serious stall. From that perspective, when things have completely filled the write cache, any writes the cleaner can get out of the way in advance of when a backend needs it should be the biggest win most of the time. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers