Re: [HACKERS] libpq support for arrays and composites
Andrew Chernow wrote: libpqtypes already implemented this. It is a different approach but provides the same functionality; with the inclusion of being able to handle every data type. libpqtypes uses the PGresult API for composites and arrays, rather than adding a new set of functions. Yes, I thought you'd say that :-) This has some significant limitations - for example (quoting from your docs) Arrays are only handled using binary format. This means that any type used as an array element must be put and gotten in binary format. If a user-defined type does not implement a send and recv function in the backend, it can not be used as an array element. That makes it quite useless for my intended purpose. I also am not particularly enamoured of the libpqtypes way of doing things, which feels rather foreign to me. Lastly, the idea is to provide extra facilities to libpq clients without requiring any extra library. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq support for arrays and composites
Andrew Dunstan [EMAIL PROTECTED] writes: Lastly, the idea is to provide extra facilities to libpq clients without requiring any extra library. Or more to the point, without requiring boatloads of new code that only some libpq users would have any use for. To my mind, the point of the present proposal is to provide some client-side code that understands how to invert the data formatting/escaping rules implemented by array_out, record_out, and perhaps array_in/record_in. We can make that happen without taking a quantum jump in libpq's API complexity --- and offhand it seems that Andrew D's proposal is at about the right level of complication. libpqtypes has its place also, but I think it's addressing a different level of problem complexity. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Automating our version-stamping a bit better
Andrew Dunstan wrote: Tom Lane wrote: I'm tempted to suggest letting the script invoke autoconf, too, but that would require standardizing where to find the correct version of autoconf for each branch; so it might not be such a great idea. Unfortunately that's true. Maybe we could agree on using an alias for the right version of autoconf, but it seems likely to be error prone. Or we could sidestep the issue by not running autoconf, but search-replace the version strings in configure directly with the perl script. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq support for arrays and composites
That makes it quite useless for my intended purpose. I found no more use cases for text results after libpqtypes started to take shape, eventhough libpqtypes supports all data types in text binary excluding arrays and composites. Because of this, adding a text parser for arrays and composites felt like a lot of work for a little gain. libpqtypes is really designed to be a binary interface. The text support offered allows existing applications to use the new interface with results generated by PQexec(), meaning you can use PQgetf w/o having to change code to use PQputf(). If you take another glance at libpqtypes, you may see that result format decisions are pretty well abstracted and there really is no need for text results anymore (okay, I'll catagorize that as an opinion). I also am not particularly enamoured of the libpqtypes way of doing things, which feels rather foreign to me. Not sure we can fix this issue. We made every attempt to keep things familiar ... printf/scanf style. It's a new approach for libpq but an old one for C hacks. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] proposal: add window function to 8.4
This topic has been discussed on this list and many user expect that PostgreSQL implements it. I'd like to work on this feature and hope that we can include it on 8.4. Former discussions are here: http://archives.postgresql.org/pgsql-hackers/2004-11/msg01093.php http://archives.postgresql.org/pgsql-hackers/2007-01/msg00861.php How it works and examples: SELECT dept, empno, RANK() OVER(PARTITION BY dept ORDER BY age) as age_rank, RANK() OVER(PARTITION BY dept ORDER BY salary) as salary_rank, SUM(salary) OVER(PARTITION BY dept ORDER BY age) as run_total FROM employees ORDER BY 1, 3, 4; dept empno age_rank salary_rank run_total ENG 2 12 4 ENG 1 21 9 QA 3 12 3 QA 4 21 65000 (ref.: http://www.gavinsherry.org/talks/window_talk.pdf) My current idea and concept: - add window function and treat it specially such like aggregate function and setof function. - some features may be dropped at the first release, considering to support them later. - to formulate and to let it work properly are primary, performance optimization is secondary. From my survey around web and list archive, the points are: - what is window function rank(), rank_dense(), lead() and others? First of all, we should define the window function such like aggregate function. In my opinion, there are two types of functions in OVER () call context. One of them is aggregate, and the other is window (ranking) function. Sum() in a query like SELECT empno, sum(salary) OVER (PARTITION BY depno) FROM empsalary; is obviously aggregate function. This type of function can be used as it is now. Only executer will change its behavior. Current pgsql feature sets lack window function like rank(). This type of function must 1) have a context such as SETOF functions, 2) return values until executor says DONE, rather than function itself says DONE as in SETOF function, and 3) know about other tuples (mainly ORDER BY clause), as rank() doesn't take any arguments but knows the ranking boundary. I suppose that pgsql have new function system for these types called window function. Once we can define window function, users have extensibility to this type of function. - do we really need FRAME clause? From my survey, Oracle and DB2 have FRAME clause SELECT empno, sum(salary) OVER (ORDER BY empno ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM empsalary; while MS SQL Server doesn't (note that the literal from ROWS to CURRENT ROW is called FRAME clause). To implement FRAME clause is much more complicated than only PARTITION and ORDER clause support because we need random access to result tuples. Though we will be expected to support FAME clause in long-term, for the first release it might be able to be put off. Even though rank() doesn't support FRAME clause (only PARTITION and ORDER) it is so useful, more than now at least. - execution order In SQL:2003, the execution order is defined as where - group by - having - (windowing) * N - order by (outer, currently existing one) where windowing is partition by - order by - (framing) * N But Oracle seems that it has another order (windowing) * N - where - group by ... and so on. which is better for us? With Oracle's one you can say SELECT empno, rank() OVER (PARTITION BY depno ORDER BY saraly) AS topsalary FROM empsalary WHERE topsaraly 3 to get the top 3 people taking heighest salary. In the SQL standard, you should the nest query. I insist the first (standard) one is better because we may want use the result of normal aggregate in OVER clause. - plan and node Currently in my mind the execution strategy could be: 1. Where GroupBy Having | 2. SortBy partitionClause, orderByClause | 3. Window foreach partition: if not there_is_frame(): aggvalue = null foreach row in partition: aggvalue = agg_trans_func(aggvalue) aggvalue = agg_final_func(aggvalue) foreach row in partition: if has frame clause: aggvalue = null frame = make_frame() foreach row_in_frame: aggvalue = aggregate_trans_func(aggvalue) aggvalue = aggregate_final_func(aggvalue) set aggvalue to row val = window_func() set val to row goto 2. if another window remained | 4. SortBy ORDER BY clause (outer) Limit | 5. Output This pseudo code is quite simple and stupid. We may optimize it by splitting tasks with MergeJoin, etc. or think about process 2. that collect same PARTITION clauses to reduce sort operation. Or to use Hash Agg to create PARTITION. But let's be stupid at first. Optimization is secondary. References: description by Stefan DeBloch http://wwwdvs.informatik.uni-kl.de/courses/NEDM/WS0607/Vorlesungsunterlagen/NEDM.Chapter.06.Windows_and_Query_Functions_in_SQL.pdf via Wikipedia[Select (SQL)] http://en.wikipedia.org/wiki/Select_(SQL) BTW, what about Bizgres
Re: [HACKERS] TODO, FAQs to Wiki?
Alvaro Herrera wrote: Actually, now that I try it, it seems that the MediaWiki markup is not completely helpful here -- right now, on some items we have a one-line header and then possibly a longer description, and it seems the only way to do that in MediaWiki is like this: * Set proper permissions on non-system schemas during db creationbr Currently all schemas are owned by the super-user because they are copied from the template1 database. However, since all objects are inherited from the template database, it is not clear that setting schemas to the db owner is correct. Note the dumb br thing in the middle. Personally I find that ugly enough as to be unacceptable; what do others think? How about using definition lists? //Magnus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump restore time and Foreign Keys
On Jun 7, 2008, at 2:00 PM, Andrew Dunstan wrote: If we go down this road then I would far rather we tried to devise some safe (or semi-safe) way of doing it instead of simply providing expert (a.k.a. footgun) mode. For instance, I'm wondering if we could do something with checksums of the input lines or something else that would make this difficult to do in circumstances other than pg_restore. Yes, but that provides no help at all outside of pg_dump. Being able to add a FK with NO CHECK would be tremendously useful outside of pg_dump. Actually, in the interest of stating the problem and not the solution, what we need is a way to add FKs that doesn't lock everything up to perform the key checks. Perhaps there is some semi- safe way that the constraint could be added and the checks done in the background... As for the footgun aspect, are we the enterprise-class OSS database or the one that caters itself to noobs that will go out of their way to make life hard on themselves? I'm all in favor of not adding footguns that don't have value, but this one holds a lot of value for anyone trying to maintain a large database in a 24/7 environment. To put this in perspective, the amount of revenue we would loose from adding just one FK to one of our larger tables would more than cover paying someone to develop this feature. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] proposal: add window function to 8.4
On Jun 9, 2008, at 7:32 AM, H.Harada wrote: This topic has been discussed on this list and many user expect that PostgreSQL implements it. I'd like to work on this feature and hope that we can include it on 8.4. I can't really comment on the technical aspects of your proposal, but yes, please, windowing functions would be great to have even if not fully implemented. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
[HACKERS] Strange issue with GiST index scan taking far too long
Hi there, I'm currently looking at a bug report in PostGIS where we are getting extremely long index scan times querying an index in one case, but the same scan can take much less time if the input geometry is calculated as the result of another function. First of all, I include the EXPLAIN ANALYZE of the basic query which looks like this: postgis=# explain analyze select count(*) from geography where centroid (select the_geom from geography where id=69495); QUERY PLAN - Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual time=2691.783..2691.784 rows=1 loops=1) InitPlan - Seq Scan on geography (cost=0.00..7149.00 rows=1 width=4559) (actual time=60.987..61.373 rows=1 loops=1) Filter: (id = 69495::numeric) - Index Scan using geography_geom_centroid_idx on geography (cost=0.00..8.28 rows=1 width=0) (actual time=79.241..2645.722 rows=32880 loops=1) Index Cond: (centroid $0) Filter: (centroid $0) Total runtime: 2692.288 ms (8 rows) The only real thing to know about the query is that the id field within the geography table is a primary key, and hence only a single geometry is being returned from within the subselect. Note that most of the time is disappearing into the index scan. Where things start getting really strange is when we add an extra function called force_2d() into the mix. All this function does is scan through the single geometry returned from the subselect and remove any 3rd dimension coordinates. Now the resulting EXPLAIN ANALYZE for this query looks like this: postgis=# explain analyze select count(*) from geography where centroid (select force_2d(the_geom) from geography where id=69495); QUERY PLAN Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual time=343.004..343.005 rows=1 loops=1) InitPlan - Seq Scan on geography (cost=0.00..7149.00 rows=1 width=4559) (actual time=48.714..49.016 rows=1 loops=1) Filter: (id = 69495::numeric) - Index Scan using geography_geom_centroid_idx on geography (cost=0.00..8.28 rows=1 width=0) (actual time=49.367..235.296 rows=32880 loops=1) Index Cond: (centroid $0) Filter: (centroid $0) Total runtime: 343.084 ms (8 rows) So by adding in an extra function around the subselect result, we have speeded up the index lookup by several orders of magnitude, and the speedup appears to be coming from somewhere within the index scan?! I've spent a little bit of time playing with this and it seems even writing a function to return a copy of the input function is enough. Here is my test function below: Datum LWGEOM_mcatest(PG_FUNCTION_ARGS); PG_FUNCTION_INFO_V1(LWGEOM_mcatest); Datum LWGEOM_mcatest(PG_FUNCTION_ARGS) { PG_LWGEOM *pgl = (PG_LWGEOM *) PG_DETOAST_DATUM(PG_GETARG_DATUM(0)); void *mem; /* Copy somewhere else */ mem = palloc(VARSIZE(pgl)); memcpy(mem, pgl, VARSIZE(pgl)-VARHDRSZ); PG_RETURN_POINTER(mem); } CREATE OR REPLACE FUNCTION mcatest(geometry) RETURNS geometry AS '$libdir/lwpostgis','LWGEOM_mcatest' LANGUAGE 'C'; And then here is the resulting EXPLAIN ANALYZE: postgis=# explain analyze select count(*) from geography where centroid (select mcatest(the_geom) from geography where id=69495); QUERY PLAN Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual time=283.126..283.127 rows=1 loops=1) InitPlan - Seq Scan on geography (cost=0.00..7149.00 rows=1 width=4559) (actual time=48.712..49.040 rows=1 loops=1) Filter: (id = 69495::numeric) - Index Scan using geography_geom_centroid_idx on geography (cost=0.00..8.28 rows=1 width=0) (actual time=49.321..215.524 rows=32880 loops=1) Index Cond: (centroid $0) Filter: (centroid $0) Total runtime: 283.221 ms (8 rows) Unfortunately I can't seem to work out why the extra time is disappearing into the index scan when my extra mcatest() function is not present, especially as sprof doesn't seem to want to run at the moment :( I'm wondering if it's related to either excess TOAST/palloc/pfree somewhere in the code, but I'd definitely appreciate any pointers. All these tests were done using PostgreSQL 8.3.1 and the latest PostGIS SVN. Many thanks, Mark. -- Mark Cave-Ayland Sirius Corporation - The Open Source Experts http://www.siriusit.co.uk T: +44 870 608 0063 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:
Re: [HACKERS] pg_dump restore time and Foreign Keys
Decibel! wrote: Yes, but that provides no help at all outside of pg_dump. Being able to add a FK with NO CHECK would be tremendously useful outside of pg_dump. Actually, in the interest of stating the problem and not the solution, what we need is a way to add FKs that doesn't lock everything up to perform the key checks. Perhaps there is some semi-safe way that the constraint could be added and the checks done in the background... I had some thoughts along the same lines. But how do you propose to recover when the check fails? What should pg_restore do if the dump is corrupt causing an FK check to fail? I suppose we could have some sort of marking for FK constraints along the lines of {checked, unchecked, invalid}. As for the footgun aspect, are we the enterprise-class OSS database or the one that caters itself to noobs that will go out of their way to make life hard on themselves? We are the database that tries very hard to keep its promises. If you want to change or relax those promises then the implications need to be very very clear. I'm all in favor of not adding footguns that don't have value, but this one holds a lot of value for anyone trying to maintain a large database in a 24/7 environment. To put this in perspective, the amount of revenue we would loose from adding just one FK to one of our larger tables would more than cover paying someone to develop this feature. Come up with a good proposal and I'm your man :-) I haven't seen one yet. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange issue with GiST index scan taking far too long
am Mon, dem 09.06.2008, um 14:18:50 +0100 mailte Mark Cave-Ayland folgendes: Hi there, I'm currently looking at a bug report in PostGIS where we are getting extremely long index scan times querying an index in one case, but the same scan can take much less time if the input geometry is calculated as the result of another function. First of all, I include the EXPLAIN ANALYZE of the basic query which looks like this: postgis=# explain analyze select count(*) from geography where centroid (select the_geom from geography where id=69495); QUERY PLAN - Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual time=2691.783..2691.784 rows=1 loops=1) InitPlan - Seq Scan on geography (cost=0.00..7149.00 rows=1 width=4559) (actual time=60.987..61.373 rows=1 loops=1) Filter: (id = 69495::numeric) - Index Scan using geography_geom_centroid_idx on geography (cost=0.00..8.28 rows=1 width=0) (actual time=79.241..2645.722 rows=32880 loops=1) Index Cond: (centroid $0) Filter: (centroid $0) Total runtime: 2692.288 ms (8 rows) There are a BIG difference between estimated rows and real rows (1 vs. 32880). Why? Do you have recent statistics? The only real thing to know about the query is that the id field within the geography table is a primary key, and hence only a single geometry Sure? I can't believe this because the rows=32880. is being returned from within the subselect. Note that most of the time is disappearing into the index scan. Where things start getting really strange is when we add an extra function called force_2d() into the mix. All this function does is scan through the single geometry returned from the subselect and remove any 3rd dimension coordinates. Now the resulting EXPLAIN ANALYZE for this query looks like this: postgis=# explain analyze select count(*) from geography where centroid (select force_2d(the_geom) from geography where id=69495); QUERY PLAN Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual time=343.004..343.005 rows=1 loops=1) InitPlan - Seq Scan on geography (cost=0.00..7149.00 rows=1 width=4559) (actual time=48.714..49.016 rows=1 loops=1) Filter: (id = 69495::numeric) - Index Scan using geography_geom_centroid_idx on geography (cost=0.00..8.28 rows=1 width=0) (actual time=49.367..235.296 rows=32880 loops=1) Index Cond: (centroid $0) Filter: (centroid $0) Total runtime: 343.084 ms (8 rows) So by adding in an extra function around the subselect result, we have speeded up the index lookup by several orders of magnitude, and the Wild guess: you have a big cache. But i'm not a PostGIS-Expert. Regards, Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: - Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange issue with GiST index scan taking far too long
On Mon, 2008-06-09 at 14:18 +0100, Mark Cave-Ayland wrote: Unfortunately I can't seem to work out why the extra time is disappearing into the index scan when my extra mcatest() function is not present, Hmmm, perhaps implicit casting? Try this to see if it works better also select count(*) from geography where centroid (select the_geom::geometry from geography where id=69495); -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange issue with GiST index scan taking far too long
Simon Riggs wrote: Hmmm, perhaps implicit casting? Try this to see if it works better also select count(*) from geography where centroid (select the_geom::geometry from geography where id=69495); Hi Simon, Unfortunately that seems to take the slow runtime path too. I did initially think about casting being involved (since the underlying index storage type is actually box2d rather than geometry), however my mcatest() function is also declared as returning geometry too. Interesting enough, forcing a cast to box2d instead of geometry seems to take the faster path, i.e: postgis=# explain analyze select count(*) from geography where centroid (select the_geom::box2d from geography where id=69495); QUERY PLAN Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual time=376.033..376.034 rows=1 loops=1) InitPlan - Seq Scan on geography (cost=0.00..7149.00 rows=1 width=4559) (actual time=42.853..43.051 rows=1 loops=1) Filter: (id = 69495::numeric) - Index Scan using geography_geom_centroid_idx on geography (cost=0.00..8.28 rows=1 width=0) (actual time=43.218..286.535 rows=32880 loops=1) Index Cond: (centroid ($0)::geometry) Filter: (centroid ($0)::geometry) Total runtime: 376.117 ms (8 rows) ATB, Mark. -- Mark Cave-Ayland Sirius Corporation - The Open Source Experts http://www.siriusit.co.uk T: +44 870 608 0063 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump restore time and Foreign Keys
Decibel! [EMAIL PROTECTED] writes: Actually, in the interest of stating the problem and not the solution, what we need is a way to add FKs that doesn't lock everything up to perform the key checks. Ah, finally a useful comment. I think it might be possible to do an add FK concurrently type of command that would take exclusive lock for just long enough to add the triggers, then scan the tables with just AccessShareLock to see if the existing rows meet the constraint, and if so finally mark the constraint valid. Meanwhile the constraint would be enforced against newly-added rows by the triggers, so nothing gets missed. You'd still get a small hiccup in system performance from the transient exclusive lock, but nothing like as bad as it is now. Would that solve your problem? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Potential deadlock with auto-analyze
It says here that CVS HEAD has a deadlock risk: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=chinchilladt=2008-06-09%2008:16:01 The problem manifested while attempting to drop a GIN index (though I doubt GIN is specially at fault): [484ceb40.47da:169] ERROR: deadlock detected [484ceb40.47da:170] DETAIL: Process 18394 waits for AccessExclusiveLock on relation 41759 of database 41648; blocked by process 18401. Process 18401 waits for AccessShareLock on relation 41766 of database 41648; blocked by process 18394. Process 18394: DROP INDEX text_idx; Process 18401: autovacuum: ANALYZE public.test__int [484ceb40.47da:171] HINT: See server log for query details. [484ceb40.47da:172] STATEMENT: DROP INDEX text_idx; We need to fix that --- having DDL randomly fail is not going to go over well with autovacuum users. I don't have time to look into it just now, but I rather suspect that what's happening is that the DROP tries to lock the index first and the table second, since it can't find out which table is involved without looking at the index. So it can deadlock against anything going in the more-normal direction. If that's the right diagnosis, we probably ought to try to fix DROP INDEX to not do that, since it could deadlock against any number of things not just autovac. But there's another question here: can we make autovac lose the deadlock war, instead of the user process? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump restore time and Foreign Keys
On Mon, 2008-06-09 at 10:57 -0400, Tom Lane wrote: Decibel! [EMAIL PROTECTED] writes: Actually, in the interest of stating the problem and not the solution, what we need is a way to add FKs that doesn't lock everything up to perform the key checks. Ah, finally a useful comment. I think it might be possible to do an add FK concurrently type of command that would take exclusive lock for just long enough to add the triggers, then scan the tables with just AccessShareLock to see if the existing rows meet the constraint, and if so finally mark the constraint valid. Meanwhile the constraint would be enforced against newly-added rows by the triggers, so nothing gets missed. You'd still get a small hiccup in system performance from the transient exclusive lock, but nothing like as bad as it is now. Would that solve your problem? That's good, but it doesn't solve the original user complaint about needing to re-run many, many large queries to which we already know the answer. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics
On Sun, 2008-06-08 at 19:03 -0400, Tom Lane wrote: Your argument seems to consider only columns having a normal distribution. How badly does it fall apart for non-normal distributions? (For instance, Zipfian distributions seem to be pretty common in database work, from what I've seen.) If using Idea 1: Keep an array of stadistinct that correspond to each bucket size, I would expect it to still be a better estimate than it is currently, because it's keeping a separate ndistinct for each histogram bucket. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump restore time and Foreign Keys
Simon Riggs wrote: On Mon, 2008-06-09 at 10:57 -0400, Tom Lane wrote: Decibel! [EMAIL PROTECTED] writes: Actually, in the interest of stating the problem and not the solution, what we need is a way to add FKs that doesn't lock everything up to perform the key checks. Ah, finally a useful comment. I think it might be possible to do an add FK concurrently type of command that would take exclusive lock for just long enough to add the triggers, then scan the tables with just AccessShareLock to see if the existing rows meet the constraint, and if so finally mark the constraint valid. Meanwhile the constraint would be enforced against newly-added rows by the triggers, so nothing gets missed. You'd still get a small hiccup in system performance from the transient exclusive lock, but nothing like as bad as it is now. Would that solve your problem? That's good, but it doesn't solve the original user complaint about needing to re-run many, many large queries to which we already know the answer. But we don't know it for dead sure, we only think we do. What if the data for one or other of the tables is corrupted? We'll end up with data we believe is consistent but in fact is not, ISTM. If you can somehow guarantee the integrity of data in both tables then we might be justified in assuming that the FK constraint will be consistent - that's why I suggested some sort of checksum mechanism might serve the purpose. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential deadlock with auto-analyze
Tom Lane wrote: If that's the right diagnosis, we probably ought to try to fix DROP INDEX to not do that, since it could deadlock against any number of things not just autovac. But there's another question here: can we make autovac lose the deadlock war, instead of the user process? Hmm, I thought autovacuum was already supposed to lose on deadlock. Perhaps it's only considering VACUUM though and not analyze ... [checks] nope, that doesn't seem to be it. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump restore time and Foreign Keys
Simon Riggs [EMAIL PROTECTED] writes: On Mon, 2008-06-09 at 10:57 -0400, Tom Lane wrote: Ah, finally a useful comment. I think it might be possible to do an add FK concurrently type of command that would take exclusive lock That's good, but it doesn't solve the original user complaint about needing to re-run many, many large queries to which we already know the answer. No, we are running a large query to which the user *thinks* he knows the answer. There are any number of reasons why he might be wrong. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump restore time and Foreign Keys
On Mon, 2008-06-09 at 11:23 -0400, Andrew Dunstan wrote: Simon Riggs wrote: On Mon, 2008-06-09 at 10:57 -0400, Tom Lane wrote: Decibel! [EMAIL PROTECTED] writes: Actually, in the interest of stating the problem and not the solution, what we need is a way to add FKs that doesn't lock everything up to perform the key checks. Ah, finally a useful comment. I think it might be possible to do an add FK concurrently type of command that would take exclusive lock for just long enough to add the triggers, then scan the tables with just AccessShareLock to see if the existing rows meet the constraint, and if so finally mark the constraint valid. Meanwhile the constraint would be enforced against newly-added rows by the triggers, so nothing gets missed. You'd still get a small hiccup in system performance from the transient exclusive lock, but nothing like as bad as it is now. Would that solve your problem? That's good, but it doesn't solve the original user complaint about needing to re-run many, many large queries to which we already know the answer. But we don't know it for dead sure, we only think we do. What if the data for one or other of the tables is corrupted? We'll end up with data we believe is consistent but in fact is not, ISTM. If you can somehow guarantee the integrity of data in both tables then we might be justified in assuming that the FK constraint will be consistent - that's why I suggested some sort of checksum mechanism might serve the purpose. Agreed. Can we get COPY to output the checksum of its output as part of the command tag? How else can we return the checksum? In $file.cksum for any given output file? We can then use an explicit checksum option in the COPY when we reload, with CHECKSUM option. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump restore time and Foreign Keys
On Mon, 2008-06-09 at 11:33 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Mon, 2008-06-09 at 10:57 -0400, Tom Lane wrote: Ah, finally a useful comment. I think it might be possible to do an add FK concurrently type of command that would take exclusive lock That's good, but it doesn't solve the original user complaint about needing to re-run many, many large queries to which we already know the answer. No, we are running a large query to which the user *thinks* he knows the answer. There are any number of reasons why he might be wrong. Of course. I should have said to which we already know the answer to indicate I'm passing on others' criticisms of us. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange issue with GiST index scan taking far too long
Mark Cave-Ayland [EMAIL PROTECTED] writes: So by adding in an extra function around the subselect result, we have speeded up the index lookup by several orders of magnitude, and the speedup appears to be coming from somewhere within the index scan?! Is the value you are fetching from the geography table large enough to be toasted? I'm thinking you might be looking at the cost of repeated de-toastings. BTW, that mcatest function is buggy --- it's not copying the last 4 bytes of the source value. I don't know enough about PostGIS data types to know what effect that would have. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange issue with GiST index scan taking far too long
Tom Lane wrote: Is the value you are fetching from the geography table large enough to be toasted? I'm thinking you might be looking at the cost of repeated de-toastings. Yeah, it's a fairly large geometry field so it will definitely be getting toasted. So is it a case of with the mcatest function in place, we're effectively caching the de-TOASTED value? BTW, that mcatest function is buggy --- it's not copying the last 4 bytes of the source value. I don't know enough about PostGIS data types to know what effect that would have. It would probably lose that last point in the coordinate sequence, but nothing major. I've removed the -VARHDRSZ part just to check and it doesn't make any difference. Many thanks, Mark. -- Mark Cave-Ayland Sirius Corporation - The Open Source Experts http://www.siriusit.co.uk T: +44 870 608 0063 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump restore time and Foreign Keys
Simon Riggs [EMAIL PROTECTED] writes: On Mon, 2008-06-09 at 11:33 -0400, Tom Lane wrote: No, we are running a large query to which the user *thinks* he knows the answer. There are any number of reasons why he might be wrong. Of course. I should have said to which we already know the answer to indicate I'm passing on others' criticisms of us. [ shrug... ] We don't know the answer either, and anyone who says we do is merely betraying his ignorance of the number of ways to load a foot-gun. I don't have any confidence in the checksum proposal either, as it's still naively assuming that changes in the data are the only possible problem. Consider that you are loading the data into a new database, which might be running under a different locale setting, might contain a new implementation of a datatype with subtly (or not so subtly) different semantics, or might just already contain data in the target tables. pg_dump scripts are not nearly bright enough to defend against these types of threats. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange issue with GiST index scan taking far too long
Mark Cave-Ayland [EMAIL PROTECTED] writes: Tom Lane wrote: Is the value you are fetching from the geography table large enough to be toasted? I'm thinking you might be looking at the cost of repeated de-toastings. Yeah, it's a fairly large geometry field so it will definitely be getting toasted. So is it a case of with the mcatest function in place, we're effectively caching the de-TOASTED value? Well, yeah, because the first thing it does is pg_detoast_datum. Just as a cross-check, try changing it to copy the value without forcibly detoasting --- I'll bet it's still slow then. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump restore time and Foreign Keys
2008/6/9 Simon Riggs [EMAIL PROTECTED]: On Mon, 2008-06-09 at 10:57 -0400, Tom Lane wrote: Decibel! [EMAIL PROTECTED] writes: Actually, in the interest of stating the problem and not the solution, what we need is a way to add FKs that doesn't lock everything up to perform the key checks. Ah, finally a useful comment. I think it might be possible to do an add FK concurrently type of command that would take exclusive lock for just long enough to add the triggers, then scan the tables with just AccessShareLock to see if the existing rows meet the constraint, and if so finally mark the constraint valid. Meanwhile the constraint would be enforced against newly-added rows by the triggers, so nothing gets missed. You'd still get a small hiccup in system performance from the transient exclusive lock, but nothing like as bad as it is now. Would that solve your problem? That's good, but it doesn't solve the original user complaint about needing to re-run many, many large queries to which we already know the answer. just a guess, but maybe create FK concurrently feature combined with synchronized scan feature _does_ resolve original problem. if you run many create FK concurrently one after another, wouldn't the seq scan be reused? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Filip Rembiałkowski
Re: [HACKERS] pg_dump restore time and Foreign Keys
Simon Riggs wrote: But we don't know it for dead sure, we only think we do. What if the data for one or other of the tables is corrupted? We'll end up with data we believe is consistent but in fact is not, ISTM. If you can somehow guarantee the integrity of data in both tables then we might be justified in assuming that the FK constraint will be consistent - that's why I suggested some sort of checksum mechanism might serve the purpose. Agreed. Can we get COPY to output the checksum of its output as part of the command tag? How else can we return the checksum? In $file.cksum for any given output file? It seems a reasonable idea to use the command tag, unless that's going to break lots of stuff. I think the only thing we can usefully checksum is the output lines in the client encoding. We can then use an explicit checksum option in the COPY when we reload, with CHECKSUM option. We need rather more than this to make sure your facility isn't abused. That's the part that I haven't been able to think of a good answer for (yet). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Proposal: GiST constraints
I would like to consider adding constraints to GiST indexes. I think it is possible to add constraints that are more sophisticated than just UNIQUE. My use case is a non-overlapping constraint, but I think it's possible for this to be more general. The idea is to make an array in shared memory with size equal to max_connections. Each element of the array would hold the oid of an index, and a tid. When inserting into a GiST index, take out a lock on the array, and scan it for other tuples being concurrently inserted into the same index, and execute some function to determine if any tuples conflict (using the tid stored in the array). Then, check the visibility of the tuple at that tid. If the conflicting tuple is live, release the lock on the array and abort. If the tuple is dead, set the array entry to be invalid, make an entry in the array, and release the lock. If the tuple has an xmin or xmax that's still in progress, release the lock on the array, block waiting on the appropriate xid, and then try again. If no conflicts exist in the array, make an entry in the array, and release the lock. Then, search the GiST index using the same function to determine if conflicts exist in the index. If conflicts exist in the index, check the visibility information for the tuple and proceed, wait or abort (in the same way as above). If no conflicts exist, insert. This should work fine for multi-column indexes where the constraints for each column are different. For instance, unique and non-overlapping could be mixed. I spoke about this idea with several people at EAST and PGCon. In particular, Teodor had the good idea to store the tid in the array, rather than the value, to make the idea more general to types of different sizes. Thoughts, ideas? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] proposal: new contrib module - session variables
Hello all, session variables are missing feature long time. Currently nobody works on package or SQL/PSM module implementation so there is some gap.Currently best know solution are session variables based on plper hash tables or Joe Conway's C implementation. I propose include Joe's code into contrib set http://www.joeconway.com/sessfunc.tar.gz It's important mainly for window's users, that has problems with own compilation. Objections? Regards Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange issue with GiST index scan taking far too long
On Mon, 2008-06-09 at 11:49 -0400, Tom Lane wrote: Mark Cave-Ayland [EMAIL PROTECTED] writes: So by adding in an extra function around the subselect result, we have speeded up the index lookup by several orders of magnitude, and the speedup appears to be coming from somewhere within the index scan?! Is the value you are fetching from the geography table large enough to be toasted? I'm thinking you might be looking at the cost of repeated de-toastings. So you are saying it is de-toasted 32880 times, in this case? If not, where are the repeated de-toastings happening? Sounds like we need some stats on how well toast is working for us. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump restore time and Foreign Keys
On Monday 09 June 2008 11:59:27 Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Mon, 2008-06-09 at 11:33 -0400, Tom Lane wrote: No, we are running a large query to which the user *thinks* he knows the answer. There are any number of reasons why he might be wrong. Of course. I should have said to which we already know the answer to indicate I'm passing on others' criticisms of us. [ shrug... ] We don't know the answer either, and anyone who says we do is merely betraying his ignorance of the number of ways to load a foot-gun. I think the more realistic scenario (based on the FK idea) is that you want to prevent any future rows from coming without validating the FK, and you're willing to clean up any violators after the fact, since you can make that an out of the critical path operation. if you extend this to a more general create constraint concurrently (to handle normal constraint, not null constraints, etc...), it would certainly be a big win, and i think most would see it as a reasonable compromise. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: new contrib module - session variables
Pavel Stehule wrote: Hello all, session variables are missing feature long time. Currently nobody works on package or SQL/PSM module implementation so there is some gap.Currently best know solution are session variables based on plper hash tables or Joe Conway's C implementation. I propose include Joe's code into contrib set http://www.joeconway.com/sessfunc.tar.gz It's important mainly for window's users, that has problems with own compilation. Objections? The URL fails, so I can't comment on Joe's work. But I seriously doubt that any contrib module is really the right way to go about doing session variables. If we're going to have them then they need to be properly builtin, and available to all PLs. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange issue with GiST index scan taking far too long
Simon Riggs [EMAIL PROTECTED] writes: On Mon, 2008-06-09 at 11:49 -0400, Tom Lane wrote: Is the value you are fetching from the geography table large enough to be toasted? I'm thinking you might be looking at the cost of repeated de-toastings. So you are saying it is de-toasted 32880 times, in this case? If not, where are the repeated de-toastings happening? Inside the index support functions. I'm thinking we could fix this by forcibly detoasting values passed as index scan keys, but it's not quite clear where's the best place to do that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: GiST constraints
Jeff Davis [EMAIL PROTECTED] writes: I would like to consider adding constraints to GiST indexes. I think it is possible to add constraints that are more sophisticated than just UNIQUE. My use case is a non-overlapping constraint, but I think it's possible for this to be more general. I would like to see something that replaces the current btree-only kluge for UNIQUE, if we're going to try to do something general. IOW, don't think of this as GiST-specific. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: new contrib module - session variables
2008/6/9 Andrew Dunstan [EMAIL PROTECTED]: Pavel Stehule wrote: Hello all, session variables are missing feature long time. Currently nobody works on package or SQL/PSM module implementation so there is some gap.Currently best know solution are session variables based on plper hash tables or Joe Conway's C implementation. I propose include Joe's code into contrib set http://www.joeconway.com/sessfunc.tar.gz It's important mainly for window's users, that has problems with own compilation. Objections? The URL fails, so I can't comment on Joe's work. But I seriously doubt that any contrib module is really the right way to go about doing session variables. If we're going to have them then they need to be properly builtin, and available to all PLs. probably Joe's server is out currently :(. Joe's implementation was simply - hash table stored in session context. Pavel cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics
Your argument seems to consider only columns having a normal distribution. My example was based upon normally distributed data because people usually know what they are and they are reasonably common. How badly does it fall apart for non-normal distributions? This should work to the extent that there is a correlation between the number of distinct values in a histogram bin and the 'size' of the bin. I know that this isn't a very good answer, but it's a hard problem in the general case. However, for any non-uniform probability distribution there exists a measure under which there is a non-zero correlation between these. I wrote up a hand-wavie semi-formal argument at the end, but it's not tough to see intuitively. Just think about the graph of the cdf. The histogram boundaries are horizontal lines that are equally spaced from top to bottom. If we rescale the x-axis st there is a fixed distance between each distinct value, and define the distance as ( ndistinct in a given interval)/(total ndistinct) then the only place where this doesn't hold is when the CDF is f(x) = x, aka the dist is uniform. And even then we just get a 0 coefficient, which is exactly what we are always assuming now. Obviously we run into problems when a) we have a poor estimate for ndistinct - but then we have worse problems b) our length measure doesn't correspond well with ndistinct in an interval expanding on b)... your mention of Zipfian distributions is particularly good example of where this could be poor. Right now ( someone correct me if I'm wrong ) words are sorted alphabetically. However, if we wanted this estimator to do a good job, we would sort them by their length or, better yet, frequency in the english language ( which is highly correlated with length ). (For instance, Zipfian distributions seem to be pretty common in database work, from what I've seen.) This should work well for any power law distribution. If people are curious, I can rerun my previous example using a power law distribution instead of normal. However, the easy way of thinking about all of this is that we're building a linear model between ndistinct and histogram bucket width. It's intuitive to expect there to be a correlation between the two, and so the model should have some predictive power. -Nathan somewhat formal - probably will be difficult to read without some basic probability theory To see this, first note that we can alternatively define the uniform distribution on [a,b] as the distribution whose CDF is a straight line that passes through both (a,0) and (b,1) ( just integrate the PDF ). So any non-uniform distribution will have a CDF with slope that is both below and above 1/(b-a) at some set of points, implying the existence of an interval [i1, i2] st ( CDF(i2) - CDF(i1) ) ( i2 - i1 )/(b-a). Then, under the constraints of the probability measure, there exists a second disjoint interval st ( CDF(i2') - CDF(i1') ) ( i2' - i1' )/(b-a). In other words, Next, assume that the number of potential distinct values in our interval scales linearly with the length of the interval. Although it seems as if this assumption could be somewhat burdensome, there always exists a measure under which this is true for sets with a defined order relation. ( As remarked earlier by Tom, we are already making this assumption ). To see this, consider defining the length(i1, i2) as ( the number of distinct value in [i1, i2] )/( total num distinct values ), where the number of distinct values is the set of values { v | v = i1 and v = i2 }. Next, note that the joint distribution of identically distributed, independent random variables is multinomial with cell probabilities given by the value of the pdf at each distinct point. Next, I'll state without proof that for an IID RV the expected number of distinct values is maximized for a uniform distribution ( this is pretty easy to see: think about the binomial case. Do you want your cell probabilities to be ( 1.0, 0 ) or ( 0.5, 0.5 ) ) Finally, note that the number of expected distinct values decreases faster than linearly in the length of the interval. This is pretty clear when we consider the sparse case. As the number of potential entries ( in this case, the interval length) approaches infinity, the probability of a new entry being distinct approaches 1. This means that, in this limit, every new entry ends up being distinct, aka the number of distinct values scales linearly in the number of new entries. As the interval shrinks, new entries have some probability of being repeats. As the interval shrinks to 0, there is a zero probability of new entries being unique. Since, a) there doesn't exists a linear relationship that contains the two boundary points b) the multinomial distribution of the PDF is continuous c) the relationship is clearly decreasing we can surmise that it is sub-linear. Therefore, we have two intervals that have sub and super linear slopes that cancel one another. However,
Re: [HACKERS] pg_dump restore time and Foreign Keys
On Mon, 2008-06-09 at 12:37 -0400, Robert Treat wrote: On Monday 09 June 2008 11:59:27 Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Mon, 2008-06-09 at 11:33 -0400, Tom Lane wrote: No, we are running a large query to which the user *thinks* he knows the answer. There are any number of reasons why he might be wrong. Of course. I should have said to which we already know the answer to indicate I'm passing on others' criticisms of us. [ shrug... ] We don't know the answer either, and anyone who says we do is merely betraying his ignorance of the number of ways to load a foot-gun. I think the more realistic scenario (based on the FK idea) is that you want to prevent any future rows from coming without validating the FK, and you're willing to clean up any violators after the fact, since you can make that an out of the critical path operation. if you extend this to a more general create constraint concurrently (to handle normal constraint, not null constraints, etc...), it would certainly be a big win, and i think most would see it as a reasonable compromise. Agreed. I think the out of the critical path action is more likely to be the intended path rather than the never check at all route. If we break down the action into two parts. ALTER TABLE ... ADD CONSTRAINT foo FOREIGN KEY ... NOVALIDATE; which holds exclusive lock, but only momentarily After this runs any new data is validated at moment of data change, but the older data has yet to be validated. ALTER TABLE ... VALIDATE CONSTRAINT foo which runs lengthy check, though only grabs lock as last part of action This way we have the ability to add them concurrently if we choose by running one after the other, or we can run first part only for now and run the other one at a more convenient moment. On a full set of checks on a large complex database can easily take hours or even days. We should allow this. It's not a footgun, its an honest attempt by people to add RI checks to their database. The only other alternative for some people is to not add FKs at all, which is also a footgun, but we don't seem bothered that they might take that option. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: GiST constraints
On Mon, 2008-06-09 at 13:28 -0400, Tom Lane wrote: Jeff Davis [EMAIL PROTECTED] writes: I would like to consider adding constraints to GiST indexes. I think it is possible to add constraints that are more sophisticated than just UNIQUE. My use case is a non-overlapping constraint, but I think it's possible for this to be more general. I would like to see something that replaces the current btree-only kluge for UNIQUE, if we're going to try to do something general. IOW, don't think of this as GiST-specific. I was concerned that the BTree kludge would outperform what I am suggesting for the case of UNIQUE, and might therefore still be necessary. My proposal requires an extra index lookup, because in GiST a conflicting tuple isn't necessarily found near the place it might be inserted. Maybe that cost is not too high, because for the case of BTree UNIQUE it would just be accessing the same pages twice (once to test for conflicts, and once to insert). I'm not sure exactly what you have in mind when you say kludge. My proposal doesn't solve the problem of update foo set a = a + 1, in which the UNIQUE constraint may fail when it should succeed. I don't see how that problem can be solved without deferring the constraint checking until the end of the statement, which sounds costly. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: new contrib module - session variables
Pavel Stehule wrote: The URL fails, so I can't comment on Joe's work. But I seriously doubt that any contrib module is really the right way to go about doing session variables. If we're going to have them then they need to be properly builtin, and available to all PLs. probably Joe's server is out currently :(. Joe's implementation was simply - hash table stored in session context. Sorry -- I forgot to restart apache after upgrading my kernel yesterday :-( I don't have a strong opinion either way regarding including this in contrib or elsewhere, but if there is sufficient interest I'll find the time to update it for cvs head as well as ensure it represents the latest and greatest that I have available. I've used and improved it in minor ways since that tarball was put up. Joe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] math error or rounding problem Money type
Gregory Stark wrote: Mark Kirkwood [EMAIL PROTECTED] writes: IFAIK (dimly recalling numerical analysis courses at university) SUM and ROUND can *never* be commuted. In general the recommended approach is to round as late as possible and as few times are possible - so your 1st query is the correct or best way to go. Justin, isn't your problem related precisely to what Tom said? Now, when you're casting to Money, you're doing a cast like that original_type - text - money (that's from your trailbalance view). I suspect the original_type is NUMERIC (and I think it's a very good type to keep your monetary data in). My guess: what happens is that you have numbers with more that 6 fractional digits in your original table, and they're kept as NUMERIC values. If you round them to the 6th fractional digit *before* summing them up, you can indeed get different results from what you'd get if you'd rounded them *after* doign the sum. Compare: =# select round(0.004 + 0.004, 6) ; round -- 0.01 (1 row) =# select round(0.004, 6) + round(0.004) ; ?column? -- 0.00 Do you see what (could've) happened? The first query is computed like this: round(0.004 + 0.004, 0) = round(0.008, 6) = 0.01 whereas the second one is more like: round(0.004, 6) + round(0.004, 6) = 0.00 + 0.00 = 0.00 Fractional parts that have been thrown away by the rounding may, when added up, become fractional parts that get significant when you're calculating the rounded value of the sum. So yes, probably the way to go is do *all* computations in NUMERIC and only cast when you're about to generate a report or present the data to the end user. Otherwise you risk losing some cents like that (and you need to be aware that a cast to MONEY *is* in fact a truncation, and you will not get mathematically correct results). Cheers, Jan -- Jan Urbanski GPG key ID: E583D7D2 ouden estin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump restore time and Foreign Keys
Simon Riggs wrote: If we break down the action into two parts. ALTER TABLE ... ADD CONSTRAINT foo FOREIGN KEY ... NOVALIDATE; which holds exclusive lock, but only momentarily After this runs any new data is validated at moment of data change, but the older data has yet to be validated. ALTER TABLE ... VALIDATE CONSTRAINT foo which runs lengthy check, though only grabs lock as last part of action The problem I see with this approach in general (two-phase FK creation) is that you have to keep the same transaction for the first and second command, but you really want concurrent backends to see the tuple for the not-yet-validated constraint row. Another benefit that could arise from this is that the hypothetical VALIDATE CONSTRAINT step could validate more than one constraint at a time, possibly processing all the constraints with a single table scan. Perhaps VALIDATE CONSTRAINT could be handled as an automatic commit-time action. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Tom, Actually, the reason it's still 10 is that the effort expended to get it changed has been *ZERO*. I keep asking for someone to make some measurements, do some benchmarking, anything to make a plausible case for a specific higher value as being a reasonable place to set it. The silence has been deafening. Not surprising really. It is a simple adjustment to make and it also is easy to spot when its a problem. However it is not trivial to test for (in terms of time and effort). I know 10 is wrong and so do you. If you don't I am curious why I see so many posts from you saying, Your estimates are off, what is your default_statistics_target? with yet even more responses saying, Uhh 10. I tried (back in 7.4) to do some systematic testing of this. The problem is that the cases were higher d_s_t are required are specifically ones with complex, unbalanced data distributions and/or very large databases. This makes test cases extremely difficult and time-consuming to generate; further, I found that the test cases I had from my clients' databases were not portable (in addition to being confidential). Also, I'd actually assert that 10 seems to be perfectly adequate for the majority of users. That is, the number of users where I've recommended increasing d_s_t for the whole database is smaller than the number where I don't, and of course we never hear from most users at all. So I'm pretty happy recommending Leave the default. If you encounter problem queries, increase it to 100, and analyse the database. If you're running a data warehouse, increase it to 1000. Where analyze does systematically fall down is with databases over 500GB in size, but that's not a function of d_s_t but rather of our tiny sample size. --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] handling TOAST tables in autovacuum
Alvaro Herrera napsal(a): Hi, We've been making noises about dealing with TOAST tables as separate entities in autovacuum for some time now. So here's a proposal: Maybe dumb idea - whats about make a queue of toast pointers ready for vacuum and remove this toast items directly from toast table and index? Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Josh Berkus [EMAIL PROTECTED] writes: Where analyze does systematically fall down is with databases over 500GB in size, but that's not a function of d_s_t but rather of our tiny sample size. Speak to the statisticians. Our sample size is calculated using the same theory behind polls which sample 600 people to learn what 250 million people are going to do on election day. You do NOT need (significantly) larger samples for larger populations. In fact where those polls have difficulty is the same place we have some problems. For *smaller* populations like individual congressional races you need to have nearly the same 600 sample for each of those small races. That adds up to a lot more than 600 total. In our case it means when queries cover a range much less than a whole bucket then the confidence interval increases too. Also, our estimates for n_distinct are very unreliable. The math behind sampling for statistics just doesn't work the same way for properties like n_distinct. For that Josh is right, we *would* need a sample size proportional to the whole data set which would practically require us to scan the whole table (and have a technique for summarizing the results in a nearly constant sized data structure). -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump restore time and Foreign Keys
Alvaro Herrera [EMAIL PROTECTED] writes: The problem I see with this approach in general (two-phase FK creation) is that you have to keep the same transaction for the first and second command, but you really want concurrent backends to see the tuple for the not-yet-validated constraint row. Do you? It seems like having a constraint which is enforced on any new operations but which doesn't guarantee that existing records satisfy it is a useful feature in itself -- separating the two concepts this property is true for all records and any action taken must leave the record with this property ISTM you can validate an invalid constraint using any snapshot taken at any time = the original snapshot. As long as the constraint is being enforced for all transactions which start after the validating snapshot's xmin then when it's done it can know the constraint is valid. Taking a lock on the table to create the constraint certainly leaves that property fulfilled. Actually it seems we could not take any lock and just check when it comes time to do the validation that the snapshot's xmin is = the xmin on the constraint. I'm starting to get leery of all these tightly argued bits of logic though. Each one on its own is safe but the resulting system is getting to be quite complex. Another benefit that could arise from this is that the hypothetical VALIDATE CONSTRAINT step could validate more than one constraint at a time, possibly processing all the constraints with a single table scan. Interesting. Perhaps VALIDATE CONSTRAINT could be handled as an automatic commit-time action. I don't really like this, at least not as the only option, because as I said above and Robert Treat also said, it could be useful to have the constraint in place for new operations but check it for the existing data at some later date. (Or even never) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On 6/9/08, Gregory Stark [EMAIL PROTECTED] wrote: Josh Berkus [EMAIL PROTECTED] writes: Where analyze does systematically fall down is with databases over 500GB in size, but that's not a function of d_s_t but rather of our tiny sample size. n_distinct. For that Josh is right, we *would* need a sample size proportional to the whole data set which would practically require us to scan the whole table (and have a technique for summarizing the results in a nearly constant sized data structure). Hi, is this (summarizing results in a constant sized data structure) something which could be achived by Bloom-Filters ? http://archives.postgresql.org/pgsql-general/2008-06/msg00076.php Kind regards Hakan Kocaman
Re: [HACKERS] Overhauling GUCS
Hakan Kocaman [EMAIL PROTECTED] writes: On 6/9/08, Gregory Stark [EMAIL PROTECTED] wrote: n_distinct. For that Josh is right, we *would* need a sample size proportional to the whole data set which would practically require us to scan the whole table (and have a technique for summarizing the results in a nearly constant sized data structure). is this (summarizing results in a constant sized data structure) something which could be achived by Bloom-Filters ? Uhm, it would be a bit of a strange application of them but actually it seems to me that would be a possible approach. It would need a formula for estimating the number of distinct values given the number of bits set in the bloom filter. That should be a tractable combinatorics problem (in fact it's pretty similar to the combinatorics I posted a while back about getting all the drives in a raid array busy). And if you have a dynamic structure where the filter size grows then it would overestimate because extra copied bits would be set. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Greg, Speak to the statisticians. Our sample size is calculated using the same theory behind polls which sample 600 people to learn what 250 million people are going to do on election day. You do NOT need (significantly) larger samples for larger populations. Your analogy is bad. For elections, the voters have only a few choices. In a 300 million row table, there could be 300 million different values, and the histogram becomes less accurate for every order of magnitude smaller than 300 million it is. Also, our estimates for n_distinct are very unreliable. The math behind sampling for statistics just doesn't work the same way for properties like n_distinct. For that Josh is right, we *would* need a sample size proportional to the whole data set which would practically require us to scan the whole table (and have a technique for summarizing the results in a nearly constant sized data structure). Actually, a number of papers have shown block-based algorithms which can arrive a reasonably confident (between 50% and 250% of accurate) estimates based on scanning only 5% of *blocks*. Simon did some work on this a couple years ago, but he and I had difficultly convincing -hackers that a genuine problem existed. You're correct that we'd need to change pg_statistic, though. For one thing, we need to separate the sample size from the histogram size. Also, we seem to be getting pretty far away from the original GUC discussion. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump restore time and Foreign Keys
On Mon, 2008-06-09 at 14:07 -0400, Alvaro Herrera wrote: Simon Riggs wrote: If we break down the action into two parts. ALTER TABLE ... ADD CONSTRAINT foo FOREIGN KEY ... NOVALIDATE; which holds exclusive lock, but only momentarily After this runs any new data is validated at moment of data change, but the older data has yet to be validated. ALTER TABLE ... VALIDATE CONSTRAINT foo which runs lengthy check, though only grabs lock as last part of action The problem I see with this approach in general (two-phase FK creation) is that you have to keep the same transaction for the first and second command, but you really want concurrent backends to see the tuple for the not-yet-validated constraint row. Well, they *must* be in separate transactions if we are to avoid holding an AccessExclusiveLock while we perform the check. Plus the whole idea is to perform the second part at some other non-critical time, though we all agree that never performing the check at all is foolhardy. Maybe we say that you can defer the check, but after a while autovacuum runs it for you if you haven't done so. It would certainly be useful to run the VALIDATE part as a background task with vacuum wait enabled. Another benefit that could arise from this is that the hypothetical VALIDATE CONSTRAINT step could validate more than one constraint at a time, possibly processing all the constraints with a single table scan. Good thought, though not as useful for FK checks. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump restore time and Foreign Keys
Simon Riggs wrote: Maybe we say that you can defer the check, but after a while autovacuum runs it for you if you haven't done so. It would certainly be useful to run the VALIDATE part as a background task with vacuum wait enabled. It would be useful if there was anywhere to report the error to, or an action that could be taken automatically. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: GiST constraints
Jeff Davis [EMAIL PROTECTED] writes: On Mon, 2008-06-09 at 13:28 -0400, Tom Lane wrote: I would like to see something that replaces the current btree-only kluge for UNIQUE, if we're going to try to do something general. IOW, don't think of this as GiST-specific. I'm not sure exactly what you have in mind when you say kludge. Well, there are at least three things not to like about the btree UNIQUE implementation: 1. It's btree-specific and can't be shared by other index AMs that might wish to implement constraints. 2. It involves the index AM reaching into the heap, which is at the least a serious failure of modularity. 3. There's no way to implement a deferred uniqueness check, nor even to handle the within-statement conflict problem. It looks to me like the same knocks can be laid on your proposal. Now admittedly I don't have a solution that addresses these objections (much less one that does it without losing any performance) but I'm hesitant to see us building new features in this area without any idea how we will fix these things --- especially #3, which is a SQL-spec violation as well as a frequent user complaint. I'd like to have at least a design plan for fixing these things, so we know whether we are painting ourselves (further) into a corner. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Core team statement on replication in PostgreSQL
Gurjeet Singh wrote: On Fri, May 30, 2008 at 10:40 AM, Tom Lane [EMAIL PROTECTED] wrote: But since you mention it: one of the plausible answers for fixing the vacuum problem for read-only slaves is to have the slaves push an xmin back upstream to the master to prevent premature vacuuming. The current design of pg_standby is utterly incapable of handling that requirement. So there might be an implementation dependency there, depending on how we want to solve that problem. I think it would be best to not make the slave interfere with the master's operations; that's only going to increase the operational complexity of such a solution. There could be multiple slaves following a master, some serving For the slave to not interfere with the master at all, we would need to delay application of WAL files on each slave until visibility on that slave allows the WAL to be applied, but in that case we would have long-running transactions delay data visibility of all slave sessions. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Core team statement on replication in PostgreSQL
Andreas 'ads' Scherbaum wrote: On Fri, 30 May 2008 16:22:41 -0400 (EDT) Greg Smith wrote: On Fri, 30 May 2008, Andreas 'ads' Scherbaum wrote: Then you ship 16 MB binary stuff every 30 second or every minute but you only have some kbyte real data in the logfile. Not if you use pg_clearxlogtail ( http://www.2ndquadrant.com/replication.htm ), which got lost in the giant March commitfest queue but should probably wander into contrib as part of 8.4. Yes, this topic was discussed several times in the past but to solve this it needs a patch/solution which is integrated into PG itself, not contrib. Agreed. I realize why we are not zeroing those bytes (for performance), but can't we have the archiver zero those bytes before calling the 'archive_command'? -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Core team statement on replication in PostgreSQL
Bruce Momjian wrote: Agreed. I realize why we are not zeroing those bytes (for performance), but can't we have the archiver zero those bytes before calling the 'archive_command'? Perhaps make the zeroing user-settable. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Core team statement on replication in PostgreSQL
Bruce Momjian [EMAIL PROTECTED] writes: Gurjeet Singh wrote: There could be multiple slaves following a master, some serving For the slave to not interfere with the master at all, we would need to delay application of WAL files on each slave until visibility on that slave allows the WAL to be applied, but in that case we would have long-running transactions delay data visibility of all slave sessions. Right, but you could segregate out long-running queries to one slave server that could be further behind than the others. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Core team statement on replication in PostgreSQL
Bruce Momjian [EMAIL PROTECTED] writes: Agreed. I realize why we are not zeroing those bytes (for performance), but can't we have the archiver zero those bytes before calling the 'archive_command'? The archiver doesn't know any more about where the end-of-data is than the archive_command does. Moreover, the archiver doesn't know whether the archive_command cares. I think the separate module is a fine solution. It should also be pointed out that the whole thing becomes uninteresting if we get real-time log shipping implemented. So I see absolutely no point in spending time integrating pg_clearxlogtail now. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Core team statement on replication in PostgreSQL
On Mon, 9 Jun 2008, Tom Lane wrote: It should also be pointed out that the whole thing becomes uninteresting if we get real-time log shipping implemented. So I see absolutely no point in spending time integrating pg_clearxlogtail now. There are remote replication scenarios over a WAN (mainly aimed at disaster recovery) that want to keep a fairly updated database without putting too much traffic over the link. People in that category really want zeroed tail+compressed archives, but probably not the extra overhead that comes with shipping smaller packets in a real-time implementation. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] a question about exec_simple_query()
Hi, everyone: In functin exec_simple_query(), why we run a simple query wrapped in a portal. For instance: version 8.3.0 ,postgres.c, Line 908 /* * Create unnamed portal to run the query or queries in. If there * already is one, silently drop it. */ portal = CreatePortal(, true, true); .. In other words, what's the benifit we use a portal to run a simple query? Thanks for your help! :) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Core team statement on replication in PostgreSQL
Just for information. In terms of archive compression, I have archive log compression which will be found in http://pgfoundry.org/projects/pglesslog/ This feature is also included in NTT's synchronized log shipping replication presented in the last PGCon. 2008/6/10 Greg Smith [EMAIL PROTECTED]: On Mon, 9 Jun 2008, Tom Lane wrote: It should also be pointed out that the whole thing becomes uninteresting if we get real-time log shipping implemented. So I see absolutely no point in spending time integrating pg_clearxlogtail now. There are remote replication scenarios over a WAN (mainly aimed at disaster recovery) that want to keep a fairly updated database without putting too much traffic over the link. People in that category really want zeroed tail+compressed archives, but probably not the extra overhead that comes with shipping smaller packets in a real-time implementation. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- -- Koichi Suzuki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: GiST constraints
On Mon, 2008-06-09 at 21:00 -0400, Tom Lane wrote: 1. It's btree-specific and can't be shared by other index AMs that might wish to implement constraints. This can be solved by my proposal, but I just don't know how it would apply to something like GIN, for instance. It could replace the unique constraint for BTree, but I'm not sure it would perform as well. It's not that my proposal is GiST-specific, it's just that is the only use case I can think of that is an improvement. 2. It involves the index AM reaching into the heap, which is at the least a serious failure of modularity. We need to reach into the heap for visibility information, if we're to implement any constraints at all. Also, we have to test against values being inserted by other concurrent transactions, and those values can be variable in size. What other mechanism do we have to share those variable-sized values among several backends? 3. There's no way to implement a deferred uniqueness check, nor even to handle the within-statement conflict problem. This is the big one. Now admittedly I don't have a solution that addresses these objections (much less one that does it without losing any performance) but I'm hesitant to see us building new features in this area without any idea how we will fix these things --- especially #3, which is a SQL-spec violation as well as a frequent user complaint. I'd like to have at least a design plan for fixing these things, so we know whether we are painting ourselves (further) into a corner. I'll see if I can come up with something. I agree that's an important problem to solve. Does anyone know how other DBMSs do this? I found this thread from the TODO: http://archives.postgresql.org/pgsql-hackers/2006-09/msg01458.php Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] a question about exec_simple_query()
=?GB2312?Q?=CD=F4=E7=F9?= [EMAIL PROTECTED] writes: In other words, what's the benifit we use a portal to run a simple query? er, because it doesn't work otherwise? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers