Re: [HACKERS] machine-readable explain output v4
On Tue, 11 Aug 2009, Mike wrote: Have any tool authors stepped up and committed resources to utilizing this feature in the near term? Even before the easier to read format was available, there were already multiple EXPLAIN analysis tools floating around, some of them web-based like you're considering; a list is at http://wiki.postgresql.org/wiki/Using_EXPLAIN You might expect some of those tool authors would do the appropriate overhaul to import the new format data, and perhaps make things more powerful or simple in the process. You might want to collaborate within someone writing one of those existing applications rather than start over on your own. The reason I would like to provide this tool in a web-based form is that no additional software installation would be necessary for the user, reducing any hurdles to using it to zero. I personally hate only having a web-based tool for this style of application, because I'm always dealing with data I can't paste into somebody else's site for EXPLAIN output--that's a common hurdle that's impossible to clear given all the regulatory and business secret restrictions people work under nowadays. Even when the source code is available for the web app, that puts you back to needing to install the tool locally, and I've found web apps tend to be more complicated to get running than a typical standalone app. -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
On Tue, 11 Aug 2009, Dimitri Fontaine wrote: We should somehow provide a default archive and restore command integrated into the main product, so that it's as easy as turning it 'on' in the configuration for users to have something trustworthy: PostgreSQL will keep past logs into a pg_xlog/archives subdir or some other default place, and will know about the setup at startup time when/if needed. Wandering a little off topic here because this plan reminded me of something else I've been meaning to improve...while most use-cases require some sort of network transport for this to be useful, there is one obvious situation where it would be great to have a ready to roll setup by default. Right now, if people want to make a filesystem level background of their database, they first have to grapple with setting up the archive command to do so. If the system were shipped in a way that made that trivial to active, perhaps using something like what you describe here, that would reduce the complaints that PostgreSQL doesn't have any easy way to grab a filesystem hotcopy of the database. Those rightly pop up sometimes, and it would be great if the procedure were reduced to: 1) Enable archiving 2) pg_start_backup 3) rsync/tar/cpio/copy/etc. 4) pg_stop_backup 5) Disable archiving Because the default archive_command was something that supported a filesystem snapshot using a standard layout. -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Geometry RESTRICT and JOIN
I'm trying to add all the box op point operators. The C routines are written and working as advertised. The manuals description of the RESTRICT and JOIN clauses of CREATE OPERATOR don't seem too clear. Are these samples correct, or am I totally off base here? CREATE OPERATOR ( LEFTARG= box, RIGHTARG = point, PROCEDURE = leftof, RESTRICT = scalarltsel, -- ?? UNSURE JOIN = positionjoinsel -- ?? UNCLEAR ); CREATE OPERATOR ( LEFTARG= box, RIGHTARG = point, PROCEDURE = notleft, RESTRICT = scalargtsel, -- ?? UNSURE JOIN = positionjoinsel -- ?? UNCLEAR ); CREATE OPERATOR @ ( LEFTARG= box, RIGHTARG = point, PROCEDURE = contains, RESTRICT = eqsel, -- ?? UNSURE JOIN = contjoinsel -- ?? UNCLEAR ); ...etc... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alpha 1 release notes
Hi, Josh Berkus j...@agliodbs.com writes: Will do. Teaching myself RST now I've been doing a lot of RST editing before, and found it pretty straightforward. Except for default table handling, where ascii-art maintenance is a pain, or you have to use extended tools, like emacs table mode and such. Or use list-table and rejoy :) http://docutils.sourceforge.net/docs/ref/rst/directives.html#list-table Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alpha 1 release notes
within source code, build options there is: - Reserve the shared memory region during backend startup on Windows, so that memory allocated by starting third party DLLs doesn't end up conflicting with it. Hopefully this solves the long-time issue with could not reattach to shared memory errors on Win32. I suggest that it should also be pointed out that this fix will be backported to 8.3 and 8.4 (as much as I followed the ML); similiar to the information at - Fast shutdown stop should forcibly disconnect any active backends, even if a smart shutdown is already in progress. Backpatched to 8.3. best wishes Harald -- GHUM Harald Massa persuadere et programmare Harald Armin Massa Spielberger Straße 49 70435 Stuttgart 0173/9409607 no fx, no carrier pigeon - LASIK good, steroids bad?
Re: [HACKERS] COPY speedup
But when I see a big red button, I just press it to see what happens. Ugly hacks are useful to know how fast the thing can go ; then the interesting part is to reimplement it cleanly, trying to reach the same performance... Right -- now that you've shown a 6x speedup increase, it is clear that it makes sense to attempt a reimplementation. It also means it makes sense to have an additional pair or two of input/output functions. Okay. Here are some numbers. The tables are the same as in the previous email, and it also contains the same results as copy patch 4, aka API hack for reference. I benchmarked these : * p5 = no api changes, COPY TO optimized : - Optimizations in COPY (fast buffer, much less fwrite() calls, etc) remain. - SendFunction API reverted to original state (actually, the API changes are still there, but deactivated, fcinfo-context = NULL). = small performance gain ; of course the lower per-row overhead is more visible on test_one_int, because that table has 1 column. = the (still huge) distance between p5 and API hack is split between overhead in pq_send*+stringInfo (that we will tackle below) and palloc() overhead (that was removed by the API hack by passing the destination buffer directly). * p6 = p5 + optimization of pq_send* - inlining strategic functions - probably benefits many other code paths = small incremental performance gain * p7 = p6 + optimization of StringInfo - inlining strategic functions - probably benefits many other code paths = small incremental performance gain (they start to add up nicely) * p8 = p7 + optimization of palloc() - actually this is extremely dumb : - int4send and int2send simply palloc() 16 bytes instead of 1024.. - the initial size of the allocset is 64K instead of 8K = still it has interesting results... The three patches above are quite simple (especially the inlines) and yet, speedup is already nice. * p9 = p8 + monstrously ugly hack copy looks at the sendfunc, notices it's int{2,4}send , and replaces it with int{2,4}fastsend which is called directly from C, bypassing the fmgr (urrrgghh) of course it only works for ints. This gives information about fmgr overhead : fmgr is pretty damn fast. * p10 no copy does everything except calling the SendFuncs, it writes dummy data instead. This gives the time used in everything except the SendFuncs : table scan, deform_tuple, file writes, etc, which is an interesting thing to know. RESULTS : COPY annonces TO '/dev/null' BINARY : Time | Speedup | Table | KRows | MTuples | Name (s) | | MB/s | /s | /s | --|-|||-|- 2.149 | 2.60 x | 151.57 | 192.40 |7.50 | copy to patch 4 3.055 | 1.83 x | 106.64 | 135.37 |5.28 | p8 = p7 + optimization of palloc() 3.202 | 1.74 x | 101.74 | 129.15 |5.04 | p7 = p6 + optimization of StringInfo 3.754 | 1.49 x | 86.78 | 110.15 |4.30 | p6 = p5 + optimization of pq_send* 4.434 | 1.26 x | 73.47 | 93.26 |3.64 | p5 no api changes, COPY TO optimized 5.579 | --- | 58.39 | 74.12 |2.89 | compiled from source COPY archive_data TO '/dev/null' BINARY : Time | Speedup | Table | KRows | MTuples | Name (s) | | MB/s | /s | /s | ---|-|---||-|- 5.372 | 3.75 x | 73.96 | 492.88 | 13.80 | copy to patch 4 8.545 | 2.36 x | 46.49 | 309.83 |8.68 | p8 = p7 + optimization of palloc() 10.229 | 1.97 x | 38.84 | 258.82 |7.25 | p7 = p6 + optimization of StringInfo 12.869 | 1.57 x | 30.87 | 205.73 |5.76 | p6 = p5 + optimization of pq_send* 15.559 | 1.30 x | 25.54 | 170.16 |4.76 | p5 no api changes, COPY TO optimized 20.165 | --- | 19.70 | 131.29 |3.68 | 8.4.0 / compiled from source COPY test_one_int TO '/dev/null' BINARY : Time | Speedup | Table | KRows | MTuples | Name (s) | | MB/s | /s | /s | --|-||-|-|- 1.493 | 4.23 x | 205.25 | 6699.22 |6.70 | p10 no copy 1.660 | 3.80 x | 184.51 | 6022.33 |6.02 | p9 monstrously ugly hack 2.003 | 3.15 x | 152.94 | 4991.87 |4.99 | copy to patch 4 2.803 | 2.25 x | 109.32 | 3568.03 |3.57 | p8 = p7 + optimization of palloc() 2.976 | 2.12 x | 102.94 | 3360.05 |3.36 | p7 = p6 + optimization of StringInfo 3.165 | 2.00 x | 96.82 | 3160.05 |3.16 | p6 = p5 + optimization of pq_send* 3.698 | 1.71 x | 82.86 | 2704.43 |2.70 | p5 no api changes, COPY TO optimized 6.318 | --- | 48.49 | 1582.85 |1.58 | 8.4.0 / compiled from source COPY test_many_ints TO '/dev/null' BINARY : Time | Speedup | Table | KRows | MTuples | Name (s) | | MB/s | /s | /s | --|-|||-|- 1.007 | 8.80 x | 127.23 | 993.34 |
[HACKERS] FDW-based dblink
Here is a proposal to integrate contrib/dblink and SQL/MED (foreign data wrapper). Dblink manages connections and transactions by itself at the moment, but there are some benefits to split database connectors into FDW. Dblink will uses those multiple connectors. For example, we will be able to retrieve data from Oracle into PostgreSQL directly if we had Oracle-connector. New syntax in SQL --- CREATE FOREIGN DATA WRAPPER postgres CONNECTOR pg_catalog.dblink_postgres; or CREATE FOREIGN DATA WRAPPER postgres OPTIONS (connector 'pg_catalog.dblink_postgres') We don't have to modify gram.y if we take the latter syntax, but need to modify VALIDATORs to distinguish 'connector' and other options. The 'connector' option should not be passed as connection string. New interface in C pg_catalog.dblink_postgres is a function that havs folloing prototype: Connection *connector(List *defElems); The argument 'defElems' is a concatenated connection options in FDW, server, and user-mapping. Also new two interfaces will be introduced: interface Connection/* represents PGconn */ { voiddisconnect(self); Cursor *open(self, query, fetchsize); /* for SELECT */ int64 exec(self, query);/* for UPDATE, INSERT, DELETE */ booltransaction_command(self, type); } interface Cursor/* represents PGresult and server-side cursor */ { bool fetch(self, OUT values); void close(self); } They have some methods implemented with function pointers. The benefit of using function pointers is that we only have to export one connector function to pg_proc. The Cursor interface represents both result-set and server-side cursor. PostgreSQL has SQL-level cursor, but there are some database that have protocol-level cursor. This abstraction layer is needed for dblink to handle connectors to other databases. Other features Present dblink is a thin wrapper of libpq, but some of my customers want automatic transaction managements. Remote transactions are committed with 2PC when the local transaction is committed. To achieve it, I think we need on-commit trigger is needed, but it is hard to implement with current infrastructure. (That is one of the reason I proposed to merge dblink into core.) Other considerations -- The proposed method is a SQL-based connector. There might be another approach -- ScanKey-based connector. It is similar to the index access method interface (pg_am). It takes relation id and scankeys instead of SQL text. The scanKey-based approach will work better if we try to pass WHERE-clause to an external database. However, I think we need SQL-based interface in any case. ScanKey will be converted to SQL and passed to an external database. I have a prototype of the feature. I'd like to submit it for 8.5. Comments welcome. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] trigger functions can only be called as triggers
Is there a reason why the function manager allows calling trigger functions outside of triggers and forces the PLs to catch this case themselves? Is there a case where calling trigger functions directly is useful? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW-based dblink
Itagaki Takahiro wrote: Present dblink is a thin wrapper of libpq, but some of my customers want automatic transaction managements. Remote transactions are committed with 2PC when the local transaction is committed. To achieve it, I think we need on-commit trigger is needed, but it is hard to implement with current infrastructure. (That is one of the reason I proposed to merge dblink into core.) Quite aside from the requirement for on-commit trigger, how exactly would you use 2PC with the remote database? When would you issue PREPARE TRANSACTION, and when would COMMIT PREPARED? What if the local database crashes in between - is the remote transaction left hanging in prepared state? Making the remote transcation atomic with the local one is a lot harder than it may seem at first glance. It's doable, but I think you'll need to build a full-fledged transaction manager into dblink, or integrate with a 3rd party one, -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DECLARE doesn't set/reset sqlca after DECLARE cursor
On Wed, Aug 12, 2009 at 07:13:44PM +0200, Boszormenyi Zoltan wrote: a customer of us complained a behavioural difference ... The attached patch implements this. The only downside is that now DECLARE CURSOR cannot appear outside of a function, a change in test/preproc/variable.pgc reflects DECLARE by definition is a declarative command and as such should be able to live outside a function. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org ICQ: 179140304, AIM/Yahoo/Skype: michaelmeskes, Jabber: mes...@jabber.org Go VfL Borussia! Go SF 49ers! Use Debian GNU/Linux! Use PostgreSQL! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] surprising trigger/foreign key interaction
Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: However I'm guessing that what actually happens is that heap_update is returning HeapTupleSelfUpdated instead, which the code states as /* nothing to do */. Yeah. I imagine this is so because of some old fiddling to get semantics just right for obscure corner cases, but it feels wrong nevertheless. I suspect it was reluctance to use the EvalPlanQual semantics (which are pretty bogus in their own way) for perfectly deterministic single-transaction cases. still the current behaviour feels quite wrong because even after the update the modified tuple still satisfies the WHERE clause of the DELETE but still it won't actually get deleted. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] surprising trigger/foreign key interaction
Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: I imagine this is so because of some old fiddling to get semantics just right for obscure corner cases, but it feels wrong nevertheless. I suspect it was reluctance to use the EvalPlanQual semantics (which are pretty bogus in their own way) for perfectly deterministic single-transaction cases. I suspect the FK trigger messing up the visibility is an obscure corner case too :-( -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Filtering dictionaries support and unaccent dictionary
On Tue, Aug 11, 2009 at 4:31 AM, Peter Eisentrautpete...@gmx.net wrote: On Tuesday 11 August 2009 08:28:24 Jaime Casanova wrote: try to build the docs to see how to properly test this and seems like you have to teach contrib.sgml and bookindex.sgml about dict-unaccent... and when i did that i got this: openjade -wall -wno-unused-param -wno-empty -wfully-tagged -D . -c /usr/share/sgml/docbook/stylesheet/dsssl/modular/catalog -d stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml openjade:dict-unaccent.sgml:48:1:E: non SGML character number 128 openjade:dict-unaccent.sgml:49:1:E: non SGML character number 129 openjade:dict-unaccent.sgml:50:1:E: non SGML character number 130 openjade:dict-unaccent.sgml:51:1:E: non SGML character number 131 openjade:dict-unaccent.sgml:52:1:E: non SGML character number 132 openjade:dict-unaccent.sgml:53:1:E: non SGML character number 133 openjade:dict-unaccent.sgml:54:1:E: non SGML character number 134 openjade:dict-unaccent.sgml:116:4:E: element B undefined make: *** [HTML.index] Error 1 make: *** Se borra el archivo `HTML.index' You should escape the special characters as well as the b that appears as part of the example output using character entitities (amp; etc.). Sounds like this patch needs a little bit of doc adjustment per the above and is then ready for committer? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alpha 1 release notes
Dimitri Fontaine wrote: Hi, Josh Berkus j...@agliodbs.com writes: Will do. Teaching myself RST now I've been doing a lot of RST editing before, and found it pretty straightforward. Except for default table handling, where ascii-art maintenance is a pain, or you have to use extended tools, like emacs table mode and such. Or use list-table and rejoy :) http://docutils.sourceforge.net/docs/ref/rst/directives.html#list-table Yeah, table handling in RST is pretty silly, particularly when you have to escape some character in a cell. I wonder if this format can be converted to SGML DocBook automatically. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dependencies for generated header files
On Tue, Aug 11, 2009 at 9:56 PM, Tom Lanet...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Given that the anum.h stuff is gone, vastly might be an overstatement. I'm pretty surprised to find out that people don't like the idea of having dependencies be correct from anywhere in the tree. Even if I'm the only developer who does partial builds, the cost seems to me to be next to nil, so I'm not quite sure what anyone gets out of rejecting this patch. It's not that having the dependencies be 100% up to date wouldn't be nice; it's that there's a limit to how much we're willing to uglify the Makefiles to have that. The makefiles need maintenance too, you know, and putting things far away from where they should be is not any better in the makefiles than it is in C code. Well, I certainly agree that making a huge mess to address what is admittedly a corner case is not a good idea. But I also don't think this patch is all that messy. However, I guess we're getting to the point where we need to make a decision one way or the other so that we can close out this CommitFest. As far as I can tell, if you've used --enable-depend then things will get updated properly before you can ever attempt to run the code (ie, install a rebuilt postmaster). The only situation where you'd actually get an improvement from redoing the dependencies like this is where lack of an update to a derived file results in a compiler error/warning. But there aren't many such cases. The only one I can even think of offhand is lack of an fmgroids.h symbol for a newly-added function ... but we don't use F_XXX symbols enough to make that a convincing example. We've intentionally arranged things so that more-fragile cases like gram.h are not referenced outside their own directories. Yes, that's definitely the best situation. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW-based dblink
Itagaki Takahiro wrote: Also new two interfaces will be introduced: interface Connection/* represents PGconn */ { voiddisconnect(self); Cursor *open(self, query, fetchsize); /* for SELECT */ int64 exec(self, query);/* for UPDATE, INSERT, DELETE */ booltransaction_command(self, type); } It's not good to return int64 in exec(), because it could have a RETURNING clause. (So it also needs a fetchsize). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alpha 1 release notes
On Thursday 13 August 2009 17:07:38 Alvaro Herrera wrote: I wonder if this format can be converted to SGML DocBook automatically. Yes, that's why I used it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alpha 1 release notes
Massa, Harald Armin c...@ghum.de writes: within source code, build options there is: - Reserve the shared memory region during backend startup on Windows, so that memory allocated by starting third party DLLs doesn't end up conflicting with it. Hopefully this solves the long-time issue with could not reattach to shared memory errors on Win32. I suggest that it should also be pointed out that this fix will be backported to 8.3 and 8.4 (as much as I followed the ML); Normally, bug fixes that have been back-patched wouldn't be mentioned at all in a new major release's release notes. The implied base that we are comparing to in major-release notes is the end of the prior branch's updates. I'm not sure if this case should be an exception, or if we should have a different general rule for alpha releases. We'd like to get more testing on that fix, so I think it is reasonable to mention it for alpha1 --- but is that an exception specific to this bug fix, or does it indicate we want to handle bug fixes differently in general within alpha release notes? In any case, it is not the function of the alpha release notes to discuss changes in earlier release branches. The reason the commit log points out the back-patch is to make it easier to extract the information when we prepare release notes for the back-branch updates. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] trigger functions can only be called as triggers
Peter Eisentraut pete...@gmx.net writes: Is there a reason why the function manager allows calling trigger functions outside of triggers and forces the PLs to catch this case themselves? Is there a case where calling trigger functions directly is useful? I think it's a matter of not wanting to slow down *all* function calls with an error check that's useless for most. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] schemapg.h
Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane escribió: Indeed, and it fails to get rid of all the dull declarations :-(. Right. I don't think we're going to move forward if we only accept giant steps at a time, and we simultaneously reject patches that are too intrusive. I'm okay with small steps as long as they're small steps in the right direction ;-). I'm not convinced that this script is the right direction. I thought the idea was to generate all this stuff directly from the C struct declarations (plus some hardwired knowledge about the datatypes, comparable to what is in TypInfo in bootstrap.c already). Hmm, perhaps that's workable. I'll have a look around. OK. It might be interesting to see if this can be unified somehow with what the bootstrap.c code does. (However, since that runs at initdb time not during compilation, there may not be any reasonable way to unify the two.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Filtering dictionaries support and unaccent dictionary
Peter, how to write accented characters in sgml ? Is't not allowed to write them as is ? Oleg On Tue, 11 Aug 2009, Peter Eisentraut wrote: On Tuesday 11 August 2009 08:28:24 Jaime Casanova wrote: try to build the docs to see how to properly test this and seems like you have to teach contrib.sgml and bookindex.sgml about dict-unaccent... and when i did that i got this: openjade -wall -wno-unused-param -wno-empty -wfully-tagged -D . -c /usr/share/sgml/docbook/stylesheet/dsssl/modular/catalog -d stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml openjade:dict-unaccent.sgml:48:1:E: non SGML character number 128 openjade:dict-unaccent.sgml:49:1:E: non SGML character number 129 openjade:dict-unaccent.sgml:50:1:E: non SGML character number 130 openjade:dict-unaccent.sgml:51:1:E: non SGML character number 131 openjade:dict-unaccent.sgml:52:1:E: non SGML character number 132 openjade:dict-unaccent.sgml:53:1:E: non SGML character number 133 openjade:dict-unaccent.sgml:54:1:E: non SGML character number 134 openjade:dict-unaccent.sgml:116:4:E: element B undefined make: *** [HTML.index] Error 1 make: *** Se borra el archivo `HTML.index' You should escape the special characters as well as the b that appears as part of the example output using character entitities (amp; etc.). Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Geometry RESTRICT and JOIN
Paul Matthews p...@netspace.net.au writes: I'm trying to add all the box op point operators. The C routines are written and working as advertised. The manuals description of the RESTRICT and JOIN clauses of CREATE OPERATOR don't seem too clear. Are these samples correct, or am I totally off base here? Well, I'm pretty sure you don't want the scalar selectivity functions for any of these. IIRC the geometric selectivity functions already come in pairs, eg you should use contsel and contjoinsel for @. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alpha 1 release notes
Tom Lane wrote: In any case, it is not the function of the alpha release notes to discuss changes in earlier release branches. The reason the commit log points out the back-patch is to make it easier to extract the information when we prepare release notes for the back-branch updates. Hmm, isn't it enough to use cvs2cl --follow branch? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Filtering dictionaries support and unaccent dictionary
Oleg Bartunov wrote: Peter, how to write accented characters in sgml ? Is't not allowed to write them as is ? aacute; for á, etc. You can't use characters that aren't in Latin-1 I think. Writing them literally is not allowed. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alpha 1 release notes
Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane wrote: In any case, it is not the function of the alpha release notes to discuss changes in earlier release branches. The reason the commit log points out the back-patch is to make it easier to extract the information when we prepare release notes for the back-branch updates. Hmm, isn't it enough to use cvs2cl --follow branch? Yeah, cvs will certainly tell you the same information, which is why I frequently don't bother mentioning the point at all in commit messages. I think the most useful reason for mentioning the branch(es) in a commit message is to explain why a particular patch goes back so far and no farther. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
Robert Haas robertmh...@gmail.com wrote: *scratches head* I don't really know how you COULD pick a safe default location. Presumably any location that's in the default postgresql.conf file would be under $PGDATA, which kind of defeats the purpose of the whole thing. In other words, you're always going to have to move it anyway, so why bother with a default that is bound to be wrong? Well, we want the WAL files to flow in two directions from the database server so that if either target (or connectivity to it) is down, the WAL files still flow to the other target. The only sensible way to do that, as far as we've determined, is to have the archive script copy to a temporary directory and move to a publisher directory, then have once-a-minute crontab jobs to rsync the directory to the targets. We figure that while a WAL file is not more at risk in the publisher directory than in the pg_xlog directory on the same volume. The other reason is what I think Greg Smith was mentioning -- simplifying the process of grabbing a usable PITR backup for novice users. That seems like it has merit. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DECLARE doesn't set/reset sqlca after DECLARE cursor
Michael Meskes írta: On Wed, Aug 12, 2009 at 07:13:44PM +0200, Boszormenyi Zoltan wrote: a customer of us complained a behavioural difference ... The attached patch implements this. The only downside is that now DECLARE CURSOR cannot appear outside of a function, a change in test/preproc/variable.pgc reflects DECLARE by definition is a declarative command and as such should be able to live outside a function. Okay, so it's a declarative command. But if we're in a function, we should still emit a call to ecpg_init, to be able to follow the Informix behaviour. We can limit it it compat mode, though. The attached patch does this, and detects being inside of a function by braces_open 0. Short of rewriting ECPG into a flull-fledged C/C++ preprocessor, we can't do better currently. In compat mode, you cannot do DECLARE mycur CURSOR FOR SELECT ... INTO :var, ... or DECLARE mycur CURSOR FOR SELECT ... WHERE field = ? in the global scope because adjust_informix() emits function calls outside of a function. Or is this declaration illegal? At least it should be documented in PostgreSQL. Best regards, Zoltán Böszörményi -- Bible has answers for everything. Proof: But let your communication be, Yea, yea; Nay, nay: for whatsoever is more than these cometh of evil. (Matthew 5:37) - basics of digital technology. May your kingdom come - superficial description of plate tectonics -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ diff -dcrpN pgsql.describe/src/interfaces/ecpg/ecpglib/exports.txt pgsql.declare-reset-sqlca/src/interfaces/ecpg/ecpglib/exports.txt *** pgsql.describe/src/interfaces/ecpg/ecpglib/exports.txt 2008-03-25 13:58:49.0 +0100 --- pgsql.declare-reset-sqlca/src/interfaces/ecpg/ecpglib/exports.txt 2009-08-12 18:13:11.0 +0200 *** ECPGstatus 23 *** 26,28 --- 26,29 ECPGtrans24 sqlprint 25 ECPGget_PGconn 26 + ECPGreset_sqlca 27 diff -dcrpN pgsql.describe/src/interfaces/ecpg/ecpglib/misc.c pgsql.declare-reset-sqlca/src/interfaces/ecpg/ecpglib/misc.c *** pgsql.describe/src/interfaces/ecpg/ecpglib/misc.c 2009-08-07 13:06:28.0 +0200 --- pgsql.declare-reset-sqlca/src/interfaces/ecpg/ecpglib/misc.c 2009-08-12 18:22:35.0 +0200 *** ecpg_gettext(const char *msgid) *** 489,491 --- 489,499 } #endif /* ENABLE_NLS */ + + bool + ECPGreset_sqlca(int lineno, const char *connection_name) + { + struct connection *con = ecpg_get_connection(connection_name); + + return ecpg_init(con, connection_name, lineno); + } diff -dcrpN pgsql.describe/src/interfaces/ecpg/include/ecpglib.h pgsql.declare-reset-sqlca/src/interfaces/ecpg/include/ecpglib.h *** pgsql.describe/src/interfaces/ecpg/include/ecpglib.h 2009-08-11 14:34:03.0 +0200 --- pgsql.declare-reset-sqlca/src/interfaces/ecpg/include/ecpglib.h 2009-08-12 18:21:06.0 +0200 *** bool ECPGset_desc(int, const char *, in *** 84,89 --- 84,90 void ECPGset_noind_null(enum ECPGttype, void *); bool ECPGis_noind_null(enum ECPGttype, void *); bool ECPGdescribe(int, bool, const char *, const char *, ...); + bool ECPGreset_sqlca(int, const char *); /* dynamic result allocation */ void ECPGfree_auto_mem(void); diff -dcrpN pgsql.describe/src/interfaces/ecpg/preproc/ecpg.addons pgsql.declare-reset-sqlca/src/interfaces/ecpg/preproc/ecpg.addons *** pgsql.describe/src/interfaces/ecpg/preproc/ecpg.addons 2009-08-11 14:34:03.0 +0200 --- pgsql.declare-reset-sqlca/src/interfaces/ecpg/preproc/ecpg.addons 2009-08-13 17:40:41.0 +0200 *** ECPG: DeclareCursorStmtDECLAREcursor_nam *** 318,324 cur = this; if (INFORMIX_MODE) ! $$ = cat_str(5, adjust_informix(this-argsinsert), adjust_informix(this-argsresult), make_str(/*), mm_strdup(this-command), make_str(*/)); else $$ = cat_str(3, make_str(/*), mm_strdup(this-command), make_str(*/)); } --- 318,338 cur = this; if (INFORMIX_MODE) ! { ! char *comment; ! const char *con = connection ? connection : NULL; ! ! comment = cat_str(3, make_str(/*), mm_strdup(this-command), make_str(*/)); ! ! if (braces_open 0) /* we're in a function */ ! { ! char *command = (char *)mm_alloc(sizeof(ECPGreset_sqlca(__LINE__, );) + strlen(con)); ! sprintf(command, ECPGreset_sqlca(__LINE__, %s);, con); ! $$ = cat_str(4, adjust_informix(this-argsinsert), adjust_informix(this-argsresult), command, comment); ! } ! else ! $$ = cat_str(3, adjust_informix(this-argsinsert), adjust_informix(this-argsresult), comment); ! } else $$ = cat_str(3, make_str(/*), mm_strdup(this-command), make_str(*/)); } diff -dcrpN pgsql.describe/src/interfaces/ecpg/preproc/ecpg.trailer pgsql.declare-reset-sqlca/src/interfaces/ecpg/preproc/ecpg.trailer ***
Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)
2009/8/8 Alvaro Herrera alvhe...@commandprompt.com: Олег Царев escribió: Hello all! If no one objecte (all agree, in other say) i continue work on patch - particulary, i want support second strategy (tuple store instead of hash-table) for save order of source (more cheap solution in case with grouping sets + order by), investigate and brainstorm another optimisation, writing regression tests and technical documentation. But I need some time for complete my investigation internals of PostgreSQL, particulary CTE. Where are we on this patch? Is it moving forward? It seems to me that the patch goes backward. I looked trough the gsets-0.6.diff for about an hour, and found it is now only a syntax sugar that builds multiple GROUP BY queries based on CTE functionality. There's no executor modification. If I remember correctly, the original patch touched executor parts. I'd buy if the GROUPING SETS touches executor but I don't if this is only syntax sugar, because you can write it as the same by yourself without GROUPING SETS syntax. The motivation we push this forward is performance that cannot be made by rewriting query, I guess. Because GROUP BY we have today is a subset of GROUPING SETS by definition, I suppose we'll refactor nodeAgg.c so that it is allowed to take multiple group definitions. And we must support both of HashAgg and GroupAgg. For HashAgg, it is easier in any case as the earlier patch does. For GroupAgg, it is a bit complicated since we sort by different key sets. When we want GROUPING SET(a, b), at first we sort by a and aggregate then sort by b and aggregate. This is the same as: select a, null, count(*) from x group by a union all select null, b, count(*) from x group by b so nothing better than query rewriting unless we invent something new. But in case of sub total and grand total like ROLLUP query, GroupAgg can do it by one-time scan by having multiple life cycle PerGroup state. Anyway, before going ahead we need to find rough sketch of how to implement this feature. Only syntax sugar is acceptable? Or internal executor support is necessary? Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)
2009/8/13 Hitoshi Harada umi.tan...@gmail.com: 2009/8/8 Alvaro Herrera alvhe...@commandprompt.com: Олег Царев escribió: Hello all! If no one objecte (all agree, in other say) i continue work on patch - particulary, i want support second strategy (tuple store instead of hash-table) for save order of source (more cheap solution in case with grouping sets + order by), investigate and brainstorm another optimisation, writing regression tests and technical documentation. But I need some time for complete my investigation internals of PostgreSQL, particulary CTE. Where are we on this patch? Is it moving forward? It seems to me that the patch goes backward. I looked trough the gsets-0.6.diff for about an hour, and found it is now only a syntax sugar that builds multiple GROUP BY queries based on CTE functionality. There's no executor modification. If I remember correctly, the original patch touched executor parts. I'd buy if the GROUPING SETS touches executor but I don't if this is only syntax sugar, because you can write it as the same by yourself without GROUPING SETS syntax. The motivation we push this forward is performance that cannot be made by rewriting query, I guess. Because GROUP BY we have today is a subset of GROUPING SETS by definition, I suppose we'll refactor nodeAgg.c so that it is allowed to take multiple group definitions. And we must support both of HashAgg and GroupAgg. For HashAgg, it is easier in any case as the earlier patch does. For GroupAgg, it is a bit complicated since we sort by different key sets. When we want GROUPING SET(a, b), at first we sort by a and aggregate then sort by b and aggregate. This is the same as: select a, null, count(*) from x group by a union all select null, b, count(*) from x group by b so nothing better than query rewriting unless we invent something new. But in case of sub total and grand total like ROLLUP query, GroupAgg can do it by one-time scan by having multiple life cycle PerGroup state. Anyway, before going ahead we need to find rough sketch of how to implement this feature. Only syntax sugar is acceptable? Or internal executor support is necessary? Regards, -- Hitoshi Harada All rights, exclude Because GROUP BY we have today is a subset of GROUPING SETS by definition, I suppose we'll refactor nodeAgg.c so that it is allowed to take multiple group definitions. And we must support both of HashAgg and GroupAgg. For HashAgg, it is easier in any case as the earlier patch does. For GroupAgg, it is a bit complicated since we sort by different key sets. because group by it's optimized version of grouping sets. Of course, we can extend the current definition of group by, but we regress perfomance of it. Some questions for you: How calcualte aggregation on ROLLUP on single pass? Stupid way - store different buffer of aggregations for every group, and accumulate every record on group for every calculator. When a group has changed, return key of this group to output set with NULL for fields not contains in this group, and restart current buffer of aggregation. Better way - add operation merge aggregations, and calculate one buffer on every group, when group has cnahged - merge this main buffer to other, and return some intermediate result. I think, support this of grouping operation isn't simple, and different implementation of ROLLUP it's better. Regards, Tsarev Oleg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)
2009/8/13 Hitoshi Harada umi.tan...@gmail.com: 2009/8/8 Alvaro Herrera alvhe...@commandprompt.com: Олег Царев escribió: Hello all! If no one objecte (all agree, in other say) i continue work on patch - particulary, i want support second strategy (tuple store instead of hash-table) for save order of source (more cheap solution in case with grouping sets + order by), investigate and brainstorm another optimisation, writing regression tests and technical documentation. But I need some time for complete my investigation internals of PostgreSQL, particulary CTE. Where are we on this patch? Is it moving forward? It seems to me that the patch goes backward. little bit. I looked trough the gsets-0.6.diff for about an hour, and found it is now only a syntax sugar that builds multiple GROUP BY queries based on CTE functionality. There's no executor modification. I wrote older version in time when CTE wasn't implemented. The old patch had own executor node, and was based on creating hash table per group. This patch was maybe little bit faster than 0.6, but had lot of bugs. Who knows planner code for grouping, then have to agree with me. This code isn't readable and I wouldn't to it more complicated and less readable. So I had idea, to join grouping sets with CTE. Grouping sets is subset of CTE, so it is possible. Grouping sets is non recursive CTE generally, and (I believe) this should be optimized together. I prefered using CTE, because this way was the most short to small bugs less prototype - with full functionality. If I remember correctly, the original patch touched executor parts. I'd buy if the GROUPING SETS touches executor but I don't if this is only syntax sugar, because you can write it as the same by yourself without GROUPING SETS syntax. The motivation we push this forward is performance that cannot be made by rewriting query, I guess. I don't thing, so you can do simply transformation from grouping sets syntax to CTE. And what's more. Why you have optimized grouping sets and not optimized non recursive CTE? Because GROUP BY we have today is a subset of GROUPING SETS by definition, I suppose we'll refactor nodeAgg.c so that it is allowed to take multiple group definitions. And we must support both of HashAgg and GroupAgg. For HashAgg, it is easier in any case as the earlier patch does. For GroupAgg, it is a bit complicated since we sort by different key sets. This way is possible too. But needs absolutely grouping planner and executor reworking. Maybe is time do it. It is work for somebody other to me. My place is stored procedures. When we want GROUPING SET(a, b), at first we sort by a and aggregate then sort by b and aggregate. This is the same as: select a, null, count(*) from x group by a union all select null, b, count(*) from x group by b so nothing better than query rewriting unless we invent something new. the problem is when x is subquery. Then is better using CTE, because we don't need repeat x evaluation twice. The most typical use case is, so x isn't table. But in case of sub total and grand total like ROLLUP query, GroupAgg can do it by one-time scan by having multiple life cycle PerGroup state. Anyway, before going ahead we need to find rough sketch of how to implement this feature. Only syntax sugar is acceptable? Or internal executor support is necessary? I thing, so both ways are possible. Probably the most clean way is total refactoring of grouping executor and grouping planner. I am not sure if we need new nodes. There are all. But these nodes cannot work paralel now. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)
2009/8/14 Олег Царев zabiva...@gmail.com: All rights, exclude Because GROUP BY we have today is a subset of GROUPING SETS by definition, I suppose we'll refactor nodeAgg.c so that it is allowed to take multiple group definitions. And we must support both of HashAgg and GroupAgg. For HashAgg, it is easier in any case as the earlier patch does. For GroupAgg, it is a bit complicated since we sort by different key sets. because group by it's optimized version of grouping sets. Of course, we can extend the current definition of group by, but we regress perfomance of it. Some questions for you: How calcualte aggregation on ROLLUP on single pass? I'd imagine such like: select a, b, count(*) from x group by rollup(a, b); PerGroup all = init_agg(), a = init_agg(), ab = init_agg(); while(row = fetch()){ if(group_is_changed(ab, row)){ result_ab = finalize_agg(ab); ab = init_agg(); } if(group_is_changed(a, row)){ result_a = finalize_agg(a); a = init_agg(); } advance_agg(all, row); advance_agg(a, row); advance_agg(ab, row); } result_all = finalize_agg(all); of course you should care best way to return result row and continue aggregates and the number of grouping key varies from 1 to many, it is quite possible. And normal GROUP BY is a case of key = a only, there won't be performance regression. Better way - add operation merge aggregations, and calculate one buffer on every group, when group has cnahged - merge this main buffer to other, and return some intermediate result. Merge aggregates sounds fascinating to me in not only this feature but also partitioned table aggregates. But adding another function (merge function?) to the current aggregate system is quite far way. I think, support this of grouping operation isn't simple, and different implementation of ROLLUP it's better. Surely not simple. Adding another node is one of the choices, but from code maintenance point of view I feel it is better to integrate it into nodeAgg. nodeWindowAgg and nodeAgg have similar aggregate processes but don't share it so a bug fix in nodeAgg isn't completed in itself but we must re-check nodeWindowAgg also. To add another agg-like node *may* be kind of nightmare. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)
2009/8/13 Олег Царев zabiva...@gmail.com: 2009/8/13 Hitoshi Harada umi.tan...@gmail.com: 2009/8/8 Alvaro Herrera alvhe...@commandprompt.com: Олег Царев escribió: Hello all! If no one objecte (all agree, in other say) i continue work on patch - particulary, i want support second strategy (tuple store instead of hash-table) for save order of source (more cheap solution in case with grouping sets + order by), investigate and brainstorm another optimisation, writing regression tests and technical documentation. But I need some time for complete my investigation internals of PostgreSQL, particulary CTE. Where are we on this patch? Is it moving forward? It seems to me that the patch goes backward. I looked trough the gsets-0.6.diff for about an hour, and found it is now only a syntax sugar that builds multiple GROUP BY queries based on CTE functionality. There's no executor modification. If I remember correctly, the original patch touched executor parts. I'd buy if the GROUPING SETS touches executor but I don't if this is only syntax sugar, because you can write it as the same by yourself without GROUPING SETS syntax. The motivation we push this forward is performance that cannot be made by rewriting query, I guess. Because GROUP BY we have today is a subset of GROUPING SETS by definition, I suppose we'll refactor nodeAgg.c so that it is allowed to take multiple group definitions. And we must support both of HashAgg and GroupAgg. For HashAgg, it is easier in any case as the earlier patch does. For GroupAgg, it is a bit complicated since we sort by different key sets. When we want GROUPING SET(a, b), at first we sort by a and aggregate then sort by b and aggregate. This is the same as: select a, null, count(*) from x group by a union all select null, b, count(*) from x group by b so nothing better than query rewriting unless we invent something new. But in case of sub total and grand total like ROLLUP query, GroupAgg can do it by one-time scan by having multiple life cycle PerGroup state. Anyway, before going ahead we need to find rough sketch of how to implement this feature. Only syntax sugar is acceptable? Or internal executor support is necessary? Regards, -- Hitoshi Harada All rights, exclude Because GROUP BY we have today is a subset of GROUPING SETS by definition, I suppose we'll refactor nodeAgg.c so that it is allowed to take multiple group definitions. And we must support both of HashAgg and GroupAgg. For HashAgg, it is easier in any case as the earlier patch does. For GroupAgg, it is a bit complicated since we sort by different key sets. because group by it's optimized version of grouping sets. Of course, we can extend the current definition of group by, but we regress perfomance of it. Some questions for you: How calcualte aggregation on ROLLUP on single pass? Stupid way - store different buffer of aggregations for every group, and accumulate every record on group for every calculator. When a group has changed, return key of this group to output set with NULL for fields not contains in this group, and restart current buffer of aggregation. Better way - add operation merge aggregations, and calculate one buffer on every group, when group has cnahged - merge this main buffer to other, and return some intermediate result. I don't thing, so this is possible for all operations. Don't forgot. People can to implement own aggregates. example: weighted average regards Pavel Stehule I think, support this of grouping operation isn't simple, and different implementation of ROLLUP it's better. Regards, Tsarev Oleg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
All, The other reason is what I think Greg Smith was mentioning -- simplifying the process of grabbing a usable PITR backup for novice users. That seems like it has merit. While we're at this, can we add xlog_location as a file-location GUC? It seems inconsistent that we're still requiring people to symlink the pg_xlog in order to move that. Or is that already part of this set of patches? -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] surprising trigger/foreign key interaction
On 8/13/09 7:03 AM, Alvaro Herrera wrote: Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: I imagine this is so because of some old fiddling to get semantics just right for obscure corner cases, but it feels wrong nevertheless. I suspect it was reluctance to use the EvalPlanQual semantics (which are pretty bogus in their own way) for perfectly deterministic single-transaction cases. I suspect the FK trigger messing up the visibility is an obscure corner case too :-( Yes, but it's one which happens frequently. I've already had to debug a client case where a client had a before trigger, and after trigger, and a self-join FK. That seems like a bizarre arrangement, but for a proximity tree (which we're going to see a lot more of thanks to WITH RECURSIVE) it actually makes a lot of sense. The result is that you can get a *successful* transaction, with no error, that nevertheless results in rows which are inconsistent with the FK -- silent data corruption. I had to tell the user to disable the FK and maintain consistency by trigger as well, which doesn't reflect well on our devotion to avoiding data corruption. This is 100% reproduceable; test case below my sig. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- create two tables, one of which is the master table (reftable) the other of which is a child which contains a tree structure (treetab): create table reftable( refid int primary key, refname text ); create table treetab ( id int primary key, parent int, refid int not null references reftable(refid) on delete cascade, name text ); -- now create a trigger function to maintain the integrity of the trees in treetab by pulling up -- each node to its parent if intermediate nodes get deleted -- this trigger is inherently flawed and won't work with the FK below create function treemaint () returns trigger as $t$ begin update treetab set parent = OLD.parent where parent = OLD.id; return OLD; end; $t$ language plpgsql; create trigger treemaint_trg before delete on treetab for each row execute procedure treemaint(); -- populate reftable insert into reftable select i, ( 'Ref' || i::TEXT ) from generate_series(1,100) as g(i); -- populate treetab with 10 rows each pointing to reftable insert into treetab (id, refid) select i, (( i / 10::INT ) + 1 ) from generate_series (1,900) as g(i); -- create trees in treetab. for this simple example each treeset is just a chain with each child node -- pointing to one higher node update treetab set parent = ( id - 1 ) where id ( select min(id) from treetab tt2 where tt2.refid = treetab.refid); update treetab set name = ('tree' || parent::TEXT || '-' || id::TEXT); -- now create a self-referential FK to enforce tree integrity. This logically breaks the trigger alter table treetab add constraint selfref foreign key (parent) references treetab (id); -- show tree for id 45 select * from treetab where refid = 45; id | parent | refid |name -++---+- 440 ||45 | 441 |440 |45 | tree440-441 442 |441 |45 | tree441-442 443 |442 |45 | tree442-443 444 |443 |45 | tree443-444 445 |444 |45 | tree444-445 446 |445 |45 | tree445-446 447 |446 |45 | tree446-447 448 |447 |45 | tree447-448 449 |448 |45 | tree448-449 -- now, we're going to delete the tree. This delete should fail with an error because the -- trigger will violate selfref delete from reftable where refid = 45; -- however, it doesn't fail. it reports success, and some but not all rows from treetab -- are deleted, leaving the database in an inconsistent state. select * from treetab where refid = 45; id | parent | refid |name -++---+- 441 ||45 | tree440-441 443 |441 |45 | tree442-443 445 |443 |45 | tree444-445 447 |445 |45 | tree446-447 449 |447 |45 | tree448-449 -- this means we now have rows in the table which -- violate the FK to reftable. postgres=# select * from reftable where refid = 45; refid | refname ---+- (0 rows) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
Josh Berkus j...@agliodbs.com writes: While we're at this, can we add xlog_location as a file-location GUC? That was proposed and rejected quite a long time ago. We don't *want* people to be able to just change a GUC and have their xlog go somewhere else, because of the foot-gun potential. You need to be sure that the existing WAL files get moved over when you do something like that, and the GUC infrastructure isn't up to ensuring that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY speedup
In the previous mails I made a mistake, writing MTuples/s instead of MDatums/s, sorry about that. It is the number of rows x columns. The title was wrong, but the data was right. I've been doing some tests on COPY FROM ... BINARY. - inlines in various pg_get* etc - a faster buffer handling for copy - that's about it... In the below tables, you have p17 (ie test patch 17, the last one) and straight postgres compared. COPY annonces_2 FROM 'annonces.bin' BINARY : Time | Speedup | Table | KRows | MDatums | Name (s) | | MB/s | /s | /s | ---|-|||-| 8.417 | 1.40 x | 38.70 | 49.13 |1.92 | 8.4.0 / p17 11.821 | --- | 27.56 | 34.98 |1.36 | 8.4.0 / compiled from source COPY archive_data_2 FROM 'archive_data.bin' BINARY : Time | Speedup | Table | KRows | MDatums | Name (s) | | MB/s | /s | /s | ---|-|---||-| 15.314 | 1.93 x | 25.94 | 172.88 |4.84 | 8.4.0 / p17 COPY FROM BINARY all 29.520 | --- | 13.46 | 89.69 |2.51 | 8.4.0 / compiled from source COPY test_one_int_2 FROM 'test_one_int.bin' BINARY : Time | Speedup | Table | KRows | MDatums | Name (s) | | MB/s | /s | /s | ---|-||-|-| 10.003 | 1.39 x | 30.63 | 999.73 |1.00 | 8.4.0 / p17 COPY FROM BINARY all 13.879 | --- | 22.08 | 720.53 |0.72 | 8.4.0 / compiled from source COPY test_many_ints_2 FROM 'test_many_ints.bin' BINARY : Time | Speedup | Table | KRows | MDatums | Name (s) | | MB/s | /s | /s | ---|-|---||-| 6.009 | 2.08 x | 21.31 | 166.42 |4.33 | 8.4.0 / p17 COPY FROM BINARY all 12.516 | --- | 10.23 | 79.90 |2.08 | 8.4.0 / compiled from source I thought it might be interesting to get split timings of the various steps in COPY FROM, so I simply commented out bits of code and ran tests. The delta columns are differences between two lines, that is the time taken in the step mentioned on the right. reading data only = reading all the data and parsing it into chunks, doing everything until the RecvFunc is called. RecvFuncs = same, + RecvFunc is called heap_form_tuple = same + heap_form_tuple is called triggers = same + triggers are applied insert = actual tuple insertion p17 = total time (post insert triggers, constraint check, etc) Time | Delta | Row delta | Datum delta | Name (s) | (s) | (us) |(us) | ---|---|---|-|-- 1.311 | --- | --- | --- | reading data only 4.516 | 3.205 | 7.750 | 0.199 | RecvFuncs 4.534 | 0.018 | 0.043 | 0.001 | heap_form_tuple 5.323 | 0.789 | 1.908 | 0.049 | triggers 8.182 | 2.858 | 6.912 | 0.177 | insert 8.417 | 0.236 | 0.570 | 0.015 | p17 COPY archive_data_2 FROM 'archive_data.bin' BINARY : Time | Delta | Row delta | Datum delta | Name (s) |(s) | (us) |(us) | ---||---|-|- 4.729 |--- | --- | --- | reading data only 8.508 | 3.778 | 1.427 | 0.051 | RecvFuncs 8.567 | 0.059 | 0.022 | 0.001 | heap_form_tuple 10.804 | 2.237 | 0.845 | 0.030 | triggers 14.475 | 3.671 | 1.386 | 0.050 | insert 15.314 | 0.839 | 0.317 | 0.011 | p17 COPY test_one_int_2 FROM 'test_one_int.bin' BINARY : Time | Delta | Row delta | Datum delta | Name (s) | (s) | (us) |(us) | ---|---|---|-|-- 1.247 | --- | --- | --- | reading data only 1.745 | 0.498 | 0.050 | 0.050 | RecvFuncs 1.750 | 0.004 | 0.000 | 0.000 | heap_form_tuple 3.114 | 1.364 | 0.136 | 0.136 | triggers 9.984 | 6.870 | 0.687 | 0.687 | insert 10.003 | 0.019 | 0.002 | 0.002 | p17 COPY test_many_ints_2 FROM 'test_many_ints.bin' BINARY : Time | Delta | Row delta | Datum delta | Name (s) | (s) | (us) |(us) | ---|---|---|-|-- 1.701 | --- | --- | --- | reading data only 3.122 | 1.421 | 1.421 | 0.055 | RecvFuncs 3.129 | 0.008 | 0.008 | 0.000 | heap_form_tuple 3.754 | 0.624 | 0.624 | 0.024 | triggers 5.639 | 1.885 | 1.885 | 0.073 | insert 6.009 | 0.370 | 0.370 | 0.014 | p17 We can see that : - reading and parsing the data is still slow (actually, everything is copied something like 3-4 times) - RecvFuncs take quite long, too - triggers use some time, although the table has no triggers ? This is
Re: [HACKERS] Hot standby and synchronous replication status
Tom, That was proposed and rejected quite a long time ago. We don't *want* people to be able to just change a GUC and have their xlog go somewhere else, because of the foot-gun potential. You need to be sure that the existing WAL files get moved over when you do something like that, and the GUC infrastructure isn't up to ensuring that. Doesn't the same argument apply to data_directory? -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
Josh Berkus j...@agliodbs.com writes: That was proposed and rejected quite a long time ago. We don't *want* people to be able to just change a GUC and have their xlog go somewhere else, because of the foot-gun potential. You need to be sure that the existing WAL files get moved over when you do something like that, and the GUC infrastructure isn't up to ensuring that. Doesn't the same argument apply to data_directory? No. Changing data_directory might result in failure to start (if you didn't move the actual data over there) but it's unlikely to result in irretrievable corruption of your data. The key issue here is the need to keep data and xlog in sync, and moving the whole data directory doesn't create risks of that sort. Now admittedly it's not hard to screw yourself with a careless manual move of xlog, either. But at least the database didn't hand you a knob that invites clueless frobbing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)
2009/8/14 Pavel Stehule pavel.steh...@gmail.com: I prefered using CTE, because this way was the most short to small bugs less prototype - with full functionality. You could make it by query rewriting, but as you say the best cleanest way is total refactoring of existing nodeAgg. How easy to implement is not convincing. When we want GROUPING SET(a, b), at first we sort by a and aggregate then sort by b and aggregate. This is the same as: select a, null, count(*) from x group by a union all select null, b, count(*) from x group by b so nothing better than query rewriting unless we invent something new. the problem is when x is subquery. Then is better using CTE, because we don't need repeat x evaluation twice. The most typical use case is, so x isn't table. So we need single scan aggregate as far as possible. Buffering subquery's result is possible without CTE node. Tuplestore has that functionality but I found the buffered result will be sorted multiple times, one way might be to allow tuplesort to perform sort multiple times with different keys. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
Now admittedly it's not hard to screw yourself with a careless manual move of xlog, either. But at least the database didn't hand you a knob that invites clueless frobbing. So really rather than a GUC we should have a utility for moving the xlog. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
Josh Berkus j...@agliodbs.com writes: Now admittedly it's not hard to screw yourself with a careless manual move of xlog, either. But at least the database didn't hand you a knob that invites clueless frobbing. So really rather than a GUC we should have a utility for moving the xlog. Yeah, that would work. Although it would probably take as much verbiage to document the utility as it does to document how to do it manually. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)
2009/8/13 Hitoshi Harada umi.tan...@gmail.com: 2009/8/14 Pavel Stehule pavel.steh...@gmail.com: I prefered using CTE, because this way was the most short to small bugs less prototype - with full functionality. You could make it by query rewriting, but as you say the best cleanest way is total refactoring of existing nodeAgg. How easy to implement is not convincing. I agree. Simply I am not have time and force do it. I would to concentrate on finishing some plpgsql issues, and then I have to do some other things than PostgreSQL. There are fully functional prototype and everybody is welcome to continue in this work. When we want GROUPING SET(a, b), at first we sort by a and aggregate then sort by b and aggregate. This is the same as: select a, null, count(*) from x group by a union all select null, b, count(*) from x group by b so nothing better than query rewriting unless we invent something new. the problem is when x is subquery. Then is better using CTE, because we don't need repeat x evaluation twice. The most typical use case is, so x isn't table. So we need single scan aggregate as far as possible. Buffering subquery's result is possible without CTE node. Tuplestore has that functionality but I found the buffered result will be sorted multiple times, one way might be to allow tuplesort to perform sort multiple times with different keys. yes, I don't afraid multiple evaluation of aggregates. It's cheap. Problem is multiple table scan. I though about some new version of aggregates. Current aggregates process row by row with final operation. Some new kind of aggregates should to work over tuple store. Internally it get pointer to tupplestore and number of rows. This should be very fast for functions like median, or array_agg. I thing so it's similar to window functions - only it not window function. If you like to optimalize to speed, then the most faster solution will be using hash tables - then you don't need tuplestore. For rollup is possible maybe one single scan - but I am not sure - there are important fakt - final function cannot modify intermediate data. Pavel Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW-based dblink
On Thu, Aug 13, 2009 at 02:01:19PM +0300, Heikki Linnakangas wrote: Itagaki Takahiro wrote: Present dblink is a thin wrapper of libpq, but some of my customers want automatic transaction managements. Remote transactions are committed with 2PC when the local transaction is committed. To achieve it, I think we need on-commit trigger is needed, but it is hard to implement with current infrastructure. (That is one of the reason I proposed to merge dblink into core.) Quite aside from the requirement for on-commit trigger, how exactly would you use 2PC with the remote database? When would you issue PREPARE TRANSACTION, and when would COMMIT PREPARED? For what it's worth, in DBI-Link, I've allowed some of this by letting people pass commands like BEGIN, COMMIT and ROLLBACK through to the remote side. However, it doesn't--can't, as far as I know--implement the full 2PC. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Filtering dictionaries support and unaccent dictionary
On Thursday 13 August 2009 18:07:51 Alvaro Herrera wrote: Oleg Bartunov wrote: Peter, how to write accented characters in sgml ? Is't not allowed to write them as is ? aacute; for á, etc. You can't use characters that aren't in Latin-1 I think. Writing them literally is not allowed. It's somehow possible, but it's not as straightforward as say with XML. And you might get into a Latin-1 vs UTF-8 mixup. At least that's what I noticed in my limited testing the other day. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
Yeah, that would work. Although it would probably take as much verbiage to document the utility as it does to document how to do it manually. Yes, but it would *feel* less hackish to sysadmins and DBAs, and make them more confident about moving the xlogs. Getting it to work on windows will be a pita, though ... Andrew? -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
Josh Berkus wrote: Yeah, that would work. Although it would probably take as much verbiage to document the utility as it does to document how to do it manually. Yes, but it would *feel* less hackish to sysadmins and DBAs, and make them more confident about moving the xlogs. Getting it to work on windows will be a pita, though ... Andrew? Why would it? All the tools are there - if not tablespaces wouldn't work. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Filtering dictionaries support and unaccent dictionary
Peter Eisentraut wrote: On Thursday 13 August 2009 18:07:51 Alvaro Herrera wrote: Oleg Bartunov wrote: Peter, how to write accented characters in sgml ? Is't not allowed to write them as is ? aacute; for ?, etc. You can't use characters that aren't in Latin-1 I think. Writing them literally is not allowed. It's somehow possible, but it's not as straightforward as say with XML. And you might get into a Latin-1 vs UTF-8 mixup. At least that's what I noticed in my limited testing the other day. The top of release.sgml has instructions on that because that is often something we need to do for names in release notes: non-ASCII charactersconvert to HTML4 entity () escapes official: http://www.w3.org/TR/html4/sgml/entities.html one page: http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html other lists: http://www.zipcon.net/~swhite/docs/computers/browsers/entities.html http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references we cannot use UTF8 because SGML Docbook does not support it http://www.pemberley.com/janeinfo/latin1.html#latexta -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alpha 1 release notes
Tom Lane wrote: Massa, Harald Armin c...@ghum.de writes: within source code, build options there is: - Reserve the shared memory region during backend startup on Windows, so that memory allocated by starting third party DLLs doesn't end up conflicting with it. Hopefully this solves the long-time issue with could not reattach to shared memory errors on Win32. I suggest that it should also be pointed out that this fix will be backported to 8.3 and 8.4 (as much as I followed the ML); Normally, bug fixes that have been back-patched wouldn't be mentioned at all in a new major release's release notes. The implied base that we are comparing to in major-release notes is the end of the prior branch's updates. I'm not sure if this case should be an exception, or if we should have a different general rule for alpha releases. We'd like to get more testing on that fix, so I think it is reasonable to mention it for alpha1 --- but is that an exception specific to this bug fix, or does it indicate we want to handle bug fixes differently in general within alpha release notes? In any case, it is not the function of the alpha release notes to discuss changes in earlier release branches. The reason the commit log points out the back-patch is to make it easier to extract the information when we prepare release notes for the back-branch updates. FYI, tools/pgcvslog -d removes backbranch commits automatically. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
On Thu, Aug 13, 2009 at 1:49 PM, Josh Berkusj...@agliodbs.com wrote: Yeah, that would work. Although it would probably take as much verbiage to document the utility as it does to document how to do it manually. Yes, but it would *feel* less hackish to sysadmins and DBAs, and make them more confident about moving the xlogs. and is better for marketing... in fact, when i say we need to move them manually with a symlink sysadmins looks to me like an strange bug ;) Getting it to work on windows will be a pita, though ... Andrew? mmm... is there a way to make this *manually* in windows? maybe this is enough reason for a tool to make it... -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
[ moving to -hackers ] If this topic has been discussed previously, please point me to the earlier threads. Why aren't we more opportunistic about freezing tuples? For instance, if we already have a dirty buffer in cache, we should be more aggressive about freezing those tuples than freezing tuples on disk. I looked at the code, and it looks like if we freeze one tuple on the page during VACUUM, we mark it dirty. Wouldn't that be a good opportunity to freeze all the other tuples on the page that we can? Or, perhaps when the bgwriter is flushing dirty buffers, it can look for opportunities to set hint bits or freeze tuples. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Getting rid of the flat authentication file
I've been looking into what it would take to eliminate the flat file for pg_auth info. The implication of doing that is that authentication has to be postponed until inside InitPostgres(), where we can read the actual system catalogs instead. The easy way to do it would be to postpone authentication until after we have selected and entered a database. At that point we could use existing code such as is_member_of_role(). There is a security disadvantage to that: you would find out whether the database name you'd given was valid before any authentication check occurred. Since database names are often also user names, that would give a brute-force attacker a leg up on discovering valid user names. Plan B is to use the same techniques for reading pg_authid and pg_auth_members as InitPostgres is now using for reading pg_database. That's perfectly doable; the main downside to it is that if the shared relcache file were missing, we'd be reduced to seqscan searches of these files, which could be pretty darn unpleasant for role membership searches. However, the shared relcache file should hardly ever be missing, and standard pg_hba.conf setups (with the role column always ALL) don't result in role membership checks anyway. So I'm leaning to plan B here. Another issue is that currently, option switches supplied via PGOPTIONS are processed at entry to PostgresMain (unless they are for SUSET GUC variables). If we retained that behavior then they'd be applied before authentication occurred. This worries me, though I can't immediately point to a problem case. I'd be inclined to postpone the processing of all user-supplied switches until after InitPostgres. This would simplify the logic in PostgresMain, too, since we'd not have to process SUSET variables separately from others. The only real downside I can see is that it would make -W (post_auth_delay) pretty much useless for its intended purpose of assisting debugging of InitPostgres-time problems. We might as well remove it and just rely on pre_auth_delay. This point is only of interest to hackers, and not all that often even to us, so I don't feel that it's a critical objection. So the disadvantages of not using the flat file for authentication seem to boil down to * more cycles expended before we can reject a bad username/password * could be slow in the uncommon case that the shared relcache file is missing * debugging InitPostgres problems will get more inconvenient As against this, we'd be getting rid of a bunch of klugy, slow code with assorted failure points. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Jeff Davis wrote: Why aren't we more opportunistic about freezing tuples? For instance, if we already have a dirty buffer in cache, we should be more aggressive about freezing those tuples than freezing tuples on disk. The most widely cited reason is that you lose forensics data. Although they are increasingly rare, there are still situations in which the heap tuple machinery messes up and the xmin/xmax/etc fields of the tuple are the best/only way to find out what happened and thus fix the bug. If you freeze early, there's just no way to know. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[PERFORM] Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age100m? )
Alvaro Herrera alvhe...@commandprompt.com wrote: Jeff Davis wrote: Why aren't we more opportunistic about freezing tuples? For instance, if we already have a dirty buffer in cache, we should be more aggressive about freezing those tuples than freezing tuples on disk. The most widely cited reason is that you lose forensics data. Although they are increasingly rare, there are still situations in which the heap tuple machinery messes up and the xmin/xmax/etc fields of the tuple are the best/only way to find out what happened and thus fix the bug. If you freeze early, there's just no way to know. Although I find it hard to believe that this is compelling argument in the case where an entire table or database is loaded in a single database transaction. In the more general case, I'm not sure why this argument applies here but not to cassert and other diagnostic options. It wouldn't surprise me to find workloads where writing data three times (once for the data, once for hint bits, and once to freeze the tid) affects performance more than cassert. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
On Thu, 2009-08-13 at 17:58 -0400, Alvaro Herrera wrote: The most widely cited reason is that you lose forensics data. Although they are increasingly rare, there are still situations in which the heap tuple machinery messes up and the xmin/xmax/etc fields of the tuple are the best/only way to find out what happened and thus fix the bug. If you freeze early, there's just no way to know. As it stands, it looks like it's not just one extra write for each buffer, but potentially many (theoretically, as many as there are tuples on a page). I suppose the reasoning is that tuples on the same page have approximately the same xmin, and are likely to be frozen at the same time. But it seems entirely reasonable that the xmins on one page span several VACUUM runs, and that seems more likely with the FSM. That means that a few tuples on the page are older than 100M and get frozen, and the rest are only about 95M transactions old, so we have to come back and freeze them again, later. Let's say that we had a range like 50-100M, where if it's older than 100M, we freeze it, and if it's older than 50M we freeze it only if it's on a dirty page. We would still have forensic evidence, but we could make a range such that we avoid writing multiple times. And people who don't care about forensic evidence can set it to 0-100M. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [PERFORM] Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age100m? )
On Thu, 2009-08-13 at 17:17 -0500, Kevin Grittner wrote: It wouldn't surprise me to find workloads where writing data three times (once for the data, once for hint bits, and once to freeze the tid) I'm not sure that we're limited to 3 times, here. I could be missing something, but if you have tuples with different xmins on the same page, some might be older than 100M, which you freeze, and then you will have to come back later to freeze the rest. As far as I can tell, the maximum number of writes is the number of tuples that fit on the page. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
On Thu, Aug 13, 2009 at 5:33 PM, Jeff Davispg...@j-davis.com wrote: Or, perhaps when the bgwriter is flushing dirty buffers, it can look for opportunities to set hint bits or freeze tuples. One of the tricky things here is that the time you are mostly likely to want to do this is when you are loading a lot of data. But in that case shared buffers are likely to be written back to disk before transaction commit, so it'll be too early to do anything. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [PERFORM] Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Why aren't we more opportunistic about freezing tuples? For instance, if we already have a dirty buffer in cache, we should be more aggressive about freezing those tuples than freezing tuples on disk. The most widely cited reason is that you lose forensics data. Although they are increasingly rare, there are still situations in which the heap tuple machinery messes up and the xmin/xmax/etc fields of the tuple are the best/only way to find out what happened and thus fix the bug. If you freeze early, there's just no way to know. That argument doesn't apply. If the page is in memory and is being written anyway, and some of the rows are past vacuum_freeze_min_age, then why not freeze them rather than waiting for a vacuum process to read them off disk and rewrite them? We're not talking about freezing every tuple as soon as it's out of scope. Just the ones which are more that 100m (or whatever the setting is) old. I seriously doubt that anyone is doing useful forensics using xids which are 100m old. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Jeff Davis pg...@j-davis.com writes: Let's say that we had a range like 50-100M, where if it's older than 100M, we freeze it, and if it's older than 50M we freeze it only if it's on a dirty page. We would still have forensic evidence, but we could make a range such that we avoid writing multiple times. Yeah, making the limit slushy would doubtless save some writes, with not a lot of downside. And people who don't care about forensic evidence can set it to 0-100M. Everybody *thinks* they don't care about forensic evidence. Until they need it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
On Thu, 2009-08-13 at 18:46 -0400, Tom Lane wrote: Yeah, making the limit slushy would doubtless save some writes, with not a lot of downside. OK, then should we make this a TODO? I'll make an attempt at this. And people who don't care about forensic evidence can set it to 0-100M. Everybody *thinks* they don't care about forensic evidence. Until they need it. We already allow setting vacuum_freeze_min_age to zero, so I don't see a solution here other than documentation. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Jeff Davis pg...@j-davis.com writes: On Thu, 2009-08-13 at 18:46 -0400, Tom Lane wrote: Everybody *thinks* they don't care about forensic evidence. Until they need it. We already allow setting vacuum_freeze_min_age to zero, so I don't see a solution here other than documentation. Yeah, we allow it. I just don't want to encourage it ... and definitely not make it default. What are you envisioning exactly? If vacuum finds any reason to dirty a page (or it's already dirty), then freeze everything on the page that's got age some lower threshold? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Jeff, Tom, Let's say that we had a range like 50-100M, where if it's older than 100M, we freeze it, and if it's older than 50M we freeze it only if it's on a dirty page. We would still have forensic evidence, but we could make a range such that we avoid writing multiple times. Yeah, making the limit slushy would doubtless save some writes, with not a lot of downside. This would mean two settings: vacuum_freeze_min_age and vacuum_freeze_dirty_age. And we'd need to add those to the the autovacuum settings for each table as well. While we could just make one setting 1/2 of the other, that prevents me from saying: freeze this table agressively if it's in memory, but wait a long time to vaccuum if it's on disk I can completely imagine a table which has a vacuum_freeze_dirty_age of 1 and a vacuum_freeze_min_age of 1m. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
What are you envisioning exactly? If vacuum finds any reason to dirty a page (or it's already dirty), then freeze everything on the page that's got age some lower threshold? I was envisioning, if the page is already dirty and in memory *for any reason*, the freeze rows at below some threshold. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_hba.conf: samehost and samenet
I love using postgresql, and have for a long time. I'm involved with almost a hundred postgresql installs. But this is the first time I've gotten into the code. Renumbering networks happens often, and will happen more frequently as IPv4 space runs low. The IP based restrictions in pg_hba.conf is one of the places where renumbering can break running installs. In addition when postgresql is run in BSD jails, 127.0.0.1 is not available for use in pg_hba.conf. It would be great if, in the cidr-address field of pg_hba.conf, we could specify samehost and samenet. These special values use the local hosts network interface addresses. samehost allows an IP assigned to the local machine. samenet allows any host on the subnets connected to the local machine. This is similar to the sameuser value that's allowed in the database field. A change like this would enable admins like myself to distribute postgresql with something like this in the default pg_hba.conf file: host all all samenet md5 hostssl all all 0.0.0.0/0 md5 I've attached an initial patch which implements samehost and samenet. The patch looks more invasive than it really is, due to necessary indentation change (ie: a if block), and moving some code into a separate function. Thanks for your time. How can I help get a feature like this into postgresql? Cheers, Stef diff --git a/configure b/configure index 61b3c72..7bcfcec 100755 *** a/configure --- b/configure *** done *** 9642,9648 ! for ac_header in crypt.h dld.h fp_class.h getopt.h ieeefp.h langinfo.h poll.h pwd.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/socket.h sys/shm.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h kernel/OS.h kernel/image.h SupportDefs.h do as_ac_Header=`$as_echo ac_cv_header_$ac_header | $as_tr_sh` if { as_var=$as_ac_Header; eval test \\${$as_var+set}\ = set; }; then --- 9642,9649 ! ! for ac_header in crypt.h dld.h fp_class.h getopt.h ieeefp.h langinfo.h poll.h pwd.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/socket.h sys/shm.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h kernel/OS.h kernel/image.h SupportDefs.h ifaddrs.h do as_ac_Header=`$as_echo ac_cv_header_$ac_header | $as_tr_sh` if { as_var=$as_ac_Header; eval test \\${$as_var+set}\ = set; }; then *** fi *** 17278,17284 ! for ac_func in cbrt dlopen fcvt fdatasync getpeereid getpeerucred getrlimit memmove poll pstat readlink setproctitle setsid sigprocmask symlink sysconf towlower utime utimes waitpid wcstombs do as_ac_var=`$as_echo ac_cv_func_$ac_func | $as_tr_sh` { $as_echo $as_me:$LINENO: checking for $ac_func 5 --- 17279,17286 ! ! for ac_func in cbrt dlopen fcvt fdatasync getpeereid getpeerucred getrlimit memmove poll pstat readlink setproctitle setsid sigprocmask symlink sysconf towlower utime utimes waitpid wcstombs getifaddrs do as_ac_var=`$as_echo ac_cv_func_$ac_func | $as_tr_sh` { $as_echo $as_me:$LINENO: checking for $ac_func 5 diff --git a/configure.in b/configure.in index 505644a..bc37b1b 100644 *** a/configure.in --- b/configure.in *** AC_SUBST(OSSP_UUID_LIBS) *** 962,968 ## dnl sys/socket.h is required by AC_FUNC_ACCEPT_ARGTYPES ! AC_CHECK_HEADERS([crypt.h dld.h fp_class.h getopt.h ieeefp.h langinfo.h poll.h pwd.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/socket.h sys/shm.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h kernel/OS.h kernel/image.h SupportDefs.h]) # At least on IRIX, cpp test for netinet/tcp.h will fail unless # netinet/in.h is included first. --- 962,968 ## dnl sys/socket.h is required by AC_FUNC_ACCEPT_ARGTYPES ! AC_CHECK_HEADERS([crypt.h dld.h fp_class.h getopt.h ieeefp.h langinfo.h poll.h pwd.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/socket.h sys/shm.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h kernel/OS.h kernel/image.h SupportDefs.h ifaddrs.h]) # At least on IRIX, cpp test for netinet/tcp.h will fail unless # netinet/in.h is included first. *** PGAC_VAR_INT_TIMEZONE *** 1141,1147 AC_FUNC_ACCEPT_ARGTYPES PGAC_FUNC_GETTIMEOFDAY_1ARG ! AC_CHECK_FUNCS([cbrt dlopen fcvt fdatasync getpeereid getpeerucred getrlimit memmove poll pstat readlink setproctitle setsid sigprocmask symlink sysconf towlower utime utimes waitpid wcstombs]) # posix_fadvise() is a no-op on Solaris, so don't incur function overhead # by calling it, 2009-04-02 --- 1141,1147 AC_FUNC_ACCEPT_ARGTYPES PGAC_FUNC_GETTIMEOFDAY_1ARG ! AC_CHECK_FUNCS([cbrt dlopen fcvt fdatasync getpeereid getpeerucred getrlimit memmove poll pstat readlink setproctitle setsid sigprocmask symlink sysconf towlower utime utimes waitpid wcstombs getifaddrs]) # posix_fadvise() is a
Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
On Thu, 2009-08-13 at 19:05 -0400, Tom Lane wrote: What are you envisioning exactly? If vacuum finds any reason to dirty a page (or it's already dirty), then freeze everything on the page that's got age some lower threshold? Yes. There are two ways to do the threshold: 1. Constant fraction of vacuum_freeze_min_age 2. Extra GUC I lean toward #1, because it avoids an extra GUC*, and it avoids the awkwardness when the lower setting is higher than the higher setting. However, #2 might be nice for people who want to live on the edge or experiment with new values. But I suspect most of the advantage would be had just by saying that we opportunistically freeze tuples older than 50% of vacuum_freeze_min_age. Regards, Jeff Davis *: As an aside, these GUCs already have incredibly confusing names, and an extra variable would increase the confusion. For instance, they seem to use min and max interchangeably. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
On Thu, 2009-08-13 at 18:25 -0400, Robert Haas wrote: On Thu, Aug 13, 2009 at 5:33 PM, Jeff Davispg...@j-davis.com wrote: Or, perhaps when the bgwriter is flushing dirty buffers, it can look for opportunities to set hint bits or freeze tuples. One of the tricky things here is that the time you are mostly likely to want to do this is when you are loading a lot of data. But in that case shared buffers are likely to be written back to disk before transaction commit, so it'll be too early to do anything. I think it would be useful in other cases, like avoiding repeated freezing of different tuples on the same page. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Josh Berkus j...@agliodbs.com writes: What are you envisioning exactly? If vacuum finds any reason to dirty a page (or it's already dirty), then freeze everything on the page that's got age some lower threshold? I was envisioning, if the page is already dirty and in memory *for any reason*, the freeze rows at below some threshold. I believe we've had this discussion before. I do *NOT* want freezing operations pushed into any random page access, and in particular will do my best to veto any attempt to put them into the bgwriter. Freezing requires accessing the clog and emitting a WAL record, and neither is appropriate for low-level code like bgwriter. The deadlock potential alone is sufficient reason why not. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
On Fri, Aug 14, 2009 at 12:07 AM, Josh Berkusj...@agliodbs.com wrote: freeze this table agressively if it's in memory, but wait a long time to vaccuum if it's on disk Waitasec, in memory? There are two projects here: 1) Make vacuum when it's freezing tuples freeze every tuple lesser age if it finds any tuples which are max_age (or I suppose if the page is already dirty due to vacuum or something else). Vacuum still has to read in all the pages before it finds out that they don't need freezing so it doesn't mean distinguishing in memory from needs to be read in. 2) Have something like bgwriter check if the page is dirty and vacuum and freeze things based on the lesser threshold. This would effectively only be vacuuming things that are in memory However the latter is a more complex and frought project. We looked at this a while back in EDB and we found that the benefits were less than we expected and the complexities more than we expected. I would recommend sticking with (1) for now and only looking at (2) if we have a more detailed plan and solid testable use cases. -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
On Fri, Aug 14, 2009 at 12:21 AM, Tom Lanet...@sss.pgh.pa.us wrote: I was envisioning, if the page is already dirty and in memory *for any reason*, the freeze rows at below some threshold. I believe we've had this discussion before. I do *NOT* want freezing operations pushed into any random page access, and in particular will do my best to veto any attempt to put them into the bgwriter. It's possible Josh accidentally waved this red flag and really meant just to make it conditional on whether the page is dirty rather than on whether vacuum dirtied it. However he did give me a thought With the visibility map vacuum currently only covers pages that are known to have in-doubt tuples. That's why we have the anti-wraparound vacuums. However it could also check if the pages its skipping are in memory and process them if they are even if they don't have in-doubt tuples. Or it could first go through ram and process any pages that are in cache before going to the visibility map and starting from page 0, which would hopefully avoid having to read them in later when we get to them and find they've been flushed out. I'm just brainstorming here. I'm not sure if either of these are actually worth the complexity and danger of finding new bottlenecks in special case optimization codepaths. -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW-based dblink
Alvaro Herrera alvhe...@commandprompt.com wrote: int64 exec(self, query);/* for UPDATE, INSERT, DELETE */ It's not good to return int64 in exec(), because it could have a RETURNING clause. (So it also needs a fetchsize). We should use open() for RETURNING query. It is just same as present dblink_exec(), that only returns a command tag. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)
2009/8/14 Pavel Stehule pavel.steh...@gmail.com: 2009/8/13 Hitoshi Harada umi.tan...@gmail.com: 2009/8/14 Pavel Stehule pavel.steh...@gmail.com: I prefered using CTE, because this way was the most short to small bugs less prototype - with full functionality. You could make it by query rewriting, but as you say the best cleanest way is total refactoring of existing nodeAgg. How easy to implement is not convincing. I agree. Simply I am not have time and force do it. I would to concentrate on finishing some plpgsql issues, and then I have to do some other things than PostgreSQL. There are fully functional prototype and everybody is welcome to continue in this work. I see your situation. Actually your prototype is good shape to be discussed in both ways. But since you've been focusing on this feature it'd be better if you keep your eyes on this. So, Oleg, do you continue on this? Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)
2009/8/14 Hitoshi Harada umi.tan...@gmail.com: 2009/8/14 Pavel Stehule pavel.steh...@gmail.com: 2009/8/13 Hitoshi Harada umi.tan...@gmail.com: 2009/8/14 Pavel Stehule pavel.steh...@gmail.com: I prefered using CTE, because this way was the most short to small bugs less prototype - with full functionality. You could make it by query rewriting, but as you say the best cleanest way is total refactoring of existing nodeAgg. How easy to implement is not convincing. I agree. Simply I am not have time and force do it. I would to concentrate on finishing some plpgsql issues, and then I have to do some other things than PostgreSQL. There are fully functional prototype and everybody is welcome to continue in this work. I see your situation. Actually your prototype is good shape to be discussed in both ways. But since you've been focusing on this feature it'd be better if you keep your eyes on this. So, Oleg, do you continue on this? Regards, -- Hitoshi Harada I'd imagine such like: select a, b, count(*) from x group by rollup(a, b); PerGroup all = init_agg(), a = init_agg(), ab = init_agg(); while(row = fetch()){ if(group_is_changed(ab, row)){ result_ab = finalize_agg(ab); ab = init_agg(); } if(group_is_changed(a, row)){ result_a = finalize_agg(a); a = init_agg(); } advance_agg(all, row); advance_agg(a, row); advance_agg(ab, row); } result_all = finalize_agg(all); Fun =) My implementation of rollup in DBMS qd work as your imagine there! =) Also, multiply sort of source we take for CUBE implementation, but this hard for support (sort in group by - it's bloat). As result we have merge implementation of group by, rollup, and window functions with some common code - it's way for grouping of source, Hash implementation group xxx on different hash-tables (with different keys) it's very expensive (require many memory for keys). I hope continue my work, after end of time trouble on work =( (bad TPC-H perfomance) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW-based dblink
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Quite aside from the requirement for on-commit trigger, how exactly would you use 2PC with the remote database? When would you issue PREPARE TRANSACTION, and when would COMMIT PREPARED? What if the local database crashes in between - is the remote transaction left hanging in prepared state? I'm thinking prepareing remote transactions just before commit the local transaction in CommitTransaction(). The pseudo code is something like: 1. Fire deferred triggers and do works for just-before-commit. 2. AtEOXact_dblink() = prepare and commit remote transactions. 3. HOLD_INTERRUPTS() We cannot rollback the local transaction after this. 4. do works for commit If we need more robust atomicity, we could use 2PC against the local transaction if there some remote transactions. i.e., expand COMMIT command into PREPARE TRANSACTION and COMMIT PREPARED internally: 1. Fire deferred triggers and do works for just-before-commit. 2. AtEOXact_dblink_prepare()-- prepare remotes 3. PrepareTransaction() -- prepare local 4. AtEOXact_dblink_commit() -- commit remotes 5. FinishPreparedTransaction(commit)-- commit local I'm using deferrable after trigger for the purpose in my present prototype, and it seems to work if the trigger is called at the end of deferrable event and local backend doesn't crash in final works for commit -- and we have some should-not-failed operations in the final works already (flushing WAL, etc.). Regards, --- ITAGAKI Takahiro NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Custom geometry, why slow?
The story so far ... The provide polygon@point routine does not work correctly when the points are close to the boundary. So we implemented a custom contains(poly,point) function. In order to stop all points being checked against all polygons, a separate bounding box is maintained. So the query has sections looking like : boundbox @ box( thepoint, thepoint ) AND contains(boundary,thepoint) You will notice that each point to be checked has to be promoted to a degenerate box. Working on the assumption that there is a cost associated with this (ie pmalloc), and we will be passing 100's of millions of points though this in a single transaction, streaming this is important. At any rate it looked kludgy. The goal is provide : boundbox @ thepoint AND contains(boundary,thepoint) So the whole family of point op box functions where provided (except for point @ box) which already exists. The operators have been created. And the operators added to the box_ops operator family. Samples below : CREATE OR REPLACE FUNCTION leftof(box,point) RETURNS boolean LANGUAGE C IMMUTABLE STRICT AS 'contains.so', 'box_point_leftof'; ..etc... DROP OPERATOR IF EXISTS (box,point); CREATE OPERATOR ( LEFTARG= box, RIGHTARG = point, PROCEDURE = leftof, RESTRICT = positionsel, JOIN = positionjoinsel ); ...etc... ALTER OPERATOR FAMILY box_ops USING GiST ADD OPERATOR 1(box,point), OPERATOR 2(box,point), OPERATOR 3(box,point), OPERATOR 4(box,point), OPERATOR 5(box,point), OPERATOR 7 @ (box,point), --OPERATOR 8 @ (point,box), OPERATOR 9 | (box,point), OPERATOR 10 | (box,point), OPERATOR 11 | (box,point), OPERATOR 12 | (box,point); The problem is, according to EXPLAIN, it still wants to do a sequential scan and not use the index. Any pointers as to why? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] CommitFest 2009-07: Remaining Patches
OK, we have the following patches remaining for CommitFest 2009-07... (1) Named and mixed notation for PL. Tom left this one as Waiting on Author because he didn't have much else left to work on, just in case Pavel could rework it in time. I'll move it to Returned with Feedback after tomorrow if it's not resubmitted before then. (2) ECPG dynamic cursor, SQLDA support. I think we're still waiting on Michael Meskes to review this one. (3) query cancel issues in dblink. Joe Conway is planning to review this, but not until this weekend. (4) plpythonu datatype conversion improvements. Peter Eisentraut said that was in progress as of 7/24, and I've seen a few PL/python related commits go by, I think, but not this one, so I guess this one is still waiting on Peter. (5, 6) dependencies for generated header files, autogenerating headers bki stuff - Tom doesn't seem to think any of this is moving in the right direction, I believe Alvaro likes it, and I think Peter is skeptical as well but I'm not totally sure. It might be nice to have a bit more discussion to figure out what WOULD be a good way to move forward with this project. What is good? What is bad? What is ugly? But the discussion seems to have trailed off so maybe I should just move these to Rejected and give up. (7) Prefix support for synonym dictionary - Just waiting on Oleg to commit this one, I think. (8) Filtering dictionary support and unaccent dictionary - Needs some changes to the docs to handle high-bit characters, but other than that all I think all objections that have been raised have been addressed, so I think this is also ready to commit once the doc issues are addressed. Comments welcome. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Wisconsin benchmark
The Wisconsin Benchmark in src/test/bench is broken, probably since 8.2. Attached is a tested patch that fixes it. However, it might be better to just remove src/test/bench. The benchmark is quite useless, because it is single user test that runs in the stand alone mode, and because it is laughably small, taking just a couple seconds on a 6 year old not-all-that good machine. And it doesn't seem to be all that faithful to the Wisconsin Benchmark, as the string fields are only 6 characters rather than 42 (I think) of the benchmark. Cheers, Jeff WISC-patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)
2009/8/14 Олег Царев zabiva...@gmail.com: 2009/8/14 Hitoshi Harada umi.tan...@gmail.com: 2009/8/14 Pavel Stehule pavel.steh...@gmail.com: 2009/8/13 Hitoshi Harada umi.tan...@gmail.com: 2009/8/14 Pavel Stehule pavel.steh...@gmail.com: I prefered using CTE, because this way was the most short to small bugs less prototype - with full functionality. You could make it by query rewriting, but as you say the best cleanest way is total refactoring of existing nodeAgg. How easy to implement is not convincing. I agree. Simply I am not have time and force do it. I would to concentrate on finishing some plpgsql issues, and then I have to do some other things than PostgreSQL. There are fully functional prototype and everybody is welcome to continue in this work. I see your situation. Actually your prototype is good shape to be discussed in both ways. But since you've been focusing on this feature it'd be better if you keep your eyes on this. So, Oleg, do you continue on this? Regards, -- Hitoshi Harada I'd imagine such like: select a, b, count(*) from x group by rollup(a, b); PerGroup all = init_agg(), a = init_agg(), ab = init_agg(); while(row = fetch()){ if(group_is_changed(ab, row)){ result_ab = finalize_agg(ab); ab = init_agg(); } if(group_is_changed(a, row)){ result_a = finalize_agg(a); a = init_agg(); } advance_agg(all, row); advance_agg(a, row); advance_agg(ab, row); } result_all = finalize_agg(all); Fun =) My implementation of rollup in DBMS qd work as your imagine there! =) Also, multiply sort of source we take for CUBE implementation, but this hard for support (sort in group by - it's bloat). As result we have merge implementation of group by, rollup, and window functions with some common code - it's way for grouping of source, Hash implementation group xxx on different hash-tables (with different keys) it's very expensive (require many memory for keys). I hope continue my work, after end of time trouble on work =( (bad TPC-H perfomance) I thing, so you are afraid too much about memory. Look on current postgres. Any hash grouping is faster than sort grouping. Try and see. PostgreSQL isn't embeded database. So there are not main goal an using less memory. The goal is has features with clean, readable and maintainable source code. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2009-07: Remaining Patches
2009/8/14 Robert Haas robertmh...@gmail.com: OK, we have the following patches remaining for CommitFest 2009-07... (1) Named and mixed notation for PL. Tom left this one as Waiting on Author because he didn't have much else left to work on, just in case Pavel could rework it in time. I'll move it to Returned with Feedback after tomorrow if it's not resubmitted before then. I'll work on it this night. I would to respect some Tom comments. Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers