[HACKERS] Adding XSLT support to PostgreSQL core?
Dear PostgreSQL hackers, [ please CC to me as I'm not subscribed to the list ] For my projects, I find it very handy to apply XSLT transformations directly on database side. Unfortunately, this is only available in contrib/xml2 with an unpleasant interface. The documentation of contrib/xml2 of PostgreSQL 9.1 says: | If you find that some of the functionality of this module | is not available in an adequate form with the newer API, | please explain your issue to pgsql-hackers@postgresql.org | so that the deficiency can be addressed. So this is what I'm trying to do here. :-) While all xpath_*() functions seem to have been successfully collapsed into a generic xpath() function, and xml_is_well_formed() has been moved into the type check for the XML type, I wonder what happened to the XSLT support. What are the issues that prevent the xslt_process() function from being polished up and moved into core? Do you need some volunteer to put work into this? Even with the upcoming XQuery support, XSLT still has its place as a very useful templating language with lots of existing stylesheets in the wild. Greets, Volker -- Volker Grabsch ---(())--- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog location arithmetic
On Tue, Dec 6, 2011 at 05:19, Euler Taveira de Oliveira eu...@timbira.com wrote: Hi, A while ago when blogging about WAL [1], I noticed a function to deal with xlog location arithmetic is wanted. I remembered Depez [2] mentioning it and after some questions during trainings and conferences I decided to translate my shell script function in C. The attached patch implements the function pg_xlog_location_diff (bikeshed colors are welcome). It calculates the difference between two given transaction log locations. Now that we have pg_stat_replication view, it will be easy to get the lag just passing columns as parameters. Also, the monitoring tools could take advantage of it instead of relying on a fragile routine to get the lag. I've been considering similar things, as you can find in the archives, but what I was thinking of was converting the number to just a plain bigint, then letting the user apply whatever arithmetic wanted at the SQL level. I never got around to acutally coding it, though. It could easily be extracted from your patch of course - and I think that's a more flexible approach. Is there some advantage to your method that I'm missing? Also, why do you use DirectFunctionCall to do the simple math, and not just do the math right there in the function? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
2011/12/5 Tom Lane t...@sss.pgh.pa.us: What is bothering me is that this approach is going to cause substantial bloat of the executable code, and such bloat has got distributed costs, which we don't have any really good way to measure but for sure micro-benchmarks addressing only sort speed aren't going to reveal them. Cache lines filled with sort code take cycles to flush and replace with something else. I think it's possibly reasonable to have inlined copies of qsort for a small number of datatypes, but it seems much less reasonable to have multiple copies per datatype in order to obtain progressively tinier micro-benchmark speedups. We need to figure out what the tradeoff against executable size really is, but so far it seems like you've been ignoring the fact that there is any such tradeoff at all. [ Randomly spouting ideas here: ] Might it not be a good idea to decide whether to use the inlined copies vs. the normal version, based on how much data to sort? Surely for a very large sort, the cache blow-out doesn't matter that much relative to the amount of time that can be saved sorting? Assuming that all types would have an inlined sort function, although that would indeed result in a larger binary, most of that binary would never touch the cache, because corresponding large sorts are never performed. If they would sporadically occur (and assuming the points at which inline sorting starts to get used are chosen wisely), the possibly resulting cache blow-out would be a net win. I am also assuming here that instruction cache lines are small enough for case line aliasing not to become a problem; putting all sort functions next to each other in the binary (so that they don't alias with the rest of the backend code that might be used more often) might alleviate that. Nicolas -- A. Because it breaks the logical sequence of discussion. Q. Why is top posting bad? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] IDLE in transaction introspection
On Sat, Nov 19, 2011 at 02:55, Scott Mead sco...@openscg.com wrote: On Thu, Nov 17, 2011 at 11:58 AM, Scott Mead sco...@openscg.com wrote: On Wed, Nov 16, 2011 at 4:09 PM, Scott Mead sco...@openscg.com wrote: On Tue, Nov 15, 2011 at 1:18 PM, Robert Treat r...@xzilla.net wrote: On Tue, Nov 15, 2011 at 12:00 PM, Greg Smith g...@2ndquadrant.com wrote: On 11/15/2011 09:44 AM, Scott Mead wrote: Fell off the map last week, here's v2 that: * RUNNING = active * all states from CAPS to lower case This looks like what I was hoping someone would add here now. Patch looks good, only issue I noticed was a spaces instead of a tab goof where you set beentry_st_state at line 2419 in src/backend/postmaster/pgstat.c Missing pieces: -There is one regression test that uses pg_stat_activity that is broken now. -The documentation should list the new field and all of the states it might include. That's a serious doc update from the minimal info available right now. V3 Attached: * Updates Documentation -- Adds a separate table to cover each column of pg_stat_activity I like that a lot - we should've done taht years ago :-) For consistency, either both datname and usename should refer to their respective oid, or none of them. application_name should probably have a link to the libpq documentation for said parameter. field is not set should probably be field is NULL Boolean, if the query is waiting because of a block / lock reads really strange to me. Boolean indicating if would be a good start, but I'm not sure what you're trying to say with block / lock at all? The way the list of states is built up looks really strange. I think it should be built out of variablelist like other such places instead - but I admit to not having checked what that actually looks like in the output. The use of single quotes in the descriptions, things like This is the 'state' of seems wrong. If we should use anything, it should be double quotes, but why do we need double quotes around that? And the reference to think time is just strange, IMO. In some other cases, the single quotes should probably be replaced with literal. NB: should probably be note. * Adds 'state_change' (timestamp) column -- Tracks when the 'state' column is updated independently of when the 'query' column is updated Very useful. * renames procpid = pid Shouldn't we do this consistently if we do it? It's change in pg_stat_get_activity() but not pg_stat_get_wal_senders()? I think we either change in both functions, or none of them * Bug Fixes -- If a backend had never before issued a query, another session looking at pg_stat_activity would print command string not enabled in the query column. -- query_start was being updated on a 'state' change, now only updated on query change * Fixed 'rules' regression failure -- Added new columns for pg_stat_activity Also, I think state=-1 needs a symbolic value. STATE_UNDEFINED or something like that. There's also a lot of random trailing whitespace in the patch - git diff (which you seem to be using already, that's why I mention it) will happily alert you about that - they should be removed. Or the committer can clean that up of course, but it makes for less work ;) The code is starting to look really good, but I think the docs definitely need another round of work. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] pg_last_xact_insert_timestamp
On Thu, Oct 13, 2011 at 14:25, Robert Haas robertmh...@gmail.com wrote: On Tue, Oct 4, 2011 at 9:15 AM, Simon Riggs si...@2ndquadrant.com wrote: Simon, could you? If your proposal turns out to be better than mine, I'd be happy to agree to drop my patch and adopt yours. Yes, will do. Simon, I believe that we are still waiting for this. Are we going to hear anything back on this one for the current CF? If not, I suggest we go with Fujiis version for now - we can always change it for a potentially better version later. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade if 'postgres' database is dropped
On Thu, Nov 3, 2011 at 11:20, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: On Wed, Nov 2, 2011 at 8:31 PM, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: If nobody objects, I'll go do that. ?Hopefully that should be enough to put this problem to bed more or less permanently. All right, I've worked up a (rather boring and tedious) patch to do this, which is attached. I wonder if we should bother using a flag for this. ?No one has asked for one, and the new code to conditionally connect to databases should function fine for most use cases. True, but OTOH we have such a flag for pg_dumpall, and I've already done the work. Well, every user-visible API option has a cost, and I am not sure there is enough usefulness to overcome the cost of this. I am not sure why you think this is worth the time it takes to argue about it, but if you want to whack the patch around or just forget the whole thing, go ahead. The difference between what you're proposing and what I'm proposing is about 25 lines of code, so it hardly needs an acre of justification. To me, making the tools consistent with each other and not dependent on the user's choice of database names is worth the tiny amount of code it takes to make that happen. Well, it would be good to get other opinions on this. The amount of code isn't really the issue for me, but rather keeping the user API as clean as possible. Seems reasonably clean to me. Not sure what would be unclean about it? Are you saying we need to explain the concept of maintenance db somewhere in the docs? Also, if we are going to add this flag, we should have pg_dumpall use it too and just deprecate the old options. I thought about that, but couldn't think of a compelling reason to break backward compatibility. Adding it to pg_dumpal lwithout removing the old one doesn't cause backwards compatibility break. Then mark the old one as deprecated, and remove it a few releases down the road. We can't keep every switch around forever ;) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_cancel_backend by non-superuser
On Sun, Oct 2, 2011 at 23:32, Tom Lane t...@sss.pgh.pa.us wrote: Noah Misch n...@leadboat.com writes: On Sun, Oct 02, 2011 at 06:55:51AM -0400, Robert Haas wrote: On Sat, Oct 1, 2011 at 10:11 PM, Euler Taveira de Oliveira eu...@timbira.com wrote: I see. What about passing this decision to DBA? I mean a GUC can_cancel_session = user, dbowner (default is '' -- only superuser). You can select one or both options. This GUC can only be changed by superuser. Or how about making it a grantable database-level privilege? I think either is overkill. You can implement any policy by interposing a SECURITY DEFINER wrapper around pg_cancel_backend(). I'm with Noah on this. If allowing same-user cancels is enough to solve 95% or 99% of the real-world use cases, let's just do that. There's no very good reason to suppose that a GUC or some more ad-hoc privileges will solve a large enough fraction of the rest of the cases to be worth their maintenance effort. In particular, I think both of the above proposals assume way too much about the DBA's specific administrative requirements. +1. Torello, are you up for updating your patch to do this, for now? If not, I'll be happy to create an updated patch that does just this, but since you got started on it... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table
* Robert Haas: I tried whacking out the call to GetPageWithFreeSpace() in RelationGetBufferForTuple(), and also with the unpatched code, but the run-to-run randomness was way more than any difference the change made. Is there a better test case? I think that if you want to exercise file system lookup performance, you need a larger directory, which presumably means a large number of tables. -- Florian Weimerfwei...@bfk.de BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade if 'postgres' database is dropped
Magnus Hagander wrote: On Thu, Nov 3, 2011 at 11:20, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: On Wed, Nov 2, 2011 at 8:31 PM, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: If nobody objects, I'll go do that. ?Hopefully that should be enough to put this problem to bed more or less permanently. All right, I've worked up a (rather boring and tedious) patch to do this, which is attached. I wonder if we should bother using a flag for this. ?No one has asked for one, and the new code to conditionally connect to databases should function fine for most use cases. True, but OTOH we have such a flag for pg_dumpall, and I've already done the work. Well, every user-visible API option has a cost, and I am not sure there is enough usefulness to overcome the cost of this. I am not sure why you think this is worth the time it takes to argue about it, but if you want to whack the patch around or just forget the whole thing, go ahead. ?The difference between what you're proposing and what I'm proposing is about 25 lines of code, so it hardly needs an acre of justification. ?To me, making the tools consistent with each other and not dependent on the user's choice of database names is worth the tiny amount of code it takes to make that happen. Well, it would be good to get other opinions on this. ?The amount of code isn't really the issue for me, but rather keeping the user API as clean as possible. Seems reasonably clean to me. Not sure what would be unclean about it? Are you saying we need to explain the concept of maintenance db somewhere in the docs? Also, if we are going to add this flag, we should have pg_dumpall use it too and just deprecate the old options. I thought about that, but couldn't think of a compelling reason to break backward compatibility. Adding it to pg_dumpal lwithout removing the old one doesn't cause backwards compatibility break. Then mark the old one as deprecated, and remove it a few releases down the road. We can't keep every switch around forever ;) OK, good. I will work on this. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade if 'postgres' database is dropped
Magnus Hagander wrote: On Thu, Nov 3, 2011 at 11:20, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: On Wed, Nov 2, 2011 at 8:31 PM, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: If nobody objects, I'll go do that. ?Hopefully that should be enough to put this problem to bed more or less permanently. All right, I've worked up a (rather boring and tedious) patch to do this, which is attached. I wonder if we should bother using a flag for this. ?No one has asked for one, and the new code to conditionally connect to databases should function fine for most use cases. True, but OTOH we have such a flag for pg_dumpall, and I've already done the work. Well, every user-visible API option has a cost, and I am not sure there is enough usefulness to overcome the cost of this. I am not sure why you think this is worth the time it takes to argue about it, but if you want to whack the patch around or just forget the whole thing, go ahead. ?The difference between what you're proposing and what I'm proposing is about 25 lines of code, so it hardly needs an acre of justification. ?To me, making the tools consistent with each other and not dependent on the user's choice of database names is worth the tiny amount of code it takes to make that happen. Well, it would be good to get other opinions on this. ?The amount of code isn't really the issue for me, but rather keeping the user API as clean as possible. Seems reasonably clean to me. Not sure what would be unclean about it? Are you saying we need to explain the concept of maintenance db somewhere in the docs? My point is that no one can explain why they would need to specify an alternate database when using 'postgres' and falling back to 'template1' works for almost everyone. I say just make it automatic, as it was in Robert's patch, and be done with it. Also, if we are going to add this flag, we should have pg_dumpall use it too and just deprecate the old options. I thought about that, but couldn't think of a compelling reason to break backward compatibility. Adding it to pg_dumpall without removing the old one doesn't cause backwards compatibility break. Then mark the old one as deprecated, and remove it a few releases down the road. We can't keep every switch around forever ;) OK. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade if 'postgres' database is dropped
On Tue, Dec 6, 2011 at 7:00 AM, Magnus Hagander mag...@hagander.net wrote: Seems reasonably clean to me. Not sure what would be unclean about it? Based on this feedback, I went ahead and committed my previous patch. This means that if pg_upgrade wants to accept a --maintenance-db option, it will be able to pass it through to any other commands it invokes. And if it doesn't do that, vacuumdb et. al. should still work anyway, as long as either template1 or postgres is accessible. Also, if we are going to add this flag, we should have pg_dumpall use it too and just deprecate the old options. I thought about that, but couldn't think of a compelling reason to break backward compatibility. Adding it to pg_dumpal lwithout removing the old one doesn't cause backwards compatibility break. Then mark the old one as deprecated, and remove it a few releases down the road. We can't keep every switch around forever ;) I'm not really sold on tinkering with pg_dumpall; if it ain't broke, don't fix it. But we can bikeshed about it if we like. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DOCS] Moving tablespaces
On Sun, Dec 4, 2011 at 18:07, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: On 12/04/2011 11:41 AM, Tom Lane wrote: Hm, how portable is symlink-reading? If we can actually do that without big headaches, then +1. I wondered that, specifically about Windows junction points, but we seem to have support for it already in dirmod.c::pgreadlink(). Surely there's no other currently supported platform where it would even be a question? readlink is required by Single Unix Spec v2 (1997), which is what we've been treating as our baseline expectation for Unix-oid platforms for awhile now. Given that we dealt with the Windows side already, I don't see a problem with making this assumption. At worst we'd end up needing a couple more emulations in src/port, since surely there's *some* way to do it on any platform with symlinks. AFAICT, it should be as simple as the attached. Doesn't include the required fixes for pg_upgrade, I'll get on those next. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ *** a/doc/src/sgml/catalogs.sgml --- b/doc/src/sgml/catalogs.sgml *** *** 5392,5404 /row row - entrystructfieldspclocation/structfield/entry - entrytypetext/type/entry - entry/entry - entryLocation (directory path) of the tablespace/entry - /row - - row entrystructfieldspcacl/structfield/entry entrytypeaclitem[]/type/entry entry/entry --- 5392,5397 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 13612,13617 SELECT pg_type_is_visible('myschema.widget'::regtype); --- 13612,13621 /indexterm indexterm + primarypg_tablespace_location/primary +/indexterm + +indexterm primarypg_typeof/primary /indexterm *** *** 13759,13764 SELECT pg_type_is_visible('myschema.widget'::regtype); --- 13763,13773 entryget the set of database OIDs that have objects in the tablespace/entry /row row +entryliteralfunctionpg_tablespace_location(parametertablespace_oid/parameter)/function/literal/entry +entrytypetext/type/entry +entryget the path in the filesystem that this tablespace is located in/entry + /row + row entryliteralfunctionpg_typeof(parameterany/parameter)/function/literal/entry entrytyperegtype/type/entry entryget the data type of any value/entry *** a/doc/src/sgml/xaggr.sgml --- b/doc/src/sgml/xaggr.sgml *** *** 154,160 SELECT attrelid::regclass, array_accum(attname) attrelid| array_accum ---+--- ! pg_tablespace | {spcname,spcowner,spclocation,spcacl} (1 row) SELECT attrelid::regclass, array_accum(atttypid::regtype) --- 154,160 attrelid| array_accum ---+--- ! pg_tablespace | {spcname,spcowner,spcacl,spcoptions} (1 row) SELECT attrelid::regclass, array_accum(atttypid::regtype) *** *** 164,170 SELECT attrelid::regclass, array_accum(atttypid::regtype) attrelid|array_accum ---+--- ! pg_tablespace | {name,oid,text,aclitem[]} (1 row) /programlisting /para --- 164,170 attrelid|array_accum ---+--- ! pg_tablespace | {name,oid,aclitem[],text[]} (1 row) /programlisting /para *** a/src/backend/commands/tablespace.c --- b/src/backend/commands/tablespace.c *** *** 314,321 CreateTableSpace(CreateTableSpaceStmt *stmt) DirectFunctionCall1(namein, CStringGetDatum(stmt-tablespacename)); values[Anum_pg_tablespace_spcowner - 1] = ObjectIdGetDatum(ownerId); - values[Anum_pg_tablespace_spclocation - 1] = - CStringGetTextDatum(location); nulls[Anum_pg_tablespace_spcacl - 1] = true; nulls[Anum_pg_tablespace_spcoptions - 1] = true; --- 314,319 *** a/src/backend/utils/adt/misc.c --- b/src/backend/utils/adt/misc.c *** *** 18,23 --- 18,24 #include signal.h #include dirent.h #include math.h + #include unistd.h #include catalog/catalog.h #include catalog/pg_tablespace.h *** *** 262,267 pg_tablespace_databases(PG_FUNCTION_ARGS) --- 263,298 /* + * pg_tablespace_location - get location for a tablespace + */ + Datum + pg_tablespace_location(PG_FUNCTION_ARGS) + { + Oid tablespaceOid = PG_GETARG_OID(0); + char sourcepath[MAXPGPATH]; + char targetpath[MAXPGPATH]; + + /* + * Return empty string for our two default tablespace + */ + if (tablespaceOid == DEFAULTTABLESPACE_OID || + tablespaceOid == GLOBALTABLESPACE_OID) + PG_RETURN_TEXT_P(cstring_to_text()); + + /* + * Find the location of the tablespace
Re: [HACKERS] [DOCS] Moving tablespaces
Magnus Hagander mag...@hagander.net writes: + snprintf(sourcepath, sizeof(sourcepath), pg_tblspc/%d, tablespaceOid); %u for an OID, please. Otherwise seems reasonably sane on first glance. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DOCS] Moving tablespaces
Magnus Hagander mag...@hagander.net writes: AFAICT, it should be as simple as the attached. Oh, one other thought is that the function body has to be conditionalized on HAVE_READLINK (the fact that you forgot that somewhere else isn't an excuse for not doing it here). IIRC, only the two built-in tablespaces can exist when not HAVE_READLINK, so it seems sufficient to handle those cases and then PG_RETURN_NULL (or maybe throw error) when not HAVE_READLINK. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DOCS] Moving tablespaces
On Tue, Dec 6, 2011 at 16:12, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: AFAICT, it should be as simple as the attached. Oh, one other thought is that the function body has to be conditionalized on HAVE_READLINK (the fact that you forgot that somewhere else isn't an excuse for not doing it here). IIRC, only the two built-in tablespaces can exist when not HAVE_READLINK, so it seems sufficient to handle those cases and then PG_RETURN_NULL (or maybe throw error) when not HAVE_READLINK. Hmm. good point. Throwing an error seems a lot more safe in this case than just returning NULL. Since it's a situtation that really shouldn't happen. Maybe an assert, but I think a regular ereport(ERROR) would be the best. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DOCS] Moving tablespaces
Magnus Hagander mag...@hagander.net writes: Throwing an error seems a lot more safe in this case than just returning NULL. Since it's a situtation that really shouldn't happen. Maybe an assert, but I think a regular ereport(ERROR) would be the best. Not an assert, since it's trivially user-triggerable. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DOCS] Moving tablespaces
On Tue, Dec 6, 2011 at 16:17, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: Throwing an error seems a lot more safe in this case than just returning NULL. Since it's a situtation that really shouldn't happen. Maybe an assert, but I think a regular ereport(ERROR) would be the best. Not an assert, since it's trivially user-triggerable. Right. There is some nice precedent in the CREATE TABLESPACE command (though dependent on HAVE_SYMLINK and not HAVE_READLINK), so I'm just going to copy the error message from there. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pull_up_simple_subquery
Robert Haas robertmh...@gmail.com writes: While working on KaiGai's leaky views patch, I found myself scrutinizing the behavior of the function named in the subject line; and specifically the retest of is_simple_subquery(). I've been unable to make that fail. It might be that the is_simple_subquery conditions can't currently fail, though that has been possible in the past and could be again someday. The is_safe_append_member conditions can trivially fail after pullup, however. An example in the regression database: create or replace function foo1() returns setof int8 as ' select q2 from int8_tbl, tenk1 where q1 = unique1 ' language sql stable; select * from foo1() union all select q1 from int8_tbl; Like the comment says, I'd rather just retest the conditions than try to work out exactly what might be possible or impossible to happen. However, despite my best efforts, I can't figure out what scenario it's protecting us against, at least not on current sources. The original bug report is here: http://archives.postgresql.org/pgsql-general/2004-01/msg00375.php Tom's reply indicates that the v4 example shouldn't get flattened, but it looks to me like current sources do flatten it and I really don't see why they shouldn't. Poking around with git bisect and the patch shown above, I see that the test case stopped tickling this code with commit e6ae3b5dbf2c07bceb737c5a0ff199b1156051d1, which introduced PlaceHolderVars, apparently for the precise purpose of allowing joins of this type to be flattened. Yes, that was the point of PlaceHolderVars: we used to not be able to flatten subqueries underneath outer joins, if they had any non-nullable output expressions. Adding a PHV ensures that the expression will go to null if it's supposed to. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DOCS] Moving tablespaces
Magnus Hagander mag...@hagander.net writes: There is some nice precedent in the CREATE TABLESPACE command (though dependent on HAVE_SYMLINK and not HAVE_READLINK), so I'm just going to copy the error message from there. Fair enough. Looking at the existing readlink use in port/exec.c, it strikes me that another thing you'd better do is include a check for buffer overrun, ie the test needs to be more like rllen = readlink(fname, link_buf, sizeof(link_buf)); if (rllen 0 || rllen = sizeof(link_buf)) ... fail ... Also, you're assuming that the result is already null-terminated, which is incorrect. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Recover data....
Hi, Sorry, I am resending this email.. there was spelling mistake in earlier email... Please help me. Due to some unknown reason all contents of PostGresql data folder ( D:\Program Files\PostgreSQL\8.4\data) got deleted except base and global folder. Please let me If I can recover my whole data.. Thanks Regards, Azghar Hussain Project Lead (Software) - Risk and Insurance (RI), RMSI Pvt Ltd, A - 7, Sector - 16, Noida - 201301, UP, India Ph: +91 - 120 - 2511102, Ext No. 2655 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog location arithmetic
On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander mag...@hagander.net wrote: I've been considering similar things, as you can find in the archives, but what I was thinking of was converting the number to just a plain bigint, then letting the user apply whatever arithmetic wanted at the SQL level. I never got around to acutally coding it, though. It could easily be extracted from your patch of course - and I think that's a more flexible approach. Is there some advantage to your method that I'm missing? I went so far as to put together an lsn data type. I didn't actually get all that far with it, which is why I haven't posted it sooner, but here's what I came up with. It's missing indexing support and stuff, but that could be added if people like the approach. It solves this problem by implementing -(lsn,lsn) = numeric (not int8, that can overflow since it is not unsigned), which allows an lsn = numeric conversion by just subtracting '0/0'::lsn. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company lsn.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recover data....
Azghar Hussain azghar.huss...@rmsi.com wrote: Due to some unknown reason all contents of PostGresql data folder (D:\Program Files\PostgreSQL\8.4\data) got deleted except base and global folder. Please let me If I can recover my whole data.. Several of the other folders contain information crucial to storing the data -- they are an integral part of the database. It may be possible to rebuild some approximation of what is correct, but you will probably have some data loss or corruption if you go that way. The first thing is to copy what is left of the data directory tree to a backup location and keep it safe. http://wiki.postgresql.org/wiki/Corruption The next thing is probably to decide if you want to put a lot of time and resources into recovering what you can from what's left of your database, or just go to your last good backup. What backup technique are you using? How current is the copy? If you want to try to recover, you should provide a lot more information. This page provides some advice in that regard: http://wiki.postgresql.org/wiki/Guide_to_reporting_problems You might want to pay for expert help: http://www.postgresql.org/support/professional_support/ -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On Sun, Dec 4, 2011 at 2:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: * I invented a SortKey struct that replaces ScanKey for tuplesort's purposes. Right now that's local in tuplesort.c, but we're more than likely going to need it elsewhere as well. Should we just define it in sortsupport.h? Or perhaps we should just add the few additional fields to SortSupportInfoData, and not bother with two struct types? Before you answer, consider the next point. +1 for not bothering with two struct types. We might want to consider calling the resulting structure SortKey rather than SortSupportInfoData, however. * I wonder whether it would be worthwhile to elide inlineApplyComparator altogether, pushing what it does down to the level of the datatype-specific functions. That would require changing the comparator API to include isnull flags, and exposing the reverse/nulls_first sort flags to the comparators (presumably by including them in SortSupportInfoData). The main way in which that could be a win would be if the setup function could choose one of four comparator functions that are pre-specialized for each flags combination; but that seems like it would add a lot of code bulk, and the bigger problem is that we need to be able to change the flags after sort initialization (cf. the reversedirection code in tuplesort.c), so we'd also need some kind of re-select the comparator call API. On the whole this doesn't seem promising, but maybe somebody else has a different idea. I thought about this, too, but it didn't seem promising to me, either. * We're going to want to expose PrepareSortSupportComparisonShim for use outside tuplesort.c too, and possibly refactor tuplesort_begin_heap so that the SortKey setup logic inside it can be extracted for use elsewhere. Shall we just add those to tuplesort's API, or would it be better to create a sortsupport.c with these sorts of functions? Why are we going to want to do that? If it's because there are other places in the code that can make use of a fast comparator that don't go through tuplesort.c, then we should probably break it off into a separate file (sortkey.c?). But if it's because we think that clients of the tuplesort code are going to need it for some reason, then we may as well keep it in tuplesort.c. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Patch for cursor calling with named parameters
Kevin Grittner kgri...@wicourts.gov wrote: Yeb Havinga yebhavi...@gmail.com wrote: I personally tend to believe it doesn't even need to be an error. There is no technical reason not to allow it. All the user needs to do is make sure that the combination of named parameters and the positional ones together are complete and not overlapping. If there are no objections, I suggest that Yeb implement the mixed notation for cursor parameters. Hearing no objections -- Yeb, are you OK with doing this, and do you feel this is doable for this CF? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
Robert Haas robertmh...@gmail.com writes: On Sun, Dec 4, 2011 at 2:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: * We're going to want to expose PrepareSortSupportComparisonShim for use outside tuplesort.c too, and possibly refactor tuplesort_begin_heap so that the SortKey setup logic inside it can be extracted for use elsewhere. Shall we just add those to tuplesort's API, or would it be better to create a sortsupport.c with these sorts of functions? Why are we going to want to do that? If it's because there are other places in the code that can make use of a fast comparator that don't go through tuplesort.c, then we should probably break it off into a separate file (sortkey.c?). But if it's because we think that clients of the tuplesort code are going to need it for some reason, then we may as well keep it in tuplesort.c. My expectation is that nbtree, as well as mergejoin and mergeappend, would get converted over to use the fast comparator API. I looked at that a little bit but didn't push it far enough to be very sure about whether they'd be able to share the initialization code from tuplesort_begin_heap. But they're definitely going to need the shim function for backwards compatibility, and PrepareSortSupportComparisonShim was my first cut at a wrapper that would be generally useful. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog location arithmetic
On 06-12-2011 07:14, Magnus Hagander wrote: On Tue, Dec 6, 2011 at 05:19, Euler Taveira de Oliveira eu...@timbira.com wrote: Hi, A while ago when blogging about WAL [1], I noticed a function to deal with xlog location arithmetic is wanted. I remembered Depez [2] mentioning it and after some questions during trainings and conferences I decided to translate my shell script function in C. The attached patch implements the function pg_xlog_location_diff (bikeshed colors are welcome). It calculates the difference between two given transaction log locations. Now that we have pg_stat_replication view, it will be easy to get the lag just passing columns as parameters. Also, the monitoring tools could take advantage of it instead of relying on a fragile routine to get the lag. I've been considering similar things, as you can find in the archives, but what I was thinking of was converting the number to just a plain bigint, then letting the user apply whatever arithmetic wanted at the SQL level. I never got around to acutally coding it, though. It could easily be extracted from your patch of course - and I think that's a more flexible approach. Is there some advantage to your method that I'm missing? The only advantage is that you don't expose the arithmetic, e.g., user doesn't need to know the xlog internals (like I described in a recent blog post). If one day we consider changes in xlog arithmetic (for example, XLogFileSize), we don't need to worry too much about external tools. Also, why do you use DirectFunctionCall to do the simple math, and not just do the math right there in the function? I use it because I don't want to duplicate the overflow code. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On Tue, Dec 6, 2011 at 12:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sun, Dec 4, 2011 at 2:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: * We're going to want to expose PrepareSortSupportComparisonShim for use outside tuplesort.c too, and possibly refactor tuplesort_begin_heap so that the SortKey setup logic inside it can be extracted for use elsewhere. Shall we just add those to tuplesort's API, or would it be better to create a sortsupport.c with these sorts of functions? Why are we going to want to do that? If it's because there are other places in the code that can make use of a fast comparator that don't go through tuplesort.c, then we should probably break it off into a separate file (sortkey.c?). But if it's because we think that clients of the tuplesort code are going to need it for some reason, then we may as well keep it in tuplesort.c. My expectation is that nbtree, as well as mergejoin and mergeappend, would get converted over to use the fast comparator API. I looked at that a little bit but didn't push it far enough to be very sure about whether they'd be able to share the initialization code from tuplesort_begin_heap. But they're definitely going to need the shim function for backwards compatibility, and PrepareSortSupportComparisonShim was my first cut at a wrapper that would be generally useful. OK. Well, then pushing it out to a separate file probably makes sense. Do you want to do that or shall I have a crack at it? If the latter, what do you think about using the name SortKey for everything rather than SortSupport? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog location arithmetic
On 06-12-2011 13:11, Robert Haas wrote: On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander mag...@hagander.net wrote: I've been considering similar things, as you can find in the archives, but what I was thinking of was converting the number to just a plain bigint, then letting the user apply whatever arithmetic wanted at the SQL level. I never got around to acutally coding it, though. It could easily be extracted from your patch of course - and I think that's a more flexible approach. Is there some advantage to your method that I'm missing? I went so far as to put together an lsn data type. I didn't actually get all that far with it, which is why I haven't posted it sooner, but here's what I came up with. It's missing indexing support and stuff, but that could be added if people like the approach. It solves this problem by implementing -(lsn,lsn) = numeric (not int8, that can overflow since it is not unsigned), which allows an lsn = numeric conversion by just subtracting '0/0'::lsn. Interesting approach. I don't want to go that far. If so, you want to change all of those functions that deal with LSNs and add some implicit conversion between text and lsn data types (for backward compatibility). As of int8, I'm not aware of any modern plataform that int8 is not 64 bits. I'm not against numeric use; I'm just saying that int8 is sufficient. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog location arithmetic
On Tue, Dec 6, 2011 at 1:00 PM, Euler Taveira de Oliveira eu...@timbira.com wrote: On 06-12-2011 13:11, Robert Haas wrote: On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander mag...@hagander.net wrote: I've been considering similar things, as you can find in the archives, but what I was thinking of was converting the number to just a plain bigint, then letting the user apply whatever arithmetic wanted at the SQL level. I never got around to acutally coding it, though. It could easily be extracted from your patch of course - and I think that's a more flexible approach. Is there some advantage to your method that I'm missing? I went so far as to put together an lsn data type. I didn't actually get all that far with it, which is why I haven't posted it sooner, but here's what I came up with. It's missing indexing support and stuff, but that could be added if people like the approach. It solves this problem by implementing -(lsn,lsn) = numeric (not int8, that can overflow since it is not unsigned), which allows an lsn = numeric conversion by just subtracting '0/0'::lsn. Interesting approach. I don't want to go that far. If so, you want to change all of those functions that deal with LSNs and add some implicit conversion between text and lsn data types (for backward compatibility). As of int8, I'm not aware of any modern plataform that int8 is not 64 bits. I'm not against numeric use; I'm just saying that int8 is sufficient. The point isn't that int8 might not be 64 bits - of course it has to be 64 bits; that's why it's called int8 i.e. 8 bytes. The point is that a large enough LSN, represented as an int8, will come out as a negative values. int8 can only represent 2^63 *non-negative* values, because one bit is reserved for sign. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
Robert Haas robertmh...@gmail.com writes: OK. Well, then pushing it out to a separate file probably makes sense. Do you want to do that or shall I have a crack at it? If the latter, what do you think about using the name SortKey for everything rather than SortSupport? I'll take another crack at it. I'm not entirely sold yet on merging the two structs; I think first we'd better look and see what the needs are in the other potential callers I mentioned. If we'd end up cluttering the struct with half a dozen weird fields, it'd be better to stick to a minimal interface struct with various wrapper structs, IMO. OTOH it did seem that the names were getting a bit long. If we do keep the two-struct-levels approach, what do you think of s/SortSupportInfo/SortSupport/g ? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On Tue, Dec 6, 2011 at 1:07 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: OK. Well, then pushing it out to a separate file probably makes sense. Do you want to do that or shall I have a crack at it? If the latter, what do you think about using the name SortKey for everything rather than SortSupport? I'll take another crack at it. I'm not entirely sold yet on merging the two structs; I think first we'd better look and see what the needs are in the other potential callers I mentioned. If we'd end up cluttering the struct with half a dozen weird fields, it'd be better to stick to a minimal interface struct with various wrapper structs, IMO. OK. I'll defer to whatever you come up with after looking at it. OTOH it did seem that the names were getting a bit long. If we do keep the two-struct-levels approach, what do you think of s/SortSupportInfo/SortSupport/g ? +1. I had that thought when you originally suggested that name, but it didn't seem worth arguing about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
i know its been over a year without any activity on this thread but did anything ever come of this? i'd really like to be able to get dates to match the format specified for date time strings in ecmascript 5. a generic way to specify the format would be ideal if it can be done securely. has there been other threads discussing this more recently? thanks, ben... On Wed, May 19, 2010 at 6:21 AM, Pavel Stehule pavel.steh...@gmail.comwrote: I am not a security expert - you can simply don't allow apostrophe, double quotes - but I am not sure, if this can be safe - simply - I am abe to write this patch, but I am not able to ensure security. Regards Pavel
Re: [HACKERS] [COMMITTERS] pgsql: plpython: Add SPI cursor support
Peter Eisentraut pete...@gmx.net writes: plpython: Add SPI cursor support Buildfarm member narwhal does not like this patch. It looks like PyObject_SelfIter is not a compile-time constant on its version of python (2.5, apparently). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: plpython: Add SPI cursor support
On 06/12/11 19:23, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: plpython: Add SPI cursor support Buildfarm member narwhal does not like this patch. It looks like PyObject_SelfIter is not a compile-time constant on its version of python (2.5, apparently). Hm, I quickly tried with a self-compiled Python 2.5.6 from the upstream tarball and it compiled on my Linux machine. I see that PyObject_SelfIter is defined here: http://hg.python.org/cpython/file/b48e1b48e670/Include/object.h#l407 I'll try to dig around to see what can be causing that... Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] should Makefile.custom be in source tree or in build tree?
Makefile.custom is currently looked for in the source directory. This tripped me up recently when doing a vpath build. Should it be looked for in the build tree instead? Or both? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should Makefile.custom be in source tree or in build tree?
Excerpts from Peter Eisentraut's message of mar dic 06 16:06:57 -0300 2011: Makefile.custom is currently looked for in the source directory. This tripped me up recently when doing a vpath build. Should it be looked for in the build tree instead? Or both? Hmm, interesting question. When I have used it, I put it in the source dir without thinking much about it. In the builddir it'd be painful I think because it'd be gone when the builddir is zapped to do a full rebuild (I don't do this often but it does happen from time to time). Nowadays I don't use it much though, since I learned the PROFILE trick when calling make, so it's not like I care all that much. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should Makefile.custom be in source tree or in build tree?
Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Peter Eisentraut's message of mar dic 06 16:06:57 -0300 2011: Makefile.custom is currently looked for in the source directory. This tripped me up recently when doing a vpath build. Should it be looked for in the build tree instead? Or both? Hmm, interesting question. When I have used it, I put it in the source dir without thinking much about it. In the builddir it'd be painful I think because it'd be gone when the builddir is zapped to do a full rebuild (I don't do this often but it does happen from time to time). Yeah, to the extent that I use vpath builds at all, it's with the expectation that I can rm -rf the build tree and not lose anything interesting. So I'd be sad if the ability to keep Makefile.custom in the source tree disappeared. However, I have no objection to looking first in the build tree and second in the source tree, if you can get it to do that easily. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table
On Tue, Dec 6, 2011 at 7:12 AM, Florian Weimer fwei...@bfk.de wrote: * Robert Haas: I tried whacking out the call to GetPageWithFreeSpace() in RelationGetBufferForTuple(), and also with the unpatched code, but the run-to-run randomness was way more than any difference the change made. Is there a better test case? I think that if you want to exercise file system lookup performance, you need a larger directory, which presumably means a large number of tables. OK. I created 100,000 dummy tables, 10,000 at a time avoid blowing up the lock manager. I then repeated my previous tests, and I still can't see any meaningful difference (on my MacBook Pro, running MacOS X v10.6.8). So at least on this OS, it doesn't seem to matter much. I'm inclined to defer putting any more work into it until such time as someone can demonstrate that it actually causes a problem and provides a reproducible test case. I don't deny that there's probably an effect and it would be nice to improve this, but it doesn't seem worth spending a lot of time on until we can find a case where the effect is measurable. On the other hand, the problem of the FSM taking up 24kB for an 8kB table seems clearly worth fixing, but I don't think I have the cycles for it at present. Maybe a TODO is in order. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
On Tue, Dec 6, 2011 at 1:11 PM, Ben Hockey neonstalw...@gmail.com wrote: i know its been over a year without any activity on this thread but did anything ever come of this? i'd really like to be able to get dates to match the format specified for date time strings in ecmascript 5. a generic way to specify the format would be ideal if it can be done securely. has there been other threads discussing this more recently? Not to my knowledge, though I don't read pgsql-general. I think this is the sort of thing that really only gets done if someone cares enough about it to settle down and put together a detailed design proposal, get consensus, and write a patch. IOW, it's unlikely that anyone else will do this for you, but you can certainly make a try at doing it yourself, and get help from others along the way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
Robert Haas robertmh...@gmail.com writes: On Tue, Dec 6, 2011 at 1:11 PM, Ben Hockey neonstalw...@gmail.com wrote: i know its been over a year without any activity on this thread but did anything ever come of this? i'd really like to be able to get dates to match the format specified for date time strings in ecmascript 5. a generic way to specify the format would be ideal if it can be done securely. has there been other threads discussing this more recently? Not to my knowledge, though I don't read pgsql-general. I think this is the sort of thing that really only gets done if someone cares enough about it to settle down and put together a detailed design proposal, get consensus, and write a patch. IOW, it's unlikely that anyone else will do this for you, but you can certainly make a try at doing it yourself, and get help from others along the way. TBH, I think that inventing a new datestyle setting ECMA would be a more appropriate investment of effort. Generic format strings sound like a nightmare. Maybe I've just been turned off by the to_date/to_char mess, but I'm very down on the idea of anything like that propagating into the main datetime I/O code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
2011/12/6 Tom Lane t...@sss.pgh.pa.us: Robert Haas robertmh...@gmail.com writes: On Tue, Dec 6, 2011 at 1:11 PM, Ben Hockey neonstalw...@gmail.com wrote: i know its been over a year without any activity on this thread but did anything ever come of this? i'd really like to be able to get dates to match the format specified for date time strings in ecmascript 5. a generic way to specify the format would be ideal if it can be done securely. has there been other threads discussing this more recently? Not to my knowledge, though I don't read pgsql-general. I think this is the sort of thing that really only gets done if someone cares enough about it to settle down and put together a detailed design proposal, get consensus, and write a patch. IOW, it's unlikely that anyone else will do this for you, but you can certainly make a try at doing it yourself, and get help from others along the way. TBH, I think that inventing a new datestyle setting ECMA would be a more appropriate investment of effort. Generic format strings sound like a nightmare. Maybe I've just been turned off by the to_date/to_char mess, but I'm very down on the idea of anything like that propagating into the main datetime I/O code. I am for ECMA datestyle it is there but just is not public, if I remember well Theoretically some custom output/input transform routine can be very interesting - for domains, for boolean type - but on second hand - the usage of this feature is minimal and there is risk for less advanced users - so ECMA datestyle is very adequate solution. Regards Pavel regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: SP-GiST, Space-Partitioned GiST
Oleg Bartunov o...@sai.msu.su writes: There is one annoying problem under MAC OS (Linux, FreeBSD have no problem), we just can't figure out how to find it, since we are not familiar with MAC OS - it fails to restart after 'kill -9' backend, but only if sources were compiled with -O2 option (no problem occured with -O0). Since the fail happens not every time, we use following script to reproduce the problem. We ask MAC OS guru to help us debugging this problem. I don't think it's Mac-specific at all; it looks to me like garden variety uninitialized data, specifically that there are paths through doPickSplit that don't set xlrec.newPage. The crash I'm seeing is TRAP: FailedAssertion(!(offset = (((PageHeader) (page))-pd_lower = (__builtin_offsetof (PageHeaderData, pd_linp)) ? 0 : PageHeader) (page))-pd_lower - (__builtin_offsetof (PageHeaderData, pd_linp))) / sizeof(ItemIdData))) + 1), File: spgxlog.c, Line: 81) #0 0x7fff883f982a in __kill () #1 0x7fff85bdda9c in abort () #2 0x000103165a71 in ExceptionalCondition (conditionName=value temporarily unavailable, due to optimizations, errorType=value temporarily unavailable, due to optimizations, fileName=value temporarily unavailable, due to optimizations, lineNumber=value temporarily unavailable, due to optimizations) at assert.c:57 #3 0x000102eeec73 in addOrReplaceTuple (page=0x74cc Address 0x74cc out of bounds, tuple=0x7faa1182d64c , size=88, offset=70) at spgxlog.c:81 #4 0x000102eed4bc in spgRedoPickSplit [inlined] () at /Users/tgl/pgsql/src/backend/access/spgist/spgxlog.c:504 #5 0x000102eed4bc in spg_redo (record=0x7fff62a5ccf0) at spgxlog.c:803 #6 0x000102ec4f48 in StartupXLOG () at xlog.c:6534 #7 0x000103054378 in StartupProcessMain () at startup.c:220 #8 0x000102ef4449 in AuxiliaryProcessMain (argc=2, argv=0x7fff62a60030) at bootstrap.c:414 The xlog record it's working on is (gdb) p *(spgxlogPickSplit*)(0x7fcb20826600 + 32) $6 = { node = { spcNode = 1663, dbNode = 41578, relNode = 204800 }, nTuples = 75, nNodes = 4, blknoSrc = 988, nDelete = 74, blknoInner = 929, offnumInner = 70, newPage = 1 '\001', blknoParent = 929, offnumParent = 13, nodeI = 2, stateSrc = { attType_attlen = 16, fakeTupleSize = 32, isBuild = 1 } } Since newPage is set, addOrReplaceTuple gets called on a freshly initialized page, and not surprisingly complains that offset 70 is way out of range. Maybe there's something wrong with the replay logic, but what I'm thinking is that newPage should not have been true here, which means that doPickSplit failed to set it correctly, which doesn't look at all improbable. I added a memset at the top of doPickSplit to force the whole struct to zeroes, and so far haven't seen the crash again. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
On 12/6/2011 3:20 PM, Pavel Stehule wrote: I am for ECMA datestyle it is there but just is not public, if I remember well Theoretically some custom output/input transform routine can be very interesting - for domains, for boolean type - but on second hand - the usage of this feature is minimal and there is risk for less advanced users - so ECMA datestyle is very adequate solution. Regards Pavel i don't particularly need anything other than ECMA datestyle - i was just under the impression that a more generic solution was preferred. so, ECMA is enough to stop me from making any more noise about this. pavel, is there a way i can use this currently? if not, would it take much effort to make this public? thanks, ben... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
2011/12/6 ben hockey neonstalw...@gmail.com: On 12/6/2011 3:20 PM, Pavel Stehule wrote: I am for ECMA datestyle it is there but just is not public, if I remember well Theoretically some custom output/input transform routine can be very interesting - for domains, for boolean type - but on second hand - the usage of this feature is minimal and there is risk for less advanced users - so ECMA datestyle is very adequate solution. Regards Pavel i don't particularly need anything other than ECMA datestyle - i was just under the impression that a more generic solution was preferred. so, ECMA is enough to stop me from making any more noise about this. pavel, is there a way i can use this currently? if not, would it take much effort to make this public? I am not sure, if this patch is 100% correct but it does something the name is not ECMA but XSD - I hope, so both formats are same postgres=# set datestyle TO 'XSD'; SET postgres=# select current_timestamp; now ── 2011-12-06T21:50:34.142933+01:00 (1 row) postgres=# select '2011-12-06T22:46:53.455866+01:00'::timestamp; timestamp 2011-12-06T22:46:53.455866 (1 row) but maybe this will be some more, if XSD format is not exact ECMA Regards Pavel thanks, ben... *** ./src/backend/commands/variable.c.orig 2011-09-22 23:57:57.0 +0200 --- ./src/backend/commands/variable.c 2011-12-06 21:46:03.489229819 +0100 *** *** 123,128 --- 123,135 newDateOrder = DATEORDER_MDY; have_order = true; } + else if (pg_strcasecmp(tok, XSD) == 0) + { + if (have_style newDateStyle != USE_XSD_DATES) + ok = false; /* conflicting styles */ + newDateStyle = USE_XSD_DATES; + have_style = true; + } else if (pg_strcasecmp(tok, DEFAULT) == 0) { /* *** *** 191,196 --- 198,206 case USE_GERMAN_DATES: strcpy(result, German); break; + case USE_XSD_DATES: + strcpy(result, XSD); + break; default: strcpy(result, Postgres); break; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
2011/12/6 ben hockey neonstalw...@gmail.com: On 12/6/2011 3:20 PM, Pavel Stehule wrote: I am for ECMA datestyle it is there but just is not public, if I remember well Theoretically some custom output/input transform routine can be very interesting - for domains, for boolean type - but on second hand - the usage of this feature is minimal and there is risk for less advanced users - so ECMA datestyle is very adequate solution. Regards Pavel i don't particularly need anything other than ECMA datestyle - i was just under the impression that a more generic solution was preferred. so, ECMA is enough to stop me from making any more noise about this. a general solution is not simple - there is possible a SQL injection and therefore result must be escaped, and it means some overhead else - is very common a good style to use functions to_char, to_date or to_timestamp functions. Then your application will be more robust. Using default datestyle is user friendly technique, but it can be source of some issues - is better don't use it for large and complex application. Regards Pavel pavel, is there a way i can use this currently? if not, would it take much effort to make this public? thanks, ben... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RangeVarGetRelid()
On Mon, Dec 5, 2011 at 2:09 AM, Noah Misch n...@leadboat.com wrote: Your committed patch looks great overall. A few cosmetic points: Thanks for the review. That last sentence needs a word around might things. Fixed. AcceptInvalidationMessages(); The above call can go away, now. Doesn't that still protect us against namespace-shadowing issues? RangeVarGetRelid doesn't actually AcceptInvalidationMessages() at the top. That sentence needs a word around so need. Fixed. Attached please find a patch with some more fixes on this same general theme. This one tackles renaming of relations, columns, and triggers; and changing the schema of relations. In these cases, the current code does a permissions check before locking the table (which is good) and uses RangeVarGetRelid() to guard against cache lookup failure errors caused by concurrent DDL (also good). However, if the referent of the name changes during the lock wait, we don't recheck permissions; we just allow the rename or schema change on the basis that the user had permission to do it to the relation that formerly had that name. While this is pretty minor as security concerns go, it seems best to clean it up, so this patch does that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company rangevargetrelid-callback-round2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
On 12/6/2011 3:53 PM, Pavel Stehule wrote: I am not sure, if this patch is 100% correct but it does something the name is not ECMA but XSD - I hope, so both formats are same that format works for me. in fact a simple test to see if it would do what i hope for would be to open the developer console (firebug, web developer, etc) of a browser and take the formatted output and pass it to the Date constructor - eg using chrome: var a = new Date('2011-12-06T22:46:53.455866+01:00'); Date a.toDateString(); Tue Dec 06 2011 a.toTimeString(); 16:46:53 GMT-0500 (Eastern Standard Time) you can see that the Date was properly created with the time converted to my local time zone. this would be of great help to anyone passing data from postrgres to a web browser since it eliminates the need to have to transform the format somewhere between the database and the browser. i'm not familiar with the process of getting this feature added to postgres - what needs to happen now? ben... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
2011/12/6 ben hockey neonstalw...@gmail.com: On 12/6/2011 3:53 PM, Pavel Stehule wrote: I am not sure, if this patch is 100% correct but it does something the name is not ECMA but XSD - I hope, so both formats are same that format works for me. in fact a simple test to see if it would do what i hope for would be to open the developer console (firebug, web developer, etc) of a browser and take the formatted output and pass it to the Date constructor - eg using chrome: var a = new Date('2011-12-06T22:46:53.455866+01:00'); Date a.toDateString(); Tue Dec 06 2011 a.toTimeString(); 16:46:53 GMT-0500 (Eastern Standard Time) you can see that the Date was properly created with the time converted to my local time zone. this would be of great help to anyone passing data from postrgres to a web browser since it eliminates the need to have to transform the format somewhere between the database and the browser. i'm not familiar with the process of getting this feature added to postgres - what needs to happen now? it can be in 9.2 (if will be accepted) - it will be release at summer 2012 http://wiki.postgresql.org/wiki/Submitting_a_Patch Regards Pavel Stehule ben... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
Robert Haas robertmh...@gmail.com writes: On Tue, Dec 6, 2011 at 1:07 PM, Tom Lane t...@sss.pgh.pa.us wrote: I'll take another crack at it. I'm not entirely sold yet on merging the two structs; I think first we'd better look and see what the needs are in the other potential callers I mentioned. If we'd end up cluttering the struct with half a dozen weird fields, it'd be better to stick to a minimal interface struct with various wrapper structs, IMO. OK. I'll defer to whatever you come up with after looking at it. OK, it looks like nodeMergeAppend.c could use something exactly like the draft SortKey struct, while nodeMergejoin.c could embed such a struct in MergeJoinClauseData. The btree stuff needs something more nearly equivalent to a ScanKey, including a datum-to-compare-to and a flags field. I'm inclined to think the latter would be too specialized to put in the generic struct. On the other hand, including the reverse and nulls_first flags in the generic struct is clearly a win since it allows ApplyComparator() to be defined as a generic function. So the only thing that's really debatable is the attno field, and I'm not anal enough to insist on a separate level of struct just for that. I am however inclined to stick with the shortened struct name SortSupport rather than using SortKey. The presence of the function pointer fields (especially the inlined-qsort pointers, assuming we adopt some form of Peter's patch) changes the struct's nature in my view; it's not really describing just a sort key (ie an ORDER BY column specification). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
On 12/6/2011 4:19 PM, Pavel Stehule wrote: it can be in 9.2 (if will be accepted) - it will be release at summer 2012 http://wiki.postgresql.org/wiki/Submitting_a_Patch Regards Pavel Stehule ok, so i assume your patch is now considered submitted and is waiting to be reviewed. i'll wait to see what happens. thanks, ben... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
2011/12/6 ben hockey neonstalw...@gmail.com: On 12/6/2011 4:19 PM, Pavel Stehule wrote: it can be in 9.2 (if will be accepted) - it will be release at summer 2012 http://wiki.postgresql.org/wiki/Submitting_a_Patch Regards Pavel Stehule ok, so i assume your patch is now considered submitted and is waiting to be reviewed. i'll wait to see what happens. not yet :) there should be proposal, and maybe more hacking - documentation is missing, there are no regression tests. It needs half of day. Patch that I sent you is just VIP patch Regards Pavel thanks, ben... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PostgreSQL 9.1 poster picture ?
Hi there, I'm looking for the picture of PostgreSQL 9.1 poster, which we all signed at Developers Meeting. Anybody knows where it's now ? Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
i may have spoken a little too soon about the format being right... i just took a look at the postgres source code and it would need one more change to completely meet my needs. EncodeDateTime should put a 'Z' for UTC timezone rather than '+0'. with this being the case, do you think there would need to be an ECMA datestyle or would XSD be compatible with this change? i haven't touched any c code in quite a while but the changes look simple enough that i could work towards an ECMA patch if that's the best way to go about this. thanks, ben... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
Hello 2011/12/6 ben hockey neonstalw...@gmail.com: i may have spoken a little too soon about the format being right... i just took a look at the postgres source code and it would need one more change to completely meet my needs. EncodeDateTime should put a 'Z' for UTC timezone rather than '+0'. with this being the case, do you think there would need to be an ECMA datestyle or would XSD be compatible with this change? probably we can't to change a XSD format - but new format, that is exactly ECMA should not be problem. i haven't touched any c code in quite a while but the changes look simple enough that i could work towards an ECMA patch if that's the best way to go about this. you can become a postgreql's hacker :) - anybody starts on simple jobs PostgreSQL hacking is good lecture Pavel thanks, ben... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On Tue, Dec 6, 2011 at 4:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Dec 6, 2011 at 1:07 PM, Tom Lane t...@sss.pgh.pa.us wrote: I'll take another crack at it. I'm not entirely sold yet on merging the two structs; I think first we'd better look and see what the needs are in the other potential callers I mentioned. If we'd end up cluttering the struct with half a dozen weird fields, it'd be better to stick to a minimal interface struct with various wrapper structs, IMO. OK. I'll defer to whatever you come up with after looking at it. OK, it looks like nodeMergeAppend.c could use something exactly like the draft SortKey struct, while nodeMergejoin.c could embed such a struct in MergeJoinClauseData. The btree stuff needs something more nearly equivalent to a ScanKey, including a datum-to-compare-to and a flags field. I'm inclined to think the latter would be too specialized to put in the generic struct. On the other hand, including the reverse and nulls_first flags in the generic struct is clearly a win since it allows ApplyComparator() to be defined as a generic function. So the only thing that's really debatable is the attno field, and I'm not anal enough to insist on a separate level of struct just for that. I am however inclined to stick with the shortened struct name SortSupport rather than using SortKey. The presence of the function pointer fields (especially the inlined-qsort pointers, assuming we adopt some form of Peter's patch) changes the struct's nature in my view; it's not really describing just a sort key (ie an ORDER BY column specification). Works for me. I think we should go ahead and get this part committed first, and then we can look at the inlining stuff as a further optimization for certain cases... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and relkind filtering
On Mon, Dec 5, 2011 at 5:06 PM, Bruce Momjian br...@momjian.us wrote: Pg_upgrade has the following check to make sure the cluster is safe for upgrading: res = executeQueryOrDie(conn, SELECT n.nspname, c.relname, a.attname FROM pg_catalog.pg_class c, pg_catalog.pg_namespace n, pg_catalog.pg_attribute a WHERE c.oid = a.attrelid AND NOT a.attisdropped AND a.atttypid IN ( 'pg_catalog.regproc'::pg_catalog.regtype, 'pg_catalog.regprocedure'::pg_catalog.regtype, 'pg_catalog.regoper'::pg_catalog.regtype, 'pg_catalog.regoperator'::pg_catalog.regtype, /* regclass.oid is preserved, so 'regclass' is OK */ /* regtype.oid is preserved, so 'regtype' is OK */ 'pg_catalog.regconfig'::pg_catalog.regtype, 'pg_catalog.regdictionary'::pg_catalog.regtype) AND c.relnamespace = n.oid AND n.nspname != 'pg_catalog' AND n.nspname != 'information_schema'); Based on a report from EnterpriseDB, I noticed that we check all pg_class entries, while there are cases where this is unnecessary because there is no data behind the entry, e.g. views. Here are the relkinds supported: #define RELKIND_RELATION 'r' /* ordinary table */ #define RELKIND_INDEX 'i' /* secondary index */ #define RELKIND_SEQUENCE 'S' /* sequence object */ #define RELKIND_TOASTVALUE 't' /* for out-of-line values */ #define RELKIND_VIEW 'v' /* view */ #define RELKIND_COMPOSITE_TYPE 'c' /* composite type */ #define RELKIND_FOREIGN_TABLE 'f' /* foreign table */ #define RELKIND_UNCATALOGED 'u' /* not yet cataloged */ What types, other than views, can we skip in this query? It's not obvious to me that anything other than a table or index would matter. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and relkind filtering
Robert Haas wrote: On Mon, Dec 5, 2011 at 5:06 PM, Bruce Momjian br...@momjian.us wrote: Pg_upgrade has the following check to make sure the cluster is safe for upgrading: ? ? ? ?res = executeQueryOrDie(conn, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?SELECT n.nspname, c.relname, a.attname ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?FROM ? pg_catalog.pg_class c, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? pg_catalog.pg_namespace n, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? pg_catalog.pg_attribute a ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?WHERE ?c.oid = a.attrelid AND ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? NOT a.attisdropped AND ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? a.atttypid IN ( ? ? ? ? ? ? ? ? ? 'pg_catalog.regproc'::pg_catalog.regtype, ? ? ? ? ? ? ? ? ? 'pg_catalog.regprocedure'::pg_catalog.regtype, ? ? ? ? ? ? ? ? ? 'pg_catalog.regoper'::pg_catalog.regtype, ? ? ? ? ? ? ? ? ? 'pg_catalog.regoperator'::pg_catalog.regtype, ? ? ? ?/* regclass.oid is preserved, so 'regclass' is OK */ ? ? ? ?/* regtype.oid is preserved, so 'regtype' is OK */ ? ? ? ? ? ? ? ? ? 'pg_catalog.regconfig'::pg_catalog.regtype, ? ? ? ? ? ? ? ? ? 'pg_catalog.regdictionary'::pg_catalog.regtype) AND ? ? ? ? ? ? ? c.relnamespace = n.oid AND ? ? ? ? ? ? ? n.nspname != 'pg_catalog' AND ? ? ? ? ? ? ? n.nspname != 'information_schema'); Based on a report from EnterpriseDB, I noticed that we check all pg_class entries, while there are cases where this is unnecessary because there is no data behind the entry, e.g. views. ?Here are the relkinds supported: ? ? ? ?#define ? ? ? RELKIND_RELATION ? ? ? ?'r' ? ? ? /* ordinary table */ ? ? ? ?#define ? ? ? RELKIND_INDEX ? ? ? ? ? 'i' ? ? ? /* secondary index */ ? ? ? ?#define ? ? ? RELKIND_SEQUENCE ? ? ? ?'S' ? ? ? /* sequence object */ ? ? ? ?#define ? ? ? RELKIND_TOASTVALUE ? ? ?'t' ? ? ? /* for out-of-line values */ ? ? ? ?#define ? ? ? RELKIND_VIEW ? ? ? ? ? ?'v' ? ? ? /* view */ ? ? ? ?#define ? ? ? RELKIND_COMPOSITE_TYPE ?'c' ? ? ? /* composite type */ ? ? ? ?#define ? ? ? RELKIND_FOREIGN_TABLE ? 'f' ? ? ? /* foreign table */ ? ? ? ?#define ? ? ? RELKIND_UNCATALOGED ? ? 'u' ? ? ? /* not yet cataloged */ What types, other than views, can we skip in this query? It's not obvious to me that anything other than a table or index would matter. Well, I assume the composite type could be referenced by another table, and the foreign table might have data stored in it that is now invalid. Toast and sequences are probably safely skipped, but also probably never a problem to check. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table
On Tuesday, December 06, 2011 08:53:42 PM Robert Haas wrote: On Tue, Dec 6, 2011 at 7:12 AM, Florian Weimer fwei...@bfk.de wrote: * Robert Haas: I tried whacking out the call to GetPageWithFreeSpace() in RelationGetBufferForTuple(), and also with the unpatched code, but the run-to-run randomness was way more than any difference the change made. Is there a better test case? I think that if you want to exercise file system lookup performance, you need a larger directory, which presumably means a large number of tables. OK. I created 100,000 dummy tables, 10,000 at a time avoid blowing up the lock manager. I then repeated my previous tests, and I still can't see any meaningful difference (on my MacBook Pro, running MacOS X v10.6.8). So at least on this OS, it doesn't seem to matter much. I'm inclined to defer putting any more work into it until such time as someone can demonstrate that it actually causes a problem and provides a reproducible test case. I don't deny that there's probably an effect and it would be nice to improve this, but it doesn't seem worth spending a lot of time on until we can find a case where the effect is measurable. I think if at all youre going to notice differences at a high concurrency because you then would start to hit the price of synchronizing the dcache between cpu cores in the kernel. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On 7 December 2011 00:18, Robert Haas robertmh...@gmail.com wrote: Works for me. I think we should go ahead and get this part committed first, and then we can look at the inlining stuff as a further optimization for certain cases... Do you mean just inlining, or inlining and the numerous other optimisations that my patch had? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements with query tree based normalization
On 14 November 2011 04:42, Greg Smith g...@2ndquadrant.com wrote: The approach Peter used adds a single integer to the Const structure in order to have enough information to substitute ? in place of those. Adding and maintaining that is the only change outside of the extension made here, and that overhead is paid by everyone--not just consumers of this new code. I've attempted to isolate that overhead, so far unsuccessfully. Attached are: 1. A simple python + psycopg2 script for repeatedly running a succession of similar queries that explain would show as containing a single Result node. They contain 300 Const integer nodes by default, which are simply selected. 2. The results of running the script on Greg's server, which has CPU frequency scaling disabled. That's an ODS spreadsheet. Out of consideration of filesize, I've deleted the query column in each sheet, which wasn't actually useful information. The results are...taking the median value of each set of runs as representative, my patch appears to run marginally faster than head. Of course, there is no reason to believe that it should, and I'm certain that the difference can be explained by noise, even though I've naturally strived to minimise noise. If someone could suggest a more telling test case, or even a worst-case, that would be useful. This was just my first run at this. I know that the overhead will also exist in code not well-exercised by these queries, but I imagine that any real-world query that attempts to exercise them all is going to add other costs that dwarf the additional overhead and further muddy the waters. I intend to work through the known issues with this patch in the next couple of days. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services #!/usr/bin/env python # Performance test for pg_stat_statements normalization # Peter Geoghegan # This test is intended to isolate the overhead of additional # infrastructure in the grammar and plan tree, which is used to # add a field to Const nodes so that the corresponding lexeme's # length is known from the query tree. # It is intended to be run without pg_stat_statements, because # the really pertinent issue is if these changes impose any # noticeable overhead on Postgres users that don't use # pg_stat_statements, and the degree thereof. # For best results, run with CPU frequency scaling disabled. import psycopg2 import random import time import csv from optparse import OptionParser # store results of a given run in a dedicated csv file def serialize_to_file(times, filename): wrt = csv.writer(open(filename, 'wb'), delimiter=',') # mark median run for runs of this # query (or if there is an even number of elements, near enough) median_i = (len(times) + 1) / 2 - 1 for i, v in enumerate(times): wrt.writerow([ v[0], time.ctime(v[1]), str(v[2]) + seconds, '*' if i == median_i else 'n']) def run_test(conn, num_its, num_const_nodes): # A very unsympathetic query here is one with lots of # Const nodes that explain shows as a single Result node. # This is because parsing has a large overhead # relative to planning and execution, and there is an unusually # high number of Const nodes. # Use psuedo-random numbers with a consistent seed value - numbers # used are deterministic for absolute consistency, though I don't # believe that to be significant, at least for now. random.seed(55) cur = conn.cursor() times = [] for i in range(0, num_its): # Generate new query with psuedo-random integer Const nodes qry = select for i in range(0, num_const_nodes): n = random.randint(0, 1000) qry += str(n) + (, if i != num_const_nodes - 1 else ;) begin = time.time() cur.execute(qry) end = time.time() elapsed = end - begin times.append((qry, begin, elapsed)) # Sort values for reference, and to locate the median value sort_vals = sorted(times, key=lambda tot_time: tot_time[2]) serialize_to_file(sort_vals, test_results.csv) def main(): parser = OptionParser(description=) parser.add_option('-c', '--conninfo', type=str, help=libpq-style connection info string of database to connect to. Can be omitted, in which case we get details from our environment. You'll probably want to put this in double-quotes, like this: --conninfo \hostaddr=127.0.0.1 port=5432 dbname=postgres user=postgres\. , default=) parser.add_option('-n', '--num_its', type=int, help=Number of iterations (times a query is executed), default=5000) parser.add_option('-s', '--num_const_nodes', type=int, help=Number of Const nodes that each query contains, default=300) args = parser.parse_args()[0] conn_str = args.conninfo num_its = args.num_its num_const_nodes = args.num_const_nodes conn = psycopg2.connect(conn_str) run_test(conn, num_its, num_const_nodes) if __name__==__main__: main() field_addition_results.ods Description: application/vnd.oasis.opendocument.spreadsheet --
[HACKERS] Timing overhead and Linux clock sources
Over in the add timing of buffer I/O requests thread I mentioned having a system where EXPLAIN ANALYZE of a modest COUNT(*) takes 10X as long as just executing the statement. Attached is a bit of SQL and a script that runs it multiple times that demonstrate the problem on systems that have it. It assumes you've installed the Dell Store 2 database. I'd expect it to work with simpler example too (just create something with 20,000 rows), but I haven't actually confirmed that yet. From what I've been able to measure so far, here's the state of things: -If you have a system with a working TSC clock source (timing data is pulled right from the CPU), timing overhead is reasonable enough that you might turn it on even for things that happen frequently, such as the buffer I/O timing patch enables. -Some systems have tsc, but it doesn't work right in multi-core setups. Newer kernel versions know this and disable it, older ones let you pick tsc anyway and bad things occur. -Using any of the other timing mechanism--hpt, acpi_pm, and jiffies--has extremely high timing overhead. This doesn't conflict with the comment we got from Martijn van Oosterhout in the other thread: Learned the hard way while tracking clock-skew on a multicore system. The hpet may not be the fastest (that would be the cpu timer), but it's the fastest (IME) that gives guaranteed monotonic time. There's more background about TSC, HPET, and an interesting caveat about VMs I came across: http://en.wikipedia.org/wiki/Time_Stamp_Counter http://en.wikipedia.org/wiki/High_Precision_Event_Timer http://kb.vmware.com/selfservice/microsites/search.do?language=en_UScmd=displayKCexternalId=1007020 I'm starting to envision what a new documentation section discussing timer overhead would look like now, for Linux at least. And I'm guessing other people can simulate this problem even on hardware that doesn't have it in the default config, simply by manually switching to one of the slower timing methods at boot. The test case is just doing this: \timing select count(*) from customers; explain analyze select count(*) from customers; Lots of times, with some patterns to try and avoid caching effects. The system with the worst clock timing issues I've found, the one that was so obvious it kicked off this investigation, is my Thinkpad T60 laptop. The overhead is worst on the old 32-bit Ubuntu 9.04 installation I have there. Here's some info about the system, starting with the 4 clock sources it supports: $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource hpet acpi_pm jiffies tsc $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource hpet $ uname -a Linux meddle 2.6.28-19-generic #61-Ubuntu SMP Wed May 26 23:35:15 UTC 2010 i686 GNU/Linux $ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=9.04 DISTRIB_CODENAME=jaunty DISTRIB_DESCRIPTION=Ubuntu 9.04 $ psql -c select version() PostgreSQL 8.3.11 on i486-pc-linux-gnu, compiled by GCC gcc-4.3.real (Ubuntu 4.3.3-5ubuntu4) 4.3.3 You change the time source by adding a stanza like this to the kernel boot options: clocksource=acpi_pm Martijn said you can just cat a new clock source into the setting instead, I haven't evaluated that yet. Wanted to get my baselines done in a situation where the kernel boots using the one I wanted, which turns out to be useful for a number of reasons. Boot time is one of the things that varied a lot when the timer was switched. The hpet time source is the default on my T60 system, but there are 3 others available. Of those, it appears that only acpi_pm is a viable alternative. I gave up on the system booting after 15 minutes when using jiffies. I could boot with tsc, but everything was erratic. I could barely use the mouse for example. tsc did have a much faster clock as measured by EXPLAIN overhead though. Here are the results, two runs of the test program that I've de-interleaved so it's a block of fast (count only) runs then a block of slow (with EXPLAIN ANALYZE): Time Source:hpet Time: 14.783 ms Time: 9.021 ms Time: 9.170 ms Time: 8.634 ms Time: 5.632 ms Time: 5.520 ms Time: 15.276 ms Time: 9.477 ms Time: 9.485 ms Time: 70.720 ms Time: 67.577 ms Time: 57.959 ms Time: 74.661 ms Time: 73.557 ms Time: 74.963 ms Time: 87.898 ms Time: 82.125 ms Time: 81.443 ms Time Source:acpi_pm Time: 8.124 ms Time: 5.020 ms Time: 4.887 ms Time: 8.125 ms Time: 5.029 ms Time: 4.881 ms Time: 82.986 ms Time: 82.366 ms Time: 82.609 ms Time: 83.089 ms Time: 82.438 ms Time: 82.539 ms Time Source:tsc Time: 8.371 ms Time: 4.673 ms Time: 4.901 ms Time: 8.409 ms Time: 4.943 ms Time: 4.722 ms Time: 16.436 ms Time: 16.349 ms Time: 16.139 ms Time: 19.871 ms Time: 17.175 ms Time: 16.540 ms There's the 10:1 possible slowdown I was talking about when using the default hpet clock, while the faster (but fundamentally unstable) tsc clock does better, only around a 4:1 slowdown there. First
Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table
On Tue, Dec 6, 2011 at 8:12 PM, Andres Freund and...@anarazel.de wrote: On Tuesday, December 06, 2011 08:53:42 PM Robert Haas wrote: On Tue, Dec 6, 2011 at 7:12 AM, Florian Weimer fwei...@bfk.de wrote: * Robert Haas: I tried whacking out the call to GetPageWithFreeSpace() in RelationGetBufferForTuple(), and also with the unpatched code, but the run-to-run randomness was way more than any difference the change made. Is there a better test case? I think that if you want to exercise file system lookup performance, you need a larger directory, which presumably means a large number of tables. OK. I created 100,000 dummy tables, 10,000 at a time avoid blowing up the lock manager. I then repeated my previous tests, and I still can't see any meaningful difference (on my MacBook Pro, running MacOS X v10.6.8). So at least on this OS, it doesn't seem to matter much. I'm inclined to defer putting any more work into it until such time as someone can demonstrate that it actually causes a problem and provides a reproducible test case. I don't deny that there's probably an effect and it would be nice to improve this, but it doesn't seem worth spending a lot of time on until we can find a case where the effect is measurable. I think if at all youre going to notice differences at a high concurrency because you then would start to hit the price of synchronizing the dcache between cpu cores in the kernel. Well, if the premise is that the table has been truncated in the same transaction, then it's going to be tough to get high concurrency. Maybe you could do it with multiple tables or with without truncation, but either way I think you're going to be primarily limited by I/O bandwidth or WALInsertLock contention, not kernel dcache synchronization. I might be wrong, of course, but that's what I think. I'm not saying this isn't worth improving, just that I don't see it as a priority for me personally to spend time on right now. If you or someone else wants to beat on it, or even just come up with a test case, great! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Timing overhead and Linux clock sources
On Tue, Dec 6, 2011 at 9:58 PM, Greg Smith g...@2ndquadrant.com wrote: -If you have a system with a working TSC clock source (timing data is pulled right from the CPU), timing overhead is reasonable enough that you might turn it on even for things that happen frequently, such as the buffer I/O timing patch enables. Even the TSC stuff looks expensive enough that you wouldn't to pay the full overhead all the time on a busy system, but of course we probably wouldn't want to do that anyway. EXPLAIN ANALYZE is extremely expensive mostly because it's timing entry and exit into every plan node, and the way our executor works, those are very frequent operations. But you could probably gather more coarse-grained statistics, like separating parse, plan, and execute time for each query, without breaking a sweat. I'm not sure about buffer I/Os - on a big sequential scan, you might do quite a lot of those in a pretty tight loop. That's not an argument against adding the option, though, assuming that the default setting is off. And, certainly, I agree with you that it's worth trying to document some of this stuff so that people don't have to try to figure it out themselves (uggh!). One random thought: I wonder if there's a way for us to just time every N'th event or something like that, to keep the overhead low. The problem is that you might not get accurate results if, say, every 2N'th event takes much longer than normal - you'll either hit all the long ones, or miss them all. You could fix that by using a pseudorandom number generator to decide whether to time each event, but that's got it's own overhead... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On Tue, Dec 6, 2011 at 8:13 PM, Peter Geoghegan pe...@2ndquadrant.com wrote: On 7 December 2011 00:18, Robert Haas robertmh...@gmail.com wrote: Works for me. I think we should go ahead and get this part committed first, and then we can look at the inlining stuff as a further optimization for certain cases... Do you mean just inlining, or inlining and the numerous other optimisations that my patch had? Whichever you like. But I think part of the point here is to disentangle those optimizations from each other and decide how broadly it makes sense to apply each one. Avoiding the FunctionCallInfo stuff is one, and it seems like we can apply that to a wide variety of data types (maybe all of them) for both in-memory and on-disk sorting, plus btree index ops, merge joins, and merge append. The gains may be modest, but they will benefit many use cases. Your original patch targets a much narrower use case (in-memory sorting of POD types) but the benefits are larger. We don't have to pick between a general but small optimization and a narrower but larger one; we can do both. In this regard, I think Heikki's remarks upthread are worth some thought. If inlining is a win just because it avoids saving and restoring registers or allows better instruction scheduling, then inlining is the (probably?) the only way to get the benefit. But if most of the benefit is in having a separate path for the single-sort-key case, we can do that without duplicating the qsort() code and get the benefit for every data type without much code bloat. I'd like to see us dig into that a little, so that we get the broadest possible benefit out of this work. It doesn't bother me that not every optimization will apply to every case, and I don't object to optimizations that are intrinsically narrow (within some reasonable limits). But I'd rather not take what could be a fairly broad-based optimization and apply it only narrowly, all things being equal. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and relkind filtering
Robert Haas robertmh...@gmail.com writes: On Mon, Dec 5, 2011 at 5:06 PM, Bruce Momjian br...@momjian.us wrote: Pg_upgrade has the following check to make sure the cluster is safe for upgrading: What types, other than views, can we skip in this query? It's not obvious to me that anything other than a table or index would matter. You'd better complain about composite types too, since one of them could be a column in a table. (Unless you want to test to see whether it actually is stored anywhere, but that seems like way overkill for this.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
Sorry for delayed response. 2011/11/29 Albe Laurenz laurenz.a...@wien.gv.at: I think that this is not always safe even from PostgreSQL to PostgreSQL. If two databases have different collation, on strings will behave differently. Indeed. I think that only the owner of foreign table can keep collation consistent between foreign and local, like data type of column. We need to support per-column-collation on foreign tables too, or should deny pushing down condition which is collation-sensitive... Regards,-- Shigeru Hanada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Timing overhead and Linux clock sources
On 12/06/2011 10:20 PM, Robert Haas wrote: EXPLAIN ANALYZE is extremely expensive mostly because it's timing entry and exit into every plan node, and the way our executor works, those are very frequent operations. The plan for the query I was timing looks like this: Aggregate (cost=738.00..738.01 rows=1 width=0) (actual time=3.045..3.045 rows=1 loops=1) - Seq Scan on customers (cost=0.00..688.00 rows=2 width=0) (actual time=0.002..1.700 rows=2 loops=1) That's then 2 * 2 timing calls for the Seq Scan dominating the runtime. On the system with fast TSC, the fastest execution was 1.478ms, the slowest with timing 2.945ms. That's 1.467ms of total timing overhead, worst-case, so approximately 37ns per timing call. If you're executing something that is only ever hitting data in shared_buffers, you can measure that; any other case, probably not. Picking apart the one with slow timing class on my laptop, fastest is 5.52ms, and the fastest with timing is 57.959ms. That makes for a minimum of 1311ns per timing call, best-case. I'm not sure about buffer I/Os - on a big sequential scan, you might do quite a lot of those in a pretty tight loop. To put this into perspective relative to the number of explain time calls, there are 488 pages in the relation my test is executing again. I think we need to be careful to keep timing calls from happening at every buffer allocation. I wouldn't expect sprinkling one around every buffer miss would be a problem on a system with a fast clocksource. And that is what was shown by the testing Ants Aasma did before submitting the add timing of buffer I/O requests patch; his results make more sense to me now. He estimated 22ns per gettimeofday on the system with fast timing calls--presumably using TSC, and possibly faster than I saw because his system had less cores than mine to worry about. He got 990 ns on his slower system, and a worst case there of 3% overhead. Whether people who are on one of these slower timing call systems would be willing to pay 3% overhead is questionable. But I now believe Ants's claim that it's below the noise level on ones with a good TSC driven timer. I got a 35:1 ratio between fast and slow clock sources, he got 45:1. If we try to estimate the timing overhead that is too small to measure, I'd guess it's ~3% / 30 = 0.1%. I'd just leave that on all the time on a good TSC-driven system. Buffer hits and tuple-level operations you couldn't afford to time, just about anything else would be fine. One random thought: I wonder if there's a way for us to just time every N'th event or something like that, to keep the overhead low. I'm predicting we got a lot of future demand for instrumentation features like this, where we want to make them available but would like to keep them from happening too often when the system is busy. Tossing a percentage of them might work. Caching them in queue somewhere for processing by a background process, and not collecting the data, if that queue fills is another idea I've been thinking about recently. I'm working on some ideas for making is the server busy? something you can ask the background writer usefully too. There's a number of things that become practical for that process to do, when it's decoupled from doing the checkpoint sync job so its worst-case response time is expected to tighten up. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers