Re: [HACKERS] Adding new joining alghoritm to postgresql
On Friday, July 19, 2013 7:17 PM tubadzin wrote: > Hi. I'm a little confused. 1.I have source code 9.2.4. version from > http://www.postgresql.org/ftp/source/ > 2.I want to add new alghoritm to index nested loops join, merge join and hash > join. > I have Executor catalog in src catalag containing nodeHash.c, nodeHasjoin.c, > nodeMergejoin and nodeNestloop.c > 3.After changes, I want to compile postgresql and use it. > 4.Problem is: a)I do not know which library is responsible for this > functionality. > I understand, that I have to compile src and replace library (I don't know > which library) in path where Postgresql in installed: C:\Program Files > (x86)\PostgreSQL\9.2 I think you would need to copy postgres.exe. Ideally you need to copy all the libraries that got changed due to your source code change. In the link below, you can even find how to create installation from source. > b)I don't know how use files/library (which library?) with visual studio 2010 > and how compile it. Find the instructions for how to build on windows at below link: http://www.postgresql.org/docs/devel/static/install-windows.html With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Allow background workers to be started dynamically.
Andres Freund escribió: > On 2013-07-19 08:23:25 -0400, Robert Haas wrote: > > And I'd also propose getting rid > > of bgw_sighup and bgw_sigterm in both branches, while we're at it. > > AFAICT, they don't add any functionality, and they're basically > > unusable for dynamically started background workers. Probably better > > not to get people to used to using them. > > I don't have a problem with getting rid of those, it's easy enough to > register them inside the worker and it's safe since we start with > blocked signals. I seem to remember some discussion about why they were > added but I can't find a reference anymore. Alvaro, do you remember? I left them there because it was easy; but they were absolutely necessary only until we decided that we would start the worker's main function with signals blocked. I don't think there's any serious reason not to remove them now. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE EVENT TRIGGER syntax
Joe Abbate writes: > What is the purpose of the [ AND ... ] at the end of the WHEN clause? > Is that for later releases, when presumably additional filter_variables > will be introduced? Right now, if I add "AND tag IN ..." I get an Yes. I had other filter variables in some versions of the patch, but we're yet to agree on a design for the things I wanted to solve with them. See http://www.postgresql.org/message-id/m2txrsdzxa@2ndquadrant.fr for some worked out example of the CONTEXT part of the Event Trigger proposal. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [ODBC] [HACKERS] getting rid of SnapshotNow
(2013/07/20 1:11), Tom Lane wrote: > Andres Freund writes: >> On 2013-07-20 00:49:11 +0900, Hiroshi Inoue wrote: >>> Using SnapshotSelf instead of SnapshotNow for currtid_ () wouldn't >>> matter. > >> I think it actually might. You could get into dicey situations if you >> use currtid_ in a query performing updates or inserts because it would >> see the to-be-inserted tuple... > > I'm pretty sure Hiroshi-san was only opining about whether it would > matter for ODBC's usage. IIUC, ODBC is using this function to re-fetch > rows that it inserted, updated, or at least selected-for-update in a > previous command of the current transaction, so actually any snapshot > would do fine. > > In any case, since I moved the goalposts by suggesting that SnapshotSelf > is just as dangerous as SnapshotNow, what we need to know is whether > it'd be all right to change this code to use a fresh MVCC snapshot; > and if not, why not. It's pretty hard to see a reason why client-side > code would want to make use of the results of a non-MVCC snapshot. OK I agree to replace SnapshotNow for currtid_xx() by a MVCC-snapshot. regards, Hiroshi Inoue -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Preventing tuple-table leakage in plpgsql
Noah Misch writes: > On Fri, Jul 05, 2013 at 02:47:06PM -0400, Tom Lane wrote: >> So I'm inclined to propose that SPI itself should offer some mechanism >> for cleaning up tuple tables at subtransaction abort. We could just >> have it automatically throw away tuple tables made in the current >> subtransaction, or we could allow callers to exercise some control, >> perhaps by calling a function that says "don't reclaim this tuple table >> automatically". I'm not sure if there's any real use-case for such a >> call though. > I suppose that would be as simple as making spi_dest_startup() put the > tuptabcxt under CurTransactionContext? The function to prevent reclamation > would then just do a MemoryContextSetParent(). I experimented with this, and found out that it's not quite that simple. In a SPI procedure that hasn't created a subtransaction (eg, a plpgsql function without an exception block), if we attach tuple tables to the outer transaction's CurTransactionContext then they fail to go away during SPI_finish(), creating leaks in code that was previously correct. However, we can use your idea when running inside a subtransaction, while still attaching the tuple table to the procedure's own procCxt when no subtransaction is involved. The attached draft patch does it that way, and successfully resolves the complained-of leakage case. I like this better than what I originally had in mind, because it produces no change in semantics in the case where a SPI procedure doesn't create any subtransactions, which I imagine is at least 99.44% of third-party SPI-using code. Also, because the tuple tables don't go away until CleanupSubTransaction(), code that explicitly frees them will still work as long as it does that before rolling back the subxact. Thus, the patch's changes to remove SPI_freetuptable() calls in plpy_cursorobject.c are not actually necessary for correctness, it's just that we no longer need them. Ditto for the change in AtEOSubXact_SPI. Unfortunately, the change in pl_exec.c *is* necessary to avoid a coredump, because that call was being made after rolling back the subxact. All in all, the risk of breaking anything outside core code seems very small, and I'm inclined to think that we don't need to provide an API function to reparent a tuple table. But having said that, I'm also inclined to not back-patch this further than 9.3, just in case. The patch still requires documentation changes, and there's also one more error-cleanup SPI_freetuptable() call that ought to go away, in PLy_spi_execute_fetch_result. But that one needs the fix posited in http://www.postgresql.org/message-id/21017.1374273...@sss.pgh.pa.us first. Comments? regards, tom lane diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index f30b2cd93330d72e0a412d6517aec08c96cc753e..3f4d1e625d738ad0c08d0245a0169451d29a631d 100644 *** a/src/backend/executor/spi.c --- b/src/backend/executor/spi.c *** AtEOSubXact_SPI(bool isCommit, SubTransa *** 284,291 { /* free Executor memory the same as _SPI_end_call would do */ MemoryContextResetAndDeleteChildren(_SPI_current->execCxt); ! /* throw away any partially created tuple-table */ ! SPI_freetuptable(_SPI_current->tuptable); _SPI_current->tuptable = NULL; } } --- 284,291 { /* free Executor memory the same as _SPI_end_call would do */ MemoryContextResetAndDeleteChildren(_SPI_current->execCxt); ! /* forget any partially created tuple-table */ ! /* (we don't need to free it, subxact cleanup will do so) */ _SPI_current->tuptable = NULL; } } *** void *** 1641,1648 spi_dest_startup(DestReceiver *self, int operation, TupleDesc typeinfo) { SPITupleTable *tuptable; ! MemoryContext oldcxt; MemoryContext tuptabcxt; /* * When called by Executor _SPI_curid expected to be equal to --- 1641,1649 spi_dest_startup(DestReceiver *self, int operation, TupleDesc typeinfo) { SPITupleTable *tuptable; ! MemoryContext parentcxt; MemoryContext tuptabcxt; + MemoryContext oldcxt; /* * When called by Executor _SPI_curid expected to be equal to *** spi_dest_startup(DestReceiver *self, int *** 1656,1669 if (_SPI_current->tuptable != NULL) elog(ERROR, "improper call to spi_dest_startup"); ! oldcxt = _SPI_procmem(); /* switch to procedure memory context */ ! tuptabcxt = AllocSetContextCreate(CurrentMemoryContext, "SPI TupTable", ALLOCSET_DEFAULT_MINSIZE, ALLOCSET_DEFAULT_INITSIZE, ALLOCSET_DEFAULT_MAXSIZE); ! MemoryContextSwitchTo(tuptabcxt); _SPI_current->tuptable = tuptable = (SPITupleTable *) palloc(sizeof(SPITupleTable)); --- 1657,1681 if (_SPI_current->tuptable != NULL) elog(ERROR, "improper call to spi_dest_startup"); ! /* ! * If we're in the same subtransaction our SPI procedure was called in, ! * attach the tuple tab
Re: [HACKERS] Proposal: template-ify (binary) extensions
Markus Wanner writes: >> - per-installation (not even per-cluster) DSO availability >> >> If you install PostGIS 1.5 on a system, then it's just impossible to >> bring another cluster (of the same PostgreSQL major version), let > On Debian, that should be well possible. Certainly installing Postgres > 9.1 w/ postgis-1.5 in parallel to Postgres 9.2 w/ postgis-2.0 is. I > designed it to be. > > I think I'm misunderstanding the problem statement, here. (of the same PostgreSQL major version) > Can CREATE EXTENSION check if the standbys have the extension installed? > And refuse creation, if they don't? No, because we don't register standbies so we don't know who they are, and also because some things that we see connected and using the replication protocol could well be pg_basebackup or pg_receivexlog. Also, it's possible that the standby is only there for High Availability purposes and runs no user query. > I'm sure you are aware that even without this clear separation of roles, > the guarantee means we provide an additional level of security against > attackers. Given lo_import() to upload a file from the client to the server then LOAD with the absolute path where the file ended up imported (or any untrusted PL really), this argument carries no sensible weight in my opinion. > None the less, the "safe by default" has served us well, I think. That's true. We need to consider carefully the proposal at hand though. It's all about allowing the backend to automatically load a file that it finds within its own $PGDATA so that we can have per-cluster and per-database modules (DSO files). The only difference with before is the location where the file is read from, and the main security danger comes from the fact that we used to only consider root-writable places and now propose to consider postgres bootstrap user writable places. Having the modules in different places in the system when it's a template and when it's instanciated allows us to solve a problem I forgot to list: - upgrading an extension at the OS level Once you've done that, any new backend will load the newer module (DSO file), so you have to be real quick if installing an hot fix in production and the SQL definition must be changed to match the new module version… With the ability to "instanciate" the module in a per-cluster per-database directory within $PGDATA the new version of the DSO module would only put in place and loaded at ALTER EXTENSION UPDATE time. I'm still ok with allowing to fix those problems only when a security option that defaults to 'false' has been switched to 'true', by the way, so that it's an opt-in, but I will admit having a hard time swallowing the threat model we're talking about… >> I don't think the relaxed behaviour we're talking about is currently >> possible to develop as an extension, by the way. > > It's extensions that undermine the guarantee, at the moment. But yeah, > it depends a lot on what kind of "relaxed behavior" you have in mind. The ability to load a module from another place than the current Dynamic_library_path is what we're talking about here, and IIUC Robert mentioned that maybe it could be down to an extension to allow for changing the behavior. I didn't look at that in details but I would be surprised to be able to tweak that without having to patch the backend. > If the sysadmin wants to disallow arbitrary execution of native code to > postgres (the process), any kind of embedded compiler likely is equally > unwelcome. You already mentioned untrusted PL languages, and I don't see any difference in between offering PL/pythonu and PL/C on security grounds, really. I don't think we would consider changing the current model of the "LANGUAGE C" PL, we would add a new "LANGUAGE PL/C" option, either untrusted or with a nice sandbox. The problem of sandboxing PL/C is that it makes it impossible to address such use cases as uuid or PostGIS extensions even if it still allows for hstore or other datatypes styles of extensions. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Isn't PLy_spi_execute_fetch_result completely broken?
This function appears to believe that it can PG_CATCH any random error and then just carry on, without doing a subtransaction cleanup. This is as wrong as can be. It might be all right if we had sufficiently narrow constraints on what could happen inside the PG_TRY block, but in point of fact it's calling datatype input functions, which could do anything whatsoever. I think it needs to do PG_RE_THROW() not just "return NULL" at the end of the CATCH. It looks like both of the existing callers have cleanup logic, so this should be sufficient. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSOC13 proposal - extend RETURNING syntax
New version: - fix returning "after" values if there are not "before" - add more regression tests I'd like to hear/read any feedback ;) Regards, Karol diff --git a/doc/src/sgml/ref/update.sgml b/doc/src/sgml/ref/update.sgml index 90b9208..eba35f0 100644 --- a/doc/src/sgml/ref/update.sgml +++ b/doc/src/sgml/ref/update.sgml @@ -194,12 +194,27 @@ UPDATE [ ONLY ] table_name [ * ] [ output_expression - An expression to be computed and returned by the UPDATE - command after each row is updated. The expression can use any - column names of the table named by table_name - or table(s) listed in FROM. - Write * to return all columns. + An expression to be computed and returned by the + UPDATE command either before or after (prefixed with + BEFORE. and AFTER., + respectively) each row is updated. The expression can use any + column names of the table named by table_name or table(s) listed in + FROM. Write AFTER.* to return all + columns after the update. Write BEFORE.* for all + columns before the update. Write * to return all + columns after update and all triggers fired (these values are in table + after command). You may combine BEFORE, AFTER and raw columns in the + expression. + + Mixing table names or aliases named before or after with the + above will result in confusion and suffering. If you happen to + have a table called before or + after, alias it to something else when using + RETURNING. + + @@ -287,15 +302,16 @@ UPDATE weather SET temp_lo = temp_lo+1, temp_hi = temp_lo+15, prcp = DEFAULT - Perform the same operation and return the updated entries: + Perform the same operation and return information on the changed entries: UPDATE weather SET temp_lo = temp_lo+1, temp_hi = temp_lo+15, prcp = DEFAULT WHERE city = 'San Francisco' AND date = '2003-07-03' - RETURNING temp_lo, temp_hi, prcp; + RETURNING temp_lo AS new_low, temp_hi AS new_high, BEFORE.temp_hi/BEFORE.temp_low AS old_ratio, AFTER.temp_hi/AFTER.temp_low AS new_ratio prcp; + Use the alternative column-list syntax to do the same update: diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c index d86e9ad..fafd311 100644 --- a/src/backend/commands/trigger.c +++ b/src/backend/commands/trigger.c @@ -2335,7 +2335,7 @@ ExecASUpdateTriggers(EState *estate, ResultRelInfo *relinfo) TupleTableSlot * ExecBRUpdateTriggers(EState *estate, EPQState *epqstate, ResultRelInfo *relinfo, - ItemPointer tupleid, TupleTableSlot *slot) + ItemPointer tupleid, TupleTableSlot *slot, TupleTableSlot **planSlot) { TriggerDesc *trigdesc = relinfo->ri_TrigDesc; HeapTuple slottuple = ExecMaterializeSlot(slot); @@ -2381,6 +2381,7 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate, if (newSlot != NULL) { slot = ExecFilterJunk(relinfo->ri_junkFilter, newSlot); + *planSlot = newSlot; slottuple = ExecMaterializeSlot(slot); newtuple = slottuple; } diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index 15f5dcc..06ebaf3 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -603,7 +603,7 @@ ExecUpdate(ItemPointer tupleid, resultRelInfo->ri_TrigDesc->trig_update_before_row) { slot = ExecBRUpdateTriggers(estate, epqstate, resultRelInfo, - tupleid, slot); + tupleid, slot, &planSlot); if (slot == NULL) /* "do nothing" */ return NULL; @@ -737,6 +737,7 @@ lreplace:; hufd.xmax); if (!TupIsNull(epqslot)) { + planSlot = epqslot; *tupleid = hufd.ctid; slot = ExecFilterJunk(resultRelInfo->ri_junkFilter, epqslot); tuple = ExecMaterializeSlot(slot); diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index a896d76..409b4d1 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -1928,6 +1928,7 @@ range_table_walker(List *rtable, { case RTE_RELATION: case RTE_CTE: + case RTE_BEFORE: /* nothing to do */ break; case RTE_SUBQUERY: @@ -2642,6 +2643,7 @@ range_table_mutator(List *rtable, { case RTE_RELATION: case RTE_CTE: + case RTE_BEFORE: /* we don't bother to copy eref, aliases, etc; OK? */ break; case RTE_SUBQUERY: diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 48cd9dc..79b03af 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -2366,6 +2366,7 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node) switch (node->rtekind) { case RTE_RELATION: + case RTE_BEFORE: WRITE_OID_FIELD(relid); WRITE_CHAR_FIELD(relkind); break; diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c index dc9cb3e..2ca29da 100644 --- a/src/backend/nodes/readfuncs.c +++ b/src/backend/nodes/read
[HACKERS] CREATE EVENT TRIGGER syntax
Hello, What is the purpose of the [ AND ... ] at the end of the WHEN clause? Is that for later releases, when presumably additional filter_variables will be introduced? Right now, if I add "AND tag IN ..." I get an ERROR: filter variable "tag" specified more than once Joe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FKey not enforced resulting in broken Dump/Reload
On Fri, Jul 19, 2013 at 2:15 PM, Robert Haas wrote: > On Fri, Jul 19, 2013 at 1:45 PM, Tom Lane wrote: >> Robert Haas writes: >>> On Fri, Jul 19, 2013 at 12:58 PM, Rod Taylor wrote: A poorly coded trigger on the referencing table has the ability to break foreign keys, and as a result create a database which cannot be dumped and reloaded. >> >>> This is a known limitation of our foreign key machinery. It might >>> well be susceptible to improvement, but I wouldn't count on anyone >>> rewriting it in the near future. >> >> If we failed to fire triggers on foreign-key actions, that would not be >> an improvement. And trying to circumscribe the trigger's behavior so >> that it couldn't break the FK would be (a) quite expensive, and >> (b) subject to the halting problem, unless perhaps you circumscribed >> it so narrowly as to break a lot of useful trigger behaviors. Thus, >> there's basically no alternative that's better than "so don't do that". > > I think a lot of people would be happier if foreign keys were always > checked after all regular triggers and couldn't be disabled. But, > eh, that's not how it works. Please ignore this fuzzy thinking. You're right. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
"some Salesforce folks" that would be me! It looks like I didn't quite communicate to Tom just what I was looking for as I do indeed want to have a variable number of "any" types, as: CREATE AGGREGATE FOO ( ANYELEMENT, , VARIADIC "any") ( ... STYPE = ANYARRAY ...) so the corresponding transition function would be CREATE FUNCTION FOO_sfunc( ANYARRAY, ANYELEMENT, , VARIADIC "any") RETURNS ANYARRAY and the final func is CREATE FUNCTION FOO_ffunc( ANYARRAY ) RETURNS ANYELEMENT The functions are in C, and I cheat and actually use the ANYARRAY transition variable as a struct just keeping the varlena length correct (thanks to Tom for that idea). Currently I just support a fixed number of "any" args but really need to have that be variable. So supporting VARIADIC "any" for user defined aggregates would be most useful. On Thu, Jul 18, 2013 at 7:09 PM, Tom Lane wrote: > Noah Misch writes: > > (I don't know whether VARIADIC transition functions work today, but that > would > > become an orthogonal project.) > > Coincidentally enough, some Salesforce folk were asking me about allowing > VARIADIC aggregates just a few days ago. I experimented enough to find > out that if you make an array-accepting transition function, and then > force the aggregate's pg_proc entry to look like it's variadic (by > manually setting provariadic and some other fields), then everything > seems to Just Work: the parser and executor are both fine with it. > So I think all that's needed here is to add some syntax support to > CREATE AGGREGATE, and probably make some tweaks in pg_dump. I was > planning to go work on that sometime soon. > > Having said that, though, what Andrew seemed to want was VARIADIC ANY, > which is a *completely* different kettle of fish, since the actual > parameters can't be converted to an array. I'm not sure if that's > as easy to support. > > regards, tom lane > >
Re: [HACKERS] Review: UNNEST (and other functions) WITH ORDINALITY
On Wed, Jun 26, 2013 at 2:47 AM, Tom Lane wrote: > Some of the rest of us would like to hear those reasons, because my > immediate reaction is that the patch must be broken by design. WITH > ORDINALITY should not be needing to mess with the fundamental evaluation > semantics of SRFs, but it sure sounds like it is doing so if that case > doesn't work as expected. As Dan pointed out later the patch is not affecting whether this case works as expected. Only that since you can only use WITH ORDINALITY on SRFs in the FROM list and not in the target list if you want to use it you have to rewrite this query to put the SRFs in the FROM list. I think we're ok with that since SRFs in the target list are something that already work kind of strangely and probably we would rather get rid of rather than work to extend. It would be hard to work ordinality into the grammar in the middle of the target list let alone figure out how to implement it. My only reservation with this patch is whether the WITH_ORDINALITY parser hack is the way we want to go. The precedent was already set with WITH TIME ZONE though and I think this was the best option. The worst failure I can come up with is this which doesn't seem like a huge problem: postgres=# with o as (select 1 ) select * from o; ?column? -- 1 (1 row) postgres=# with ordinality as (select 1 ) select * from ordinality; ERROR: syntax error at or near "ordinality" LINE 1: with ordinality as (select 1 ) select * from ordinality; ^ -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fatal error after starting postgres : sys identifiers must be different
*I am not the best specialist about logical replication, but as it looks to be a requirement to have 2 nodes with different system identifiers, you shouldn't link 1 node to another node whose data folder has been created from the base backup of the former (for example create the 2nd node based on a base backup of the 1st node). The error returned here would mean so.* ** *- *I didn't create the /data folders from any backup for both the nodes; they were created fresh by initdb individually. Is there anything specific that I need to do apart from initdb? *pgcontrol_data outputs different database system ids for the two nodes. So > don't understand why it says identifiers are same. Are you sure? Please re-ckeck.* ** *- *I re-verified and they are different. Pasting the snapshots below- irc1@ubuntuirc:~/bdr2/postgres-d6ed89e/postgres-bdr-bin/bin$ ./pg_controldata ./data2/ pg_control version number:937 Catalog version number: 201305061 Database system identifier: *5856368744762683487* Database cluster state: in production irc2@ubuntuirc2:~/bdr/postgres-bin/bin$ ./pg_controldata ./data/ pg_control version number:937 Catalog version number: 201305061 Database system identifier: *5901663435516996514* Database cluster state: in production Also, am pasting the below snapshot to let you know that postgres doesn't seem to be the su; in case this problem is due to that. irc1@ubuntuirc:~/bdr2/postgres-d6ed89e/postgres-bdr-bin/bin$ ./psql testdb2 psql (9.3beta1) Type "help" for help. testdb2=# \du List of roles Role name | Attributes | Member of ---++--- irc1 | Superuser, Create role, Create DB, Replication | {} irc2@ubuntuirc2:~/bdr/postgres-bin/bin$ ./psql testdb2 psql (9.3beta1) Type "help" for help. testdb2=# \du List of roles Role name | Attributes | Member of ---++--- irc2 | Superuser, Create role, Create DB, Replication | {} Please let me know which mailing list will be more appropriate to raise my concern regarding this postgres dev flavor. I am yet to receive any response from Andres. Thanks. On Thu, Jul 18, 2013 at 3:31 PM, Michael Paquier wrote: > On Fri, Jul 19, 2013 at 5:02 AM, Indrajit Roychoudhury > wrote: > > Could you please let me know what does the error "system identifiers are > > same" mean? Below is the snapshot from one of the masters. > > I am setting up BDR as per the wiki > > http://wiki.postgresql.org/wiki/BDR_User_Guide#Initial_setup > > and source @ > > git://git.postgresql.org/git/users/andresfreund/postgres.git > > > > irc1@ubuntuirc:~/bdr2/postgres-d6ed89e/postgres-bdr-bin/bin$ ./postgres > -D > > ~/bdr2/postgres-d6ed89e/postgres-bdr-bin/bin/data2/ > > LOG: bgworkers, connection: dbname=testdb2 > > LOG: option: dbname, val: testdb2 > > LOG: registering background worker: bdr apply: ubuntuirc2 > > LOG: loaded library "bdr" > > LOG: database system was shut down at 2013-03-17 10:56:52 PDT > > LOG: doing logical startup from 0/17B6410 > > LOG: starting up replication identifier with ckpt at 0/17B6410 > > LOG: autovacuum launcher started > > LOG: starting background worker process "bdr apply: ubuntuirc2" > > LOG: database system is ready to accept connections > > LOG: bdr apply: ubuntuirc2 initialized on testdb2, remote dbname=testdb2 > > replication=true fallback_application_name=bdr > > FATAL: system identifiers must differ between the nodes > > DETAIL: Both system identifiers are 5856368744762683487. > I am not the best specialist about logical replication, but as it > looks to be a requirement to have 2 nodes with different system > identifiers, you shouldn't link 1 node to another node whose data > folder has been created from the base backup of the former (for > example create the 2nd node based on a base backup of the 1st node). > The error returned here would mean so. > > > LOG: worker process: bdr apply: ubuntuirc2 (PID 16712) exited with exit > > code 1 > > ^CLOG: received fast shutdown request > > LOG: aborting any active transactions > > LOG: autovacuum launcher shutting down > > LOG: shutting down > > > > pgcontrol_data outputs different database system ids for the two nodes. > So > > don't understand why it says identifiers are same. > Are you sure? Please re-ckeck. > > > postgresql.conf content in one of the masters is like this- > > > > / > > shared_preload_libraries = 'bdr' > > bdr.connections = 'ubuntuirc2' > > bdr.ubuntuirc2.dsn = 'dbname=testdb2' > > / > > > > Two nodes are ubuntuirc and ubuntuirc2. Above is the output of the > > postgresql.conf from ubuntuirc. > If this behavior is confirmed, I think that this bug should be > reported di
Re: [HACKERS] FKey not enforced resulting in broken Dump/Reload
On Fri, Jul 19, 2013 at 1:45 PM, Tom Lane wrote: > Robert Haas writes: >> On Fri, Jul 19, 2013 at 12:58 PM, Rod Taylor wrote: >>> A poorly coded trigger on the referencing table has the ability to break >>> foreign keys, and as a result create a database which cannot be dumped and >>> reloaded. > >> This is a known limitation of our foreign key machinery. It might >> well be susceptible to improvement, but I wouldn't count on anyone >> rewriting it in the near future. > > If we failed to fire triggers on foreign-key actions, that would not be > an improvement. And trying to circumscribe the trigger's behavior so > that it couldn't break the FK would be (a) quite expensive, and > (b) subject to the halting problem, unless perhaps you circumscribed > it so narrowly as to break a lot of useful trigger behaviors. Thus, > there's basically no alternative that's better than "so don't do that". I think a lot of people would be happier if foreign keys were always checked after all regular triggers and couldn't be disabled. But, eh, that's not how it works. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fatal error after starting postgres : sys identifiers must be different
One more change was required to add both the users in each node's db as super users and replication started!! Thanks. On Thu, Jul 18, 2013 at 5:35 PM, Andres Freund wrote: > Hi! > > On 2013-07-19 07:31:07 +0900, Michael Paquier wrote: > > If this behavior is confirmed, I think that this bug should be > > reported directly to Andres and not this mailing list, just because > > logical replication is not integrated into Postgres core as of now. > > I think I agree, although I don't know where to report it, but to me > personally, so far. Hackers seems to be the wrong crowd for it, given > most of the people on it haven't even heard of bdr, much less read its > code. > We're definitely planning to propose it for community inclusion in some > form, but there are several rather large preliminary steps (like getting > changeset extraction in) that need to be done first. > > Not sure what's best. > > On 2013-07-19 07:31:07 +0900, Michael Paquier wrote: > > On Fri, Jul 19, 2013 at 5:02 AM, Indrajit Roychoudhury > > wrote: > > > Could you please let me know what does the error "system identifiers > are > > > same" mean? Below is the snapshot from one of the masters. > > > I am setting up BDR as per the wiki > > > http://wiki.postgresql.org/wiki/BDR_User_Guide#Initial_setup > > > and source @ > > > git://git.postgresql.org/git/users/andresfreund/postgres.git > > > > > > irc1@ubuntuirc:~/bdr2/postgres-d6ed89e/postgres-bdr-bin/bin$ > ./postgres -D > > > ~/bdr2/postgres-d6ed89e/postgres-bdr-bin/bin/data2/ > > > LOG: bgworkers, connection: dbname=testdb2 > > > LOG: option: dbname, val: testdb2 > > > LOG: registering background worker: bdr apply: ubuntuirc2 > > > LOG: loaded library "bdr" > > > LOG: database system was shut down at 2013-03-17 10:56:52 PDT > > > LOG: doing logical startup from 0/17B6410 > > > LOG: starting up replication identifier with ckpt at 0/17B6410 > > > LOG: autovacuum launcher started > > > LOG: starting background worker process "bdr apply: ubuntuirc2" > > > LOG: database system is ready to accept connections > > > LOG: bdr apply: ubuntuirc2 initialized on testdb2, remote > dbname=testdb2 > > > replication=true fallback_application_name=bdr > > > FATAL: system identifiers must differ between the nodes > > > DETAIL: Both system identifiers are 5856368744762683487. > > > I am not the best specialist about logical replication, but as it > > looks to be a requirement to have 2 nodes with different system > > identifiers, you shouldn't link 1 node to another node whose data > > folder has been created from the base backup of the former (for > > example create the 2nd node based on a base backup of the 1st node). > > The error returned here would mean so. > > Yes, that's correct. > > > > LOG: worker process: bdr apply: ubuntuirc2 (PID 16712) exited with > exit > > > code 1 > > > ^CLOG: received fast shutdown request > > > LOG: aborting any active transactions > > > LOG: autovacuum launcher shutting down > > > LOG: shutting down > > > > > > pgcontrol_data outputs different database system ids for the two > nodes. So > > > don't understand why it says identifiers are same. > > Are you sure? Please re-ckeck. > > > > > postgresql.conf content in one of the masters is like this- > > > > > > / > > > shared_preload_libraries = 'bdr' > > > bdr.connections = 'ubuntuirc2' > > > bdr.ubuntuirc2.dsn = 'dbname=testdb2' > > > / > > > > > > Two nodes are ubuntuirc and ubuntuirc2. Above is the output of the > > > postgresql.conf from ubuntuirc. > > The problem seems to be that your dsn doesn't include a hostname but > only a database name. So, what probably happens is both your hosts try > to connect to themselves since connecting to the local host is the > default when no hostname is specified. Which is one of the primary > reasons the requirement for differing system identifiers exist... > > Greetings, > > Andres Freund >
Re: [HACKERS] FKey not enforced resulting in broken Dump/Reload
Robert Haas writes: > On Fri, Jul 19, 2013 at 12:58 PM, Rod Taylor wrote: >> A poorly coded trigger on the referencing table has the ability to break >> foreign keys, and as a result create a database which cannot be dumped and >> reloaded. > This is a known limitation of our foreign key machinery. It might > well be susceptible to improvement, but I wouldn't count on anyone > rewriting it in the near future. If we failed to fire triggers on foreign-key actions, that would not be an improvement. And trying to circumscribe the trigger's behavior so that it couldn't break the FK would be (a) quite expensive, and (b) subject to the halting problem, unless perhaps you circumscribed it so narrowly as to break a lot of useful trigger behaviors. Thus, there's basically no alternative that's better than "so don't do that". regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign Tables as Partitions
On Fri, Jul 19, 2013 at 11:41:14AM -0400, Robert Haas wrote: > On Thu, Jul 18, 2013 at 8:56 PM, David Fetter wrote: > > Please find attached a PoC patch to implement $subject. > > > > So far, with a lot of help from Andrew Gierth, I've roughed out an > > implementation which allows you to ALTER FOREIGN TABLE so it inherits > > a local table. > > > > TBD: CREATE FOREIGN TABLE ... INHERITS ..., docs, regression testing, > > etc., etc. > > I think this could be useful, but it's going to take more than a > three-line patch. Of course! > Removing code that prohibits things is easy; what's hard is > verifying that nothing else breaks as a result. And I bet it does. It may well. I see our relations eventually being as equivalent as their definitions permit (views, materialized or not, foreign tables, etc.), and this as work toward that. Yes, it's normative, and we may never fully get there, but I'm pretty sure it's worth working towards. > This functionality was actually present in the original submission > for foreign tables. I ripped it out before commit because I didn't > think all of the interactions with other commands had been > adequately considered. But I think they did consider it to some > degree, which this patch does not. A ref to the patch as submitted & patch as applied would help a lot :) Were there any particular things you managed to break with the patch as submitted? Places you thought would be soft but didn't poke at? Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] AGG_PLAIN thinks sorts are free
Jeff Janes writes: > On Thu, Jul 18, 2013 at 8:04 PM, Tom Lane wrote: >> DISTINCT (and also ORDER BY) properties of aggregates are implemented >> at runtime; the planner doesn't really do anything about them, except >> suppress the choice it might otherwise make of using hashed aggregation. > Couldn't a hash aggregate be superior to a sort one (for the distinct, > not the order by)? If it worked at all, it might be superior, but since it doesn't, it ain't. This isn't really a matter of lack of round tuits, but a deliberate judgment that it's probably not worth the trouble. Per the comment in choose_hashed_grouping: /* * Executor doesn't support hashed aggregation with DISTINCT or ORDER BY * aggregates.(Doing so would imply storing *all* the input values in * the hash table, and/or running many sorts in parallel, either of which * seems like a certain loser.) */ regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] AGG_PLAIN thinks sorts are free
On Thu, Jul 18, 2013 at 8:04 PM, Tom Lane wrote: > Jeff Janes writes: >> AGG_PLAIN sometimes does sorts, but it thinks they are free. Also, under >> explain analyze it does not explicitly report whether the sort was external >> or not, nor report the disk or memory usage, the way other sorts do. I >> don't know if those two things are related or not. > > DISTINCT (and also ORDER BY) properties of aggregates are implemented > at runtime; the planner doesn't really do anything about them, except > suppress the choice it might otherwise make of using hashed aggregation. > Since the behavior is entirely local to the Agg plan node, it's also > not visible to the EXPLAIN ANALYZE machinery. Couldn't a hash aggregate be superior to a sort one (for the distinct, not the order by)? > Arguably we should have the planner add on some cost factor for such > aggregates, but that would have no effect whatever on the current level > of plan, and could only be useful if this was a subquery whose cost > would affect choices in an outer query level. Which is a case that's > pretty few and far between AFAIK (do you have a real-world example where > it matters?). Not that I know of. It is mainly an analytical headache. I'm trying to figure out why the planner makes the choices it does on more complex queries, but one of the component queries I'm trying to build it up from suddenly falls into this plan, where I can't see the estimated costs and can't use "set enable_*" to shift it away from that into a more transparent one. Thanks for the explanation. Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] getting rid of SnapshotNow
Robert Haas escribió: > 4. If we use GetActiveSnapshot, all the comments about about a fresh > MVCC snapshot still apply. However, the snapshot in question could be > even more stale, especially in repeatable read or serializable mode. > However, this might be thought a more consistent behavior than what we > have now. And I'm guessing that this function is typically run as its > own transaction, so in practice this doesn't seem much different from > an MVCC snapshot, only cheaper. > > At the moment, I dislike #2 and slightly prefer #4 to #3. +1 for #4, and if we ever need more then we can provide a non-default way to get at #2. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK TABLE Permissions
On Fri, Jul 19, 2013 at 12:33 PM, Tom Lane wrote: > Stephen Frost writes: >> if (lockmode == AccessShareLock) >> aclresult = pg_class_aclcheck(reloid, GetUserId(), >> ACL_SELECT); >> + else if (lockmode == RowExclusiveLock) >> + aclresult = pg_class_aclcheck(reloid, GetUserId(), >> +ACL_INSERT | ACL_UPDATE | ACL_DELETE | >> ACL_TRUNCATE); >> else >> aclresult = pg_class_aclcheck(reloid, GetUserId(), >> ACL_UPDATE | ACL_DELETE | >> ACL_TRUNCATE); > > Perhaps it would be better to refactor with a local variable for the > aclmask and just one instance of the pg_class_aclcheck call. Also, I'm > pretty sure that the documentation work needed is more extensive > than the actual patch ;-). Otherwise, I don't see a problem with this. I don't really care one way or the other whether we change this in master, but I think back-patching changes that loosen security restrictions is a poor idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] FKey not enforced resulting in broken Dump/Reload
A poorly coded trigger on the referencing table has the ability to break foreign keys, and as a result create a database which cannot be dumped and reloaded. The BEFORE DELETE trigger accidentally does RETURN NEW, which suppresses the DELETE action by the foreign key trigger. This allows the record from the referenced table to be deleted and the record in the referencing table to remain in place. While I don't expect Pg to do what the coder meant, but it should throw an error and not leave foreign key'd data in an invalid state. This applies to both 9.1 and 9.2. Please see attached bug.sql. bug.sql Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] getting rid of SnapshotNow
On Fri, Jul 19, 2013 at 12:51 PM, Alvaro Herrera wrote: > Robert Haas escribió: >> On Fri, Jul 19, 2013 at 12:29 PM, Alvaro Herrera >> wrote: > >> > I think seeing otherwise invisible rows is useful in pgrowlocks. It >> > helps observe the effects on tuples written by concurrent transactions >> > during experimentation. But then, maybe this functionality belongs in >> > pageinspect instead. >> >> It does seem like it should be useful, at least as an option. But I >> feel like changing that ought to be separate from getting rid of >> SnapshotNow. It seems like too big of a behavior change to pass off >> as a harmless tweak. > > Agreed. So any change we make to pgrowlocks is going to have some behavior consequences. 1. If we use SnapshotSelf, then nobody will notice the difference unless this is used as part of a query that locks or modifies tuples in the table being examined. But in that case you might see the results of the current query. Thus, I think this is the smallest possible behavior change, but Tom doesn't like SnapshotSelf any more than he likes SnapshotNow. 2. If we use SnapshotDirty, then the difference is probably noticeable, because you'll see the results of concurrent, uncommitted transactions. Maybe useful, but probably shouldn't be the new default. 3. If we use a fresh MVCC snapshot, then when you scan a table you'll see the state of play as of the beginning of your scan rather than the state of play as of when your scan reaches the target page. This might be noticeable on a large table. However, it might also be thought an improvement. 4. If we use GetActiveSnapshot, all the comments about about a fresh MVCC snapshot still apply. However, the snapshot in question could be even more stale, especially in repeatable read or serializable mode. However, this might be thought a more consistent behavior than what we have now. And I'm guessing that this function is typically run as its own transaction, so in practice this doesn't seem much different from an MVCC snapshot, only cheaper. At the moment, I dislike #2 and slightly prefer #4 to #3. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] getting rid of SnapshotNow
Robert Haas escribió: > On Fri, Jul 19, 2013 at 12:29 PM, Alvaro Herrera > wrote: > > I think seeing otherwise invisible rows is useful in pgrowlocks. It > > helps observe the effects on tuples written by concurrent transactions > > during experimentation. But then, maybe this functionality belongs in > > pageinspect instead. > > It does seem like it should be useful, at least as an option. But I > feel like changing that ought to be separate from getting rid of > SnapshotNow. It seems like too big of a behavior change to pass off > as a harmless tweak. Agreed. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FKey not enforced resulting in broken Dump/Reload
On Fri, Jul 19, 2013 at 12:58 PM, Rod Taylor wrote: > A poorly coded trigger on the referencing table has the ability to break > foreign keys, and as a result create a database which cannot be dumped and > reloaded. > > The BEFORE DELETE trigger accidentally does RETURN NEW, which suppresses the > DELETE action by the foreign key trigger. This allows the record from the > referenced table to be deleted and the record in the referencing table to > remain in place. > > While I don't expect Pg to do what the coder meant, but it should throw an > error and not leave foreign key'd data in an invalid state. This is a known limitation of our foreign key machinery. It might well be susceptible to improvement, but I wouldn't count on anyone rewriting it in the near future. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Minmax indexes
On 07/18/2013 10:39 PM, Alvaro Herrera wrote: > To scan the index, we first obtain the TID of index tuple for page 0. If > this returns a valid TID, we read that tuple to determine the min/max bounds > for this page range. If it returns invalid, then the range is unsummarized, > and the scan must return the whole range as needing scan. After this > index entry has been processed, we obtain the TID of the index tuple for > page 0+pagesPerRange (currently this is a compile-time constant, but > there's no reason this cannot be a per-index property). Continue adding > pagesPerRange until we reach the end of the heap. Conceptually, this sounds like a good initial solution to the update problem. I still think we could do incremental updates to the minmax indexes per the idea I discussed, but that could be a later version. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] getting rid of SnapshotNow
On Fri, Jul 19, 2013 at 12:29 PM, Alvaro Herrera wrote: > Robert Haas escribió: >> On Fri, Jul 19, 2013 at 1:27 AM, Tom Lane wrote: > >> > Why not tell people to use SnapshotDirty if they need a >> > not-guaranteed-consistent result? At least then it's pretty obvious >> > that you're getting some randomness in with your news. > >> On further reflection, I think perhaps pgrowlocks should just register >> a fresh MVCC snapshot and use that. Using SnapshotDirty would return >> TIDs of unseen tuples, which does not seem to be what is wanted there. > > I think seeing otherwise invisible rows is useful in pgrowlocks. It > helps observe the effects on tuples written by concurrent transactions > during experimentation. But then, maybe this functionality belongs in > pageinspect instead. It does seem like it should be useful, at least as an option. But I feel like changing that ought to be separate from getting rid of SnapshotNow. It seems like too big of a behavior change to pass off as a harmless tweak. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls
On 07/15/2013 10:19 AM, Jeff Davis wrote: > On Thu, 2013-07-11 at 10:51 -0400, Nicholas White wrote: >> I've attached a revised version that fixes the issues above: > > I'll get to this soon, sorry for the delay. > > Regards, > Jeff Davis So ... are you doing a final review of this for the CF, Jeff? We need to either commit it or bounce it to the next CF. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK TABLE Permissions
Stephen Frost writes: > if (lockmode == AccessShareLock) > aclresult = pg_class_aclcheck(reloid, GetUserId(), > ACL_SELECT); > + else if (lockmode == RowExclusiveLock) > + aclresult = pg_class_aclcheck(reloid, GetUserId(), > +ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE); > else > aclresult = pg_class_aclcheck(reloid, GetUserId(), > ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE); Perhaps it would be better to refactor with a local variable for the aclmask and just one instance of the pg_class_aclcheck call. Also, I'm pretty sure that the documentation work needed is more extensive than the actual patch ;-). Otherwise, I don't see a problem with this. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] getting rid of SnapshotNow
Robert Haas escribió: > On Fri, Jul 19, 2013 at 1:27 AM, Tom Lane wrote: > > Why not tell people to use SnapshotDirty if they need a > > not-guaranteed-consistent result? At least then it's pretty obvious > > that you're getting some randomness in with your news. > On further reflection, I think perhaps pgrowlocks should just register > a fresh MVCC snapshot and use that. Using SnapshotDirty would return > TIDs of unseen tuples, which does not seem to be what is wanted there. I think seeing otherwise invisible rows is useful in pgrowlocks. It helps observe the effects on tuples written by concurrent transactions during experimentation. But then, maybe this functionality belongs in pageinspect instead. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] getting rid of SnapshotNow
On 2013-07-20 00:49:11 +0900, Hiroshi Inoue wrote: > (2013/07/18 21:46), Robert Haas wrote: > >There seems to be a consensus that we should try to get rid of > >SnapshotNow entirely now that we have MVCC catalog scans, so I'm > >attaching two patches that together come close to achieving that goal: > > ... > > >With that done, the only remaining uses of SnapshotNow in our code > >base will be in currtid_byreloid() and currtid_byrelname(). So far no > >one on this list has been able to understand clearly what the purpose > >of those functions is, so I'm copying this email to pgsql-odbc in case > >someone there can provide more insight. If I were a betting man, I'd > >bet that they are used in contexts where the difference between > >SnapshotNow and SnapshotSelf wouldn't matter there, either. > > Using SnapshotSelf instead of SnapshotNow for currtid_ () wouldn't > matter. I think it actually might. You could get into dicey situations if you use currtid_ in a query performing updates or inserts because it would see the to-be-inserted tuple... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] getting rid of SnapshotNow
(2013/07/18 21:46), Robert Haas wrote: There seems to be a consensus that we should try to get rid of SnapshotNow entirely now that we have MVCC catalog scans, so I'm attaching two patches that together come close to achieving that goal: ... With that done, the only remaining uses of SnapshotNow in our code base will be in currtid_byreloid() and currtid_byrelname(). So far no one on this list has been able to understand clearly what the purpose of those functions is, so I'm copying this email to pgsql-odbc in case someone there can provide more insight. If I were a betting man, I'd bet that they are used in contexts where the difference between SnapshotNow and SnapshotSelf wouldn't matter there, either. Using SnapshotSelf instead of SnapshotNow for currtid_ () wouldn't matter. regards, Hiroshi Inoue -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Using ini file to setup replication
> I've even proposed that in the past in > 20130225211533.gd3...@awork2.anarazel.de . I plan to propose an updated > version of that patch (not allowing numeric 2nd level ids) for the next > CF. > > So you can just do stuff like: > > server.foo.grand_unified_config = value. Sounds good to me. I can see lots of uses for that. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [ODBC] [HACKERS] getting rid of SnapshotNow
Andres Freund writes: > On 2013-07-20 00:49:11 +0900, Hiroshi Inoue wrote: >> Using SnapshotSelf instead of SnapshotNow for currtid_ () wouldn't >> matter. > I think it actually might. You could get into dicey situations if you > use currtid_ in a query performing updates or inserts because it would > see the to-be-inserted tuple... I'm pretty sure Hiroshi-san was only opining about whether it would matter for ODBC's usage. IIUC, ODBC is using this function to re-fetch rows that it inserted, updated, or at least selected-for-update in a previous command of the current transaction, so actually any snapshot would do fine. In any case, since I moved the goalposts by suggesting that SnapshotSelf is just as dangerous as SnapshotNow, what we need to know is whether it'd be all right to change this code to use a fresh MVCC snapshot; and if not, why not. It's pretty hard to see a reason why client-side code would want to make use of the results of a non-MVCC snapshot. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign Tables as Partitions
On Thu, Jul 18, 2013 at 8:56 PM, David Fetter wrote: > Please find attached a PoC patch to implement $subject. > > So far, with a lot of help from Andrew Gierth, I've roughed out an > implementation which allows you to ALTER FOREIGN TABLE so it inherits > a local table. > > TBD: CREATE FOREIGN TABLE ... INHERITS ..., docs, regression testing, > etc., etc. I think this could be useful, but it's going to take more than a three-line patch. Removing code that prohibits things is easy; what's hard is verifying that nothing else breaks as a result. And I bet it does. This functionality was actually present in the original submission for foreign tables. I ripped it out before commit because I didn't think all of the interactions with other commands had been adequately considered. But I think they did consider it to some degree, which this patch does not. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] LOCK TABLE Permissions
Greetings, We've run into a curious case and I'd like to solicit feedback regarding a possible change to the access rights required to acquire locks on a relation. Specifically, we have a process which normally INSERTs into a table and another process which Exclusive locks that same table in order to syncronize other processing. We then ran into a case where we didn't actually want to INSERT but still wanted to have the syncronization happen. Unfortunately, we don't allow LOCK TABLE to acquire RowExclusive unless you have UPDATE, DELETE, or TRUNCATE privileges. My first impression is that the current code was just overly simplistic regarding what level of permissions are required for a given lock type and that it wasn't intentional to deny processes which have INSERT privileges from acquiring RowExclusive (as they can do so anyway using an actual INSERT). Therefore, I'd like to propose the below simple 3-line patch to correct this. Thoughts? Objections to back-patching? Thanks, Stephen diff --git a/src/backend/commands/lockcmds.c b/src/backend/commands/lockcmds.c new file mode 100644 index 49950d7..60f54c5 100644 *** a/src/backend/commands/lockcmds.c --- b/src/backend/commands/lockcmds.c *** LockTableAclCheck(Oid reloid, LOCKMODE l *** 174,179 --- 174,182 if (lockmode == AccessShareLock) aclresult = pg_class_aclcheck(reloid, GetUserId(), ACL_SELECT); + else if (lockmode == RowExclusiveLock) + aclresult = pg_class_aclcheck(reloid, GetUserId(), +ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE); else aclresult = pg_class_aclcheck(reloid, GetUserId(), ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE); signature.asc Description: Digital signature
Re: [HACKERS] pgindent behavior we could do without
On 07/18/2013 11:30 PM, Tom Lane wrote: Noah Misch writes: On Thu, Jul 18, 2013 at 12:27:21AM -0400, Tom Lane wrote: It's always annoyed me that pgindent insists on adjusting vertical whitespace around #else and related commands. This has, for example, rendered src/include/storage/barrier.h nigh illegible: you get things like Agreed. I've similarly disliked how pgindent adds a blank line between an #if and a multi-line comment, like at the top of get_restricted_token(). AFAICT it forces a blank line before a multiline comment in most contexts; #if isn't special (though it does seem to avoid doing that just after a left brace). I too don't much care for that behavior, although it's not as detrimental to readability as removing blank lines can be. In general, pgindent isn't nearly as good about managing vertical whitespace as it is about horizontal spacing ... I agree. I should perhaps note that when I rewrote pgindent, I deliberately didn't editorialize about its behaviour - but I did intend to make it more maintainable and simpler to change stuff like this, and so it is :-) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LATERAL quals revisited
Ashutosh Bapat writes: > On Wed, Jun 26, 2013 at 1:30 AM, Tom Lane wrote: >> For there to *be* a unique "appropriate outer join", we need to require >> that a LATERAL-using qual clause that's under an outer join contain >> lateral references only to the outer side of the nearest enclosing outer >> join. There's no such restriction in the spec of course, but we can >> make it so by refusing to flatten a sub-select if pulling it up would >> result in having a clause in the outer query that violates this rule. >> There's already some code in prepjointree.c (around line 1300) that >> attempts to enforce this, though now that I look at it again I'm not >> sure it's covering all the bases. We may need to extend that check. > Why do we need this restriction? Wouldn't a place (specifically join qual > at such a place) in join tree where all the participating relations are > present, serve as a place where the clause can be applied. No. If you hoist a qual that appears below an outer join to above the outer join, you get wrong results in general: you might eliminate rows from the outer side of the join, which a qual from within the inner side should never be able to do. > select * from tab1 left join tab2 t2 using (val) left join lateral (select > val from tab2 where val2 = tab1.val * t2.val) t3 using (val); > Can't we apply (as a join qual) the qual val2 = tab1.val * t2.val at a > place where we are computing join between tab1, t2 and t3? This particular example doesn't violate the rule I gave above, since both tab1 and t2 are on the left side of the join to the lateral subquery, and the qual doesn't have to get hoisted *past* an outer join, only to the outer join of {tab1,t2} with {t3}. >> I'm inclined to process all LATERAL-using qual clauses this way, ie >> postpone them till we recurse back up to a place where they can >> logically be evaluated. That won't make any real difference when no >> outer joins are present, but it will eliminate the ugliness that right >> now distribute_qual_to_rels is prevented from sanity-checking the scope >> of the references in a qual when LATERAL is present. If we do it like >> this, we can resurrect full enforcement of that sanity check, and then >> throw an error if any "postponed" quals are left over when we're done >> recursing. > Parameterized nested loop join would always be able to evaluate a LATERAL > query. Instead of throwing error, why can't we choose that as the default > strategy whenever we fail to flatten subquery? I think you misunderstood. That error would only be a sanity check that we'd accounted for all qual clauses, it's not something a user should ever see. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simple documentation typo patch
On Thu, Jul 18, 2013 at 2:53 PM, David Christensen wrote: > Hey folks, this corrects typos going back to > 75c6519ff68dbb97f73b13e9976fb8075bbde7b8 where EUC_JIS_2004 and > SHIFT_JIS_2004 were first added. > > These typos are present in all supported major versions of PostgreSQL, back > through 8.3; I didn't look any further than that. :-) Committed and back-patched to all currently-supported versions. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] confusing typedefs in jsonfuncs.c
On 07/18/2013 09:20 PM, Peter Eisentraut wrote: The new jsonfuncs.c has some confusing typedef scheme. For example, it has a bunch of definitions like this: typedef struct getState { ... } getState, *GetState; So GetState is a pointer to getState. I have never seen that kind of convention before. This then leads to code like GetStatestate; state = palloc0(sizeof(getState)); which has useless mental overhead. But the fact that GetState is really a pointer isn't hidden at all, because state is then derefenced with -> or cast from or to void*. So whatever abstraction might have been intended isn't there at all. And all of this is an intra-file interface anyway. And to make this even more confusing, other types such as ColumnIOData and JsonLexContext are not pointers but structs directly. I think a more typical PostgreSQL code convention is to use capitalized camelcase for structs, and use explicit pointers for pointers. I have attached a patch that cleans this up, in my opinion. I don't have a problem with this. Sometimes when you've stared at something for long enough you fail to notice such things, so I welcome more eyeballs on the code. Note that this is an externally visible API, so anyone who has written an extension against it will possibly find it broken. But that's all the more reason to clean it now, before it makes it to a released version. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.4] row level security
Dean Rasheed writes: > I took a look at this patch too. I didn't read all the code in detail, > but there was one area I wondered about --- is it still necessary to > add the new field rowsec_relid to RangeTblEntry? > As far as I can see, it is only used in the new function getrelid() > which replaces the old macro of the same name, so that it can work if > it is passed the index of the sourceRelation subquery RTE instead of > the resultRelation. This seems to be encoding a specific assumption > that a subquery RTE is always going to come from row-level security > expansion. I haven't read the patch at all, but I would opine that anything that's changing the behavior of getrelid() is broken by definition. Something is being done at the wrong level of abstraction if you think that you need to do that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ODBC] getting rid of SnapshotNow
(2013/07/19 22:03), Andres Freund wrote: On 2013-07-19 08:57:01 +0900, Inoue, Hiroshi wrote: I had the idea they were used for a client-side implementation of WHERE CURRENT OF. Perhaps that's dead code and could be removed entirely? It's been reported that ODBC still uses them. Though PostgreSQL's TID is similar to Orale's ROWID, it is transient and changed after update operations unfortunately. I implemented the currtid_xx functions to supplement the difference. For example currtid(relname, original tid) (hopefully) returns the current tid of the original row when it is updated. That is only guaranteed to work though when you're in a transaction old enough to prevent removal of the old or intermediate row versions. E.g. Yes it's what I meant by (hopefully). At the time when I implemented currtid(), I was able to use TIDs in combination with OIDs. regards, Hiroshi Inoue -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses
On 7/19/13 3:53 AM, KONDO Mitsumasa wrote: Recently, a user who think system availability is important uses synchronous replication cluster. If your argument for why it's OK to ignore bounding crash recovery on the master is that it's possible to failover to a standby, I don't think that is acceptable. PostgreSQL users certainly won't like it. I want you to read especially point that is line 631, 651, and 656. MAX_WRITEBACK_PAGES is 1024 (1024 * 4096 byte). You should read http://www.westnet.com/~gsmith/content/linux-pdflush.htm to realize everything you're telling me about the writeback code and its congestion logic I knew back in 2007. The situation is even worse than you describe, because this section of Linux has gone through multiple, major revisions since then. You can't just say "here is the writeback source code"; you have to reference each of the commonly deployed versions of the writeback feature to tell how this is going to play out if released. There are four major ones I pay attention to. The old kernel style as see in RHEL5/2.6.18--that's what my 2007 paper discussed--the similar code but with very different defaults in 2.6.22, the writeback method/tuning in RHEL6/Debian Squeeze/2.6.32, and then there are newer kernels. (The newer ones separate out into a few branches too, I haven't mapped those as carefully yet) If you tried to model your feature on Linux's approach here, what that means is that the odds of an ugly feedback loop here are even higher. You're increasing the feedback on what's already a bad situation that triggers trouble for people in the field. When Linux's congestion logic causes checkpoint I/O spikes to get worse than they otherwise might be, people panic because it seems like they stopped altogether. There are some examples of what really bad checkpoints look like in http://www.2ndquadrant.com/static/2quad/media/pdfs/talks/WriteStuff-PGCon2011.pdf if you want to see some of them. That's the talk I did around the same time I was trying out spreading the database fsync calls out over a longer period. When I did that, checkpoints became even less predictable, and that was a major reason behind why I rejected the approach. I think your suggestion will have the same problem. You just aren't generating test cases with really large write workloads yet to see it. You also don't seem afraid of how exceeding the checkpoint timeout is a very bad thing yet. In addition, if you set a large value of a checkpoint_timeout or checkpoint_complete_taget, you have said that performance is improved, but is it true in all the cases? The timeout, yes. Throughput is always improved by increasing checkpoint_timeout. Less checkpoints per unit of time increases efficiency. Less writes of the most heavy accessed buffers happen per transaction. It is faster because you are doing less work, which on average is always faster than doing more work. And doing less work usually beats doing more work, but doing it smarter. If you want to see how much work per transaction a test is doing, track the numbers of buffers written at the beginning/end of your test via pg_stat_bgwriter. Tests that delay checkpoints will show a lower total number of writes per transaction. That seems more efficient, but it's efficiency mainly gained by ignoring checkpoint_timeout. When a checkpoint complication target is actually enlarged, performance may fall in some cases. I think this as the last fsync having become heavy owing to having write in slowly. I think you're confusing throughput and latency here. Increasing the checkpoint timeout, or to a lesser extent the completion target, on average that increases throughput. It results in less work, and the more/less work amount is much more important than worrying about scheduler details. Now matter how efficient a given write is, whether you've sorted it across elevator horizon boundary A or boundary B, it's better not do it at all. But having less checkpoints makes latency worse sometimes too. Whether latency or throughput is considered the more important thing is very complicated. Having checkpoint_completion_target as the knob to control the latency/throughput trade-off hasn't worked out very well. No one has done a really comprehensive look at this trade-off since the 8.3 development. I got halfway through it for 9.1, we figured out that the fsync queue filling was actually responsible for most of my result variation, and then Robert fixed that. It was a big enough change that all my data from before that I had to throw out as no longer relevant. By the way: if you have a theory like "the last fsync having become heavy" for why something is happening, measure it. Set log_min_messages to debug2 and you'll get details about every single fsync in your logs. I did that for all my tests that led me to conclude fsync delaying on its own didn't help that problem. I was m
[HACKERS] Adding new joining alghoritm to postgresql
Hi.I'm a little confused. 1.I have source code 9.2.4. version fromhttp://www.postgresql.org/ftp/source/ 2.I want to add new alghoritm to index nested loops join, merge join and hash join. I have Executor catalog in src catalag containing nodeHash.c, nodeHasjoin.c, nodeMergejoin and nodeNestloop.c 3.After changes, I want to compile postgresql and use it. 4.Problem is:a)I do not know which library is responsible for this functionality. I understand, that I have to compile src and replace library (I don't know which library) in path where Postgresql in installed:C:\Program Files (x86)\PostgreSQL\9.2 b)I don't know how use files/library (which library?) with visual studio 2010 and how compile it. Thanks for you all answers. Tom.
Re: [HACKERS] getting rid of SnapshotNow
On 2013-07-19 01:27:41 -0400, Tom Lane wrote: > Noah Misch writes: > > To me, the major advantage of removing SnapshotNow is to force all > > third-party code to reevaluate. But that could be just as well > > achieved by renaming it to, say, SnapshotImmediate. If there are > > borderline-legitimate SnapshotNow uses in our code base, I'd lean > > toward a rename instead. Even if we decide to remove every core use, > > third-party code might legitimately reach a different conclusion on > > similar borderline cases. I don't think there are many people that aren't active on -hackers that can actually understand the implications of using SnapshotNow. Given -hackers hasn't fully grasped them in several cases... And even if those borderline cases are safe, that's really only valid for a specific postgres version. Catering to that doesn't seem like a good idea to me. > Indeed, I'm thinking I don't believe in SnapshotSelf anymore either. > It's got all the same consistency issues as SnapshotNow. In fact, it > has *more* issues, because it's also vulnerable to weirdnesses caused by > inconsistent ordering of tuple updates among multiple tuples updated by > the same command. Hm. I kind of can see the point of it in constraint code where it probably would be rather hard to remove usage of it, but e.g. the sepgsql usage looks pretty dubious to me. At least in the cases where the constraint code uses them I don't think the SnapshotNow dangers apply since those specific rows should be locked et al. The selinux usage looks like a design flaw to me, but I don't really understand that code, so I very well may be wrong. > Why not tell people to use SnapshotDirty if they need a > not-guaranteed-consistent result? At least then it's pretty obvious > that you're getting some randomness in with your news. Especially if we're going to lower the lock level of some commands, but even now, that opens us to more issues due to nonmatching table definitions et al. That doesn't sound like a good idea to me. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Using ini file to setup replication
>> *for example: for failback safe standby.* >I think that introducing another configuration format is a pretty bad >idea. While something new might not turn out to be as bad, we've seen >how annoying a separate configuration format turned out for >recovery.conf. Its not totally different way of configuration. ini file will be parsed in the same way as postgresql.conf. just want to separate out the replication parameters, to make simpler configuration for future developments in the field of replication such as failback-safe standby. > So you can just do stuff like: > > server.foo.grand_unified_config = value. But according to your approach and considering the use case of failback safe standby the parameters into the postgresql.conf will vary dynamically, and i don't think so doing this in the postgresql.conf is a good idea because it already contains whole bunch of parameters: for example: if i want to configure 2 servers then it will add 6 lines,for 3 -9, for 4-12 setting's for particular server will be: considering the way of setting value to conf parameters : guc . value standby_name.'AAA' synchronous_transfer. commit wal_sender_timeout.60 Regards, Samrat Samrat Revgade -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ODBC] getting rid of SnapshotNow
On 2013-07-19 08:57:01 +0900, Inoue, Hiroshi wrote: > >>I had the idea they were used for a client-side implementation of WHERE > >>CURRENT OF. Perhaps that's dead code and could be removed entirely? > > > >It's been reported that ODBC still uses them. > > Though PostgreSQL's TID is similar to Orale's ROWID, it is transient > and changed after update operations unfortunately. I implemented > the currtid_xx functions to supplement the difference. For example > > currtid(relname, original tid) > > (hopefully) returns the current tid of the original row when it is > updated. That is only guaranteed to work though when you're in a transaction old enough to prevent removal of the old or intermediate row versions. E.g. BEGIN; INSERT INTO foo...; -- last tid (0, 1) COMMIT; BEGIN; SELECT currtid(foo, '(0, 1')); COMMIT; can basically return no or even an arbitrarily different row. Same with an update... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] getting rid of SnapshotNow
On Fri, Jul 19, 2013 at 1:27 AM, Tom Lane wrote: > Noah Misch writes: >> To me, the major advantage of removing SnapshotNow is to force all >> third-party code to reevaluate. But that could be just as well >> achieved by renaming it to, say, SnapshotImmediate. If there are >> borderline-legitimate SnapshotNow uses in our code base, I'd lean >> toward a rename instead. Even if we decide to remove every core use, >> third-party code might legitimately reach a different conclusion on >> similar borderline cases. > > Meh. If there is third-party code with a legitimate need for > SnapshotNow, all we'll have done is to create an annoying version > dependency for them. So if we think that's actually a likely scenario, > we shouldn't rename it. But the entire point of this change IMO is that > we *don't* think there is a legitimate use-case for SnapshotNow. > > Indeed, I'm thinking I don't believe in SnapshotSelf anymore either. > It's got all the same consistency issues as SnapshotNow. In fact, it > has *more* issues, because it's also vulnerable to weirdnesses caused by > inconsistent ordering of tuple updates among multiple tuples updated by > the same command. > > Why not tell people to use SnapshotDirty if they need a > not-guaranteed-consistent result? At least then it's pretty obvious > that you're getting some randomness in with your news. You know, I didn't really consider that before, but I kind of like it. I think that would be entirely suitable (and perhaps better) for pgstattuple and get_actual_variable_range(). On further reflection, I think perhaps pgrowlocks should just register a fresh MVCC snapshot and use that. Using SnapshotDirty would return TIDs of unseen tuples, which does not seem to be what is wanted there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Allow background workers to be started dynamically.
Hi, On 2013-07-19 08:23:25 -0400, Robert Haas wrote: > On Fri, Jul 19, 2013 at 7:24 AM, Andres Freund wrote: > > The changes here make it impossible to write a bgworker which properly > > works in 9.3 and 9.4. Was that intended? If so, the commit message > > should mention the compatibility break... > > Yeah, sorry, I probably should have mentioned that. The structure > needs to be fixed size for us to store it in shared memory. > > If it was intended I propose changing the signature for 9.3 as > > well. There's just no point in releasing 9.3 when we already know which > > trivial but breaking change will be required for 9.4 > > I think that would be a good idea. > And I'd also propose getting rid > of bgw_sighup and bgw_sigterm in both branches, while we're at it. > AFAICT, they don't add any functionality, and they're basically > unusable for dynamically started background workers. Probably better > not to get people to used to using them. I don't have a problem with getting rid of those, it's easy enough to register them inside the worker and it's safe since we start with blocked signals. I seem to remember some discussion about why they were added but I can't find a reference anymore. Alvaro, do you remember? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Allow background workers to be started dynamically.
On Fri, Jul 19, 2013 at 1:23 PM, Robert Haas wrote: > On Fri, Jul 19, 2013 at 7:24 AM, Andres Freund wrote: >> The changes here make it impossible to write a bgworker which properly >> works in 9.3 and 9.4. Was that intended? If so, the commit message >> should mention the compatibility break... > > Yeah, sorry, I probably should have mentioned that. The structure > needs to be fixed size for us to store it in shared memory. > >> If it was intended I propose changing the signature for 9.3 as >> well. There's just no point in releasing 9.3 when we already know which >> trivial but breaking change will be required for 9.4 > > I think that would be a good idea. And I'd also propose getting rid > of bgw_sighup and bgw_sigterm in both branches, while we're at it. > AFAICT, they don't add any functionality, and they're basically > unusable for dynamically started background workers. Probably better > not to get people to used to using them. +1. Much better to take that pain now, before we have made a production release. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Eliminating PD_ALL_VISIBLE, take 2
On Mon, Jul 15, 2013 at 1:41 PM, Jeff Davis wrote: >> I'm of the opinion that we ought to extract the parts of the patch >> that hold the VM pin for longer, review those separately, and if >> they're good and desirable, apply them. > > I'm confused. My patch holds a VM page pinned for those cases where > PD_ALL_VISIBLE is currently used -- scans or insert/update/delete. If we > have PD_ALL_VISIBLE, there's no point in the cache, right? Hmm. You might be right. I thought there might be some benefit there, but I guess we're not going to clear the bit more than once, so maybe not. > To check visibility from an index scan, you either need to pin a heap > page or a VM page. Why would checking the heap page be cheaper? Is it > just other code in the VM-testing path that's slower? Or is there > concurrency involved in your measurements which may indicate contention > over VM pages? Well, I have seen some data that hints at contention problems with VM pages, but it's not conclusive. But the real issue is that, if the index-only scan finds the VM page not set, it still has to check the heap page. Thus, it ends up accessing two pages rather than one, and it turns out that's more expensive. It's unfortunately hard to predict whether the cost of checking VM first will pay off. It's a big win if we learn that we don't need to look at the heap page (because we don't need to read, lock, pin, or examine it, in that case) but it's a loss if we do (because checking the VM isn't free). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Allow background workers to be started dynamically.
On Fri, Jul 19, 2013 at 7:24 AM, Andres Freund wrote: > The changes here make it impossible to write a bgworker which properly > works in 9.3 and 9.4. Was that intended? If so, the commit message > should mention the compatibility break... Yeah, sorry, I probably should have mentioned that. The structure needs to be fixed size for us to store it in shared memory. > If it was intended I propose changing the signature for 9.3 as > well. There's just no point in releasing 9.3 when we already know which > trivial but breaking change will be required for 9.4 I think that would be a good idea. And I'd also propose getting rid of bgw_sighup and bgw_sigterm in both branches, while we're at it. AFAICT, they don't add any functionality, and they're basically unusable for dynamically started background workers. Probably better not to get people to used to using them. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.4] row level security
On 19 July 2013 04:09, Greg Smith wrote: > On 7/18/13 11:03 PM, Stephen Frost wrote: >> >> Wasn't there a wiki page about this >> feature which might also help? Bigger question- is it correct for the >> latest version of the patch? > > > https://wiki.postgresql.org/wiki/RLS has accumulated a lot of older debris > that may or may not be useful here. > > This useful piece was just presented at PGCon: > http://www.pgcon.org/2013/schedule/attachments/273_PGcon2013-kaigai-row-level-security.pdf > That is very up to date intro to the big picture issues. > Hi, I took a look at this patch too. I didn't read all the code in detail, but there was one area I wondered about --- is it still necessary to add the new field rowsec_relid to RangeTblEntry? As far as I can see, it is only used in the new function getrelid() which replaces the old macro of the same name, so that it can work if it is passed the index of the sourceRelation subquery RTE instead of the resultRelation. This seems to be encoding a specific assumption that a subquery RTE is always going to come from row-level security expansion. Is it the case that this can only happen from expand_targetlist(), in which case why not pass both source_relation and result_relation to expand_targetlist(), which I think will make that code neater. As it stands, what expand_targetlist() sees as result_relation is actually source_relation, which getrelid() then handles specially. Logically I think expand_targetlist() should pass the actual result_relation to getrelid(), opening the target table, and then pass source_relation to makeVar() when building new TLEs. With that change to expand_targetlist(), the change to getrelid() may be unnecessary, and then also the new rowsec_relid field in RangeTblEntry may not be needed. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LATERAL quals revisited
I have couple of questions. On Wed, Jun 26, 2013 at 1:30 AM, Tom Lane wrote: > I've been studying the bug reported at > > http://www.postgresql.org/message-id/20130617235236.GA1636@jeremyevans.local > that the planner can do the wrong thing with queries like > > SELECT * FROM > i LEFT JOIN LATERAL (SELECT * FROM j WHERE i.n = j.n) j ON true; > > I think the fundamental problem is that, because the "i.n = j.n" clause > appears syntactically in WHERE, the planner is treating it as if it were > an inner-join clause; but really it ought to be considered a clause of > the upper LEFT JOIN. That is, semantically this query ought to be > equivalent to > > SELECT * FROM > i LEFT JOIN LATERAL (SELECT * FROM j) j ON i.n = j.n; > > However, because distribute_qual_to_rels doesn't see the clause as being > attached to the outer join, it's not marked with the correct properties > and ends up getting evaluated in the wrong place (as a "filter" clause > not a "join filter" clause). The bug is masked in the test cases we've > used so far because those cases are designed to let the clause get > pushed down into the scan of the inner relation --- but if it doesn't > get pushed down, it's evaluated the wrong way. > > After some contemplation, I think that the most practical way to fix > this is for deconstruct_recurse and distribute_qual_to_rels to > effectively move such a qual to the place where it logically belongs; > that is, rather than processing it when we look at the lower WHERE > clause, set it aside for a moment and then add it back when looking at > the ON clause of the appropriate outer join. This should be reasonably > easy to do by keeping a list of "postponed lateral clauses" while we're > scanning the join tree. > > For there to *be* a unique "appropriate outer join", we need to require > that a LATERAL-using qual clause that's under an outer join contain > lateral references only to the outer side of the nearest enclosing outer > join. There's no such restriction in the spec of course, but we can > make it so by refusing to flatten a sub-select if pulling it up would > result in having a clause in the outer query that violates this rule. > There's already some code in prepjointree.c (around line 1300) that > attempts to enforce this, though now that I look at it again I'm not > sure it's covering all the bases. We may need to extend that check. > > Why do we need this restriction? Wouldn't a place (specifically join qual at such a place) in join tree where all the participating relations are present, serve as a place where the clause can be applied. E.g. in the query select * from tab1 left join tab2 t2 using (val) left join lateral (select val from tab2 where val2 = tab1.val * t2.val) t3 using (val); Can't we apply (as a join qual) the qual val2 = tab1.val * t2.val at a place where we are computing join between tab1, t2 and t3? I'm inclined to process all LATERAL-using qual clauses this way, ie > postpone them till we recurse back up to a place where they can > logically be evaluated. That won't make any real difference when no > outer joins are present, but it will eliminate the ugliness that right > now distribute_qual_to_rels is prevented from sanity-checking the scope > of the references in a qual when LATERAL is present. If we do it like > this, we can resurrect full enforcement of that sanity check, and then > throw an error if any "postponed" quals are left over when we're done > recursing. > > Parameterized nested loop join would always be able to evaluate a LATERAL query. Instead of throwing error, why can't we choose that as the default strategy whenever we fail to flatten subquery? Can we put the clause with lateral references at its appropriate place while flattening the subquery? IMO, that will be cleaner and lesser work than first pulling the clause and then putting it back again? Right, now, we do not have that capability in pull_up_subqueries() but given its recursive structure, it might be easier to do it there. > Thoughts, better ideas? > > regards, tom lane > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Postgres Database Company
Re: [HACKERS] WITH CHECK OPTION for auto-updatable views
On 18 July 2013 22:27, Stephen Frost wrote: > Dean, > > > * Stephen Frost (sfr...@snowman.net) wrote: >> Thanks! This is really looking quite good, but it's a bit late and I'm >> going on vacation tomorrow, so I didn't quite want to commit it yet. :) > > Apologies on this taking a bit longer than I expected, but it's been > committed and pushed now. Please take a look and let me know of any > issues you see with the changes that I made. > Excellent. Thank you! The changes look good to me. Cheers, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Using ini file to setup replication
On 2013-07-19 14:54:16 +0530, Samrat Revagade wrote: > I was going through the archives and there was a discussion about > using ini file to setup > replication.(http://www.postgresql.org/message-id/4c9876b4.9020...@enterprisedb.com). > I think if we work on this proposal and separate out the replication > setup from postgresql.conf file then we can provide more granularity > while setting up the replication parameters. > for example, we can set different values of wal_sender_timeout for > each standby sever. > > So i think it is good idea to separate out the replication settings > from postgresql.conf file and put into ini file. > Once it is confirmed then we can extend the ini file to support future > developments into replication. > *for example: for failback safe standby.* I think that introducing another configuration format is a pretty bad idea. While something new might not turn out to be as bad, we've seen how annoying a separate configuration format turned out for recovery.conf. I'd much rather go ahead and remove the nesting limit of GUCs. That should give us just about all that can be achieved with an ini file with a 1 line change. Sometime we might want to extend our format to add ini like sections but I think that *definitely* should be a separate proposal. I've even proposed that in the past in 20130225211533.gd3...@awork2.anarazel.de . I plan to propose an updated version of that patch (not allowing numeric 2nd level ids) for the next CF. So you can just do stuff like: server.foo.grand_unified_config = value. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Using ini file to setup replication
Please find updated hyperlinks: > Below I have explained how to to use ini file for failback safe stadby setup: > While discussing feature of fail-back safe standby > (CAF8Q-Gy7xa60HwXc0MKajjkWFEbFDWTG=ggyu1kmt+s2xcq...@mail.gmail.com) http://www.postgresql.org/message-id/CAF8Q-Gy7xa60HwXc0MKajjkWFEbFDWTG=ggyu1kmt+s2xcq...@mail.gmail.com > We have decided to use the *ini* file to configure the fail-back safe standby > here is the link for that: > cad21aocy2_bqvpzjey7s77amnccbxfj+1gphggdbulklav0...@mail.gmail.com http://www.postgresql.org/message-id/cad21aocy2_bqvpzjey7s77amnccbxfj+1gphggdbulklav0...@mail.gmail.com Regards, Samrat Revgade -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Using ini file to setup replication
Hi, I was going through the archives and there was a discussion about using ini file to setup replication.(http://www.postgresql.org/message-id/4c9876b4.9020...@enterprisedb.com). I think if we work on this proposal and separate out the replication setup from postgresql.conf file then we can provide more granularity while setting up the replication parameters. for example, we can set different values of wal_sender_timeout for each standby sever. So i think it is good idea to separate out the replication settings from postgresql.conf file and put into ini file. Once it is confirmed then we can extend the ini file to support future developments into replication. *for example: for failback safe standby.* Below I have explained how to to use ini file for failback safe stadby setup: While discussing feature of fail-back safe standby (CAF8Q-Gy7xa60HwXc0MKajjkWFEbFDWTG=ggyu1kmt+s2xcq...@mail.gmail.com) We have decided to use the *ini* file to configure the fail-back safe standby here is the link for that: cad21aocy2_bqvpzjey7s77amnccbxfj+1gphggdbulklav0...@mail.gmail.com But there is no strong positive/negative feedback for the concept of introducing the ini file.please have a look at it and give feedback. In todays scenario, In replication we only have the 2 ways of configuring the standby server 1. Asynchronous standby 2. Synchronous standby With the patch of failback safe standby we have more granularity in setting up the standby servers. 1. Default synchronous standby.(AAA) 2. Default asynchronous standby. (BBB) 3. Synchronous standby and also make same standby as a failback safe standby.(CCC) 4. Asynchronous standby and also make same standby as a failback safe standby.(DDD) In failback safe standby we are thinking to add to more granular settings of replication parameters for example 1. User can set seperate value for wal_sender_timeout for each server. 2. User can set seperate value of synchronous_transfer for each server. Consider the scenario where user want to setup the 4 standby servers as explained above so setting for them will be: - ini file - [Server] standby_name = 'AAA' synchronous_transfer = commit wal_sender_timeout = 60 [Server] standby_name = 'BBB' synchronous_transfer = none wal_sender_timeout = 40 [Server] standby_name = 'CCC' synchronous_transfer = all wal_sender_timeout = 50 [Server] standby_name = 'DDD' synchronous_transfer = data_flush wal_sender_timeout = 50 so setting up such a scenario through postgresql.conf file is impossible and if we try to do that it will add lot of complexity to the code. so use of ini file will be the very good choice in this case. Thank you , Samrat -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses
(2013/07/19 0:41), Greg Smith wrote: On 7/18/13 11:04 AM, Robert Haas wrote: On a system where fsync is sometimes very very slow, that might result in the checkpoint overrunning its time budget - but SO WHAT? Checkpoints provide a boundary on recovery time. That is their only purpose. You can always do better by postponing them, but you've now changed the agreement with the user about how long recovery might take. Recently, a user who think system availability is important uses synchronous replication cluster. And, as Robert saying, a user who cannot build cluster system will not use this function in GUC. When it became IO busy in calling fsync(), my patch does not take the over IO load in fsync(). Actually, it is the same as OS writeback structure. I read kernel source code which is fs/fs-writeback.c in linux-2.6.32-358.0.1.el6. It is latest RHEL6.4 kernel code. It seems that wb_writeback() controlled disk IO in OS-writeback function. Please see under source code. If OS think IO is busy, it does not write more IO for bail. fs/fs-writeback.c @wb_writeback() 623 /* 624 * For background writeout, stop when we are below the 625 * background dirty threshold 626 */ 627 if (work->for_background && !over_bground_thresh()) 628 break; 629 630 wbc.more_io = 0; 631 wbc.nr_to_write = MAX_WRITEBACK_PAGES; 632 wbc.pages_skipped = 0; 633 634 trace_wbc_writeback_start(&wbc, wb->bdi); 635 if (work->sb) 636 __writeback_inodes_sb(work->sb, wb, &wbc); 637 else 638 writeback_inodes_wb(wb, &wbc); 639 trace_wbc_writeback_written(&wbc, wb->bdi); 640 work->nr_pages -= MAX_WRITEBACK_PAGES - wbc.nr_to_write; 641 wrote += MAX_WRITEBACK_PAGES - wbc.nr_to_write; 642 643 /* 644 * If we consumed everything, see if we have more 645 */ 646 if (wbc.nr_to_write <= 0) 647 continue; 648 /* 649 * Didn't write everything and we don't have more IO, bail 650 */ 651 if (!wbc.more_io) 652 break; 653 /* 654 * Did we write something? Try for more 655 */ 656 if (wbc.nr_to_write < MAX_WRITEBACK_PAGES) 657 continue; 658 /* 659 * Nothing written. Wait for some inode to 660 * become available for writeback. Otherwise 661 * we'll just busyloop. 662 */ 663 spin_lock(&inode_lock); 664 if (!list_empty(&wb->b_more_io)) { 665 inode = list_entry(wb->b_more_io.prev, 666 struct inode, i_list); 667 trace_wbc_writeback_wait(&wbc, wb->bdi); 668 inode_wait_for_writeback(inode); 669 } 670 spin_unlock(&inode_lock); 671 } 672 673 return wrote; I want you to read especially point that is line 631, 651, and 656. MAX_WRITEBACK_PAGES is 1024 (1024 * 4096 byte). OS writeback scheduler does not write over MAX_WRITEBACK_PAGES. Because, if it write big data than MAX_WRITEBACK_PAGES, it will be IO-busy. And if it cannot write at all, OS think it needs recovery of IO performance. It is same as my patch's logic. In addition, if you set a large value of a checkpoint_timeout or checkpoint_complete_taget, you have said that performance is improved, but is it true in all the cases? Since the write of the dirty buffer which passed 30 seconds or more is carried out at intervals of 5 seconds, as there are many recesses of a write, a possibility of becoming an inefficient random write. For example, as for the worsening case, when the sleep time for 200 ms is inserted each time, since only 25 page (200 KB) can write in 5 seconds. I think it is bad efficiency to write. When a checkpoint complication target is actually enlarged, performance may fall in some cases. I think this as the last fsync having become heavy owing to having write in slowly. I would like to make a itemizing list which can be proof of my patch from you. Because DBT-2 benchmark spent lot of time about 1 setting test per 3 - 4 hours. Of course, I think it is important to obtain your consent. Best regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers