Re: [HACKERS] Autovacuum and Autoanalyze
On Tue, Sep 16, 2008 at 08:59:08PM -0400, Alvaro Herrera wrote: Simon Riggs wrote: Disabling autovacuum can have catastrophic effects, since it disables the ANALYZing of tables. Can we have a mode where we disable autoVACUUM yet enable autoANALYZE? You mean something like autovacuum = on / off / analyze ? We can certainly do that, but is there buy-in? +1 Having autovacuum on during bulk loads can really tank performance, but having autoanalyze on is good :) Cheers, David. -- David Fetter [EMAIL PROTECTED] http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: [EMAIL PROTECTED] Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal of SE-PostgreSQL patches (for CommitFest:Sep)
Greg Smith wrote: On Wed, 17 Sep 2008, Peter Eisentraut wrote: System-wide consistency in access controls could be nice to have in some cases. But is it really achievable? In the typical three-tier web application scenario, do you really have system-wide consistency? Can you configure your application server using SELinux? Each of the tiers end up with mapping layer similar to the one implemented here to map the SELinux permissions - PostgreSQL. Java for example has a whole JVM security manager component that makes it straighforward to do such a mapping. http://articles.techrepublic.com.com/5100-10878_11-6178805.html is a good quick intro that shows how the call structure is similar to what the SE-PostgreSQL code does. I guess these security architectures have same origin. The reference monitor concept requres all accesses to data objects to be checked by a tamperproof, always-invoked module based on its policy. http://en.wikipedia.org/wiki/Reference_monitor SE-PostgreSQL uses in-kernel SELinux as a reference monitor to check all accesses to database object via SQL. And is SELinux really the desirable interface for a system-wide access control facility? Why not port chmod or POSIX ACLs to PostgreSQL, or port SQL roles back to the operating system, or something else that captures what more people are actually using in practice. The main feature of SELinux that this crowd likes is how it manages privledge escalation risk. I'm not sure if POSIX ACLs for example are as effective at limiting the damage an exploitable suid binary can cause. As for what people are actually using, as someone who lives near the US capital I can tell you that installs using SELinux are quite plentiful around here--there really is no other UNIX-based technology for this purpose that's gotten traction inside this government like SELinux has. Anyway, even though I think picking SELinux as the primary security mechanism to integrate with is a sensible choice and I'm confident that the rest of the software stack isn't a problem, I do share your concern that implementing row and column-level security would make more sense in a general way first. Thanks for your explanation. The PGACE security framework can mount a OS independent fine grained access control feature, like Oracle Label Security. However, one concern is we have only one CommitFest remained. As I mentioned at the previous message, I think it is not a strange behavior that different security subsystems make different decisions on individual gulanualities. Ultimately, I see this patch as an interesting proof of concept -- it got us on the NSA site anyway -- but I can't see more than three people actually making use of it I take it you've never seen how big the NSA fort^H^H^H^Hfacility is? I'm not sure exactly how many orders of magnitude your estimate is off by, but I know it's at least 2 just based on conversations I've been involved in with companies around here. A lot of the US defense and homeland security related companies are adopting open-source software stacks because they can audit every level of the software, and there's a big void in that stack waiting for a database with the right security model to fill. You are right that getting code contributions back again is a challenge though. I don't have statistically reliable information. :) However, I believe there is potentially strong demand for secure database due to responses from audiences when I had presentations about SE-PostgreSQL in various opportunities. IIRC, Josh also said similar things. Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei [EMAIL PROTECTED] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum and Autoanalyze
David Fetter wrote: On Tue, Sep 16, 2008 at 08:59:08PM -0400, Alvaro Herrera wrote: Simon Riggs wrote: Disabling autovacuum can have catastrophic effects, since it disables the ANALYZing of tables. Can we have a mode where we disable autoVACUUM yet enable autoANALYZE? You mean something like autovacuum = on / off / analyze ? We can certainly do that, but is there buy-in? +1 Having autovacuum on during bulk loads can really tank performance, but having autoanalyze on is good :) Isn't autoanalyze a waste of time during a bulk load? Seems better to run ANALYZE manually at the end. Adding that option feels natural to me, but it is a rather blunt instrument. You can already do that with pg_autovacuum, though that interface isn't very user-friendly. I whole-heartedly support the idea of controlling autovacuum with storage options, e.g ALTER TABLE ... WITH (autoanalyze = on). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum and Autoanalyze
On Wed, 2008-09-17 at 10:09 +0300, Heikki Linnakangas wrote: David Fetter wrote: On Tue, Sep 16, 2008 at 08:59:08PM -0400, Alvaro Herrera wrote: Simon Riggs wrote: Disabling autovacuum can have catastrophic effects, since it disables the ANALYZing of tables. Can we have a mode where we disable autoVACUUM yet enable autoANALYZE? You mean something like autovacuum = on / off / analyze ? We can certainly do that, but is there buy-in? +1 Having autovacuum on during bulk loads can really tank performance, but having autoanalyze on is good :) Isn't autoanalyze a waste of time during a bulk load? Seems better to run ANALYZE manually at the end. Its not a waste of time because it catches tables immediately they have been loaded, not just at the end of the bulk load. Running ANALYZE is a waste of time if autoanalyze has already caught it, which is why that's never been added onto the end of a pg_dump script. But currently this is true only when we have both autoVACUUM and autoANALYZE enabled. Adding that option feels natural to me, but it is a rather blunt instrument. You can already do that with pg_autovacuum, though that interface isn't very user-friendly. I whole-heartedly support the idea of controlling autovacuum with storage options, e.g ALTER TABLE ... WITH (autoanalyze = on). Yes, have that option also, since it is fine tuning. I definitely want a blunt instrument! I don't want to have to run ALTER TABLE on *every* table. Even if you think that's possible, it won't work in conjunction with interfaces submitting standard SQL, plus it won't work if I forget either. This request comes from a real situation where a dump was reloaded during the day when autovacuum was off and so ANALYZE was missed. Not my mistake, but it took time to resolve that could have been avoided by the new option suggested here. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EXEC_BACKEND
On Tue, 2008-09-16 at 15:53 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: We keep talking about EXEC_BACKEND mode, though until recently I had misunderstood what that meant. I also realised that I have more than once neglected to take it into account when writing a patch - one recent patch failed to do this. I can't find anything coherent in docs/readme/comments to explain why it exists and what its implications are. It exists because Windows doesn't have fork(), only the equivalent of fork-and-exec. Which means that no state variables will be inherited from the postmaster by its child processes, and any state that needs to be carried across has to be handled explicitly. You can define EXEC_BACKEND in a non-Windows build, for the purpose of testing code to see if it works in that environment. OK, if its that simple then I see why its not documented. Thanks. I thought there might be more to it than that. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch
2008/9/17 Tom Lane [EMAIL PROTECTED]: Tatsuo Ishii [EMAIL PROTECTED] writes: Do we really have to make RECURSIVE a fully reserved keyword? According to the standard, RECURSIVE is a reserved keyword, I believe. Sure, but our general rule is to make keywords no more reserved than is absolutely necessary to make the bison grammar unambiguous. I haven't tested, but I'm thinking that if WITH is fully reserved then RECURSIVE shouldn't have to be. I am not sure, if these rule is good. Somebody who develop on postgresql should have a problems when they will be port to other databases in future. Reserved words in standards should be respected. regards Pavel Stehule regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] text search patch status update?
Sushant Sinha wrote: Patch #2. I think this is a straigt forward bug fix. Yes, I think you're right. In hlCover(), *q is 0 when the only match is the first item in the text, and we shouldn't bail out with return false in that case. But there seems to be something else going on here as well: postgres=# select ts_headline('1 2 3 4 5', '2'::tsquery, 'MinWords=2, MaxWords=3'); ts_headline -- b2/b 3 4 (1 row) postgres=# select ts_headline('aaa1 aaa2 aaa3 aaa4 aaa5','aaa2'::tsquery, 'MinWords=2, MaxWords=3'); ts_headline -- baaa2/b aaa3 (1 row) In the first example, you get three words, and in the 2nd, just two. It must be because of the default ShortWord setting of 3. Also, if only the last word matches, and it's a short word, you get the whole text: postgres=# select ts_headline('1 2 3 4 5','5'::tsquery, 'MinWords=2, MaxWords=3'); ts_headline -- 1 2 3 4 b5/b (1 row) -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] libpq events patch (with sgml docs)
Now, it's questionable how tense we need to be about that as long as event proc failure aborts calling of later event procs. That means that procs have to be robust against getting DESTROY with no CREATE calls in any case. Should we try to make that less uncertain? I attached a patch that adds a 'needDestroy' member to PGEvent It is set when resultcreate or resultcopy succeeds and checked during a PQclear. That *should* resolve the issue of no resultcreate but gets a resultdestroy. The general question of symmetry between RESULTCREATE and RESULTDESTROY callbacks is still bothering me. As committed, PQmakeEmptyPGresult will copy events into its result, but not fire RESULTCREATE for them ... but they'll get RESULTDESTROY when it's deleted. Is that what we want? PQmakeEmptyPGResult was given thought. The problem is every internal function that generates a result calls PQmakeEmptyPGResult, but those cases should not fire a resultcreate. resultcreate should be called when the result is fully constructed (tuples and all) so the eventproc gets a useful PGresult. One solution is to do something like the below: PGresult * PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status) { return pqMakeEmptyPGresult(conn, status, TRUE); } PGresult * pqMakeEmptyPGresult(PGconn *conn, ExecStatusType status, int fireEvents) { // existing function, only change is handling fireEvents } I am willing to create a patch for this. Is this an acceptable idea? I don't have a lot of faith that PQgetResult is the only place inside libpq that needs to fire RESULTCREATE, either. I looked again and I didn't see anything. Is there something your thinking of? ISTM that PQgetResult is called every where a result is created (outside of PQmakeEmptyPGresult). -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ Index: src/interfaces/libpq/fe-exec.c === RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/fe-exec.c,v retrieving revision 1.198 diff -C6 -r1.198 fe-exec.c *** src/interfaces/libpq/fe-exec.c 17 Sep 2008 04:31:08 - 1.198 --- src/interfaces/libpq/fe-exec.c 17 Sep 2008 10:40:41 - *** *** 356,367 --- 356,369 if (!dest-events[i].proc(PGEVT_RESULTCOPY, evt, dest-events[i].passThrough)) { PQclear(dest); return NULL; } + + dest-events[i].needDestroy = TRUE; } return dest; } /* *** *** 378,394 return NULL; newEvents = (PGEvent *) malloc(count * sizeof(PGEvent)); if (!newEvents) return NULL; - memcpy(newEvents, events, count * sizeof(PGEvent)); - - /* NULL out the data pointers and deep copy names */ for (i = 0; i count; i++) { newEvents[i].data = NULL; newEvents[i].name = strdup(newEvents[i].name); if (!newEvents[i].name) { while (--i = 0) free(newEvents[i].name); --- 380,396 return NULL; newEvents = (PGEvent *) malloc(count * sizeof(PGEvent)); if (!newEvents) return NULL; for (i = 0; i count; i++) { + newEvents[i].proc = events[i].proc; + newEvents[i].passThrough = events[i].passThrough; + newEvents[i].needDestroy = FALSE; newEvents[i].data = NULL; newEvents[i].name = strdup(newEvents[i].name); if (!newEvents[i].name) { while (--i = 0) free(newEvents[i].name); *** *** 663,679 if (!res) return; for (i = 0; i res-nEvents; i++) { ! PGEventResultDestroy evt; - evt.result = res; - (void) res-events[i].proc(PGEVT_RESULTDESTROY, evt, - res-events[i].passThrough); free(res-events[i].name); } if (res-events) free(res-events); --- 665,685 if (!res) return; for (i = 0; i res-nEvents; i++) { ! if(res-events[i].needDestroy) ! { ! PGEventResultDestroy evt; ! ! evt.result = res; ! (void) res-events[i].proc(PGEVT_RESULTDESTROY, evt, ! res-events[i].passThrough); ! } free(res-events[i].name); } if (res-events) free(res-events); *** *** 1609,1620 --- 1615,1628 libpq_gettext(PGEventProc \%s\ failed during PGEVT_RESULTCREATE event\n), res-events[i].name); pqSetResultError(res, conn-errorMessage.data); res-resultStatus = PGRES_FATAL_ERROR; break; } + + res-events[i].needDestroy = TRUE; } } return res; } Index: src/interfaces/libpq/libpq-int.h === RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/libpq-int.h,v retrieving revision 1.132 diff -C6 -r1.132 libpq-int.h *** src/interfaces/libpq/libpq-int.h 17 Sep 2008 04:31:08 - 1.132 --- src/interfaces/libpq/libpq-int.h 17 Sep 2008 10:40:41 - *** *** 153,164 --- 153,165 typedef struct PGEvent { PGEventProc proc; /* the function
Re: [HACKERS] [PATCHES] libpq events patch (with sgml docs)
Andrew Chernow wrote: Now, it's questionable how tense we need to be about that as long as event proc failure aborts calling of later event procs. That means that procs have to be robust against getting DESTROY with no CREATE calls in any case. Should we try to make that less uncertain? I attached a patch that adds a 'needDestroy' member to PGEvent It is set when resultcreate or resultcopy succeeds and checked during a PQclear. That *should* resolve the issue of no resultcreate but gets a resultdestroy. Shoot. I have a booboo in that last patch. Attached is the corrected version. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ Index: src/interfaces/libpq/fe-exec.c === RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/fe-exec.c,v retrieving revision 1.198 diff -C6 -r1.198 fe-exec.c *** src/interfaces/libpq/fe-exec.c 17 Sep 2008 04:31:08 - 1.198 --- src/interfaces/libpq/fe-exec.c 17 Sep 2008 10:49:10 - *** *** 356,367 --- 356,369 if (!dest-events[i].proc(PGEVT_RESULTCOPY, evt, dest-events[i].passThrough)) { PQclear(dest); return NULL; } + + dest-events[i].needDestroy = TRUE; } return dest; } /* *** *** 378,396 return NULL; newEvents = (PGEvent *) malloc(count * sizeof(PGEvent)); if (!newEvents) return NULL; - memcpy(newEvents, events, count * sizeof(PGEvent)); - - /* NULL out the data pointers and deep copy names */ for (i = 0; i count; i++) { newEvents[i].data = NULL; ! newEvents[i].name = strdup(newEvents[i].name); if (!newEvents[i].name) { while (--i = 0) free(newEvents[i].name); free(newEvents); return NULL; --- 380,398 return NULL; newEvents = (PGEvent *) malloc(count * sizeof(PGEvent)); if (!newEvents) return NULL; for (i = 0; i count; i++) { + newEvents[i].proc = events[i].proc; + newEvents[i].passThrough = events[i].passThrough; + newEvents[i].needDestroy = FALSE; newEvents[i].data = NULL; ! newEvents[i].name = strdup(events[i].name); if (!newEvents[i].name) { while (--i = 0) free(newEvents[i].name); free(newEvents); return NULL; *** *** 663,679 if (!res) return; for (i = 0; i res-nEvents; i++) { ! PGEventResultDestroy evt; - evt.result = res; - (void) res-events[i].proc(PGEVT_RESULTDESTROY, evt, - res-events[i].passThrough); free(res-events[i].name); } if (res-events) free(res-events); --- 665,685 if (!res) return; for (i = 0; i res-nEvents; i++) { ! if(res-events[i].needDestroy) ! { ! PGEventResultDestroy evt; ! ! evt.result = res; ! (void) res-events[i].proc(PGEVT_RESULTDESTROY, evt, ! res-events[i].passThrough); ! } free(res-events[i].name); } if (res-events) free(res-events); *** *** 1609,1620 --- 1615,1628 libpq_gettext(PGEventProc \%s\ failed during PGEVT_RESULTCREATE event\n), res-events[i].name); pqSetResultError(res, conn-errorMessage.data); res-resultStatus = PGRES_FATAL_ERROR; break; } + + res-events[i].needDestroy = TRUE; } } return res; } Index: src/interfaces/libpq/libpq-int.h === RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/libpq-int.h,v retrieving revision 1.132 diff -C6 -r1.132 libpq-int.h *** src/interfaces/libpq/libpq-int.h 17 Sep 2008 04:31:08 - 1.132 --- src/interfaces/libpq/libpq-int.h 17 Sep 2008 10:49:10 - *** *** 153,164 --- 153,165 typedef struct PGEvent { PGEventProc proc; /* the function to call on events */ char *name; /* used only for error messages */ void *passThrough; /* pointer supplied at registration time */ void *data; /* optional state (instance) data */ + int needDestroy; /* indicates a PGEVT_RESULTDESTROY is needed. */ } PGEvent; struct pg_result { int ntups; int numAttributes; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch
I am not sure, if these rule is good. Somebody who develop on postgresql should have a problems when they will be port to other databases in future. Reserved words in standards should be respected. I disagree. I have never ported an app written for PostgreSQL to another database system, and have no plans to start. The fact that some other database system might barf on a particular bit of SQL is insufficient reason for PostgreSQL to do the same thing. If people want to write code that will work on multiple databases, they should of course avoid using any SQL reserved words for anything other than their reserved purposes. But there is no reason for the database system to unilaterally shove that down everyone's throat. It is very easy to overdo the idea of protecting users from themselves. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] libpq events patch (with sgml docs)
Andrew Chernow [EMAIL PROTECTED] writes: Now, it's questionable how tense we need to be about that as long as event proc failure aborts calling of later event procs. That means that procs have to be robust against getting DESTROY with no CREATE calls in any case. Should we try to make that less uncertain? I attached a patch that adds a 'needDestroy' member to PGEvent It is set when resultcreate or resultcopy succeeds and checked during a PQclear. That *should* resolve the issue of no resultcreate but gets a resultdestroy. Some thought would need to be given to how that interacts with RESULTCOPY. Presumably on the destination side, RESULTCOPY is the equivalent of RESULTCREATE, ie, don't DESTROY if you didn't get COPY. But what of the source side? Arguably you shouldn't invoke COPY on events that were never initialized in this object. I also think that a little bit of thought should go into whether or not to call DESTROY on an event that *did* get a CREATE and failed it. You could argue that one either way: should a failing CREATE operation be expected to fully clean up after itself, or should it be expected to leave things in a state where DESTROY can clean up properly? I'm not entirely sure, but option A might force duplication of code between CREATE's failure path and DESTROY. Whichever semantics we choose needs to be documented. Also, if we choose option B, then the same would hold for REGISTER versus CONNDESTROY: failing REGISTER should still mean that you get a CONNDESTROY. So maybe that's an argument for option A, because that's how REGISTER works now. Lastly, the idea that was in the back of my mind was to resolve this issue by unconditionally calling all the event procs at CREATE time, not by cutting off the later ones if an earlier one failed. Again, I do not see a performance argument for skipping the extra steps, and it seems to me that it might improve symmetry/robustness. I'm not necessarily wedded to that approach but it bears thinking about. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New FSM patch
Heikki Linnakangas napsal(a): Heikki Linnakangas wrote: snip Let me describe this test case first: - The test program calls RecordAndGetPageWithFreeSpace in a tight loop, with random values. There's no activity to the heap. In normal usage, the time spent in RecordAndGetWithFreeSpace is minuscule compared to the heap and index updates that cause RecordAndGetWithFreeSpace to be called. - WAL was placed on a RAM drive. This is of course not how people set up their database servers, but the point of this test was to measure CPU speed and scalability. The impact of writing extra WAL is significant and needs to be taken into account, but that's a separate test and discussion, and needs to be considered in comparison to the WAL written by heap and index updates. snip Another surprise was how badly both implementations scale. On CVS HEAD, I expected the performance to be roughly the same with 1 and 2 clients, because all access to the FSM is serialized on the FreeSpaceLock. But adding the 2nd client not only didn't help, but it actually made the performance much worse than with a single client. Context switching or cache line contention, perhaps? The new FSM implementation shows the same effect, which was an even bigger surprise. At table sizes 32 MB, the FSM no longer fits on a single FSM page, so I expected almost a linear speed up with bigger table sizes from using multiple clients. That's not happening, and I don't know why. Although, going from 33MB to 333 MB, the performance with 2 clients almost doubles, but it still doesn't exceed that with 1 client. I tested it with DTrace on Solaris 10 and 8CPUs SPARC machine. I got similar result as you. Main problem in your new implementation is locking. On small tables where FSM fits on one page clients spend about 3/4 time to waiting on page lock. On medium tables (2level FSM) then InsertWal lock become significant - it takes 1/4 of waiting time. Page waiting takes only 1/3. I think the main reason of scalability problem is that locking invokes serialization. Suggestions: 1) remove WAL logging. I think that FSM record should be recovered during processing of others WAL records (like insert, update). Probably only we need full page write on first modification after checkpoint. 2) break lock - use only share lock for page locking and divide page for smaller part for exclusive locking (at least for root page) However, your test case is too artificial. I'm going to run OLTP workload and test it with real workload. Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch
Robert Haas [EMAIL PROTECTED] writes: I am not sure, if these rule is good. Somebody who develop on postgresql should have a problems when they will be port to other databases in future. Reserved words in standards should be respected. I disagree. I have never ported an app written for PostgreSQL to another database system, and have no plans to start. The fact that some other database system might barf on a particular bit of SQL is insufficient reason for PostgreSQL to do the same thing. If people want to write code that will work on multiple databases, they should of course avoid using any SQL reserved words for anything other than their reserved purposes. More than that, they have to actually test their SQL on each target DB. Every DB (including us) is going to have some reserved words that are not in the standard; so imagining that Postgres can all by itself protect you from this type of problem is doomed to failure anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for SQL-standard negative valued year-month literals
Tom Lane wrote: Ron Mayer [EMAIL PROTECTED] writes: Tom Lane wrote: If I read SQL 200N's spec correctly select interval '-1 1:00:00'; should mean-1 days -1 hours, yet 8.3 sees it as -1 days +1 hours. I think we are kind of stuck on this one. If we change it, then how would one represent -1 days +1 hours? The spec's format is only sane if you assume all the fields must have the same sign, which is not the case for PG. -1 days +1 hours = interval '-0 23:00:00' Intervals are a scalar, not an addition of assorted values, alternating signs between fields would be wrong. -- Sincerely, Stephen R. van den Berg. He did a quarter of the work in *half* the time! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for SQL-standard negative valued year-month literals
Stephen R. van den Berg [EMAIL PROTECTED] writes: Intervals are a scalar, not an addition of assorted values, alternating signs between fields would be wrong. Sorry, you're the one who's wrong on that. We've treated intervals as three independent fields for years now (and before that it was two independent fields). We're not going to throw away that capability. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] per-table autovacuum configuration
Alvaro Herrera wrote: any new thoughts on the matter? Perhaps someone is already working on it? It is still a wanted feature, but a couple of people have offered patches and I'm waiting for them ... Aha, good. I was considering taking a stab at it, but under the circumstances will wait and see. regards, Martin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal of SE-PostgreSQL patches (for CommitFest:Sep)
KaiGai Kohei wrote: Peter, thanks for your comments. Let's review: *) System-wide consistency in access controls could be nice to have in some cases. But is it really achievable? In the typical three-tier web application scenario, do you really have system-wide consistency? Can you configure your application server using SELinux? I'm no expert on these things, but I wonder, would it even work in a useful way, over the network, with all the different threads, processes, and sessions going on? Or how about a desktop, pgAdmin with several database connections, can those be isolated from each other or whatever the security setup may be? It's a good question. Yes, it is possible no need to say. :) We can configure Apache to kick its contents handler with a proper security context. The contents handler is a sort of Apache module to handle various kind of web contents like *.html, *.php, *.cgi and so on. The existing module (mod_selinux) eanbles to invoke CGI program with a proper security context based on HTTP authentication. In addition, the upcoming Linux kernel got a feature to assign built-in scripts its security context. SELinux applied its access controls based on the assigned security context for various kind of objects like files, sockets, IPCs, tables, columns and so on. I can provide a demonstration, pelase wait for a while to set up. The following URL can show the demonstration: http://kaigai.myhome.cx/index.php It requires HTTP authentication, and you can choose one of foo, var or baz. They can be authenticated by same password: sepgsql. The web server assigns per-user security context for its contents handler including the PHP script. It shows the result set of SQL query depends on the security context of its client. (note) This script always connects to SE-PostgreSQL server with apache role that has a privileged user rights. Thanks, -- KaiGai Kohei [EMAIL PROTECTED] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] libpq events patch (with sgml docs)
Tom Lane wrote: Some thought would need to be given to how that interacts with RESULTCOPY. Presumably on the destination side, RESULTCOPY is the equivalent of RESULTCREATE, ie, don't DESTROY if you didn't get COPY. But what of the source side? Arguably you shouldn't invoke COPY on events that were never initialized in this object. You are correct. The resultcopy function needs to check if the src result eventproc was initialized. BUT, that should not be possible unless someone is not checking return values. Meaning, the only way to have an uninitialized eventproc in this instance is if something failed during a resultcreate. Actually, if you call PQmakeEmptyPGResult then you can also run into this problem. That can be solved by adding an argument to makeResult as I previously suggested. I also think that a little bit of thought should go into whether or not to call DESTROY on an event that *did* get a CREATE and failed it. You could argue that one either way: should a failing CREATE operation be expected to fully clean up after itself, or should it be expected to leave things in a state where DESTROY can clean up properly? I'm not entirely sure, but option A might force duplication of code between CREATE's failure path and DESTROY. Whichever semantics we choose needs to be documented. If a resultcreate fails, than I think that resultdestroy should not be delivered to that eventproc (same for resultcopy). That is how register works and how I saw this patch working (eventhough it appears I have a few logical errors). I don't think it makes sense to do it otherwise, it would be like calling free after a malloc failure. The needDestroy member should be called resultInitialized. Although the conn doesn't reference the 'resultInitialized' member, I should initialize it to FALSE. I did not do this in the last patch ... register function. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NDirectFileRead and Write
ITAGAKI Takahiro [EMAIL PROTECTED] writes: - NDirectFile{Read|Write} are renamed to BufFile{Read|Write}Count. - They are still declared in execdebug.h and buffile.c includes it. After some thought I concluded that the least ugly way to do this was to declare and define the variables in the same places as the other counters that are printed by ShowBufferUsage. It's not any worse from a modularity standpoint than the other alternatives we considered, and it seems to satisfy the principle of least surprise a little better. Committed it that way. I did not touch messages in ShowBufferUsage() in the patch. I didn't change the message either. Heikki's suggestion of temp blocks doesn't seem very good to me because of the likelihood of confusion with temp tables. I don't have a better alternative offhand. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] libpq events patch (with sgml docs)
Andrew Chernow [EMAIL PROTECTED] writes: Tom Lane wrote: Some thought would need to be given to how that interacts with RESULTCOPY. Presumably on the destination side, RESULTCOPY is the equivalent of RESULTCREATE, ie, don't DESTROY if you didn't get COPY. But what of the source side? Arguably you shouldn't invoke COPY on events that were never initialized in this object. You are correct. The resultcopy function needs to check if the src result eventproc was initialized. BUT, that should not be possible unless someone is not checking return values. Meaning, the only way to have an uninitialized eventproc in this instance is if something failed during a resultcreate. Actually, if you call PQmakeEmptyPGResult then you can also run into this problem. That can be solved by adding an argument to makeResult as I previously suggested. I still think it would be a good idea to expend a couple more lines in PQcopyResult to cover the case --- the likelihood of there being code paths that make an empty result and never invoke RESULTCREATE just seems too high. Defensive programming 'n all that. In any case I'm not convinced that we should force a RESULTCREATE cycle on external callers of PQmakeEmptyPGResult. If you guys didn't see a need for it in your development then it probably doesn't exist. If a resultcreate fails, than I think that resultdestroy should not be delivered to that eventproc (same for resultcopy). That is how register works and how I saw this patch working (eventhough it appears I have a few logical errors). I don't think it makes sense to do it otherwise, it would be like calling free after a malloc failure. I can live with that definition, but please document it. The needDestroy member should be called resultInitialized. Yeah, probably, so that you can set it FALSE in conn events and continue to use memcpy to move things over. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch
Tom Lane [EMAIL PROTECTED] wrote: Robert Haas [EMAIL PROTECTED] writes: I am not sure, if these rule is good. Somebody who develop on postgresql should have a problems when they will be port to other databases in future. Reserved words in standards should be respected. If people want to write code that will work on multiple databases, they should of course avoid using any SQL reserved words for anything other than their reserved purposes. More than that, they have to actually test their SQL on each target DB. Every DB (including us) is going to have some reserved words that are not in the standard; so imagining that Postgres can all by itself protect you from this type of problem is doomed to failure anyway. If someone wants portable code, they can use a development tool which wraps ALL identifiers in quotes, every time. That's what we do. The important thing is that, to the extent practicable, standard SQL code is accepted and behaves in compliance with the standard. I don't see that it does anything to compromise that if you support additional, non-standard syntax for extensions. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for SQL-standard negative valued year-month literals
Tom Lane wrote: Stephen R. van den Berg [EMAIL PROTECTED] writes: Intervals are a scalar, not an addition of assorted values, alternating signs between fields would be wrong. Sorry, you're the one who's wrong on that. We've treated intervals as three independent fields for years now (and before that it was two independent fields). We're not going to throw away that capability. +1 It's very useful. Currently our terse input format that's similar to the SQL standard rejects more mixed-sign intervals than I'd like. I'd be quite happy if: '1 2:03:-04' gave me '1 day 2 hours 3 minutes -4 seconds' but currently we reject that mixed-sign-literal. I'd just like to find a way to have SQL-standard input produce SQL-standard output in the cases where the input happened to match the standard. If we had a blank slate, my vote would be that '-1 2:03:04' should mean what the SQL standard says it should. '-1 +2:03:04' should mean negative 1 days, plus 2 hours 3 minutes 4 sec '1 2:03:-04' should mean 1 day 2 hours 3 minutes minus 4 seconds '-1 2:03:+04' should mean negative 1 day 2 hours 3 minutes plus 4 seconds but I'm aware that there are backward compatibility issues. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum and Autoanalyze
Simon Riggs wrote: On Wed, 2008-09-17 at 10:09 +0300, Heikki Linnakangas wrote: Isn't autoanalyze a waste of time during a bulk load? Seems better to run ANALYZE manually at the end. Its not a waste of time because it catches tables immediately they have been loaded, not just at the end of the bulk load. Running ANALYZE is a waste of time if autoanalyze has already caught it, which is why that's never been added onto the end of a pg_dump script. But currently this is true only when we have both autoVACUUM and autoANALYZE enabled. Hmm, one of the first complaints about defaulting autovacuum to on was that it made restores so much longer *because* it was choosing to do autoanalyzes on the tables as they were imported. It was then that the auto-cancel mechanism was introduced. http://pgsql.markmail.org/message/rqyjkafuw43426xy Why doesn't this new request conflict with that one? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Column level privileges was:(Re: [HACKERS] Extending grant insert on tables to sequences)
Jaime, * Stephen Frost ([EMAIL PROTECTED]) wrote: * Jaime Casanova ([EMAIL PROTECTED]) wrote: On 7/25/08, Stephen Frost [EMAIL PROTECTED] wrote: Yes, I'm working on it hi, any work on it? may i help? If you look at the commitfest, I've posted my WIP so far there. Most of the grammer, parser, and catalog changes are there. There's a couple of bugs in that code that I'm working to run down but otherwise I think it's pretty good. I do need to add in the dependency tracking as well though, and that's what I'm planning to work on next. I've now added dependency tracking and worked out a few kinks in the code, both existing previously and from adding the dep tracking. I'd really like to simplify things in aclchk.c, perhaps by factoring out more common bits into functional pieces, but it's been kind of a bear so far. The dependency tracking is being done by continuing to treat the table as a single entity and just figuring out the total set (including all column-level permissions) of roles for the entire table, rather than introducing the sub-object concept. This requires a bit of extra effort when doing DDLs and GRANTs but simplifies the dependency tracking itself, especially since we have to keep track of both table-level permissions and column-level permissions seperately. I'm open to other suggestions/comments. If people feel the sub-object is a better approach, it would get somewhat more awkward because we'd have to handle the relation-level dependencies as well as the column-level ones. Not impossible to do, of course, but a bit more complicated than how it was done originally. A piece which can be broken off pretty easily is adding support to track the columns used through to the executor so we can check the permissions in the right place. Jamie, have you had a chance to work on this? It's next on my list and I'll start working on it tonight unless you've had a chance to get to it. Please let me know. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Autovacuum and Autoanalyze
Alvaro Herrera wrote: Simon Riggs wrote: On Wed, 2008-09-17 at 10:09 +0300, Heikki Linnakangas wrote: Isn't autoanalyze a waste of time during a bulk load? Seems better to run ANALYZE manually at the end. Its not a waste of time because it catches tables immediately they have been loaded, not just at the end of the bulk load. Running ANALYZE is a waste of time if autoanalyze has already caught it, which is why that's never been added onto the end of a pg_dump script. But currently this is true only when we have both autoVACUUM and autoANALYZE enabled. Hmm, one of the first complaints about defaulting autovacuum to on was that it made restores so much longer *because* it was choosing to do autoanalyzes on the tables as they were imported. It was then that the auto-cancel mechanism was introduced. http://pgsql.markmail.org/message/rqyjkafuw43426xy Why doesn't this new request conflict with that one? The problem back then was that a CREATE INDEX was waiting on the autoanalyze to finish, and the autoanalyze took a long time to finish because of vacuum_cost_delay. Now that we have the auto-cancel mechanism, that's not a problem. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] libpq events patch (with sgml docs)
Andrew Chernow wrote: New patch following our discussion with updated docs. few logical errors). I don't think it makes sense to do it otherwise, it would be like calling free after a malloc failure. I can live with that definition, but please document it. To build on this analogy, PGEVT_CONNRESET is like a realloc. Meaning, the initial malloc PGEVT_REGISTER worked by the realloc PGEVT_CONNRESET didn't ... you still have free PGEVT_CONNDESTROY the initial. Its documented that way. Basically if a register succeeds, a destroy will always be sent regardless of what happens with a reset. I attached the wrong patch. I'm sorry. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ Index: doc/src/sgml/libpq.sgml === RCS file: /projects/cvsroot/pgsql/doc/src/sgml/libpq.sgml,v retrieving revision 1.261 diff -C6 -r1.261 libpq.sgml *** doc/src/sgml/libpq.sgml 17 Sep 2008 04:31:08 - 1.261 --- doc/src/sgml/libpq.sgml 17 Sep 2008 14:19:29 - *** *** 4911,4923 When a literalPGEVT_REGISTER/literal event is received, the parameterevtInfo/parameter pointer should be cast to a structnamePGEventRegister */structname. This structure contains a structnamePGconn/structname that should be in the literalCONNECTION_OK/literal status; guaranteed if one calls functionPQregisterEventProc/function right after obtaining a good !structnamePGconn/structname. /para /listitem /varlistentry varlistentry termliteralPGEVT_CONNRESET/literal/term --- 4911,4925 When a literalPGEVT_REGISTER/literal event is received, the parameterevtInfo/parameter pointer should be cast to a structnamePGEventRegister */structname. This structure contains a structnamePGconn/structname that should be in the literalCONNECTION_OK/literal status; guaranteed if one calls functionPQregisterEventProc/function right after obtaining a good !structnamePGconn/structname. When returning a failure code, all !cleanup must be performed as no literalPGEVT_CONNDESTROY/literal !event will be sent. /para /listitem /varlistentry varlistentry termliteralPGEVT_CONNRESET/literal/term *** *** 4941,4953 When a literalPGEVT_CONNRESET/literal event is received, the parameterevtInfo/parameter pointer should be cast to a structnamePGEventConnReset */structname. Although the contained structnamePGconn/structname was just reset, all event data remains unchanged. This event should be used to reset/reload/requery any !associated literalinstanceData/literal. /para /listitem /varlistentry varlistentry termliteralPGEVT_CONNDESTROY/literal/term --- 4943,4956 When a literalPGEVT_CONNRESET/literal event is received, the parameterevtInfo/parameter pointer should be cast to a structnamePGEventConnReset */structname. Although the contained structnamePGconn/structname was just reset, all event data remains unchanged. This event should be used to reset/reload/requery any !associated literalinstanceData/literal. A PGEVT_CONNDESTROY event !is always sent, regardless of the event procedure's return value. /para /listitem /varlistentry varlistentry termliteralPGEVT_CONNDESTROY/literal/term *** *** 5000,5023 structnamePGEventResultCreate */structname. The parameterconn/parameter is the connection used to generate the result. This is the ideal place to initialize any literalinstanceData/literal that needs to be associated with the result. If the event procedure fails, the result will be cleared and the failure will be propagated. The event procedure must not try to !functionPQclear/ the result object for itself. /para /listitem /varlistentry varlistentry termliteralPGEVT_RESULTCOPY/literal/term listitem para The result copy event is fired in response to functionPQcopyResult/function. This event will only be fired after !the copy is complete. synopsis typedef struct { const PGresult *src; PGresult *dest; --- 5003,5030 structnamePGEventResultCreate */structname. The parameterconn/parameter is the connection used to generate the result. This is the ideal place to initialize any literalinstanceData/literal that needs to be associated with the result. If the event procedure fails, the result will be cleared and the failure will be propagated. The event procedure must not try to !
Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch
Is physical_tlist optimization sensible for RecursiveScan? We seem to use it for every other Scan node type. To enable physical_tlist optimization, it seems build_physical_tlist, use_physical_tlist and disuse_physical_tlist need to be changed. build_physical_tlist and use_physical_tlist have been already patched and only disuse_physical_tlist needs to be patched. Any other place I miss to enable the optimization? -- Tatsuo Ishii SRA OSS, Inc. Japan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum and Autoanalyze
Heikki Linnakangas [EMAIL PROTECTED] writes: Alvaro Herrera wrote: Why doesn't this new request conflict with that one? The problem back then was that a CREATE INDEX was waiting on the autoanalyze to finish, and the autoanalyze took a long time to finish because of vacuum_cost_delay. Now that we have the auto-cancel mechanism, that's not a problem. Define not a problem. With auto-cancel, what will happen is that whatever work the autoanalyze does will be wasted. It seems to me that the current complaint is about background autovacuum/autoanalyze wasting cycles during a bulk load, and there's certainly no purer waste than an analyze cycle that gets aborted. I tend to agree with Alvaro that there's not very much of a use case for an analyze-only autovacuum mode. Assuming that we get to the point of having a parallelizing pg_restore, it would be interesting to give it an option to include ANALYZE for each table it's loaded among the tasks that it schedules. (I'm visualizing these commands as being made up by pg_restore itself, *not* added to the pg_dump output.) Then you could have a reasonably optimal total workflow, whereas allowing autovacuum to try to schedule the ANALYZEs can't be. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch
Tatsuo Ishii [EMAIL PROTECTED] writes: Is physical_tlist optimization sensible for RecursiveScan? We seem to use it for every other Scan node type. To enable physical_tlist optimization, it seems build_physical_tlist, use_physical_tlist and disuse_physical_tlist need to be changed. build_physical_tlist and use_physical_tlist have been already patched and only disuse_physical_tlist needs to be patched. Any other place I miss to enable the optimization? IIRC, the comment for build_physical_tlist hadn't been patched, but yeah that seems like about it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch
To enable physical_tlist optimization, it seems build_physical_tlist, use_physical_tlist and disuse_physical_tlist need to be changed. build_physical_tlist and use_physical_tlist have been already patched and only disuse_physical_tlist needs to be patched. Any other place I miss to enable the optimization? IIRC, the comment for build_physical_tlist hadn't been patched, but yeah that seems like about it. Yeah, I need to fix sloppy comments in the existing patches all over the places:-) -- Tatsuo Ishii SRA OSS, Inc. Japan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for SQL-standard negative valued year-month literals
Tom Lane wrote: Stephen R. van den Berg [EMAIL PROTECTED] writes: Intervals are a scalar, not an addition of assorted values, alternating signs between fields would be wrong. Sorry, you're the one who's wrong on that. We've treated intervals as three independent fields for years now (and before that it was two independent fields). Ok, didn't know that. Let's put it differently then: I can understand that the standard considers it a scalar and not an addition, but apparently the addition characteristic is being used in Postgres code already; that makes it undesirable to change it indeed. -- Sincerely, Stephen R. van den Berg. He did a quarter of the work in *half* the time! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Column level privileges was:(Re: [HACKERS] Extending grant insert on tables to sequences)
On 9/17/08, Stephen Frost [EMAIL PROTECTED] wrote: A piece which can be broken off pretty easily is adding support to track the columns used through to the executor so we can check the permissions in the right place. Jamie, have you had a chance to work on this? It's next on my list and I'll start working on it tonight unless you've had a chance to get to it. Please let me know. not really, i start to read the code... but was interrupted for a new task... (if we only could send kill -9 signals to work tasks ;) -- regards, Jaime Casanova Soporte y capacitación de PostgreSQL AsesorÃa y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] text search patch status update?
I remember about that but right now I havn't time to make final review. Sorry. Will return soon. Sushant Sinha wrote: Any status updates on the following patches? 1. Fragments in tsearch2 headlines: http://archives.postgresql.org/pgsql-hackers/2008-08/msg00043.php 2. Bug in hlCover: http://archives.postgresql.org/pgsql-hackers/2008-08/msg00089.php -Sushant. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Patch for SQL-Standard Interval output and decoupling DateStyle from IntervalStyle
Tom Lane wrote: In fact, given that we are now somewhat SQL-compliant on interval input, a GUC that selected PG traditional, SQL-standard, or ISO 8601 interval output format seems like it could be a good idea. Short summary: The attached patch (1) adds a new GUC called IntervalStyle that decouples interval output from the DateStyle GUC, and (2) adds a new interval style that will match the SQL standards for interval literals when given interval data that meets the sql standard (year-month or date-time only; and no mixed sign). Background: Currently Postgres outputs Intervals in one of two formats depending on DateStyle. When DateStyle is 'ISO', it outputs intervals like '1 year 2 mons 3 days -04:05:06.78' (though I know of no ISO interval standards that look like that). When DateStyle is 'SQL' it outputs intervals like '@ 1 year 2 mons 3 days -4 hours -5 mins -6.78 secs' (though I know of no SQL interval standards that look like that). The feature: The SQL standard only specifies interval literal strings for the two kinds of intervals (year-month and day-time) that are defined by the SQL spec. It also doesn't account for postgres's intervals that can have mixed-sign components. I tried to make the output here be a logical extension of the spec, concatenating the year-month and day-time interval strings and forcing signs in the output that could otherwise have been ambiguous. This little table shows an example of the output of this new style compared to the existing postgres output for (a) a year-month interval, (b) a day-time interval, and (c) a not-quite-standard postgres interval with year-month and day-time components of varying signs. '1-2' | '@ 1 year 2 mons' '3 4:05:06.78'| '@ 3 days 4 hours 5 mins 6.78 secs' '+1-2 -3 +4:05:06.78' | '@ 1 year 2 mons -3 days 4 hours 5 mins 6.78 secs' The patch: Seems to work for me; and I believe I updated the docs. Many regression tests fail, though, because they assume the existing coupling of DateStyle and interval output styles. If people like where this is going I can update those regression tests and add ones to test this new style. *** a/doc/src/sgml/config.sgml --- b/doc/src/sgml/config.sgml *** *** 4090,4095 SET XML OPTION { DOCUMENT | CONTENT }; --- 4090,4117 /listitem /varlistentry + varlistentry id=guc-intervalstyle xreflabel=IntervalStyle + termvarnameIntervalStyle/varname (typestring/type)/term + indexterm +primaryvarnameIntervalStyle/ configuration parameter/primary + /indexterm + listitem +para + Sets the display format for interval values. + The value literalsql_standard/ will output SQL Standard + strings when given intervals that conform to the SQL + standard (either year-month only or date-time only; and no + mixing of positive and negative components). + The value literalpostgres/ will output intervals in + a format that matches what old releases had output when + the DateStyle was set to literal'ISO'/. + The value literalpostgres_verbose/ will output intervals in + a format that matches what old releases had output when + the DateStyle was set to literal'SQL'/. +/para + /listitem + /varlistentry + varlistentry id=guc-timezone xreflabel=timezone termvarnametimezone/varname (typestring/type)/term indexterm *** a/doc/src/sgml/datatype.sgml --- b/doc/src/sgml/datatype.sgml *** *** 2213,2218 January 8 04:05:06 1999 PST --- 2213,2305 /para /sect2 +sect2 id=interval-output + titleInterval Output/title + + indexterm + primaryinterval/primary + secondaryoutput format/secondary + seealsoformatting/seealso + /indexterm + + para + The output format of the interval types can be set to one of the four + styles literalsql_standard/, + literalpostgres/, or literalpostgres_verbose/.The default + is the literalpostgres/ format. + xref + linkend=interval-style-output-table shows examples of each + output style. + /para + + para + The literalsql_standard/ style will output SQL standard + interval literal strings where the value of the interval + value consists of only a year-month component or a datetime + component (as required by the sql standard). For an interval + containing both a year-month and a datetime component, the + output will be a SQL Standard unquoted year-month literal + string joined to a SQL Standard unquoted datetime literal + string with a space in between. + /para + + para + The literalpostgres/ style will output intervals that match + the style PostgreSQL 8.3 outputed when the xref
Re: [HACKERS] Autovacuum and Autoanalyze
On Wed, 2008-09-17 at 10:52 -0400, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Alvaro Herrera wrote: Why doesn't this new request conflict with that one? The problem back then was that a CREATE INDEX was waiting on the autoanalyze to finish, and the autoanalyze took a long time to finish because of vacuum_cost_delay. Now that we have the auto-cancel mechanism, that's not a problem. Define not a problem. With auto-cancel, what will happen is that whatever work the autoanalyze does will be wasted. It seems to me that the current complaint is about background autovacuum/autoanalyze wasting cycles during a bulk load, and there's certainly no purer waste than an analyze cycle that gets aborted. OK, but that's an argument against auto-anything, not just against splitting out autoanalyze and autovacuum. I tend to agree with Alvaro that there's not very much of a use case for an analyze-only autovacuum mode. Did he say that? I thought he said we could do that, what did that mean Alvaro? I have a customer saying this would be a good thing and I agree. The roles of Autovacuum and autoanalyze are not exactly matched, so why do we force them to be run together or not at all? Why not allow the user to specify whether they want both or not? It's an option, we're not forcing anyone to do it that way if they don't want to. Assuming that we get to the point of having a parallelizing pg_restore, it would be interesting to give it an option to include ANALYZE for each table it's loaded among the tasks that it schedules. (I'm visualizing these commands as being made up by pg_restore itself, *not* added to the pg_dump output.) Then you could have a reasonably optimal total workflow, whereas allowing autovacuum to try to schedule the ANALYZEs can't be. That doesn't solve all problems, just ones with pg_restore. That's nice and I won't turn it away, but what will we do about plain pg_dump and about other table creations and loads? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum and Autoanalyze
I tend to agree with Alvaro that there's not very much of a use case for an analyze-only autovacuum mode. Assuming that we get to the point of having a parallelizing pg_restore, it would be interesting to give it an option to include ANALYZE for each table it's loaded among the tasks that it schedules. (I'm visualizing these commands as being made up by pg_restore itself, *not* added to the pg_dump output.) Then you could have a reasonably optimal total workflow, whereas allowing autovacuum to try to schedule the ANALYZEs can't be. In Simon's original email, he suggested forcing an automatic ANALYZE on the server side after CREATE TABLE AS. I objected on the grounds that this won't fix anything for people who are doing bulk data loads using any other mechanism. Here, you're proposing the exact same thing, except instead of restricting it to people who use CREATE TABLE AS, you're restricting it to people who use a hypothetical parallelized implementation of pg_restore. While either of these is better than doing nothing, ISTM it would be far better to give the database some smarts about what constitutes a bulk data load (a whole bunch of insert operations on a newly created table) and what to do about it (synchronous analyze just before the first operation on the table that isn't an insert - and perhaps not before). ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] proposal - function's default parametes
Hello, before any implementation of named params we have to implement parameters with default values. I don't see any sense of named params without it. It's ortogonal to variadic functions (variadic functions are called when we put more parameters than functions has, dafault parameters are used when we put less parameters than funtions has). Similar implementation has firebird. Default values are filled from right - create function foo(undefvar type, var1 type1 = expr1, var2 type2 = expr2, ... var_n type_n = expr_n) select foo(cexpr1, cexpr2) is transformed to select foo(cexpr1, cexpr2, expr2, ... expr_n) Problems with identification of good function are same as with variadic functions. Implementation: In my prototype I used special datatype: position_expression that carries position and parsed and transformed expression. pg_proc has new column defaults of position_expression[] type. Position will be used for named params in future, and using this type protect us to compatibility problems. postgres=# create or replace function x1(int = 1,int = 2,int= 3) returns int as $$ select $1+$2+$3; $$ language sql; CREATE FUNCTION postgres=# select x1(); x1 6 (1 row) postgres=# select x1(10);; x1 15 (1 row) postgres=# select x1(10,20); x1 33 (1 row) postgres=# select x1(10,20,30); x1 60 (1 row) This feature is implemented on SQL level, so all PL languages will be supported. any comments, ideas are welcome regards Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] optimizing CleanupTempFiles
Hi, We've been profiling a large system (8 CPUs, 64 GB of memory, some dozens of disks) which seems rather more swamped than it should. Part of the problem seems to come from CleanupTempFiles, the second entry in oprofile output. This is the top 3 symbol report for a 2 minute oprofile run: $ opreport '*postgres' -l CPU: AMD64 processors, speed 2600 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 7000 samples %symbol name 1963666 5.4862 AllocSetAlloc 1629477 4.5525 CleanupTempFiles 1560543 4.3599 hash_search_with_hash_value Annotating the function results in this: :static void :CleanupTempFiles(bool isProcExit) 1334 0.0037 :{ /* CleanupTempFiles total: 1629477 4.5525 */ :Index i; : 90 2.5e-04 :if (SizeVfdCache 0) :{ :Assert(FileIsNotOpen(0)); /* Make sure ring not corrupted */ 470706 1.3151 :for (i = 1; i SizeVfdCache; i++) :{ 281998 0.7879 :unsigned short fdstate = VfdCache[i].fdstate; : 872073 2.4365 :if ((fdstate FD_TEMPORARY) VfdCache[i].fileName != NULL) :{ :/* : * If we're in the process of exiting a backend process, close : * all temporary files. Otherwise, only close temporary files : * local to the current transaction. : */ :if (isProcExit || (fdstate FD_XACT_TEMPORARY)) :FileClose(i); :} :} :} : 3198 0.0089 :while (numAllocatedDescs 0) :FreeDesc(allocatedDescs[0]); 78 2.2e-04 :} So we're scanning the array lots of times (there are about 1300 transactions per second), and do no useful work on each scan. There are about 8900 files in the database, and since this is using a connection pooler, it's quite likely that each session opens a large subset of them at least once at some point, and so the VfdCache gets very large. I can see two simple ways to solve this problem. One is to create a second array that keeps pointers to the temp files in VfdCache. Then, on CleanupTempFiles we scan that array instead of VfdCache directly. The second one is to use a separate VfdCache for temp files, but this seems much more involved, requiring touch almost all of fd.c. Of course, perhaps there's another solution which involves rethinking fd.c in a more thorough fashion, but I'm not sure how right offhand. Thoughts? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] optimizing CleanupTempFiles
On Wed, 2008-09-17 at 16:25 -0400, Alvaro Herrera wrote: We've been profiling a large system (8 CPUs, 64 GB of memory, some dozens of disks) which seems rather more swamped than it should. Part of the problem seems to come from CleanupTempFiles, the second entry in oprofile output. I'm glad you've observed this also. I saw it about two years ago but wasn't able to convince anyone else it existed at the time. I can see two simple ways to solve this problem. One is to create a second array that keeps pointers to the temp files in VfdCache. Then, on CleanupTempFiles we scan that array instead of VfdCache directly. The second one is to use a separate VfdCache for temp files, but this seems much more involved, requiring touch almost all of fd.c. Of course, perhaps there's another solution which involves rethinking fd.c in a more thorough fashion, but I'm not sure how right offhand. Simple solution is to have a state variable so you can see whether a backend has created an temp files in this transaction. Most don't, so I think the above two solutions are overkill. If we created any, scan for them, if not, don't. Just a simple boolean state, just as we have for AtEOXact_RelationCache(). -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] optimizing CleanupTempFiles
Simon Riggs wrote: On Wed, 2008-09-17 at 16:25 -0400, Alvaro Herrera wrote: We've been profiling a large system (8 CPUs, 64 GB of memory, some dozens of disks) which seems rather more swamped than it should. Part of the problem seems to come from CleanupTempFiles, the second entry in oprofile output. I'm glad you've observed this also. I saw it about two years ago but wasn't able to convince anyone else it existed at the time. I couldn't find it in the archives. Simple solution is to have a state variable so you can see whether a backend has created an temp files in this transaction. Most don't, so I think the above two solutions are overkill. If we created any, scan for them, if not, don't. Just a simple boolean state, just as we have for AtEOXact_RelationCache(). Ah -- like this? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. Index: src/backend/storage/file/fd.c === RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/storage/file/fd.c,v retrieving revision 1.143 diff -c -p -r1.143 fd.c *** src/backend/storage/file/fd.c 1 Jan 2008 19:45:51 - 1.143 --- src/backend/storage/file/fd.c 17 Sep 2008 21:33:28 - *** static int max_safe_fds = 32; /* default *** 121,126 --- 121,132 #define FD_TEMPORARY (1 0) /* T = delete when closed */ #define FD_XACT_TEMPORARY (1 1) /* T = delete at eoXact */ + /* + * Flag to tell whether it's worth scanning VfdCache looking for temp files to + * close + */ + static bool any_temporary_files = false; + typedef struct vfd { signed short fd; /* current FD, or VFD_CLOSED if none */ *** OpenTemporaryFile(bool interXact) *** 889,894 --- 895,903 { VfdCache[file].fdstate |= FD_XACT_TEMPORARY; VfdCache[file].create_subid = GetCurrentSubTransactionId(); + + /* ensure cleanup happens at eoxact */ + any_temporary_files = true; } return file; *** AtEOSubXact_Files(bool isCommit, SubTran *** 1603,1609 { Index i; ! if (SizeVfdCache 0) { Assert(FileIsNotOpen(0)); /* Make sure ring not corrupted */ for (i = 1; i SizeVfdCache; i++) --- 1612,1618 { Index i; ! if (SizeVfdCache 0 any_temporary_files) { Assert(FileIsNotOpen(0)); /* Make sure ring not corrupted */ for (i = 1; i SizeVfdCache; i++) *** CleanupTempFiles(bool isProcExit) *** 1679,1685 { Index i; ! if (SizeVfdCache 0) { Assert(FileIsNotOpen(0)); /* Make sure ring not corrupted */ for (i = 1; i SizeVfdCache; i++) --- 1688,1694 { Index i; ! if (SizeVfdCache 0 (isProcExit || any_temporary_files)) { Assert(FileIsNotOpen(0)); /* Make sure ring not corrupted */ for (i = 1; i SizeVfdCache; i++) *** CleanupTempFiles(bool isProcExit) *** 1697,1702 --- 1706,1713 FileClose(i); } } + + any_temporary_files = false; } while (numAllocatedDescs 0) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] optimizing CleanupTempFiles
Alvaro Herrera wrote: Simon Riggs wrote: Simple solution is to have a state variable so you can see whether a backend has created an temp files in this transaction. Most don't, so I think the above two solutions are overkill. If we created any, scan for them, if not, don't. Just a simple boolean state, just as we have for AtEOXact_RelationCache(). Ah -- like this? BTW I wonder if there's any point in checking SizeVfdCache 0 in the places where checking the new flag suffices. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] optimizing CleanupTempFiles
Alvaro Herrera [EMAIL PROTECTED] writes: Ah -- like this? +1, but there are two kinds of temp files in that module, and only one of them is relevant here. Call it something like have_xact_temporary_files to make things clearer. I concur that the explicit test on SizeVfdCache 0 is a waste of effort, too. It'll nearly always be true anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] optimizing CleanupTempFiles
On Wed, 2008-09-17 at 17:34 -0400, Alvaro Herrera wrote: Simon Riggs wrote: On Wed, 2008-09-17 at 16:25 -0400, Alvaro Herrera wrote: We've been profiling a large system (8 CPUs, 64 GB of memory, some dozens of disks) which seems rather more swamped than it should. Part of the problem seems to come from CleanupTempFiles, the second entry in oprofile output. I'm glad you've observed this also. I saw it about two years ago but wasn't able to convince anyone else it existed at the time. I couldn't find it in the archives. Perhaps it was a private conversation then, but the main point is I've seen it too and believe it is a real world problem. Simple solution is to have a state variable so you can see whether a backend has created an temp files in this transaction. Most don't, so I think the above two solutions are overkill. If we created any, scan for them, if not, don't. Just a simple boolean state, just as we have for AtEOXact_RelationCache(). Ah -- like this? Yeh, nice and simple. Might be better to call it need_eoxact_work to mirror relcache.c? any_temporary_files is fairly vague and could be misinterpreted. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] optimizing CleanupTempFiles
On Wed, 2008-09-17 at 17:34 -0400, Alvaro Herrera wrote: Ah -- like this? if test should include || isProcExit so you don't skip non-transactional temp files at proc exit when there weren't any created in the last transaction. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] optimizing CleanupTempFiles
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Ah -- like this? +1, but there are two kinds of temp files in that module, and only one of them is relevant here. Call it something like have_xact_temporary_files to make things clearer. Ok, so that's what I called it. Simon wrote: if test should include || isProcExit so you don't skip non-transactional temp files at proc exit when there weren't any created in the last transaction. Yep, I noticed that too. Thanks. Should I backpatch this to 8.3? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support Index: src/backend/storage/file/fd.c === RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/storage/file/fd.c,v retrieving revision 1.143 diff -c -p -r1.143 fd.c *** src/backend/storage/file/fd.c 1 Jan 2008 19:45:51 - 1.143 --- src/backend/storage/file/fd.c 17 Sep 2008 22:58:32 - *** static int max_safe_fds = 32; /* default *** 121,126 --- 121,132 #define FD_TEMPORARY (1 0) /* T = delete when closed */ #define FD_XACT_TEMPORARY (1 1) /* T = delete at eoXact */ + /* + * Flag to tell whether it's worth scanning VfdCache looking for temp files to + * close + */ + static bool have_xact_temporary_files = false; + typedef struct vfd { signed short fd; /* current FD, or VFD_CLOSED if none */ *** OpenTemporaryFile(bool interXact) *** 889,894 --- 895,903 { VfdCache[file].fdstate |= FD_XACT_TEMPORARY; VfdCache[file].create_subid = GetCurrentSubTransactionId(); + + /* ensure cleanup happens at eoxact */ + have_xact_temporary_files = true; } return file; *** AtEOSubXact_Files(bool isCommit, SubTran *** 1603,1609 { Index i; ! if (SizeVfdCache 0) { Assert(FileIsNotOpen(0)); /* Make sure ring not corrupted */ for (i = 1; i SizeVfdCache; i++) --- 1612,1618 { Index i; ! if (have_xact_temporary_files) { Assert(FileIsNotOpen(0)); /* Make sure ring not corrupted */ for (i = 1; i SizeVfdCache; i++) *** CleanupTempFiles(bool isProcExit) *** 1679,1685 { Index i; ! if (SizeVfdCache 0) { Assert(FileIsNotOpen(0)); /* Make sure ring not corrupted */ for (i = 1; i SizeVfdCache; i++) --- 1688,1698 { Index i; ! /* ! * Careful here: at proc_exit we need extra cleanup, not just ! * xact_temporary files. ! */ ! if (isProcExit || have_xact_temporary_files) { Assert(FileIsNotOpen(0)); /* Make sure ring not corrupted */ for (i = 1; i SizeVfdCache; i++) *** CleanupTempFiles(bool isProcExit) *** 1697,1702 --- 1710,1717 FileClose(i); } } + + have_xact_temporary_files = false; } while (numAllocatedDescs 0) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] optimizing CleanupTempFiles
BTW in testing this patch I was surprised by the fact that temp tables files are removed at checkpoint time, rather than when the transaction ends (at first I thought I had broken the removal of temp files). Is this a recent feature? (I verified that this continues to work fine for WITH HOLD cursors too.) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New FSM patch
On Sep 17, 2008, at 9:30 AM, Heikki Linnakangas wrote: I think we'd still need to WAL log operations that decrease the amount of free space on page. Otherwise, after we have partial vacuum, we might never revisit a page, and update the FSM, even though there's usable space on the page, leaving the space forgotten forever. ISTM that it would be better to deal with such corner cases via periodic non-partial vacuums, done by something like autovac, and probably done with an ever higher-than-normal vacuum_cost_delay setting so as to minimize performance. That's likely a lot less wasteful than further compounding lock contention for the WAL. Even if it does result in more overall IO, you have to trade a *lot* of IO to balance out the impact of lock contention. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
[HACKERS] 0x1A in control file on Windows
I found a bug that pg_controldata ends with error if control files contain 0x1A (Ctrl+Z) on Windows. We probably need to add PG_BINARY when we open control files because 0x1A is an end-of-file marker on Windows. This fix needs to be applied in back versions (8.2, 8.3 and HEAD). Index: src/bin/pg_controldata/pg_controldata.c === --- src/bin/pg_controldata/pg_controldata.c (head) +++ src/bin/pg_controldata/pg_controldata.c (pg_control_0x1A) @@ -107,7 +107,7 @@ snprintf(ControlFilePath, MAXPGPATH, %s/global/pg_control, DataDir); - if ((fd = open(ControlFilePath, O_RDONLY, 0)) == -1) + if ((fd = open(ControlFilePath, O_RDONLY | PG_BINARY, 0)) == -1) { fprintf(stderr, _(%s: could not open file \%s\ for reading: %s\n), progname, ControlFilePath, strerror(errno)); Index: src/bin/pg_resetxlog/pg_resetxlog.c === --- src/bin/pg_resetxlog/pg_resetxlog.c (head) +++ src/bin/pg_resetxlog/pg_resetxlog.c (pg_control_0x1A) @@ -373,7 +373,7 @@ char *buffer; pg_crc32crc; - if ((fd = open(XLOG_CONTROL_FILE, O_RDONLY, 0)) 0) + if ((fd = open(XLOG_CONTROL_FILE, O_RDONLY | PG_BINARY, 0)) 0) { /* * If pg_control is not there at all, or we can't read it, the odds Regards, --- ITAGAKI Takahiro NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Where is Aggregation Attribute
I want to get the attribute from an Aggregation node, where is it? typedef struct Agg { Plan plan; AggStrategy aggstrategy; int numCols; /* number of grouping columns */ AttrNumber *grpColIdx; /* their indexes in the target list */ Oid *grpOperators; /* equality operators to compare with */ long numGroups; /* estimated number of groups in input */ } Agg; -- Best Regards, Zhe HE TEL: (001) 646-789-3008 Address:965 Amsterdam Avenue, New York, NY 10025 Master Student, CS Dept. Columbia University www.columbia.edu/~zh2132 --- 07 Alumni Bachelor of Eng, BUPT www.bupt.edu.cn
Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch
Tom, thanks for the review. Here is an in-progress report. Patches against CVS HEAD attached. (uncommented items are work-in-progress). -- Tatsuo Ishii SRA OSS, Inc. Japan Tatsuo Ishii [EMAIL PROTECTED] writes: Included is the latest patches against CVS HEAD. I spent some time reading this patch. Here are a few comments in no particular order: RangeRecursive node lacks copyfuncs/equalfuncs support. Functions added. Query.recursive is missed in equalfuncs.c. But rather than fix that, get rid of it entirely. AFAICS the only use is in qual_is_pushdown_safe, and what is the value of that test? The callers know perfectly well whether they are operating on a recursive RTE or not. You might as well just delete all the useless qual-pushdown-attempt code from set_recursion_pathlist, and not need to touch qual_is_pushdown_safe at all. Query.recursive removed and qual_is_pushdown_safe is untouched. Is physical_tlist optimization sensible for RecursiveScan? We seem to use it for every other Scan node type. Fixed and physical_tlist optimization is enabled for RecursiveScan I believe. I dislike putting state into ExecutorState; that makes it impossible to have multiple recursion nodes in one plan tree. It would probably be better for the Recursion and RecursiveScan nodes to talk to each other directly (compare HashJoin/Hash); although since they are not adjacent in the plan tree I admit I'm not sure how to do that. es_disallow_tuplestore doesn't seem to need to be in ExecutorState at all, it could as well be in RecursionState. It was for a workaround to avoid an infinit recursion in some cases. Discussion came to the conclusion that it's user's responsibilty to avoid such that case (otherwise the semantics of our recursive query becomes to be different from the one defined in the standard) I believe. es_disallow_tuplestore removed. I don't really like the way that Append nodes are being abused here. It would be better to allow nodeRecursion.c to duplicate a little code from nodeAppend.c, and have the child plans be direct children of the Recursion node. BTW, is it actually possible to have more than two children? I didn't spend enough time analyzing the restrictions in parse_cte to be sure. If there are always just two then you could simplify the representation by treating it like a join node instead of an append. (The RTE_RECURSIVE representation sure makes it look like there can be only two...) Mark/restore support seems useless ... note the comment on ExecSupportsMarkRestore (which should be updated if this code isn't removed). RecursiveScan claims to support backwards fetch, but does not in fact contain code to do so. (Given that it will always be underneath Recursion, which can't do backwards fetch, I see little point in adding such code; fix execAmi.c instead.) ExecInitRecursion doesn't seem to be on the same page about whether it supports backward scan as execAmi.c, either. This comment in nodeRecursivescan.c seems bogus: /* * Do not initialize scan tuple type, result tuple type and * projection info in ExecInitRecursivescan. These types are * initialized after initializing Recursion node. */ because the code seems to be doing exactly what the comment says it doesn't. Numerous comments appear to have been copied-and-pasted and not modified from the original. Somebody will have to go over all that text. ruleutils.c fails completely for non-recursive WITH. It *must* regenerate such a query with a WITH clause, not as a flattened subquery which is what you seem to be doing. This isn't negotiable because the semantics are different. This will mean at least some change in the parsetree representation. Perhaps we could add a bool to subquery RTEs to mark them as coming from a nonrecursive WITH? The tests added for RTE_RECURSIVE seem a bit ugly too. If it thinks that can't happen it should Assert so, not just fall through silently. commentary for ParseState.p_ctenamespace is gratuitously unlike the comment style for the other fields, and p_recursive_namespace isn't documented at all. ParseState.p_in_with_clause is unused, should be removed. Done. The WithClause struct definition is poorly commented. It should be stated that it is used only pre-parse-analysis (assuming that continues to be true after you get done fixing ruleutils.c...), and it doesn't say what the elements of the subquery list are (specifically, what node type). A lot of the other added structs and fields could use better commenting too. For that matter subquery is a poor name for WithClause's list of CTEs, especially so since it's hard to search for. It should be a plural name and I'd be inclined to use something like ctes not subqueries. The term subquery is too overloaded already, so any place you can refer to a WITH-list member as a CTE you should do so. WithClause node may need a
Re: [HACKERS] optimizing CleanupTempFiles
Alvaro Herrera [EMAIL PROTECTED] writes: BTW in testing this patch I was surprised by the fact that temp tables files are removed at checkpoint time, [ blink... ] Doesn't look like that should happen. What is your test case? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 0x1A in control file on Windows
ITAGAKI Takahiro [EMAIL PROTECTED] writes: I found a bug that pg_controldata ends with error if control files contain 0x1A (Ctrl+Z) on Windows. We probably need to add PG_BINARY when we open control files because 0x1A is an end-of-file marker on Windows. Well, why is that a bug? If the platform is so silly as to define text files that way, who are we to argue? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers