Re: [HACKERS] Proposal: First step towards Intelligent, integrateddatabase
I know world's population. Non of the person thinks alike and still many peoples goal can be the same. Nothing is virgin in this world. If someone thinks like that then it is a mistake. My aim is to prove that Postgresql can be the great leader if we put natural intelligence in database which is missing all over. And what is that intelligence? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] build problem
On ons, 2010-12-01 at 21:27 -0800, Jeff Davis wrote: On Wed, 2010-12-01 at 22:16 -0500, Tom Lane wrote: Jeff Davis pg...@j-davis.com writes: When I build HEAD using just make -s install, everything works fine. But when I add -j12, it appears to cause problems. This problem appeared very recently. Output below. Platform? Oops, sorry: $ uname -a Linux jdavis-ux 2.6.32-24-generic #43-Ubuntu SMP Thu Sep 16 14:58:24 UTC 2010 x86_64 GNU/Linux I'll look into it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: too many KnownAssignedXids
On 01.12.2010 20:51, Heikki Linnakangas wrote: Another approach would be to revisit the way the running-xacts snapshot is taken. Currently, we first take a snapshot, and then WAL-log it. There is a small window between the steps where backends can begin/end transactions, and recovery has to deal with that. When this was designed, there was long discussion on whether we should instead grab WALInsertLock and ProcArrayLock at the same time, to ensure that the running-xacts snapshot represents an up-to-date situation at the point in WAL where it's inserted. We didn't want to do that because both locks can be heavily contended. But maybe we should after all. It would make the recovery code simpler. If we want to get fancy, we wouldn't necessarily need to hold both locks for the whole duration. We could first grab ProcArrayLock and construct the snapshot. Then grab WALInsertLock and release ProcArrayLock, and finally write the WAL record and release WALInsertLock. But that would require small changes to XLogInsert. I took a look at that approach. We don't actually need to hold ProcArrayLock while the WAL-record is written, we need to hold XidGenLock. I believe that's less severe than holding the ProcArrayLock as there's already precedence for writing a WAL record while holding that: we do that when we advance to a new clog page and write a zero-clog-page record. So this is what we should do IMHO. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com *** a/src/backend/access/transam/xlog.c --- b/src/backend/access/transam/xlog.c *** *** 6095,6102 StartupXLOG(void) StartupSUBTRANS(oldestActiveXID); StartupMultiXact(); - ProcArrayInitRecoveryInfo(oldestActiveXID); - /* * If we're beginning at a shutdown checkpoint, we know that * nothing was running on the master at this point. So fake-up an --- 6095,6100 *** a/src/backend/storage/ipc/procarray.c --- b/src/backend/storage/ipc/procarray.c *** *** 435,453 ProcArrayClearTransaction(PGPROC *proc) } /* - * ProcArrayInitRecoveryInfo - * - * When trying to assemble our snapshot we only care about xids after this value. - * See comments for LogStandbySnapshot(). - */ - void - ProcArrayInitRecoveryInfo(TransactionId oldestActiveXid) - { - latestObservedXid = oldestActiveXid; - TransactionIdRetreat(latestObservedXid); - } - - /* * ProcArrayApplyRecoveryInfo -- apply recovery info about xids * * Takes us through 3 states: Initialized, Pending and Ready. --- 435,440 *** *** 519,533 ProcArrayApplyRecoveryInfo(RunningTransactions running) Assert(standbyState == STANDBY_INITIALIZED); /* ! * OK, we need to initialise from the RunningTransactionsData record ! */ ! ! /* ! * Remove all xids except xids later than the snapshot. We don't know ! * exactly which ones that is until precisely now, so that is why we allow ! * xids to be added only to remove most of them again here. */ - ExpireOldKnownAssignedTransactionIds(running-nextXid); StandbyReleaseOldLocks(running-nextXid); /* --- 506,514 Assert(standbyState == STANDBY_INITIALIZED); /* ! * Release any locks belonging to old transactions that are not ! * running according to the running-xacts record. */ StandbyReleaseOldLocks(running-nextXid); /* *** *** 536,544 ProcArrayApplyRecoveryInfo(RunningTransactions running) LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); /* - * Combine the running xact data with already known xids, if any exist. * KnownAssignedXids is sorted so we cannot just add new xids, we have to ! * combine them first, sort them and then re-add to KnownAssignedXids. * * Some of the new xids are top-level xids and some are subtransactions. * We don't call SubtransSetParent because it doesn't matter yet. If we --- 517,524 LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); /* * KnownAssignedXids is sorted so we cannot just add new xids, we have to ! * sort them first. * * Some of the new xids are top-level xids and some are subtransactions. * We don't call SubtransSetParent because it doesn't matter yet. If we *** *** 547,597 ProcArrayApplyRecoveryInfo(RunningTransactions running) * xids to subtrans. If RunningXacts is overflowed then we don't have * enough information to correctly update subtrans anyway. */ /* ! * Allocate a temporary array so we can combine xids. The total of both ! * arrays should never normally exceed TOTAL_MAX_CACHED_SUBXIDS. */ ! xids = palloc(sizeof(TransactionId) * TOTAL_MAX_CACHED_SUBXIDS); /* ! * Get the remaining KnownAssignedXids. In most cases there won't be any ! * at all since this exists only to catch a theoretical race condition. ! */ ! nxids = KnownAssignedXidsGet(xids, InvalidTransactionId); ! if (nxids 0) !
Re: [HACKERS] Hot Standby: too many KnownAssignedXids
On Thu, 2010-12-02 at 10:39 +0200, Heikki Linnakangas wrote: On 01.12.2010 20:51, Heikki Linnakangas wrote: Another approach would be to revisit the way the running-xacts snapshot is taken. Currently, we first take a snapshot, and then WAL-log it. There is a small window between the steps where backends can begin/end transactions, and recovery has to deal with that. When this was designed, there was long discussion on whether we should instead grab WALInsertLock and ProcArrayLock at the same time, to ensure that the running-xacts snapshot represents an up-to-date situation at the point in WAL where it's inserted. We didn't want to do that because both locks can be heavily contended. But maybe we should after all. It would make the recovery code simpler. If we want to get fancy, we wouldn't necessarily need to hold both locks for the whole duration. We could first grab ProcArrayLock and construct the snapshot. Then grab WALInsertLock and release ProcArrayLock, and finally write the WAL record and release WALInsertLock. But that would require small changes to XLogInsert. I took a look at that approach. We don't actually need to hold ProcArrayLock while the WAL-record is written, we need to hold XidGenLock. I believe that's less severe than holding the ProcArrayLock as there's already precedence for writing a WAL record while holding that: we do that when we advance to a new clog page and write a zero-clog-page record. So this is what we should do IMHO. Oh, thanks for looking at this. I've been looking at this also and as we might expect had a slightly different design. First, your assessment of the locking above is better than mine. I agree with your analysis so we should do it that way. The locking issue was the reason I haven't patched this yet so I'm glad you've improved this. In terms of the rest of the patch, it seems we have different designs, I think I have a much simpler, less invasive solution: The cause of the issue is that replay starts at one LSN and there is a delay until the RunningXacts WAL record occurs. If there was no delay, there would be no issue at all. In CreateCheckpoint() we start by grabbing the WAInsertLock and later recording that pointer as part of the checkpoint record. My proposal is to replace the grab the lock code with the insert of the RunningXacts WAL record (when wal_level set), so that recovery always starts with that record type. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hi- How frequently Postgres Poll for trigger file
Hey!1 I am able to change the switch over time.. Thanks alot.. 2010/12/2 aaliya zarrin aaliya.zar...@gmail.com Ok.. Thanks I will try!!! 2010/12/2 Heikki Linnakangas heikki.linnakan...@enterprisedb.com Oh, and there's another value for the case we're polling the archive, not streaming from master. Search for 5 in XLogPageRead function to catch them all. On 02.12.2010 08:21, Heikki Linnakangas wrote: It's the 500L argument in the WaitLatch call. It's expressed in microseconds. On 02.12.2010 06:39, aaliya zarrin wrote: Where this 5 Seconds time is defined in the code ??? On Wed, Dec 1, 2010 at 5:53 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 01.12.2010 13:27, aaliya zarrin wrote: I want to know how frequently postgres search for trigger file to switch over. In 9.0, every 100ms while streaming replication is active and connected. 5 seconds otherwise. In current git master branch, it's always 5 s. Can this switch over time be reduced? Not without hacking the sources and compiling. Although, on many platforms, Linux included I believe, sending a signal to the startup process should wake it up from the sleep and make it check the trigger file immediately. pg_ctl reload for example should do it. So if ıou send a signal to the startup process immediately after creating the trigger file, it should take notice sooner. Plz let me know where postgres poll for trigger file. I could find it out in backend/access/trans/xlog.c ? am i right? Yes. search for callers of CheckForStandbyTrigger() function. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Thanks Regards, Aaliya Zarrin (+91)-9160665888 -- Thanks Regards, Aaliya Zarrin (+91)-9160665888
Re: [HACKERS] FK's to refer to rows in inheritance child
On 2010-12-01 16:05, Tom Lane wrote: Perhaps I should have said possibly workable proposal. What you wrote doesn't even begin to cover the interesting part of the problem, namely how to ensure uniqueness is preserved in the face of concurrent insertions. The usual page locking done by nbtree doesn't work in this case, since a conflict can be two inserts into two different indexes. This means that one way or another, some extra kind of lock is needed. The question is what current lock mechanism is suitable for this purpose, the answer to that may depend on the way conflicts are checked for. Checking for conflicts can be done at row insert time or commit time. With 'inheritance chain' I mean any set of more than one relation that is the transitive closure of the inheritance parent and child relation on a given relation. - Check at row insert time: transaction A -- before insert of row into an inheritance chain, suppose with key value 10 -- lock X exclusively (this is to be a fined grained lock, X could be a name derived from the inserted key value 10) -- query inheritance chain for existing values with SnapshotNow -- on existing value, release 'X locks' taken by current backend and error -- else normal processing continues (actual insert etc..) -- after commit, release 'X locks' taken by current backend If another transaction B wants to insert the same key value 10, it will block on the lock taken by A, which A releases after commit, then B continues and when it queries the inheritance chain, it sees the committed record of A and will then report an error. The drawback here is lock proliferation: for each row a lock seems unwanted. - Check at commit time transaction A -- inserts row into inheritance chain -- record row and tuple for later check -- at commit time, if any check was recorded --- for every inheritance chain with inserts, ordered by lowest relation oid lock Y exclusively (where Y is a name derived from the inheritance chain, e.g. lowest relation oid) for every inserted value, query for duplicate key values in inheritance chain with SnapshotNow on error, release Y and abort transaction (serialization error) --- after commit / heap tuples marked committed, release all Y locks. transaction B May insert duplicate values into the inheritance chain, but at commit time will either get a lock or block on lock. It can continue after A releases locks, and then it will read the changes committed by B. Though this solution has fewer locks as the per-row case, a deadlock can occur if the inheritance chains are not locked in order. This is avoided by ordering the chains on the lowest oid of its relations. Perhaps the biggest drawback is that a serialization error must be thrown at commit, even when the user did not set the transaction isolation level to SERIALIZABLE and that might violate POLA. (My current feelings about this are that a general-purpose solution would probably cost more than it's worth. What people really care about is FK to a partitioned table, which is a structure in which we don't have to solve the general problem: if we know that the partitioning prevents the same key from appearing in multiple partitions, then we only have to look into one partition. So this is just another item that's pending introduction of real partitioning infrastructure.) While this is true for the partitioning use case, it will never be for 'true' inheritance use cases, and therefore a real partitioning structure doesn't help the inheritance use cases. (Such as ours that is a implementation of HL7's reference implementation model. http://www.hl7.org/v3ballot/html/infrastructure/rim/rim.html) A lot can be said about object model implementations in relational databases. To keep it short: Codd stressed the value of the relational model in it's ACM Turing award lecture: a practical foundation for productivity. It is not productive, that users must implement foreign key checking in user-space, instead of using a database primitive. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FK's to refer to rows in inheritance child
On 2010-12-02 11:27, Yeb Havinga wrote: With 'inheritance chain' I mean any set of more than one relation that is the transitive closure of the inheritance parent and child relation on a given relation. s/transitive closure/transitive reflexive closure -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: too many KnownAssignedXids
On 02.12.2010 11:02, Simon Riggs wrote: The cause of the issue is that replay starts at one LSN and there is a delay until the RunningXacts WAL record occurs. If there was no delay, there would be no issue at all. In CreateCheckpoint() we start by grabbing the WAInsertLock and later recording that pointer as part of the checkpoint record. My proposal is to replace the grab the lock code with the insert of the RunningXacts WAL record (when wal_level set), so that recovery always starts with that record type. Oh, interesting idea. But AFAICS closing the gap between acquiring the running-xacts snapshot and writing it to the log is sufficient, I don't see what moving the running-xacts record buys us. Does it allow some further simplifications somewhere? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_execute_from_file review
Hi, Please find attached the v8 version of the patch, that fixes the following: Itagaki Takahiro itagaki.takah...@gmail.com writes: * pg_read_binary_file_internal() should return not only the contents as char * but also the length, because the file might contain 0x00. In addition, null-terminations for the contents buffer is useless. * The 1st argument of pg_convert must be bytea rather than cstring in pg_convert_and_execute_sql_file(). I think you can fix both the bug and the above one if pg_read_binary_file_internal() returns bytea. I've changed pg_read_binary_file_internal() to return bytea*, which is much cleaner, thanks for the suggestion! * pg_read_file() has stronger restrictions than pg_read_binary_file(). (absolute path not allowed) and -1 length is not supported. We could fix pg_read_file() to behave as like as pg_read_binary_file(). It's now using the _internal function directly, so that there's only one code definition to care about here. * (It was my suggestion, but) Is it reasonable to use -1 length to read the while file? It might fit with C, but NULL might be better for SQL. Well thinking about it, omitting the length parameter alltogether seems like the more natural SQL level API for me, so I've made it happen: =# \df pg_read_*|replace|pg_exe* List of functions Schema | Name | Result data type | Argument data types | Type +---+--+-+ pg_catalog | pg_execute_sql_file | void | text | normal pg_catalog | pg_execute_sql_file | void | text, name | normal pg_catalog | pg_execute_sql_file | void | text, name, VARIADIC text | normal pg_catalog | pg_execute_sql_string | void | text | normal pg_catalog | pg_execute_sql_string | void | text, VARIADIC text | normal pg_catalog | pg_read_binary_file | bytea| text, bigint | normal pg_catalog | pg_read_binary_file | bytea| text, bigint, bigint | normal pg_catalog | pg_read_file | text | text, bigint | normal pg_catalog | pg_read_file | text | text, bigint, bigint | normal pg_catalog | replace | text | text, text, text | normal pg_catalog | replace | text | text, text, text, VARIADIC text | normal (11 rows) * The doc says pg_execute_sql_string() is restricted for superusers, but is not restricted actually. I think we don't have to. Agreed, fixed the doc. * In docs, the example of replace_placeholders() is replace('abcdefabcdef', 'cd', 'XX', 'ef', 'YY'). ~~~ BTW, I think we can call it just replace because parser can handle them correctly even if we have both replace(text, text, text) and replace(text, VARIADIC text[]). We will need only one doc for them. Note that if we call replace() with 3 args, the non-VARIADIC version is called. So, there is no performance penalty. The same idea occured to me yesternight when reading through the patch to send. It's now done in the way you can see above. The idea is not to change the existing behavior at all, so I've not changed the non-VARIADIC version of the function. * We might rename pg_convert_and_execute_sql_file() to pg_execute_sql_file_with_encoding() or so to have the same prefix. Well, I think I prefer reading the verbs in the order that things are happening in the code, it's actually convert then execute. But again, maybe some Native Speaker will have a say here, or maybe your proposed name fits better in PostgreSQL. I'd leave it for commiter :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 1840,1846 /indexterm literalfunctionreplace(parameterstring/parameter typetext/type, parameterfrom/parameter typetext/type, ! parameterto/parameter typetext/type)/function/literal /entry entrytypetext/type/entry entryReplace all occurrences in parameterstring/parameter of substring --- 1840,1849 /indexterm literalfunctionreplace(parameterstring/parameter typetext/type, parameterfrom/parameter typetext/type, ! parameterto/parameter typetext/type ! [, parameterfrom/parameter typetext/type, !parameterto/parameter typetext/type, ![, ...] ])/function/literal /entry entrytypetext/type/entry entryReplace all occurrences in parameterstring/parameter of substring *** *** 14449,14466 postgres=# SELECT * FROM
Re: [HACKERS] WIP patch for parallel pg_dump
On 02.12.2010 07:39, Joachim Wieland wrote: On Sun, Nov 14, 2010 at 6:52 PM, Joachim Wielandj...@mcknight.de wrote: You would add a regular parallel dump with $ pg_dump -j 4 -Fd -f out.dir dbname So this is an updated series of patches for my parallel pg_dump WIP patch. Most importantly it now runs on Windows once you get it to compile there (I have added the new files to the respective project of Mkvcbuild.pm but I wondered why the other archive formats do not need to be defined in that file...). So far nobody has volunteered to review this patch. It would be great if people could at least check it out, run it and let me know if it works and if they have any comments. That's a big patch.. I don't see the point of the sort-by-relpages code. The order the objects are dumped should be irrelevant, as long as you obey the restrictions dictated by dependencies. Or is it only needed for the multiple-target-dirs feature? Frankly I don't see the point of that, so it would be good to cull it out at least in this first stage. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: too many KnownAssignedXids
On Thu, 2010-12-02 at 12:41 +0200, Heikki Linnakangas wrote: On 02.12.2010 11:02, Simon Riggs wrote: The cause of the issue is that replay starts at one LSN and there is a delay until the RunningXacts WAL record occurs. If there was no delay, there would be no issue at all. In CreateCheckpoint() we start by grabbing the WAInsertLock and later recording that pointer as part of the checkpoint record. My proposal is to replace the grab the lock code with the insert of the RunningXacts WAL record (when wal_level set), so that recovery always starts with that record type. Oh, interesting idea. But AFAICS closing the gap between acquiring the running-xacts snapshot and writing it to the log is sufficient, I don't see what moving the running-xacts record buys us. Does it allow some further simplifications somewhere? Your patch is quite long and you do a lot more than just alter the locking. I don't think we need those changes at all and especially would not wish to backpatch that. Earlier on this thread, we discussed: On Wed, 2010-11-24 at 15:19 +, Simon Riggs wrote: On Wed, 2010-11-24 at 12:48 +0200, Heikki Linnakangas wrote: When recovery starts, we fetch the oldestActiveXid from the checkpoint record. Let's say that it's 100. We then start replaying WAL records from the Redo pointer, and the first record (heap insert in your case) contains an Xid that's much larger than 100, say 1. We call RecordKnownAssignedXids() to make note that all xids between that range are in-progress, but there isn't enough room in the array for that. Agreed. The current code fails because of the gap between the redo pointer and the XLOG_RUNNING_XACTS WAL record. If there is no gap, there is no problem. So my preferred solution would: * Log XLOG_RUNNING_XACTS while holding XidGenLock, as you suggest * Move logging to occur at the Redo pointer That is a much smaller patch with a smaller footprint. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: too many KnownAssignedXids
On 02.12.2010 13:25, Simon Riggs wrote: On Thu, 2010-12-02 at 12:41 +0200, Heikki Linnakangas wrote: On 02.12.2010 11:02, Simon Riggs wrote: The cause of the issue is that replay starts at one LSN and there is a delay until the RunningXacts WAL record occurs. If there was no delay, there would be no issue at all. In CreateCheckpoint() we start by grabbing the WAInsertLock and later recording that pointer as part of the checkpoint record. My proposal is to replace the grab the lock code with the insert of the RunningXacts WAL record (when wal_level set), so that recovery always starts with that record type. Oh, interesting idea. But AFAICS closing the gap between acquiring the running-xacts snapshot and writing it to the log is sufficient, I don't see what moving the running-xacts record buys us. Does it allow some further simplifications somewhere? Your patch is quite long and you do a lot more than just alter the locking. I don't think we need those changes at all and especially would not wish to backpatch that. Most of the changes to procarray.c were about removing code that's no longer necessary when we close the gap between acquiring and writing the running-xacts WAL record. You can leave it as it is as a historical curiosity, but I'd prefer to simplify it, given that we now know that it doesn't actually work correctly if the gap is not closed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] crash-safe visibility map, take three
Robert Haas robertmh...@gmail.com writes: Or maybe I do. One other thing I've been thinking about with regard to hint bit updates is that we might choose to mark that are hint-bit-updated as untidy rather than dirty. The background Please review archives, you'll find the idea discussed and some patches to implement it, by Simon. I suppose you could begin here: http://archives.postgresql.org/pgsql-patches/2008-06/msg00113.php Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: I don't see the point of the sort-by-relpages code. The order the objects are dumped should be irrelevant, as long as you obey the restrictions dictated by dependencies. Or is it only needed for the multiple-target-dirs feature? Frankly I don't see the point of that, so it would be good to cull it out at least in this first stage. From the talk at CHAR(10), and provided memory serves, it's an optimisation so that you're doing largest file in a process and all the little file in other processes. In lots of case the total pg_dump duration is then reduced to about the time to dump the biggest files. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] When will 8.1 be EOLed?
IIRC we announced that we will EOL 8.1 in 2010 -- given that we are now in December, will it be done soon? (As a packager, I'll be happy to drop is ASAP, so that I can use those VMs for new OSes). Regards, -- Devrim GÜNDÜZ PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer PostgreSQL RPM Repository: http://yum.pgrpms.org Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz signature.asc Description: This is a digitally signed message part
Re: [HACKERS] When will 8.1 be EOLed?
2010/12/2 Devrim GÜNDÜZ dev...@gunduz.org: IIRC we announced that we will EOL 8.1 in 2010 -- given that we are now in December, will it be done soon? (As a packager, I'll be happy to drop is ASAP, so that I can use those VMs for new OSes). We even said November 2010 - see http://wiki.postgresql.org/wiki/PostgreSQL_Release_Support_Policy. The usual policy is we'll put out one more minor release and that's the final one. So the answer is whenever we put out the next batch of minor releases. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On Thu, Dec 2, 2010 at 6:19 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I don't see the point of the sort-by-relpages code. The order the objects are dumped should be irrelevant, as long as you obey the restrictions dictated by dependencies. Or is it only needed for the multiple-target-dirs feature? Frankly I don't see the point of that, so it would be good to cull it out at least in this first stage. A guy called Dimitri Fontaine actually proposed the serveral-directories feature here and other people liked the idea. http://archives.postgresql.org/pgsql-hackers/2008-02/msg01061.php :-) The code doesn't change much with or without it, and if people are no longer in favour of it, I have no problem with taking it out. As Dimitri has already pointed out, the relpage sorting thing is there to start with the largest table(s) first. Joachim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] When will 8.1 be EOLed?
Excerpts from Devrim GÜNDÜZ's message of jue dic 02 10:10:28 -0300 2010: IIRC we announced that we will EOL 8.1 in 2010 -- given that we are now in December, will it be done soon? (As a packager, I'll be happy to drop is ASAP, so that I can use those VMs for new OSes). My guess is that the next update will be the last one for 8.1. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] When will 8.1 be EOLed?
On 12/02/2010 08:10 AM, Devrim GÜNDÜZ wrote: IIRC we announced that we will EOL 8.1 in 2010 -- given that we are now in December, will it be done soon? (As a packager, I'll be happy to drop is ASAP, so that I can use those VMs for new OSes). EOL does not mean we simply drop the branch on the floor. The policy at http://wiki.postgresql.org/wiki/PostgreSQL_Release_Support_Policy states: The first version update released after the EOL date will normally be the final one for that version. Note that this can extend support shortly past the listed date. That shortly could amount to several months, and has in the past. (I'm not sure I understand why you need a separate VM for each branch, as your post seems to imply.) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FK's to refer to rows in inheritance child
On 2010-12-01 16:58, Florian Pflug wrote: On Dec1, 2010, at 15:27 , Tom Lane wrote: Indeed. This isn't even worth the time to review, unless you have a proposal for fixing the unique-index-across-multiple-tables problem. I've wondered in the past if a unique index is really necessary on the columns referenced by a FK in all cases, though. Me too, but Tom wrote, in one of the archive threads linked to from the todo wiki, that it was in the SQL spec, so.. Another idea might be to allow this if tableoid is part of the FK. In that case, unique indices on the remaining columns in the individual child table would be enough to guarantee global uniqueness. Yes that would work, but if the database user must make the tableoid column, and include in the PK and FK's (i.e. not transparant for the user), then it would still be a workaround. BTW, my serializable_lock_consisteny patch would allow you to do this purely within pl/pgsql in a race-condition free way. So if that patch should get applied you might want to consider this as a workaround. Whether it will get applied is yet to be seen, though - currently there doesn't seem to be unanimous vote one way or the other. Thanks for the reference, I will try and see how to use it for this purpose. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] is cachedFetchXid ever invalidated?
Excerpts from Jeff Davis's message of mié dic 01 22:48:36 -0300 2010: I can't see any place that cachedFetchXid is ever invalidated. Shouldn't it be invalidated before transaction ID wraparound? It's difficult to construct a test case to show whether this is a problem or not, Couldn't you just create a C function that advanced the Xid counter by, say, 100k? -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Joachim Wieland j...@mcknight.de writes: A guy called Dimitri Fontaine actually proposed the serveral-directories feature here and other people liked the idea. Hehe :) Reading that now, it could be that I didn't know at the time that given a powerful enough subsystem disk there's no way to saturate it with one CPU. So the use case of parralel dump in a bunch or user given locations would be to use different mount points (disk subsystems) at the same time. Not sure how releveant it is. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] improving foreign key locks
Excerpts from Tom Lane's message of mié dic 01 14:44:03 -0300 2010: Florian Pflug f...@phlo.org writes: On Dec1, 2010, at 17:17 , Tom Lane wrote: There's not enough space in the infomask to record which columns (or which unique index) are involved. And if you're talking about data that could remain on disk long after the unique index is gone, that's not going to be good enough. We'd distinguish two cases A) The set of locked columns is a subset of the set of columns mentioned in *any* unique index. (In other words, for every locked column there is a unique index which includes that column, though not necessarily one index which includes them all) B) The set of locked columns does not satisfy (A) How's that fix it? The on-disk flags are still falsifiable by subsequent index changes. Creating indices shouldn't pose a problem, since it would only enlarge the set of locked columns for rows with HEAP_XMAX_SHARED_LOCK_KEY set. Not with that definition. I could create a unique index that doesn't contain some column that every previous unique index contained. This is not a problem, because a later UPDATE that tried to modify a locked tuple on a column only indexed by the new index, would consider it locked. Which is OK. There would only be a problem if we were allowed to drop an index (which would reduce the set of locked columns), but we don't allow this because DROP INDEX requires an exclusive lock on the table and thus there cannot be anyone with key-locked tuples in it. AFAICT Florian's definition is good. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FK's to refer to rows in inheritance child
On 2010-12-02 01:18, Jim Nasby wrote: On Dec 1, 2010, at 8:07 AM, Yeb Havinga wrote: FK's cannot refer to rows in inheritance childs. We have partially solved this issue at work. In our scenario, we're not using inheritance for partitioning, we're using it for, well, inheriting. As part of that, we have a field in the parent table that tells you what type of object each row is, and constraints on the child tables that enforce that. We've created triggers that perform the same operations that the built-in RI triggers do, namely grabbing a share lock on the target row. The difference is that our trigger looks at the type field to determine exactly what table it needs to try and grab shared locks on (we need to do this because the backend doesn't allow you to SELECT ... FROM parent FOR SHARE). That part is exactly what the current WIP patch takes care of: grabbing share locks on the right relation. Our solution is not complete though. Offhand, I know it doesn't support cascade, but I think there's more stuff it doesn't do. AFAIK all of those shortcomings could be handled with whats available at a user level though, so someone with enough motivation could produce an entire RI framework that worked with inheritance (though the framework would need a way to work around the uniqueness issue). But is 'it can be solved on the user level' enough reason to not implement it in the server code? Foreign key and unique constraint checking are features the server should provide. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] V3: Idle in transaction cancellation
Andres Freund and...@anarazel.de wrote: On Tuesday 19 October 2010 16:18:29 Kevin Grittner wrote: For SSI purposes, it would be highly desirable to be able to set the SQLSTATE and message generated when the canceled transaction terminates. Ok, I implemented that capability, but the patch feels somewhat wrong to me, so its a separate patch on top the others: The patch is in context diff format, applies with minor offsets, compiles and passes regression tests. I have to admit that after reading the patch, I think I previously misunderstood the scope of it. Am I correct in reading that the main thrust of this is to improve error handling on standbys? Is there any provision for one backend to cause a *different* backend which is idle in a transaction to terminate cleanly when it attempts to process its next statement? (That is what I was hoping to find, for use in the SSI patch.) Anyway, if the third patch file is only there because of my request, I think it might be best to focus on the first two as a solution for the standby issues this was originally meant to address, and then to look at an API for the usage I have in mind after that is settled. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] crash-safe visibility map, take three
On Thu, Dec 2, 2010 at 6:37 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Robert Haas robertmh...@gmail.com writes: Or maybe I do. One other thing I've been thinking about with regard to hint bit updates is that we might choose to mark that are hint-bit-updated as untidy rather than dirty. The background Please review archives, you'll find the idea discussed and some patches to implement it, by Simon. I suppose you could begin here: http://archives.postgresql.org/pgsql-patches/2008-06/msg00113.php Thanks for the pointer. I guess that demonstrates that good ideas will keep floating back up to the surface. It seems like the idea was met with generally positive feedback, except that most people seemed to want a slightly simpler system than what Simon was proposing. I suspect that this is mostly suffering from a lack of round tuits. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] V3: Idle in transaction cancellation
On Thursday 02 December 2010 00:48:53 Kevin Grittner wrote: Andres Freund and...@anarazel.de wrote: On Tuesday 19 October 2010 16:18:29 Kevin Grittner wrote: For SSI purposes, it would be highly desirable to be able to set the SQLSTATE and message generated when the canceled transaction terminates. Ok, I implemented that capability, but the patch feels somewhat wrong to me, so its a separate patch on top the others: The patch is in context diff format, applies with minor offsets, compiles and passes regression tests. I have to admit that after reading the patch, I think I previously misunderstood the scope of it. Am I correct in reading that the main thrust of this is to improve error handling on standbys? Is there any provision for one backend to cause a *different* backend which is idle in a transaction to terminate cleanly when it attempts to process its next statement? (That is what I was hoping to find, for use in the SSI patch.) Do you wan't to terminate it immediately or on next statement? You might want to check out SendProcSignal() et al. Anyway, if the third patch file is only there because of my request, I think it might be best to focus on the first two as a solution for the standby issues this was originally meant to address, and then to look at an API for the usage I have in mind after that is settled. Besides that I dont like the implementation very much, I think its generally a good idea... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] twitter_fdw-0.1
Hitoshi, I've written twitter_fdw, which makes twitter table to get twitter search result by SELECT statement. This is done against WIP SQL/MED version PostgreSQL (git sha-1 is c5e027c32b8faf0e2d25b8deecb4d3c05399411c). To play with it, make make install, run the install script as you usually do so with contrib modules and connect the server, then, This is way cool! Will have to try it out. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: First step towards Intelligent, integrateddatabase
ghatpa...@vsnl.net wrote: If possible can you provide glimpses of History. http://en.wikipedia.org/wiki/Navigational_database http://en.wikipedia.org/wiki/MARK_IV_%28software%29 I don't want to go back there, myself. YMMV. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On 12/02/2010 05:50 AM, Dimitri Fontaine wrote: So the use case of parralel dump in a bunch or user given locations would be to use different mount points (disk subsystems) at the same time. Not sure how releveant it is. I think it will complicate this feature unnecessarily for 9.1. Personally, I need this patch so much I'm thinking of backporting it. However, having all the data go to one directory/mount wouldn't trouble me at all. Now, if only I could think of some way to write a parallel dump to a set of pipes, I'd be in heaven. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: First step towards Intelligent, integrateddatabase
On 12/02/2010 12:44 PM, Kevin Grittner wrote: ghatpa...@vsnl.net wrote: If possible can you provide glimpses of History. http://en.wikipedia.org/wiki/Navigational_database http://en.wikipedia.org/wiki/MARK_IV_%28software%29 I don't want to go back there, myself. YMMV. Me either. BTDTGTTS. :-) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On 12/02/2010 12:56 PM, Josh Berkus wrote: On 12/02/2010 05:50 AM, Dimitri Fontaine wrote: So the use case of parralel dump in a bunch or user given locations would be to use different mount points (disk subsystems) at the same time. Not sure how releveant it is. I think it will complicate this feature unnecessarily for 9.1. Personally, I need this patch so much I'm thinking of backporting it. However, having all the data go to one directory/mount wouldn't trouble me at all. Now, if only I could think of some way to write a parallel dump to a set of pipes, I'd be in heaven. The only way I can see that working sanely would be to have a program gathering stuff at the other end of the pipes, and ensuring it was all coherent. That would be a huge growth in scope for this, and I seriously doubt it's worth it. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Now, if only I could think of some way to write a parallel dump to a set of pipes, I'd be in heaven. The only way I can see that working sanely would be to have a program gathering stuff at the other end of the pipes, and ensuring it was all coherent. That would be a huge growth in scope for this, and I seriously doubt it's worth it. Oh, no question. And there's workarounds ... sshfs, for example. I'm just thinking of the ad-hoc parallel backup I'm running today, which relies heavily on pipes. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] V3: Idle in transaction cancellation
Andres Freund and...@anarazel.de wrote: Do you wan't to terminate it immediately or on next statement? I want to have one backend terminate the transaction on another backend as soon as practicable. If a query is active, it would be best if it was canceled. It appears that if it is idle in transaction there is a need to wait for the next request. It would be a big plus for the backend requesting the cancellation to be able to specify the SQLSTATE, message, etc., used by the other backend on failure. You might want to check out SendProcSignal() et al. Will take a look. Besides that I dont like the implementation very much, I think its generally a good idea... OK. While browsing around, I'll try to think of an alternative approach, but this is new territory for me -- I've been learning about areas in the code at need so far -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] crash-safe visibility map, take three
On Wed, 2010-12-01 at 23:22 -0500, Robert Haas wrote: Well, let's think about what we'd need to do to make CRCs work reliably. There are two problems. 1. [...] If we CRC the entire page, the torn pages are never acceptable, so every action that modifies the page must be WAL-logged. 2. Currently, we allow hint bits on a page to be updated while holding [...] The way I see it, here are the rules we are breaking, and why: * We don't get an exclusive lock when dirtying a page with hint bits - Why: we write while reading, and we want good concurrency. - Why': because after a bulk load, we don't have any hint bits, and the only way to get them set without VACUUM is to write while reading. I've never been entirely sure why VACUUM isn't good enough in this case, aside from the fact that a user might not run VACUUM (and autovacuum might not either, if it was only a bulk load and no updates/deletes). * We don't WAL log setting hint bits (which dirties a page) - Why: because after a bulk load, we don't want to write the data a 4th time Hypothetically, if we had a bulk loading strategy, these problems would go away, and we could follow the rules. Right? Is there a case other than bulk loading which demands that we break these rules? And, if we had a bulk loading path, we could probably get away with writing the data only twice (today, we write it 3 times including the hint bits) or maybe once if WAL archiving is off. So, is there a case other than bulk loading for which we need to break these rules? If not, perhaps we should consider bulk loading a different problem, and simplify the design of all of these other features (and allow new storage-touching features to come about, like CRCs, without exponentially increasing the complexity with each one). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On Thu, Dec 2, 2010 at 12:56 PM, Josh Berkus j...@agliodbs.com wrote: Now, if only I could think of some way to write a parallel dump to a set of pipes, I'd be in heaven. What exactly are you trying to accomplish with the pipes? Joachim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] crash-safe visibility map, take three
Jeff Davis pg...@j-davis.com wrote: And, if we had a bulk loading path, we could probably get away with writing the data only twice (today, we write it 3 times including the hint bits) or maybe once if WAL archiving is off. If you're counting WAL writes, you're low. If you don't go out of your way to avoid it, you are likely to write the data to the table once during the bulk load, a second time on first read to set the hint bits, and a third time to freeze data to prevent wrap-around. The initial write may or may not be WAL-logged. The freezing typically is WAL-logged. So, you can easily write the data to disk four or five times. With luck these are spread out uniformly or happen during off-peak periods. Unmanaged, a WAL-logged freeze of bulk-loaded data is somewhat more likely to occur, however, during hours of heavy OLTP load, because transaction numbers are consumed so quickly. Currently, a VACUUM FREEZE after a bulk load collapses at least two of those writes to one. With luck, some pages might still be dirty in cache, and you can save two of the writes. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spread checkpoint sync
On Wed, Dec 1, 2010 at 4:25 AM, Greg Smith g...@2ndquadrant.com wrote: I ask because I don't have a mental model of how the pause can help. Given that this dirty data has been hanging around for many minutes already, what is a 3 second pause going to heal? The difference is that once an fsync call is made, dirty data is much more likely to be forced out. It's the one thing that bypasses all other ways the kernel might try to avoid writing the data I had always assumed the problem was that other I/O had been done to the files in the meantime. I.e. the fsync is not just syncing the checkpoint but any other blocks that had been flushed since the checkpoint started. The longer the checkpoint is spread over the more other I/O is included as well. Using sync_file_range you can specify the set of blocks to sync and then block on them only after some time has passed. But there's no documentation on how this relates to the I/O scheduler so it's not clear it would have any effect on the problem. We might still have to delay the begining of the sync to allow the dirty blocks to be synced naturally and then when we issue it still end up catching a lot of other i/o as well. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spread checkpoint sync
Using sync_file_range you can specify the set of blocks to sync and then block on them only after some time has passed. But there's no documentation on how this relates to the I/O scheduler so it's not clear it would have any effect on the problem. We might still have to delay the begining of the sync to allow the dirty blocks to be synced naturally and then when we issue it still end up catching a lot of other i/o as well. This *really* sounds like we should be working with the FS geeks on making the OS do this work for us. Greg, you wanna go to LinuxCon next year? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] directory archive format for pg_dump
Ok, committed, with some small cleanup since the last patch I posted. Could you update the directory-format patch on top of the committed version, please? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Another proposal for table synonyms
On Wed, Dec 1, 2010 at 3:55 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Josh Berkus's message of mié dic 01 17:13:35 -0300 2010: Well, porting applications from other database systems that support synonyms (i.e. Oracle, DB2, SQL Server). SQL Server supports synonyms? If it's not Oracle-only, it's a more powerful argument to have the feature. I think it's worth mentioning that in SQL Server, synonyms are not schema-qualified; they're global objects. Seems like they have more than one kind. http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_7001.htm The section entitled notes on public synonyms is particularly interesting, as it seems to imply that under some but not all conditions synonyms get materialized inside schemas that use them. The list of objects for which they support synonyms is also interesting. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Another proposal for table synonyms
Excerpts from Robert Haas's message of jue dic 02 17:27:01 -0300 2010: On Wed, Dec 1, 2010 at 3:55 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Josh Berkus's message of mié dic 01 17:13:35 -0300 2010: Well, porting applications from other database systems that support synonyms (i.e. Oracle, DB2, SQL Server). SQL Server supports synonyms? If it's not Oracle-only, it's a more powerful argument to have the feature. I think it's worth mentioning that in SQL Server, synonyms are not schema-qualified; they're global objects. Seems like they have more than one kind. http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_7001.htm Yeah, the Oracle system is a lot more complex than SQL Server's, but I was only talking about the latter, for which see here: http://www.databasejournal.com/features/mssql/article.php/3635426/SYNONYM-in-SQL-Server-2005.htm The list of objects for which they support synonyms is also interesting. The bit that allows a synonym to reference another synonym seems like worth considering further (either reject them altogether, or have some way to deal with possible cycles). I think the original proposal in this thread didn't mention them at all. (I don't think we should consider synonyms for either functions or stored procedures; that would make the current mess of function resolution rules a lot messier.) -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] V3: Idle in transaction cancellation
Andres Freund and...@anarazel.de wrote: On Thursday 02 December 2010 00:48:53 Kevin Grittner wrote: Is there any provision for one backend to cause a *different* backend which is idle in a transaction to terminate cleanly when it attempts to process its next statement? You might want to check out SendProcSignal() et al. Yeah, that was the missing link for me. Thanks! Anyway, if the third patch file is only there because of my request, I think it might be best to focus on the first two as a solution for the standby issues this was originally meant to address, and then to look at an API for the usage I have in mind after that is settled. Besides that I dont like the implementation very much, I think its generally a good idea... Is it sane to leave the implementation of this for the specific areas which need it (like SSI), or do you think a generalized API for it is needed? I'll look at it more closely tonight, but at first scan it appears that just reserving one flag for PROCSIG_SERIALIZATION_FAILURE (or PROCSIG_SSI_CANCELLATION?) would allow me to code up the desired behavior in a function called from procsignal_sigusr1_handler. I can arrange for passing any needed detail through the SSI-controlled structures somehow. Would that allow you to skip the parts you didn't like? It looks as though this is something which could easily be split off as a separate patch within the SSI effort. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] directory archive format for pg_dump
Excerpts from Heikki Linnakangas's message of jue dic 02 16:52:27 -0300 2010: Ok, committed, with some small cleanup since the last patch I posted. I think the comments on _ReadBuf and friends need to be updated, since they are not just for headers and TOC stuff anymore. I'm not sure if they were already outdated before your patch ... -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] V3: Idle in transaction cancellation
Excerpts from Andres Freund's message of sáb oct 30 05:49:21 -0300 2010: Ill set this up for the next commitfest, I don't think I can do much more without further input. Are you reserving about 20 bits for levels, and 12 for flags? Given the relatively scarce growth of levels, I think we could live with about 6 or 7 bits for level, rest for flags. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] V3: Idle in transaction cancellation
On Thursday 02 December 2010 22:21:37 Alvaro Herrera wrote: Excerpts from Andres Freund's message of sáb oct 30 05:49:21 -0300 2010: Ill set this up for the next commitfest, I don't think I can do much more without further input. Are you reserving about 20 bits for levels, and 12 for flags? Given the relatively scarce growth of levels, I think we could live with about 6 or 7 bits for level, rest for flags. The number I picked was absolutely arbitrary I admit. Neither did I think it would be likely to see more levels, nor did I forsee many flags, so I just chose some number I liked in that certain moment ;-) Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] crash-safe visibility map, take three
On Thu, Dec 2, 2010 at 2:01 PM, Jeff Davis pg...@j-davis.com wrote: * We don't get an exclusive lock when dirtying a page with hint bits - Why: we write while reading, and we want good concurrency. - Why': because after a bulk load, we don't have any hint bits, and the only way to get them set without VACUUM is to write while reading. I've never been entirely sure why VACUUM isn't good enough in this case, aside from the fact that a user might not run VACUUM (and autovacuum might not either, if it was only a bulk load and no updates/deletes). * We don't WAL log setting hint bits (which dirties a page) - Why: because after a bulk load, we don't want to write the data a 4th time Hypothetically, if we had a bulk loading strategy, these problems would go away, and we could follow the rules. Right? Is there a case other than bulk loading which demands that we break these rules? I'm not really convinced that this problem is confined to bulk loading. Every INSERT or UPDATE results in a new tuple that may need hit bits set and eventually to be frozen. A bulk load is just a time when you do lots of inserts all at once; it seems to me that a large update would cause all the same problems, plus bloat. The triple I/O problem exists for small transactions as well (and isn't desirable there either); it's just less noticeable because the second and third writes are, like the first one, small. And, if we had a bulk loading path, we could probably get away with writing the data only twice (today, we write it 3 times including the hint bits) or maybe once if WAL archiving is off. It seems to me that a COPY command executed in a transaction with no other open snapshots writing to a table created or truncated within the same transaction should be able to write frozen tuples from the get-go, regardless of anything else we do. So, is there a case other than bulk loading for which we need to break these rules? If not, perhaps we should consider bulk loading a different problem, and simplify the design of all of these other features (and allow new storage-touching features to come about, like CRCs, without exponentially increasing the complexity with each one). I don't think we're exponentially increasing complexity - I think we're incrementally improving our algorithms. If you want to propose a bulk loading path, great. Propose away! But without something a bit more concrete, I don't think it would be appropriate to hold off making the visibility map crash-safe, on the off chance that our design for so doing might complicate something else we want to do later. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: That's a big patch.. Not nearly big enough :-( In the past, proposals for this have always been rejected on the grounds that it's impossible to assure a consistent dump if different connections are used to read different tables. I fail to understand why that consideration can be allowed to go by the wayside now. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On 12/02/2010 05:01 PM, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: That's a big patch.. Not nearly big enough :-( In the past, proposals for this have always been rejected on the grounds that it's impossible to assure a consistent dump if different connections are used to read different tables. I fail to understand why that consideration can be allowed to go by the wayside now. Well, snapshot cloning should allow that objection to be overcome, no? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Andrew Dunstan and...@dunslane.net writes: On 12/02/2010 05:01 PM, Tom Lane wrote: In the past, proposals for this have always been rejected on the grounds that it's impossible to assure a consistent dump if different connections are used to read different tables. I fail to understand why that consideration can be allowed to go by the wayside now. Well, snapshot cloning should allow that objection to be overcome, no? Possibly, but we need to see that patch first not second. (I'm not actually convinced that snapshot cloning is the only problem here; locking could be an issue too, if there are concurrent processes trying to take locks that will conflict with pg_dump's. But the snapshot issue is definitely a showstopper.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Dimitri Fontaine wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: I don't see the point of the sort-by-relpages code. The order the objects are dumped should be irrelevant, as long as you obey the restrictions dictated by dependencies. Or is it only needed for the multiple-target-dirs feature? Frankly I don't see the point of that, so it would be good to cull it out at least in this first stage. From the talk at CHAR(10), and provided memory serves, it's an optimisation so that you're doing largest file in a process and all the little file in other processes. In lots of case the total pg_dump duration is then reduced to about the time to dump the biggest files. Seems there should be a comment in the code explaining why this is being done. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] crash-safe visibility map, take three
On Thu, 2010-12-02 at 17:00 -0500, Robert Haas wrote: I'm not really convinced that this problem is confined to bulk loading. Every INSERT or UPDATE results in a new tuple that may need hit bits set and eventually to be frozen. A bulk load is just a time when you do lots of inserts all at once; it seems to me that a large update would cause all the same problems, plus bloat. A big UPDATE does a lot of work, I don't see any near-term approach for solving that. Avoiding WAL (and full page writes) for the updates to PD_ALL_VISIBLE, hint bits, VM bits, etc., would probably be the least of my concerns in that case. The triple I/O problem exists for small transactions as well (and isn't desirable there either); it's just less noticeable because the second and third writes are, like the first one, small. Bulk loading poses some unique challenges because there is no opportunity to set PD_ALL_VISIBLE or hint bits before the loading is complete; and by that time, many checkpoints will have already happened, and the pages have already hit disk. That means we need to re-read them, modify them, and write them again (plus WAL, if we were following the rules). Small transactions don't suffer from the same problems. They generally begin and end without an intervening checkpoint. That means that you have an opportunity to set PD_ALL_VISIBLE or hint bits before the checkpoint happens, thus avoiding unnecessary extra writes. Additionally, small transaction workloads will generally have, to some approximation, some working set of pages. So, even if you do a read of a single tuple, write PD_ALL_VISIBLE and hint bits (thus dirtying the page), there is a reasonable chance that someone will come by later and do an insert/update/delete (thus forcing WAL anyway). And if the small transaction workload is completely random and you aren't touching the same pages between checkpoints, then setting hint bits one-at-a-time is not a good strategy anyway. It would be much better to do it in bulk with a VACUUM. And if VACUUM does anything significant to a page, it's going to WAL anyway. I'm having trouble seeing a case other than bulk-loading which causes a real problem. Maybe a small-transaction workload with a few long-running transactions? Even that doesn't seem so bad. It seems to me that a COPY command executed in a transaction with no other open snapshots writing to a table created or truncated within the same transaction should be able to write frozen tuples from the get-go, regardless of anything else we do. Well, some transaction might pick up a snapshot between the time you begin the copy and the time it commits. We'd need to prevent such a transaction from actually reading the table. I don't think it would be appropriate to hold off making the visibility map crash-safe, on the off chance that our design for so doing might complicate something else we want to do later. I'm not suggesting we hold off on it at all. To the contrary, I'm suggesting that we simply log updates of PD_ALL_VISIBLE as well as VM bits, at least until a performance problem presents itself. That will _simplify_ the design. Then, when a performance problem does present itself for a certain use case, we can see how to fix it. If many cases are affected, then we might choose one of these more creative solutions that breaks the rules in controlled ways, understanding the trade-offs. If only bulk loading is affected, we might choose to address that case directly. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On 12/02/2010 05:32 PM, Tom Lane wrote: Andrew Dunstanand...@dunslane.net writes: On 12/02/2010 05:01 PM, Tom Lane wrote: In the past, proposals for this have always been rejected on the grounds that it's impossible to assure a consistent dump if different connections are used to read different tables. I fail to understand why that consideration can be allowed to go by the wayside now. Well, snapshot cloning should allow that objection to be overcome, no? Possibly, but we need to see that patch first not second. Yes, I agree with that. (I'm not actually convinced that snapshot cloning is the only problem here; locking could be an issue too, if there are concurrent processes trying to take locks that will conflict with pg_dump's. But the snapshot issue is definitely a showstopper.) Why is that more an issue with parallel pg_dump? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4
Andrew Dunstan wrote: On 11/30/2010 11:17 PM, Tom Lane wrote: Andrew Dunstanand...@dunslane.net writes: On 11/30/2010 10:09 PM, Tom Lane wrote: We should wait for the outcome of the discussion about whether to change the default wal_sync_method before worrying about this. we've just had a significant PGX customer encounter this with the latest Postgres on Redhat's freshly released flagship product. Presumably the default wal_sync_method will only change prospectively. I don't think so. The fact that Linux is changing underneath us is a compelling reason for back-patching a change here. Our older branches still have to be able to run on modern OS versions. I'm also fairly unclear on what you think a fix would look like if it's not effectively a change in the default. (Hint: this *will* be changing, one way or another, in Red Hat's version of 8.4, since that's what RH is shipping in RHEL6.) Well, my initial idea was that if PG_O_DIRECT is non-zero, we should test at startup time if we can use it on the WAL file system and inhibit its use if not. Incidentally, I notice it's not used at all in test_fsync.c - should it not be? test_fsync certainly should be using PG_O_DIRECT in the same places the backend does. Once we decide how to handle PG_O_DIRECT, I will modify test_fsync to match. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] should we set hint bits without dirtying the page?
In a sleepy email late last night on the crash-safe visibility map thread, I proposed introducing a new buffer state BM_UNTIDY. When a page is dirtied by a hint bit update, we mark it untidy but not dirty. Untidy buffers would be treated as dirty by the background writer cleaning scan, but as clean by checkpoints and by backends doing emergency buffer cleaning to feed new allocations. This would have the effect of rate-limiting the number of buffers that we write just for hint-bit updates. With default settings, we'd write at most bgwriter_lru_maxpages * (1000 ms/second / bgwriter_delay) untidy pages per second, which works out to 4MB/second of write traffic with default settings. That seems like it might be enough to prevent the bulk load followed by SELECT access pattern from totally swamping the machine with write traffic, while still ensuring that all the hint bits eventually do get set. I then got to wondering whether we should even go a step further, and simply decree that a page with only hint bit updates is not dirty and won't be written, period. If your working set fits in RAM, this isn't really a big deal because you'll read the pages in once, set the hint bits, and those pages will just stick around. Where it's a problem is when you have a huge table that you're scanning over and over again, especially if data in that table was loaded by many different, widely spaced XIDs that require looking at many different CLOG pages. But maybe we could ameliorate that problem by freezing more aggressively. As soon as all tuples on the page are all-visible, VACUUM will freeze every tuple on the page (setting a HEAP_XMIN_FROZEN bit rather than actually overwriting XMIN, to preserve forensic information) and mark it all-visible in a single WAL-logged operation. Also, we could have the background writer (!) try to perform this same operation on pages evicted during the cleaning scan. This would impose the same sort of I/O cap as the previous idea, although it would generate not only page writes but also WAL activity. The result would be not only to reduce the number of times we write the page (which, right now, can be as much as 3 * number_of_tuples, if we insert, hint-bit update, and then freeze each tuple separately), but also to make the freezing happen gradually over time rather than in a sudden spike when the XID age cut-off is reached. This would also be advantageous for index-only scans, because a large insert only table would gradually accumulate frozen pages without ever being vacuumed. The gradual freezing wouldn't apply in all cases - in particular, if you have a large insert-only table that you never actually read anything out of, you'd still get a spike when the XID age cut-off is reached. I'm inclined to think it would still be a big improvement over the status quo - you'd write the table twice instead of three times, and the second one would often be spread out rather than all at once. I foresee various objections. One is that freezing will force FPIs, so you'll still be writing the data three times. Of course, if you count FPIs, we're now writing the data four times, but under this scheme much more data would stick around long enough to get frozen, so the objection has merit. However, I think we can avoid this too, by allocating an additional bit in pd_flags, PD_FPI. Instead of emitting an FPI when the old LSN precedes the redo pointer, we'll emit an FPI when the FPI bit is set (in which case we'll also clear the bit) OR when the old LSN precedes the redo pointer. Upon emitting a WAL record that is torn-page safe (such as a freeze or all-visible record), we'll pass a flag to XLogInsert that arranges to suppress FPIs, bump the LSN, and set PD_FPI. That way, if the page is touched again before the next checkpoint by an operation that does NOT suppress FPI, one will be emitted then. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] crash-safe visibility map, take three
On Thu, Dec 2, 2010 at 6:37 PM, Jeff Davis pg...@j-davis.com wrote: It seems to me that a COPY command executed in a transaction with no other open snapshots writing to a table created or truncated within the same transaction should be able to write frozen tuples from the get-go, regardless of anything else we do. Well, some transaction might pick up a snapshot between the time you begin the copy and the time it commits. We'd need to prevent such a transaction from actually reading the table. Oh, hmm. That's awkward. I guess under present semantics it can see the table - but not its contents - once the inserting transaction has committed. That stinks. I don't think it would be appropriate to hold off making the visibility map crash-safe, on the off chance that our design for so doing might complicate something else we want to do later. I'm not suggesting we hold off on it at all. To the contrary, I'm suggesting that we simply log updates of PD_ALL_VISIBLE as well as VM bits, at least until a performance problem presents itself. That will _simplify_ the design. Then, when a performance problem does present itself for a certain use case, we can see how to fix it. If many cases are affected, then we might choose one of these more creative solutions that breaks the rules in controlled ways, understanding the trade-offs. If only bulk loading is affected, we might choose to address that case directly. I don't think that you can seriously suggest that emitting that volume of FPIs isn't going to be a problem immediately. We have to have some solution to that problem out of the gate. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Another proposal for table synonyms
On Thu, Dec 2, 2010 at 3:43 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Robert Haas's message of jue dic 02 17:27:01 -0300 2010: On Wed, Dec 1, 2010 at 3:55 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Josh Berkus's message of mié dic 01 17:13:35 -0300 2010: Well, porting applications from other database systems that support synonyms (i.e. Oracle, DB2, SQL Server). SQL Server supports synonyms? If it's not Oracle-only, it's a more powerful argument to have the feature. I think it's worth mentioning that in SQL Server, synonyms are not schema-qualified; they're global objects. Seems like they have more than one kind. http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_7001.htm Yeah, the Oracle system is a lot more complex than SQL Server's, but I was only talking about the latter, for which see here: http://www.databasejournal.com/features/mssql/article.php/3635426/SYNONYM-in-SQL-Server-2005.htm Well, that seems primarily designed to cut down on three and four part names. We don't have that problem anyway. The list of objects for which they support synonyms is also interesting. The bit that allows a synonym to reference another synonym seems like worth considering further (either reject them altogether, or have some way to deal with possible cycles). It's pretty trivial to do cycle-detection at runtime. I think the original proposal in this thread didn't mention them at all. (I don't think we should consider synonyms for either functions or stored procedures; that would make the current mess of function resolution rules a lot messier.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spread checkpoint sync
On Thu, Dec 2, 2010 at 2:24 PM, Greg Stark gsst...@mit.edu wrote: On Wed, Dec 1, 2010 at 4:25 AM, Greg Smith g...@2ndquadrant.com wrote: I ask because I don't have a mental model of how the pause can help. Given that this dirty data has been hanging around for many minutes already, what is a 3 second pause going to heal? The difference is that once an fsync call is made, dirty data is much more likely to be forced out. It's the one thing that bypasses all other ways the kernel might try to avoid writing the data I had always assumed the problem was that other I/O had been done to the files in the meantime. I.e. the fsync is not just syncing the checkpoint but any other blocks that had been flushed since the checkpoint started. It strikes me that we might start the syncs of the files that the checkpoint isn't going to dirty further at the start of the checkpoint, and do the rest at the end. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DELETE with LIMIT (or my first hack)
Peter Eisentraut wrote: On tis, 2010-11-30 at 14:20 -0500, Andrew Dunstan wrote: I agree, that argument is completely misconceived. If the DBA is paying enough attention to use LIMIT, s/he should be paying enough attention not to do damage in the first place. If that were the only argument in its favor I'd be completely against the feature. I don't have any use for DELETE with LIMIT, but UPDATE with LIMIT could be very useful if you are doing full-table updates and you don't have enough space so you do it in chunks. So should this now be a TODO item? Text? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On Thu, Dec 2, 2010 at 5:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: On 12/02/2010 05:01 PM, Tom Lane wrote: In the past, proposals for this have always been rejected on the grounds that it's impossible to assure a consistent dump if different connections are used to read different tables. I fail to understand why that consideration can be allowed to go by the wayside now. Well, snapshot cloning should allow that objection to be overcome, no? Possibly, but we need to see that patch first not second. Yes, by all means let's allow the perfect to be the enemy of the good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DELETE with LIMIT (or my first hack)
On Thu, Dec 2, 2010 at 7:12 PM, Bruce Momjian br...@momjian.us wrote: Peter Eisentraut wrote: On tis, 2010-11-30 at 14:20 -0500, Andrew Dunstan wrote: I agree, that argument is completely misconceived. If the DBA is paying enough attention to use LIMIT, s/he should be paying enough attention not to do damage in the first place. If that were the only argument in its favor I'd be completely against the feature. I don't have any use for DELETE with LIMIT, but UPDATE with LIMIT could be very useful if you are doing full-table updates and you don't have enough space so you do it in chunks. So should this now be a TODO item? Text? Allow DELETE and UPDATE to be used with LIMIT and ORDER BY. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Another proposal for table synonyms
Excerpts from Robert Haas's message of jue dic 02 21:10:48 -0300 2010: On Thu, Dec 2, 2010 at 3:43 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Robert Haas's message of jue dic 02 17:27:01 -0300 2010: Yeah, the Oracle system is a lot more complex than SQL Server's, but I was only talking about the latter, for which see here: http://www.databasejournal.com/features/mssql/article.php/3635426/SYNONYM-in-SQL-Server-2005.htm Well, that seems primarily designed to cut down on three and four part names. We don't have that problem anyway. Right. (My point here is that SQL Server is not a good guidance on what the synonym system should do.) The list of objects for which they support synonyms is also interesting. The bit that allows a synonym to reference another synonym seems like worth considering further (either reject them altogether, or have some way to deal with possible cycles). It's pretty trivial to do cycle-detection at runtime. No disagreement on that, but something needs to be decided. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should we set hint bits without dirtying the page?
On 12/2/10 4:00 PM, Robert Haas wrote: As soon as all tuples on the page are all-visible, VACUUM will freeze every tuple on the page (setting a HEAP_XMIN_FROZEN bit rather than actually overwriting XMIN, to preserve forensic information) and mark it all-visible in a single WAL-logged operation. Also, we could have the background writer (!) try to perform this same operation on pages evicted during the cleaning scan. This would impose the same sort of I/O cap as the previous idea, although it would generate not only page writes but also WAL activity. I would love this. It would also help considerably with the freezing already cold data problem ... if we were allowed to treat the frozen bit as canonical and not update any of the tuples. While never needing to touch pages at all for freezing is my preference, updating them while they're in memory anyway is a close second. Hmm. That doesn't work, though; the page can contain tuples which are attached to rolledback XIDs. Also, autovacuum would have no way of knowing which pages are frozen without reading them. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On 12/02/2010 07:13 PM, Robert Haas wrote: On Thu, Dec 2, 2010 at 5:32 PM, Tom Lanet...@sss.pgh.pa.us wrote: Andrew Dunstanand...@dunslane.net writes: On 12/02/2010 05:01 PM, Tom Lane wrote: In the past, proposals for this have always been rejected on the grounds that it's impossible to assure a consistent dump if different connections are used to read different tables. I fail to understand why that consideration can be allowed to go by the wayside now. Well, snapshot cloning should allow that objection to be overcome, no? Possibly, but we need to see that patch first not second. Yes, by all means let's allow the perfect to be the enemy of the good. That seems like a bit of an easy shot. Requiring that parallel pg_dump produce a dump that is as consistent as non-parallel pg_dump currently produces isn't unreasonable. It's not stopping us moving forward, it's just not wanting to go backwards. And it shouldn't be terribly hard. IIRC Joachim has already done some work on it. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DELETE with LIMIT (or my first hack)
Robert Haas wrote: On Thu, Dec 2, 2010 at 7:12 PM, Bruce Momjian br...@momjian.us wrote: Peter Eisentraut wrote: On tis, 2010-11-30 at 14:20 -0500, Andrew Dunstan wrote: I agree, that argument is completely misconceived. If the DBA is paying enough attention to use LIMIT, s/he should be paying enough attention not to do damage in the first place. If that were the only argument in its favor I'd be completely against the feature. I don't have any use for DELETE with LIMIT, but UPDATE with LIMIT could be very useful if you are doing full-table updates and you don't have enough space so you do it in chunks. So should this now be a TODO item? ?Text? Allow DELETE and UPDATE to be used with LIMIT and ORDER BY. Done: Allow DELETE and UPDATE to be used with LIMIT and ORDER BY * http://archives.postgresql.org/pgsql-hackers/2010-11/msg01997.php * http://archives.postgresql.org/pgsql-hackers/2010-12/msg00021.php -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should we set hint bits without dirtying the page?
On Thu, Dec 2, 2010 at 7:19 PM, Josh Berkus j...@agliodbs.com wrote: On 12/2/10 4:00 PM, Robert Haas wrote: As soon as all tuples on the page are all-visible, VACUUM will freeze every tuple on the page (setting a HEAP_XMIN_FROZEN bit rather than actually overwriting XMIN, to preserve forensic information) and mark it all-visible in a single WAL-logged operation. Also, we could have the background writer (!) try to perform this same operation on pages evicted during the cleaning scan. This would impose the same sort of I/O cap as the previous idea, although it would generate not only page writes but also WAL activity. I would love this. It would also help considerably with the freezing already cold data problem ... if we were allowed to treat the frozen bit as canonical and not update any of the tuples. While never needing to touch pages at all for freezing is my preference, updating them while they're in memory anyway is a close second. Hmm. That doesn't work, though; the page can contain tuples which are attached to rolledback XIDs. Sure, well, any pages that are not all-visible will need to get vacuumed before they get marked all-visible. I can't fix that problem. But the more we freeze opportunistically before vacuum, the less painful vacuum will be when it finally kicks in. I don't anticipate this is going to be perfect; I'd be happy if we could achieve better. Also, autovacuum would have no way of knowing which pages are frozen without reading them. Well, reading them is still better than reading them and then writing them. But in the long term I imagine we can avoid even doing that much. If we have a crash-safe visibility map and an aggressive freezing policy that freezes all tuples on the page before marking it all-visible, then even an anti-wraparound vacuum needn't scan all-visible pages. We might not feel confident to rely on that right away, but I think over the long term we can hope to get there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] crash-safe visibility map, take three
On Thu, 2010-12-02 at 19:06 -0500, Robert Haas wrote: I don't think that you can seriously suggest that emitting that volume of FPIs isn't going to be a problem immediately. We have to have some solution to that problem out of the gate. Fair enough. I think you understand my point, and it's easy enough to remove complexity later if it makes sense to do so. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On Thu, Dec 2, 2010 at 7:21 PM, Andrew Dunstan and...@dunslane.net wrote: In the past, proposals for this have always been rejected on the grounds that it's impossible to assure a consistent dump if different connections are used to read different tables. I fail to understand why that consideration can be allowed to go by the wayside now. Well, snapshot cloning should allow that objection to be overcome, no? Possibly, but we need to see that patch first not second. Yes, by all means let's allow the perfect to be the enemy of the good. That seems like a bit of an easy shot. Requiring that parallel pg_dump produce a dump that is as consistent as non-parallel pg_dump currently produces isn't unreasonable. It's not stopping us moving forward, it's just not wanting to go backwards. I certainly agree that would be nice. But if Joachim thought the patch were useless without that, perhaps he wouldn't have bothered writing it at this point. In fact, he doesn't think that, and he mentioned the use cases he sees in his original post. But even supposing you wouldn't personally find this useful in those situations, how can you possibly say that HE wouldn't find it useful in those situations? I understand that people sometimes show up here and ask for ridiculous things, but I don't think we should be too quick to attribute ridiculousness to regular contributors. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On 12/02/2010 07:48 PM, Robert Haas wrote: On Thu, Dec 2, 2010 at 7:21 PM, Andrew Dunstanand...@dunslane.net wrote: In the past, proposals for this have always been rejected on the grounds that it's impossible to assure a consistent dump if different connections are used to read different tables. I fail to understand why that consideration can be allowed to go by the wayside now. Well, snapshot cloning should allow that objection to be overcome, no? Possibly, but we need to see that patch first not second. Yes, by all means let's allow the perfect to be the enemy of the good. That seems like a bit of an easy shot. Requiring that parallel pg_dump produce a dump that is as consistent as non-parallel pg_dump currently produces isn't unreasonable. It's not stopping us moving forward, it's just not wanting to go backwards. I certainly agree that would be nice. But if Joachim thought the patch were useless without that, perhaps he wouldn't have bothered writing it at this point. In fact, he doesn't think that, and he mentioned the use cases he sees in his original post. But even supposing you wouldn't personally find this useful in those situations, how can you possibly say that HE wouldn't find it useful in those situations? I understand that people sometimes show up here and ask for ridiculous things, but I don't think we should be too quick to attribute ridiculousness to regular contributors. Umm, nobody has attributed ridiculousness to anyone. Please don't put words in my mouth. But I think this is a perfectly reasonable discussion to have. Nobody gets to come along and get the features they want without some sort of consensus, not me, not you, not Joachim, not Tom. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Instrument checkpoint sync calls
On Tue, Nov 30, 2010 at 12:15 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Tue, Nov 30, 2010 at 8:38 AM, Greg Smith g...@2ndquadrant.com wrote: Hi Greg, Thanks for the update. This might be ready for some proper review now. I know there's at least one blatant bug still in here I haven't found yet, related to how the averages are computed. Yes, the blatant bug: average_sync_time = CheckpointStats.ckpt_longest_sync / CheckpointStats.ckpt_sync_rels; That should clearly be ckpt_agg_sync_time, not ckpt_longest_sync. I've attached a tiny patch to apply over yours, to deal with this and with the case where no files are synced. Combining this instrumentation patch with the backend sync one already committed, the net result under log_min_messages=debug1is somewhat undesirable in that I can now see the individual sync times for the syncs done by the checkpoint writer, but I do not get times for the syncs done by the backend (I only get informed of their existence). I don't know what I would propose to fix this. Having the reportage of sync time of backend syncs be controlled by log_checkpoints seems somewhat perverse, but the only alternative I see is to have log_min_messages=debug1 always report times for the backend syncs. Or to have them go unreported altogether. Cheers, Jeff patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Author names in source files
I found there are some author names in some of source files. Will we keep their names? And, will we add new author names who wrote patches for those files? src/backend/postmaster/syslogger.c(15) : * Author: Andreas Pflug src/backend/utils/adt/genfile.c(9) : * Author: Andreas Pflug src/backend/utils/adt/oracle_compat.c(7): * Author: Edmund Mergl src/backend/utils/adt/txid.c(14): * Author: Jan Wieck, Afilias USA INC. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hi- How frequently Postgres Poll for trigger file
On Thu, Dec 2, 2010 at 12:16 AM, Euler Taveira de Oliveira eu...@timbira.com wrote: As you said, there are platforms that a signal doesn't wake up a process, so I suggest (ii) but I'm fine to include (i) at docs too. Can we use pg_ctl kill to send signal on such platforms? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On Dec 2, 2010, at 8:11 PM, Andrew Dunstan and...@dunslane.net wrote: Umm, nobody has attributed ridiculousness to anyone. Please don't put words in my mouth. But I think this is a perfectly reasonable discussion to have. Nobody gets to come along and get the features they want without some sort of consensus, not me, not you, not Joachim, not Tom. I'm not disputing that we COULD reject the patch. I AM disputing that we've made a cogent argument for doing so. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Andrew Dunstan and...@dunslane.net writes: On 12/02/2010 05:32 PM, Tom Lane wrote: (I'm not actually convinced that snapshot cloning is the only problem here; locking could be an issue too, if there are concurrent processes trying to take locks that will conflict with pg_dump's. But the snapshot issue is definitely a showstopper.) Why is that more an issue with parallel pg_dump? The scenario that bothers me is 1. pg_dump parent process AccessShareLocks everything to be dumped. 2. somebody else tries to acquire AccessExclusiveLock on table foo. 3. pg_dump child process is told to dump foo, tries to acquire AccessShareLock. Now, process 3 is blocked behind process 2 is blocked behind process 1 which is waiting for 3 to complete. Can you say undetectable deadlock? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On 12/02/2010 09:09 PM, Tom Lane wrote: Andrew Dunstanand...@dunslane.net writes: On 12/02/2010 05:32 PM, Tom Lane wrote: (I'm not actually convinced that snapshot cloning is the only problem here; locking could be an issue too, if there are concurrent processes trying to take locks that will conflict with pg_dump's. But the snapshot issue is definitely a showstopper.) Why is that more an issue with parallel pg_dump? The scenario that bothers me is 1. pg_dump parent process AccessShareLocks everything to be dumped. 2. somebody else tries to acquire AccessExclusiveLock on table foo. hmm. 3. pg_dump child process is told to dump foo, tries to acquire AccessShareLock. Now, process 3 is blocked behind process 2 is blocked behind process 1 which is waiting for 3 to complete. Can you say undetectable deadlock? Hmm. Yeah. Maybe we could get around it if we prefork the workers and they all acquire locks on everything to be dumped up front in nowait mode, right after the parent, and if they can't the whole dump fails. Or something along those lines. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Andrew Dunstan and...@dunslane.net writes: Umm, nobody has attributed ridiculousness to anyone. Please don't put words in my mouth. But I think this is a perfectly reasonable discussion to have. Nobody gets to come along and get the features they want without some sort of consensus, not me, not you, not Joachim, not Tom. In particular, this issue *has* been discussed before, and there was a consensus that preserving dump consistency was a requirement. I don't think that Joachim gets to bypass that decision just by submitting a patch that ignores it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Andrew Dunstan and...@dunslane.net writes: On 12/02/2010 09:09 PM, Tom Lane wrote: Now, process 3 is blocked behind process 2 is blocked behind process 1 which is waiting for 3 to complete. Can you say undetectable deadlock? Hmm. Yeah. Maybe we could get around it if we prefork the workers and they all acquire locks on everything to be dumped up front in nowait mode, right after the parent, and if they can't the whole dump fails. Or something along those lines. [ thinks for a bit... ] Actually it might be good enough if a child simply takes the lock it needs in nowait mode, and reports failure on error. We know the parent already has that lock, so the only way that the child's request can fail is if something conflicting with AccessShareLock is queued up behind the parent's lock. So failure to get the child lock immediately proves that the deadlock case applies. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should we set hint bits without dirtying the page?
Robert Haas robertmh...@gmail.com writes: I then got to wondering whether we should even go a step further, and simply decree that a page with only hint bit updates is not dirty and won't be written, period. This sort of thing has been discussed before. It seems fairly clear to me that any of these variations represents a performance tradeoff: some cases will get better and some will get worse. I think we are not going to get far unless we can agree on a set of benchmark cases that we'll use to decide whether the tradeoff is a win or not. How can we arrive at that? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On 12/02/2010 09:41 PM, Tom Lane wrote: Andrew Dunstanand...@dunslane.net writes: On 12/02/2010 09:09 PM, Tom Lane wrote: Now, process 3 is blocked behind process 2 is blocked behind process 1 which is waiting for 3 to complete. Can you say undetectable deadlock? Hmm. Yeah. Maybe we could get around it if we prefork the workers and they all acquire locks on everything to be dumped up front in nowait mode, right after the parent, and if they can't the whole dump fails. Or something along those lines. [ thinks for a bit... ] Actually it might be good enough if a child simply takes the lock it needs in nowait mode, and reports failure on error. We know the parent already has that lock, so the only way that the child's request can fail is if something conflicting with AccessShareLock is queued up behind the parent's lock. So failure to get the child lock immediately proves that the deadlock case applies. Yeah, that would be a whole lot simpler. It would avoid the deadlock, but it would have lots more chances for failure. But it would at least be a good place to start. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Author names in source files
Itagaki Takahiro wrote: I found there are some author names in some of source files. Will we keep their names? And, will we add new author names who wrote patches for those files? src/backend/postmaster/syslogger.c(15) : * Author: Andreas Pflug src/backend/utils/adt/genfile.c(9) : * Author: Andreas Pflug src/backend/utils/adt/oracle_compat.c(7): * Author: Edmund Mergl src/backend/utils/adt/txid.c(14): * Author: Jan Wieck, Afilias USA INC. When someone writes a file from scratch, we usually keep their name on it so we remember the original author. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
2nd) I can get the data to stick around after restart. Though not reliably. In general: create and fill a table, vacuum it (not sure if its important, I do it because thats what I'd done in my pgbench testing where I noticed the data stuck around), wait an hour (I usually left it for 12-24 hours, but recreated it with as little as a half hour), then restart pg. Sometimes the data is there... sometimes not. I also filled my table with more data than memory would hold so it would spill to disk, again, because it recreates my pgbench setup. I'm still working on finding the exact steps, but I wanted to get you #1 above. -Andy Ok, forget the time thing. Has nothing to do with it. (Which everyone already assumed I imagine). Its truncate. Create unloged table, fill it, truncate it, fill it again, restart pg, and the data will still be there. -Andy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Author names in source files
Excerpts from Bruce Momjian's message of vie dic 03 00:15:28 -0300 2010: Itagaki Takahiro wrote: I found there are some author names in some of source files. Will we keep their names? And, will we add new author names who wrote patches for those files? src/backend/postmaster/syslogger.c(15) : * Author: Andreas Pflug src/backend/utils/adt/genfile.c(9) : * Author: Andreas Pflug src/backend/utils/adt/oracle_compat.c(7): *Author: Edmund Mergl src/backend/utils/adt/txid.c(14): *Author: Jan Wieck, Afilias USA INC. When someone writes a file from scratch, we usually keep their name on it so we remember the original author. The reason Takahiro-san is asking the question is because one of Dimitri's patches adds a function to genfile.c, and add his name to the Author field while at it. So answering the second part of the question is important. I take it your answer to that is no. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Excerpts from Andy Colson's message of vie dic 03 00:37:17 -0300 2010: Ok, forget the time thing. Has nothing to do with it. (Which everyone already assumed I imagine). Its truncate. Create unloged table, fill it, truncate it, fill it again, restart pg, and the data will still be there. Hmm, presumably the table rewrite thing in truncate is not preserving the unlogged state (perhaps it's the swap-relfilenode business). Does CLUSTER have a similar effect? What about VACUUM FULL? If so you know where the bug is. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Author names in source files
On 12/02/2010 10:15 PM, Bruce Momjian wrote: Itagaki Takahiro wrote: I found there are some author names in some of source files. Will we keep their names? And, will we add new author names who wrote patches for those files? src/backend/postmaster/syslogger.c(15) : * Author: Andreas Pflug src/backend/utils/adt/genfile.c(9) : * Author: Andreas Pflug src/backend/utils/adt/oracle_compat.c(7): * Author: Edmund Mergl src/backend/utils/adt/txid.c(14): * Author: Jan Wieck, Afilias USA INC. When someone writes a file from scratch, we usually keep their name on it so we remember the original author. That's probably reasonable for a while. But the code drifts. I certainly wouldn't object to my name being removed from initdb.c, after 7 years, athough probably half the code or more is still mine. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Author names in source files
Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Bruce Momjian's message of vie dic 03 00:15:28 -0300 2010: When someone writes a file from scratch, we usually keep their name on it so we remember the original author. The reason Takahiro-san is asking the question is because one of Dimitri's patches adds a function to genfile.c, and add his name to the Author field while at it. So answering the second part of the question is important. I take it your answer to that is no. If everybody who had added any code to a file got to put their name on it, the author lists would be ridiculously long (and some of us would have our names plastered in hundreds of places). My answer is no, and I'm not sure those things should be there in the first place. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Author names in source files
On Thu, Dec 2, 2010 at 10:57 PM, Tom Lane t...@sss.pgh.pa.us wrote: [...] My answer is no, and I'm not sure those things should be there in the first place. i agree. after all the committers give the appropiate credit in the commit message so if we want history it's there -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte y capacitación de PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Author names in source files
On Fri, Dec 3, 2010 at 13:28, Jaime Casanova ja...@2ndquadrant.com wrote: On Thu, Dec 2, 2010 at 10:57 PM, Tom Lane t...@sss.pgh.pa.us wrote: [...] My answer is no, and I'm not sure those things should be there in the first place. i agree. after all the committers give the appropiate credit in the commit message so if we want history it's there Hi, folks. Thanks for all of your replies. Dimitri, your name cannot be added in source files, but will be recorded in commit logs ;-) -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On Thu, Dec 2, 2010 at 9:33 PM, Tom Lane t...@sss.pgh.pa.us wrote: In particular, this issue *has* been discussed before, and there was a consensus that preserving dump consistency was a requirement. I don't think that Joachim gets to bypass that decision just by submitting a patch that ignores it. I am not trying to bypass anything here :) Regarding the locking issue I probably haven't done sufficient research, at least I managed to miss the emails that mentioned it. Anyway, that seems to be solved now fortunately, I'm going to implement your idea over the weekend. Regarding snapshot cloning and dump consistency, I brought this up already several months ago and asked if the feature is considered useful even without snapshot cloning. And actually it was you who motivated me to work on it even without having snapshot consistency... http://archives.postgresql.org/pgsql-hackers/2010-03/msg01181.php In my patch pg_dump emits a warning when called with -j, if you feel better with an extra option --i-know-that-i-have-no-synchronized-snapshots, fine with me :-) In the end we provide a tool with limitations, it might not serve all use cases but there are use cases that would benefit a lot. I personally think this is better than to provide no tool at all... Joachim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should we set hint bits without dirtying the page?
On 03.12.2010 04:54, Tom Lane wrote: Robert Haasrobertmh...@gmail.com writes: I then got to wondering whether we should even go a step further, and simply decree that a page with only hint bit updates is not dirty and won't be written, period. This sort of thing has been discussed before. It seems fairly clear to me that any of these variations represents a performance tradeoff: some cases will get better and some will get worse. I think we are not going to get far unless we can agree on a set of benchmark cases that we'll use to decide whether the tradeoff is a win or not. How can we arrive at that? It's pretty easy to come up with a test case where that would be a win. I'd like to see some benchmark results of the worst case, to see how much loss we're talking about at most. Robert described the worst case: Where it's a problem is when you have a huge table that you're scanning over and over again, especially if data in that table was loaded by many different, widely spaced XIDs that require looking at many different CLOG pages. I'd like to add to that: and the table is big enough to not fit in shared_buffers, but small enough to fit in OS cache. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_execute_from_file review
On Thu, Dec 2, 2010 at 20:00, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Please find attached the v8 version of the patch, that fixes the following: I fixed and cleanup some of codes in it; v9 patch attached. Please check my modifications, and set the status to Ready to Committer if you find no problems. I think documentation and code comments might need to be checked by native English speakers. Well thinking about it, omitting the length parameter alltogether seems like the more natural SQL level API for me, so I've made it happen: Good idea. I re-added negative lengths checks in pg_read_file functions; negative length is used internally, but not exposed as SQL functions. BTW, I think we can call it just replace The same idea occured to me yesternight when reading through the patch to send. It's now done in the way you can see above. The idea is not to change the existing behavior at all, so I've not changed the non-VARIADIC version of the function. You added replace(text, text, text, VARIADIC text), but I think replace(text, VARIADIC text) also works. If we have the short form, we can use it easily from execute functions with placeholders. Other changes: * Added some regression tests. * Int64GetDatum((int64) fst.st_size) was broken. * An error checks for could not read file didn't work. * Read file contents into bytea buffer directly to avoid memcpy. * Fixed bad usages of text and bytea values because they are not null-terminated. * I don't want to expose ArrayType to builtins.h. So I call replace_text_variadic() from execute functions. * pg_execute_sql_file(path, NULL) won't work because it's a STRICT function. It returns NULL with no works when at least one of the argument is NULL. BTW, we have many text from/to cstring conversions in the new codes. It would be not an item for now, but we would need to improve them if those functions are heavily used, Especially replace_text_variadic(). * We might rename pg_convert_and_execute_sql_file() to pg_execute_sql_file_with_encoding() or so to have the same prefix. Well, I think I prefer reading the verbs in the order that things are happening in the code, it's actually convert then execute. But again, maybe some Native Speaker will have a say here, or maybe your proposed name fits better in PostgreSQL. I'd leave it for commiter :) Agreed. I also prefer pg_read_file_all rather than pg_read_whole_file :P -- Itagaki Takahiro pg_execute_from_file.v9.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers