Re: [HACKERS] Companies Contributing to Open Source
On Tue, Dec 19, 2006 at 07:17:13PM -0800, Joshua D. Drake wrote: On Tue, 2006-12-19 at 22:04 -0500, Tom Lane wrote: Joshua D. Drake [EMAIL PROTECTED] writes: I remember the president of Great Bridge saying that the company needs the community, but not visa-vera --- if the company dies, the community keeps going (as it did after Great Bridge, without a hickup), but if the community dies, the company dies too. I 95% agree here. If EDB or CMD were go to down in flames, it could hurt the community quite a bit. Josh, I hate to burst your bubble, but Great Bridge employed a much larger fraction of the hacker-community-at-the-time than either EDB or CMD do today. We survived that, and if EDB or CMD or Greenplum or the entire lot went down tomorrow, we'd survive that too. I never once suggested that the community would not survive. I said it would hurt productivity for n amount of time. Let's just be realistic here: In one fails swoop: Devrim, Alvaro, Darcy, Heikki, Bruce, Simon, Greg, Dave, Marc and I are all suddenly looking for employment... and by Friday, all are working again, unless they want to take a few weeks off. :) You don't think there would be an issue that could cause some grief to the community? Not really. It might cause some personal grief to each, which might cause some temporary loss of productivity. Then again, it might be a stimulating shakeup, or creative destruction, to borrow a phrase. Is it surmountable? Of course, that isn't the point. The point is that it is not painless. Nothing is painless. Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Companies Contributing to Open Source
On Tue, Dec 19, 2006 at 05:40:12PM -0500, Bruce Momjian wrote: Lukas Kahwe Smith wrote: Hi, I think another point you need to bring out more clearily is that the community is also often miffed if they feel they have been left out of the design and testing phases. This is sometimes just a reflex that is not always based on technical reasoning. Its just that as you correctly point out are worried of being high-jacked by companies. I hate to mention an emotional community reaction in this document. You don't have to name it that if you don't want to, although respect (or at least a good simulation of it) is crucial when dealing with any person or group. Handing the community a /fait accompli/ is a great way to convey disrespect, no matter how well-meaning the process originally was. We normally just highlight the inefficiency of a company doing things on their own, and the wasted effort of them having to make adjustments. Would it really hurt to touch on some of the whys of this? Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Doc bug
The documentation at http://www.postgresql.org/docs/8.2/interactive/config-setting.html states that: Boolean values may be written as ON, OFF, TRUE, FALSE, YES, NO, 1, 0 (all case-insensitive) or any unambiguous prefix of these. But the following doesn't work: postgres=# set enable_seqscan = of; ERROR: parameter enable_seqscan requires a Boolean value postgres=# 'of' is an unambiguous prefix of OFF, but it clearly doesn't work. Is it the documentation that needs fix or is it the code? I tried the following too: set enable_seqscan = of; -- doesn't work set enable_seqscan = off; -- works BTW, I tried TR, TRU, FA, FAL, FALS, YE. They all work fine. Best regards, -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | yahoo }.com
Re: [HACKERS] [PATCHES] Enums patch v2
On Wed, Dec 20, 2006 at 01:39:58AM +, Tom Dunstan wrote: Not with this patch, and AFAIK not possible generally, without writing separate I/O functions for each type. I'd love to be able to do that, but I don't think it's possible currently. The main stumbling block is the output function (and cast-to-text function), because output functions do not get provided the oid of the type that they're dealing with, for security reasons IIRC. It was never clear to me why I/O functions should ever be directly callable by a user (and hence open to security issues), but apparently it was enough to purge any that were designed like that from the system, so I wasn't going to go down that road with the patch. I worked around this in taggedtypes by indeed creating seperate copies of the i/o functions on demand and at execution time looking up the required type from the function signiture. The only solution indeed is to change the calling convention if the I/O functions so that the relevent datatype oid stored in a safe place, that isn't set for normal function calls. BTW, being able to call type i/o functions directly is very useful. For example date_in(text_out(blah)) is a form of cast between types that don't usually have a cast. If you change the calling convention as indicated, that trick will still work, just not for types with the restricted i/o functions. Also, it's not just I/O functions that are the issue, consider the enum-to-integer cast. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] Load distributed checkpoint
Hello, Itagaki-san, all I have to report a sad result. Your patch didn't work. Let's consider the solution together. What you are addressing is very important for the system designers in the real world -- smoothing response time. Recall that unpatched PostgreSQL showed the following tps's in case (1) (i.e. with default bgwriter_* and checkpoint_* settings.) 235 80 226 77 240 The patched PostgreSQL showed the following tps's: 230 228 77 209 66 [disk usage] The same tendency can be seen as with the unpatched PostgreSQL. That is: When the tps is low, the %util of disk for data files is high, and %util of disk for WAL is low. Why is transaction logging is disturbed by cleaning and/or syncing activity? While the bgwriter is fsync()ing, it does not lock any data structures that the transactions want to access. Even though they share the same SCSI controller and bus, they are different disks. The bandwidth does not appear to be exhausted, since Ultra320 is said to have 256MB band width in practice. (Recall that WAL is on sdd and data files are on sde.) Device:rrqm/s wrqm/s r/s w/s rsec/s wsec/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sdd 0.00 810.78 0.00 102.200.00 7306.99 0.00 3653.4971.50 1.12 10.95 7.32 74.77 sde 0.00 25.35 0.00 6.190.00 252.30 0.00 126.15 40.77 0.50 81.32 5.94 3.67 sdd 0.00 884.20 0.00 126.000.00 8080.00 0.00 4040.0064.13 1.26 10.00 7.11 89.64 sde 0.00 21.40 0.00 5.000.00 211.20 0.00 105.60 42.24 0.31 62.56 6.52 3.26 sdd 0.00 924.80 0.00 116.200.00 8326.40 0.00 4163.2071.66 1.23 10.59 7.37 85.64 sde 0.00 27.60 0.00 26.600.00 433.60 0.00 216.80 16.30 4.24 159.29 2.44 6.50 sdd 0.00 721.20 0.00 102.400.00 6588.80 0.00 3294.4064.34 0.999.71 7.07 72.40 sde 0.00 1446.80 0.00 101.600.00 20289.60 0.00 10144.80 199.70 1192.40 572.45 2.29 23.30 sdd 0.00 0.00 0.00 0.200.001.60 0.00 0.80 8.00 0.11 539.00 539.00 10.80 sde 0.00 0.00 0.00 452.100.000.00 0.00 0.00 0.00 3829.57 3715.83 2.22 100.22 sdd 0.00 0.00 0.00 0.000.000.00 0.00 0.00 0.00 0.000.00 0.00 0.00 sde 0.00 0.00 0.00 349.800.000.00 0.00 0.00 0.00 1745.52 8515.74 2.86 100.02 sdd 0.00 442.40 0.00 51.000.00 3948.80 0.00 1974.40 77.43 0.60 11.73 7.54 38.46 sde 0.00 2.80 0.00 184.000.00 25.60 0.00 12.80 0.14 277.52 12629.41 3.19 58.74 sdd 0.00 898.00 0.00 124.800.00 8182.40 0.00 4091.2065.56 1.30 10.40 7.24 90.30 sde 0.00 19.20 0.00 3.800.00 184.00 0.0092.00 48.42 0.24 62.11 14.11 5.36 sdd 0.00 842.28 0.00 109.020.00 7612.02 0.00 3806.0169.82 1.33 12.26 8.35 91.02 sde 0.00 45.49 0.00 46.890.00 739.08 0.00 369.54 15.76 9.04 192.73 3.38 15.85 sdd 0.00 1198.41 0.00 71.510.00 10505.18 0.00 5252.59 146.90 128.19 99.76 13.48 96.43 sde 0.00 1357.77 0.00 199.800.00 19263.75 0.00 9631.8796.41 2251.09 1179.42 2.39 47.81 sdd 0.00 0.00 0.00 7.200.000.00 0.00 0.00 0.00 203.87 5671.83 138.92 100.02 sde 0.00 0.00 0.00 409.600.000.00 0.00 0.00 0.00 3171.04 4779.83 2.44 100.02 sdd 0.00 0.00 0.00 17.800.000.00 0.00 0.00 0.00 137.87 10240.90 56.19 100.02 sde 0.00 0.00 0.00 240.600.000.00 0.00 0.00 0.00 1573.85 9815.29 4.16 100.02 sdd 0.00 109.80 0.00 35.400.00 1012.80 0.00 506.40 28.6142.14 7974.47 27.86 98.64 sde 0.00 2.80 0.00 198.800.00 30.40 0.00 15.20 0.15 428.49 14474.39 4.30 85.56 sdd 0.00 466.20 0.00 62.800.00 4230.40 0.00 2115.20 67.36 0.599.49 6.79 42.62 sde 0.00 5.20 0.00 0.800.00 48.00 0.0024.00 60.00 0.01 16.25 11.25 0.90 sdd 0.00 0.00 0.00 0.200.001.60 0.00 0.80 8.00 0.01 35.00 35.00 0.70 sde 0.00 0.00 0.00 0.000.000.00 0.00 0.00 0.00 0.000.00 0.00 0.00 I suspect that fsync() is the criminal as I've been afraid. I'll show you an interesting data. I collected the stack traces of backend processes while a checkpoint is happening. [bgwriter] Oh, he is fsync()ing hard. #0 0x003a629bfbb2 in __fsync_nocancel () from /lib64/tls/libc.so.6 #1 0x005742a1 in mdsync () #2 0x005753d7 in smgrsync () #3 0x00564d65 in FlushBufferPool () ... [some backends] They are forced to wait for some lock pertaining to WAL when they try to insert a log record.
Re: [HACKERS] Load distributed checkpoint
On Wed, Dec 20, 2006 at 08:10:56PM +0900, Takayuki Tsunakawa wrote: One question is the disk utilization. While bgwriter is fsync()ing, %util of WAL disk drops to almost 0. But the the bandwidth of Ultra320 SCSI does not appear to be used fully. Any idea? That implies that fsyncing a datafile blocks fsyncing the WAL. That seems terribly unlikely (although...). What OS/Kernel/Filesystem is this. I note a sync bug in linux for ext3 that may have relevence. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
Tom Lane wrote: Stephen Frost [EMAIL PROTECTED] writes: Force references to go through macros which implement the lookup for the appropriate type? ie: LOGICAL_COL(table_oid,2) vs. PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic. It doesn't really address the question of how you know which one to use at any particular line of code; or even more to the point, what mechanism will warn you if you use the wrong one. My gut feeling about this is that we could probably enforce such a distinction if we were using C++, but while coding in C I have no confidence in it. (And no, that's not a vote to move to C++ ...) What about a comprimise... The 8.1 documentation for ALTER TABLE states the following. Adding a column with a non-null default or changing the type of an existing column will require the entire table to be rewritten. This may take a significant amount of time for a large table; and it will temporarily require double the disk space. Now, we are rewriting the table from scratch anyway, the on disk format is changing. What is stopping us from switching the column order at the same time. The only thing I can think is that the catalogs will need more work to update them. It's a middle sized price to pay for being able to reorder the columns in the table. One of the problems I have is wanting to add a column in the middle of the table, but FK constraints stop me dropping the table to do the reorder. If ALTER TABLE would let me stick it in the middle and rewrite the table on disk, I wouldn't care. It's likely that I would be rewriting the table anyway. And by specifying AT POSITION, or BEFORE/AFTER you know for big tables it's going to take a while. Not that I'm able to code this at all, but I'm interested in feedback on this option. Regards Russell Smith regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Fw: Fw: [HACKERS] choosing use an index or not
Hi, can you halp me? Thks, Felipe - Original Message - From: Martijn van Oosterhout kleptog@svana.org To: Felipe Rondon Rocha [EMAIL PROTECTED] Sent: Tuesday, December 19, 2006 8:39 AM Subject: Re: Fw: [HACKERS] choosing use an index or not No idea, ask on -hackers... I only know the concepts of how it works, I have no idea about the code itself. Have a nice day, On Tue, Dec 19, 2006 at 08:39:12AM -0300, Felipe Rondon Rocha wrote: Hi Martijn, my problem is that our group of study made a modification to always use the index on a select explain ... statement. But as a matter of fact we found some bugs that have to be fixed. I know how the choice of an index happens. What I need to find is the moment that PostgreSQL compares the costs of all trees (paths) and decide to use the index or not. Do you have any tips? Thank you, very much. Regards, Felipe - Original Message - From: Martijn van Oosterhout kleptog@svana.org To: Felipe Rondon Rocha [EMAIL PROTECTED] Cc: pgsql-hackers@postgresql.org Sent: Thursday, December 14, 2006 9:56 AM Subject: Re: [HACKERS] choosing use an index or not On Thu, Dec 14, 2006 at 10:06:45AM -0300, Felipe Rondon Rocha wrote: Hi everyone, Do you know in which part of the file analyze.c, PostgreSQL decides to use or not an index? I?m making a test and I can?t trace the moment that he makes the choice. It doesn't explicitly make the choice anywhere. The planner makes a set of paths that can be used. So for each table a path is made for a sequential scan and one or more paths for each index. For each a cost is calcualted and whichever has the lowest cost is the one chosen. It's quite dynamic, the choice also depends on what happens to the result. There's no point using an index if the result is in the wrong order for the join, for example. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Binary data ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] psql: core dumped
On 19/12/06, Alvaro Herrera [EMAIL PROTECTED] wrote: I think the problem Mario is really trying to solve is quitting at psql's Password: prompt. Ctrl-C is ignored at that point apparently. SIGQUIT (thus Ctrl-\ in most people's setup) does it but it also dumps core. yes, that is true and also when you are is psql prompt too. But, I had no idea that a core dump is normal with a SIGQUIT signal. Thank you all for your time and patience; this was my first try with a postgres patch, I'll try with something else in other chance :-) -- http://www.advogato.org/person/mgonzalez/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] psql: core dumped
On 19/12/06, Andrew Dunstan [EMAIL PROTECTED] wrote: This normally a SIGQUIT, and on my machine at least the default action for that is a core dump. Perhaps you need to say what you are trying to do and why. I'd like to help :-) I wanted to avoid a core dumped but you told me that's a normal thing for a SIGQUIT signal. -- http://www.advogato.org/person/mgonzalez/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Load distributed checkpoint patch
From: ITAGAKI Takahiro [EMAIL PROTECTED] Bruce Momjian [EMAIL PROTECTED] wrote: Do you use the same delay autovacuum uses? What do you mean 'the same delay'? Autovacuum does VACUUM, not CHECKPOINT. If you think cost-based-delay, I think we cannot use it here. It's hard to estimate how much checkpoints delay by cost-based sleeping, but we should finish asynchronous checkpoints by the start of next checkpoint. So I gave priority to punctuality over load smoothing. I consider that smoothing the load (more meaningfully, response time) has higher priority over checkpoint punctuality in a practical sense, because the users of a system benefit from good steady response and give good reputation to the system. If the checkpoint processing is not punctual, crash recovery would take longer time. But which would you give higher priority, the unlikely event (=crash of the system) or likely event (=peek hours of the system)? I believe the latter should be regarded. The system can write dirty buffers after the peek hours pass. User experience should be taken much case of. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: Fw: Fw: [HACKERS] choosing use an index or not
As a starting point, you can look at these two functions: src/backend/optimizer/path/indxpath.c : find_usable_indexes() src/backend/optimizer/util/pathnode.c : add_path() Hope this helps. On 12/20/06, Felipe Rondon Rocha [EMAIL PROTECTED] wrote: Hi, can you halp me? Thks, Felipe - Original Message - From: Martijn van Oosterhout kleptog@svana.org To: Felipe Rondon Rocha [EMAIL PROTECTED] Sent: Tuesday, December 19, 2006 8:39 AM Subject: Re: Fw: [HACKERS] choosing use an index or not ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | yahoo }.com
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
On Tue, Dec 19, 2006 at 11:29:24PM -0500, Tom Lane wrote: Stephen Frost [EMAIL PROTECTED] writes: Force references to go through macros which implement the lookup for the appropriate type? ie: LOGICAL_COL(table_oid,2) vs. PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic. It doesn't really address the question of how you know which one to use at any particular line of code; or even more to the point, what mechanism will warn you if you use the wrong one. There's one method: Set it up so that when you create a table, it randomizes the order of the fields on disk. Obviously for production this isn't smart, but it would test the code a lot. Though in the regression tests many tables only have one column so they won't be affected. If we had unit tests you could create a function called heap_mangle_tuple which simply does physical reordering but logically does nothing and feed it in at each point to check the code is invarient. Another approach is to number logical columns starting at 1000. This would mean that at a glance you could tell what you're talking about. And code using the wrong one will do something obviously bad. If performance is an issue you could only enable the offset for --enable-assert builds. Personally I like this approach because it would encourage everyone to use the macro to access the fields, since not doing so will place a constant in an obvious place. It's also trivial for the system to check. Personally I'm unsure of the scope of the problem. AFAICS there's hardly anywhere that would use physical offsets... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] Load distributed checkpoint
That implies that fsyncing a datafile blocks fsyncing the WAL. That seems terribly unlikely (although...). What OS/Kernel/Filesystem is this. I note a sync bug in linux for ext3 that may have relevence. Oh, really? What bug? I've heard that ext3 reports wrong data to iostat when it performs writes (the data is correct when performing reads.) My env is: OS: RHEL 4.0 for AMD64/EM64T kernel: 2.6.9-42.ELsmp The file system is ext3. Terribly unlikely? But I've seen the disk utilization quite often. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane: If you can show me a reasonably bulletproof or machine-checkable way to keep the two kinds of column numbers distinct, I'd be all for it. The only way I can see is to make the domains of the numbers distinct. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] effective_cache_size vs units
Tom Lane wrote: (Hmm, I wonder what Tom Dunstan's enum patch does about case sensitivity...) Currently enum labels are case sensitive. I was a bit ambivalent about it... case insensitivity can lead to less surprises in some cases, but many programming languages that have enums are case sensitive, and so this wouldn't be a direct map for them. OTOH, if someone's doing evil things like sticking labels that differ only in case into an enum, perhaps they *should* be dissuaded. :) The question is where does it end, though? Should we treat letters with accents and umlauts as equivalent as well? Do we remove punctuation characters? It gets into a (for me) more murky localization issue, and I'm not familiar with the postgresql apis for handling that. Maybe it's easy. Since we basically accept any old thing into an enum label, I think we probably shouldn't muck with it. If we want to have some sort of normalized version, then we should probably restrict the characters that we accept fairly severely. Also note that enum values are far more likely to be set by application code than by a human typing the value in directly, so in that sense the need for case insensitivity seems somewhat diminished. I suppose we should think about mysql refugees at some point, though. I wonder what they do. The documentation is silent on the matter (and all their examples are in lower case). Mysql is generally case insensitive, right? Cheers Tom ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Interface for pg_autovacuum
Jim Nasby wrote: I'm teaching a class this week and a student asked me about OIDs. He related the story of how in Sybase, if you moved a database from one server from another, permissions got all screwed up because user IDs no longer matched. I explained that exposing something like an integer ID in a user interface or an API is just a bad idea and PostgreSQL doesn't do that. Then I got to pg_autovacuum So... is there any reason there isn't a prescribed interface to pg_autovacuum that doesn't expose vacrelid? Can we get that added to TODO? Wouldn't it be sufficient to change the type of vacrelid from oid to regclass? Then just dumping and restoring pg_autovacuum like any other table should Just Work. greetings, Florian Pflug ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
On Wed, Dec 20, 2006 at 07:20:14AM -0600, Kenneth Marshall wrote: On Wed, Dec 20, 2006 at 01:26:59PM +0100, Peter Eisentraut wrote: Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane: If you can show me a reasonably bulletproof or machine-checkable way to keep the two kinds of column numbers distinct, I'd be all for it. The only way I can see is to make the domains of the numbers distinct. Negative vs. positive numbers? Negative is used by system columns. Just adding some large constant (say 1) should be enough. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] Load distributed checkpoint
On Wed, Dec 20, 2006 at 09:14:50PM +0900, Takayuki Tsunakawa wrote: That implies that fsyncing a datafile blocks fsyncing the WAL. That seems terribly unlikely (although...). What OS/Kernel/Filesystem is this. I note a sync bug in linux for ext3 that may have relevence. Oh, really? What bug? I've heard that ext3 reports wrong data to iostat when it performs writes (the data is correct when performing reads.) I was referring to this in the 2.6.6 changelog: http://www.linuxhq.com/kernel/changelog/v2.6/6/index.html ext3's fsync/fdatasync implementation is currently syncing the inode via a full journal commit even if it was unaltered. However you're running a later version so that's not it. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
Am Mittwoch, 20. Dezember 2006 14:20 schrieb Kenneth Marshall: On Wed, Dec 20, 2006 at 01:26:59PM +0100, Peter Eisentraut wrote: Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane: If you can show me a reasonably bulletproof or machine-checkable way to keep the two kinds of column numbers distinct, I'd be all for it. The only way I can see is to make the domains of the numbers distinct. Negative vs. positive numbers? That would be an obvious choice, but negative column numbers are already in use for system columns. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
Martijn van Oosterhout wrote: On Wed, Dec 20, 2006 at 07:20:14AM -0600, Kenneth Marshall wrote: On Wed, Dec 20, 2006 at 01:26:59PM +0100, Peter Eisentraut wrote: Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane: If you can show me a reasonably bulletproof or machine-checkable way to keep the two kinds of column numbers distinct, I'd be all for it. The only way I can see is to make the domains of the numbers distinct. Negative vs. positive numbers? Negative is used by system columns. Just adding some large constant (say 1) should be enough. Have a nice day, Or we could divide the positive number space in two, by starting at 2^14 (attnums are int2). Then a simple bitmask test would work to distinguish them. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
* Tom Lane ([EMAIL PROTECTED]) wrote: Stephen Frost [EMAIL PROTECTED] writes: Force references to go through macros which implement the lookup for the appropriate type? ie: LOGICAL_COL(table_oid,2) vs. PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic. It doesn't really address the question of how you know which one to use at any particular line of code; or even more to the point, what mechanism will warn you if you use the wrong one. That'd be the point of doing the typing, you then declare functions as accepting the type and then if someone passes the wrong type to a function the compiler will complain. Inside of a particular function it would hopefully be easier to keep it clear. I'd think that most functions would deal with one type or the other (which would be declared in the arguments or in the local variables) and that functions which have to deal with both would be able to keep them straight. My gut feeling about this is that we could probably enforce such a distinction if we were using C++, but while coding in C I have no confidence in it. (And no, that's not a vote to move to C++ ...) I need to go research what Linux does for this because aiui it's pretty good about being able to enforce better type-checking than the stock C types. The only downside is that I *think* it might be a GCC-only thing. In that case I'd think we would still use it but build some macros which essentially disable it for non-GCC compilers. As a mainly-for-developers compile-time check I think as long as a build-farm member is running GCC and complaining when there are errors (and it can be disabled on non-GCC compilers) we won't lose any portability from it. Thanks, Stephen signature.asc Description: Digital signature
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
Russell Smith wrote: Tom Lane wrote: Stephen Frost [EMAIL PROTECTED] writes: Force references to go through macros which implement the lookup for the appropriate type? ie: LOGICAL_COL(table_oid,2) vs. PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic. It doesn't really address the question of how you know which one to use at any particular line of code; or even more to the point, what mechanism will warn you if you use the wrong one. My gut feeling about this is that we could probably enforce such a distinction if we were using C++, but while coding in C I have no confidence in it. (And no, that's not a vote to move to C++ ...) What about a comprimise... The 8.1 documentation for ALTER TABLE states the following. Adding a column with a non-null default or changing the type of an existing column will require the entire table to be rewritten. This may take a significant amount of time for a large table; and it will temporarily require double the disk space. Now, we are rewriting the table from scratch anyway, the on disk format is changing. What is stopping us from switching the column order at the same time. The only thing I can think is that the catalogs will need more work to update them. It's a middle sized price to pay for being able to reorder the columns in the table. One of the problems I have is wanting to add a column in the middle of the table, but FK constraints stop me dropping the table to do the reorder. If ALTER TABLE would let me stick it in the middle and rewrite the table on disk, I wouldn't care. It's likely that I would be rewriting the table anyway. And by specifying AT POSITION, or BEFORE/AFTER you know for big tables it's going to take a while. This isn't really a compromise. Remember that this discussion started with consideration of optimal record layout (minimising space use by reducing or eliminating alignment padding). The above proposal really does nothing for that. cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] effective_cache_size vs units
On Tue, Dec 19, 2006 at 10:12:34PM +, Gregory Stark wrote: Tom Lane [EMAIL PROTECTED] writes: Magnus Hagander [EMAIL PROTECTED] writes: Oh, you mean MB vs Mb. Man, it had to be that simple :) ISTM we had discussed whether guc.c should accept units strings in a case-insensitive manner, and the forces of pedantry won the first round. Shall we reopen that argument? Nope, I just checked back in the archive and that's not what happened. There was an extended discussion about whether to force users to use the silly KiB, MiB, etc units. Thankfully the pedants lost that round soundly. There was no particular discussion about case sensitivity though Simon made the case for user-friendly behaviour: I think we are safe assume to that kB = KB = kb = Kb = 1024 bytes mB = MB = mb = Mb = 1024 * 1024 bytes gB = GB = gb = Gb = 1024 * 1024 * 1024 bytes There's no value in forcing the use of specific case and it will be just confusing for people. http://archives.postgresql.org/pgsql-hackers/2006-07/msg01253.php And Jim Nasby said something similar: Forcing people to use a specific casing scheme is just going to lead to confusion and user frustration. If there's not a very solid *functional* argument for it, we shouldn't do it. Wanting to enforce a convention that people rarely use isn't a good reason. http://archives.postgresql.org/pgsql-hackers/2006-07/msg01355.php There was a lone comment from Thomas Hallgren in favour of case sensitivity in the name of consistency. But Nasby's comment was directly in response and nobody else piped up after that. My one comment is that a little 'b' is used to indicate bits normally and a capital 'B' is used to indicate bytes. So kb = '1024 bits' kB = '1024 bytes' ... I do think that whether or not the k/m/g is upper case or lower case is immaterial. Ken -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
On Wed, Dec 20, 2006 at 01:26:59PM +0100, Peter Eisentraut wrote: Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane: If you can show me a reasonably bulletproof or machine-checkable way to keep the two kinds of column numbers distinct, I'd be all for it. The only way I can see is to make the domains of the numbers distinct. Negative vs. positive numbers? Ken ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
VS: [HACKERS] Companies Contributing to Open Source
Jonah Harris wrote: We could also mention all the Ingres-based offshoots that were commercial. Let me think of some other examples... but there may not be that many seeing as there aren't really that many PostgreSQL-like communities. I guess I could mention more if I had a clear understanding of what we mean when we say, community. That's something that bothers me as well: The article and the discussions talks about the community, but what's The Community? There really isn't any clear definition, the closest thing is the list of committers or core members, but I don't think anyone considers The Community to be just the committers, though that's the group of people whose opinions matter the most when trying to get a patch accepted. I don't think it's fruitful to spend time on a precise definition, but it's important to realize that there isn't one and that the community consists of individuals with different priorities, opinions and points of view. Therefore it's a bit meaningless to say that The Community thinks this or The Community says that. On one topic, some people might have a very strong opinion one way, and others might just not care at all. On some topics, everyone agrees. And on some topics, people strongly disagree. And that's ok. Respecting all the different viewpoints leads to a well-balanced product. How does that affect a company trying to get a patch accepted? First, do no harm. If you're proposing something that for example brakes someone else's application, your proposal is likely to be rejected. Or if you're proposing a patch that increases the performance of something, at a very high cost on some other things, your patch is likely to be rejected. Another kind of harm that many people miss is the maintainability of the codebase. Adding complexity for little gain is likely to be rejected, just because it'll make the code harder to read. Secondly, getting a large feature accepted is easier if you're not just dumping a large patch to pgsql-patches, but you're committed to maintainting it and developing it further. Remember, some things you might have ignored as not important might be crucial to other people. Also, a note to all Members of The Community: people like to work in different ways. Some might want to seek acceptance and commitment to a feature from others before starting development. Some might want to write a large up-front design document before proposing something. Some might want to write an experimental patch with a lot of quick hacks and no comments, and refine that according to feedback. And we, The Members of The Community, if I may count myself as one, don't get to choose how others prefer to work. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
* Andrew Dunstan ([EMAIL PROTECTED]) wrote: This isn't really a compromise. Remember that this discussion started with consideration of optimal record layout (minimising space use by reducing or eliminating alignment padding). The above proposal really does nothing for that. While I agree that's how the discussion started the column ordering issue can stand on its own and any proposal which provides that feature should be considered. I don't think we should throw out the rewrite-the-table idea because it doesn't solve other problems. Thanks, Stephen signature.asc Description: Digital signature
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
Andrew Dunstan wrote: Or we could divide the positive number space in two, by starting at 2^14 (attnums are int2). Then a simple bitmask test would work to distinguish them. Perhaps divide-by-four, then it would be possible to have calculated columns (as mentioned recently on one of the lists). In particular, that would let you have FK constraints with a constant as part of the key. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] Load distributed checkpoint patch
ITAGAKI Takahiro wrote: Bruce Momjian [EMAIL PROTECTED] wrote: OK, if I understand correctly, instead of doing a buffer scan, write(), and fsync(), and recyle the WAL files at checkpoint time, you delay the scan/write part with the some delay. Exactly. Actual behavior of checkpoint is not changed by the patch. Compared with existing checkpoints, it just takes longer time in scan/write part. Do you use the same delay autovacuum uses? Sorry, I meant bgwriter delay, not autovauum. What do you mean 'the same delay'? Autovacuum does VACUUM, not CHECKPOINT. If you think cost-based-delay, I think we cannot use it here. It's hard to estimate how much checkpoints delay by cost-based sleeping, but we should finish asynchronous checkpoints by the start of next checkpoint. So I gave priority to punctuality over load smoothing. OK. As I remember, often the checkpoint is caused because we are using the last WAL file. Doesn't this delay the creation of new WAL files by renaming the old ones to higher numbers (we can't rename them until the checkpoint is complete)? Checkpoints should be done by the next one, so we need WAL files for two checkpoints. It is the same as now. Ah, OK, so we already reserve a full set of WAL files while we are waiting for the checkpoint to complete. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] effective_cache_size vs units
Am Mittwoch, 20. Dezember 2006 13:42 schrieb Tom Dunstan: I suppose we should think about mysql refugees at some point, though. I wonder what they do. The documentation is silent on the matter (and all their examples are in lower case). Mysql is generally case insensitive, right? Maybe you can make sense of this, but I can't ... mysql create table test (a int, b enum ('x', 'X', 'y')); Query OK, 0 rows affected, 1 warning (0.02 sec) mysql show warnings; +---+--+-+ | Level | Code | Message | +---+--+-+ | Note | 1291 | Column 'b' has duplicated value 'x' in ENUM | +---+--+-+ 1 row in set (0.00 sec) mysql insert into test values (1, 'x'); Query OK, 1 row affected (0.00 sec) mysql insert into test values (1, 'X'); Query OK, 1 row affected (0.00 sec) mysql insert into test values (1, 'y'); Query OK, 1 row affected (0.00 sec) mysql insert into test values (1, 'Y'); Query OK, 1 row affected (0.00 sec) mysql insert into test values (1, 'z'); Query OK, 1 row affected, 1 warning (0.00 sec) ## You think that was funny -- now watch this: mysql select * from test; +--+--+ | a| b| +--+--+ |1 | x| |1 | x| |1 | y| |1 | y| |1 | | +--+--+ 5 rows in set (0.00 sec) mysql drop table test; Query OK, 0 rows affected (0.00 sec) mysql create table test (a int, b enum ('ä', 'Ä', ' ', ' ')); ## Above is a-diaeresis, A-diaeresis. Query OK, 0 rows affected, 2 warnings (0.00 sec) mysql show warnings; +---+--+-+ | Level | Code | Message | +---+--+-+ | Note | 1291 | Column 'b' has duplicated value '?' in ENUM | # literal ? | Note | 1291 | Column 'b' has duplicated value '' in ENUM | +---+--+-+ 2 rows in set (0.00 sec) mysql insert into test values (1, ' '); Query OK, 1 row affected (0.00 sec) mysql insert into test values (1, ' '); Query OK, 1 row affected (0.00 sec) mysql insert into test values (1, ' '); Query OK, 1 row affected (0.01 sec) mysql insert into test values (1, ' |'); Query OK, 1 row affected, 1 warning (0.00 sec) mysql show warnings; +-+--++ | Level | Code | Message| +-+--++ | Warning | 1265 | Data truncated for column 'b' at row 1 | +-+--++ 1 row in set (0.00 sec) mysql select distinct * from test; +--+--+ | a| b| +--+--+ |1 | ä| |1 | | |1 | | +--+--+ Better not imitate that. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] Bundle of patches
Perhaps an array of int4 would be better? How much Done http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz The patch needs more cleanup before applying, too, eg make comments match code, get rid of unused keywords added to gram.y. Cleaned. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [PATCHES] Bundle of patches
Teodor Sigaev [EMAIL PROTECTED] writes: Perhaps an array of int4 would be better? How much Done http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz The patch needs more cleanup before applying, too, eg make comments match code, get rid of unused keywords added to gram.y. Cleaned. OK. I'm up to my rear in the opclass/opfamily rewrite, but will take another look at the typmod code as soon as I have a working HEAD again ;-) regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] Enums patch v2
Martijn van Oosterhout wrote: Also, it's not just I/O functions that are the issue, consider the enum-to-integer cast. er, what cast? :-) IIRC Tom hasn't provided one. If you don't break the enum abstraction there should be no need for one, and given the implementation it's not quite trivial - probably the best way if this is needed would be to precalculate it at type creation time and store the value in an extra column in pg_enum. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Companies Contributing to Open Source
David Fetter wrote: On Tue, Dec 19, 2006 at 05:40:12PM -0500, Bruce Momjian wrote: Lukas Kahwe Smith wrote: Hi, I think another point you need to bring out more clearily is that the community is also often miffed if they feel they have been left out of the design and testing phases. This is sometimes just a reflex that is not always based on technical reasoning. Its just that as you correctly point out are worried of being high-jacked by companies. I hate to mention an emotional community reaction in this document. You don't have to name it that if you don't want to, although respect (or at least a good simulation of it) is crucial when dealing with any person or group. Handing the community a /fait accompli/ is a great way to convey disrespect, no matter how well-meaning the process originally was. We normally just highlight the inefficiency of a company doing things on their own, and the wasted effort of them having to make adjustments. Would it really hurt to touch on some of the whys of this? Sure, I mention that once they deal with the community, it might requires extensive rewriting: This means the employee is not benefitting from community oversight and suggestions, often leading to a major rewrite when a patch is submitted to the community. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch
Andrew Dunstan wrote: Now, we are rewriting the table from scratch anyway, the on disk format is changing. What is stopping us from switching the column order at the same time. The only thing I can think is that the catalogs will need more work to update them. It's a middle sized price to pay for being able to reorder the columns in the table. One of the problems I have is wanting to add a column in the middle of the table, but FK constraints stop me dropping the table to do the reorder. If ALTER TABLE would let me stick it in the middle and rewrite the table on disk, I wouldn't care. It's likely that I would be rewriting the table anyway. And by specifying AT POSITION, or BEFORE/AFTER you know for big tables it's going to take a while. This isn't really a compromise. Remember that this discussion started with consideration of optimal record layout (minimising space use by reducing or eliminating alignment padding). The above proposal really does nothing for that. I assume space waste will be mostly fixed when we have 0/1 byte headers for varlena data types. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
Bruce Momjian [EMAIL PROTECTED] writes: I assume space waste will be mostly fixed when we have 0/1 byte headers for varlena data types. Hardly. int float timestamp etc types will all still have alignment issues. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] effective_cache_size vs units
Peter Eisentraut wrote: Am Mittwoch, 20. Dezember 2006 13:42 schrieb Tom Dunstan: I suppose we should think about mysql refugees at some point, though. I wonder what they do. The documentation is silent on the matter (and all their examples are in lower case). Mysql is generally case insensitive, right? Maybe you can make sense of this, but I can't ... [lots of amusing non-orthogonal braindead stuff ...] Better not imitate that. The MySQL treatment of enums is generally quite reprehensible. The proposed patch by contrast fits quite well into our existing type system, I think. MySQL users migrating will have a bit of work to do. I don't think their experience has much to teach us (except how not to do enums). We should decide on case sensitivity without having reference to it. cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Stats Collector Oddity
We're getting a bit of an anomaly relating to pg_stat_activity... oxrstld=# SELECT * from pg_stat_activity where current_query 'IDLE'; datid| datname | procpid | usesysid | usename | current_query | query_start +-+-+--+-+---+--- 1347729970 | oxrstld | 893094 | 122 | tldepp | commit| 2006-12-16 19:34:08.583978+00 (1 row) oxrstld=# select version(); version -- PostgreSQL 7.4.12 on powerpc-ibm-aix5.3.0.0, compiled by GCC gcc (GCC) 4.0.1 (1 row) That PID has been dead for several days, but this connection is marked as being open, still, after lo many days. This has *traditionally* been a sign that the stats collector has been clobbered, at which point pg_stat_activity becomes useless. That's not the case here; a less restricted query on pg_stat_activity shows other more recent data that keeps changing, seemingly consistent with system activity. I can track most of the lifecycle of that PID; the connection was established at 2006-12-16 18:46:38, and I see a few errors associated with the PID (violations of unique constraints; usual business stuff). I see nothing in the logs at around 19:34 which would suggest a reason for any strange behaviour. There are no records about that PID after 19:34... I'd like to get rid of this entry, if I can; it's blowing up tests that warn us about elderly transactions, causing a false positive... -- output = reverse(moc.enworbbc @ enworbbc) http://www3.sympatico.ca/cbbrowne/ Rules of the Evil Overlord #212. I will not send out battalions composed wholly of robots or skeletons against heroes who have qualms about killing living beings. http://www.eviloverlord.com/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Release 8.2.0 done, 8.3 development starts
I'm not sure, but as far as I remember, it will be a short release cycle for 8.3 in order to finish some big items that couldn't be ready in time for 8.2. But which items are more or less expected for 8.3? I recall - Hierarchical Queries - On disk bitmap index - Clustered/replication solutions being discussed. What are on people's minds? -- Med venlig hilsen Kaare Rasmussen, Jasonic Jasonic Telefon: +45 3816 2582 Nordre Fasanvej 12 2000 Frederiksberg Email: [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: VS: [HACKERS] Companies Contributing to Open Source
Heikki Linnakangas wrote: Jonah Harris wrote: We could also mention all the Ingres-based offshoots that were commercial. Let me think of some other examples... but there may not be that many seeing as there aren't really that many PostgreSQL-like communities. I guess I could mention more if I had a clear understanding of what we mean when we say, community. That's something that bothers me as well: The article and the discussions talks about the community, but what's The Community? There really isn't any clear definition, the closest thing is the list of committers or core members, but I don't think anyone considers The Community to be just the committers, though that's the group of people whose opinions matter the most when trying to get a patch accepted. I Not really. The committers/core just weigh in on the patch, but it is the general discussion that determines if a patch gets in. Now, if all committers don't want a patch, it is unlikely it will be applied, but in most cases some committers want something, and some don't, and the discussion determines if it gets in or not, perhaps with modification so all community members are happy. The remainder of this email talks about general principles for all developers, not just for company employees. If we need more text in this area, it should be added to the developer's FAQ. Feel free to suggest additions to that. I am kind of stumped that we have so much text in the developer's FAQ, but it seems many people don't know what is in there. The Developer's FAQ is referenced in the main FAQ, and on the web site. I am inclinded to think that developers know the contents of the developer's FAQ, but think they know better, and go ahead and do what they want anyway. I have no idea how to fix that. :-( --- don't think it's fruitful to spend time on a precise definition, but it's important to realize that there isn't one and that the community consists of individuals with different priorities, opinions and points of view. Therefore it's a bit meaningless to say that The Community thinks this or The Community says that. On one topic, some people might have a very strong opinion one way, and others might just not care at all. On some topics, everyone agrees. And on some topics, people strongly disagree. And that's ok. Respecting all the different viewpoints leads to a well-balanced product. How does that affect a company trying to get a patch accepted? First, do no harm. If you're proposing something that for example brakes someone else's application, your proposal is likely to be rejected. Or if you're proposing a patch that increases the performance of something, at a very high cost on some other things, your patch is likely to be rejected. Another kind of harm that many people miss is the maintainability of the codebase. Adding complexity for little gain is likely to be rejected, just because it'll make the code harder to read. Secondly, getting a large feature accepted is easier if you're not just dumping a large patch to pgsql-patches, but you're committed to maintainting it and developing it further. Remember, some things you might have ignored as not important might be crucial to other people. Also, a note to all Members of The Community: people like to work in different ways. Some might want to seek acceptance and commitment to a feature from others before starting development. Some might want to write a large up-front design document before proposing something. Some might want to write an experimental patch with a lot of quick hacks and no comments, and refine that according to feedback. And we, The Members of The Community, if I may count myself as one, don't get to choose how others prefer to work. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: VS: [HACKERS] Companies Contributing to Open Source
The remainder of this email talks about general principles for all developers, not just for company employees. If we need more text in this area, it should be added to the developer's FAQ. Feel free to suggest additions to that. I am kind of stumped that we have so much text in the developer's FAQ, but it seems many people don't know what is in there. The Developer's FAQ is referenced in the main FAQ, and on the web site. I am inclinded to think that developers know the contents of the developer's FAQ, but think they know better, and go ahead and do what they want anyway. I have no idea how to fix that. :-( We don't accept patches from anyone who doesn't follow the developer FAQ rules. Which means of course, that I need to go read it ;) Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] tsearch in core patch, for review
We (Oleg and me) are glad to present tsearch2 in core of pgsql patch. In basic, layout, functions, methods, types etc are the same as in current tsearch2 with a lot of improvements: - pg_ts_* tables now are in pg_catalog - parsers, dictionaries, configurations now have owner and namespace similar to other pgsql's objects like tables, operator classes etc - current tsearch configuration is managed with a help of GUC variable tsearch_conf_name. - choosing of tsearch cfg by locale may be done for each schema separately - managing of tsearch configuration with a help of SQL commands, not with insert/update/delete statements. This allows to drive dependencies, correct dumping and dropping. - psql support with a help of \dF* commands - add all available Snowball stemmers and corresponding configuration - correct memory freeing by any dictionary Work is sponsored by EnterpriseDB's PostgreSQL Development Fund patch: http://www.sigaev.ru/misc/tsearch_core-0.27.gz Sorry, but documentation is under heavy development now and temporary placed at http://mira.sai.msu.su/~megera/pgsql/ftsdoc/ (changes quickly :) ) So, below there is a short overview of syntax. Comments, suggestions, objections, questions will be appreciated. We are planning to complete the patch and suggest to commit in HEAD soon. SQL syntax: CREATE FULLTEXT PARSER prsname ( START = funcname, GETTOKEN = funcname, END = funcname, LEXTYPES = funcname [ , HEADLINE = funcname ] ); CREATE FULLTEXT DICTIONARY dictname ( LEXIZE = funcname [ , INIT = funcname ] [ , OPT = text ] ); CREATE FULLTEXT DICTIONARY dictname [( { INIT = funcname | LEXIZE = funcname | OPT = text } [, ... ] )] LIKE template_dictname; ALTER FULLTEXT DICTIONARY dictname SET OPT=text; CREATE FULLTEXT CONFIGURATION cfgname ( PARSER = prsname [, LOCALE = localename] ) [AS DEFAULT]; CREATE FULLTEXT CONFIGURATION cfgname [( { LOCALE = localename | PARSER = prsname } [, ...] )] LIKE template_cfg [WITH MAP] [AS DEFAULT]; ALTER FULLTEXT CONFIGURATION cfgname SET { LOCALE=localename | PARSER=prsname } [, ...]; ALTER FULLTEXT CONFIGURATION cfgname SET AS DEFAULT; ALTER FULLTEXT CONFIGURATION cfgname DROP DEFAULT; CREATE FULLTEXT MAPPING ON cfgname FOR lexemetypename[, ...] WITH dictname1[, ...]; ALTER FULLTEXT MAPPING ON cfgname FOR lexemetypename[, ...] WITH dictname1[, ...]; DROP FULLTEXT MAPPING [IF EXISTS] ON cfgname FOR lexemetypename; DROP FULLTEXT { DICTIONARY | PARSER | CONFIGURATION } [IF EXISTS] objname [ CASCADE | RESTRICT ]; ALTER FULLTEXT { DICTIONARY | PARSER | CONFIGURATION } objname RENAME TO newobjname; ALTER FULLTEXT { DICTIONARY | PARSER | CONFIGURATION } objname OWNER TO newowner; COMMENT ON FULLTEXT { DICTIONARY | PARSER | CONFIGURATION } objname IS text; psql backslesh commands: \dF [PATTERN] list fulltext configurations (add + for more detail) \dFd [PATTERN] list fulltext dictionaries (add + for more detail) \dFp [PATTERN] list fulltext parsers (add + for more detail) -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Rare corruption of pg_class index
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I'm encountering some disconcerting problems on a 8.1.3 database. Very occasionally, I get a could not open relation with OID xxx. This always occurs inside of a plpgsql function, and always refers to a normal, stable table that has not been dropped. The first time this occured, I reindexed the system indexes, but it has occured again, and on more than one database, which seems to lessen the chance of a hardware issue or a temporary index corruption issue. The functions in question are called many times, but the error only happens once in a blue moon. The last time it happened, the function ran with no problem one minute before the error, and again four minutes afterwards. The table name is hard-coded into the functions. This happens on average five times a day or so, on a very busy database ( 10M statements/day, the functions fire tens of thousands of times) Any clues on what could be causing this, or ways to go forward on debugging? Thanks. - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200612201252 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iD8DBQFFiXmzvJuQZxSWSsgRAkr4AKDhTKTILjrVbitWTu4wzktejXD2egCfSk/H h+vBn6lJegKeRD+tsBFSYng= =svv7 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Rare corruption of pg_class index
On Wed, 2006-12-20 at 18:06 +, Greg Sabino Mullane wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I'm encountering some disconcerting problems on a 8.1.3 database. Very occasionally, I get a could not open relation with OID xxx. This always occurs inside of a plpgsql function, and always refers to a normal, stable table that has not been dropped. The first time this occured, I reindexed the system indexes, but it has occured again, and on more than one database, which seems to lessen the chance of a hardware issue or a temporary index corruption issue. The functions in question are called many times, but the error only happens once in a blue moon. The last time it happened, the function ran with no problem one minute before the error, and again four minutes afterwards. The table name is hard-coded into the functions. This happens on average five times a day or so, on a very busy database ( 10M statements/day, the functions fire tens of thousands of times) It has to do with PL/pgSQL caching the plans of the statements inside. If there is some kind of change that causes that plan to become invalid, that could cause that error. You can almost certainly fix it by using EXECUTE in the function, which will not cache the plan, or telling all clients to reconnect when you make a change that could invalidate the plan. Regards, Jeff Davis ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Rare corruption of pg_class index
Greg Sabino Mullane wrote: I'm encountering some disconcerting problems on a 8.1.3 database. Very occasionally, I get a could not open relation with OID xxx. This always occurs inside of a plpgsql function, and always refers to a normal, stable table that has not been dropped. The first time this occured, I reindexed the system indexes, but it has occured again, and on more than one database, which seems to lessen the chance of a hardware issue or a temporary index corruption issue. The functions in question are called many times, but the error only happens once in a blue moon. The last time it happened, the function ran with no problem one minute before the error, and again four minutes afterwards. The table name is hard-coded into the functions. This happens on average five times a day or so, on a very busy database ( 10M statements/day, the functions fire tens of thousands of times) Well, if the error fixed by itself four minutes after the failure, then it's not likely to be a corrupted index. My first guess would be that there's some kind of race condition on the relcache or the sinval mechanism. It would help if you could get a stack trace at the moment of the problem, but I'm not sure how to do that. Maybe cause the backend to send a SIGSTOP to itself and then have an external program to quickly launch GDB on it and get a backtrace, then send a SIGCONT. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Rare corruption of pg_class index
Alvaro Herrera [EMAIL PROTECTED] writes: Greg Sabino Mullane wrote: I'm encountering some disconcerting problems on a 8.1.3 database. Very occasionally, I get a could not open relation with OID xxx. Does the mentioned OID actually correspond to the OID of the table it's supposed to be opening, or is it wrong? Is anything being done to the table schema in parallel? If the table is occasionally dropped and recreated, there's a known race condition that could cause it: we lookup the table name in pg_class to get the OID, then lock the relation by OID, then try to finish opening the relation. By the time we obtain lock the original rel could be gone and the name now refers to some other OID, but we'll fail because the old OID is no longer anywhere to be found. I think this is fixed in 8.2. If the table is perfectly static then another explanation is needed ... It would help if you could get a stack trace at the moment of the problem, but I'm not sure how to do that. Perhaps insert an abort() call right before the elog(ERROR) that's reporting this. (I think there are three possibilities, but they're all in heapam.c so you might as well just hack them all.) regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Release 8.2.0 done, 8.3 development starts
On Wed, Dec 20, 2006 at 05:49:15PM +0100, Kaare Rasmussen wrote: I'm not sure, but as far as I remember, it will be a short release cycle for 8.3 in order to finish some big items that couldn't be ready in time for 8.2. But which items are more or less expected for 8.3? I recall - Hierarchical Queries - On disk bitmap index These two are happening - Clustered/replication solutions This one's stalled because no two replication solutions appear to need anything in the core. Instead, lots of things are going on outside the core in the clustering/replication/HA arena. being discussed. What are on people's minds? Other things coming down the pike are: * SQL/XML support per SQL:2003 * SQL/PSM support per SQL:2003 Hrm. While two things isn't much of a pattern, it's suggestive. Personally, I'd like to see MERGE, windowing functions, and better integration of (INSERT|UPDATE|DELETE) ... RETURNING. Cheers, D (SQL/MED, anybody?) -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Release 8.2.0 done, 8.3 development starts
[EMAIL PROTECTED] (Kaare Rasmussen) writes: I'm not sure, but as far as I remember, it will be a short release cycle for 8.3 in order to finish some big items that couldn't be ready in time for 8.2. But which items are more or less expected for 8.3? I recall - Hierarchical Queries - On disk bitmap index - Clustered/replication solutions being discussed. What are on people's minds? - WITH (recursive) queries - Better handling of partitioning - SQL:2003 windowing queries - Discussion is under way on improvements to autovac - Simon Riggs has reuse of index tuples under way -- let name=cbbrowne and tld=acm.org in name ^ @ ^ tld;; http://linuxfinances.info/info/advocacy.html I think there is a world market for maybe five computers -- Tom Watson, CEO of IBM, 1943 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Rare corruption of pg_class index
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 (Jeff Davis: I've not changed the function, so it's not the plan cache) Tom Lane wrote: Does the mentioned OID actually correspond to the OID of the table it's supposed to be opening, or is it wrong? Is anything being done to the table schema in parallel? Yes, it is the correct OID. No, nothing done to the schema in parallel, although there is a process that disables/re-enables triggers and rules on that table via pg_class tweaking (inside a txn, of course). If the table is occasionally dropped and recreated... Yeah, that's the first thing I thought of, but it's definitely not being dropped and recreated. Alvaro Herrera wrote: It would help if you could get a stack trace at the moment of the problem, but I'm not sure how to do that. Perhaps insert an abort() call right before the elog(ERROR) that's reporting this. (I think there are three possibilities, but they're all in heapam.c so you might as well just hack them all.) Argh, that will have to be a last resort measure, as this is a production system. Have not been able to duplicate yet on a dev box, but will look into adding the abort() for when/if I can duplicate it there. Thanks everyone. - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200612201412 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iD8DBQFFiYtwvJuQZxSWSsgRArEbAJ9udj9/Kh2Vi45A8ej3YCC2RKESwgCcCkT8 5ZnN0+yCU0rQ6+PuHtJRtnw= =C1gB -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Release 8.2.0 done, 8.3 development starts
Kaare Rasmussen wrote: I'm not sure, but as far as I remember, it will be a short release cycle for 8.3 in order to finish some big items that couldn't be ready in time for 8.2. But which items are more or less expected for 8.3? I recall - Hierarchical Queries - On disk bitmap index - Clustered/replication solutions being discussed. What are on people's minds? Items on my personal agenda for this time frame - all have been previously discussed: . notification payload messages . fix permissions properly on custom GUC vars . create a mechanism for plperl to load modules safely I am committed to getting the first of these done to meet a businees need. The last two will be best effort. cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Rare corruption of pg_class index
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Greg Sabino Mullane wrote: It would help if you could get a stack trace at the moment of the problem, but I'm not sure how to do that. Perhaps insert an abort() call right before the elog(ERROR) that's reporting this. Yeah, but doing this in a production environment is not likely to go very far ... (I think there are three possibilities, but they're all in heapam.c so you might as well just hack them all.) Maybe it would be good to run under log_error_verbosity='verbose' for a while to discover whether it's always the same one. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Interface for pg_autovacuum
On Dec 20, 2006, at 7:56 AM, Florian G. Pflug wrote: Jim Nasby wrote: I'm teaching a class this week and a student asked me about OIDs. He related the story of how in Sybase, if you moved a database from one server from another, permissions got all screwed up because user IDs no longer matched. I explained that exposing something like an integer ID in a user interface or an API is just a bad idea and PostgreSQL doesn't do that. Then I got to pg_autovacuum So... is there any reason there isn't a prescribed interface to pg_autovacuum that doesn't expose vacrelid? Can we get that added to TODO? Wouldn't it be sufficient to change the type of vacrelid from oid to regclass? Then just dumping and restoring pg_autovacuum like any other table should Just Work. I think that would work, though as I mentioned we'd also want to set reasonable defaults on the table if we decide to keep that as our interface. On the other hand, this would be the only part of the system where the official interface/API is a system catalog table. Do we really want to expose the internal representation of something as our API? That doesn't seem wise to me... Additionally, AFAIK it is not safe to go poking data into catalogs willy-nilly. Having one table where this is the interface to the system seems like it could lead to some dangerous confusion. -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
On Wed, Dec 20, 2006 at 09:15:05AM -0500, Stephen Frost wrote: It doesn't really address the question of how you know which one to use at any particular line of code; or even more to the point, what mechanism will warn you if you use the wrong one. That'd be the point of doing the typing, you then declare functions as accepting the type and then if someone passes the wrong type to a function the compiler will complain. Inside of a particular function it would hopefully be easier to keep it clear. I'd think that most functions would deal with one type or the other (which would be declared in the arguments or in the local variables) and that functions which have to deal with both would be able to keep them straight. I'm not sure how much you can do with typing. Things like heap_getattr are macros, and thus untyped. Most places use attr as an index to an array, which also can't be type checked. If you switched everything over to inline functions you might get it to work, but that's about it. IMHO the best solution is to offset the logical numbers by some constant... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] Release 8.2.0 done, 8.3 development starts
Hi, I just collected all the items mentioned in this thread as well as what people quickly came up with on IRC and put it on a list in the developer wiki [1]. I tried to put names and links behind the items where ever possible. Let me know if there is something missing or if you know any other information (links, names) etc. that should be mentioned. The list is kinda scary long .. regards Lukas [1] http://developer.postgresql.org/index.php/Todo:WishlistFor83 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Load distributed checkpoint
On 12/20/06, Takayuki Tsunakawa [EMAIL PROTECTED] wrote: [Conclusion] I believe that the problem cannot be solved in a real sense by avoiding fsync/fdatasync(). We can't ignore what commercial databases have done so far. The kernel does as much as he likes when PostgreSQL requests him to fsync(). I am new to the community and am very interested in the tests that you have done. I am also working on resolving the sudden IO spikes at checkpoint time. I agree with you that fsync() is the core issue here. Being a new member I was wondering if someone on this list has done testing with O_DIRECT and/or O_SYNC for datafiles as that seems to be the most logical way of dealing with fsync() flood at checkpoint time. If so, I'll be very interested in the results. As mentioned in this thread that a single bgwriter with O_DIRECT will not be able to keep pace with cleaning effort causing backend writes. I think (i.e. IMHO) multiple bgwriters and/or AsyncIO with O_DIRECT can resolve this issue. Talking of bgwriter_* parameters I think we are missing a crucial internal counter i.e. number of dirty pages. How much work bgwriter has to do at each wakeup call should be a function of total buffers and currently dirty buffers. Relying on both these values instead of just one static NBuffers should allow bgwriter to adapt more quickly to workload changes and ensure that not much work is accumulated for checkpoint. -- Inaam Rana EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] Question about debugging bootstrapping and catalog
Zdenek Kotala wrote: Gregory Stark wrote: Martijn van Oosterhout kleptog@svana.org writes: Here's what I did: you can step over functions in initdb until it fails (although I alredy know which part it's failing I guess). Restart. Then you go into that function and step until the new backend has been started. At this point you attach another gdb to the backend and let it run. Hm, I suppose. Though starting a second gdb is a pain. What I've done in the past is introduce a usleep(3000) in strategic points in the backend to give me a chance to attach. I use dtrace which wait on write syscall for stderr output and if it is happen then stop(freeze) the process and I able to connect into the process with debugger and examine what happened. There is dtrace script which sitting on exec. It stops postgres process after exec. It works on Solaris. Different name of kernel function probably will be on other platform where is dtrace implemented (Freebsd,MacOS). ::exec_common:return /execname == initdb/ { exec_pg = 1; } syscall:::entry /execname == postgres exec_pg == 1/ { stop(); printf(Postgres is stopped.\n); exec_pg = 0; } ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
Andrew Dunstan wrote: Russell Smith wrote: Tom Lane wrote: Stephen Frost [EMAIL PROTECTED] writes: Force references to go through macros which implement the lookup for the appropriate type? ie: LOGICAL_COL(table_oid,2) vs. PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic. It doesn't really address the question of how you know which one to use at any particular line of code; or even more to the point, what mechanism will warn you if you use the wrong one. My gut feeling about this is that we could probably enforce such a distinction if we were using C++, but while coding in C I have no confidence in it. (And no, that's not a vote to move to C++ ...) What about a comprimise... The 8.1 documentation for ALTER TABLE states the following. Adding a column with a non-null default or changing the type of an existing column will require the entire table to be rewritten. This may take a significant amount of time for a large table; and it will temporarily require double the disk space. Now, we are rewriting the table from scratch anyway, the on disk format is changing. What is stopping us from switching the column order at the same time. The only thing I can think is that the catalogs will need more work to update them. It's a middle sized price to pay for being able to reorder the columns in the table. One of the problems I have is wanting to add a column in the middle of the table, but FK constraints stop me dropping the table to do the reorder. If ALTER TABLE would let me stick it in the middle and rewrite the table on disk, I wouldn't care. It's likely that I would be rewriting the table anyway. And by specifying AT POSITION, or BEFORE/AFTER you know for big tables it's going to take a while. This isn't really a compromise. Remember that this discussion started with consideration of optimal record layout (minimising space use by reducing or eliminating alignment padding). The above proposal really does nothing for that. cheers andrew This is partly true. If you have the ability to rewrite the table and put columns in a specific order you can manually minimize the alignment padding. However that will probably produce a table that is not in the logical order you would like. I still see plenty of use case for both my initial case as the alignment padding case, even without logical layout being different to disk layout. Also there has been a large about of discussion on performance relating to having firm numbers for proposals for different compiler options. Do anybody have tested numbers, and known information about where/how you can eliminate padding by column ordering? Tom suggests in this thread that lots of types have padding issues, so how much is it really going to buy us space wise if we re-order the table in optimal format. What is the optimal ordering to reduce disk usage? Russell. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Release 8.2.0 done, 8.3 development starts
Lukas Kahwe Smith wrote: Hi, I just collected all the items mentioned in this thread as well as what people quickly came up with on IRC and put it on a list in the developer wiki [1]. I tried to put names and links behind the items where ever possible. Let me know if there is something missing or if you know any other information (links, names) etc. that should be mentioned. The list is kinda scary long .. regards Lukas [1] http://developer.postgresql.org/index.php/Todo:WishlistFor83 That looks helpful. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Release 8.2.0 done, 8.3 development starts
On 12/20/06, Lukas Kahwe Smith [EMAIL PROTECTED] wrote: wiki [1]. I tried to put names and links behind the items where ever possible. Let me know if there is something missing or if you know any other information (links, names) etc. that should be mentioned. I know Mark Cave-Ayland was interested in taking over hierarchical queries... let me see where he stands on it and whether he was going to complete it on an 8.3 deadline. If not, I'll do it. -- Jonah H. Harris, Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] psql: core dumped
Mario wrote: I'd like to help :-) I wanted to avoid a core dumped but you told me that's a normal thing for a SIGQUIT signal. Did you try running `ulimit -c 0` first? That should do what you want - prevent generation of the dump file. Regards, Philip. -- Philip Yarra Senior Software Engineer, Utiba Pty Ltd [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] inet/cidr
I was looking at upgrading to 8.2, but I make extensive use of the IP4 module. It doesn't make properly due to the changes in the inet/cidr types, notably the removal of the type (is_cidr) field of the inet_struct, which the module uses when casting: ip4_cast_to_cidr, ip4r_cast_to_cidr, and a check in ip4r_cast_from_cidr. I read a bit about the cidr/inet types becoming binary compatible, but I wasn't sure if I needed to do more extensive conversion, or if I could just remove the references to the type field and be happy. -Mike ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Interface for pg_autovacuum
Jim Nasby wrote: On Dec 20, 2006, at 7:56 AM, Florian G. Pflug wrote: Jim Nasby wrote: I'm teaching a class this week and a student asked me about OIDs. He related the story of how in Sybase, if you moved a database from one server from another, permissions got all screwed up because user IDs no longer matched. I explained that exposing something like an integer ID in a user interface or an API is just a bad idea and PostgreSQL doesn't do that. Then I got to pg_autovacuum So... is there any reason there isn't a prescribed interface to pg_autovacuum that doesn't expose vacrelid? Can we get that added to TODO? Wouldn't it be sufficient to change the type of vacrelid from oid to regclass? Then just dumping and restoring pg_autovacuum like any other table should Just Work. I think that would work, though as I mentioned we'd also want to set reasonable defaults on the table if we decide to keep that as our interface. On the other hand, this would be the only part of the system where the official interface/API is a system catalog table. Do we really want to expose the internal representation of something as our API? That doesn't seem wise to me... Additionally, AFAIK it is not safe to go poking data into catalogs willy-nilly. Having one table where this is the interface to the system seems like it could lead to some dangerous confusion. I thought the plan was to change the ALTER TABLE command to allow vacuum settings to be set. I may be totally away from the mark. But if this was the case it would mean that dumps would just need an alter table statement to maintain autovacuum information. There is an advantage that if you only dump some tables, their autovac settings would go with them. But is that a good thing? Reagrds Russell Smith -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Load distributed checkpoint
On 12/20/06, Takayuki Tsunakawa [EMAIL PROTECTED] wrote: [Conclusion] I believe that the problem cannot be solved in a real sense by avoiding fsync/fdatasync(). We can't ignore what commercial databases have done so far. The kernel does as much as he likes when PostgreSQL requests him to fsync(). From: Inaam Rana I am new to the community and am very interested in the tests that you have done. I am also working on resolving the sudden IO spikes at checkpoint time. I agree with you that fsync() is the core issue here. Thank you for understanding my bad English correctly. Yes, what I've been insisting is that it is necessary to avoid fsync()/fdatasync() and to use O_SYNC (plus O_DIRECT if supported on the target platform) to really eliminate the big spikes. In my mail, the following sentence made a small mistake. I believe that the problem cannot be solved in a real sense by avoiding fsync/fdatasync(). The correct sentence is: I believe that the problem cannot be solved in a real sense without avoiding fsync/fdatasync(). Being a new member I was wondering if someone on this list has done testing with O_DIRECT and/or O_SYNC for datafiles as that seems to be the most logical way of dealing with fsync() flood at checkpoint time. If so, I'll be very interested in the results. Could you see the mail I sent on Dec 18? Its content was so long that I zipped the whole content and attached to the mail. I just performed the same test simply adding O_SYNC to open() in mdopen() and another function in md.c. I couldn't succeed in running with O_DIRECT because O_DIRECT requires the shared buffers to be aligned on the sector-size boundary. To perform O_DIRECT test, a little more modification is necessary to the code where the shared buffers are allocated. The result was bad. But that's just a starting point. We need some improvements that commercial databases have done. I think some approaches we should take are: (1) two-checkpoint (described in Jim Gray's textbook Transaction Processing: Concepts and Techniques (2) what Oracle suggests in its manual (see my previous mails) (3) write multiple contiguous buffers with one write() to decrease the count of write() calls As mentioned in this thread that a single bgwriter with O_DIRECT will not be able to keep pace with cleaning effort causing backend writes. I think (i.e. IMHO) multiple bgwriters and/or AsyncIO with O_DIRECT can resolve this issue. I agree with you. Oracle provides a parameter called DB_WRITER_PROCESSES to set the number of database writer processes. Oracle also provides asynchronous I/O to solve the problem you are saying about. Please see section 10.3.9 the following page: http://download-west.oracle.com/docs/cd/B19306_01/server.102/b14211/instance_tune.htm#sthref1049 Talking of bgwriter_* parameters I think we are missing a crucial internal counter i.e. number of dirty pages. How much work bgwriter has to do at each wakeup call should be a function of total buffers and currently dirty buffers. Relying on both these values instead of just one static NBuffers should allow bgwriter to adapt more quickly to workload changes and ensure that not much work is accumulated for checkpoint. I agree with you in the sense that the current bgwriter is a bit careless about the system load. I believe that PostgreSQL should be more gentle to OLTP transactions -- many users of the system as a result. I think the speed of WAL accumulation should also be taken into account. Let's list up the problems and ideas. --
Re: [HACKERS] New version of money type
On Thu, 12 Oct 2006 14:24:22 -0400 D'Arcy J.M. Cain darcy@druid.net wrote: On Thu, 12 Oct 2006 14:17:33 -0400 Tom Lane [EMAIL PROTECTED] wrote: D'Arcy J.M. Cain darcy@druid.net writes: Cool. So what do I do with the patch? Should I add the currency symbol back in and commit or should I resubmit the patch to hackers for further review? Well, one thing you definitely *don't* do is commit right now, because we're in feature freeze, not to mention trying to avoid forced initdbs now that beta has started. Sit on it till 8.3 is branched, and OK. I hadn't thought of it as a new feature per se but I understand the initdb issue. Holding at 30,000 feet, ground control. meanwhile think about what you want to do with the currency-symbol issue... Personally I don't see a need for it but I am currently in favour of adding it back in before committing just so that we can deal with the issue separately. The same as the other changes being discussed. Now that 8.3 has been branched shall I go ahead and commit? As discussed I will put the currency symbol back in just so that it can be discussed and worked on as a completely separate issue. I have attached the current patch against HEAD. -- D'Arcy J.M. Cain darcy@druid.net | Democracy is three wolves http://www.druid.net/darcy/| and a sheep voting on +1 416 425 1212 (DoD#0082)(eNTP) | what's for dinner. cash64_patch Description: Binary data ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Load distributed checkpoint
Takayuki Tsunakawa [EMAIL PROTECTED] wrote: I have to report a sad result. Your patch didn't work. Let's consider the solution together. What you are addressing is very important for the system designers in the real world -- smoothing response time. You were running the test on the very memory-depend machine. shared_buffers = 4GB / The scaling factor is 50, 800MB of data. Thet would be why the patch did not work. I tested it with DBT-2, 10GB of data and 2GB of memory. Storage is always the main part of performace here, even not in checkpoints. If you use Linux, it has very unpleased behavior in fsync(); It locks all metadata of the file being fsync-ed. We have to wait for the completion of fsync when we do read(), write(), and even lseek(). Almost of your data is in the accounts table and it was stored in a single file. All of transactions must wait for fsync to the single largest file, so you saw the bottleneck was in the fsync. [Conclusion] I believe that the problem cannot be solved in a real sense by avoiding fsync/fdatasync(). I think so, too. However, I assume we can resolve a part of the checkpoint spikes with smoothing of write() alone. BTW, can we use the same way to fsync? We call fsync()s to all modified files without rest in mdsync(), but it's not difficult at all to insert sleeps between fsync()s. Do you think it helps us? One of issues is that we have to sleep in file unit, which is maybe rough granularity. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Interface for pg_autovacuum
Jim Nasby [EMAIL PROTECTED] writes: On the other hand, this would be the only part of the system where the official interface/API is a system catalog table. I don't think it was ever intended by anyone that that would be the long-term solution. Where we are currently at is experimenting to find out what autovacuum's control knobs ought to be. The catalog table was a suitably low-effort way to expose a first cut at the knobs. The fact that pg_dump doesn't dump the settings is entirely deliberate: that's to avoid locking us into a forward compatibility commitment before we're ready. Once we are happy with the control design, we can think about what the long-term API ought to be. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Release 8.2.0 done, 8.3 development starts
On Wed, Dec 20, 2006 at 17:49:15 +0100, Kaare Rasmussen [EMAIL PROTECTED] wrote: I'm not sure, but as far as I remember, it will be a short release cycle for 8.3 in order to finish some big items that couldn't be ready in time for 8.2. I believe the point of the short release cycle was more to change the timing of the release to avoid lots of people being on holiday during a critical period. After the short release cycle, the tentative plan was to go back to yearly releases shifted a few months from where they typically used to occur in the past. Whether or not this actually works out, remains to be seen. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Stats Collector Oddity
Chris Browne [EMAIL PROTECTED] writes: We're getting a bit of an anomaly relating to pg_stat_activity... ... That PID has been dead for several days, but this connection is marked as being open, still, after lo many days. This probably just means that the backend termination stats message got dropped due to heavy load. That's expected behavior in all pre-8.2 releases: the stats system was never intended to provide guaranteed-exactly-correct status. PG 8.2 has reimplemented the pg_stat_activity view to make it more trustworthy. (The other stuff is still probabilistic, but being just event counters, message loss isn't so obvious.) regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Question about debugging bootstrapping and catalog entries
Gurjeet Singh [EMAIL PROTECTED] writes: On 12/18/06, Tom Lane [EMAIL PROTECTED] wrote: There is already an option to sleep early in backend startup for the normal case. Not sure if it works for bootstrap, autovacuum, etc, but I could see making it do so. You are probably referring to the command-line switch -W to posrgres, that translates to 'PostAuthDelay' GUC variable; I think that kicks in a bit too late! No, I was thinking of PreAuthDelay. There might be cases where even that is too late in the procedure --- probably not on Unix, but on Windows there's a lot that happens before BackendInitialize. But offhand I don't know how we'd have a configurable delay much earlier ... custom insertions of hardwired delays into the source code are probably the only good approach if you find that, say, guc.c initialization fails in individual backends under Windows. Back at the ranch, though, the question was whether it'd be worth honoring PreAuthDelay in the other startup code paths such as BootstrapMain. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] ERROR: tuple concurrently updated
Stephen Frost [EMAIL PROTECTED] writes: Subject pretty much says it all. I've put up with this error in the past when it has caused me trouble but it's now starting to hit our clients on occation which is just unacceptable. Have you tracked down the exact scenario making it happen? If this is correct then the solution seems to be either add versioning to the SysCache data, You have provided no evidence that that would fix anything at all. To my mind a concurrently updated failure is more likely to mean insufficient locking around update operations. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] ERROR: tuple concurrently updated
* Tom Lane ([EMAIL PROTECTED]) wrote: Stephen Frost [EMAIL PROTECTED] writes: Subject pretty much says it all. I've put up with this error in the past when it has caused me trouble but it's now starting to hit our clients on occation which is just unacceptable. Have you tracked down the exact scenario making it happen? I think I might have confused two different issues, sorry. :/ If this is correct then the solution seems to be either add versioning to the SysCache data, You have provided no evidence that that would fix anything at all. To my mind a concurrently updated failure is more likely to mean insufficient locking around update operations. I havn't built a reliable test case yet but I *think* the tuple concurrently updated problem is with an analyze being run inside of a function and also being run by autovacuum. The SysCache stuff I was thinking about previously was actually for another problem that I hadn't seen in a long time (because I hadn't been doing a particular set of operations, not because it's that difficult to have happen) but just ran into again today: ERROR: cache lookup failed for relation ... I first started seeing this happen, iirc, when I created a function which went against pg_class/pg_attribute/pg_type and used pg_table_is_visible(). This function (and another which uses pg_class, pg_constraint and pg_attribute) gets used over and over again from another pl/pgsql function. Basically we are walking through a list of tables pulling the column names and primary keys and then running some checks on the tables. Anyhow, while this is running (and it takes upwards of half an hour to run) other activity on the database (this time it was creating a view in a seperate completely unrelated schema) can cause the lookup failure which kills the long-running function (which gets to be very frustrating...). Thanks, Stephen signature.asc Description: Digital signature