Re: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
On Tue, 12 Aug 2008, Ron Mayer wrote: Really old software (notably 2.4 linux kernels) didn't send cache synchronizing commands for SCSI nor either ATA; but it seems well thought through in the 2.6 kernels as described in the Linux kernel documentation. http://www.mjmwired.net/kernel/Documentation/block/barrier.txt If you've drank the kool-aid you might believe that. When I see people asking about this in early 2008 at http://thread.gmane.org/gmane.linux.kernel/646040 and serious disk driver hacker Jeff Garzik says It's completely ridiculous that we default to an unsafe fsync. [ http://thread.gmane.org/gmane.linux.kernel/646040 ], I don't know about you but that barrier documentation doesn't make me feel warm and safe anymore. If you do have a disk where you need to disable write caches, I'd love to know the name of the disk and see the output of of hdparm -I /dev/sd*** to see if it claims to support such cache flushes. The below disk writes impossibly fast when I issue a sequence of fsync writes to it under the CentOS 5 Linux I was running on it. Should only be possible to do at most 120/second since it's 7200 RPM, and if I poke it with hdparm -W0 first it behaves. The drive is a known piece of junk from circa 2004, and it's worth noting that it's an ext3 filesystem in a md0 RAID-1 array (aren't there issues with md and the barriers?) # hdparm -I /dev/hde /dev/hde: ATA device, with non-removable media Model Number: Maxtor 6Y250P0 Serial Number: Y62K95PE Firmware Revision: YAR41BW0 Standards: Used: ATA/ATAPI-7 T13 1532D revision 0 Supported: 7 6 5 4 Configuration: Logical max current cylinders 16383 65535 heads 16 1 sectors/track 63 63 -- CHS current addressable sectors:4128705 LBAuser addressable sectors: 268435455 LBA48 user addressable sectors: 490234752 device size with M = 1024*1024: 239372 MBytes device size with M = 1000*1000: 251000 MBytes (251 GB) Capabilities: LBA, IORDY(can be disabled) Standby timer values: spec'd by Standard, no device specific minimum R/W multiple sector transfer: Max = 16 Current = 16 Advanced power management level: unknown setting (0x) Recommended acoustic management value: 192, current value: 254 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: *SMART feature set Security Mode feature set *Power Management feature set *Write cache *Look-ahead *Host Protected Area feature set *WRITE_VERIFY command *WRITE_BUFFER command *READ_BUFFER command *NOP cmd *DOWNLOAD_MICROCODE Advanced Power Management feature set SET_MAX security extension *Automatic Acoustic Management feature set *48-bit Address feature set *Device Configuration Overlay feature set *Mandatory FLUSH_CACHE *FLUSH_CACHE_EXT *SMART error logging *SMART self-test -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] long transaction
have you considered importing to a temporary 'holding' table with copy, then doing 'big' sql statements on it to check constraints, etc? Yes I considered it, but the problem is the data is very tight related between different tables and is important to keep the import order of each entity into the database. With other words, the entity imprt serialization is mandatory. In fact the import script doesn't keep just insert but also delete and update for different entities. So copy is not enough. Also using 'big' sql statements cannot guarantee the import order. Sabin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
On Tue, Aug 12, 2008 at 10:28 PM, Ron Mayer [EMAIL PROTECTED] wrote: Scott Marlowe wrote: I can attest to the 2.4 kernel not being able to guarantee fsync on IDE drives. Sure. But note that it won't for SCSI either; since AFAICT the write barrier support was implemented at the same time for both. Tested both by pulling the power plug. The SCSI was pulled 10 times while running 600 or so concurrent pgbench threads, and so was the IDE. The SCSI came up clean every single time, the IDE came up corrupted every single time. I find it hard to believe there was no difference in write barrier behaviour with those two setups. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
On Tue, 12 Aug 2008, Ron Mayer wrote: Really old software (notably 2.4 linux kernels) didn't send cache synchronizing commands for SCSI nor either ATA; Surely not true. Write cache flushing has been a known problem in the computer science world for several tens of years. The difference is that in the past we only had a flush everything command whereas now we have a flush everything before the barrier before everything after the barrier command. Matthew -- To err is human; to really louse things up requires root privileges. -- Alexander Pope, slightly paraphrased -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
Scott Marlowe wrote: On Tue, Aug 12, 2008 at 10:28 PM, Ron Mayer ...wrote: Scott Marlowe wrote: I can attest to the 2.4 kernel ... ...SCSI...AFAICT the write barrier support... Tested both by pulling the power plug. The SCSI was pulled 10 times while running 600 or so concurrent pgbench threads, and so was the IDE. The SCSI came up clean every single time, the IDE came up corrupted every single time. Interesting. With a pre-write-barrier 2.4 kernel I'd expect corruption in both. Perhaps all caches were disabled in the SCSI drives? I find it hard to believe there was no difference in write barrier behaviour with those two setups. Skimming lkml it seems write barriers for SCSI were behind (in terms of implementation) those for ATA http://lkml.org/lkml/2005/1/27/94 Jan 2005 ... scsi/sata write barrier support ... For the longest time, only the old PATA drivers supported barrier writes with journalled file systems. This patch adds support for the same type of cache flushing barriers that PATA uses for SCSI -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
Greg Smith wrote: The below disk writes impossibly fast when I issue a sequence of fsync 'k. I've got some homework. I'll be trying to reproduce similar with md raid, old IDE drives, etc to see if I can reproduce them. I assume test_fsync in the postgres source distribution is a decent way to see? driver hacker Jeff Garzik says It's completely ridiculous that we default to an unsafe fsync. Yipes indeed. Still makes me want to understand why people claim IDE suffers more than SCSI, tho. Ext3 bugs seem likely to affect both to me. writes to it under the CentOS 5 Linux I was running on it. ... junk from circa 2004, and it's worth noting that it's an ext3 filesystem in a md0 RAID-1 array (aren't there issues with md and the barriers?) Apparently various distros vary a lot in how they're set up (SuSE apparently defaults to mounting ext3 with the barrier=1 option; other distros seemed not to, etc). I'll do a number of experiments with md, a few different drives, etc. today and see if I can find issues with any of the drives (and/or filesystems) around here. But I still am looking for any evidence that there were any widely shipped SATA (or even IDE drives) that were at fault, as opposed to filesystem bugs and poor settings of defaults. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
On Wed, Aug 13, 2008 at 8:41 AM, Ron Mayer [EMAIL PROTECTED] wrote: Greg Smith wrote: But I still am looking for any evidence that there were any widely shipped SATA (or even IDE drives) that were at fault, as opposed to filesystem bugs and poor settings of defaults. Well, if they're getting more than 150/166.6/250 transactions per second without a battery backed cache, then they're likely lying about fsync. And most SATA and IDE drives will give you way over that for a small data set. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query plan, index scan cost
On Jul 18, 2008, at 5:28 AM, Stefan Zweig wrote: CREATE TABLE nw_tla_2008_4_deu ( ID bigint NOT NULL, NET2CLASS smallint, FOW smallint, CONSTRAINT nw_tla_2008_4_deu_pkey PRIMARY KEY (ID), ) WITHOUT OIDS; You might want to give up on the double-quotes... you'll have to use them everywhere. It'd drive me nuts... :) EXPLAIN ANALYZE SELECT nw.ID AS id FROM nw_tla_2008_4_deu AS nw WHERE expand(st_pointfromtext('POINT(13.7328934 51.049476)',4326), 0.2476961598054) nw.the_geom AND nw.FOW IN (1,2,3,4,10,17) AND nw.NET2CLASS IN (0,1,2,3) snip Total runtime: *13.332* ms running the next query which is only slightly different and has one instead of two and conditions leads to the following result EXPLAIN ANALYZE SELECT nw.ID AS id FROM nw_tla_2008_4_deu AS nw WHERE expand(st_pointfromtext('POINT(13.7328934 51.049476)',4326), 0.2476961598054) nw.the_geom AND nw.FOW IN (1,2,3,4,10,17) snip Total runtime: *109*ms so in both querys there are and conditions. there are two and conditions in the first query and one and condition in the second query. unfortunately i am not an expert in reading the postgre query plan. basically i am wondering why in the first query a second index scan is done whereas in the second query the second index scan is not done. the second query runs hundred times faster then first one which surprising to me. The second index scan wasn't done in the second query because you don't have the second IN clause. And it's actually the 1st query that was faster, because it returned fewer rows (15k instead of 45k). -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
On Aug 11, 2008, at 9:01 AM, Jeff wrote: On Aug 11, 2008, at 5:17 AM, Henrik wrote: OK, changed the SAS RAID 10 to RAID 5 and now my random writes are handing 112 MB/ sek. So it is almsot twice as fast as the RAID10 with the same disks. Any ideas why? Is the iozone tests faulty? does IOzone disable the os caches? If not you need to use a size of 2xRAM for true results. regardless - the test only took 10 seconds of wall time - which isn't very long at all. You'd probably want to run it longer anyway. Additionally, you need to be careful of what size writes you're using. If you're doing random writes that perfectly align with the raid stripe size, you'll see virtually no RAID5 overhead, and you'll get the performance of N-1 drives, as opposed to RAID10 giving you N/2. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] Incorrect estimates on correlated filters
On Aug 12, 2008, at 4:59 PM, Chris Kratz wrote: Ran into a re-occuring performance problem with some report queries again today. In a nutshell, we have filters on either multiple joined tables, or multiple columns on a single table that are highly correlated. So, the estimates come out grossly incorrect (the planner has no way to know they are correlated). 2000:1 for one I'm looking at right now. Generally this doesn't matter, except in complex reporting queries like these when this is the first join of 40 other joins. Because the estimate is wrong at the lowest level, it snowballs up through the rest of the joins causing the query to run very, very slowly. In many of these cases, forcing nested loops off for the duration of the query fixes the problem. But I have a couple that still are painfully slow and shouldn't be. I've been reading through the archives with others having similar problems (including myself a year ago). Am I right in assuming that at this point there is still little we can do in postgres to speed up this kind of query? Right now the planner has no way to know the correlation between different columns in the same table, let alone columns in different tables. So, it just assumes no correlation and returns incorrectly low estimates in cases like these. The only solution I've come up with so far is to materialize portions of the larger query into subqueries with these correlated filters which are indexed and analyzed before joining into the larger query. This would keep the incorrect estimates from snowballing up through the chain of joins. Are there any other solutions to this problem? Well... you could try and convince certain members of the community that we actually do need some kind of a query hint mechanism... ;) I did make a suggestion a few months ago that involved sorting a table on different columns and recording the correlation of other columns. The scheme isn't perfect, but it would help detect cases like a field populated by a sequence and another field that's insert timestamp; those two fields would correlate highly, and you should even be able to correlate the two histograms; that would allow you to infer that most of the insert times for _id's between 100 and 200 will be between 2008-01-01 00:10 and 2008-01-01 00:20, for example. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
On Wed, 13 Aug 2008, Ron Mayer wrote: I assume test_fsync in the postgres source distribution is a decent way to see? Not really. It takes too long (runs too many tests you don't care about) and doesn't spit out the results the way you want them--TPS, not average time. You can do it with pgbench (scale here really doesn't matter): $ cat insert.sql \set nbranches :scale \set ntellers 10 * :scale \set naccounts 10 * :scale \setrandom aid 1 :naccounts \setrandom bid 1 :nbranches \setrandom tid 1 :ntellers \setrandom delta -5000 5000 BEGIN; INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); END; $ createdb pgbench $ pgbench -i -s 20 pgbench $ pgbench -f insert.sql -s 20 -c 1 -t 1 pgbench Don't really need to ever rebuild that just to run more tests if all you care about is the fsync speed (no indexes in the history table to bloat or anything). Or you can measure with sysbench; http://www.mysqlperformanceblog.com/2006/05/03/group-commit-and-real-fsync/ goes over that but they don't have the syntax exacty right. Here's an example that works: :~/sysbench-0.4.8/bin/bin$ ./sysbench run --test=fileio --file-fsync-freq=1 --file-num=1 --file-total-size=16384 --file-test-mode=rndwr But I still am looking for any evidence that there were any widely shipped SATA (or even IDE drives) that were at fault, as opposed to filesystem bugs and poor settings of defaults. Alan Cox claims that until circa 2001, the ATA standard didn't require implementing the cache flush call at all. See http://www.kerneltraffic.org/kernel-traffic/kt20011015_137.html Since firmware is expensive to write and manufacturers are generally lazy here, I'd bet a lot of disks from that era were missing support for the call. Next time I'd digging through my disk graveyard I'll try and find such a disk. If he's correct that the standard changed around you wouldn't expect any recent drive to not support the call. I feel it's largely irrelevant that most drives handle things just fine nowadays if you send them the correct flush commands, because there are so manh other things that can make that system as a whole not work right. Even if the flush call works most of the time, disk firmware is turning increasibly into buggy software, and attempts to reduce how much of that firmware you're actually using can be viewed as helpful. This is why I usually suggest just turning the individual drive caches off; the caveats for when they might work fine in this context are just too numerous. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Incorrect estimates on correlated filters
On Wed, Aug 13, 2008 at 10:59 AM, Decibel! [EMAIL PROTECTED] wrote: On Aug 12, 2008, at 4:59 PM, Chris Kratz wrote: Ran into a re-occuring performance problem with some report queries again today. In a nutshell, we have filters on either multiple joined tables, or multiple columns on a single table that are highly correlated. So, the estimates come out grossly incorrect (the planner has no way to know they are correlated). 2000:1 for one I'm looking at right now. Generally this doesn't matter, except in complex reporting queries like these when this is the first join of 40 other joins. Because the estimate is wrong at the lowest level, it snowballs up through the rest of the joins causing the query to run very, very slowly. In many of these cases, forcing nested loops off for the duration of the query fixes the problem. But I have a couple that still are painfully slow and shouldn't be. I've been reading through the archives with others having similar problems (including myself a year ago). Am I right in assuming that at this point there is still little we can do in postgres to speed up this kind of query? Right now the planner has no way to know the correlation between different columns in the same table, let alone columns in different tables. So, it just assumes no correlation and returns incorrectly low estimates in cases like these. The only solution I've come up with so far is to materialize portions of the larger query into subqueries with these correlated filters which are indexed and analyzed before joining into the larger query. This would keep the incorrect estimates from snowballing up through the chain of joins. Are there any other solutions to this problem? Well... you could try and convince certain members of the community that we actually do need some kind of a query hint mechanism... ;) I did make a suggestion a few months ago that involved sorting a table on different columns and recording the correlation of other columns. The scheme isn't perfect, but it would help detect cases like a field populated by a sequence and another field that's insert timestamp; those two fields would correlate highly, and you should even be able to correlate the two histograms; that would allow you to infer that most of the insert times for _id's between 100 and 200 will be between 2008-01-01 00:10 and 2008-01-01 00:20, for example. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Thanks for the reply, Yes, I know hints are frowned upon around here. Though, I'd love to have them or something equivalent on this particular query just so the customer can run their important reports. As it is, it's unrunnable. Unfortunately, if I don't think the sorting idea would help in the one case I'm looking at which involves filters on two tables that are joined together. The filters happen to be correlated such that about 95% of the rows from each filtered table are actually returned after the join. Unfortunately, the planner thinks we will get 1 row back. I do have to find a way to make these queries runnable. I'll keep looking. Thanks, -Chris
Re: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
13 aug 2008 kl. 17.13 skrev Decibel!: On Aug 11, 2008, at 9:01 AM, Jeff wrote: On Aug 11, 2008, at 5:17 AM, Henrik wrote: OK, changed the SAS RAID 10 to RAID 5 and now my random writes are handing 112 MB/ sek. So it is almsot twice as fast as the RAID10 with the same disks. Any ideas why? Is the iozone tests faulty? does IOzone disable the os caches? If not you need to use a size of 2xRAM for true results. regardless - the test only took 10 seconds of wall time - which isn't very long at all. You'd probably want to run it longer anyway. Additionally, you need to be careful of what size writes you're using. If you're doing random writes that perfectly align with the raid stripe size, you'll see virtually no RAID5 overhead, and you'll get the performance of N-1 drives, as opposed to RAID10 giving you N/ 2. But it still needs to do 2 reads and 2 writes for every write, correct? I did some bonnie++ tests just to give some new more reasonable numbers. This is with RAID10 on 4 SAS 15k drives with write-back cache. Version 1.03b --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP safecube04 32136M 73245 95 213092 16 89456 11 64923 81 219341 16 839.9 1 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 6178 99 + +++ + +++ 6452 100 + +++ 20633 99 safecube04,32136M, 73245,95,213092,16,89456,11,64923,81,219341,16,839.9,1,16,6178,99, +,+++,+,+++,6452,100,+,+++,20633,99 -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Incorrect estimates on correlated filters
Chris Kratz wrote: Unfortunately, if I don't think the sorting idea would help in the one case I'm looking at which involves filters on two tables that are joined together. The filters happen to be correlated such that about 95% of the rows from each filtered table are actually returned after the join. Unfortunately, the planner thinks we will get 1 row back. Maybe you can wrap that part of the query in a SQL function and set its estimated cost to the real values with ALTER FUNCTION ... ROWS. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] autovacuum: use case for indenpedent TOAST table autovac settings
Hackers and PG users, Does anyone see a need for having TOAST tables be individually configurable for autovacuum? I've finally come around to looking at being able to use ALTER TABLE for autovacuum settings, and I'm wondering if we need to support that case. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
Scott Marlowe wrote: IDE came up corrupted every single time. Greg Smith wrote: you've drank the kool-aid ... completely ridiculous ...unsafe fsync ... md0 RAID-1 array (aren't there issues with md and the barriers?) Alright - I'll eat my words. Or mostly. I still haven't found IDE drives that lie; but if the testing I've done today, I'm starting to think that: 1a) ext3 fsync() seems to lie badly. 1b) but ext3 can be tricked not to lie (but not in the way you might think). 2a) md raid1 fsync() sometimes doesn't actually sync 2b) I can't trick it not to. 3a) some IDE drives don't even pretend to support letting you know when their cache is flushed 3b) but the kernel will happily tell you about any such devices; as well as including md raid ones. In more detail. I tested on a number of systems and disks including new (this year) and old (1997) IDE drives; and EXT3 with and without the barrier=1 mount option. First off - some IDE drives don't even support the relatively recent ATA command that apparently lets the software know when a cache flush is complete. Apparently on those you will get messages in your system logs: %dmesg | grep 'disabling barriers' JBD: barrier-based sync failed on md1 - disabling barriers JBD: barrier-based sync failed on hda3 - disabling barriers and %hdparm -I /dev/hdf | grep FLUSH_CACHE_EXT will not show you anything on those devices. IMHO that's cool; and doesn't count as a lying IDE drive since it didn't claim to support this. Second of all - ext3 fsync() appears to me to be *extremely* stupid. It only seems to correctly do the correct flushing (and waiting) for a drive's cache to be flushed when a file's inode has changed. For example, in the test program below, it will happily do a real fsync (i.e. the program take a couple seconds to run) so long as I have the fchmod() statements are in there. It will *NOT* wait on my system if I comment those fchmod()'s out. Sadly, I get the same behavior with and without the ext3 barrier=1 mount option. :( == /* ** based on http://article.gmane.org/gmane.linux.file-systems/21373 ** http://thread.gmane.org/gmane.linux.kernel/646040 */ #include sys/types.h #include sys/stat.h #include fcntl.h #include unistd.h #include stdio.h #include stdlib.h int main(int argc,char *argv[]) { if (argc2) { printf(usage: fs filename\n); exit(1); } int fd = open (argv[1], O_RDWR | O_CREAT | O_TRUNC, 0666); int i; for (i=0;i100;i++) { char byte; pwrite (fd, byte, 1, 0); fchmod (fd, 0644); fchmod (fd, 0664); fsync (fd); } } == Since it does indeed wait when the inode's touched, I think it suggests that it's not the hard drive that's lying, but rather ext3. So I take back what I said about linux and write barriers being sane. They're not. But AFACT, all the (6 different) IDE drives I've seen work as advertised, and the kernel happily seems to spews boot messages when it finds one that doesn't support knowing when a cache flush finished. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] autovacuum: use case for indenpedent TOAST table autovac settings
Alvaro Herrera [EMAIL PROTECTED] writes: Does anyone see a need for having TOAST tables be individually configurable for autovacuum? I've finally come around to looking at being able to use ALTER TABLE for autovacuum settings, and I'm wondering if we need to support that case. It seems like we'll want to do it somehow. Perhaps the cleanest way is to incorporate toast-table settings in the reloptions of the parent table. Otherwise dump/reload is gonna be a mess. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [HACKERS] [PERFORM] autovacuum: use case for indenpedent TOAST table autovac settings
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Does anyone see a need for having TOAST tables be individually configurable for autovacuum? I've finally come around to looking at being able to use ALTER TABLE for autovacuum settings, and I'm wondering if we need to support that case. It seems like we'll want to do it somehow. Perhaps the cleanest way is to incorporate toast-table settings in the reloptions of the parent table. Otherwise dump/reload is gonna be a mess. Yeah, Magnus was suggesting this syntax: ALTER TABLE foo SET toast_autovacuum_enable = false; and the like. My question is whether there is interest in actually having support for this, or should we just inherit the settings from the main table. My gut feeling is that this may be needed in some cases, but perhaps I'm overengineering the thing. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [HACKERS] [PERFORM] autovacuum: use case for indenpedent TOAST table autovac settings
Alvaro Herrera [EMAIL PROTECTED] writes: Tom Lane wrote: It seems like we'll want to do it somehow. Perhaps the cleanest way is to incorporate toast-table settings in the reloptions of the parent table. Otherwise dump/reload is gonna be a mess. My question is whether there is interest in actually having support for this, or should we just inherit the settings from the main table. My gut feeling is that this may be needed in some cases, but perhaps I'm overengineering the thing. It seems reasonable to inherit the parent's settings by default, in any case. So you could do that now and then extend the feature later if there's real demand. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Incorrect estimates on correlated filters
Decibel! wrote: Well... you could try and convince certain members of the community that we actually do need some kind of a query hint mechanism... ;) It strikes me that there are really two types of query hint possible here. One tells the planner (eg) prefer a merge join here. The other gives the planner more information that it might not otherwise have to work with, so it can improve its decisions. The values used in this join condition are highly correlated. Is there anything wrong with the second approach? It shouldn't tend to suppress planner bug reports etc. Well, not unless people use it to lie to the planner, and I expect results from that would be iffy at best. It just provides information to supplement Pg's existing stats system to handle cases where it's not able to reasonably collect the required information. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance