Re: [PERFORM] Minimum hardware requirements for Postgresql db
scott.marlowe [EMAIL PROTECTED] writes: 3) Estimated number of transactions to be written into the Postgresql db is around 15000 records per day. The growth rate in terms of number of connections is around 10% per year and the data retention is kept on average at least for 18 months for the 2 databases. Like another poster pointed out, this is a walk in the park for postgresql. My workstation (1.1GHz celeron, 40 gig IDE drive, 512 Meg memory) could handle this load while still being my workstation. Well there's some info missing. Like what would you actually be _doing_ with these data? 15,000 inserts per day is nothing. But after 18 months that's over 5M records not including the 10% growth rate. 5M records isn't really all that much but it's enough that it's possible to write slow queries against it. If you're doing big batch updates or complex reports against the data that will be more interesting than the inserts. -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Update performance ... Recommended configuration changes?
Thanks to Greg Stark, Tom Lane and Stephan Szabo for their advice on rewriting my query... the revised query plan claims it should only take about half the time my original query did. Now for a somewhat different question: How might I improve my DB performance by adjusting the various parameters in postgresql.conf and kernel config? Again, TKA. Here's what I've currently got (hardware, kernel config. and postgresql.conf) Hardware: Mac iBook, G3 900Mhz, 640MB memory (This is my research machine :p ) OS: OS X 10.2.6 Postgresql version: 7.3.2 Kernel Config: sysctl -w kern.sysv.shmmax=4194304 sysctl -w kern.sysv.shmmin=1 sysctl -w kern.sysv.shmmni=32 sysctl -w kern.sysv.shmseg=8 sysctl -w kern.sysv.shmall=1024 = Snip of postgresql.conf = # # Shared Memory Size # shared_buffers = 128# min max_connections*2 or 16, 8KB each max_fsm_relations = 2000# min 10, fsm is free space map, ~40 bytes max_fsm_pages = 2 # min 1000, fsm is free space map, ~6 bytes max_locks_per_transaction = 128 # min 10 wal_buffers = 16# min 4, typically 8KB each # # Non-shared Memory Sizes # sort_mem = 65535# min 64, size in KB vacuum_mem = 8192 # min 1024, size in KB # # Write-ahead log (WAL) # #checkpoint_segments = 3# in logfile segments, min 1, 16MB each #checkpoint_timeout = 300 # range 30-3600, in seconds # #commit_delay = 0 # range 0-10, in microseconds #commit_siblings = 5# range 1-1000 # fsync = false #wal_sync_method = fsync# the default varies across platforms: # # fsync, fdatasync, open_sync, or open_datasync #wal_debug = 0 # range 0-16 == End Snip === Saludos, Erik Norvelle ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Update performance ... Recommended configuration
shared_buffers = 128# min max_connections*2 or 16, 8KB each Try 1500. sort_mem = 65535# min 64, size in KB I'd pull this in. You only have 640MB ram, which means about 8 large sorts to swap. How about 16000? fsync = false I presume you understand the risks involved with this setting and dataloss. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] Has anyone run on the new G5 yet
To all, We are building a data warehouse composed of essentially click stream data. The DB is growing fairly quickly as to be expected, currently at 90GB for one months data. The idea is to keep 6 months detailed data on line and then start aggregating older data to summary tables. We have 2 fact tables currently, one with about 68 million rows and the other with about 210 million rows. Numerous dimension tables ranging from a dozen rows to millions. We are currently running on a Dell 2650 with 2 Xeon 2.8 processors in hyper-threading mode, 4GB of ram, and 5 SCSI drives in a RAID 0, Adaptec PERC3/Di, configuration. I believe they are 10k drives. Files system is EXT3. We are running RH9 Linux kernel 2.4.20-20.9SMP with bigmem turned on. This box is used only for the warehouse. All the ETL work is done on this machine as well. DB version is postgreSQL 7.4. We are running into issues with IO saturation obviously. Since this thing is only going to get bigger we are looking for some advice on how to accommodate DB's of this size. First question is do we gain anything by moving the RH Enterprise version of Linux in terms of performance, mainly in the IO realm as we are not CPU bound at all? Second and more radical, has anyone run postgreSQL on the new Apple G5 with an XRaid system? This seems like a great value combination. Fast CPU, wide bus, Fibre Channel IO, 2.5TB all for ~17k. I keep see references to terabyte postgreSQL installations, I was wondering if anyone on this list is in charge of one of those and can offer some advice on how to position ourselves hardware wise. Thanks. --sean ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] sequence overhead
Just wondering if anyone has done any testing on the amount of overhead for insert you might gain by adding a serial column to a table. I'm thinking of adding a few to some tables that get an average of 30 - 40 inserts per second, sometimes bursting over 100 inserts per second and wondering if there will be any noticeable impact. Robert Treat -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Has anyone run on the new G5 yet
We are running into issues with IO saturation obviously. Since this thing is only going to get bigger we are looking for some advice on how to accommodate DB's of this size. snip Second and more radical, has anyone run postgreSQL on the new Apple G5 with an XRaid system? This seems like a great value combination. Fast CPU, wide bus, Fibre Channel IO, 2.5TB all for ~17k. snip If you are going for I/O performance you are best off with one of the Xserve competitors listed at http://www.apple.com/xserve/raid/. The Xserve is based on IDE drives which have a lower seek time (say 8.9 ms) compared to scsi (3.6 ms for seagate cheetah). For small random read/write operations (like databases) this will give you a noticable improvement in performance over ide drives. Also make sure to get as many drives as possible, more spindles equals better I/O performance. I keep see references to terabyte postgreSQL installations, I was wondering if anyone on this list is in charge of one of those and can offer some advice on how to position ourselves hardware wise. I've gone to about half terabyte size and all I can say is you should plan for at least one quarter to one half a rack of drivespace (assuming 14 drives per 4u that's 42 to 84 drives). Do yourself a favor and get more rather than less, you will really appreciate it. I averaged about 2 mb/s average per drive via the raid controller stats on 14 drive array during I/O bound seek and update operations in 2 raid 10 arrays (half xlogs and half data). That comes out to around 2 hours for a terabyte with 70 drives assuming a constant scaling. You may be able to get more or less depending on your setup and query workload. Thanks. --sean ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] A question on the query planner
Tom Lane wrote: Greg Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: Define no longer works well. Well it seems to completely bar the use of a straight merge join between two index scans: Hmmm ... [squints] ... it's not supposed to do that ... [digs] ... yeah, there's something busted here. Will get back to you ... LOL, but I am not sure why. :-) -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Has anyone run on the new G5 yet
Sean Shanny wrote: We are currently running on a Dell 2650 with 2 Xeon 2.8 processors in hyper-threading mode, 4GB of ram, and 5 SCSI drives in a RAID 0, Adaptec PERC3/Di, configuration. I believe they are 10k drives. Files system is EXT3. We are running RH9 Linux kernel 2.4.20-20.9SMP with bigmem turned on. This box is used only for the warehouse. All the ETL work is done on this machine as well. DB version is postgreSQL 7.4. Are you experiencing improvment using the hyper-threading ? Regards Gaetano Mendola ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster