Re: [PERFORM] PostgreSQL clustering VS MySQL clustering
fsync on. Alex Turner NetEconomist On Fri, 28 Jan 2005 11:19:44 -0500, Merlin Moncure [EMAIL PROTECTED] wrote: With the right configuration you can get very serious throughput. The new system is processing over 2500 insert transactions per second. We don't need more RAM with this config. The disks are fast enough. 2500 transaction/second is pretty damn fast. fsync on/off? Merlin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Automagic tuning
Hi, Cristopher, Christopher Kings-Lynne schrieb: Are there any tools that help with postgres/postgis performance tuning? So they measure the acutal tuple costs and cpu power, or suggest optimal values for the index sample counts? Have you turned on the stat_* settings in postgresql.conf and then examined the pg_stat_* system views? As far as I examined, those views only count several things like fetched rows and pages, and cache hits. I would like something that really measures values like random_page_cost or cpu_tuple_cost that are hardware dependent. I assume such thing does not exist? Markus -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich phone +41-43-888 62 52 | fax +41-43-888 62 53 mailto:[EMAIL PROTECTED] | www.logi-track.com signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Possibly slow query
Manfred, Yeah, that was a typo. It should have been ASet.Value IS NULL. I have considered storing the setting names by key, since I do have a separate table with the names and a key as you suggest, but since my application is only ~75% finished, it's still pretty important to have human readable/editable tables. Thanks, Peter Darley -Original Message- From: Manfred Koizar [mailto:[EMAIL PROTECTED] Sent: Monday, January 31, 2005 3:06 AM To: Peter Darley Cc: Richard Huxton; Pgsql-Performance Subject: Re: [PERFORM] Possibly slow query On Wed, 26 Jan 2005 07:16:25 -0800, Peter Darley [EMAIL PROTECTED] wrote: SELECT User_ID FROM Assignments A NATURAL LEFT JOIN (SELECT * FROM Assignment_Settings WHERE Setting='Status') ASet WHERE A.User_ID IS NOT NULL AND ASet.Assignment_ID IS NULL GROUP BY User_ID; ASet.Assignment_ID IS NULL and value IS NULL as you had in your original post don't necessarily result in the same set of rows. SELECT DISTINCT a.User_ID FROM Assignments a LEFT JOIN Assignment_Settings s ON (a.Assignment_ID=s.Assignment_ID AND s.Setting='Status') WHERE a.User_ID IS NOT NULL AND s.Value IS NULL; Note how the join condition can contain subexpressions that only depend on columns from one table. BTW, |neo=# \d assignment_settings | [...] | setting | character varying(250) | not null | [...] |Indexes: |[...] |assignment_settings_assignment_id_setting unique, btree (assignment_id, setting) storing the setting names in their own table and referencing them by id might speed up some queries (and slow down others). Certainly worth a try ... Servus Manfred ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] PostgreSQL clustering VS MySQL clustering
Le Vendredi 21 Janvier 2005 19:18, Marty Scholes a écrit : The indexes can be put on a RAM disk tablespace and that's the end of index problems -- just make sure you have enough memory available. Also make sure that the machine can restart correctly after a crash: the tablespace is dropped and recreated, along with the indexes. This will cause a machine restart to take some time. Tell me if I am wrong but it sounds to me like like an endless problemThis solution may work with small indexes (less than 4GB) but what appends when the indexes grow ? You would add more memory to your server ? But there will be a moment were you can not add more so what's next ? ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Automagic tuning
Markus, As far as I examined, those views only count several things like fetched rows and pages, and cache hits. I would like something that really measures values like random_page_cost or cpu_tuple_cost that are hardware dependent. I assume such thing does not exist? Nope. You gotta whip out your calculator and run some queries. -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Automagic tuning
Josh Berkus josh@agliodbs.com writes: I would like something that really measures values like random_page_cost or cpu_tuple_cost that are hardware dependent. I assume such thing does not exist? Nope. You gotta whip out your calculator and run some queries. Preferably a whole lot of queries. All the measurement techniques I can think of are going to have a great deal of noise, so you shouldn't twiddle these cost settings based on just a few examples. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] High end server and storage for a PostgreSQL OLTP system
Hi all, I've been following this list for nearly a year now. I've always managed to get PostgreSQL 7.1.x right for the job, which in my case is a large and complex oltp system, run under Pg for 6 years now. We were already planning the switch from 7.1 to 7.4 (or even 8.0). The last project we're facing with has a transaction volume that is something we've never dealt with. By transaction I mean something involving 10 to 10,000 (and more) sql queries (a complex mix of insert/ update/ delete/ select). I'd like to ask: 1) What kind of performance gain can I expect switching from 7.1 to 7.4 (or 8.0)? Obviously I'm doing my own testing, but I'm not very impressed by 8.0 speed, may be I'm doing testing on a low end server... 2) The goal is to make the db handle 100 tps (something like 100 users). What kind of server and storage should I provide? The actual servers our application runs on normally have 2 Intel Xeon processors, 2-4 Gb RAM, RAID 0/1/5 SCSI disk storage with hard drives @ 10,000 rpm 3) Highest I/O throughput SCSI adapters? Adaptec? 4) Is it correct to suppose that multiple RAID 1 arrays can provide the fastest I/O ? I usually reserve one RAID1 array to db data directory, one RAID1 array to pg_xlog directory and one RAID1 array for os and application needs. 5) OS and Pg specific tuning? Usually I modify shared memory settings and most of postgresql.conf available settings for 7.1, like `effective_cache', `shared_buffers', `wal_buffers', `wal_files', and so on. -- Cosimo ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] Very important choice
Hello my friends, I'd like to know (based on your experience and technical details) which OS is recommended for running PostgreSQL keeping in mind 3 indicators: 1 - Performance (SO, Network and IO) 2 - SO Stability 3 - File System Integrity Comparisons between Slackware, Gentoo and FreeBSD are welcome. Which file system has the best performance and integrity: XFS (Linux) or UFS (FreeBSD)? *I've read that UFS is not a journaling FS. Is this right? How much this difference affects performance and integrity? I don't have experience with FreeBSD so I'd like to know if it is possible to run XFS on FreeBSD 5.3. Thank you, Bruno Almeida do Lago
Re: [PERFORM] Automagic tuning
On Mon, Jan 31, 2005 at 03:26:12PM -0500, Tom Lane wrote: Josh Berkus josh@agliodbs.com writes: I would like something that really measures values like random_page_cost or cpu_tuple_cost that are hardware dependent. I assume such thing does not exist? Nope. You gotta whip out your calculator and run some queries. Preferably a whole lot of queries. All the measurement techniques I can think of are going to have a great deal of noise, so you shouldn't twiddle these cost settings based on just a few examples. Are there any examples of how you can take numbers from pg_stats_* or explain analize and turn them into configuration settings (such and random page cost)? -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: Where do you want to go today? Linux: Where do you want to go tomorrow? FreeBSD: Are you guys coming, or what? ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system
On Mon, Jan 31, 2005 at 09:41:32PM +0100, Cosimo Streppone wrote: 2) The goal is to make the db handle 100 tps (something like 100 users). What kind of server and storage should I provide? The actual servers our application runs on normally have 2 Intel Xeon processors, 2-4 Gb RAM, RAID 0/1/5 SCSI disk storage with hard drives @ 10,000 rpm You might look at Opteron's, which theoretically have a higher data bandwidth. If you're doing anything data intensive, like a sort in memory, this could make a difference. 4) Is it correct to suppose that multiple RAID 1 arrays can provide the fastest I/O ? I usually reserve one RAID1 array to db data directory, one RAID1 array to pg_xlog directory and one RAID1 array for os and application needs. RAID10 will be faster than RAID1. The key factor to a high performance database is a high performance I/O system. If you look in the archives you'll find people running postgresql on 30 and 40 drive arrays. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: Where do you want to go today? Linux: Where do you want to go tomorrow? FreeBSD: Are you guys coming, or what? ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Automagic tuning
Jim C. Nasby [EMAIL PROTECTED] writes: On Mon, Jan 31, 2005 at 03:26:12PM -0500, Tom Lane wrote: Preferably a whole lot of queries. All the measurement techniques I can think of are going to have a great deal of noise, so you shouldn't twiddle these cost settings based on just a few examples. Are there any examples of how you can take numbers from pg_stats_* or explain analize and turn them into configuration settings (such and random page cost)? Well, the basic idea is to adjust random_page_cost so that the ratio of estimated cost to real elapsed time (as shown by EXPLAIN ANALYZE) is the same for seqscans and indexscans. What you have to watch out for is that the estimated cost model is oversimplified and doesn't take into account a lot of real-world factors, such as the activity of other concurrent processes. The reason for needing a whole lot of tests is essentially to try to average out the effects of those unmodeled factors, so that you have a number that makes sense within the planner's limited view of reality. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Index Slowing Insert 50x
Trevor Ball [EMAIL PROTECTED] writes: ... it doesn't seem reasonable to me that an index would slow an insert more than 50-fold, regardless of hardware or the nature of the index. Seems pretty slow to me too. Can you provide a self-contained test case? One possibility is that depending on your platform and locale setting, varchar comparisons can be a whole lot slower than a normal person would consider sane. If you're not using C locale, you might try C locale and see if it helps. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Automagic tuning
On Tue, Feb 01, 2005 at 12:06:27AM -0500, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: On Mon, Jan 31, 2005 at 03:26:12PM -0500, Tom Lane wrote: Preferably a whole lot of queries. All the measurement techniques I can think of are going to have a great deal of noise, so you shouldn't twiddle these cost settings based on just a few examples. Are there any examples of how you can take numbers from pg_stats_* or explain analize and turn them into configuration settings (such and random page cost)? Well, the basic idea is to adjust random_page_cost so that the ratio of estimated cost to real elapsed time (as shown by EXPLAIN ANALYZE) is the same for seqscans and indexscans. What you have to watch out for is that the estimated cost model is oversimplified and doesn't take into account a lot of real-world factors, such as the activity of other concurrent processes. The reason for needing a whole lot of tests is essentially to try to average out the effects of those unmodeled factors, so that you have a number that makes sense within the planner's limited view of reality. Given that, I guess the next logical question is: what would it take to collect stats on queries so that such an estimate could be made? And would it be possible/make sense to gather stats useful for tuning the other parameters? -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: Where do you want to go today? Linux: Where do you want to go tomorrow? FreeBSD: Are you guys coming, or what? ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system
Tom Lane wrote: Cosimo writes: 1) What kind of performance gain can I expect switching from 7.1 to 7.4 (or 8.0)? Obviously I'm doing my own testing, but I'm not very impressed by 8.0 speed, may be I'm doing testing on a low end server... Most people report a noticeable speedup in each new release [...] I'm surprised that you're not seeing any gain at all. What was your test case exactly? Have you perhaps tuned your app so specifically to 7.1 that you need to detune it? We tend to use the lowest common SQL features that will allow us to work with any db, so probably the problem is the opposite, there is no pg-specific overtuning. Also, the real pg load, that should be my ideal test case, is somewhat difficult to reproduce (~ 50 users with handhelds and browser clients). Another good test is a particular procedure that opens several (~1000) subsequent transactions, composed of many repeated selection queries with massive write loads on 6/7 different tables, as big as 300/400k tuples. Every transaction ends with either commit or rollback state Indexing here should be ok, for I've analyzed every single query also under database stress. Probably one big issue is that I need to vacuum/reindex too often to keep db performances at a good(tm) level. I realize that this has been addressed in several ways with newer PGs. However, I need to do a lot of application and performance tests and do them more seriously. Then I'll report the results here. -- Cosimo ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])