Re: [PERFORM] PostgreSQL on VMWare vs Windows vs CoLinux
Vitaly Belman [EMAIL PROTECTED] writes: With all that said, VMWare is badly suited for running a database, while CoLinux can be run as a service (didn't try it yet though), VMWare always sits there, it is slow to go up, slow to go down and generally feels like a system hog. Uhm, it sounds like you're using VMWare Workstation? VMWare has a range of different versions including some that are specifically targeted towards server situations. I think they had the idea that hosting companies would run hundreds of virtual machines on a server and provide their hosting clients with a virtual machine to play with. That said, I'm curious why the emulated servers performed better than the Native Windows port. My first thought is that they probably aren't syncing every write to disk so effectively they're defeating the fsyncs, allowing the host OS to buffer disk writes. I would be curious to see better stats on things like a pgbench run which would give some idea of the context switch efficiency, and a large select or update, which would give some idea of the i/o throughput. Really there's no excuse for the Windows port to be slower than an emulator. Barring effects like the disk caching I mentioned, it should far outpace the emulators. -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] PostgreSQL and Kernel 2.6.x
V i s h a l Kashyap @ [Sai Hertz And Control Systems] wrote: Dear all, Have anyone compiled PostgreSQL with kernel 2.6.x if YES 1. Was their any performance gains Else 1. Is it possible 2. What problems would keeping us away from compiling on kernel 2.6 We run pgsql on 2.6.6 there was upto 30% improvement in performance for certain queries. None, everything works just fine. Regds Mallah. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] PostgreSQL on VMWare vs Windows vs CoLinux
Using VMware myself quite extensively, I wonder what the disk configuration was that you created for the VM. Where the disks preallocated and did you make sure that they are contiguous on the NTFS filesystem? Did you install the VMware tools in the guest operating system? What did you use to measure the performance? Jan On 6/1/2004 6:56 PM, Vitaly Belman wrote: Hello pgsql-performance, I was using the native windows PostgreSQL 7.5Dev and was adviced by several people to use an emulated PostgreSQL instead, as it is just a beta. Well, I give it a whirl and tried both commercial VMWare and the freeweare open-source CoLinux, both work under Windows and both emulate Linux, that's a quick review of my experience with them, may someone in need learn from it. This might be not the best place for such a post, but since the subject was brought up here, I'll post it here as well. If someone thinks it should be posted somewhere else, let me know. Installation Configuration VMWare: On the bright side, the installation went quite smoothly, VMWare configured all the network stuff by itself and I had no trouble using the net right away. On the grim side, the installation itself took ages, compared to the plug play feel of CoLinux. Installing PostgreSQL on VMWare was quite straightforward, just as the the PostgreSQL documention goes. CoLinux: As I said, with CoLinux the installation itself goes very quickly. To get Linux running you need to download practically less than 20mb which include the distribution (Debian in my case) and the CoLinux setup. Configuring CoLinux took a bit longer than VMWare, yet, not long as I thought it would take. In fact, it can be very easy if you just follow the documention of CoLinux Wiki stuff, there are some very easy to follow tutorials there. Installing PostgreSQL on CoLinux proved a little more difficult (again, Debian), but I posted a quick tutorial that should smooth the process: http://www.colinux.org/wiki/index.php/PostgreSQL. Performance --- This was a totally subjective test (especially since one of the participants is in a beta stage), yet, that's what I tested and that's what I needed to know. To make the test as fair as possible, I did an exact dump of the same database. I ran the SQLs (around 10) in the same order on all of them and repeated the test several times. I also did an EXPLAIN on the queries to make sure all the databases work on the query the same way. It wasn't a full test though, I didn't test mass select load, nor inserts, nor work under heavy load, nor I tried different types of joins. All I did was to run some heavy (in execution time) queries. So you should take these tests just for what they are. That's what I got: The native window port performed poorly lagging 30%-50% behind the VMWare/CoLinux solutions in execution times, rather sad, but not unexpected, I guess. CoLinux and VMWare give AROUND the same results, yet CoLinux did give slightly better performance (I'd say 5%-10%) but with such slight improvement and inconsistency I wouldn't count it as much. Conclusion -- With all that said, VMWare is badly suited for running a database, while CoLinux can be run as a service (didn't try it yet though), VMWare always sits there, it is slow to go up, slow to go down and generally feels like a system hog. I'll go on with CoLinux for now and hope it will act as good as it looks. http://www.vmware.com/ http://www.colinux.org/ Thanks to Bryan and Matthew for their advices regarding the emulations. Regards, Vitaly Belman ICQ: 1912453 AIM: VitalyB1984 MSN: [EMAIL PROTECTED] Yahoo!: VitalyBe ---(end of broadcast)--- TIP 8: explain analyze is your friend -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] PostgreSQL on VMWare vs Windows vs CoLinux
Greg Stark [EMAIL PROTECTED] writes: That said, I'm curious why the emulated servers performed better than the Native Windows port. My first thought is that they probably aren't syncing every write to disk so effectively they're defeating the fsyncs, allowing the host OS to buffer disk writes. It would be fairly easy to check this by repeating the comparisons with fsync = off in postgresql.conf. A performance number that doesn't change much would be a smoking gun ;-). The native port hasn't had any performance testing done on it yet, and I wouldn't be surprised to hear of a gotcha or two. Perhaps with the recent schedule change there will be some time for performance tuning before we go beta. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Trigger Function
Title: RE: [PERFORM] Trigger Function Thanks for the response. I was pretty sure it couldn't be done the way I wanted to but felt I would ask anyway. Thanks again, Duane -Original Message- From: Mike Nolan [mailto:[EMAIL PROTECTED]] Sent: Tuesday, June 01, 2004 3:04 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [PERFORM] Trigger Function My problem is I defined the before and after fields in the audit table as TEXT and when I try to move NEW or OLD into these fields I get the error NEW used in query that is not in a rule. You're trying to insert record data into a text field, that doesn't work. OLD and NEW can be used as either record identifiers (as in RETURN OLD) or column qualifiers (as in OLD.colname), but you can't intermingle them. I don't think postgres (pl/pgsql) has row-to-variable and variable-to-row functions like serialize and unserialize, that's probably what you'd need. It would probably be necessary to write something like that in C, since at this point pl/perl cannot be used for trigger functions. I've not tried using pl/php yet, the announcement for it says it can be used for trigger functions. My first thought is that even if there was a serialize/unserialize capabiity you might be able to write something using it that creates the log entry but not anything that allows you to query the log for specific column or row entries. It would probably require a MAJOR extension of SQL to add it to pg, as there would need to be qualifiers that can be mapped to specific tables and columns. Even if we had that, storing values coming from multiple tables into a single audit table would present huge challenges. I've found only two ways to implement audit logs: 1. Have separate log tables that match the structure of the tables they are logging. 2. Write a trigger function that converts columns to something you can store in a common log table. (I've not found a way to do this without inserting one row for each column being logged, though.) -- Mike Nolan
Re: [PERFORM] PostgreSQL on VMWare vs Windows vs CoLinux
I have colinux running on a Fedora Core 1 image. I have the rhdb 3 (or PostgreSQL RedHat Edition 3) on it running. Here are tests with fsync on and off: FSYNC OFF FSYNC ON RUN 136.9142.0 124.5149.1 1 122.1126.7 140.1169.7 2 125.7148.7 147.4180.4 3 103.3136.7 136.8166.3 4 126.5146.1 152.3187.9 5 114.4133.3 144.8176.7 6 124.0146.5 143.3175.0 7 121.7166.8 147.8180.5 8 127.3151.8 146.7180.0 9 124.6143.0 137.2167.5 10 -- 122.7144.2 142.1173.3 AVG I hope those numbers' formatting come through all right. This computer is an AMD Athlon 900MHz with 448MB Ram running XP Pro SP1 This is using Colinux 0.60 (not the recently released 0.61) and 96MB of RAM allocated to linux. The computer was idle but it was running Putty, Excel and Task Manager during the process. (I prefer to use Putty to SSH into the virtual computer than to run the fltk console) It occurs to me that the fsync may be performed to the linux filesystem, but this filesystem is merely a file on the windows drive. Would Windows cache this file? It's 2GB in size, so if it did, it would only be able to cache part of it. I'd like to run a more difficult test personally. It seems like this test goes too fast to be very useful. If someone would like me to try something more specific, e-mail me right away and I'll do it. I must leave my office at 4:15 EDT and will not return until Friday, although I can do another test on my home computer Thursday. Matthew Nuzum | Makers of Elite Content Management System www.followers.net | View samples of Elite CMS in action [EMAIL PROTECTED] | http://www.followers.net/portfolio/ -Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-performance- [EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Wednesday, June 02, 2004 11:25 AM To: Greg Stark Cc: Vitaly Belman; [EMAIL PROTECTED]; Bryan Encina; Matthew Nuzum Subject: Re: [PERFORM] PostgreSQL on VMWare vs Windows vs CoLinux Greg Stark [EMAIL PROTECTED] writes: That said, I'm curious why the emulated servers performed better than the Native Windows port. My first thought is that they probably aren't syncing every write to disk so effectively they're defeating the fsyncs, allowing the host OS to buffer disk writes. It would be fairly easy to check this by repeating the comparisons with fsync = off in postgresql.conf. A performance number that doesn't change much would be a smoking gun ;-). The native port hasn't had any performance testing done on it yet, and I wouldn't be surprised to hear of a gotcha or two. Perhaps with the recent schedule change there will be some time for performance tuning before we go beta. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] PostgreSQL on VMWare vs Windows vs CoLinux
Hi All, I think it would actually be interesting to see the performance of the Cygwin version for these same benchmarks, then we've covered all ways to run PostgreSQL on Windows systems. (I expect though that performance of Cygwin-PostgreSQL will improve considerably when an updated version is released that uses Cygwin native IPC instead of the ipc-daemon.) regards, --Tim ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] Context Switching issue: Spinlock doesn't fix.
Folks, I've been testing varying SPINS_PER_DELAY in a client's installation of PostgreSQL against a copy of a production database, to test varying this statistic as a way of fixing the issue. It does not seem to work. I've tested all of the following graduated levels: 100 (the original) 250 500 1000 2000 5000 1 2 3 5 None of these quantities seem to make any difference at all in the number of context switches -- neither down nor up. Seems to me like this is a dead end. Does anyone have test results that show otherwise? -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] Postgres query optimization with varchar fields
Dear reader, I am investigating whether it is useful to directly query a database containing a rather large text corpus (order of magnitude 100k - 1m newspaper articles, so around 100 million words), or whether I should use third party text indexing services. I want to know things such as: how often is a certain word (or pattern) mentioned in an article and how often it is mentioned with the condition that another word is nearby (same article or n words distant). I created a table listing the words one word per row, and created an index on the word and wordnr columns. An example query would be: simple: select articleid, count(*) as count from words w where articleid in (select id from articles where batchid in (84,85,100,101,118,121)) and (word like 'PATTERN') group by articleid complex: select articleid, count(*) as count from words w where articleid in (select id from articles where batchid in (84,85,100,101,118,121)) and (word like 'PATTERN') and exists (select * from words w2 where w.articleid = w2.articleid and (word like 'PATTERN2')) group by articleid According to the diagnostics, the database does use the indices for the query, but it is still rather slow (around 10 minutes for a 'simple query', x seconds for a complex one) It is important that the complex query only counts instances where the PATTERN is found and PATTERN2 only functions as a criterium and does not add to the count. My questions are: (technical details provided below) - Does anyone disagree with the general setup? - Is there a more sensible way to phrase my SQL? - Any other ideas to improve performance? Thanks, Wouter van Atteveldt Free University Amsterdam -- Technicalities: I am using a Postgresql 7.4.1 database on a linux machine (uname -a: Linux swpc450.cs.vu.nl 2.4.22-1.2115.nptl #1 Wed Oct 29 15:31:21 EST 2003 i686 athlon i386 GNU/Linux). The table of interest is: (lemma, pos, simplepos currently not used) Table public.words Column | Type | Modifiers ++-- - id | integer| not null default nextval('public.words_id_seq'::text) articleid | integer| not null sentencenr | integer| not null word | character varying(255) | not null lemma | character varying(255) | pos| character varying(255) | simplepos | character(1) | wordnr | integer| not null parnr | integer| not null Indexes: words_pkey primary key, btree (id) words_aid btree (articleid) words_word btree (word) words_word_ptrn btree (word varchar_pattern_ops) words_wordnr btree (wordnr) Query plans: anoko= explain select articleid, count(*) as count from words w where articleid in (select id from articles where batchid in (84,85,100,101,118,121)) and (word like 'integratie%') group by articleid; QUERY PLAN -- HashAggregate (cost=937959.21..937959.22 rows=2 width=4) - Hash IN Join (cost=95863.70..937816.01 rows=28640 width=4) Hash Cond: (outer.articleid = inner.id) - Index Scan using words_word_ptrn on words w (cost=0.00..836604.62 rows=208886 width=4) Index Cond: (((word)::text ~=~ 'integratie'::character varying) AND ((word)::text ~~ 'integratif'::character varying)) Filter: ((word)::text ~~ 'integratie%'::text) - Hash (cost=94998.60..94998.60 rows=146041 width=4) - Index Scan using articles_batchid, articles_batchid, articles_batchid, articles_batchid, articles_batchid, articles_batchid on articles (cost=0.00..94998.60 rows=146041 width=4) Index Cond: ((batchid = 84) OR (batchid = 85) OR (batchid = 100) OR (batchid = 101) OR (batchid = 118) OR (batchid = 121)) explain select articleid, count(*) as count from words w where articleid in (select id from articles where batchid in (84,85,100,101,118,121)) and (word like 'PATTERN') and exists (select * from words w2 where w.articleid = w2.articleid and (word like 'PATTERN2')) group by articleid anoko- ; QUERY PLAN -- GroupAggregate (cost=168253089.23..168254556.46 rows=1 width=4) - Merge IN Join (cost=168253089.23..168254484.85 rows=14320 width=4) Merge Cond: (outer.articleid = inner.id) - Sort (cost=168144438.23..168144699.33 rows=104443 width=4)
Re: [PERFORM] Hardware opinions wanted
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I'm working through the aquisition process for a quad Opteron box right now. I'll be benchmarking it against a quad processor p630 as well as a quad Xeon after we get it and posting results here. But that's about a month or two from now. I expect that the results will be strongly in favour of the Opetron, especially the price / performance since the Opteron box is being quoted at about half the price of the p630 systems. One thing you may wish to consider is going with lots of 10kRPM SATA disks instead of 15kRPM SCSI disks. Two companies that I'm aware of offer quad Opteron solutions with SATA raid: http://www.quatopteron.com/ http://alltec.com/home.php Andrew Hammond DBA - Afilias -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFAvjYKgfzn5SevSpoRAvq0AJkBDXOKL52HXg43mQ6rXe/i9RzFkQCfYQn8 HpHP2U0jvjfYIvihNLFLbzA= =LyqB -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL on VMWare vs Windows vs CoLinux
Greg Stark wrote: That said, I'm curious why the emulated servers performed better than the Native Windows port. My first thought is that they probably aren't syncing every write to disk so effectively they're defeating the fsyncs, allowing the host OS to buffer disk writes. I havn't tested it, and it's certanly possible. However, please bear in mind that it is also possible that it just gives better performance. The reason this may be possible is that the emulation layer gets the CPU (and other resources) from the OS in bulk, and decides on it's own how to allocate it to the various processes running within the emulation. Inparticular, this on it's own is done using the stock Linux kernel. As Postgresql works sufficiently better on Linux than on Windows, this yields better performance. Again - speculation only. Someone should defenitely make sure that no caching takes place where it shouldn't. As a side note, I have had a chance to talk to Dan Aloni (coLinux maintainer) about running PostgreSQL on coLinux. He said that he knows that this particular use is high on people's priority list, but he feels it is totally unsafe to run a production database on alpha grade software. Then again, free software projects being what they are, this is usually what a maintainer would say. Shachar -- Shachar Shemesh Lingnu Open Source Consulting http://www.lingnu.com/ ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Pl/Pgsql Functions running simultaneously
Hello all, I have an import function that I have been working on for some time now, and it performed well up until recently. It is doing a lot, and because the queries are not cached, I am not sure if that is what the problem is. If a function takes a while, does it lock any of the tables it is accessing, even for SELECT? Below is the bulk of the function: -- set sql statement variables create_import_file_sql := ''COPY '' || container_table || '' ('' || filtered_container_columns || '') TO '' || quote_literal(formatted_import_file) || '' WITH NULL AS '' || null_single_quotes; upload_to_import_table_sql := ''COPY '' || import_table || '' ('' || field_names || '') FROM '' || quote_literal(formatted_import_file) || '' WITH NULL AS '' || null_single_quotes; clean_personalization_fields_sql := ''UPDATE '' || import_table || '' SET emma_member_email = btrim(emma_member_email, '' || quote_literal(quoted_single_quote) || '') , emma_member_name_first = btrim(emma_member_name_first, '' || quote_literal(quoted_single_quote) || '') , emma_member_name_last = btrim(emma_member_name_last, '' || quote_literal(quoted_single_quote) || '') ;''; clean_personalization_fields_sql2 := ''UPDATE '' || import_table || '' SET emma_member_email = btrim(emma_member_email) , emma_member_name_first = btrim(emma_member_name_first) , emma_member_name_last = btrim(emma_member_name_last) ;''; set_account_id_sql := ''UPDATE '' || import_table || '' SET emma_account_id = '' || account_id; set_default_active_status_sql := ''UPDATE '' || import_table || '' SET emma_member_status_id = 1''; set_errors_for_null_email_sql := ''UPDATE '' || import_table || '' SET emma_member_status_id = 2 WHERE emma_member_email IS NULL''; record_null_email_count_sql := ''UPDATE '' || import_history_table || '' SET emma_import_null_email_count = (SELECT COUNT(*) FROM '' || import_table || '' WHERE emma_member_email IS NULL) WHERE emma_import_history_id ='' || import_history_id; set_errors_for_invalid_email_sql := ''UPDATE '' || import_table || '' SET emma_member_status_id = 2 WHERE emma_member_email !~* '' || email_regex; record_invalid_email_count_sql := ''UPDATE '' || import_history_table || '' SET emma_import_invalid_email_count = ( SELECT COUNT(*) FROM '' || import_table || '' WHERE emma_member_email !~* '' || email_regex || '' ) WHERE emma_import_history_id ='' || import_history_id; get_dupes_in_import_sql := ''SELECT emma_member_email, emma_member_status_id FROM '' || import_table || '' GROUP BY emma_member_email, emma_member_status_id having count(*) 1''; insert_dupes_sql := ''INSERT INTO '' || dupe_table || '' SELECT * FROM '' || import_table || '' WHERE LOWER(emma_member_email) = LOWER('' || member_table || ''.emma_member_email)''; record_table_dupe_count_sql := ''UPDATE '' || import_history_table || '' SET emma_import_table_dupe_email_count = (SELECT COUNT(*) FROM '' || import_table || '' WHERE emma_member_email = LOWER('' || member_table || ''.emma_member_email)) WHERE emma_import_history_id ='' || import_history_id; remove_dupes_from_import_table_sql := ''DELETE FROM '' || import_table || '' WHERE LOWER(emma_member_email) = LOWER('' || member_table || ''.emma_member_email)''; create_clean_import_file_sql := ''COPY '' || import_table || '' TO '' || quote_literal(clean_import_file) || '' WITH NULL AS '' || null_single_quotes; create_members_groups_ids_file_sql := ''COPY '' || import_table || '' (emma_member_id) TO '' || quote_literal(members_groups_ids_file) || '' WITH NULL AS '' || null_single_quotes; empty_import_table_sql := ''TRUNCATE '' || import_table; upload_clean_import_sql := ''COPY '' || member_table || '' FROM '' || quote_literal(clean_import_file) || '' WITH NULL AS '' || null_single_quotes; upload_members_groups_ids_sql := ''COPY '' || members_groups_ids_table || '' (emma_member_id) FROM '' || quote_literal(members_groups_ids_file) || '' WITH NULL AS '' || null_single_quotes; empty_members_groups_ids_sql := ''TRUNCATE '' || members_groups_ids_table; empty_members_dupes_sql := ''TRUNCATE '' || dupe_table; vacuum_sql := ''VACUUM '' || member_table || ''; VACUUM '' || import_table || ''; VACUUM '' || container_table || ''; VACUUM '' || members_groups_table || ''; VACUUM '' || members_groups_ids_table || ''; VACUUM '' || dupe_table; -- BEGIN ACTIVITY -- Create the filtered import file with the EXECUTE create_import_file_sql; -- Load data from the filtered file to the import table EXECUTE upload_to_import_table_sql; -- Set account id in import table EXECUTE set_account_id_sql; -- Set the status of all the records to 1 EXECUTE set_default_active_status_sql; -- Clean personalization data EXECUTE
Re: [PERFORM] PostgreSQL on VMWare vs Windows vs CoLinux
On 2 Jun 2004 at 16:45, Merlin Moncure wrote: 'better' does not mean 'faster'. Win32 has a pretty decent journaling filesytem (ntfs) and a good I/O subsystem which includes IPC. Process management is poor compared to newer linux kernels but this is unimportant except in extreme cases. Right now the win32 native does not sync() (but does fsync()). So, the performance is somewhere between fsync = off and fsync = on (probably much closer to fsync = on). It is reasonable to assume that the win32 port will outperform the unix versions at many tasks (at the expense of safety) until the new sync() code is put in. If tested on the same source base, 40-60% differences can only be coming from the I/O subsystem. There are other factors which aren't clear from this exchange like what version of gcc, etc. Hmm, interesting. I've been running the Win32 port for a couple of weeks now. Using the same database as a Linux 2.6 system. Same processor and memory but different disks. Linux system has 10K rpm SCSI disks Windows has 7200 rpm serial ATA disks. When a lot of IO is involved the performance differences are very mixed as I would expect. Sometimes Windows wins, sometimes Linux. BUT, very consistently, when NO IO is involved then the Win32 port is always around 20% slower than Linux. In cases where the EXPLAIN ANALYZE results are different I have disregarded. In all the cases that the EXPLAIN ANALYZE results are the same and no IO is involved the Win32 port is slower. Currently I am putting this down to the build/gcc differences. I can't see why there should be this difference otherwise. (memory management??) Regards, Gary. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Inherited Tables Performance
Is there any crossover in performance with sibling inherited tables? For Ex. if I have a parent table called : people A child of 'people' called: Adults and A child of 'people' called: Kids Does the work I do to Adults, namely copies, huge updates and such ever affect the performance of Kids? Thanks. -- marcus whitney chief architect : cold feet creative www.coldfeetcreative.com 800.595.4401 cold feet presents emma email marketing for discriminating organizations everywhere visit www.myemma.com ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] PostgreSQL on VMWare vs Windows vs CoLinux
Matthew Nuzum [EMAIL PROTECTED] writes: I have colinux running on a Fedora Core 1 image. I have the rhdb 3 (or PostgreSQL RedHat Edition 3) on it running. Here are tests with fsync on and off: FSYNC OFF FSYNC ON RUN 136.9 142.0 124.5149.1 1 122.1 126.7 140.1169.7 2 125.7 148.7 147.4180.4 3 103.3 136.7 136.8166.3 4 126.5 146.1 152.3187.9 5 114.4 133.3 144.8176.7 6 124.0 146.5 143.3175.0 7 121.7 166.8 147.8180.5 8 127.3 151.8 146.7180.0 9 124.6 143.0 137.2167.5 10 -- 122.7 144.2 142.1173.3 AVG I hope those numbers' formatting come through all right. No, they didn't. You used tabs? Are they four space tabs or 8 space tabs? I assume 4 space tabs, but then what is the meaning of the four columns? You have two columns for each fsync setting? One's under Windows and one's under Vmware? Which is which? It occurs to me that the fsync may be performed to the linux filesystem, but this filesystem is merely a file on the windows drive. Would Windows cache this file? It's 2GB in size, so if it did, it would only be able to cache part of it. Well VMWare certainly doesn't know that the linux process called fsync. For all it knows the Linux kernel just schedule the i/o because it felt it was time. So the question is how does VMWare alway handle i/o normally. Does it always handle i/o from the Guest OS synchronously or does it buffer it via the Host OS's i/o system. I'm actually not sure which it does, it could be doing something strange. But does seem most likely that it lets Windows buffer the writes, or does so itself. It might also depend on whether you're using raw disks or a virtual disk file. Undoable disks would throw another wrench in the works entirely. Note that caching isn't really the question. It doesn't have to cache the entire 2GB file or even very much of it. It just has to store the block that linux wants to write and report success to linux without waiting for the disk to report success. Linux will then think the file is sync'd to disk and allow postgres to continue with the next transaction without actually waiting for the physical disk to spin around to the right place and the head to seek and perform the write. -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] PostgreSQL on VMWare vs Windows vs CoLinux
On Wed, 2004-06-02 at 17:39, Greg Stark wrote: Matthew Nuzum [EMAIL PROTECTED] writes: I have colinux running on a Fedora Core 1 image. I have the rhdb 3 (or PostgreSQL RedHat Edition 3) on it running. Here are tests with fsync on and off: FSYNC OFF FSYNC ON RUN 136.9142.0 124.5149.1 1 122.1126.7 140.1169.7 2 125.7148.7 147.4180.4 3 103.3136.7 136.8166.3 4 126.5146.1 152.3187.9 5 114.4133.3 144.8176.7 6 124.0146.5 143.3175.0 7 121.7166.8 147.8180.5 8 127.3151.8 146.7180.0 9 124.6143.0 137.2167.5 10 -- 122.7144.2 142.1173.3 AVG I hope those numbers' formatting come through all right. No, they didn't. You used tabs? Are they four space tabs or 8 space tabs? I assume 4 space tabs, but then what is the meaning of the four columns? You have two columns for each fsync setting? One's under Windows and one's under Vmware? Which is which? Sorry that wasn't clear. The pgbench program puts out two numbers, can't remember what they are, I think one number included the time to make the connection. Therefore, the first two columns represent the two values presented from pgbench with FSYNC off. The second two columns are those same to figures but with FSYNC ON. The 5th column is the run. I did 10 runs and included the output of all runs so that incase anything significant could be gleaned from the details, the data would be there. The executive summary is this: Tom was curious if colinux might be deceiving the applications that expect the fsync to occur. He suspected that pgbench run with and without fsync enabled might reveal something. Therefore: FSYNC ON: 142.1 FSYNC OFF: 122.7 Having FSYNC off seems to yield faster results. I'd like some input on a more demanding test though, because these tests run so quickly I can't help but be suspicious of their accuracy. When there are two OSs involved, it seems like the momentary activity of a background process could skew these results. It occurs to me that the fsync may be performed to the linux filesystem, but this filesystem is merely a file on the windows drive. Would Windows cache this file? It's 2GB in size, so if it did, it would only be able to cache part of it. Well VMWare certainly doesn't know that the linux process called fsync. For all it knows the Linux kernel just schedule the i/o because it felt it was time. So the question is how does VMWare alway handle i/o normally. Does it always handle i/o from the Guest OS synchronously or does it buffer it via the Host OS's i/o system. We probably will never know what the internal workings of VMWare are like because it is a closed source program. I'm not slighting them, I have purchased a license of VMWare and use it for my software testing. However, colinux is an open source project and we can easily find out how they handle this. I have little interest in this as I use this merely as a tool to speed up my application development and do not run any critical services what-so-ever. I'm actually not sure which it does, it could be doing something strange. But does seem most likely that it lets Windows buffer the writes, or does so itself. It might also depend on whether you're using raw disks or a virtual disk file. Undoable disks would throw another wrench in the works entirely. In these tests I'm using a virtual disk file. This is a 2GB file on the hard drive that linux sees as a disk partition. Colinux does not support undoable disks in the way that vmware does. Their wiky site does not mention anything tricky being done to force disk writes to actually be written; the implication therefore is that it leaves the i/o completely at the discretion of XP. Also note that XP Pro and 2000 Pro both offer different caching options for the user to choose so unless it does something to actually force a write the answer is probably who knows. Note that caching isn't really the question. It doesn't have to cache the entire 2GB file or even very much of it. It just has to store the block that linux wants to write and report success to linux without waiting for the disk to report success. Linux will then think the file is sync'd to disk and allow postgres to continue with the next transaction without actually waiting for the physical disk to spin around to the right place and the head to seek and perform the write. That's interesting to know. I wondered about that. So, my summary is this: If you develop applications in windows that run in linux and you need a testing platform you may like colinux a lot because of the following: * It's purchase price is 0 * It's seems to be capable of running any (or at least many) distribution based on 2.4 kernel * It appears to run much faster than VMWare (maybe because it
Re: [PERFORM] PostgreSQL on VMWare vs Windows vs CoLinux
Matthew Nuzum [EMAIL PROTECTED] writes: I'd like some input on a more demanding test though, because these tests run so quickly I can't help but be suspicious of their accuracy. So increase the number of transactions tested (-t switch to pgbench). Be aware also that you really want -s (database size scale factor) to exceed -c (number of concurrent clients) for meaningful results. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] PostgreSQL on VMWare vs Windows vs CoLinux
Merlin Moncure [EMAIL PROTECTED] writes: ... Right now the win32 native does not sync() (but does fsync()). So, the performance is somewhere between fsync = off and fsync = on (probably much closer to fsync = on). It is reasonable to assume that the win32 port will outperform the unix versions at many tasks (at the expense of safety) until the new sync() code is put in. ... which was three days ago. Why are we still speculating? regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html