Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks
...and on Thu, Apr 22, 2004 at 06:59:10AM -0700, Eduardo Almeida used the keyboard: snip To reference, Sun has java 64bits just to IA64 and Solaris Sparc 64 not to Opteron. As I mentioned, that is true for the 1.4.x release of the JVMs. We have been testing some JCA builds of 1.5.0 on x86_64 so far, but it is too unstable for any kind of serious work. Cheers, -- Grega Bremec Senior Administrator Noviforum Ltd., Software Media http://www.noviforum.si/ pgpNaspEVJ49h.pgp Description: PGP signature
Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks
Folks, Im doing the 100GB TPC-H and Ill show the previous results to our community (Postgres) in 3 weeks before finishing the study. My intention is to carry through a test with a VLDB in a low cost platform (PostgreSQL, Linux and cheap HW) and not to compare with another DBMS. So far I can tell you that the load time on PG 7.4.2 with kernel 2.6.5 on Opteron 64 model 240 in RAID 0 with 8 disks (960 GB) loaded the database in less than 24 hours. About 7hs:30min to load the data and 16:09:25 to create the indexes The Power test still running and thats why Ill not present anything so far. Now I´ll just send to the list my environment configuration. - The configuration of the machine is: Dual opteron 64 bits model 240 4GB RAM 960 GB on RAID 0 Mandrake Linux 64 with Kernel 2.6.5 (I compiled a kernel for this test) Java SDK java version 1.4.2_04 PostgreSQL JDBC pg74.1jdbc3.jar - The TPC-H configuration is: TPC-H 2.0.0 100GB load using flat files Refresh functions using java - The PostgreSQL 7.4.2 configuration is: add_missing_from | on australian_timezones | off authentication_timeout | 60 check_function_bodies | on checkpoint_segments| 128 checkpoint_timeout | 300 checkpoint_warning | 30 client_encoding| SQL_ASCII client_min_messages| notice commit_delay | 0 commit_siblings| 5 cpu_index_tuple_cost | 0.001 cpu_operator_cost | 0.0025 cpu_tuple_cost | 0.01 DateStyle | ISO, MDY db_user_namespace | off deadlock_timeout | 1000 debug_pretty_print | off debug_print_parse | off debug_print_plan | off debug_print_rewritten | off default_statistics_target | 10 default_transaction_isolation | read committed default_transaction_read_only | off dynamic_library_path | $libdir effective_cache_size | 15 enable_hashagg | on enable_hashjoin| on enable_indexscan | on enable_mergejoin | on enable_nestloop| on enable_seqscan | on enable_sort| on enable_tidscan | on explain_pretty_print | on extra_float_digits | 0 from_collapse_limit| 8 fsync | off geqo | on geqo_effort| 1 geqo_generations | 0 geqo_pool_size | 0 geqo_selection_bias| 2 geqo_threshold | 11 join_collapse_limit| 8 krb_server_keyfile | unset lc_collate | en_US lc_ctype | en_US lc_messages| C lc_monetary| C lc_numeric | C lc_time| C log_connections| off log_duration | off log_error_verbosity| default log_executor_stats | off log_hostname | off log_min_duration_statement | -1 log_min_error_statement| panic log_min_messages | notice log_parser_stats | off log_pid| off log_planner_stats | off log_source_port| off log_statement | off log_statement_stats| off log_timestamp | off max_connections| 10 max_expr_depth | 1 max_files_per_process | 1000 max_fsm_pages | 2 max_fsm_relations | 1000 max_locks_per_transaction | 64 password_encryption| on port | 5432 pre_auth_delay | 0 preload_libraries | unset random_page_cost | 1.25 regex_flavor | advanced rendezvous_name| unset search_path| $user,public server_encoding| SQL_ASCII server_version | 7.4.2 shared_buffers | 4 silent_mode| off sort_mem | 65536 sql_inheritance| on ssl| off statement_timeout | 1000 stats_block_level | off stats_command_string | off stats_reset_on_server_start| on stats_row_level| off stats_start_collector | on superuser_reserved_connections | 2 syslog | 0 syslog_facility| LOCAL0 syslog_ident | postgres tcpip_socket | on TimeZone | unknown trace_notify | off transaction_isolation | read committed transaction_read_only
Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks
Grega, That´s why I used java 32bits and needed to compile the kernel 2.6.5 with the 32bits modules. To reference, Sun has java 64bits just to IA64 and Solaris Sparc 64 not to Opteron. regards, Eduardo --- Grega Bremec [EMAIL PROTECTED] wrote: ...and on Thu, Apr 22, 2004 at 05:53:18AM -0700, Eduardo Almeida used the keyboard: - The configuration of the machine is: Dual opteron 64 bits model 240 4GB RAM 960 GB on RAID 0 Mandrake Linux 64 with Kernel 2.6.5 (I compiled a kernel for this test) Java SDK java version 1.4.2_04 PostgreSQL JDBC pg74.1jdbc3.jar - The TPC-H configuration is: TPC-H 2.0.0 100GB load using flat files Refresh functions using java I'll just add for the reference, to those that aren't aware of it, the Java virtual machine for x86_64 only exists in the 1.5 branch so far, and it's so utterly unstable that most every notable shuffling around in the memory crashes it. :) Hence the 1.4.2_04 is a 32-bit application running in 32-bit mode. I won't be getting into how much this affects the benchmarks as I didn't really get into how CPU- and memory-intensive the refresh functions are in these, so as I said - let's keep it a reference. Cheers, -- Grega Bremec Senior Administrator Noviforum Ltd., Software Media http://www.noviforum.si/ ATTACHMENT part 2 application/pgp-signature __ Do you Yahoo!? Yahoo! Photos: High-quality 4x6 digital prints for 25¢ http://photos.yahoo.com/ph/print_splash ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks
Eduardo Almeida wrote: Folks, Im doing the 100GB TPC-H and Ill show the previous results to our community (Postgres) in 3 weeks before finishing the study. My intention is to carry through a test with a VLDB in a low cost platform (PostgreSQL, Linux and cheap HW) and not to compare with another DBMS. QphH and Price/QphH will be enought for us to see where in the list we are. Unfortunately there are only Sybase and MS SQL results published in the 100 GB category. The 300 GB has DB2 as well. Oracle starts at 1 TB and in the 10 TB category Oracle and DB2 are the only players left. Jan So far I can tell you that the load time on PG 7.4.2 with kernel 2.6.5 on Opteron 64 model 240 in RAID 0 with 8 disks (960 GB) loaded the database in less than 24 hours. About 7hs:30min to load the data and 16:09:25 to create the indexes The Power test still running and thats why Ill not present anything so far. Now Ill just send to the list my environment configuration. - The configuration of the machine is: Dual opteron 64 bits model 240 4GB RAM 960 GB on RAID 0 Mandrake Linux 64 with Kernel 2.6.5 (I compiled a kernel for this test) Java SDK java version 1.4.2_04 PostgreSQL JDBC pg74.1jdbc3.jar - The TPC-H configuration is: TPC-H 2.0.0 100GB load using flat files Refresh functions using java - The PostgreSQL 7.4.2 configuration is: add_missing_from | on australian_timezones | off authentication_timeout | 60 check_function_bodies | on checkpoint_segments| 128 checkpoint_timeout | 300 checkpoint_warning | 30 client_encoding| SQL_ASCII client_min_messages| notice commit_delay | 0 commit_siblings| 5 cpu_index_tuple_cost | 0.001 cpu_operator_cost | 0.0025 cpu_tuple_cost | 0.01 DateStyle | ISO, MDY db_user_namespace | off deadlock_timeout | 1000 debug_pretty_print | off debug_print_parse | off debug_print_plan | off debug_print_rewritten | off default_statistics_target | 10 default_transaction_isolation | read committed default_transaction_read_only | off dynamic_library_path | $libdir effective_cache_size | 15 enable_hashagg | on enable_hashjoin| on enable_indexscan | on enable_mergejoin | on enable_nestloop| on enable_seqscan | on enable_sort| on enable_tidscan | on explain_pretty_print | on extra_float_digits | 0 from_collapse_limit| 8 fsync | off geqo | on geqo_effort| 1 geqo_generations | 0 geqo_pool_size | 0 geqo_selection_bias| 2 geqo_threshold | 11 join_collapse_limit| 8 krb_server_keyfile | unset lc_collate | en_US lc_ctype | en_US lc_messages| C lc_monetary| C lc_numeric | C lc_time| C log_connections| off log_duration | off log_error_verbosity| default log_executor_stats | off log_hostname | off log_min_duration_statement | -1 log_min_error_statement| panic log_min_messages | notice log_parser_stats | off log_pid| off log_planner_stats | off log_source_port| off log_statement | off log_statement_stats| off log_timestamp | off max_connections| 10 max_expr_depth | 1 max_files_per_process | 1000 max_fsm_pages | 2 max_fsm_relations | 1000 max_locks_per_transaction | 64 password_encryption| on port | 5432 pre_auth_delay | 0 preload_libraries | unset random_page_cost | 1.25 regex_flavor | advanced rendezvous_name| unset search_path| $user,public server_encoding| SQL_ASCII server_version | 7.4.2 shared_buffers | 4 silent_mode| off sort_mem | 65536 sql_inheritance| on ssl| off statement_timeout | 1000 stats_block_level | off stats_command_string | off stats_reset_on_server_start| on stats_row_level| off stats_start_collector | on superuser_reserved_connections | 2
Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks
...and on Thu, Apr 22, 2004 at 05:53:18AM -0700, Eduardo Almeida used the keyboard: - The configuration of the machine is: Dual opteron 64 bits model 240 4GB RAM 960 GB on RAID 0 Mandrake Linux 64 with Kernel 2.6.5 (I compiled a kernel for this test) Java SDK java version 1.4.2_04 PostgreSQL JDBC pg74.1jdbc3.jar - The TPC-H configuration is: TPC-H 2.0.0 100GB load using flat files Refresh functions using java I'll just add for the reference, to those that aren't aware of it, the Java virtual machine for x86_64 only exists in the 1.5 branch so far, and it's so utterly unstable that most every notable shuffling around in the memory crashes it. :) Hence the 1.4.2_04 is a 32-bit application running in 32-bit mode. I won't be getting into how much this affects the benchmarks as I didn't really get into how CPU- and memory-intensive the refresh functions are in these, so as I said - let's keep it a reference. Cheers, -- Grega Bremec Senior Administrator Noviforum Ltd., Software Media http://www.noviforum.si/ pgp0.pgp Description: PGP signature
Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks
Eduardo Almeida [EMAIL PROTECTED] writes: About 7hs:30min to load the data and 16:09:25 to create the indexes You could probably improve the index-create time by temporarily increasing sort_mem. It wouldn't be unreasonable to give CREATE INDEX several hundred meg to work in. (You don't want sort_mem that big normally, because there may be many sorts happening in parallel, but in a data-loading context there'll just be one active sort.) regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks
Markus Bertheau [EMAIL PROTECTED] writes: You could probably improve the index-create time by temporarily increasing sort_mem. It wouldn't be unreasonable to give CREATE INDEX several hundred meg to work in. (You don't want sort_mem that big normally, because there may be many sorts happening in parallel, but in a data-loading context there'll just be one active sort.) Doesn't this provide a reason for CREATE INDEX not to honour sort_mem? Already done for 7.5. http://archives.postgresql.org/pgsql-committers/2004-02/msg00025.php regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks
, 22.04.2004, 17:54, Tom Lane : Eduardo Almeida [EMAIL PROTECTED] writes: About 7hs:30min to load the data and 16:09:25 to create the indexes You could probably improve the index-create time by temporarily increasing sort_mem. It wouldn't be unreasonable to give CREATE INDEX several hundred meg to work in. (You don't want sort_mem that big normally, because there may be many sorts happening in parallel, but in a data-loading context there'll just be one active sort.) Doesn't this provide a reason for CREATE INDEX not to honour sort_mem? -- Markus Bertheau [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks
Folks, I forgot to mention that I used Shell scripts to load the data and use Java just to run the refresh functions. Talking about sort_mem config, I used 65000 but in the TPCH specification they said that you are not able to change the configs when you start the benchmark, is that a big problem to use 65000? In the TPCH 100GB we run 5 streams in parallel for the throughput test! To power test I think is not a problem because it runs one query after another. Another thing is that I put statement_timeout = 1000 Some queries may exceed this timeout and Ill send the EXPLAIN for this ones. The last thing is that Jan forgets to mention that Teradata doesnt show up now but in older lists shows 3TB and 10TB results. regards Eduardo --- Jan Wieck [EMAIL PROTECTED] wrote: Eduardo Almeida wrote: Folks, Im doing the 100GB TPC-H and Ill show the previous results to our community (Postgres) in 3 weeks before finishing the study. My intention is to carry through a test with a VLDB in a low cost platform (PostgreSQL, Linux and cheap HW) and not to compare with another DBMS. QphH and Price/QphH will be enought for us to see where in the list we are. Unfortunately there are only Sybase and MS SQL results published in the 100 GB category. The 300 GB has DB2 as well. Oracle starts at 1 TB and in the 10 TB category Oracle and DB2 are the only players left. Jan So far I can tell you that the load time on PG 7.4.2 with kernel 2.6.5 on Opteron 64 model 240 in RAID 0 with 8 disks (960 GB) loaded the database in less than 24 hours. About 7hs:30min to load the data and 16:09:25 to create the indexes The Power test still running and thats why Ill not present anything so far. Now I´ll just send to the list my environment configuration. - The configuration of the machine is: Dual opteron 64 bits model 240 4GB RAM 960 GB on RAID 0 Mandrake Linux 64 with Kernel 2.6.5 (I compiled a kernel for this test) Java SDK java version 1.4.2_04 PostgreSQL JDBC pg74.1jdbc3.jar - The TPC-H configuration is: TPC-H 2.0.0 100GB load using flat files Refresh functions using java - The PostgreSQL 7.4.2 configuration is: add_missing_from | on australian_timezones | off authentication_timeout | 60 check_function_bodies | on checkpoint_segments| 128 checkpoint_timeout | 300 checkpoint_warning | 30 client_encoding| SQL_ASCII client_min_messages| notice commit_delay | 0 commit_siblings| 5 cpu_index_tuple_cost | 0.001 cpu_operator_cost | 0.0025 cpu_tuple_cost | 0.01 DateStyle | ISO, MDY db_user_namespace | off deadlock_timeout | 1000 debug_pretty_print | off debug_print_parse | off debug_print_plan | off debug_print_rewritten | off default_statistics_target | 10 default_transaction_isolation | read committed default_transaction_read_only | off dynamic_library_path | $libdir effective_cache_size | 15 enable_hashagg | on enable_hashjoin| on enable_indexscan | on enable_mergejoin | on enable_nestloop| on enable_seqscan | on enable_sort| on enable_tidscan | on explain_pretty_print | on extra_float_digits | 0 from_collapse_limit| 8 fsync | off geqo | on geqo_effort| 1 geqo_generations | 0 geqo_pool_size | 0 geqo_selection_bias| 2 geqo_threshold | 11 join_collapse_limit| 8 krb_server_keyfile | unset lc_collate | en_US lc_ctype | en_US lc_messages| C lc_monetary| C lc_numeric | C lc_time| C log_connections| off log_duration | off log_error_verbosity| default log_executor_stats | off log_hostname | off log_min_duration_statement | -1 log_min_error_statement| panic log_min_messages | notice log_parser_stats | off log_pid| off log_planner_stats | off log_source_port| off log_statement | off log_statement_stats| off log_timestamp | off max_connections| 10
Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks
Josh Berkus wrote: Folks, I've sent a polite e-mail to Mr. Gomez offering our help. Please, nobody flame him! Please keep in mind that the entire test has, other than a similar database schema and query types maybe, nothing to do with a TPC-H. I don't see any kind of SUT. Foreign key support on the DB level is not required by any of the TPC benchmarks. But the System Under Test, which is the combination of middleware application and database together with all computers and network components these parts are running on, must implement all the required semantics, like ACID properties, referential integrity c. One could implement a TPC-H with flat files, it's just a major pain in the middleware. A proper TPC benchmark implementation would for example be a complete PHP+DB application, where the user interaction is done by an emulated browser and what is measured is the http response times, not anything going on between PHP and the DB. Assuming that all requirements of the TPC specification are implemented by either using available DB features, or including appropriate workarounds in the PHP code, that would very well lead to something that can compare PHP+MySQL vs. PHP+PostgreSQL. All TPC benchmarks I have seen are performed by timing such a system after a considerable rampup time, giving the DB system a chance to properly populate caches and so forth. Rebooting the machine just before the test is the wrong thing here and will especially kill any advanced cache algorithms like ARC. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html