Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-05-06 Thread Grega Bremec
...and on Thu, Apr 22, 2004 at 06:59:10AM -0700, Eduardo Almeida used the keyboard:

snip

 To reference, Sun has java 64bits just to IA64 and
 Solaris Sparc 64 not to Opteron.
 

As I mentioned, that is true for the 1.4.x release of the JVMs. We have been
testing some JCA builds of 1.5.0 on x86_64 so far, but it is too unstable for
any kind of serious work.

Cheers,
-- 
Grega Bremec
Senior Administrator
Noviforum Ltd., Software  Media
http://www.noviforum.si/


pgpNaspEVJ49h.pgp
Description: PGP signature


Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-22 Thread Eduardo Almeida
Folks,

I’m doing the 100GB TPC-H and I’ll show the previous
results to our community (Postgres) in 3 weeks before
finishing the study.

My intention is to carry through a test with a VLDB in
a low cost platform (PostgreSQL, Linux and cheap HW)
and not to compare with another DBMS.

So far I can tell you that the load time on PG 7.4.2
with kernel 2.6.5 on Opteron 64 model 240 in RAID 0
with 8 disks (960 GB) loaded the database in less than
24 hours. 
About 7hs:30min to load the data and 16:09:25 to
create the indexes

The Power test still running and that’s why I’ll not
present anything so far. Now I´ll just send to the
list my environment configuration.

- The configuration of the machine is:
Dual opteron 64 bits model 240
4GB RAM
960 GB on RAID 0
Mandrake Linux 64 with Kernel 2.6.5  (I compiled a
kernel for this test)
Java SDK  java version 1.4.2_04
PostgreSQL JDBC pg74.1jdbc3.jar

- The TPC-H configuration is:
TPC-H 2.0.0
100GB
load using flat files
Refresh functions using java

- The PostgreSQL 7.4.2 configuration is:

add_missing_from   | on
 australian_timezones   | off
 authentication_timeout | 60
 check_function_bodies  | on
 checkpoint_segments| 128
 checkpoint_timeout | 300
 checkpoint_warning | 30
 client_encoding| SQL_ASCII
 client_min_messages| notice
 commit_delay   | 0
 commit_siblings| 5
 cpu_index_tuple_cost   | 0.001
 cpu_operator_cost  | 0.0025
 cpu_tuple_cost | 0.01
 DateStyle  | ISO, MDY
 db_user_namespace  | off
 deadlock_timeout   | 1000
 debug_pretty_print | off
 debug_print_parse  | off
 debug_print_plan   | off
 debug_print_rewritten  | off
 default_statistics_target  | 10
 default_transaction_isolation  | read committed
 default_transaction_read_only  | off
 dynamic_library_path   | $libdir
 effective_cache_size   | 15
 enable_hashagg | on
 enable_hashjoin| on
 enable_indexscan   | on
 enable_mergejoin   | on
 enable_nestloop| on
 enable_seqscan | on
 enable_sort| on
 enable_tidscan | on
 explain_pretty_print   | on
 extra_float_digits | 0
 from_collapse_limit| 8
 fsync  | off
 geqo   | on
 geqo_effort| 1
 geqo_generations   | 0
geqo_pool_size | 0
 geqo_selection_bias| 2
 geqo_threshold | 11
 join_collapse_limit| 8
 krb_server_keyfile | unset
 lc_collate | en_US
 lc_ctype   | en_US
 lc_messages| C
 lc_monetary| C
 lc_numeric | C
 lc_time| C
 log_connections| off
 log_duration   | off
 log_error_verbosity| default
 log_executor_stats | off
 log_hostname   | off
 log_min_duration_statement | -1
 log_min_error_statement| panic
 log_min_messages   | notice
 log_parser_stats   | off
 log_pid| off
 log_planner_stats  | off
 log_source_port| off
 log_statement  | off
 log_statement_stats| off
 log_timestamp  | off
 max_connections| 10
 max_expr_depth | 1
 max_files_per_process  | 1000
 max_fsm_pages  | 2
 max_fsm_relations  | 1000
 max_locks_per_transaction  | 64
 password_encryption| on
 port   | 5432
 pre_auth_delay | 0
 preload_libraries  | unset
 random_page_cost   | 1.25
 regex_flavor   | advanced
 rendezvous_name| unset
 search_path| $user,public
 server_encoding| SQL_ASCII
 server_version | 7.4.2
 shared_buffers | 4
 silent_mode| off
sort_mem   | 65536
 sql_inheritance| on
 ssl| off
 statement_timeout  | 1000
 stats_block_level  | off
 stats_command_string   | off
 stats_reset_on_server_start| on
 stats_row_level| off
 stats_start_collector  | on
 superuser_reserved_connections | 2
 syslog | 0
 syslog_facility| LOCAL0
 syslog_ident   | postgres
 tcpip_socket   | on
 TimeZone   | unknown
 trace_notify   | off
 transaction_isolation  | read committed
 transaction_read_only 

Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-22 Thread Eduardo Almeida
Grega,

That´s why I used java 32bits and needed to compile
the kernel 2.6.5 with the 32bits modules.
To reference, Sun has java 64bits just to IA64 and
Solaris Sparc 64 not to Opteron.

regards,
Eduardo
--- Grega Bremec [EMAIL PROTECTED] wrote:
 ...and on Thu, Apr 22, 2004 at 05:53:18AM -0700,
 Eduardo Almeida used the keyboard:
  
  - The configuration of the machine is:
  Dual opteron 64 bits model 240
  4GB RAM
  960 GB on RAID 0
  Mandrake Linux 64 with Kernel 2.6.5  (I compiled a
  kernel for this test)
  Java SDK  java version 1.4.2_04
  PostgreSQL JDBC pg74.1jdbc3.jar
  
  - The TPC-H configuration is:
  TPC-H 2.0.0
  100GB
  load using flat files
  Refresh functions using java
  
 
 I'll just add for the reference, to those that
 aren't aware of it, the Java
 virtual machine for x86_64 only exists in the 1.5
 branch so far, and it's so
 utterly unstable that most every notable shuffling
 around in the memory
 crashes it. :)
 
 Hence the 1.4.2_04 is a 32-bit application running
 in 32-bit mode.
 
 I won't be getting into how much this affects the
 benchmarks as I didn't
 really get into how CPU- and memory-intensive the
 refresh functions are in
 these, so as I said - let's keep it a reference.
 
 Cheers,
 -- 
 Grega Bremec
 Senior Administrator
 Noviforum Ltd., Software  Media
 http://www.noviforum.si/
 

 ATTACHMENT part 2 application/pgp-signature 






__
Do you Yahoo!?
Yahoo! Photos: High-quality 4x6 digital prints for 25¢
http://photos.yahoo.com/ph/print_splash

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-22 Thread Jan Wieck
Eduardo Almeida wrote:

Folks,

Im doing the 100GB TPC-H and Ill show the previous
results to our community (Postgres) in 3 weeks before
finishing the study.
My intention is to carry through a test with a VLDB in
a low cost platform (PostgreSQL, Linux and cheap HW)
and not to compare with another DBMS.
QphH and Price/QphH will be enought for us to see where in the list we 
are. Unfortunately there are only Sybase and MS SQL results published in 
the 100 GB category. The 300 GB has DB2 as well. Oracle starts at 1 TB 
and in the 10 TB category Oracle and DB2 are the only players left.

Jan

So far I can tell you that the load time on PG 7.4.2
with kernel 2.6.5 on Opteron 64 model 240 in RAID 0
with 8 disks (960 GB) loaded the database in less than
24 hours. 
About 7hs:30min to load the data and 16:09:25 to
create the indexes

The Power test still running and thats why Ill not
present anything so far. Now Ill just send to the
list my environment configuration.
- The configuration of the machine is:
Dual opteron 64 bits model 240
4GB RAM
960 GB on RAID 0
Mandrake Linux 64 with Kernel 2.6.5  (I compiled a
kernel for this test)
Java SDK  java version 1.4.2_04
PostgreSQL JDBC pg74.1jdbc3.jar
- The TPC-H configuration is:
TPC-H 2.0.0
100GB
load using flat files
Refresh functions using java
- The PostgreSQL 7.4.2 configuration is:

add_missing_from   | on
 australian_timezones   | off
 authentication_timeout | 60
 check_function_bodies  | on
 checkpoint_segments| 128
 checkpoint_timeout | 300
 checkpoint_warning | 30
 client_encoding| SQL_ASCII
 client_min_messages| notice
 commit_delay   | 0
 commit_siblings| 5
 cpu_index_tuple_cost   | 0.001
 cpu_operator_cost  | 0.0025
 cpu_tuple_cost | 0.01
 DateStyle  | ISO, MDY
 db_user_namespace  | off
 deadlock_timeout   | 1000
 debug_pretty_print | off
 debug_print_parse  | off
 debug_print_plan   | off
 debug_print_rewritten  | off
 default_statistics_target  | 10
 default_transaction_isolation  | read committed
 default_transaction_read_only  | off
 dynamic_library_path   | $libdir
 effective_cache_size   | 15
 enable_hashagg | on
 enable_hashjoin| on
 enable_indexscan   | on
 enable_mergejoin   | on
 enable_nestloop| on
 enable_seqscan | on
 enable_sort| on
 enable_tidscan | on
 explain_pretty_print   | on
 extra_float_digits | 0
 from_collapse_limit| 8
 fsync  | off
 geqo   | on
 geqo_effort| 1
 geqo_generations   | 0
geqo_pool_size | 0
 geqo_selection_bias| 2
 geqo_threshold | 11
 join_collapse_limit| 8
 krb_server_keyfile | unset
 lc_collate | en_US
 lc_ctype   | en_US
 lc_messages| C
 lc_monetary| C
 lc_numeric | C
 lc_time| C
 log_connections| off
 log_duration   | off
 log_error_verbosity| default
 log_executor_stats | off
 log_hostname   | off
 log_min_duration_statement | -1
 log_min_error_statement| panic
 log_min_messages   | notice
 log_parser_stats   | off
 log_pid| off
 log_planner_stats  | off
 log_source_port| off
 log_statement  | off
 log_statement_stats| off
 log_timestamp  | off
 max_connections| 10
 max_expr_depth | 1
 max_files_per_process  | 1000
 max_fsm_pages  | 2
 max_fsm_relations  | 1000
 max_locks_per_transaction  | 64
 password_encryption| on
 port   | 5432
 pre_auth_delay | 0
 preload_libraries  | unset
 random_page_cost   | 1.25
 regex_flavor   | advanced
 rendezvous_name| unset
 search_path| $user,public
 server_encoding| SQL_ASCII
 server_version | 7.4.2
 shared_buffers | 4
 silent_mode| off
sort_mem   | 65536
 sql_inheritance| on
 ssl| off
 statement_timeout  | 1000
 stats_block_level  | off
 stats_command_string   | off
 stats_reset_on_server_start| on
 stats_row_level| off
 stats_start_collector  | on
 superuser_reserved_connections | 2
 

Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-22 Thread Grega Bremec
...and on Thu, Apr 22, 2004 at 05:53:18AM -0700, Eduardo Almeida used the keyboard:
 
 - The configuration of the machine is:
 Dual opteron 64 bits model 240
 4GB RAM
 960 GB on RAID 0
 Mandrake Linux 64 with Kernel 2.6.5  (I compiled a
 kernel for this test)
 Java SDK  java version 1.4.2_04
 PostgreSQL JDBC pg74.1jdbc3.jar
 
 - The TPC-H configuration is:
 TPC-H 2.0.0
 100GB
 load using flat files
 Refresh functions using java
 

I'll just add for the reference, to those that aren't aware of it, the Java
virtual machine for x86_64 only exists in the 1.5 branch so far, and it's so
utterly unstable that most every notable shuffling around in the memory
crashes it. :)

Hence the 1.4.2_04 is a 32-bit application running in 32-bit mode.

I won't be getting into how much this affects the benchmarks as I didn't
really get into how CPU- and memory-intensive the refresh functions are in
these, so as I said - let's keep it a reference.

Cheers,
-- 
Grega Bremec
Senior Administrator
Noviforum Ltd., Software  Media
http://www.noviforum.si/


pgp0.pgp
Description: PGP signature


Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-22 Thread Tom Lane
Eduardo Almeida [EMAIL PROTECTED] writes:
 About 7hs:30min to load the data and 16:09:25 to
 create the indexes

You could probably improve the index-create time by temporarily
increasing sort_mem.  It wouldn't be unreasonable to give CREATE INDEX
several hundred meg to work in.  (You don't want sort_mem that big
normally, because there may be many sorts happening in parallel,
but in a data-loading context there'll just be one active sort.)

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-22 Thread Tom Lane
Markus Bertheau [EMAIL PROTECTED] writes:
 You could probably improve the index-create time by temporarily
 increasing sort_mem.  It wouldn't be unreasonable to give CREATE INDEX
 several hundred meg to work in.  (You don't want sort_mem that big
 normally, because there may be many sorts happening in parallel,
 but in a data-loading context there'll just be one active sort.)

 Doesn't this provide a reason for CREATE INDEX not to honour sort_mem?

Already done for 7.5.

http://archives.postgresql.org/pgsql-committers/2004-02/msg00025.php

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-22 Thread Markus Bertheau
 , 22.04.2004,  17:54, Tom Lane :
 Eduardo Almeida [EMAIL PROTECTED] writes:
  About 7hs:30min to load the data and 16:09:25 to
  create the indexes
 
 You could probably improve the index-create time by temporarily
 increasing sort_mem.  It wouldn't be unreasonable to give CREATE INDEX
 several hundred meg to work in.  (You don't want sort_mem that big
 normally, because there may be many sorts happening in parallel,
 but in a data-loading context there'll just be one active sort.)

Doesn't this provide a reason for CREATE INDEX not to honour sort_mem?

-- 
Markus Bertheau [EMAIL PROTECTED]


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-22 Thread Eduardo Almeida
Folks,

I forgot to mention that I used Shell scripts to load
the data and use Java just to run the refresh
functions.

Talking about sort_mem config, I used 65000 but in the
TPCH specification they said that you are not able to
change the configs when you start the benchmark, is
that a big problem to use 65000? In the TPCH 100GB we
run 5 streams in parallel for the throughput test!  To
power test I think is not a problem because it runs
one query after another.

Another thing is that I put statement_timeout =
1000 

Some queries may exceed this timeout and I’ll send the
EXPLAIN for this ones.
 
The last thing is that Jan forgets to mention that
Teradata doesn’t show up now but in older lists shows
3TB and 10TB results.

regards
Eduardo
--- Jan Wieck [EMAIL PROTECTED] wrote:
 Eduardo Almeida wrote:
 
  Folks,
  
  I’m doing the 100GB TPC-H and I’ll show the
 previous
  results to our community (Postgres) in 3 weeks
 before
  finishing the study.
  
  My intention is to carry through a test with a
 VLDB in
  a low cost platform (PostgreSQL, Linux and cheap
 HW)
  and not to compare with another DBMS.
 
 QphH and Price/QphH will be enought for us to see
 where in the list we 
 are. Unfortunately there are only Sybase and MS SQL
 results published in 
 the 100 GB category. The 300 GB has DB2 as well.
 Oracle starts at 1 TB 
 and in the 10 TB category Oracle and DB2 are the
 only players left.
 
 
 Jan
 
  
  So far I can tell you that the load time on PG
 7.4.2
  with kernel 2.6.5 on Opteron 64 model 240 in RAID
 0
  with 8 disks (960 GB) loaded the database in less
 than
  24 hours. 
  About 7hs:30min to load the data and 16:09:25 to
  create the indexes
  
  The Power test still running and that’s why I’ll
 not
  present anything so far. Now I´ll just send to the
  list my environment configuration.
  
  - The configuration of the machine is:
  Dual opteron 64 bits model 240
  4GB RAM
  960 GB on RAID 0
  Mandrake Linux 64 with Kernel 2.6.5  (I compiled a
  kernel for this test)
  Java SDK  java version 1.4.2_04
  PostgreSQL JDBC pg74.1jdbc3.jar
  
  - The TPC-H configuration is:
  TPC-H 2.0.0
  100GB
  load using flat files
  Refresh functions using java
  
  - The PostgreSQL 7.4.2 configuration is:
  
  add_missing_from   | on
   australian_timezones   | off
   authentication_timeout | 60
   check_function_bodies  | on
   checkpoint_segments| 128
   checkpoint_timeout | 300
   checkpoint_warning | 30
   client_encoding| SQL_ASCII
   client_min_messages| notice
   commit_delay   | 0
   commit_siblings| 5
   cpu_index_tuple_cost   | 0.001
   cpu_operator_cost  | 0.0025
   cpu_tuple_cost | 0.01
   DateStyle  | ISO, MDY
   db_user_namespace  | off
   deadlock_timeout   | 1000
   debug_pretty_print | off
   debug_print_parse  | off
   debug_print_plan   | off
   debug_print_rewritten  | off
   default_statistics_target  | 10
   default_transaction_isolation  | read committed
   default_transaction_read_only  | off
   dynamic_library_path   | $libdir
   effective_cache_size   | 15
   enable_hashagg | on
   enable_hashjoin| on
   enable_indexscan   | on
   enable_mergejoin   | on
   enable_nestloop| on
   enable_seqscan | on
   enable_sort| on
   enable_tidscan | on
   explain_pretty_print   | on
   extra_float_digits | 0
   from_collapse_limit| 8
   fsync  | off
   geqo   | on
   geqo_effort| 1
   geqo_generations   | 0
  geqo_pool_size | 0
   geqo_selection_bias| 2
   geqo_threshold | 11
   join_collapse_limit| 8
   krb_server_keyfile | unset
   lc_collate | en_US
   lc_ctype   | en_US
   lc_messages| C
   lc_monetary| C
   lc_numeric | C
   lc_time| C
   log_connections| off
   log_duration   | off
   log_error_verbosity| default
   log_executor_stats | off
   log_hostname   | off
   log_min_duration_statement | -1
   log_min_error_statement| panic
   log_min_messages   | notice
   log_parser_stats   | off
   log_pid| off
   log_planner_stats  | off
   log_source_port| off
   log_statement  | off
   log_statement_stats| off
   log_timestamp  | off
   max_connections| 10
   

Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-21 Thread Jan Wieck
Josh Berkus wrote:

Folks,

I've sent a polite e-mail to Mr. Gomez offering our help.  Please, nobody 
flame him!

Please keep in mind that the entire test has, other than a similar 
database schema and query types maybe, nothing to do with a TPC-H. I 
don't see any kind of SUT. Foreign key support on the DB level is not 
required by any of the TPC benchmarks. But the System Under Test, which 
is the combination of middleware application and database together with 
all computers and network components these parts are running on, must 
implement all the required semantics, like ACID properties, referential 
integrity c. One could implement a TPC-H with flat files, it's just a 
major pain in the middleware.

A proper TPC benchmark implementation would for example be a complete 
PHP+DB application, where the user interaction is done by an emulated 
browser and what is measured is the http response times, not anything 
going on between PHP and the DB. Assuming that all requirements of the 
TPC specification are implemented by either using available DB features, 
or including appropriate workarounds in the PHP code, that would very 
well lead to something that can compare PHP+MySQL vs. PHP+PostgreSQL.

All TPC benchmarks I have seen are performed by timing such a system 
after a considerable rampup time, giving the DB system a chance to 
properly populate caches and so forth. Rebooting the machine just before 
the test is the wrong thing here and will especially kill any advanced 
cache algorithms like ARC.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html