Hi, I've just been referred here after a conversion on IRC and everybody seemed to think I've stumbled upon some strangeness.
The planner (in PG version 8.0.2) is choosing what it thinks is a more expensive plan. I've got a table of animals (about 3M rows) and their movements (about 16M rows), and I'm trying to execute this query: SELECT a.birthlocnid, m.locnid FROM animals a LEFT JOIN movements m ON (a.animalid = m.animalid AND m.mtypeid=0) LIMIT 10; If I have "work_mem" set to something small (1000) it uses this plan: QUERY PLAN Limit (cost=0.00..202.52 rows=10 width=8) (actual time=0.221..0.600 rows=10 loops=1) -> Merge Left Join (cost=0.00..66888828.30 rows=3302780 width=8) (actual time=0.211..0.576 rows=10 loops=1) Merge Cond: ("outer".animalid = "inner".animalid) -> Index Scan using animals_pkey on animals a (cost=0.00..10198983.91 rows=3302780 width=8) (actual time=0.112..0.276 rows=10 loops=1) -> Index Scan using movement_animal on movements m (cost=0.00..56642740.73 rows=3107737 width=8) (actual time=0.088..0.235 rows=10 loops=1) Filter: (mtypeid = 0) Total runtime: 0.413 ms But if I increase "work_mem" to 10000 it uses this plan: QUERY PLAN Limit (cost=565969.42..566141.09 rows=10 width=8) (actual time=27769.047..27769.246 rows=10 loops=1) -> Merge Right Join (cost=565969.42..57264070.77 rows=3302780 width=8) (actual time=27769.043..27769.228 rows=10 loops=1) Merge Cond: ("outer".animalid = "inner".animalid) -> Index Scan using movement_animal on movements m (cost=0.00..56642740.73 rows=3107737 width=8) (actual time=0.022..0.154 rows=10 loops=1) Filter: (mtypeid = 0) -> Sort (cost=565969.42..574226.37 rows=3302780 width=8) (actual time=27768.991..27769.001 rows=10 loops=1) Sort Key: a.animalid -> Seq Scan on animals a (cost=0.00..77086.80 rows=3302780 width=8) (actual time=0.039..5620.651 rows=3303418 loops=1) Total runtime: 27851.097 ms I've tried playing with the statistics as people suggested on IRC but to no effect. There was some discussion about why it would be doing this, but nothing obvious came out of it. SHOW ALL output is at the end of this mail but it should be pretty standard apart from: shared_buffers = 10000 work_mem = 8192 max_connections = 100 effective_cache_size = 10000 Hope that's enough information to be useful. Thanks. Sam name | setting --------------------------------+-------------------------------- add_missing_from | on archive_command | /home/postgres/pgarchive "%p" australian_timezones | off authentication_timeout | 60 bgwriter_delay | 200 bgwriter_maxpages | 100 bgwriter_percent | 1 block_size | 8192 check_function_bodies | on checkpoint_segments | 3 checkpoint_timeout | 300 checkpoint_warning | 30 client_encoding | SQL_ASCII client_min_messages | notice commit_delay | 0 commit_siblings | 5 config_file | /home/pgdata/postgresql.conf cpu_index_tuple_cost | 0.001 cpu_operator_cost | 0.0025 cpu_tuple_cost | 0.01 custom_variable_classes | unset data_directory | /home/pgdata DateStyle | ISO, MDY db_user_namespace | off deadlock_timeout | 1000 debug_pretty_print | off debug_print_parse | off debug_print_plan | off debug_print_rewritten | off debug_shared_buffers | 0 default_statistics_target | 10 default_tablespace | unset default_transaction_isolation | read committed default_transaction_read_only | off default_with_oids | on dynamic_library_path | $libdir effective_cache_size | 10000 enable_hashagg | on enable_hashjoin | on enable_indexscan | on enable_mergejoin | on enable_nestloop | on enable_seqscan | off enable_sort | on enable_tidscan | on explain_pretty_print | on external_pid_file | unset extra_float_digits | 0 from_collapse_limit | 8 fsync | on geqo | on geqo_effort | 5 geqo_generations | 0 geqo_pool_size | 0 geqo_selection_bias | 2 geqo_threshold | 12 hba_file | /home/pgdata/pg_hba.conf ident_file | /home/pgdata/pg_ident.conf integer_datetimes | off join_collapse_limit | 8 krb_server_keyfile | unset lc_collate | C lc_ctype | C lc_messages | C lc_monetary | C lc_numeric | C lc_time | C listen_addresses | * log_connections | on log_destination | stderr log_directory | pg_log log_disconnections | off log_duration | off log_error_verbosity | default log_executor_stats | off log_filename | postgresql-%Y-%m-%d_%H%M%S.log log_hostname | off log_line_prefix | %t %u log_min_duration_statement | -1 log_min_error_statement | panic log_min_messages | notice log_parser_stats | off log_planner_stats | off log_rotation_age | 1440 log_rotation_size | 10240 log_statement | all log_statement_stats | off log_truncate_on_rotation | off maintenance_work_mem | 256000 max_connections | 100 max_files_per_process | 1000 max_fsm_pages | 20000 max_fsm_relations | 1000 max_function_args | 32 max_identifier_length | 63 max_index_keys | 32 max_locks_per_transaction | 64 max_stack_depth | 2048 password_encryption | on port | 5432 pre_auth_delay | 0 preload_libraries | unset random_page_cost | 4 redirect_stderr | off regex_flavor | advanced rendezvous_name | unset search_path | $user,public server_encoding | SQL_ASCII server_version | 8.0.2 shared_buffers | 1000 silent_mode | off sql_inheritance | on ssl | off statement_timeout | 0 stats_block_level | off stats_command_string | off stats_reset_on_server_start | on stats_row_level | off stats_start_collector | on superuser_reserved_connections | 2 syslog_facility | LOCAL0 syslog_ident | postgres TimeZone | GMT trace_notify | off transaction_isolation | read committed transaction_read_only | off transform_null_equals | off unix_socket_directory | unset unix_socket_group | unset unix_socket_permissions | 511 vacuum_cost_delay | 0 vacuum_cost_limit | 200 vacuum_cost_page_dirty | 20 vacuum_cost_page_hit | 1 vacuum_cost_page_miss | 10 wal_buffers | 8 wal_sync_method | fdatasync work_mem | 128000 zero_damaged_pages | off ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster