Hello, I having a great deal of difficulty getting postgres to do a hash join. Even if I disable nestloop and mergejoin in postgres.conf, the optimizer still refuses to select hash join. This behavior is killing my performance.
Postgres version is 7.3.2 and relevant tables are vacuum analyzed. Here's an overview of what I'm doing: I have one table of network logs ordered by time values. The other table is a set of hosts (approximately 60) that are infected by a worm. I want to do this query on the dataset: standb=# explain SELECT count (allflow_tv_sobig.tv_s) FROM allflow_tv_sobig, blaster_set WHERE allflow_tv_sobig.src = blaster_set.label AND allflow_tv_sobig.tv_s >= 1060101118::bigint and allflow_tv_sobig.tv_s < 1060187518::bigint; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=185785.06..185785.06 rows=1 width=32) -> Merge Join (cost=174939.71..184986.38 rows=319472 width=32) Merge Cond: ("outer".label = "inner".src) -> Index Scan using blaster_set_x on blaster_set (cost=0.00..3.67 rows=66 width=12) -> Sort (cost=174939.71..178073.92 rows=1253684 width=20) Sort Key: allflow_tv_sobig.src -> Index Scan using allflow_tv_sobig_x on allflow_tv_sobig (cost=0.00..47955.63 rows=1253684 width=20) Index Cond: ((tv_s >= 1060101118::bigint) AND (tv_s < 1060187518::bigint)) (8 rows) Basically I just want to use the smaller table as a filtering mechanism so that I only get resulted for hosts in that table. Rather than do the sensible thing, which is scan the list of infected hosts, then scan the traffic table and ignore entries that aren't in the first list, the optimizer insists on SORTING the table of network traffic according to source address. Considering that this table is very large, these queries are taking forever. Doing it in a nested loop, while it doesn't require sorting, still takes a very long time as well. Is there anyway that I can force the optimizer to do this the right way, aside from adding each IP manually to a disgustingly bloated 'where' clause? Thanks, -S ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html