Bruno Wolff III wrote:
There should be parenthesis around the list to test.
WHERE a_name, a_type, a_dir NOT IN (
should be
WHERE (a_name, a_type, a_dir) NOT IN (

That did it (I think)!


I believe that the NOT IN query should run comparably to the LEFT JOIN example supplied by the other person (at least in recent versions of Postgres). I would expect this to run faster than using NOT EXISTS. You probably want to try all 3. The semantics of the three ways of doing this are not all equivalent if there are NULLs in the data being used to eliminate rows. As you indicated you don't have NULLs this shouldn't be a problem.

Another way to write this is using set different (EXCEPT or EXCEPT ALL)
using the key fields and then joining back to table a to pick up the
other fields. However this will almost certianly be slower than the
other methods.

Something odd, now that I have the other method working (I think)...

tle-bu=> SELECT a.file_name, a.file_parent_dir, a.file_type FROM file_info_1 a LEFT JOIN file_set_1 b ON a.file_name=b.fs_name AND a.file_parent_dir=b.fs_parent_dir AND a.file_type=b.fs_type WHERE b.fs_name IS NULL;

returns the results in roughly 1 or 2 seconds on a test data set of 15,000 entries. I have an index on both 'file_info_1' covering 'fs_name', 'fs_parent_dir' and 'fs_type' and on 'file_set_1' covering 'file_name', 'file_parent_dir' and 'file_type'. When I try the seconds method though:

tle-bu=> SELECT file_name, file_parent_dir, file_type FROM file_info_1 WHERE (file_name, file_parent_dir, file_type) NOT IN (SELECT fs_name, fs_parent_dir, fs_type FROM file_set_1);

It took so long to process that after roughly three minutes I stopped the query for fear of overheating my laptop (which happend a while back forcing a thermal shut down).

  The indexes are:

CREATE INDEX file_info_#_display_idx ON file_info_# (file_type, file_parent_dir, file_name);
CREATE INDEX file_set_#_sync_idx ON file_set_# (fs_name, fs_parent_dir, fs_type)


Are these not effective for the second query? If not, what should I change or add? If so, would you have any insight into why there is such an incredible difference in performance?

  Thanks very much again!!

Madison

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
     subscribe-nomail command to [EMAIL PROTECTED] so that your
     message can get through to the mailing list cleanly

Reply via email to