the orginal query has more output columns. I reduced it for readability. Specifically it returns a persitent object (flatobj column) which needs to be processed by the application as the returned result. The problem of the huge sort space usage seems to be that the flatobj is part of the row, so it used always copied in the sort algorithm I guess. When I drop the flatobj from the output columns the size of the temp space file drops dramatically. So I'll probably need to read flatobj after the sorting from the limited return result in a subselect.
Tom Lane wrote:
[EMAIL PROTECTED] (Dirk Lutzebaeck) writes:
SELECT DISTINCT ON (df.val_9, df.created, df.flatid) df.docindex, df.flatobj, bi.oid, bi.en
ORDER BY df.val_9 ASC, df.created DESC
LIMIT 1000 OFFSET 0
Just out of curiosity, what is this query supposed to *do* exactly? It looks to me like it will give indeterminate results. Practical uses of DISTINCT ON generally specify more ORDER BY columns than there are DISTINCT ON columns, because the extra columns determine which rows have priority to survive the DISTINCT filter. With the above query, you have absolutely no idea which row will be output for a given combination of val_9/created/flatid.
regards, tom lane
---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]