Thanks very much Matthew , its more then my expectation... Without changing the query is there any way to optimize it, like by changing the pg configuration for handling these kind queries?
-Arvind S On Mon, Oct 12, 2009 at 6:31 PM, Matthew Wakeling <matt...@flymine.org>wrote: > On Mon, 12 Oct 2009, S Arvind wrote: > >> I can understand left join, actually can any one tell me why sort >> operation is carried >> out and wat Materialize means... >> Can anyone explain me the mentioned plan with reason(s)? >> > > Merge Left Join (cost=62451.86..67379.08 rows=286789 width=0) >> Merge Cond: (a.id = b.id) >> -> Sort (cost=18610.57..18923.27 rows=125077 width=8) >> Sort Key: a.id >> -> Seq Scan on a (cost=0.00..6309.77 rows=125077 width=8) >> -> Materialize (cost=43841.28..47426.15 rows=286789 width=8) >> -> Sort (cost=43841.28..44558.26 rows=286789 width=8) >> Sort Key: b.id >> -> Seq Scan on b (cost=0.00..13920.89 rows=286789 width=8) >> > > This is a merge join. A merge join joins together two streams of data, > where both streams are sorted, by placing the two streams side by side and > advancing through both streams finding matching rows. The algorithm can use > a pointer to a position in both of the streams, and advance the pointer of > the stream that has the earlier value according to the sort order, and > therefore get all the matches. > > You are performing a query over the whole of both of the tables, so the > cheapest way to obtain a sorted stream of data is to do a full sequential > scan of the whole table, bring it into memory, and sort it. An alternative > would be to follow a B-tree index if one was available on the correct > column, but that is usually more expensive unless the table is clustered on > the index or only a small portion of the table is to be read. If you had put > a "LIMIT 10" clause on the end of the query and had such an index, it would > probably switch to that strategy instead. > > The materialise step is effectively a buffer that allows one of the streams > to be rewound cheaply, which will be necessary if there are multiple rows > with the same value. > > Does that answer your question? > > Matthew > > -- > The only secure computer is one that's unplugged, locked in a safe, > and buried 20 feet under the ground in a secret location...and i'm not > even too sure about that one. --Dennis Huges, FBI