Re: [PERFORM] How to read query plan

Miroslav Šulc Sun, 13 Mar 2005 11:09:51 -0800

John Arbash Meinel wrote:

It's actually more of a question as to why you are doing left outer
joins, rather than simple joins.
Are the tables not fully populated? If so, why not?

Some records do not consist of full information (they are collected from different sources which use different approach to the data collection) so using INNER JOIN would cause some records wouldn't be displayed which is unacceptable.

How are you using this information? Why is it useful to get back rows
that don't have all of their information filled out?

Each row contains main information which are important. The other information are also important but may be missing. Information are display on lists of 30 rows or on a card. When using filter the query is much faster but the case without filter has these results.

Why is it useful to have so many columns returned? It seems like it most
cases, you are only going to be able to use *some* of the information,
why not create more queries that are specialized, rather than one get
everything query.

Many of the columns are just varchar(1) (because of the migration from MySQL enum field type) so the record is not so long as it could seem. These fields are just switches (Y(es) or N(o)). The problem is users can define their own templates and in different scenarios there might be displayed different information so reducing the number of fields would mean in some cases it wouldn't work as expected. But if we couldn't speed the query up, we will try to improve it other way. Is there any serious reason not to use so much fields except memory usage? It seems to me that it shouldn't have a great impact on the speed in this case.

Have you thought about using a cursor instead of using limit + offset?
This may not help the overall time, but it might let you split up when
the time is spent.
......

No. I come from MySQL world where these things are not common (at least when using MyISAM databases). The other reason (if I understand it well) is that the retrieval of the packages of 30 records is not sequential. Our app is web based and we use paging. User can select page 1 and then page 10, then go backward to page 9 etc.

And if I understand correctly, you consider all of these to be outer
joins. Meaning you want *all* of AdDevicesSites, and whatever info goes
along with it, but there are no restrictions as to what rows you want.
You want everything you can get.

Do you actually need *everything*? You mention only needing 30, what for?

For display of single page consisting of 30 rows. The reason I query all rows is that this is one of the filters users can use. User can display just bigboards or billboards (or specify more advanced filters) but he/she can also display AdDevices without any filter (page by page). Before I select the 30 row, I need to order them by a key and after that select the records, so this is also the reason why to ask for all rows. The key for sorting might be different for each run.

There is one possibility if we don't find anything nicer. Which is to
create a lazy materialized view. Basically, you run this query, and
store it in a table. Then when you want to do the SELECT, you just do
that against the unrolled table.
You can then create triggers, etc to keep the data up to date.
Here is a good documentation of it:
http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html

It is basically a way that you can un-normalize data, in a safe way.

Also, another thing that you can do, is instead of using a cursor, you
can create a temporary table with the results of the query, and create a
primary key which is just a simple counter. Then instead of doing limit
+ offset, you can select * where id > 0 and id < 30; ... select * where
id > 30 and id < 60; etc.

It still requires the original query to be run, though, so it is not
necessarily optimal for you.

These might be the other steps in case we cannot speed-up the query. I would prefer to speed the query up :-)

Unfortunately, I don't really see any obvious problems with your query
in the way that you are using it. The problem is that you are not
applying any selectivity, so postgres has to go to all the tables, and
get all the rows, and then try to logically merge them together. It is
doing a hash merge, which is generally one of the faster ones and it
seems to be doing the right thing.

I would be curious to see how mysql was handling this query, to see if
there was something different it was trying to do. I'm also curious how
much of a difference there was.

In fact, on MySQL I didn't see any slow reactions so I didn't measure and inspect it. But I can try it if I figure out how to copy the database from PostgreSQL to MySQL.


John
=:->

Thank you for your time and help.

Miroslav

begin:vcard
fn;quoted-printable:Miroslav =C5=A0ulc
n;quoted-printable:=C5=A0ulc;Miroslav
org:StartNet s.r.o.
adr;quoted-printable;quoted-printable:;;Vrchlick=C3=A9ho 161/5;Praha 5;;150 00;=C4=8Cesk=C3=A1 republika
email;internet:[EMAIL PROTECTED]
title:CEO
tel;work:+420 257 225 602
tel;cell:+420 603 711 413
x-mozilla-html:TRUE
url:http://www.startnet.cz
version:2.1
end:vcard

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] How to read query plan

Reply via email to