Hello again,

This question is related to my previous one (Unused table of view, see
http://archives.postgresql.org/pgsql-performance/2004-06/msg00043.php).
For the moment, I have queries where I join tables by hand. Since a
few tables are always joined together, I thought I could define a view
to centralize this and make my queries more readable. But I then
observe a drop in performances on some queries because it seems the
view is not "broken" by the planner, so some optimizations cannot
occur anymore. Maybe this assertion is plain wrong, it's just my
feeling of the situation.

I'm using postgresql 7.4.2 on Debian GNU/Linux.

Here are the details of my tables, queries and views:

CREATE TABLE pictures (
        PictureID serial PRIMARY KEY,
        RollID character varying(64) NOT NULL REFERENCES rolls,
        FrameID character varying(64) NOT NULL,
        Description character varying(255),
        Filename character varying(255),
        Owner integer NOT NULL REFERENCES users,
        EntryDate datetime DEFAULT now(),
        Date datetime,
        NbClick integer DEFAULT 0,
        NbRates integer DEFAULT 0,
        MaxRate int2,
        MinRate int2,
        AverageRate float4 DEFAULT 5,
        SumRates integer DEFAULT 0);

-- Each picture can belong to a number of topics
CREATE TABLE topicscontent (
        TopicID integer REFERENCES topics ON DELETE cascade,
        PictureID integer REFERENCES pictures ON DELETE cascade,
        Direct boolean NOT NULL,
        PRIMARY KEY (TopicID,PictureID) );

-- Each picture can be viewed by a number of groups
CREATE TABLE permissions (
        GroupID integer NOT NULL REFERENCES groups ON DELETE cascade,
        PictureID integer NOT NULL REFERENCES pictures ON DELETE cascade,
        UNIQUE (GroupID, PictureID));

-- Each user can belong to a number of groups
CREATE TABLE groupsdef (
        UserID integer REFERENCES users,
        GroupID integer REFERENCES groups,
        PRIMARY KEY (UserID,GroupID));

-- Each picture can have a number of keywords
CREATE TABLE keywords (
        Type integer,
        PictureID integer NOT NULL REFERENCES pictures ON DELETE cascade,
        Value character varying(128) NOT NULL,
        UNIQUE (Type,PictureID,Value));


Without views, if I want all the picture with a keyword value of
'laurent' that a user with ID of 2 can see, sorted by AverageRate:

SELECT DISTINCT ON (AverageRate,PictureID) P.*
       FROM Pictures AS P, GroupsDef AS G, Permissions AS A, Keywords AS K
       WHERE P.PictureID=A.PictureID 
             AND G.GroupID=A.GroupID 
             AND K.Value in ('laurent') 
             AND K.PictureID=P.PictureID 
             AND UserID=2
       ORDER BY AverageRate,PictureID;

QUERY PLAN                                                                     

 Unique  (cost=528.93..532.71 rows=504 width=97) (actual time=32.447..33.062 rows=274 
loops=1)
   ->  Sort  (cost=528.93..530.19 rows=504 width=97) (actual time=32.443..32.590 
rows=505 loops=1)
         Sort Key: p.averagerate, p.pictureid
         ->  Hash Join  (cost=297.36..506.31 rows=504 width=97) (actual 
time=12.495..29.312 rows=505 loops=1)
               Hash Cond: ("outer".groupid = "inner".groupid)
               ->  Hash Join  (cost=292.14..466.79 rows=900 width=101) (actual 
time=12.056..26.180 rows=750 loops=1)
                     Hash Cond: ("outer".pictureid = "inner".pictureid)
                     ->  Seq Scan on permissions a  (cost=0.00..125.05 rows=8305 
width=8) (actual time=0.007..6.271 rows=8305 loops=1)
                     ->  Hash  (cost=291.43..291.43 rows=285 width=101) (actual 
time=11.961..11.961 rows=0 loops=1)
                           ->  Hash Join  (cost=110.26..291.43 rows=285 width=101) 
(actual time=6.378..11.573 rows=274 loops=1)
                                 Hash Cond: ("outer".pictureid = "inner".pictureid)
                                 ->  Seq Scan on pictures p  (cost=0.00..68.33 
rows=2933 width=97) (actual time=0.007..2.426 rows=2933 loops=1)
                                 ->  Hash  (cost=109.55..109.55 rows=285 width=4) 
(actual time=6.163..6.163 rows=0 loops=1)
                                       ->  Seq Scan on keywords k  (cost=0.00..109.55 
rows=285 width=4) (actual time=0.032..5.929 rows=274 loops=1)
                                             Filter: ((value)::text = 'laurent'::text)
               ->  Hash  (cost=5.19..5.19 rows=12 width=4) (actual time=0.217..0.217 
rows=0 loops=1)
                     ->  Seq Scan on groupsdef g  (cost=0.00..5.19 rows=12 width=4) 
(actual time=0.038..0.197 rows=11 loops=1)
                           Filter: (userid = 2)
 Total runtime: 33.554 ms

Now, if I use the following view to abstract access rights:

CREATE VIEW userpictures (
       PictureID,RollID,FrameID,Description,Filename,
       Owner,EntryDate,Date,
       NbClick,NbRates,MaxRate,MinRate,AverageRate,SumRates,
       UserID) 
   AS SELECT DISTINCT ON (Permissions.PictureID,UserID) 
             Pictures.PictureID,RollID,FrameID,Description,Filename,Owner,
             EntryDate,Date,NbClick,NbRates,MaxRate,MinRate,AverageRate,SumRates,
             UserID 
         FROM Permissions 
              JOIN Groupsdef using (GroupID) 
              JOIN pictures using (PictureID);

The query I use is:

SELECT P.*
       FROM UserPictures AS P, Keywords AS K
       WHERE P.PictureID=K.PictureID 
             AND K.Value in ('laurent') 
             AND UserID=2
       ORDER BY AverageRate,PictureID;

QUERY PLAN                                                                             
                

 Sort  (cost=1126.98..1128.54 rows=622 width=438) (actual time=142.053..142.132 
rows=274 loops=1)
   Sort Key: p.averagerate, p.pictureid
   ->  Merge Join  (cost=995.75..1098.12 rows=622 width=438) (actual 
time=116.412..140.481 rows=274 loops=1)
         Merge Cond: ("outer".pictureid = "inner".pictureid)
         ->  Subquery Scan p  (cost=874.58..955.99 rows=4652 width=438) (actual 
time=108.709..130.049 rows=2361 loops=1)
               ->  Unique  (cost=874.58..909.47 rows=4652 width=105) (actual 
time=108.685..119.661 rows=2361 loops=1)
                     ->  Sort  (cost=874.58..886.21 rows=4652 width=105) (actual 
time=108.676..110.185 rows=4403 loops=1)
                           Sort Key: permissions.pictureid, groupsdef.userid
                           ->  Hash Join  (cost=388.35..591.19 rows=4652 width=105) 
(actual time=32.031..63.322 rows=5076 loops=1)
                                 Hash Cond: ("outer".pictureid = "inner".pictureid)
                                 ->  Seq Scan on pictures  (cost=0.00..68.33 rows=2933 
width=97) (actual time=0.011..2.836 rows=2933 loops=1)
                                 ->  Hash  (cost=376.72..376.72 rows=4652 width=8) 
(actual time=31.777..31.777 rows=0 loops=1)
                                       ->  Merge Join  (cost=5.40..376.72 rows=4652 
width=8) (actual time=0.285..27.805 rows=5076 loops=1)
                                             Merge Cond: ("outer".groupid = 
"inner".groupid)
                                             ->  Index Scan using 
permissions_groupid_key on permissions  (cost=0.00..280.77 rows=8305 width=8) (actual 
time=0.031..13.018 rows=7633 loops=1)
                                             ->  Sort  (cost=5.40..5.43 rows=12 
width=8) (actual time=0.237..1.762 rows=5078 loops=1)
                                                   Sort Key: groupsdef.groupid
                                                   ->  Seq Scan on groupsdef  
(cost=0.00..5.19 rows=12 width=8) (actual time=0.034..0.203 rows=11 loops=1)
                                                         Filter: (userid = 2)
         ->  Sort  (cost=121.17..121.88 rows=285 width=4) (actual time=6.987..7.065 
rows=274 loops=1)
               Sort Key: k.pictureid
               ->  Seq Scan on keywords k  (cost=0.00..109.55 rows=285 width=4) 
(actual time=0.056..6.656 rows=274 loops=1)
                     Filter: ((value)::text = 'laurent'::text)
 Total runtime: 144.045 ms

To me the only difference between the queries is that second one
includes a UserID column. One strange thing is that width from 105
jumps to 438 during "Subquery Scan p".

I think it's because of DISTINCT ON in the view. If I use this view:


CREATE VIEW userpictures (
       PictureID,RollID,FrameID,Description,Filename,
       Owner,EntryDate,Date,
       NbClick,NbRates,MaxRate,MinRate,AverageRate,SumRates,
       UserID) 
   AS SELECT Pictures.PictureID,RollID,FrameID,Description,Filename,Owner,
             EntryDate,Date,NbClick,NbRates,MaxRate,MinRate,AverageRate,SumRates,
             UserID 
         FROM Permissions 
              JOIN Groupsdef using (GroupID) 
              JOIN pictures using (PictureID);

and this query:

SELECT DISTINCT ON (AverageRate,PictureID) P.*
       FROM UserPictures AS P, Keywords AS K
       WHERE P.PictureID=K.PictureID 
             AND K.Value in ('laurent') 
             AND UserID=2
       ORDER BY AverageRate,PictureID;

The result is similar to the query without the view:

QUERY PLAN                                                                     

 Unique  (cost=525.39..529.17 rows=504 width=101) (actual time=34.689..35.287 rows=274 
loops=1)
   ->  Sort  (cost=525.39..526.65 rows=504 width=101) (actual time=34.686..34.828 
rows=505 loops=1)
         Sort Key: pictures.averagerate, pictures.pictureid
         ->  Hash Join  (cost=297.36..502.76 rows=504 width=101) (actual 
time=11.786..31.739 rows=505 loops=1)
               Hash Cond: ("outer".groupid = "inner".groupid)
               ->  Hash Join  (cost=292.14..466.79 rows=807 width=101) (actual 
time=11.409..28.389 rows=750 loops=1)
                     Hash Cond: ("outer".pictureid = "inner".pictureid)
                     ->  Seq Scan on permissions  (cost=0.00..125.05 rows=8305 
width=8) (actual time=0.004..9.177 rows=8305 loops=1)
                     ->  Hash  (cost=291.43..291.43 rows=285 width=101) (actual 
time=11.319..11.319 rows=0 loops=1)
                           ->  Hash Join  (cost=110.26..291.43 rows=285 width=101) 
(actual time=5.919..10.961 rows=274 loops=1)
                                 Hash Cond: ("outer".pictureid = "inner".pictureid)
                                 ->  Seq Scan on pictures  (cost=0.00..68.33 rows=2933 
width=97) (actual time=0.004..2.258 rows=2933 loops=1)
                                 ->  Hash  (cost=109.55..109.55 rows=285 width=4) 
(actual time=5.700..5.700 rows=0 loops=1)
                                       ->  Seq Scan on keywords k  (cost=0.00..109.55 
rows=285 width=4) (actual time=0.029..5.471 rows=274 loops=1)
                                             Filter: ((value)::text = 'laurent'::text)
               ->  Hash  (cost=5.19..5.19 rows=12 width=8) (actual time=0.198..0.198 
rows=0 loops=1)
                     ->  Seq Scan on groupsdef  (cost=0.00..5.19 rows=12 width=8) 
(actual time=0.031..0.178 rows=11 loops=1)
                           Filter: (userid = 2)
 Total runtime: 35.657 ms


-- 
Laurent Martelli
[EMAIL PROTECTED]                                Java Aspect Components
http://www.aopsys.com/                          http://jac.objectweb.org


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Reply via email to