Re: [PERFORM] single index on more than two coulumns a bad thing?
another thing that I have all over the place is a hierarchy: index on grandfather_table(grandfather) index on father_table(grandfather, father) index on son_table(grandfather, father, son) almost all of my indices are composite. Are you thinking about composite indices with low cardinality leading columns? /Aaron - Original Message - From: Josh Berkus [EMAIL PROTECTED] To: Palle Girgensohn [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Thursday, April 01, 2004 7:35 PM Subject: Re: [PERFORM] single index on more than two coulumns a bad thing? Palle, Is it always bad to create index xx on yy (field1, field2, field3); No, it seldom bad, in fact.I have some indexes that run up to seven columns, becuase they are required for unique keys. Indexes of 3-4 columns are often *required* for many-to-many join tables. I'm afraid that you've been given some misleading advice. I guess the problem is that the index might often grow bigger than the table, or at least big enough not to speed up the queries? Well, yes ... a 4-column index on a 5-column table could be bigger than the table if allowed to bloat and not re-indexed. But that's just a reason for better maintainence. -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[PERFORM] Spatial join insists on sequential scan of larger table
I am trying to do a spatial join between two tables each of which has a column of type BOX called ERRBOX, with R-TREE indices created on both. The smaller table, xmm1, has 56,711 rows, the larger one, twomass, has 177,757,299 rows. The most efficient way to join these is to do a sequential scan of the smaller table, and an R-tree lookup on the larger. However for a simple inner join the optimiser seems to want to do the reverse, for example: EXPLAIN SELECT x.ra AS xra, x.decl AS xdecl, t.ra AS tra, t.decl AS tdecl FROM xmm1 AS x INNER JOIN twomass AS t ON x.errbox t.errbox; QUERY PLAN -- Nested Loop (cost=0.00..196642756520.34 rows=49506496044 width=32) - Seq Scan on twomass t (cost=0.00..9560002.72 rows=177023872 width=48) - Index Scan using xmm1box on xmm1 x (cost=0.00..1107.28 rows=280 width=48) Index Cond: (x.errbox outer.errbox) Reversing the join condition (i.e. t.errbox x.errbox) and similar make no difference, nor does using the old implicit join syntax. If, however, I specify an outer join such as: EXPLAIN SELECT x.ra AS xra, x.decl AS xdecl, t.ra AS tra, t.decl AS tdecl FROM xmm1 AS x LEFT OUTER JOIN twomass AS t ON x.errbox t.errbox; QUERY PLAN Nested Loop Left Join (cost=0.00..198945259325.90 rows=49506496044 width=32) - Seq Scan on xmm1 x (cost=0.00..8592.32 rows=55932 width=48) - Index Scan using tbox on twomass t (cost=0.00..3545848.88 rows=885119 width=48) Index Cond: (outer.errbox t.errbox) This executes, it need hardly be said, a whole lot faster. I found that I can also force a sequential scan of the smaller table by dropping its R-tree index, but I may need this in other operations, so this isn't a very satisfactory solution. It's odd that an outer join should be faster than an inner one, or to put it another way, after dropping an index there is more than an order of magnitude speed increase. I'm using Postgres 7.4.1 on Red Hat Linux. Has anyone had similar problems with spatial joins? -- Clive Page Dept of Physics Astronomy, University of Leicester, Leicester, LE1 7RH, U.K. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Spatial join insists on sequential scan of larger table
Clive Page [EMAIL PROTECTED] writes: This executes, it need hardly be said, a whole lot faster. Could we see EXPLAIN ANALYZE output? The estimated costs for the two cases are nearly the same, which says to me that there's something wrong with the cost model for r-tree lookups, but I don't know what it is. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] single index on more than two coulumns a bad thing?
On Fri, Apr 02, 2004 at 01:00:45 +0200, Palle Girgensohn [EMAIL PROTECTED] wrote: Is it always bad to create index xx on yy (field1, field2, field3); I guess the problem is that the index might often grow bigger than the table, or at least big enough not to speed up the queries? One place where you need them in postgres is enforcing unique multicolumn keys. These will get created implicitly from the unique (or primary key) constraint. It isn't all that unusual to have a table that describes a many to many (to many ...) relationship where the primary key is all of the columns. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Spatial join insists on sequential scan of larger
On Fri, 2 Apr 2004, Tom Lane wrote: Could we see EXPLAIN ANALYZE output? Certainly, but that's going to take a little time (as the ANALYZE causes it to run the actual query, which I only just discovered), so may have to wait until Monday if I don't get time to finish it this afternoon. -- Clive Page Dept of Physics Astronomy, University of Leicester, Leicester, LE1 7RH, U.K. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings