### Re: [HACKERS] cross column correlation revisted

hello ... a view is already nice but i think it is still too narrow. the problem is: you don't want a view for every potential join. in addition to that - ideally there is not much left of a view when it comes to checking for costs. so, i think, this is not the kind of approach leading to total

### Re: [HACKERS] cross column correlation revisted

On Thu, Jul 15, 2010 at 12:04:21PM +0200, Hans-Jürgen Schönig wrote: hello ... a view is already nice but i think it is still too narrow. the problem is: you don't want a view for every potential join. in addition to that - ideally there is not much left of a view when it comes to

### Re: [HACKERS] cross column correlation revisted

On Thu, Jul 15, 2010 at 12:04:21PM +0200, Hans-Jürgen Schönig wrote: hello ... a view is already nice but i think it is still too narrow. One sure way to fail is to take on a problem in chunks too large. If we get even one of the cross-column issues solved by statistics, we'll be ahead of

### [HACKERS] cross column correlation revisted

hello everybody, we are currently facing some serious issues with cross correlation issue. consider: 10% of all people have breast cancer. we have 2 genders (50:50). if i select all the men with breast cancer, i will get basically nobody - the planner will overestimate the output. this is the

### Re: [HACKERS] cross column correlation revisted

On 14/07/10 13:12, PostgreSQL - Hans-Jürgen Schönig wrote: hello everybody, we are currently facing some serious issues with cross correlation issue. consider: 10% of all people have breast cancer. we have 2 genders (50:50). if i select all the men with breast cancer, i will get basically

### Re: [HACKERS] cross column correlation revisted

On Jul 14, 2010, at 12:40 PM, Heikki Linnakangas wrote: On 14/07/10 13:12, PostgreSQL - Hans-Jürgen Schönig wrote: hello everybody, we are currently facing some serious issues with cross correlation issue. consider: 10% of all people have breast cancer. we have 2 genders (50:50). if i

### Re: [HACKERS] cross column correlation revisted

Heikki Linnakangas wrote: However, the problem is how to represent and store the cross-correlation. For fields with low cardinality, like gender and boolean breast-cancer-or-not you can count the prevalence of all the different combinations, but that doesn't scale. Another often cited example

### Re: [HACKERS] cross column correlation revisted

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 14/07/10 13:12, PostgreSQL - Hans-Jürgen Schönig wrote: maybe somehow like this ... ALTER TABLE x SET CORRELATION STATISTICS FOR (id = id2 AND id3=id4) +1 is my general feeling, it's good if you can tell the system to collect

### Re: [HACKERS] cross column correlation revisted

On Wed, Jul 14, 2010 at 01:21:19PM +0200, Yeb Havinga wrote: Heikki Linnakangas wrote: However, the problem is how to represent and store the cross-correlation. For fields with low cardinality, like gender and boolean breast-cancer-or-not you can count the prevalence of all the

### Re: [HACKERS] cross column correlation revisted

hello tom, i think that having stats on an index is a problem by itself for 2 reasons - for cross column correlation at least: a.) joins cannot be covered by an index on two tables - we would fix inside a table correlation problems but not joins. b.) who says that there is

### Re: [HACKERS] cross column correlation revisted

=?iso-8859-1?Q?PostgreSQL_-_Hans-J=FCrgen_Sch=F6nig?= postg...@cybertec.at writes: i think that having stats on an index is a problem by itself for 2 reasons - for cross column correlation at least: a.) joins cannot be covered by an index on two tables - we would fix inside a table

### Re: [HACKERS] cross column correlation revisted

Tom Lane wrote: If the combination of columns is actually interesting, there might well be an index in place, or the DBA might be willing to create it. I'm having a hard time imagining an interesting case where that wouldn't be so. For that matter, have you considered the idea of

### Re: [HACKERS] cross column correlation revisted

hello ... look at the syntax i posted in more detail: ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id = y.id AND x.id2 = y.id2) it says X and Y ... the selectivity of joins are what i am most interested in. cross correlation of columns within the same table are just a byproduct.

### Re: [HACKERS] cross column correlation revisted

On Wed, Jul 14, 2010 at 04:41:01PM +0200, PostgreSQL - Hans-Jürgen Schönig wrote: hello ... look at the syntax i posted in more detail: ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id = y.id AND x.id2 = y.id2) it says X and Y ... the selectivity of joins are what i am most

### Re: [HACKERS] cross column correlation revisted

2010/7/14 Tom Lane t...@sss.pgh.pa.us: If the combination of columns is actually interesting, there might well be an index in place, or the DBA might be willing to create it. Indexes aren't free, though, nor even close to it. Still, I think we should figure out the underlying mechanism first

### Re: [HACKERS] cross column correlation revisted

On Wed, Jul 14, 2010 at 5:13 PM, Robert Haas robertmh...@gmail.com wrote: 2010/7/14 Tom Lane t...@sss.pgh.pa.us: If the combination of columns is actually interesting, there might well be an index in place, or the DBA might be willing to create it. Indexes aren't free, though, nor even close

### Re: [HACKERS] cross column correlation revisted

Joshua Tolley eggyk...@gmail.com writes: ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id =3D y.id AND x.id= 2 =3D y.id2) =20 it says X and Y ... the selectivity of joins are what i am most interested in. cross correlation of columns within the same table are just a byproduct. the core