subject:"\\\[HACKERS\\\] cross column correlation revisted"

Re: [HACKERS] cross column correlation revisted

2010-07-15 Thread Hans-Jürgen Schönig

hello ...

a view is already nice but i think it is still too narrow. 
the problem is: you don't want a view for every potential join.
in addition to that - ideally there is not much left of a view when it comes to 
checking for costs.
so, i think, this is not the kind of approach leading to total success here.

one side question: does anybody happen to know how this is one in oracle or db2?

many thanks,

hans



On Jul 15, 2010, at 1:33 AM, Dimitri Fontaine wrote:

 Joshua Tolley eggyk...@gmail.com writes:
   ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id =3D y.id AND x.id=
 2 =3D y.id2)
 =20
 it says X and Y ...  the selectivity of joins are what i am most
 interested in. cross correlation of columns within the same table are
 just a byproduct.  the core thing is: how can i estimate the number
 of rows returned from a join?
 
 All the discussion of this topic that I've seen has been limited to the s=
 ingle
 table case. The hard problem in that case is coming up with something you=
 can
 precalculate that will actually be useful during query planning, without
 taking too much disk, memory, CPU, or something else. Expanding the discu=
 ssion
 to include join relations certainly still has valid use cases, but is even
 harder, because you've also got to keep track of precisely how the underl=
 ying
 relations are joined, so you know in what context the statistics remain v=
 alid.
 
 Well I've been proposing to handle the correlation problem in another
 way in some past mails here, and I've been trying to write it down too:
 
  http://archives.postgresql.org/pgsql-performance/2009-06/msg00118.php
  http://tapoueh.org/char10.html#sec13
 
 What I propose is to extend ANALYZE to be able to work on a VIEW too,
 rather than just a table. The hard parts seems to be:
 
 a. what stats to record, exploiting the view definition the best we can
 b. how to match a user query against the view definitions we have in
order to actually use the stats
 
 If you have answers or good ideas=C2=A0:)
 
 Regards,
 --=20
 dim
 
 
 -- 
 dim
 


--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cross column correlation revisted

2010-07-15 Thread Joshua Tolley

On Thu, Jul 15, 2010 at 12:04:21PM +0200, Hans-Jürgen Schönig wrote:
 hello ...
 
 a view is already nice but i think it is still too narrow. 
 the problem is: you don't want a view for every potential join.
 in addition to that - ideally there is not much left of a view when it comes 
 to checking for costs.
 so, i think, this is not the kind of approach leading to total success here.

The prolem is a very big one, and it's helpful to try solving it one piece at
a time, so the single table and view-based cases are probably good starting
points.

 one side question: does anybody happen to know how this is one in oracle or 
 db2?

Neither appear to handle multi-column statistics in any form.

[1] http://download.oracle.com/docs/cd/B13789_01/appdev.101/b10802/d_stats.htm
[2]
http://www.ibm.com/developerworks/data/library/techarticle/dm-0606fechner/index.html

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature

Re: [HACKERS] cross column correlation revisted

2010-07-15 Thread David Fetter

On Thu, Jul 15, 2010 at 12:04:21PM +0200, Hans-Jürgen Schönig wrote:
 hello ...
 
 a view is already nice but i think it is still too narrow. 

One sure way to fail is to take on a problem in chunks too large.  If
we get even one of the cross-column issues solved by statistics, we'll
be ahead of all our competition, both free and proprietary.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] cross column correlation revisted

2010-07-14 Thread PostgreSQL - Hans-Jürgen Schönig

hello everybody,

we are currently facing some serious issues with cross correlation issue.
consider: 10% of all people have breast cancer. we have 2 genders (50:50).
if i select all the men with breast cancer, i will get basically nobody - the 
planner will overestimate the output.
this is the commonly known problem ...

this cross correlation problem can be quite nasty in many many cases.
underestimated nested loops can turn joins into a never ending nightmare and so 
on and so on.

my ideas is the following:
what if we allow users to specifiy cross-column combinations where we keep 
separate stats?
maybe somehow like this ...

ALTER TABLE x SET CORRELATION STATISTICS FOR (id = id2 AND id3=id4)

or ...

ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id = y.id AND x.id2 = 
y.id2)

clearly we cannot store correlation for all combinations of all columns so we 
somehow have to limit it.

what is the general feeling about something like that?

many thanks,

hans

--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cross column correlation revisted

2010-07-14 Thread Heikki Linnakangas


On 14/07/10 13:12, PostgreSQL - Hans-Jürgen Schönig wrote:

hello everybody,

we are currently facing some serious issues with cross correlation issue.
consider: 10% of all people have breast cancer. we have 2 genders (50:50).
if i select all the men with breast cancer, i will get basically nobody - the 
planner will overestimate the output.
this is the commonly known problem ...

this cross correlation problem can be quite nasty in many many cases.
underestimated nested loops can turn joins into a never ending nightmare and so 
on and so on.

my ideas is the following:
what if we allow users to specifiy cross-column combinations where we keep 
separate stats?
maybe somehow like this ...

ALTER TABLE x SET CORRELATION STATISTICS FOR (id = id2 AND id3=id4)

or ...

ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id = y.id AND x.id2 = 
y.id2)

clearly we cannot store correlation for all combinations of all columns so we 
somehow have to limit it.

what is the general feeling about something like that?


+1 is my general feeling, it's good if you can tell the system to 
collect additional statistics where needed. And once you have that, you 
can write an agent or something to detect automatically which extra 
statistics might be useful.


However, the problem is how to represent and store the 
cross-correlation. For fields with low cardinality, like gender and 
boolean breast-cancer-or-not you can count the prevalence of all the 
different combinations, but that doesn't scale. Another often cited 
example is zip code + street address. There's clearly a strong 
correlation between them, but how do you represent that?


For scalar values we currently store a histogram. I suppose we could 
create a 2D histogram for two columns, but that doesn't actually help 
with the zip code + street address problem.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cross column correlation revisted

2010-07-14 Thread PostgreSQL - Hans-Jürgen Schönig


On Jul 14, 2010, at 12:40 PM, Heikki Linnakangas wrote:

 On 14/07/10 13:12, PostgreSQL - Hans-Jürgen Schönig wrote:
 hello everybody,
 
 we are currently facing some serious issues with cross correlation issue.
 consider: 10% of all people have breast cancer. we have 2 genders (50:50).
 if i select all the men with breast cancer, i will get basically nobody - 
 the planner will overestimate the output.
 this is the commonly known problem ...
 
 this cross correlation problem can be quite nasty in many many cases.
 underestimated nested loops can turn joins into a never ending nightmare and 
 so on and so on.
 
 my ideas is the following:
 what if we allow users to specifiy cross-column combinations where we keep 
 separate stats?
 maybe somehow like this ...
 
  ALTER TABLE x SET CORRELATION STATISTICS FOR (id = id2 AND id3=id4)
 
 or ...
 
  ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id = y.id AND x.id2 = 
 y.id2)
 
 clearly we cannot store correlation for all combinations of all columns so 
 we somehow have to limit it.
 
 what is the general feeling about something like that?
 
 +1 is my general feeling, it's good if you can tell the system to collect 
 additional statistics where needed. And once you have that, you can write an 
 agent or something to detect automatically which extra statistics might be 
 useful.
 


it seems i can leave my bunker where i was hiding for cover when i was waiting 
for a reply ;).
yes, my idea was to have an agent as well - but this is just some follow up 
problem.


 However, the problem is how to represent and store the cross-correlation. For 
 fields with low cardinality, like gender and boolean breast-cancer-or-not 
 you can count the prevalence of all the different combinations, but that 
 doesn't scale. Another often cited example is zip code + street address. 
 There's clearly a strong correlation between them, but how do you represent 
 that?


we could play the same story with a table storing people including their home 
country and the color of their skin.
obviously we will have more black people in african countries..


 
 For scalar values we currently store a histogram. I suppose we could create a 
 2D histogram for two columns, but that doesn't actually help with the zip 
 code + street address problem.
 


i think we might go for a second relation here specifically for this issue and 
a boolean flag in the current stats table indicating that additional 
correlation stats exist (to avoid an additional lookup unless really necessary).
do you have a useful syntax in mind? the thing is: this issue can be isolated 
inside a table (e.g. WHERE a.id = a.id2 AND a.id3 = a.id4) or it might span two 
tables with an arbitrary number of fields.

many thanks,

hans


--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cross column correlation revisted

2010-07-14 Thread Yeb Havinga


Heikki Linnakangas wrote:
However, the problem is how to represent and store the 
cross-correlation. For fields with low cardinality, like gender and 
boolean breast-cancer-or-not you can count the prevalence of all the 
different combinations, but that doesn't scale. Another often cited 
example is zip code + street address. There's clearly a strong 
correlation between them, but how do you represent that?


For scalar values we currently store a histogram. I suppose we could 
create a 2D histogram for two columns, but that doesn't actually help 
with the zip code + street address problem.
In my head the neuron for 'principle component analysis' went on while 
reading this. Back in college it was used to prepare input data before 
feeding it into a neural network. Maybe ideas from PCA could be helpful?


regards,
Yeb Havinga



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cross column correlation revisted

2010-07-14 Thread Tom Lane

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 14/07/10 13:12, PostgreSQL - Hans-Jürgen Schönig wrote:
 maybe somehow like this ...
 ALTER TABLE x SET CORRELATION STATISTICS FOR (id = id2 AND id3=id4)

 +1 is my general feeling, it's good if you can tell the system to 
 collect additional statistics where needed.

The previous discussions about this went in the direction of
automatically collect stats if there is an index on that combination of
columns.  Do we really need a command?

 However, the problem is how to represent and store the 
 cross-correlation.

Yes, whatever the triggering mechanism is for collecting cross-column
stats, actually doing something useful is the hard part.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cross column correlation revisted

2010-07-14 Thread Joshua Tolley

On Wed, Jul 14, 2010 at 01:21:19PM +0200, Yeb Havinga wrote:
 Heikki Linnakangas wrote:
 However, the problem is how to represent and store the  
 cross-correlation. For fields with low cardinality, like gender and  
 boolean breast-cancer-or-not you can count the prevalence of all the  
 different combinations, but that doesn't scale. Another often cited  
 example is zip code + street address. There's clearly a strong  
 correlation between them, but how do you represent that?

 For scalar values we currently store a histogram. I suppose we could  
 create a 2D histogram for two columns, but that doesn't actually help  
 with the zip code + street address problem.
 In my head the neuron for 'principle component analysis' went on while  
 reading this. Back in college it was used to prepare input data before  
 feeding it into a neural network. Maybe ideas from PCA could be helpful?

I've been playing off and on with an idea along these lines, which builds an
empirical copula[1] to represent correlations between columns where there
exists a multi-column index containing those columns. This copula gets stored
in pg_statistic. There are plenty of unresolved questions (and a crash I
introduced and haven't had time to track down), but the code I've been working
on is here[2] in the multicolstat branch. Most of the changes are in
analyze.c; no user-visible changes have been introduced. For that matter,
there aren't any changes yet actually to use the values once calculated (more
unresolved questions get in the way there), but it's a start.

[1] http://en.wikipedia.org/wiki/Copula_(statistics)
[2] http://git.postgresql.org/gitweb?p=users/eggyknap/postgres.git

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature

Re: [HACKERS] cross column correlation revisted

2010-07-14 Thread PostgreSQL - Hans-Jürgen Schönig

hello tom,

i think that having stats on an index is a problem by itself for 2 reasons - 
for cross column correlation at least:

a.) joins cannot be covered by an index on two tables - we would fix 
inside a table correlation problems but not joins.
b.) who says that there is actually an index in place? assume you are 
doing some big seq scan to do analytics. you don't want it to be indexed for 10 
different types of queries.

i think i is pretty hard to determine automatically what to collect because we 
cannot know which permutations of cross-column magic people will use.
i was thinking along the line of having it automatic as well but i could not 
figure out how to do it.
i think we can suggest addition stats to the user and we can write tools to 
figure our somehow what would be useful but personally i cannot see anything 
which is better than a command here.

many thanks,

hans



On Jul 14, 2010, at 4:01 PM, Tom Lane wrote:

 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 14/07/10 13:12, PostgreSQL - Hans-Jürgen Schönig wrote:
 maybe somehow like this ...
 ALTER TABLE x SET CORRELATION STATISTICS FOR (id = id2 AND id3=id4)
 
 +1 is my general feeling, it's good if you can tell the system to 
 collect additional statistics where needed.
 
 The previous discussions about this went in the direction of
 automatically collect stats if there is an index on that combination of
 columns.  Do we really need a command?
 
 However, the problem is how to represent and store the 
 cross-correlation.
 
 Yes, whatever the triggering mechanism is for collecting cross-column
 stats, actually doing something useful is the hard part.
 
   regards, tom lane
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 


--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cross column correlation revisted

2010-07-14 Thread Tom Lane

=?iso-8859-1?Q?PostgreSQL_-_Hans-J=FCrgen_Sch=F6nig?= postg...@cybertec.at 
writes:
 i think that having stats on an index is a problem by itself for 2 reasons - 
 for cross column correlation at least:

   a.) joins cannot be covered by an index on two tables - we would fix 
 inside a table correlation problems but not joins.

Your proposed command didn't cover the two-table case either, and anyway
what the heck do you mean by cross-correlation across tables?
Cross-correlation is about the correlation between values in the same
row.

   b.) who says that there is actually an index in place?

If the combination of columns is actually interesting, there might well
be an index in place, or the DBA might be willing to create it.  For
that matter, have you considered the idea of examining the index
contents to derive the statistics?  Might work better than trying to get
numbers via ANALYZE.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cross column correlation revisted

2010-07-14 Thread Andrew Dunstan




Tom Lane wrote:

If the combination of columns is actually interesting, there might well
be an index in place, or the DBA might be willing to create it.  


I'm having a hard time imagining an interesting case where that wouldn't 
be so.



For
that matter, have you considered the idea of examining the index
contents to derive the statistics?  Might work better than trying to get
numbers via ANALYZE.

  


Sounds like a good idea.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cross column correlation revisted

2010-07-14 Thread PostgreSQL - Hans-Jürgen Schönig

hello ...

look at the syntax i posted in more detail:

  ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id = y.id AND x.id2 = 
 y.id2)
 



it says X and Y ...
the selectivity of joins are what i am most interested in. cross correlation of 
columns within the same table are just a byproduct.
the core thing is: how can i estimate the number of rows returned from a join?

an example would be: you have a email accounts + messages. you know that each 
row will match in a join as you can assume that every account will have a 
message.
what we need is a syntax which covers the join case and the case where columns 
inside the same table correlate.
and the fact that an index cannot cover two tables leads me to the conclusion 
that stats on an index are not the solution to the join problem.

many thanks,

hans


On Jul 14, 2010, at 4:22 PM, Tom Lane wrote:

 =?iso-8859-1?Q?PostgreSQL_-_Hans-J=FCrgen_Sch=F6nig?= postg...@cybertec.at 
 writes:
 i think that having stats on an index is a problem by itself for 2 reasons - 
 for cross column correlation at least:
 
  a.) joins cannot be covered by an index on two tables - we would fix 
 inside a table correlation problems but not joins.
 
 Your proposed command didn't cover the two-table case either, and anyway
 what the heck do you mean by cross-correlation across tables?
 Cross-correlation is about the correlation between values in the same
 row.
 
  b.) who says that there is actually an index in place?
 
 If the combination of columns is actually interesting, there might well
 be an index in place, or the DBA might be willing to create it.  For
 that matter, have you considered the idea of examining the index
 contents to derive the statistics?  Might work better than trying to get
 numbers via ANALYZE.
 
   regards, tom lane
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 


--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cross column correlation revisted

2010-07-14 Thread Joshua Tolley

On Wed, Jul 14, 2010 at 04:41:01PM +0200, PostgreSQL - Hans-Jürgen Schönig 
wrote:
 hello ...
 
 look at the syntax i posted in more detail:
 
 ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id = y.id AND x.id2 = 
  y.id2)
  
 it says X and Y ...
 the selectivity of joins are what i am most interested in. cross correlation 
 of columns within the same table are just a byproduct.
 the core thing is: how can i estimate the number of rows returned from a join?

All the discussion of this topic that I've seen has been limited to the single
table case. The hard problem in that case is coming up with something you can
precalculate that will actually be useful during query planning, without
taking too much disk, memory, CPU, or something else. Expanding the discussion
to include join relations certainly still has valid use cases, but is even
harder, because you've also got to keep track of precisely how the underlying
relations are joined, so you know in what context the statistics remain valid.
So it makes sense to tackle the single table version first. Once it's
implemented somehow, and has been proven sufficiently effective to merit the
increased code size and complexity, we can consider expanding it to joined
relations.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature

Re: [HACKERS] cross column correlation revisted

2010-07-14 Thread Robert Haas

2010/7/14 Tom Lane t...@sss.pgh.pa.us:
 If the combination of columns is actually interesting, there might well
 be an index in place, or the DBA might be willing to create it.

Indexes aren't free, though, nor even close to it.

Still, I think we should figure out the underlying mechanism first and
then design the interface afterwards.  One idea I had was a way to say
compute the MCVs and histogram buckets for this table WHERE
predicate.  If you can prove predicate for a particular query, you
can use the more refined statistics in place of the full-table
statistics.  This is fine for the breast cancer case, but not so
useful for the zip code/street name case (which seems to be the really
tough one).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cross column correlation revisted

2010-07-14 Thread marcin mank

On Wed, Jul 14, 2010 at 5:13 PM, Robert Haas robertmh...@gmail.com wrote:
 2010/7/14 Tom Lane t...@sss.pgh.pa.us:
 If the combination of columns is actually interesting, there might well
 be an index in place, or the DBA might be willing to create it.

 Indexes aren't free, though, nor even close to it.

 Still, I think we should figure out the underlying mechanism first and
 then design the interface afterwards.  One idea I had was a way to say
 compute the MCVs and histogram buckets for this table WHERE
 predicate.  If you can prove predicate for a particular query, you
 can use the more refined statistics in place of the full-table
 statistics.  This is fine for the breast cancer case, but not so
 useful for the zip code/street name case (which seems to be the really
 tough one).


One way of dealing with the zipcode problem is estimating NDST =
count(distinct row(zipcode, street))  - i.e. multi-column ndistinct.

Then the planner doesn`t have to assume that the selectivity of a
equality condition involving both zipcode and city is a multiple of
the respective selectivities. As a first cut it can assume that it
will get count(*) / NDST rows, but there are ways to improve it.

Greetings
Marcin Mańk

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cross column correlation revisted

2010-07-14 Thread Dimitri Fontaine

Joshua Tolley eggyk...@gmail.com writes:
ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id =3D y.id AND x.id=
2 =3D y.id2)
 =20
 it says X and Y ...  the selectivity of joins are what i am most
 interested in. cross correlation of columns within the same table are
 just a byproduct.  the core thing is: how can i estimate the number
 of rows returned from a join?

 All the discussion of this topic that I've seen has been limited to the s=
ingle
 table case. The hard problem in that case is coming up with something you=
can
 precalculate that will actually be useful during query planning, without
 taking too much disk, memory, CPU, or something else. Expanding the discu=
ssion
 to include join relations certainly still has valid use cases, but is even
 harder, because you've also got to keep track of precisely how the underl=
ying
 relations are joined, so you know in what context the statistics remain v=
alid.

Well I've been proposing to handle the correlation problem in another
way in some past mails here, and I've been trying to write it down too:

  http://archives.postgresql.org/pgsql-performance/2009-06/msg00118.php
  http://tapoueh.org/char10.html#sec13

What I propose is to extend ANALYZE to be able to work on a VIEW too,
rather than just a table. The hard parts seems to be:

a. what stats to record, exploiting the view definition the best we can
b. how to match a user query against the view definitions we have in
order to actually use the stats

If you have answers or good ideas=C2=A0:)

Regards,
--=20
dim


-- 
dim


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

[HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

Re: [HACKERS] cross column correlation revisted

17 matches

Site Navigation

Mail list logo

Footer information