On Sun, Feb 12, 2017 at 10:35:04AM +, Dean Rasheed wrote:
> On 11 February 2017 at 01:17, Tomas Vondra
> wrote:
> > Thanks for the feedback, I'll fix this. I've allowed myself to be a bit
> > sloppy because the number of attributes in the statistics is currently
On 11 February 2017 at 01:17, Tomas Vondra wrote:
> Thanks for the feedback, I'll fix this. I've allowed myself to be a bit
> sloppy because the number of attributes in the statistics is currently
> limited to 8, so the overflows are currently not an issue. But it
On 02/08/2017 07:40 PM, Dean Rasheed wrote:
On 8 February 2017 at 16:09, David Fetter wrote:
Combinations are n!/(k! * (n-k)!), so computing those is more
along the lines of:
unsigned long long
choose(unsigned long long n, unsigned long long k) {
if (k > n) {
On 8 February 2017 at 16:09, David Fetter wrote:
> Combinations are n!/(k! * (n-k)!), so computing those is more
> along the lines of:
>
> unsigned long long
> choose(unsigned long long n, unsigned long long k) {
> if (k > n) {
> return 0;
> }
> unsigned long
On Wed, Feb 08, 2017 at 03:23:25PM +, Dean Rasheed wrote:
> On 6 February 2017 at 21:26, Alvaro Herrera wrote:
> > Tomas Vondra wrote:
> >> On 02/01/2017 11:52 PM, Alvaro Herrera wrote:
> >
> >> > Nearby, some auxiliary functions such as n_choose_k and
> >> >
On 6 February 2017 at 21:26, Alvaro Herrera wrote:
> Tomas Vondra wrote:
>> On 02/01/2017 11:52 PM, Alvaro Herrera wrote:
>
>> > Nearby, some auxiliary functions such as n_choose_k and
>> > num_combinations are not documented. What it is that they do? I'd
>> > move these
Still about 0003. dependencies.c comment at the top of the file should
contain some details about what is it implementing and a general
description of the algorithm and data structures. As before, it's best
to have the main entry point build_mv_dependencies at the top, the other
public
Looking at 0003, I notice that gram.y is changed to add a WITH ( .. )
clause. If it's not specified, an error is raised. If you create
stats with (ndistinct) then you can't alter it later to add
"dependencies" or whatever; unless I misunderstand, you have to drop the
statistics and create
Tomas Vondra wrote:
> On 02/01/2017 11:52 PM, Alvaro Herrera wrote:
> > Nearby, some auxiliary functions such as n_choose_k and
> > num_combinations are not documented. What it is that they do? I'd
> > move these at the end of the file, keeping the important entry points
> > at the top of the
On Thu, Feb 2, 2017 at 3:59 AM, Tomas Vondra
wrote:
> There's a subtle difference between pg_node_tree and the data types for
> statistics - pg_node_tree stores the value as a string (matching the
> nodeToString output), so the _in function is fairly simple. Of
On 02/01/2017 11:52 PM, Alvaro Herrera wrote:
Still looking at 0002.
pg_ndistinct_in disallows input, claiming that pg_node_tree does the
same thing. But pg_node_tree does it for security reasons: you could
crash the backend if you supplied a malicious value. I don't think
that applies to
Still looking at 0002.
pg_ndistinct_in disallows input, claiming that pg_node_tree does the
same thing. But pg_node_tree does it for security reasons: you could
crash the backend if you supplied a malicious value. I don't think that
applies to pg_ndistinct_in. Perhaps it will be useful to
On 01/31/2017 07:52 AM, Amit Langote wrote:
On 2017/01/31 6:57, Tomas Vondra wrote:
On 01/30/2017 09:37 PM, Alvaro Herrera wrote:
Looks good to me. I don't think we need to keep the names very short --
I would propose "standistinct", "stahistogram", "stadependencies".
Yeah, I got annoyed
On 2017/01/31 6:57, Tomas Vondra wrote:
> On 01/30/2017 09:37 PM, Alvaro Herrera wrote:
>> Looks good to me. I don't think we need to keep the names very short --
>> I would propose "standistinct", "stahistogram", "stadependencies".
>>
>
> Yeah, I got annoyed by the short names too.
>
> This
On Tue, Jan 31, 2017 at 6:57 AM, Tomas Vondra
wrote:
> This however reminds me that perhaps pg_mv_statistic is not the best name. I
> know others proposed pg_statistic_ext (and pg_stats_ext), and while I wasn't
> a big fan initially, I think it's a better name.
On 01/30/2017 09:37 PM, Alvaro Herrera wrote:
Tomas Vondra wrote:
The 'built' flags may be easily replaced with a check if the bytea-like
columns are NULL, and the 'enabled' columns may be replaced by the array of
char, just like you proposed.
That'd give us a single catalog looking like
Tomas Vondra wrote:
> The 'built' flags may be easily replaced with a check if the bytea-like
> columns are NULL, and the 'enabled' columns may be replaced by the array of
> char, just like you proposed.
>
> That'd give us a single catalog looking like this:
>
> pg_mv_statistics
> starelid
>
On 01/30/2017 05:12 PM, Alvaro Herrera wrote:
>
Hmm. So we have a catalog pg_mv_statistics which stores two things:
1. the configuration regarding mvstats that have been requested by user
via CREATE/ALTER STATISTICS
2. the actual values captured from the above, via ANALYZE
I think this
On 01/30/2017 05:55 PM, Alvaro Herrera wrote:
Minor nitpicks:
Let me suggest to use get_attnum() in CreateStatistics instead of
SearchSysCacheAttName for each column. Also, we use type AttrNumber for
attribute numbers rather than int16. Finally in the same function you
have an erroneous
Hello,
On 01/26/2017 10:03 AM, Ideriha, Takeshi wrote:
Though I pointed out these typoes and so on,
I believe these feedback are less priority compared to the source code itself.
So please work on my feedback if you have time.
I think getting the comments (and docs in general) right is
On 01/26/2017 10:43 AM, Dilip Kumar wrote:
histograms
--
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
Isn't it will be better that take some percentage of the
Minor nitpicks:
Let me suggest to use get_attnum() in CreateStatistics instead of
SearchSysCacheAttName for each column. Also, we use type AttrNumber for
attribute numbers rather than int16. Finally in the same function you
have an erroneous ERRCODE_UNDEFINED_COLUMN which should be
Tomas Vondra wrote:
> On 01/03/2017 05:22 PM, Tomas Vondra wrote:
> > On 01/03/2017 02:42 PM, Dilip Kumar wrote:
> ...
> > > I think it should be easily reproducible, in case it's not I can send
> > > call stack or core dump.
> > >
> >
> > Thanks for the report. It was trivial to reproduce and
Hello, I'll return on this since this should welcome more eyeballs.
At Thu, 26 Jan 2017 09:03:10 +, "Ideriha, Takeshi"
wrote in
<4E72940DA2BF16479384A86D54D0988A565822A9@G01JPEXMBKW04>
> Hi
>
> When you have time, could you rebase the pathes?
> Some
On Thu, Jan 5, 2017 at 3:27 AM, Tomas Vondra
wrote:
> Thanks. Those plans match my experiments with the TPC-H data set, although
> I've been playing with the smallest scale (1GB).
>
> It's not very difficult to make the estimation error arbitrary large, e.g.
> by
Hi
When you have time, could you rebase the pathes?
Some patches cannot be applied to the current HEAD.
0001 patch can be applied but the following 0002 patch cannot be.
I've just started reading your patch (mainly docs and README, not yet source
code.)
Though these are minor things, I've
On Wed, Jan 25, 2017 at 9:56 PM, Alvaro Herrera
wrote:
> Michael Paquier wrote:
>> And nothing has happened since. Are there people willing to review
>> this patch and help it proceed?
>
> I am going to grab this patch as committer.
Thanks, that's good to know.
--
Michael Paquier wrote:
> On Wed, Jan 4, 2017 at 11:35 AM, Tomas Vondra
> wrote:
> > Attached is v22 of the patch series, rebased to current master and fixing
> > the reported bug. I haven't made any other changes - the issues reported by
> > Petr are mostly minor,
On Wed, Jan 4, 2017 at 11:35 AM, Tomas Vondra
wrote:
> On 01/03/2017 05:22 PM, Tomas Vondra wrote:
>>
>> On 01/03/2017 02:42 PM, Dilip Kumar wrote:
>
> ...
>>>
>>> I think it should be easily reproducible, in case it's not I can send
>>> call stack or core dump.
>>>
On 01/04/2017 03:21 PM, Dilip Kumar wrote:
On Wed, Jan 4, 2017 at 8:05 AM, Tomas Vondra
wrote:
Attached is v22 of the patch series, rebased to current master and fixing
the reported bug. I haven't made any other changes - the issues reported by
Petr are mostly
On Wed, Jan 4, 2017 at 8:05 AM, Tomas Vondra
wrote:
> Attached is v22 of the patch series, rebased to current master and fixing
> the reported bug. I haven't made any other changes - the issues reported by
> Petr are mostly minor, so I've decided to wait a bit more
On 12/30/2016 02:12 PM, Petr Jelinek wrote:
On 12/12/16 22:50, Tomas Vondra wrote:
On 12/12/2016 12:26 PM, Amit Langote wrote:
Hi Tomas,
On 2016/10/30 4:23, Tomas Vondra wrote:
Hi,
Attached is v20 of the multivariate statistics patch series, doing
mostly
the changes outlined in the
On 01/03/2017 02:42 PM, Dilip Kumar wrote:
On Tue, Dec 13, 2016 at 3:20 AM, Tomas Vondra
wrote:
attached is v21 of the patch series, rebased to current master (resolving
the duplicate OID and a few trivial merge conflicts), and also fixing some
of the issues you
On Tue, Dec 13, 2016 at 3:20 AM, Tomas Vondra
wrote:
> attached is v21 of the patch series, rebased to current master (resolving
> the duplicate OID and a few trivial merge conflicts), and also fixing some
> of the issues you reported.
I wanted to test the grouping
On 12/12/16 22:50, Tomas Vondra wrote:
> On 12/12/2016 12:26 PM, Amit Langote wrote:
>>
>> Hi Tomas,
>>
>> On 2016/10/30 4:23, Tomas Vondra wrote:
>>> Hi,
>>>
>>> Attached is v20 of the multivariate statistics patch series, doing
>>> mostly
>>> the changes outlined in the preceding e-mail from
On 12/12/16 22:50, Tomas Vondra wrote:
>> +
>> +SELECT pg_mv_stats_dependencies_show(stadeps)
>> + FROM pg_mv_statistic WHERE staname = 's1';
>> +
>> + pg_mv_stats_dependencies_show
>> +---
>> + (1) => 2, (2) => 1
>> +(1 row)
>> +
>>
>> Couldn't this somehow show
Hi Tomas,
On 2016/10/30 4:23, Tomas Vondra wrote:
> Hi,
>
> Attached is v20 of the multivariate statistics patch series, doing mostly
> the changes outlined in the preceding e-mail from October 11.
>
> The patch series currently has these parts:
>
> * 0001 : (FIX) teach pull_varno about
Hi everyone,
thanks for the reviews. Let me sum the feedback so far, and outline my
plans for the next patch version that I'd like to submit for CF 2016-11.
1) syntax changes
I agree with the changes proposed by Dean, although only a subset of the
syntax is going to be supported until we
On 4 October 2016 at 09:15, Heikki Linnakangas wrote:
> However, for tables and views, each object you store in those views is a
> "table" or "view", but with this thing, the object you store is
> "statistics". Would you have a catalog table called "pg_scissor"?
>
No, probably
On 04/10/16 20:37, Dean Rasheed wrote:
On 4 October 2016 at 04:25, Michael Paquier wrote:
OK. A second thing was related to the use of schemas in the new system
catalogs. As mentioned in [1], those could be removed.
[1]:
On 10/04/2016 10:49 AM, Dean Rasheed wrote:
On 30 September 2016 at 12:10, Heikki Linnakangas wrote:
I fear that using "statistics" as the name of the new object might get a bit
awkward. "statistics" is a plural, but we use it as the name of a single
object, like "pants" or
On 30 September 2016 at 12:10, Heikki Linnakangas wrote:
> I fear that using "statistics" as the name of the new object might get a bit
> awkward. "statistics" is a plural, but we use it as the name of a single
> object, like "pants" or "scissors". Not sure I have any better
On 4 October 2016 at 04:25, Michael Paquier wrote:
> OK. A second thing was related to the use of schemas in the new system
> catalogs. As mentioned in [1], those could be removed.
> [1]:
>
On Mon, Oct 3, 2016 at 8:25 PM, Heikki Linnakangas wrote:
> Yeah. The idea was to use something like pg_node_tree to store all the
> different kinds of statistics, the histogram, the MCV, and the functional
> dependencies, in one datum. Or JSON, maybe. It sounds better than an
On 10/03/2016 04:46 AM, Michael Paquier wrote:
On Fri, Sep 30, 2016 at 8:10 PM, Heikki Linnakangas wrote:
This patch set is in pretty good shape, the only problem is that it's so big
that no-one seems to have the time or courage to do the final touches and
commit it.
Did you
On Fri, Sep 30, 2016 at 8:10 PM, Heikki Linnakangas wrote:
> This patch set is in pretty good shape, the only problem is that it's so big
> that no-one seems to have the time or courage to do the final touches and
> commit it.
Did you see my suggestions about simplifying its SQL
This patch set is in pretty good shape, the only problem is that it's so
big that no-one seems to have the time or courage to do the final
touches and commit it. If we just focus on the functional dependencies
part for now, I think we might get somewhere. I peeked at the MCV and
histogram
Hi,
Thanks for looking into this!
On 09/12/2016 04:08 PM, Dean Rasheed wrote:
On 3 August 2016 at 02:58, Tomas Vondra wrote:
Attached is v19 of the "multivariate stats" patch series
Hi,
I started looking at this - just at a very high level - I've not read
On 3 August 2016 at 02:58, Tomas Vondra wrote:
> Attached is v19 of the "multivariate stats" patch series
Hi,
I started looking at this - just at a very high level - I've not read
much of the detail yet, but here are some initial review comments.
I think the
On Wed, Aug 24, 2016 at 2:03 AM, Robert Haas wrote:
> ISTR that you were going to try to look at this patch set. It seems
> from the discussion that it's not really ready for serious
> consideration for commit yet, but also that some high-level design
> comments from you
On Tue, Aug 2, 2016 at 9:58 PM, Tomas Vondra
wrote:
> Attached is v19 of the "multivariate stats" patch series - essentially v18
> rebased on top of current master.
Tom:
ISTR that you were going to try to look at this patch set. It seems
from the discussion that
On 08/10/2016 06:41 AM, Michael Paquier wrote:
On Wed, Aug 3, 2016 at 10:58 AM, Tomas Vondra
wrote:
1) enriching the query tree with multivariate statistics info
Right now all the stuff related to multivariate statistics estimation
happens in clausesel.c -
On Thu, Aug 11, 2016 at 3:34 AM, Tomas Vondra
wrote:
> On 08/10/2016 02:23 PM, Michael Paquier wrote:
>>
>> On Wed, Aug 10, 2016 at 8:33 PM, Tomas Vondra
>> wrote:
>>> The idea is that the syntax should work even for statistics built on
On 08/10/2016 02:23 PM, Michael Paquier wrote:
On Wed, Aug 10, 2016 at 8:33 PM, Tomas Vondra
wrote:
On 08/10/2016 06:41 AM, Michael Paquier wrote:
Patch 0001: there have been comments about that before, and you have
put the checks on RestrictInfo in a couple of
On 08/10/2016 02:24 PM, Michael Paquier wrote:
On Wed, Aug 10, 2016 at 8:50 PM, Petr Jelinek wrote:
On 10/08/16 13:33, Tomas Vondra wrote:
On 08/10/2016 06:41 AM, Michael Paquier wrote:
On Wed, Aug 3, 2016 at 10:58 AM, Tomas Vondra
2) combining multiple statistics
On 08/10/2016 03:29 PM, Ants Aasma wrote:
On Wed, Aug 3, 2016 at 4:58 AM, Tomas Vondra
wrote:
2) combining multiple statistics
I think the ability to combine multivariate statistics (covering different
subsets of conditions) is important and useful, but I'm
On Wed, Aug 3, 2016 at 4:58 AM, Tomas Vondra
wrote:
> 2) combining multiple statistics
>
> I think the ability to combine multivariate statistics (covering different
> subsets of conditions) is important and useful, but I'm starting to think
> that the current
On Wed, Aug 10, 2016 at 8:50 PM, Petr Jelinek wrote:
> On 10/08/16 13:33, Tomas Vondra wrote:
>>
>> On 08/10/2016 06:41 AM, Michael Paquier wrote:
>>>
>>> On Wed, Aug 3, 2016 at 10:58 AM, Tomas Vondra
2) combining multiple statistics
I think the
On Wed, Aug 10, 2016 at 8:33 PM, Tomas Vondra
wrote:
> On 08/10/2016 06:41 AM, Michael Paquier wrote:
>> Patch 0001: there have been comments about that before, and you have
>> put the checks on RestrictInfo in a couple of variables of
>> pull_varnos_walker, so
On 10/08/16 13:33, Tomas Vondra wrote:
On 08/10/2016 06:41 AM, Michael Paquier wrote:
On Wed, Aug 3, 2016 at 10:58 AM, Tomas Vondra
2) combining multiple statistics
I think the ability to combine multivariate statistics (covering
different
subsets of conditions) is important and useful, but
On 08/10/2016 06:41 AM, Michael Paquier wrote:
On Wed, Aug 3, 2016 at 10:58 AM, Tomas Vondra
wrote:
...
But more importantly, I think we'll need to show some of the data in EXPLAIN
output. With per-column statistics it's fairly straightforward to determine
which
On Wed, Aug 3, 2016 at 10:58 AM, Tomas Vondra
wrote:
> 1) enriching the query tree with multivariate statistics info
>
> Right now all the stuff related to multivariate statistics estimation
> happens in clausesel.c - matching condition to statistics, selection of
>
On Sat, Aug 6, 2016 at 2:38 AM, Tomas Vondra
wrote:
> On 08/05/2016 06:24 AM, Michael Paquier wrote:
>>
>> On Wed, Aug 3, 2016 at 10:58 AM, Tomas Vondra
>> wrote:
>>>
>>> Attached is v19 of the "multivariate stats" patch series -
On 08/05/2016 06:24 AM, Michael Paquier wrote:
On Wed, Aug 3, 2016 at 10:58 AM, Tomas Vondra
wrote:
Attached is v19 of the "multivariate stats" patch series - essentially v18
rebased on top of current master. Aside from a few bug fixes, the main
improvement is
On Wed, Aug 3, 2016 at 10:58 AM, Tomas Vondra
wrote:
> Attached is v19 of the "multivariate stats" patch series - essentially v18
> rebased on top of current master. Aside from a few bug fixes, the main
> improvement is addition of SGML docs demonstrating the
65 matches
Mail list logo