Re: BRIN index which is much faster never chosen by planner

2019-10-10 Thread Michael Lewis
Since the optimizer is choosing a seq scan over index scan when it seems
like it has good row estimates in both cases, to me that may mean costs of
scanning index are expected to be high. Is this workload on SSD? Has the
random_page_cost config been decreased from default 4 (compared with cost
of 1 unit for sequential scan)?

Your buffer hits aren't great. What is shared_buffers set to? How much ram
on this cluster?

With this table being insert only, one assumes correlation is very high on
the data in this column as shown in pg_stats, but have your confirmed?

To me, distinct ON is often a bad code smell and probably can be re-written
to be much more efficient with GROUP BY, lateral & order by, or some other
tool. Same with the window function. It is a powerful tool, but sometimes
not the right one.

Is "source" a function that is called on field1? What is it doing/how is it
defined?


Re: BRIN index which is much faster never chosen by planner

2019-10-10 Thread Michael Lewis
On Thu, Oct 10, 2019 at 6:22 PM David Rowley 
wrote:

> The planner might be able to get a better estimate on the number of
> matching rows if the now() - interval '10 days' expression was
> replaced with 'now'::timestamptz - interval '10 days'. However, care
> would need to be taken to ensure the plan is never prepared since
> 'now' is evaluated during parse. The same care must be taken when
> creating views, functions, stored procedures and the like.
>
> The planner will just estimate the selectivity of now() - interval '10
> days'  by using DEFAULT_INEQ_SEL, which is 0., so it
> thinks it'll get 1/3rd of the table.  Using 'now' will allow the
> planner to lookup actual statistics on that column which will likely
> give a much better estimate, which by the looks of it, likely will
> result in one of those BRIN index being used.
>

This surprised me a bit, and would have significant implications. I tested
a few different tables in our system and get the same row count estimate
with either WHERE condition. Perhaps I am missing a critical piece of what
you said.

explain
select * from charges where posted_on > now() - interval '10 days';

explain
select * from charges where posted_on > 'now'::timestamptz  - interval '10
days';


Re: BRIN index which is much faster never chosen by planner

2019-10-15 Thread Michael Lewis
Thanks for closing the loop on the data correlation question. I've been
playing with BRIN indexes on a log table of sorts and this thread helped
clear up some of the behavior I have been seeing.

I am curious, would a partial btree index fit your needs? Perhaps the
maintenance overhead is too significant or this is too off-the-wall, but a
daily job to create new index and drop the old concurrently could give the
performance you need while still saving the extra disk space of the full
btree on the timestamp.

CREATE INDEX CONCURRENTLY log_table_rec_insert_time_partial_10_04 ON
log_table USING btree ( rec_insert_time ) WHERE rec_insert_time >
'2019-10-04'::DATE;
DROP INDEX CONCURRENTLY IF EXISTS log_table_rec_insert_time_partial_10_03;

I would consider including category column as well, but I suspect that
would increase the size of the index significantly. Of course, this depends
on the query planner evaluating that "l.rec_insert_time >= now() - interval
'10 days'" and determining that the index fulfills the need.

>


Proposal- GUC for max dead rows before autovacuum

2019-11-19 Thread Michael Lewis
To mitigate the need for per-table tuning of autovacuum configuration, I'd
like to propose a new GUC for autovacuum_vacuum_threshold_max.

Currently, it seems that I can either set autovacuum_vacuum_scale_factor
much smaller than default on tables with millions of rows, or set a value
globally that means small tables are auto vacuumed rarely.

The default value for this new setting value could be -1 or 0 to disable
the feature, or something like 100,000 perhaps so that tables with more
than 500, tuples are candidates for an autovacuum before they would
with current default values.