[PERFORM] CREATE STATISTICS and join selectivity

2017-11-15 Thread Justin Pryzby
I just noticed that PG10 CREATE STATISTICS (dependencies) doesn't seem to work for joins on multiple columns; is that right? With statistics on table for 2017 but not 20171110: ts=# CREATE STATISTICS x ON site_id,sect_id FROM eric_enodeb_cell_2017; ts=# ANALYZE VERBOSE eric_enodeb_cell_20

Re: [PERFORM] query performance issue

2017-11-15 Thread Justin Pryzby
On Wed, Nov 15, 2017 at 03:03:39PM +0530, Samir Magar wrote: > I am having performance issues with one of the query. > The query is taking 39 min to fetch 3.5 mil records. > > I want to reduce that time to 15 mins. > could you please suggest something to its performance? > "HashAggregate (cost=4

Re: [PERFORM] overestimate on empty table

2017-11-11 Thread Justin Pryzby
On Fri, Nov 10, 2017 at 04:19:41PM -0500, Tom Lane wrote: > Justin Pryzby writes: > > (or, the opposite of the more common problem) > > As the queued_alters table is typically empty (and autoanalyzed with > > relpages=0), I see "why": > > >

[PERFORM] overestimate on empty table

2017-11-10 Thread Justin Pryzby
(or, the opposite of the more common problem) I wrote this query some time ago to handle "deferred" table-rewriting type promoting ALTERs of a inheritence children, to avoid worst-case disk usage altering the whole heirarchy, and also locking the entire heirarchy against SELECT and INSERT. ts=# e

Re: [PERFORM] Performance loss upgrading from 9.3 to 9.6

2017-11-06 Thread Justin Pryzby
On 11/6/17, 9:21 AM, "Justin Pryzby" wrote: > see if statistics improve: > SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, > tablename, attname, n_distinct, array_length(most_common_vals,1) n_mcv, > FROM pg_stats WHERE attname~'customers_

Re: [PERFORM] Performance loss upgrading from 9.3 to 9.6

2017-11-06 Thread Justin Pryzby
On Mon, Nov 06, 2017 at 01:18:00PM +, Adam Torres wrote: > Good morning all, > > We have a problem with performance after upgrading from 9.3 to 9.6 where > certain queries take 9 times longer to run. On our initial attempt to > upgrade, we noticed the system as a whole was taking longer to

Re: [PERFORM] performance drop after upgrade (9.6 > 10)

2017-10-26 Thread Justin Pryzby
On Tue, Oct 24, 2017 at 04:15:59PM +0200, johannes graën wrote: > Hi Pavel, *, > > you were right with ANALYZing the DB first. However, even after doing > so, I frequently see Seq Scans where an index was used before. This > usually cooccurs with parallelization and looked different before > upgra

Re: [PERFORM] performance drop after upgrade (9.6 > 10)

2017-10-24 Thread Justin Pryzby
On Tue, Oct 24, 2017 at 04:15:59PM +0200, johannes graën wrote: > upgrading to 10. I can provide an example for 10 [1], but I cannot > generate a query plan for 9.6 anymore. You could (re)install PG96 alongside PG10 and run a copy of the DB (even from your homedir, or on a difference server) and p

Re: [PERFORM] Partitioned table - scans through every partitions

2017-08-25 Thread Justin Pryzby
On Fri, Aug 25, 2017 at 03:36:29PM +, Aniko Belim wrote: > Hi, > > We have an issue with one of our partitioned tables. It has a column with > timestamp without time zone type, and we had to partition it daily. To do > that, we created the following constraints like this example: > CHECK (to

Re: [PERFORM] Performance Issue -- "Materialize"

2017-08-21 Thread Justin Pryzby
On Sat, Aug 19, 2017 at 10:37:56AM -0700, anand086 wrote: > +---+| > > QUER

Re: [PERFORM] Very poor read performance, query independent

2017-07-18 Thread Justin Pryzby
On Tue, Jul 18, 2017 at 02:13:58PM -0300, Claudio Freire wrote: > On Tue, Jul 18, 2017 at 1:01 PM, Claudio Freire > wrote: > > On Tue, Jul 18, 2017 at 6:20 AM, Charles Nadeau > > wrote: > >> Claudio, > >> > >> At one moment > >> during the query, there is a write storm to the swap drive (a bit l

estimate correlation of index separately from table (Re: [PERFORM] index fragmentation on insert-only table with non-unique column)

2017-07-07 Thread Justin Pryzby
Months ago I reported an issue with very slow index scan of tables with high "correlation" of its indexed column, due to (we concluded at the time) duplicate/repeated values of that column causing many lseek()s. References: https://www.postgresql.org/message-id/flat/20160524173914.GA11880%40telsa

Re: [PERFORM] Re: join under-estimates with ineq conditions

2017-06-15 Thread Justin Pryzby
I never heard back but was hoping for some feedback/discussion about this 2nd problem/patch. just a reminder - Thanks On Thu, Jun 08, 2017 at 11:05:38AM -0500, Justin Pryzby wrote: > On Mon, Jun 05, 2017 at 05:02:32PM -0400, Tom Lane wrote: > > Justin Pryzby writes: > > >

Re: [PERFORM] Re: join under-estimates with ineq conditions

2017-06-08 Thread Justin Pryzby
On Mon, Jun 05, 2017 at 05:02:32PM -0400, Tom Lane wrote: > Justin Pryzby writes: > > diff --git a/src/backend/utils/adt/selfuncs.c > > b/src/backend/utils/adt/selfuncs.c > > + if (nd1>vardata1->rel->rows) nd1=vardata1->rel->rows; > > + if (

[PERFORM] Re: join estimate of subqueries with range conditions and constraint exclusion

2017-05-30 Thread Justin Pryzby
On Wed, May 24, 2017 at 04:17:30PM -0500, Justin Pryzby wrote: > We got bitten again by what appears to be the same issue I reported (perhaps > poorly) here: > https://www.postgresql.org/message-id/20170326193344.GS31628%40telsasoft.com > I'm diagnosing a bad estimate/plan due to

[PERFORM] join estimate of subqueries with range conditions and constraint exclusion

2017-05-24 Thread Justin Pryzby
We got bitten again by what appears to be the same issue I reported (perhaps poorly) here: https://www.postgresql.org/message-id/20170326193344.GS31628%40telsasoft.com We have PG9.6.3 table heirarchies partitioned by time. Our reports use subqueries each with their own copies of a range clauses o

Re: [PERFORM] Performance issue in PostgreSQL server...

2017-04-27 Thread Justin Pryzby
On Mon, Mar 06, 2017 at 12:17:22PM +, Dinesh Chandra 12108 wrote: > Below is the output of Query SELECT * FROM pg_stats WHERE tablename='point' > AND attname='domain_class_id' ; > > > schemaname | tablename | attname | inherited | null_frac | avg_width > | n_distinct | most_common_v

[PERFORM] self join estimate and constraint exclusion

2017-04-14 Thread Justin Pryzby
I mailed last month [0] but didn't see any reponse .. (if I'm being naive, daft, or missing something simple, please just say so). [0] https://www.postgresql.org/message-id/20170326193344.GS31628%40telsasoft.com It seems when self (inner/equi) joining there's two bad alternatives: either specify

[PERFORM] self join estimate and constraint exclusion

2017-03-26 Thread Justin Pryzby
It seems when self (inner/equi) joining there's two bad alternatives: either specify a where clause for each self-joined table and incur poor estimate and plan, due to incorrect perceived independence of clauses, even though joined column ought to be known equal; or, specify where clause only once,

Re: [PERFORM] Performance issue in PostgreSQL server...

2017-03-05 Thread Justin Pryzby
On Sun, Mar 05, 2017 at 08:23:08PM -0800, Jeff Janes wrote: > On Fri, Mar 3, 2017 at 4:44 AM, Dinesh Chandra 12108 > wrote: > > The below is the output for psql=> EXPLAIN ANALYZE SELECT DISTINCT > > feature_id FROM evidence.point p INNER JOIN evidence.observation_evidence > > oe ON p.feature_id =

Re: [PERFORM] no MCV list of tiny table with unique columns

2016-11-05 Thread Justin Pryzby
On Wed, Nov 02, 2016 at 07:48:23PM -0400, Tom Lane wrote: > There's not a lot of point in worrying about your two-row table when these > other estimates are off by multiple orders of magnitude. In this > particular case my first bet would be that the planner has no idea about > the selectivity of

Re: [PERFORM] no MCV list of tiny table with unique columns

2016-11-02 Thread Justin Pryzby
On Wed, Nov 02, 2016 at 04:05:46PM -0400, Tom Lane wrote: > Justin Pryzby writes: > > I believe the join is being (badly) underestimated, leading to a crappy plan > > involving multiple nested loop joins, which takes 2.5 hours instead of a > > handful of seconds; I believe t

[PERFORM] no MCV list of tiny table with unique columns

2016-11-02 Thread Justin Pryzby
We have a report query which joins (multiple times, actually) against this trivial, tiny table: ts=# \d bsm_to_switch Table "public.bsm_to_switch" Column | Type | Modifiers +--+--- bsm| text | not null switch | text | not null ts=# SELECT length(bsm), length(switch) FR

Re: [PERFORM] index fragmentation on insert-only table with non-unique column

2016-08-13 Thread Justin Pryzby
Regarding this earlier thread: https://www.postgresql.org/message-id/flat/20160524173914.GA11880%40telsasoft.com#20160524173914.ga11...@telsasoft.com On Tue, May 24, 2016 at 10:39 AM, Justin Pryzby wrote: > Summary: Non-unique btree indices are returning CTIDs for rows with same > va

Re: [PERFORM] index fragmentation on insert-only table with non-unique column

2016-06-03 Thread Justin Pryzby
On Fri, Jun 03, 2016 at 06:26:33PM -0300, Claudio Freire wrote: > On Wed, May 25, 2016 at 11:00 AM, Justin Pryzby wrote: > >> > First, I found I was able to get 30-50min query results on full week's > >> > table by > >> > prefering a seq scan to a

Re: [PERFORM] index fragmentation on insert-only table with non-unique column

2016-05-25 Thread Justin Pryzby
On Tue, May 24, 2016 at 11:23:48PM -0700, Jeff Janes wrote: > > But note the non-uniqueness of the index column: > > ts=# SELECT recordopeningtime, COUNT(1) FROM cdrs_huawei_pgwrecord WHERE > > recordopeningtime>='2016-05-21' AND recordopeningtime<'2016-05-22' GROUP BY > > 1 ORDER BY 2 DESC; > >

Re: [PERFORM] index fragmentation on insert-only table with non-unique column

2016-05-25 Thread Justin Pryzby
On Tue, May 24, 2016 at 09:16:20PM -0700, Peter Geoghegan wrote: > On Tue, May 24, 2016 at 10:39 AM, Justin Pryzby wrote: > > Postgres seems to assume that the high degree of correlation of the table > > column seen in pg_stats is how it will get data from the index scan, which

[PERFORM] index fragmentation on insert-only table with non-unique column

2016-05-24 Thread Justin Pryzby
Summary: Non-unique btree indices are returning CTIDs for rows with same value of indexed column not in logical order, imposing a high performance penalty. Running PG 9.5.3 now, we have a time-based partitions of append-only tables with data loaded from other sources. The tables are partitioned b

Re: [PERFORM] Clarification on using pg_upgrade

2016-03-04 Thread Justin Pryzby
On Fri, Mar 04, 2016 at 02:27:59PM -0800, Tory M Blue wrote: > If my data is located in /data > > and I link to a new dir in /data1, what actually happens. do I end up with > 2 file systems and links and thus am not able to delete or cleanup any old > data, or how does this work? > > Also will t

[PERFORM] dump time increase by 1h with new kernel

2009-10-03 Thread Justin Pryzby
[I got no response on -general for a few days so I'm trying here] When we upgraded from linux-2.6.24 to linux-2.6.27, our pg_dump duration increased by 20% from 5 hours to 6. My first attempt at resolution was to boot with elevator=deadline. However that's actually the default IO scheduler in bo

[PERFORM] dump time increase by 1h with new kernel

2009-10-02 Thread Justin Pryzby
[I got no response on -general for a few days so I'm trying here] When we upgraded from linux-2.6.24 to linux-2.6.27, our pg_dump duration increased by 20% from 5 hours to 6. My first attempt at resolution was to boot with elevator=deadline. However that's actually the default IO scheduler in bo