Re: Long running INSERT+SELECT query

2018-04-27 Thread Vitaliy Garnashevich
Everybody thanks for the suggestions! We're going to try using "SELECT 1 FROM table FOR KEY SHARE" for each of the tables, which are referenced by results, before running the big query. That should be up to a million of rows in total. It will probably not cover the case when a record is

Re: Long running INSERT+SELECT query

2018-04-26 Thread Vitaliy Garnashevich
Have not worked through all of the above, but a first draft suggestion: Move the SELECT minus the aggregation functions into a sub-query that uses FOR UPDATE. Then do the aggregation on the results of the sub-query. The aggregated table has hundreds of millions of rows, and the query runs

Re: Long running INSERT+SELECT query

2018-04-26 Thread Vitaliy Garnashevich
Without the query we are flying blind, so suggestions will have a ? Here is one such query:     INSERT INTO cmdb_sp_usage_history   (created_by, updated_by, created_on, updated_on, mod_count,   summary_on, quarter, product, used_from, "user",   keystrokes, minutes_in_use,

Long running INSERT+SELECT query

2018-04-26 Thread Vitaliy Garnashevich
Hi, We're trying to populate a table with aggregated data from other tables. For that we're running a huge INSERT+SELECT query which joins several tables, aggregates values, and then inserts the results into another table. The problem we're facing is that while the query is running , some

Re: Tricking the optimizer

2018-04-18 Thread Vitaliy Garnashevich
We've tried to use "... WHERE coalesce(...)::bool;" on a test query: Before: ->  Seq Scan on public.sys_user sys_user_1  (cost=0.00..27140.46 rows=1 width=16) (actual time=0.105..38.627 rows=3289 loops=1)       Filter: (((SubPlan 7) = 0) AND ((SubPlan 8) = 0) AND ((SubPlan 9) = 0)) After: -> 

Re: Tricking the optimizer

2018-04-18 Thread Vitaliy Garnashevich
We'd rather avoid writing C code. We're trying to stick with the DB functionality provided by default. Could something like the following, in theory, make things at least somewhat better in our case (without using the selectivity override)? ... WHERE coalesce(securityFilter1 AND

Tricking the optimizer

2018-04-18 Thread Vitaliy Garnashevich
Hi, In order to implement some security features in our application, we sometimes append additional WHERE filters to our queries, so the filer in the end looks like: SELECT ... FROM ... WHERE securityFilter1 AND securityFilter2 AND securityFilter3 In the EXPLAIN the filters look something

Re: Autovacuum Problem

2018-04-03 Thread Vitaliy Garnashevich
However no space seems to be freed to the system. Is there any way a bloody newbie can debug this behaviour? In our experience, autovacuum is able to contain bloating of table data, but not bloating of indexes. You could see where the bloating is by running the following queries: CREATE

Re: shared_buffers 8GB maximum

2018-02-19 Thread Vitaliy Garnashevich
When we did calculation of some analytic tasks, then increasing shared_buffers had negative impact on speed. Probably hit ration was too low after change, but the maintenance of shared buffers (searching free blocks) was slower. What was the size of shared buffers when slowdown happened

Re: shared_buffers 8GB maximum

2018-02-19 Thread Vitaliy Garnashevich
Yes.  I don't know the exact reason, but reading a buffer from OS cache is quite a bit more expensive than just pinning a buffer already in the buffer_pool, about 5 times more expensive the last time I tested it, which was before Meltdown.  (And just pinning a buffer which is already in the

Re: shared_buffers 8GB maximum

2018-02-18 Thread Vitaliy Garnashevich
I certainly wouldn't recommend using 1/2 of RAM right away. There's a good chance it would be a waste of memory - for example due to double buffering, which effectively reduces "total" cache hit ratio. Double buffering is often mentioned in context of tuning shared buffers. Is there a tool

Re: shared_buffers 8GB maximum

2018-02-16 Thread Vitaliy Garnashevich
Not necessarily - it depends on exactly what was changed ... which unfortunately I don't know for certain. Any filesystem call is a kernel transition. That's a Meltdown issue. Meltdown can be avoided by using trampoline functions to call the (real) kernel functions and isolating each

shared_buffers 8GB maximum

2018-02-16 Thread Vitaliy Garnashevich
Hi All, I've seen the shared_buffers 8GB maximum recommendation repeated many times. I have several questions in this regard. - Is this recommendation still true for recent versions of postgres? (e.g. wasn't it the case only for really old versions where the locks on shared buffers worked

Re: EXPLAIN BUFFERS: dirtied

2018-01-29 Thread Vitaliy Garnashevich
the hint bits before writing the page, or the following SELECT/VACUUM will have to do that (possibly loading/updating/writing the page again)? Regards, Vitaliy On 2018-01-29 20:38, Tom Lane wrote: Vitaliy Garnashevich <vgarnashev...@gmail.com> writes: But what is "dirtied"

EXPLAIN BUFFERS: dirtied

2018-01-29 Thread Vitaliy Garnashevich
Hi, In EXPLAIN (ANALYZE, BUFFERS) for a SELECT query, I see the following statistics under an Index Scan node: Buffers: shared hit=8357288 read=6165444 dirtied=44820 written=5590 As far as I understand, that's the statistics for accesses to shared buffers during the query: - hit = required

Number of rows returned by Sort node

2018-01-09 Thread Vitaliy Garnashevich
Hi, How, according to EXPLAIN ANALYZE, the Sort node could return more rows as output (rows=767662), than it had for input from its child node (rows=135627)? ->  Merge Join  (actual time=1977.388..333626.072 rows=725757 loops=1)   ->  Index Scan using (actual