Re: Benchmarking: How to identify bottleneck (limiting factor) and achieve "linear scalability"?

2019-01-24 Thread Saurabh Nanda
Is there any material on how to benchmark Postgres meaningfully? I'm
getting very frustrated with the numbers that `pgbench` is reporting:

-- allocating more resources to Postgres seems to be randomly dropping
performance
-- there seems to be no repeatability in the benchmarking numbers [1]
-- there is no to figure out what is causing a bottleneck and which
knob/setting is going to alleviate it.

How do the PG wizards figure all this out?

[1]
https://dba.stackexchange.com/questions/227790/pgbench-20-30-variation-in-benchmark-results-non-repeatable-benchmarks

-- Saurabh.

On Thu, Jan 24, 2019 at 12:46 AM Saurabh Nanda 
wrote:

> Hi,
>
> Please pardon me if this question is already answered in the
> documentation, Wiki, or the mailing list archive. The problem is, that I
> don't know the exact term to search for - I've tried searching for "linear
> scalability" and "concurrency vs performance" but didn't find what I was
> looking for.
>
> ## MAIN QUESTION
>
> pgbench -c 1 achieves approx 80 TPS
> pgbench -c 6 should achieve approx 480 TPS, but only achieves 360 TPS
> pgbench -c 12, should achieve approx 960 TPS, but only achieves 610 TPS
>
> If pgbench is being run on a 4c/8t machine and pg-server is being run on a
> 6c/12t machine with 32GB RAM [1], and the two servers are connected with 1
> Gbit/s connection, I don't think either pgbench or pg-server is being
> constrained by hardware, right?
>
> *In that case why is it not possible to achieve linear scalability, at
> least till 12 concurrent connections (i.e. the thread-count of pg-server)?*
> What is an easy way to identify the limiting factor? Is it network
> connectivity? Disk IOPS? CPU load? Some config parameter?
>
> ## SECONDARY QUESTION
>
> *At what level of concurrent connections should settings like
> shared_buffers, effective_cache_size, max_wal_size start making a
> difference?* With my hardware [1], I'm seeing a difference only after 48
> concurrent connections. And that too it's just a 15-30% improvement over
> the default settings that ship with the Ubuntu 18.04 package. Is this
> expected? Isn't this allocating too many resources for too little gain?
>
> ## CONTEXT
>
> I am currently trying to benchmark PG 11 (via pgbench) to figure out the
> configuration parameters that deliver optimum performance for my hardware
> [1] and workload [2]
>
> Based on https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
> I've made the following relevant changes to the default PG config on Ubuntu
> 18.04:
>
> max_connection=400
> work_mem=4MB
> maintenance_work_mem=64MB
> shared_buffers=12288MB
> temp_buffers=8MB
> effective_cache_size=16GB
> wal_buffers=-1
> wal_sync_method=fsync
> max_wal_size=5GB
> autovacuum=off  # NOTE: Only for benchmarking
>
> [1] 32 GB RAM - 6 core/12 thread - 2x SSD in RAID1
> [2] SaaS webapp -- it's a mixed workload which looks a lot like TPC-B
>
> Thanks,
> Saurabh.
>


-- 
http://www.saurabhnanda.com


Re: Zero throughput on a query on a very large table.

2019-01-24 Thread David Rowley
On Fri, 25 Jan 2019 at 19:24, Tom Lane  wrote:
> PS: On the third hand, you mention having created new indexes on this
> table with apparently not a lot of pain, which is a tad surprising
> if you don't have the patience to wait for a sort to finish.  How
> long did those index builds take?

It would certainly be good to look at psql's \d tmp_outpatient_rev
output to ensure that the index is not marked as INVALID.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Zero throughput on a query on a very large table.

2019-01-24 Thread Andreas Kretschmer




Am 25.01.19 um 06:20 schrieb l...@laurent-hasson.com:


Hello,

We have been stuck for the past week on a query that simply won’t 
“execute”. We have a table with 1.2B rows that took around 14h to 
load, but a simple select takes forever and after 10h, no records are 
coming through still.


Environment:

 - Table tmp_outpatient_rev with 41 VARCHAR columns 
(desy_sort_key, claim_no, clm_line_num, clm_thru_dt, nch_clm_type_cd, 
rev_cntr, rev_cntr_dt, …)


 - 1.2B rows (Billion with a ‘B’)

 - A single Unique Index on columns desy_sort_key, claim_no, 
clm_line_num


 - select pg_size_pretty(pg_relation_size('tmp_outpatient_rev')) 
--> 215GB


 - Database Server: 64GB, 8 cores/16 threads, HDDs 10K

 - Linux

 - PG 11.1

Query:

 select * from tmp_outpatient_rev order by desy_sort_key, claim_no

Plan:

    Gather Merge (cost=61001461.16..216401602.29 rows=1242732290 
width=250)


  Output: desy_sort_key, claim_no, clm_line_num, clm_thru_dt, 
nch_clm_type_cd, rev_cntr, rev_cntr_dt, …


  Workers Planned: 10

  ->  Sort (cost=61000460.97..61311144.04 rows=124273229 width=250)

    Output: desy_sort_key, claim_no, clm_line_num, 
clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …


    Sort Key: tmp_outpatient_rev.desy_sort_key, 
tmp_outpatient_rev.claim_no


    ->  Parallel Seq Scan on public.tmp_outpatient_rev  
(cost=0.00..29425910.29 rows=124273229 width=250)


  Output: desy_sort_key, claim_no, clm_line_num, 
clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …


Method of access:

    - Using Pentaho Kettle (an ETL tool written in Java and using 
JDBC), we simply issue the query and expect records to start streaming 
in ASAP.


   - Issue was replicated with really basic JDBC code in a Java test 
program.


    - The database doesn't have much other data and the table was 
loaded from a CSV data source with LOAD over something like 14h 
(average throughput of about 25K rows/s)


    - Settings:

  alter database "CMS_TMP" set seq_page_cost=1;

  alter database "CMS_TMP" set random_page_cost=4;

  alter database "CMS_TMP" set enable_seqscan=true;

  JDBC connection string with no extra params.

  Database has been generally configured properly.

Problem:

    - The plan shows a full table scan followed by a sort, and then a 
gather merge. With 1.2B rows, that's crazy to try to sort that 


    - After 10h, the query is still "silent" and no record is 
streaming in. IO is very high (80-90% disk throughput utilization) on 
the machine (the sort…).


    - I have tried to hack the planner to force an index scan (which 
would avoid the sort/gather steps and should start streaming data 
right away), in particular, enable_seqscan=false or seq_page_cost=2. 
This had ZERO impact on the plan to my surprise.


   - I changed the “order by” to include all 3 columns from the index, 
or created a non-unique index with only the first 2 columns, all to no 
effect whatsoever either.


    - The table was written over almost 14h at about 25K row/s and it 
seems to me I should be able to read the data back at least as fast.


Why is a simple index scan not used? Why are all our efforts to try to 
force the use of the index failing?





the query isn't that simple, there is no where condition, so PG has to 
read the whole table and the index is useless. Would it be enought to 
select only the columns covered by the index?
(run a vacuum on the table after loading the data, that's can enable a 
index-only-scan in this case)





Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com




Re: SELECT performance drop

2019-01-24 Thread Jan Nielsen
On Wed, Jan 23, 2019 at 12:37 PM legrand legrand <
legrand_legr...@hotmail.com> wrote:

> Hi,
> is there an index on
>  fm_order(session_id,type)?
>

There isn't at the moment:

   table_name   |  index_name  | column_name
+--+-
fm_account | fm_account_pkey  | id
fm_account | uk_5p6qalvucbxmw9u64wf0aif9d | name
fm_allocation  | fm_allocation_pkey   | id
fm_approval| fm_approval_pkey | id
fm_capital | fm_capital_pkey  | id
fm_grant   | fm_grant_pkey| id
fm_market  | fm_market_pkey   | id
fm_marketplace | fm_marketplace_pkey  | id
fm_order   | fm_order_pkey| id
fm_person  | fm_person_pkey   | id
fm_session | fm_session_pkey  | id
fm_trader  | fm_trader_pkey   | id




>
> regards
> PAscal


Re: ERROR: found xmin from before relfrozenxid

2019-01-24 Thread Mariel Cherkassky
I'm checking the full version.
As you said I saw that in 9.6.9 there was a fix for the next bug :

Avoid spuriously marking pages as all-visible (Dan Wood, Pavan Deolasee,
Álvaro Herrera)

This could happen if some tuples were locked (but not deleted). While
queries would still function correctly, vacuum would normally ignore such
pages, with the long-term effect that the tuples were never frozen. In
recent releases this would eventually result in errors such as "found
multixact n from before relminmxid n".

So basically, he just need to upgrade in order to fix it ? Or there is
something else that need to be done?

‫בתאריך יום ד׳, 23 בינו׳ 2019 ב-21:51 מאת ‪Jerry Sievers‬‏ <‪
gsiever...@comcast.net‬‏>:‬

> Mariel Cherkassky  writes:
>
> > Hey,
> > I'm trying to help a guy that is using pg9.6 but I'm not so familiar
> > with the error message :
> > ERROR:  found xmin 16804535 from before relfrozenxid 90126924
> > CONTEXT:  automatic vacuum of table db1.public.table_1"
>
> 9.6.?...
>
> That error or a very similar one was fixed in a recent point release.
>
> HTH
>
> --
> Jerry Sievers
> Postgres DBA/Development Consulting
> e: postgres.consult...@comcast.net
>


Re: Parallel stats in execution plans

2019-01-24 Thread David Rowley
On Thu, 17 Jan 2019 at 00:31, David Conlin  wrote:
> How the time values combine with parallelism. For example, each execution of 
> the sort node takes an average of 48.5s, over three loops. This makes a total 
> running time of 145.5s. Even if this was perfectly distributed between the 
> two workers, I would expect this to take 72.75s, which is more than the total 
> execution time, so it can't take this long.
>  How the row numbers combine with those in the "Workers" subkey. For example, 
> in the Sort node, worker #0 produces 3,277,980 rows, while worker #1 produces 
> 3,385,130 rows. The Sort node as a whole produces 3,333,337 rows per loop, 
> for a total of 10,000,010 (the value in the gather merge node). I would have 
> expected the number of rows produced by the two workers to sum to the number 
> produced by the Sort node as a whole, either per loop or in total.
> How the "Actual Loops" values combine with those in the "Workers" subkey. For 
> example, the "Sort" node has 3 loops, but each of the workers inside it have 
> 1 loop. I would have expected either:
>
> each of the workers to have done 3 loops (since the sort is executed 3 
> times), or
> the number of loops in the two workers to sum to three (if the three 
> executions of the sort are distributed across the two workers)

It's important to know that all of the actual row counts and actual
time are divided by the number of loops, which in this case is 3, one
per process working on that part of the plan.  There are two workers,
but also the main process helps out too.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Parallel stats in execution plans

2019-01-24 Thread David Conlin
It seems like no-one has any ideas on this - does anyone know anywhere 
else I can try to look/ask to find out more?


Is it possible that this is a bug?

Thanks

Dave

On 16/01/2019 11:31, David Conlin wrote:


Hi folks -

I'm having trouble understanding what some of the stats mean in the 
execution plan output when parallel workers are used. I've tried to 
read up about it, but I haven't been able to find anything that 
explains what I'm seeing. Apologies in advance if there's 
documentation I've been too stupid to find.


I've run the following query. The "towns" table is a massive table 
that I created in order to get some big numbers on a parallel query - 
don't worry, this isn't a real query I want to make faster, just a 
silly example I'd like to understand.


EXPLAIN (ANALYZE, FORMAT JSON, BUFFERS, VERBOSE)
SELECT name, code, article
FROM towns
ORDER BY nameASC,
  codeDESC;

The output looks like this:

[
   {
 "Plan": {
   "Node Type": "Gather Merge", "Parallel Aware": false, "Startup Cost": 
1013948.54, "Total Cost": 1986244.55, "Plan Rows": 884, "Plan 
Width": 77, "Actual Startup Time": 42978.838, "Actual Total Time": 
60628.982, "Actual Rows": 1010, "Actual Loops": 1, "Output": ["name", "code", "article"], "Workers Planned": 2, "Workers Launched": 2, "Shared Hit Blocks": 
29, "Shared Read Blocks": 47641, "Shared Dirtied Blocks": 0, "Shared 
Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, 
"Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read 
Blocks": 91342, "Temp Written Blocks": 91479, "Plans": [

 {
   "Node Type": "Sort", "Parent Relationship": "Outer", "Parallel Aware": 
false, "Startup Cost": 1012948.52, "Total Cost": 1023365.25, "Plan 
Rows": 4166692, "Plan Width": 77, "Actual Startup Time": 42765.496, 
"Actual Total Time": 48526.168, "Actual Rows": 337, "Actual 
Loops": 3, "Output": ["name", "code", "article"], "Sort Key": ["towns.name", "towns.code DESC"], "Sort Method": "external merge", "Sort Space Used": 283856, "Sort 
Space Type": "Disk", "Shared Hit Blocks": 170, "Shared Read Blocks": 
142762, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local 
Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, 
"Local Written Blocks": 0, "Temp Read Blocks": 273289, "Temp Written 
Blocks": 273700, "Workers": [

 {
   "Worker Number": 0, "Actual Startup Time": 42588.662, "Actual Total 
Time": 48456.662, "Actual Rows": 3277980, "Actual Loops": 1, "Shared 
Hit Blocks": 72, "Shared Read Blocks": 46794, "Shared Dirtied Blocks": 
0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read 
Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, 
"Temp Read Blocks": 89067, "Temp Written Blocks": 89202 }, {
   "Worker Number": 1, "Actual Startup Time": 42946.705, "Actual Total 
Time": 48799.414, "Actual Rows": 3385130, "Actual Loops": 1, "Shared 
Hit Blocks": 69, "Shared Read Blocks": 48327, "Shared Dirtied Blocks": 
0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read 
Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, 
"Temp Read Blocks": 92880, "Temp Written Blocks": 93019 }

   ], "Plans": [
 {
   "Node Type": "Seq Scan", "Parent Relationship": "Outer", "Parallel 
Aware": true, "Relation Name": "towns", "Schema": "public", "Alias": 
"towns", "Startup Cost": 0.00, "Total Cost": 184524.92, "Plan Rows": 
4166692, "Plan Width": 77, "Actual Startup Time": 0.322, "Actual Total 
Time": 8305.886, "Actual Rows": 337, "Actual Loops": 3, "Output": ["name", "code", "article"], "Shared Hit Blocks": 96, "Shared Read Blocks": 142762, "Shared 
Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, 
"Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written 
Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "Workers": [

 {
   "Worker Number": 0, "Actual Startup Time": 0.105, "Actual Total Time": 
8394.629, "Actual Rows": 3277980, "Actual Loops": 1, "Shared Hit 
Blocks": 35, "Shared Read Blocks": 46794, "Shared Dirtied Blocks": 0, 
"Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read 
Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, 
"Temp Read Blocks": 0, "Temp Written Blocks": 0 }, {
   "Worker Number": 1, "Actual Startup Time": 0.113, "Actual Total Time": 
8139.382, "Actual Rows": 3385130, "Actual Loops": 1, "Shared Hit 
Blocks": 32, "Shared Read Blocks": 48327, "Shared Dirtied Blocks": 0, 
"Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read 
Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, 
"Temp Read Blocks": 0, "Temp Written Blocks": 0 }

   ]
 }
   ]
 }
   ]
 }, "Planning Time": 22.898, "Triggers": [
 ], "Execution Time": 61133.161 }
]

Or a more slimmed-down version, with just the confusing fields:

[
   {