Re: [PERFORM] Odd sudden performance degradation related to temp object churn

2017-08-18 Thread Jeremy Finzel
On Tue, Aug 15, 2017 at 12:07 PM, Scott Marlowe 
wrote:

> So do iostat or iotop show you if / where your disks are working
> hardest? Or is this CPU overhead that's killing performance?
>

Sorry for the delayed reply.  I took a look in more detail at the query
plans from our problem query during this incident.  There are actually 6
plans, because there were 6 unique queries.  I traced one query through our
logs, and found something really interesting.  That is that all of the
first 5 queries are creating temp tables, and all of them took upwards of
500ms each to run.  The final query, however, is a simple select from the
last temp table, and that query took 0.035ms!  This really confirms that
somehow, the issue had to do with *writing *to the SAN, I think.  Of course
this doesn't answer a whole lot, because we had no other apparent issues
with write performance at all.

I also provide some graphs below.

7pm-3am on 8/10 (first incidents were around 10:30pm, other incidents ~1am,
2am):

Local Disk IO:

[image: Screen Shot 2017-08-18 at 8.20.06 AM.png]

SAN IO:

[image: Screen Shot 2017-08-18 at 8.16.59 AM.png]

CPU:

[image: Screen Shot 2017-08-18 at 8.20.58 AM.png]

7-9pm on 8/10 (controlled attempts starting a little after 7):

CPU:

[image: Screen Shot 2017-08-18 at 8.43.35 AM.png]

Write IO on SAN:

[image: Screen Shot 2017-08-18 at 8.44.32 AM.png]

Read IO on Local disk:

[image: Screen Shot 2017-08-18 at 8.46.27 AM.png]

Write IO on Local disk:

[image: Screen Shot 2017-08-18 at 8.46.58 AM.png]


[PERFORM] Query plan for views and WHERE clauses, Luke is not using the index

2017-08-18 Thread kimaidou
Hi all,

I have come across a unexpected behavior.
You can see full detail on an issue on the QGEP project in Github :
https://github.com/QGEP/QGEP/issues/308#issuecomment-323122514

Basically, we have this view with some LEFT JOIN :
http://paste.debian.net/982003/

We have indexes on some fields ( foreign keys, and a GIST index for the
PostGIS geometry field)
If I use the raw SQL defining the view, and add a WHERE clause like:

WHERE "progression_geometry" &&
st_makeenvelope(1728327.03249295568093657,8240789.26074041239917278,1728608.10987572139129043,8240958.16933418624103069,3949)

the query plan is "as expected", as it is using the spatial index (and
others too). This query gets 100 lines from a "main" table containing 2
lines (and child tables having more). It is pretty fast and "low cost"
See the query plan:
https://explain.depesz.com/s/6Qgb

When we call the WHERE on the view:

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
SELECT *
FROM "qgep"."vw_qgep_reach"
WHERE "progression_geometry" &&
st_makeenvelope(1728327.03249295568093657,8240789.26074041239917278,1728608.10987572139129043,8240958.16933418624103069,3949)


The query plan is "wrong", as PostgreSQL seems to consider it should do a
seq scan on the tables, and only afterwards filter with the WHERE:
https://explain.depesz.com/s/wXV

The query takes about 1 second instead of less than 100ms.

Do you have any hint on this kind of issue ?

Thanks in advance

Regards,

Michaƫl


Re: [PERFORM] Odd sudden performance degradation related to temp object churn

2017-08-18 Thread Mark Kirkwood



On 19/08/17 02:21, Jeremy Finzel wrote:
On Tue, Aug 15, 2017 at 12:07 PM, Scott Marlowe 
mailto:scott.marl...@gmail.com>> wrote:


So do iostat or iotop show you if / where your disks are working
hardest? Or is this CPU overhead that's killing performance?


Sorry for the delayed reply. I took a look in more detail at the query 
plans from our problem query during this incident. There are actually 
6 plans, because there were 6 unique queries.  I traced one query 
through our logs, and found something really interesting. That is that 
all of the first 5 queries are creating temp tables, and all of them 
took upwards of 500ms each to run.  The final query, however, is a 
simple select from the last temp table, and that query took 0.035ms!  
This really confirms that somehow, the issue had to do with /writing 
/to the SAN, I think.  Of course this doesn't answer a whole lot, 
because we had no other apparent issues with write performance at all.


I also provide some graphs below.


Hi, graphs for latency (or await etc) might be worth looking at too - 
sometimes the troughs between the IO spikes are actually when the disks 
have been overwhelmed with queued up pending IOs...


Also SANs are notorious for this sort of thing - typically they have a 
big RAM cache that you are actually writing to, and everything is nice 
and fast until your workload (along with everyone else's) fills up the 
cache and then performance drops of a cliff for a while (I've seen SAN 
disks with iostat utilizations of 105% <-- Lol... and await numbers that 
scroll off the page in that scenario)!


regards
Mark


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very poor read performance, query independent

2017-08-18 Thread Mark Kirkwood

Nice!

Pleased that the general idea worked well for you!

I'm also relieved that you did not follow my recommendation exactly - 
I'm been trialling a Samsung 960 Evo (256GB) and Intel 600p (256GB) and 
I've stumbled across the serious disadvantages of (consumer) M.2 drives 
using TLC NAND - terrible sustained write performance! While these guys 
can happily do ~ 2GB/s reads, their write performance is only 'burst 
capable'. They have small SLC NAND 'write caches' that do  ~1GB/s for a 
*limited time* (10-20s) and after that you get ~ 200 MB/s! Ouch - my old 
Crucial 550 can do 350 MB/s sustained writes (so two of them in RAID0 
are doing 700 MB/s for hours).


Bigger capacity drives can do better - but overall I'm not that 
impressed with the current trend of using TLC NAND.


regards

Mark


On 21/07/17 00:50, Charles Nadeau wrote:

Mark,

I received yesterday a second server having 16 drives bays. Just for a 
quick trial, I used 2 old 60GB SSD (a Kingston V300 and a ADATA SP900) 
to build a RAID0. To my surprise, my very pecky RAID controller (HP 
P410i) recognised them without a fuss (although as SATAII drives at 
3Gb/s. A quick fio benchmark gives me 22000 random 4k read IOPS, more 
than my 5 146GB 10k SAS disks in RAID0). I moved my most frequently 
used index to this array and will try to do some benchmarks.
Knowing that SSDs based on SandForce-2281 controller are recognised by 
my server, I may buy a pair of bigger/newer ones to put my tables on.


Thanks!

Charles

On Sat, Jul 15, 2017 at 1:57 AM, Mark Kirkwood 
mailto:mark.kirkw...@catalyst.net.nz>> 
wrote:


Thinking about this a bit more - if somewhat more blazing
performance is needed, then this could be achieved via losing the
RAID card and spinning disks altogether and buying 1 of the NVME
or SATA solid state products: e.g

- Samsung 960 Pro or Evo 2 TB (approx 1 or 2 GB/s seq scan speeds
and 200K IOPS)

- Intel S3610 or similar 1.2 TB (500 MB/s seq scan and 30K IOPS)


The Samsung needs an M.2 port on the mobo (but most should have
'em - and if not PCIe X4 adapter cards are quite cheap). The Intel
is a bit more expensive compared to the Samsung, and is slower but
has a longer lifetime. However for your workload the Sammy is
probably fine.

regards

Mark

On 15/07/17 11:09, Mark Kirkwood wrote:

Ah yes - that seems more sensible (but still slower than I
would expect for 5 disks RAID 0).




-- 
Sent via pgsql-performance mailing list

(pgsql-performance@postgresql.org
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance





--
Charles Nadeau Ph.D.
http://charlesnadeau.blogspot.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance