Mark,

I increased the read ahead to 16384 and it doesn't improve performance. My
RAID 0 use a stripe size of 256k, the maximum size supported by the
controller.
Thanks!

Charles

On Sat, Jul 15, 2017 at 1:02 AM, Mark Kirkwood <
mark.kirkw...@catalyst.net.nz> wrote:

> Ah yes - that seems more sensible (but still slower than I would expect
> for 5 disks RAID 0). You should be able to get something like 5 * (single
> disk speed) i.e about 500MB/s.
>
> Might be worth increasing device read ahead (more than you have already).
> Some of these so-called 'smart' RAID cards need to be hit over the head
> before they will perform. E.g: I believe you have it set to 128 - I'd try
> 4096 or even 16384 (In the past I've used those settings on some extremely
> stupid cards that refused to max out their disks known speeds).
>
> Also worth investigating is RAID stripe size - for DW work it makes sense
> for it to be reasonably big (256K to 1M), which again will help speed is
> sequential scans.
>
> Cheers
>
> Mark
>
> P.s I used to work for Greenplum, so this type of problem came up a lot
> :-) . The best cards were the LSI and Areca!
>
>
>
> On 15/07/17 02:09, Charles Nadeau wrote:
>
>> Mark,
>>
>> First I must say that I changed my disks configuration from 4 disks in
>> RAID 10 to 5 disks in RAID 0 because I almost ran out of disk space during
>> the last ingest of data.
>> Here is the result test you asked. It was done with a cold cache:
>>
>>     flows=# \timing
>>     Timing is on.
>>     flows=# explain select count(*) from flows;
>>                                               QUERY PLAN
>>     ------------------------------------------------------------
>> -----------------------------------
>>      Finalize Aggregate  (cost=17214914.09..17214914.09 rows=1 width=8)
>>        ->  Gather  (cost=17214914.07..17214914.09 rows=1 width=8)
>>              Workers Planned: 1
>>              ->  Partial Aggregate  (cost=17213914.07..17213914.07
>>     rows=1 width=8)
>>                    ->  Parallel Seq Scan on flows
>>      (cost=0.00..17019464.49 rows=388899162 width=0)
>>     (5 rows)
>>
>>     Time: 171.835 ms
>>     flows=# select pg_relation_size('flows');
>>      pg_relation_size
>>     ------------------
>>          129865867264
>>     (1 row)
>>
>>     Time: 57.157 ms
>>     flows=# select count(*) from flows;
>>     LOG:  duration: 625546.522 ms  statement: select count(*) from flows;
>>        count
>>     -----------
>>      589831190
>>     (1 row)
>>
>>     Time: 625546.662 ms
>>
>> The throughput reported by Postgresql is almost 198MB/s, and the
>> throughput as mesured by dstat during the query execution was between 25
>> and 299MB/s. It is much better than what I had before! The i/o wait was
>> about 12% all through the query. One thing I noticed is the discrepency
>> between the read throughput reported by pg_activity and the one reported by
>> dstat: pg_activity always report a value lower than dstat.
>>
>> Besides the change of disks configuration, here is what contributed the
>> most to the improvment of the performance so far:
>>
>>     Using Hugepage
>>     Increasing effective_io_concurrency to 256
>>     Reducing random_page_cost from 22 to 4
>>     Reducing min_parallel_relation_size to 512kB to have more workers
>>     when doing sequential parallel scan of my biggest table
>>
>>
>> Thanks for recomending this test, I now know what the real throughput
>> should be!
>>
>> Charles
>>
>> On Wed, Jul 12, 2017 at 4:11 AM, Mark Kirkwood <
>> mark.kirkw...@catalyst.net.nz <mailto:mark.kirkw...@catalyst.net.nz>>
>> wrote:
>>
>>     Hmm - how are you measuring that sequential scan speed of 4MB/s?
>>     I'd recommend doing a very simple test e.g, here's one on my
>>     workstation - 13 GB single table on 1 SATA drive - cold cache
>>     after reboot, sequential scan using Postgres 9.6.2:
>>
>>     bench=#  EXPLAIN SELECT count(*) FROM pgbench_accounts;
>>                                          QUERY PLAN
>>     ------------------------------------------------------------
>> ------------------------
>>      Aggregate  (cost=2889345.00..2889345.01 rows=1 width=8)
>>        ->  Seq Scan on pgbench_accounts (cost=0.00..2639345.00
>>     rows=100000000 width=0)
>>     (2 rows)
>>
>>
>>     bench=#  SELECT pg_relation_size('pgbench_accounts');
>>      pg_relation_size
>>     ------------------
>>           13429514240
>>     (1 row)
>>
>>     bench=# SELECT count(*) FROM pgbench_accounts;
>>        count
>>     -----------
>>      100000000
>>     (1 row)
>>
>>     Time: 118884.277 ms
>>
>>
>>     So doing the math seq read speed is about 110MB/s (i.e 13 GB in
>>     120 sec). Sure enough, while I was running the query iostat showed:
>>
>>     Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s wMB/s
>>     avgrq-sz avgqu-sz   await r_await w_await  svctm %util
>>     sda               0.00     0.00  926.00    0.00 114.89  0.00
>> 254.10     1.90    2.03    2.03    0.00   1.08 100.00
>>
>>
>>     So might be useful for us to see something like that from your
>>     system - note you need to check you really have flushed the cache,
>>     and that no other apps are using the db.
>>
>>     regards
>>
>>     Mark
>>
>>
>>     On 12/07/17 00:46, Charles Nadeau wrote:
>>
>>         After reducing random_page_cost to 4 and testing more, I can
>>         report that the aggregate read throughput for parallel
>>         sequential scan is about 90MB/s. However the throughput for
>>         sequential scan is still around 4MB/s.
>>
>>
>>
>>
>>
>> --
>> Charles Nadeau Ph.D.
>> http://charlesnadeau.blogspot.com/
>>
>
>


-- 
Charles Nadeau Ph.D.
http://charlesnadeau.blogspot.com/

Reply via email to