Igor, The 1st clause of the where statement won't select addresses the same way as the one I wrote using the extension for IPv6 and IPv6 data types.
flowstimestamp is a view: flows=# \d+ flowstimestamp View "public.flowstimestamp" Column | Type | Modifiers | Storage | Description -------------+--------------------------+-----------+---------+------------- flow_id | bigint | | plain | unix_secs | bigint | | plain | unix_nsecs | bigint | | plain | sysuptime | bigint | | plain | exaddr | ip4 | | plain | dpkts | integer | | plain | doctets | bigint | | plain | first | bigint | | plain | last | bigint | | plain | engine_type | smallint | | plain | engine_id | smallint | | plain | srcaddr | ip4 | | plain | dstaddr | ip4 | | plain | nexthop | ip4 | | plain | input | integer | | plain | output | integer | | plain | srcport | integer | | plain | dstport | integer | | plain | prot | smallint | | plain | tos | smallint | | plain | tcp_flags | smallint | | plain | src_mask | smallint | | plain | dst_mask | smallint | | plain | src_as | integer | | plain | dst_as | integer | | plain | timestamp | timestamp with time zone | | plain | View definition: SELECT flows.flow_id, flows.unix_secs, flows.unix_nsecs, flows.sysuptime, flows.exaddr, flows.dpkts, flows.doctets, flows.first, flows.last, flows.engine_type, flows.engine_id, flows.srcaddr, flows.dstaddr, flows.nexthop, flows.input, flows.output, flows.srcport, flows.dstport, flows.prot, flows.tos, flows.tcp_flags, flows.src_mask, flows.dst_mask, flows.src_as, flows.dst_as, to_timestamp((flows.unix_secs + flows.unix_nsecs / 1000000000)::double precision) AS "timestamp" FROM flows; And it can use the indexes of flows: Indexes: "flows_pkey" PRIMARY KEY, btree (flow_id) "flows_dstaddr_dstport" btree (dstaddr, dstport) "flows_srcaddr_dstaddr_idx" btree (srcaddr, dstaddr) "flows_srcaddr_srcport" btree (srcaddr, srcport) "flows_srcport_dstport_idx" btree (srcport, dstport) Thanks! Charles On Fri, Jul 14, 2017 at 10:18 PM, Igor Neyman <iney...@perceptron.com> wrote: > > > > > *From:* pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance- > ow...@postgresql.org] *On Behalf Of *Igor Neyman > *Sent:* Friday, July 14, 2017 3:13 PM > *To:* Charles Nadeau <charles.nad...@gmail.com> > > *Cc:* Jeff Janes <jeff.ja...@gmail.com>; pgsql-performance@postgresql.org > *Subject:* Re: [PERFORM] Very poor read performance, query independent > > > > *From:* Charles Nadeau [mailto:charles.nad...@gmail.com > <charles.nad...@gmail.com>] > *Sent:* Friday, July 14, 2017 11:35 AM > *To:* Igor Neyman <iney...@perceptron.com> > *Cc:* Jeff Janes <jeff.ja...@gmail.com>; pgsql-performance@postgresql.org > *Subject:* Re: [PERFORM] Very poor read performance, query independent > > > > Igor, > > > > Initially temp_buffer was left to its default value (8MB). Watching the > content of the directory that stores the temporary files, I found that I > need at most 21GB of temporary files space. Should I set temp_buffer to > 21GB? > > Here is the explain you requested with work_mem set to 6GB: > > flows=# set work_mem='6GB'; > > SET > > flows=# explain (analyze, buffers) SELECT DISTINCT > > srcaddr, > > dstaddr, > > dstport, > > COUNT(*) AS conversation, > > SUM(doctets) / 1024 / 1024 AS mbytes > > FROM > > flowscompact, > > mynetworks > > WHERE > > mynetworks.ipaddr >>= flowscompact.srcaddr > > AND dstaddr IN > > ( > > SELECT > > dstaddr > > FROM > > dstexterne > > ) > > GROUP BY > > srcaddr, > > dstaddr, > > dstport > > ORDER BY > > mbytes DESC LIMIT 50; > > > QUERY PLAN > > > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------------------ > > Limit (cost=48135680.07..48135680.22 rows=50 width=52) (actual > time=2227678.196..2227678.223 rows=50 loops=1) > > Buffers: shared hit=728798038 read=82974833, temp read=381154 > written=381154 > > -> Unique (cost=48135680.07..48143613.62 rows=2644514 width=52) > (actual time=2227678.194..2227678.217 rows=50 loops=1) > > Buffers: shared hit=728798038 read=82974833, temp read=381154 > written=381154 > > -> Sort (cost=48135680.07..48137002.33 rows=2644514 width=52) > (actual time=2227678.192..2227678.202 rows=50 loops=1) > > Sort Key: (((sum(flows.doctets) / '1024'::numeric) / > '1024'::numeric)) DESC, flows.srcaddr, flows.dstaddr, flows.dstport, > (count(*)) > > Sort Method: quicksort Memory: 654395kB > > Buffers: shared hit=728798038 read=82974833, temp > read=381154 written=381154 > > -> GroupAggregate (cost=48059426.65..48079260.50 > rows=2644514 width=52) (actual time=2167909.030..2211446.192 rows=5859671 > loops=1) > > Group Key: flows.srcaddr, flows.dstaddr, flows.dstport > > Buffers: shared hit=728798038 read=82974833, temp > read=381154 written=381154 > > -> Sort (cost=48059426.65..48060748.90 > rows=2644514 width=20) (actual time=2167896.815..2189107.205 rows=91745640 > loops=1) > > Sort Key: flows.srcaddr, flows.dstaddr, > flows.dstport > > Sort Method: external merge Disk: 3049216kB > > Buffers: shared hit=728798038 read=82974833, > temp read=381154 written=381154 > > -> Gather (cost=30060688.07..48003007.07 > rows=2644514 width=20) (actual time=1268989.000..1991357.232 rows=91745640 > loops=1) > > Workers Planned: 12 > > Workers Launched: 12 > > Buffers: shared hit=728798037 > read=82974833 > > -> Hash Semi Join > (cost=30059688.07..47951761.31 rows=220376 width=20) (actual > time=1268845.181..2007864.725 rows=7057357 loops=13) > > Hash Cond: (flows.dstaddr = > flows_1.dstaddr) > > Buffers: shared hit=728795193 > read=82974833 > > -> Nested Loop > (cost=0.03..17891246.86 rows=220376 width=20) (actual > time=0.207..723790.283 rows=37910370 loops=13) > > Buffers: shared hit=590692229 > read=14991777 > > -> Parallel Seq Scan on > flows (cost=0.00..16018049.14 rows=55094048 width=20) (actual > time=0.152..566179.117 rows=45371630 loops=13) > > Buffers: shared > hit=860990 read=14991777 > > -> Index Only Scan using > mynetworks_ipaddr_idx on mynetworks (cost=0.03..0.03 rows=1 width=8) > (actual time=0.002..0.002 rows=1 loops=589831190) > > Index Cond: (ipaddr >>= > (flows.srcaddr)::ip4r) > > Heap Fetches: 0 > > Buffers: shared > hit=589831203 > > -> Hash > (cost=30059641.47..30059641.47 rows=13305 width=4) (actual > time=1268811.101..1268811.101 rows=3803508 loops=13) > > Buckets: 4194304 (originally > 16384) Batches: 1 (originally 1) Memory Usage: 166486kB > > Buffers: shared hit=138102964 > read=67983056 > > -> HashAggregate > (cost=30059561.64..30059601.56 rows=13305 width=4) (actual > time=1265248.165..1267432.083 rows=3803508 loops=13) > > Group Key: > flows_1.dstaddr > > Buffers: shared > hit=138102964 read=67983056 > > -> Nested Loop Anti > Join (cost=0.00..29729327.92 rows=660467447 width=4) (actual > time=0.389..1201072.707 rows=125838232 loops=13) > > Join Filter: > (mynetworks_1.ipaddr >> (flows_1.dstaddr)::ip4r) > > Rows Removed by > Join Filter: 503353617 > > Buffers: shared > hit=138102964 read=67983056 > > -> Seq Scan on > flows flows_1 (cost=0.00..17836152.73 rows=661128576 width=4) (actual > time=0.322..343152.274 rows=589831190 loops=13) > > Buffers: > shared hit=138102915 read=67983056 > > -> Materialize > (cost=0.00..1.02 rows=4 width=8) (actual time=0.000..0.000 rows=2 > loops=7667805470) > > Buffers: > shared hit=13 > > -> Seq > Scan on mynetworks mynetworks_1 (cost=0.00..1.01 rows=4 width=8) (actual > time=0.006..0.007 rows=4 loops=13) > > > Buffers: shared hit=13 > > Planning time: 0.941 ms > > Execution time: 2228345.171 ms > > (48 rows) > > > > With a work_mem at 6GB, I noticed that for the first 20 minutes the query > was running, the i/o wait was much lower, hovering aroun 3% then it jumped > 45% until almost the end of the query. > > > > flowscompact and dstexterne are actually views. I use views to simplify > query writing and to "abstract" queries that are use often in other > queries. flowscompact is a view built on table flows (having about 590 > million rows), it only keeps the most often used fields. > > flows=# \d+ flowscompact; > > View "public.flowscompact" > > Column | Type | Modifiers | Storage | Description > > -----------+--------------------------+-----------+---------+------------- > > flow_id | bigint | | plain | > > sysuptime | bigint | | plain | > > exaddr | ip4 | | plain | > > dpkts | integer | | plain | > > doctets | bigint | | plain | > > first | bigint | | plain | > > last | bigint | | plain | > > srcaddr | ip4 | | plain | > > dstaddr | ip4 | | plain | > > srcport | integer | | plain | > > dstport | integer | | plain | > > prot | smallint | | plain | > > tos | smallint | | plain | > > tcp_flags | smallint | | plain | > > timestamp | timestamp with time zone | | plain | > > View definition: > > SELECT flowstimestamp.flow_id, > > flowstimestamp.sysuptime, > > flowstimestamp.exaddr, > > flowstimestamp.dpkts, > > flowstimestamp.doctets, > > flowstimestamp.first, > > flowstimestamp.last, > > flowstimestamp.srcaddr, > > flowstimestamp.dstaddr, > > flowstimestamp.srcport, > > flowstimestamp.dstport, > > flowstimestamp.prot, > > flowstimestamp.tos, > > flowstimestamp.tcp_flags, > > flowstimestamp."timestamp" > > FROM flowstimestamp; > > mynetworks is a table having one column and 4 rows; it contains a list of > our network networks: > > flows=# select * from mynetworks; > > ipaddr > > ---------------- > > 192.168.0.0/24 > > 10.112.12.0/30 > > 10.112.12.4/30 > > 10.112.12.8/30 > > (4 row) > > flows=# \d+ mynetworks; > > Table "public.mynetworks" > > Column | Type | Modifiers | Storage | Stats target | Description > > --------+------+-----------+---------+--------------+------------- > > ipaddr | ip4r | | plain | | > > Indexes: > > "mynetworks_ipaddr_idx" gist (ipaddr) > > dstexterne is a view listing all the destination IPv4 addresses not inside > our network; it has one column and 3.8 million rows. > > flows=# \d+ dstexterne; > > View "public.dstexterne" > > Column | Type | Modifiers | Storage | Description > > ---------+------+-----------+---------+------------- > > dstaddr | ip4 | | plain | > > View definition: > > SELECT DISTINCT flowscompact.dstaddr > > FROM flowscompact > > LEFT JOIN mynetworks ON mynetworks.ipaddr >> > flowscompact.dstaddr::ip4r > > WHERE mynetworks.ipaddr IS NULL; > > Thanks! > > > > Charles > > > > Charles, > > > > Also, let’s try to simplify your query and see if it performs better. > > You are grouping by srcaddr, dstaddr, dstport, that makes DISTINCT not > needed. > > And after simplifying WHERE clause (let me know if the result is not what > you want), the query looks like: > > > > SELECT srcaddr, dstaddr, dstport, > > COUNT(*) AS conversation, > > SUM(doctets) / 1024 / 1024 AS mbytes > > FROM flowscompact > > WHERE srcaddr IN (SELECT ipaddr FROM mynetworks) > > AND dstaddr NOT IN (SELECT ipaddr FROM mynetworks) > > GROUP BY srcaddr, dstaddr, dstport > > ORDER BY mbytes DESC > > LIMIT 50; > > > > Now, you didn’t provide the definition of flowstimestamp table. > > If this table doesn’t have an index on (srcaddr, dstaddr, dstport) > creating one should help (I think). > > > > Igor > > > > > > > -- Charles Nadeau Ph.D. http://charlesnadeau.blogspot.com/