Re: IoT/sensor data and B-Tree page splits

2019-08-26 Thread Arcadiy Ivanov

On 8/26/19 6:48 PM, Peter Geoghegan wrote:

Such data often consists of timestamps from a large number
of low cost devices -- event data that arrives *approximately* in
order. This is more or less the problem that the TimescaleDB extension
targets, so it seems likely that a fair number of users care about
getting it right, even if they don't know it.


This problem is not limited to IoT but to RT financial transaction 
ingestion as well.
I found BRIN indices to work exceptionally well for that, while B-tree 
taking enormous amounts of space with no performance difference or win 
going to BRIN.
The situation gets even worse when B-tree index is subjected to 
identical tuples which often happens when you have an avalanche of 
timestamps that are within less than 1ms of each other (frequent TS 
rounding resolution).


--
Arcadiy Ivanov
arca...@gmail.com | @arcivanov | https://ivanov.biz
https://github.com/arcivanov





Re: IoT/sensor data and B-Tree page splits

2019-08-26 Thread Arcadiy Ivanov

On 8/26/19 7:49 PM, Peter Geoghegan wrote:

Not surprising, since the TPC-E benchmark models a financial trading
application.



The good news there is that that will almost certainly be a lot better
in Postgres 12. TPC-E also has a number of very low cardinality
indexes, despite being an OLTP benchmark.



But apart from TPC-E and having to perform to it, is there any practical 
real world usefulness in trying to have a B-tree index on TS-based data 
just to have a PK on it, as opposed to having a BRIN on a TS field and 
calling it a day?


--
Arcadiy Ivanov
arca...@gmail.com | @arcivanov | https://ivanov.biz
https://github.com/arcivanov





Re: Optimizer misses big in 10.4 with BRIN index

2018-08-07 Thread Arcadiy Ivanov
red hit=56
 Planning time: 0.515 ms
 Execution time: 7339.772 ms
(19 rows)

**  "Rows Removed by Index Recheck: 1639294" - OK, statistics are 
pooched post-clustering. Let's reanalyze.

===
db0=# ANALYZE VERBOSE schema0.table0;
INFO:  analyzing "schema0.table0"
INFO:  "table0": scanned 3 of 1048881 pages, containing 12 live 
rows and 0 dead rows; 3 rows in sample, 4195524 estimated total rows

ANALYZE


**  Let's try this again...
===
db0=# EXPLAIN (analyze, verbose, costs, buffers) SELECT ctid, 
(data->>'date')::numeric as date FROM schema0.table0 WHERE 
(data->>'date')::numeric >= '1531267200'

    AND (data->>'date')::numeric <= '1531353600'
    ORDER BY ctid, (data->>'date')::numeric
    DESC LIMIT 1000;
QUERY PLAN
---
 Limit  (cost=5444.33..5445.71 rows=555 width=38) (actual 
time=6533.824..6533.927 rows=1000 loops=1)

   Output: ctid, (((data ->> 'date'::text))::numeric)
   Buffers: shared hit=410808
   ->  Sort  (cost=5444.33..5445.71 rows=555 width=38) (actual 
time=6533.822..6533.862 rows=1000 loops=1)

 Output: ctid, (((data ->> 'date'::text))::numeric)
 Sort Key: table0.ctid, (((table0.data ->> 
'date'::text))::numeric) DESC

 Sort Method: top-N heapsort  Memory: 95kB
 Buffers: shared hit=410808
 ->  Bitmap Heap Scan on schema0.table0 (cost=162.22..5419.03 
rows=555 width=38) (actual time=59.316..6532.913 rows=3713 loops=1)

   Output: ctid, ((data ->> 'date'::text))::numeric
   Recheck Cond: table0.data ->> 
'date'::text))::numeric >= '1531267200'::numeric) AND (((table0.data ->> 
'date'::text))::numeric <= '1531353600'::numeric))

   Rows Removed by Index Recheck: 1639294
   Heap Blocks: lossy=410752
   Buffers: shared hit=410808
   ->  Bitmap Index Scan on date_idx (cost=0.00..162.08 
rows=1339 width=0) (actual time=30.859..30.859 rows=4107520 loops=1)
 Index Cond: table0.data ->> 
'date'::text))::numeric >= '1531267200'::numeric) AND (((table0.data ->> 
'date'::text))::numeric <= '1531353600'::numeric))

 Buffers: shared hit=56
 Planning time: 1.100 ms
 Execution time: 6534.026 ms
(19 rows)

** This cannot be possible! The table must be clustered on the same 
expression we cluster BRIN! Let's look at the data.

===
db0=# SELECT ctid, (data->>'date')::numeric as date FROM schema0.table0 
WHERE (data->>'date')::numeric >= '1531267200'

    AND (data->>'date')::numeric <= '1531353600'
    ORDER BY ctid, (data->>'date')::numeric
    DESC LIMIT 1000;
   ctid    |    date
---+
 (4155,2)  | 1531281600
...
 (4159,1)  | 1531281600
 (4188,4)  | 1531281600
 (4189,1)  | 1531281600
 (4189,2)  | 1531281600
 (4189,3)  | 1531281600
 (4189,4)  | 1531281600
 (4231,4)  | 1531281600
 (4247,3)  | 1531281600
 (4247,4)  | 1531281600
 (4248,1)  | 1531281600
 (4248,2)  | 1531281600
 (5367,3)  | 1531281600
...
 (5368,4)  | 1531281600
 (5376,3)  | 1531281600
...
 (5441,1)  | 1531281600
 (5518,2)  | 1531281600
 (5518,3)  | 1531281600
 (5518,4)  | 1531281600
 (7836,1)  | 1531281600
 (7836,2)  | 1531281600
 (7851,3)  | 1531281600
...
 (7913,4)  | 1531281600
 (7984,4)  | 1531281600
 (8014,1)  | 1531281600
 (8014,2)  | 1531281600
 (8014,3)  | 1531281600
 (8014,4)  | 1531281600
 (8035,1)  | 1531281600
 (8035,2)  | 1531281600
 (8035,3)  | 1531281600
 (8035,4)  | 1531281600
 (9190,3)  | 1531281600
 (9190,4)  | 1531281600
 (10307,4) | 1531281600
...
 (10349,3) | 1531281600
 (10371,3) | 1531281600
...
 (10392,2) | 1531281600
 (10445,4) | 1531281600
 (10446,1) | 1531281600
 (10446,2) | 1531281600
 (11580,3) | 1531281600
...
 (11586,2) | 1531281600
 (11595,3) | 1531281600
...
 (11598,2) | 1531281600
 (11600,1) | 1531281600
...
 (11601,2) | 1531281600
 (11609,1) | 1531281600
...
 (11621,2) | 1531281600
 (11621,3) | 1531281600
 (11664,3) | 1531281600
...
 (11684,3) | 1531281600
 (11726,2) | 1531281600
...
 (11770,4) | 1531281600
 (12891,2) | 1531281600
 (14057,1) | 1531281600
...
 (14058,2) | 1531281600
 (14064,3) | 1531281600
...
 (14071,3) | 1531281600
(1000 rows)



The issue seems to be with both clustering and BRIN stats.

Fillfactor on the table is not a factor, can be 100 can be 75, I'm able 
to reproduce and fail to reproduce with both.
Clustering on `date_ord_idx` can be done with ANALYZE and without, in 
some cases PG picks scan and sort, sometimes it uses index regardless of 
stats.
Sometimes the table is properly clustered, sometime

Optimizer misses big in 10.4 with BRIN index

2018-07-25 Thread Arcadiy Ivanov
 Buffers: shared hit=839 read=187353
 ->  Gather  (cost=1000.00..617614.68 rows=84 width=1179) 
(actual time=3.262..4819.362 rows=611 loops=1)

   Output: data
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=839 read=187353
   ->  Parallel Seq Scan on schema0_lab.data_table 
(cost=0.00..616606.28 rows=35 width=1179) (actual time=6.135..4814.826 
rows=204 loops=3)

 Output: data
 Filter: (((data_table.segment)::text = 
'pb1'::text) AND ((data_table.data ->> 'tradeStatus'::text) = 
'Replaced'::text) AND (((data_table.data ->> 
'tradeDate'::text))::numeric >= '1531267200'::numeric) AND 
(((data_table.data

 ->> 'tradeDate'::text))::numeric <= '1531353600'::numeric))
 Rows Removed by Filter: 1125498
 Buffers: shared hit=2400 read=561439
 Worker 0: actual time=11.414..4812.744 rows=220 
loops=1

   Buffers: shared hit=775 read=187333
 Worker 1: actual time=4.249..4813.264 rows=220 loops=1
   Buffers: shared hit=786 read=186753
 Planning time: 0.232 ms
 Execution time: 4823.412 ms
(24 rows)

schema0=# \d schema0_lab.data_table

 Table "schema0_lab.data_table"
   Column   |  Type  | Collation | Nullable 
| Default

++---+--+--
 id | integer    |   | not null | 
nextval('schema0_lab.data_table_id_seq'::regclass)

 address    | character varying(128) |   | not null |
 segment    | character varying(128) |   | not null |
 data   | jsonb          |   | not null |
Indexes:
    "data_table_pkey" PRIMARY KEY, btree (id)
    "tradedate_idx" brin (((data ->> 'tradeDate'::text)::numeric)) WITH 
(autosummarize='true')


--
Arcadiy Ivanov
arca...@gmail.com | @arcivanov | https://ivanov.biz
https://github.com/arcivanov