Re: JSONB index not in use, but is TOAST the real cause of slow query?

2022-05-28 Thread Shaheed Haque
Adrian, Tom, thanks for the input. Based on that, it occurred to me to
create some "dummy" rows and - almost magically - the index kicks in!
Before is 2500ms:


Seq Scan on paiyroll_payrun  (cost=0.00..52.43 rows=17 width=32)
(actual time=53.127..2567.024 rows=104 loops=1)
  Filter: ((snapshot -> 'employee'::text) ? '2209'::text)
  Rows Removed by Filter: 1835
Planning Time: 0.060 ms
Execution Time: 2567.044 ms
(5 rows)


After is 300ms:


Bitmap Heap Scan on paiyroll_payrun  (cost=36.11..64.67 rows=14
width=32) (actual time=4.189..311.932 rows=104 loops=1)
  Recheck Cond: ((snapshot -> 'employee'::text) ? '2209'::text)
  Rows Removed by Index Recheck: 1
  Heap Blocks: exact=8
  ->  Bitmap Index Scan on idx1  (cost=0.00..36.10 rows=14 width=0)
(actual time=0.087..0.087 rows=105 loops=1)
Index Cond: ((snapshot -> 'employee'::text) ? '2209'::text)
Planning Time: 0.167 ms
Execution Time: 311.962 ms
(8 rows)


Woot!

Unfortunately, the real query which I think should behave very
similarly is still at the several-seconds level despite using the
index. Before 3600ms:


SELECT "paiyroll_payrun"."actual_t"
  FROM "paiyroll_payrun"
 WHERE ("paiyroll_payrun"."company_id" = 173 AND
("paiyroll_payrun"."snapshot" -> 'employee') ? '16376'
   AND NOT (("paiyroll_payrun"."snapshot" #>
ARRAY['employee','16376','last_run_of_employment']) = 'true'
   AND ("paiyroll_payrun"."snapshot" #>
ARRAY['employee','16376','pay_graph']) = '0'
   AND ("paiyroll_payrun"."snapshot" #>
ARRAY['employee','16376','state','employment','-1','2']) > '0'))
 ORDER BY "paiyroll_payrun"."actual_t" DESC
 LIMIT 1



Limit (cost=31.33..31.33 rows=1 width=4) (actual
time=3595.174..3595.176 rows=1 loops=1)
  -> Sort (cost=31.33..31.33 rows=3 width=4) (actual
time=3595.174..3595.174 rows=1 loops=1)
Sort Key: actual_t DESC
Sort Method: top-N heapsort Memory: 25kB
-> Bitmap Heap Scan on paiyroll_payrun (cost=6.43..31.31 rows=3
width=4) (actual time=44.575..3595.082 rows=62 loops=1)
   Recheck Cond: (company_id = 173)
   Filter: (((snapshot -> 'employee'::text) ? '16376'::text) AND
(((snapshot #> '{employee,16376,last_run_of_employment}'::text[]) <>
'true'::jsonb) OR ((snapshot #> '{employee,16376,pay_graph}'::text[])
<> '0'::jsonb) OR ((snapshot #>
'{employee,16376,state,employment,-1,2}'::text[]) <= '0'::jsonb)))
Rows Removed by Filter: 242
   Heap Blocks: exact=9
 -> Bitmap Index Scan on paiyroll_payrun_company_id_ce341888
(cost=0.00..6.43 rows=304 width=0) (actual time=0.013..0.013 rows=304
loops=1)
 Index Cond: (company_id = 173)
Planning Time: 0.258 ms
Execution Time: 3595.195 ms


After, with the index in use, 2200ms:


Limit (cost=30.92..30.93 rows=1 width=4) (actual
time=2258.989..2258.990 rows=1 loops=1)
  -> Sort (cost=30.92..30.93 rows=1 width=4) (actual
time=2258.988..2258.989 rows=1 loops=1)
Sort Key: actual_t DESC
Sort Method: top-N heapsort Memory: 25kB
-> Bitmap Heap Scan on paiyroll_payrun (cost=26.88..30.91 rows=1
width=4) (actual time=32.488..2258.891 rows=62 loops=1)
  Recheck Cond: ((company_id = 173) AND ((snapshot ->
'employee'::text) ? '16376'::text))
  Filter: (((snapshot #>
'{employee,16376,last_run_of_employment}'::text[]) <> 'true'::jsonb)
OR ((snapshot #> '{employee,16376,pay_graph}'::text[]) <> '0'::jsonb)
OR ((snapshot #> '{employee,16376,state,employment,-1,2}'::text[]) <=
'0'::jsonb))
  Heap Blocks: exact=5
-> BitmapAnd (cost=26.88..26.88 rows=1 width=0) (actual
time=0.038..0.039 rows=0 loops=1)
  -> Bitmap Index Scan on paiyroll_payrun_company_id_ce341888
(cost=0.00..6.56 rows=304 width=0) (actual time=0.016..0.016 rows=304
loops=1)
Index Cond: (company_id = 173)
  -> Bitmap Index Scan on idx1 (cost=0.00..20.07 rows=9
width=0) (actual time=0.021..0.021 rows=62 loops=1)
Index Cond: ((snapshot -> 'employee'::text) ? '16376'::text)
Planning Time: 0.245 ms
Execution Time: 2259.019 ms
===

IIUC, at the bottom, the indices are doing their thing, but a couple
of layers up, the "Bitmap Heap Scan" jumps from ~30ms to 2200ms. But I
cannot quite see why. Have I missed a needed index or what? I'm pretty
much a novice at SQL, so it is quite possible I've missed something
very basic.

Thanks, Shaheed

On Sat, 28 May 2022 at 20:59, Tom Lane  wrote:
>
> Shaheed Haque  writes:
> > One last thought about TOAST. If the cost of the -> retrieving the
> > data cannot be obviated, is there any way to tweak how that works?
>
> The only thing that's readily tweakable is to disable data compression
> for the out-of-line values (see ALTER TABLE ... SET STORAGE, and note
> that you have to rebuild the table for already-stored values to be
> changed).  It seems unlikely that that will help you much though,
> since doing that would save CPU at the cost of more disk I/O, and
> it seems that the I/O side 

Re: Improve configurability for IO related behavoir

2022-05-28 Thread Thomas Munro
On Sun, May 29, 2022 at 4:29 AM 浩辰 何  wrote:
> Furthermore, the results above are also related to IO API supported by OS. 
> MySQL support synchronized IO and Linux libaio. It seems
> that PostgreSQL only supports synchronized IO, so shall we support more IO 
> engines? like io_uring which is very popular in recent years.

Hi Haochen,

There is an active project to bring these things to PostgreSQL.
https://wiki.postgresql.org/wiki/AIO has some information and links.
The short version is that there is a development patch set to add
these GUCs:

io_method=worker,io_uring,posix_aio,iocp,...
io_data_direct=on,off
io_wal_direct=on,off

It also adds a bunch of specialised logic that knows how to initiate
IO in key places (scans, writeback, recovery, ...), because it's not
enough to just turn off kernel I/O buffering, we also have to do all
the work the kernel is doing for us or performance will be terrible.




Re: JSONB index not in use, but is TOAST the real cause of slow query?

2022-05-28 Thread Tom Lane
Shaheed Haque  writes:
> One last thought about TOAST. If the cost of the -> retrieving the
> data cannot be obviated, is there any way to tweak how that works?

The only thing that's readily tweakable is to disable data compression
for the out-of-line values (see ALTER TABLE ... SET STORAGE, and note
that you have to rebuild the table for already-stored values to be
changed).  It seems unlikely that that will help you much though,
since doing that would save CPU at the cost of more disk I/O, and
it seems that the I/O side is your bottleneck already.  It *would*
help if jsonb had logic for partial fetches, because that'd require
uncompressed storage to work.  But AFAICS that's not there yet.
I distinctly recall that that was planned for when we were designing
jsonb's on-disk layout, but I see no code actually doing anything
of that sort ... the -> operator certainly doesn't.

regards, tom lane




Re: JSONB index not in use, but is TOAST the real cause of slow query?

2022-05-28 Thread Adrian Klaver

On 5/28/22 12:38, Shaheed Haque wrote:

Tom,

Thanks for the considered advice and insights. My takeaway is that
based on what I've said,you are mostly unsurprised by the results I
see. In the longer term, the number of rows will increase but I will
have to ponder options for the immediate future. I'll have a play with
the knobs you suggested and will report back with anything of note.

One last thought about TOAST. If the cost of the -> retrieving the
data cannot be obviated, is there any way to tweak how that works?


See:

https://www.postgresql.org/docs/current/storage-toast.html

and from there:

https://www.postgresql.org/docs/current/sql-altertable.html

Search for SET STORAGE.

It might help with(from Tom Lane):

"Unfortunately the planner doesn't account
for detoasting costs when making such estimates, ..."

but probably not much with:

" ...because anything at all that you do with
that big JSONB column is going to be expensive.  (Another thing
that's been on the to-do list for awhile is enabling partial
retrieval of large JSONB values, but AFAIK that hasn't happened
yet either.)



Thanks, Shaheed




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: JSONB index not in use, but is TOAST the real cause of slow query?

2022-05-28 Thread Shaheed Haque
Tom,

Thanks for the considered advice and insights. My takeaway is that
based on what I've said,you are mostly unsurprised by the results I
see. In the longer term, the number of rows will increase but I will
have to ponder options for the immediate future. I'll have a play with
the knobs you suggested and will report back with anything of note.

One last thought about TOAST. If the cost of the -> retrieving the
data cannot be obviated, is there any way to tweak how that works?

Thanks, Shaheed

On Sat, 28 May 2022 at 19:41, Tom Lane  wrote:
>
> Shaheed Haque  writes:
> > ==
> > foo =# explain analyse SELECT snapshot ->'company'->'legal_name' FROM
> > paiyroll_payrun WHERE snapshot ->'employee' ? '2209';
> >   QUERY PLAN
> > -
> > Seq Scan on paiyroll_payrun  (cost=0.00..29.13 rows=9 width=32)
> > (actual time=50.185..2520.983 rows=104 loops=1)
> >   Filter: ((snapshot -> 'employee'::text) ? '2209'::text)
> >   Rows Removed by Filter: 835
> > Planning Time: 0.075 ms
> > Execution Time: 2521.004 ms
> > (5 rows)
> > ==
>
> > So, over 2.5 seconds to sequentially scan ~850 rows. Am I right to
> > presume the INDEX is not used because of the number of rows? Is there
> > a way to verify that?
>
> You could do "set enable_seqscan = off" and see if the EXPLAIN
> results change.  My guess is that you'll find that the indexscan
> alternative is costed at a bit more than 29.13 units and thus
> the planner thinks seqscan is cheaper.
>
> > And how can I understand the dreadful amount of
> > time (of course, this is just on my dev machine, but still...)?
>
> In the seqscan case, the -> operator is going to retrieve the whole
> JSONB value from each row, which of course is pretty darn expensive
> if it's a few megabytes.  Unfortunately the planner doesn't account
> for detoasting costs when making such estimates, so it doesn't
> realize that the seqscan case is going to be expensive.  (Fixing
> that has been on the to-do list for a long time, but we seldom
> see cases where it matters this much, so it hasn't gotten done.)
>
> The problem would likely go away by itself if your table had more
> than a few hundred rows, but if you don't anticipate that happening
> then you need some sort of band-aid.  I don't recommend turning
> enable_seqscan off as a production fix; it'd likely have negative
> effects on other queries.  Personally I'd experiment with reducing
> random_page_cost a bit to see if I could encourage use of the index
> that way.  The default value of 4.0 is tuned for spinning-rust
> storage and is not too appropriate for a lot of modern hardware,
> so there's probably room to fix it that way without detuning your
> setup for other queries.
>
> You should probably also rethink whether you really want to store
> your data in this format, because anything at all that you do with
> that big JSONB column is going to be expensive.  (Another thing
> that's been on the to-do list for awhile is enabling partial
> retrieval of large JSONB values, but AFAIK that hasn't happened
> yet either.)
>
> regards, tom lane




Re: JSONB index not in use, but is TOAST the real cause of slow query?

2022-05-28 Thread Tom Lane
Shaheed Haque  writes:
> ==
> foo =# explain analyse SELECT snapshot ->'company'->'legal_name' FROM
> paiyroll_payrun WHERE snapshot ->'employee' ? '2209';
>   QUERY PLAN
> -
> Seq Scan on paiyroll_payrun  (cost=0.00..29.13 rows=9 width=32)
> (actual time=50.185..2520.983 rows=104 loops=1)
>   Filter: ((snapshot -> 'employee'::text) ? '2209'::text)
>   Rows Removed by Filter: 835
> Planning Time: 0.075 ms
> Execution Time: 2521.004 ms
> (5 rows)
> ==

> So, over 2.5 seconds to sequentially scan ~850 rows. Am I right to
> presume the INDEX is not used because of the number of rows? Is there
> a way to verify that?

You could do "set enable_seqscan = off" and see if the EXPLAIN
results change.  My guess is that you'll find that the indexscan
alternative is costed at a bit more than 29.13 units and thus
the planner thinks seqscan is cheaper.

> And how can I understand the dreadful amount of
> time (of course, this is just on my dev machine, but still...)?

In the seqscan case, the -> operator is going to retrieve the whole
JSONB value from each row, which of course is pretty darn expensive
if it's a few megabytes.  Unfortunately the planner doesn't account
for detoasting costs when making such estimates, so it doesn't
realize that the seqscan case is going to be expensive.  (Fixing
that has been on the to-do list for a long time, but we seldom
see cases where it matters this much, so it hasn't gotten done.)

The problem would likely go away by itself if your table had more
than a few hundred rows, but if you don't anticipate that happening
then you need some sort of band-aid.  I don't recommend turning
enable_seqscan off as a production fix; it'd likely have negative
effects on other queries.  Personally I'd experiment with reducing
random_page_cost a bit to see if I could encourage use of the index
that way.  The default value of 4.0 is tuned for spinning-rust
storage and is not too appropriate for a lot of modern hardware,
so there's probably room to fix it that way without detuning your
setup for other queries.

You should probably also rethink whether you really want to store
your data in this format, because anything at all that you do with
that big JSONB column is going to be expensive.  (Another thing
that's been on the to-do list for awhile is enabling partial
retrieval of large JSONB values, but AFAIK that hasn't happened
yet either.)

regards, tom lane




JSONB index not in use, but is TOAST the real cause of slow query?

2022-05-28 Thread Shaheed Haque
Hi,

I have a database table with a modest number of rows (<1000) but where
one column in the table is a JSONB "snapshot" which can be a few MB in
size. Generally, this is a great fit for the read-write access
patterns involved, but there is a read-query which is VERY slow.

I've searched via Google and in this mailing list archive and found
lots of material, but nothing that explains what is going on...or how
to fix it. The table columns look like this:

...a primary key...
...various other keys including a couple of FKs...
snapshot JSONB

and before I did anything, the indices looked like this:

   "paiyroll_payrun_pkey" PRIMARY KEY, btree (process_ptr_id)
   "paiyroll_payrun_company_id_ce341888" btree (company_id)
   "paiyroll_payrun_schedule_id_1593f55f" btree (schedule_id)

The data in one row's "snapshot" looks a bit like this:

{
"stuff": {},
"more other stuff": {},
"employee": {
 "1234": {},
 "56789": {},
}
}

The query that is slow can be approximated like this:

   SELECT snapshot ->'stuff'->'item' FROM paiyroll_payrun WHERE
snapshot ->'employee' ? '2209';

When I add this index:

CREATE INDEX idx1 ON paiyroll_payrun USING gin ((snapshot ->'employee'));

the analyser says this:

==
foo =# explain analyse SELECT snapshot ->'company'->'legal_name' FROM
paiyroll_payrun WHERE snapshot ->'employee' ? '2209';
  QUERY PLAN
-
Seq Scan on paiyroll_payrun  (cost=0.00..29.13 rows=9 width=32)
(actual time=50.185..2520.983 rows=104 loops=1)
  Filter: ((snapshot -> 'employee'::text) ? '2209'::text)
  Rows Removed by Filter: 835
Planning Time: 0.075 ms
Execution Time: 2521.004 ms
(5 rows)
==

So, over 2.5 seconds to sequentially scan ~850 rows. Am I right to
presume the INDEX is not used because of the number of rows? Is there
a way to verify that? And how can I understand the dreadful amount of
time (of course, this is just on my dev machine, but still...)? Is
there a way to see/tweak what TOAST costs or indeed to confirm if it
is even in use?

Any help appreciated.

Thanks, Shaheed




Improve configurability for IO related behavoir

2022-05-28 Thread 浩辰 何
Dear developers:

I have some suggestions on PostgreSQL's IO behavior. I am not sure my opinions 
are right. Many thanks for your time!

It is documented that :
wal_sync_method​ (enum): The open_*​ options also use O_DIRECT if available.
Shall PostgreSQL consider making O_DIRECT configurable? In MySQL, 
innodb_flush_method​ can change if use  O_DIRECT or not.
I made some benchmarking on O_DIRECT of MySQL, and I find that sometimes, using 
O_DIRECT may hurt performance.
https://ibb.co/BVWnwnh
[https://i.ibb.co/1fyzbzC/003.png]
003
Image 003 hosted in ImgBB
ibb.co






Comparing the blue (buffered IO) and green (direct IO) lines, I find buffered 
IO is faster. So I think having such parameter in PostgreSQL
is reasonable.

On the other hand, wal_sync_method​ only controls how WAL is written to 
devices, while for data file I notice that PostgreSQL uses
sync_file_range(2)​. So shall we also make it configurable? I also find that in 
some systems, open​ with O_SYNC is much more faster
than write​ + fsync​ :

https://ibb.co/f1VsCC1

[https://i.ibb.co/gjbxDDj/Picture1.png]
Picture1
Image Picture1 hosted in ImgBB
ibb.co
Furthermore, the results above are also related to IO API supported by OS. 
MySQL support synchronized IO and Linux libaio​. It seems
that PostgreSQL only supports synchronized IO, so shall we support more IO 
engines? like io_uring​ which is very popular in recent years.

Thanks & best,
Haochen