date:20210119

Re: Printing LSN made easy

2021-01-19 Thread Ashutosh Bapat

On Wed, Jan 20, 2021 at 11:55 AM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 2020-11-27 11:40, Ashutosh Bapat wrote:
> > The solution seems to be simple though. In the attached patch, I have
> > added two macros
> > #define LSN_FORMAT "%X/%X"
> > #define LSN_FORMAT_ARG(lsn) (uint32) ((lsn) >> 32), (uint32) (lsn)
> >
> > which can be used instead.
>
> It looks like we are not getting any consensus on this approach.  One
> reduced version I would consider is just the second part, so you'd write
> something like
>
>  snprintf(lsnchar, sizeof(lsnchar), "%X/%X",
>   LSN_FORMAT_ARGS(lsn));
>
> This would still reduce notational complexity quite a bit but avoid any
> funny business with the format strings.
>

Thanks for looking into this. I would like to keep both the LSN_FORMAT and
LSN_FORMAT_ARGS but with a note that the first can not be used in elog() or
in messages which require localization. We have many other places doing
snprintf() and such stuff, which can use LSN_FORMAT. If we do so, the
functions to output string representation will not be needed so they can be
removed.

--
Best Wishes,
Ashutosh

Re: Discarding DISCARD ALL

2021-01-19 Thread Peter Eisentraut


On 2020-12-23 15:33, Simon Riggs wrote:

Poolers such as pgbouncer would then be able to connect transaction
mode pools by setting transaction_cleanup=on at time of connection,
avoiding any need to issue a server_reset_query, removing the DISCARD
ALL command from the normal execution path, while still achieving the
same thing.


PgBouncer does not send DISCARD ALL in transaction mode.  There is a 
separate setting to do that, but it's not the default, and it's more of 
a workaround for bad client code.  So I don't know if this feature would 
be of much use for PgBouncer.  Other connection poolers might have other 
opinions.

Re: Printing LSN made easy

2021-01-19 Thread Michael Paquier

On Wed, Jan 20, 2021 at 07:25:37AM +0100, Peter Eisentraut wrote:
> It looks like we are not getting any consensus on this approach.  One
> reduced version I would consider is just the second part, so you'd write
> something like
> 
> snprintf(lsnchar, sizeof(lsnchar), "%X/%X",
>  LSN_FORMAT_ARGS(lsn));
> 
> This would still reduce notational complexity quite a bit but avoid any
> funny business with the format strings.

That seems reasonable to me.  So +1.
--
Michael


signature.asc
Description: PGP signature

Re: Release SPI plans for referential integrity with DISCARD ALL

2021-01-19 Thread Peter Eisentraut


On 2021-01-13 09:47, yuzuko wrote:

But we are also considering another option to solve this problem, which
makes users to release cached SPI plans for referential integrity as well as
plain cached plans with DISCARD ALL.  To do that, we added a new
function, RI_DropAllPreparedPlan() which deletes all plans from
ri_query_cache and
modified DISCARD ALL to execute that function.


I don't have a comment on the memory management issue, but I think the 
solution of dropping all cached plans as part of DISCARD ALL seems a bit 
too extreme of a solution.  In the context of connection pooling, 
getting a new session with pre-cached plans seems like a good thing, and 
this change could potentially have a performance impact without a 
practical way to opt out.

Re: a misbehavior of partition row movement (?)

2021-01-19 Thread Peter Eisentraut

On 2021-01-08 09:54, Amit Langote wrote:

I don't quite recall if the decision to implement it like this was
based on assuming that this is what users would like to see happen in
this case or the perceived difficulty of implementing it the other way
around, that is, of firing AFTER UPDATE triggers in this case.

I tried to look that up, but I couldn't find any discussion about this. Do you
have any ideas in which thread that was handled?

It was discussed here:

https://www.postgresql.org/message-id/flat/CAJ3gD9do9o2ccQ7j7%2BtSgiE1REY65XRiMb%3DyJO3u3QhyP8EEPQ%40mail.gmail.com

It's a huge discussion, so you'll have to ctrl+f "trigger" to spot
relevant emails. You might notice that the developers who
participated in that discussion gave various opinions and what we have
today got there as a result of a majority of them voting for the
current approach. Someone also said this during the discussion:
"Regarding the trigger issue, I can't claim to have a terribly strong
opinion on this. I think that practically anything we do here might
upset somebody, but probably any halfway-reasonable thing we choose to
do will be OK for most people." So what we've got is that
"halfway-reasonable" thing, YMMV. :)

Could you summarize here what you are trying to do with respect to what
was decided before? I'm a bit confused, looking through the patches you
have posted. The first patch you posted hard-coded FK trigger OIDs
specifically, other patches talk about foreign key triggers in general
or special case internal triggers or talk about all triggers.

Re: Boundary value check in lazy_tid_reaped()

2021-01-19 Thread Peter Eisentraut


On 2020-10-30 02:43, Masahiko Sawada wrote:

Using the integer set is very memory efficient (5MB vs. 114MB in the
base case) and there is no 1GB limitation. Looking at the execution
time, I had expected that using the integer set is slower on recording
TIDs than using the simple array but in this experiment, there is no
big difference among methods. Perhaps the result will vary if vacuum
needs to record much more dead tuple TIDs. From the results, I can see
a good improvement in the integer set case and probably a good start
but if we want to use it also for lazy vacuum, we will need to improve
it so that it can be used on DSA for the parallel vacuum.

I've attached the patch I used for the experiment that adds xx_vacuum
GUC parameter that controls the method of recording TIDs. Please note
that it doesn't support parallel vacuum.


How do you want to proceed here?  The approach of writing a wrapper for 
bsearch with bound check sounded like a good start.  All the other ideas 
discussed here seem larger projects and would probably be out of scope 
of this commit fest.

Re: list of extended statistics on psql

2021-01-19 Thread Tatsuro Yamada


Hi Tomas,

On 2021/01/20 11:35, Tatsuro Yamada wrote:

Apologies for all the extra work - I haven't realized this flaw when pushing 
for showing more stuff :-(


Don't worry about it. We didn't notice the problem even when viewed by multiple
people on -hackers. Let's keep moving forward. :-D

I'll send a patch including a regression test on the next patch.



I created patches and my test results on PG10, 11, 12, and 14 are fine.

  0001:
- Fix query to use pg_statistic_ext only
- Replace statuses "required" and "built" with "defined"
- Remove the size columns
- Fix document
- Add schema name as a filter condition on the query

  0002:
- Fix all results of \dX
- Add new testcase by non-superuser

Please find attached files. :-D


Regards,
Tatsuro Yamada
From 1aac3df2af2f6c834ffab10ddd1be1dee5970eb3 Mon Sep 17 00:00:00 2001
From: Tatsuro Yamada 
Date: Wed, 20 Jan 2021 15:33:04 +0900
Subject: [PATCH 2/2] psql \dX regression test

Add a test by non-superuser
---
 src/test/regress/expected/stats_ext.out | 116 
 src/test/regress/sql/stats_ext.sql  |  37 ++
 2 files changed, 153 insertions(+)

diff --git a/src/test/regress/expected/stats_ext.out 
b/src/test/regress/expected/stats_ext.out
index f094731e32..0ff4e51055 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -1727,6 +1727,122 @@ INSERT INTO tststats.priv_test_tbl
 CREATE STATISTICS tststats.priv_test_stats (mcv) ON a, b
   FROM tststats.priv_test_tbl;
 ANALYZE tststats.priv_test_tbl;
+-- Check printing info about extended statistics by \dX
+create table stts_t1 (a int, b int);
+create statistics stts_1 (ndistinct) on a, b from stts_t1;
+create statistics stts_2 (ndistinct, dependencies) on a, b from stts_t1;
+create statistics stts_3 (ndistinct, dependencies, mcv) on a, b from stts_t1;
+create table stts_t2 (a int, b int, c int);
+create statistics stts_4 on b, c from stts_t2;
+create table stts_t3 (col1 int, col2 int, col3 int);
+create statistics stts_hoge on col1, col2, col3 from stts_t3;
+create schema stts_s1;
+create schema stts_s2;
+create statistics stts_s1.stts_foo on col1, col2 from stts_t3;
+create statistics stts_s2.stts_yama (dependencies, mcv) on col1, col3 from 
stts_t3;
+insert into stts_t1 select i,i from generate_series(1,100) i;
+analyze stts_t1;
+\dX
+  List of extended statistics
+  Schema  |  Name  |  Definition  | 
Ndistinct | Dependencies |   MCV   
+--++--+---+--+-
+ public   | func_deps_stat | a, b, c FROM functional_dependencies |
   | defined  | 
+ public   | mcv_lists_arrays_stats | a, b, c FROM mcv_lists_arrays|
   |  | defined
+ public   | mcv_lists_bool_stats   | a, b, c FROM mcv_lists_bool  |
   |  | defined
+ public   | mcv_lists_stats| a, b, d FROM mcv_lists   |
   |  | defined
+ public   | stts_1 | a, b FROM stts_t1| 
defined   |  | 
+ public   | stts_2 | a, b FROM stts_t1| 
defined   | defined  | 
+ public   | stts_3 | a, b FROM stts_t1| 
defined   | defined  | defined
+ public   | stts_4 | b, c FROM stts_t2| 
defined   | defined  | defined
+ public   | stts_hoge  | col1, col2, col3 FROM stts_t3| 
defined   | defined  | defined
+ stts_s1  | stts_foo   | col1, col2 FROM stts_t3  | 
defined   | defined  | defined
+ stts_s2  | stts_yama  | col1, col3 FROM stts_t3  |
   | defined  | defined
+ tststats | priv_test_stats| a, b FROM tststats.priv_test_tbl |
   |  | defined
+(12 rows)
+
+\dX stts_?
+   List of extended statistics
+ Schema |  Name  |Definition | Ndistinct | Dependencies |   MCV   
+++---+---+--+-
+ public | stts_1 | a, b FROM stts_t1 | defined   |  | 
+ public | stts_2 | a, b FROM stts_t1 | defined   | defined  | 
+ public | stts_3 | a, b FROM stts_t1 | defined   | defined  | defined
+ public | stts_4 | b, c FROM stts_t2 | defined   | defined  | defined
+(4 rows)
+
+\dX *stts_hoge
+   List of extended statistics
+ Schema |   Name|  Definition   | Ndistinct | Dependencies 
|   MCV   
++---+---+---+--+-
+ public | stts_hoge | col1, col2, col3 FROM stts_t3 | defined   | defined  
| defined
+(1 row)
+
+\dX+
+  List of extended statistics
+  Schema  |

Re: New IndexAM API controlling index vacuum strategies

2021-01-19 Thread Masahiko Sawada

On Wed, Jan 20, 2021 at 9:45 AM Peter Geoghegan  wrote:
>
> On Tue, Jan 19, 2021 at 2:57 PM Peter Geoghegan  wrote:
> > * Maybe it would be better if you just changed the definition such
> > that "MAXALIGN(SizeofHeapTupleHeader)" became "MAXIMUM_ALIGNOF", with
> > no other changes? (Some variant of this suggestion might be better,
> > not sure.)
> >
> > For some reason that feels a bit safer: we still have an "imaginary
> > tuple header", but it's just 1 MAXALIGN() quantum now. This is still
> > much less than the current 3 MAXALIGN() quantums (i.e. what
> > MaxHeapTuplesPerPage treats as the tuple header size). Do you think
> > that this alternative approach will be noticeably less effective
> > within vacuumlazy.c?
>
> BTW, I think that increasing MaxHeapTuplesPerPage will make it
> necessary to consider tidbitmap.c. Comments at the top of that file
> say that it is assumed that MaxHeapTuplesPerPage is about 256. So
> there is a risk of introducing performance regressions affecting
> bitmap scans here.
>
> Apparently some other DB systems make the equivalent of
> MaxHeapTuplesPerPage dynamically configurable at the level of heap
> tables. It usually doesn't matter, but it can matter with on-disk
> bitmap indexes, where the bitmap must be encoded from raw TIDs (this
> must happen before the bitmap is compressed -- there must be a simple
> mapping from every possible TID to some bit in a bitmap first). The
> item offset component of each heap TID is not usually very large, so
> there is a trade-off between keeping the representation of bitmaps
> efficient and not unduly restricting the number of distinct heap
> tuples on each heap page. I think that there might be a similar
> consideration here, in tidbitmap.c (even though it's not concerned
> about on-disk bitmaps).

That's a good point. With the patch, MaxHeapTuplesPerPage increased to
2042 with 8k page, and to 8186 with 32k page whereas it's currently
291 with 8k page and 1169 with 32k page. So it is likely to be a
problem as you pointed out. If we change
"MAXALIGN(SizeofHeapTupleHeader)" to "MAXIMUM_ALIGNOF", it's 680 with
8k patch and 2728 with 32k page, which seems much better.

The purpose of increasing MaxHeapTuplesPerPage in the patch is to have
a heap page accumulate more LP_DEAD line pointers. As I explained
before, considering MaxHeapTuplesPerPage, we cannot calculate how many
LP_DEAD line pointers can be accumulated into the space taken by
fillfactor simply by ((the space taken by fillfactor) / (size of line
pointer)). We need to consider both how many line pointers are
available for LP_DEAD and how much space is available for LP_DEAD.

For example, suppose the tuple size is 50 bytes and fillfactor is 80,
each page has 1633 bytes (=(8192-24)*0.2) free space taken by
fillfactor, where 408 line pointers can fit. However, if we store 250
LP_DEAD line pointers into that space, the number of tuples that can
be stored on the page is only 41, although we have 6534 bytes
(=(8192-24)*0.8) where 121 tuples (+line pointers) can fit because
MaxHeapTuplesPerPage is 291. In this case, where the tuple size is 50
and fillfactor is 80, we can accumulate up to about 170 LP_DEAD line
pointers while storing 121 tuples. Increasing MaxHeapTuplesPerPage
raises this 291 limit and enables us to forget the limit when
calculating the maximum number of LP_DEAD line pointers that can be
accumulated on a single page.

An alternative approach would be to calculate it using the average
tuple's size. I think if we know the tuple size, the maximum number of
LP_DEAD line pointers can be accumulated into the single page is the
minimum of the following two formula:

(1) MaxHeapTuplesPerPage - (((BLCKSZ - SizeOfPageHeaderData) *
(fillfactor/100)) / (sizeof(ItemIdData) + tuple_size))); //how many
line pointers are available for LP_DEAD?

(2) ((BLCKSZ - SizeOfPageHeaderData) * ((1 - fillfactor)/100)) /
sizeof(ItemIdData); //how much space is available for LP_DEAD?

But I'd prefer to increase MaxHeapTuplesPerPage but not to affect the
bitmap much rather than introducing a complex theory.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

1 2 >

1 - 100 of 130 matches

Mail list logo