Custom explain options

2023-10-21 Thread Konstantin Knizhnik

Hi hackers,

EXPLAIN statement has a list of options (i.e. ANALYZE, BUFFERS, 
COST,...) which help to provide useful details of query execution.
In Neon we have added PREFETCH option which shows information about page 
prefetching during query execution (prefetching is more critical for Neon
architecture because of separation of compute and storage, so it is 
implemented not only for bitmap heap scan as in Vanilla Postgres, but 
also for seqscan, indexscan and indexonly scan). Another possible 
candidate  for explain options is local file cache (extra caching layer 
above shared buffers which is used to somehow replace file system cache 
in standalone Postgres).


I think that it will be nice to have a generic mechanism which allows 
extensions to add its own options to EXPLAIN.
I have attached the patch with implementation of such mechanism (also 
available as PR: https://github.com/knizhnik/postgres/pull/1 )


I have demonstrated this mechanism using Bloom extension - just to 
report number of Bloom matches.
Not sure that it is really useful information but it is used mostly as 
example:


explain (analyze,bloom) select * from t where pk=2000;
   QUERY PLAN
-
 Bitmap Heap Scan on t  (cost=15348.00..15352.01 rows=1 width=4) (actual 
time=25.244..25.939 rows=1 loops=1)
   Recheck Cond: (pk = 2000)
   Rows Removed by Index Recheck: 292
   Heap Blocks: exact=283
   Bloom: matches=293
   ->  Bitmap Index Scan on t_pk_idx  (cost=0.00..15348.00 rows=1 width=0) 
(actual time=25.147..25.147 rows=293 loops=1)
 Index Cond: (pk = 2000)
 Bloom: matches=293
 Planning:
   Bloom: matches=0
 Planning Time: 0.387 ms
 Execution Time: 26.053 ms
(12 rows)

There are two known issues with this proposal:

1. I have to limit total size of all custom metrics - right now it is 
limited by 128 bytes. It is done to keep|Instrumentation|and some other 
data structures fixes size. Otherwise maintaining varying parts of this 
structure is ugly, especially in shared memory


2. Custom extension is added by means 
of|RegisterCustomInsrumentation|function which is called from|_PG_init|
But|_PG_init|is called when extension is loaded and it is loaded on 
demand when some of extension functions is called (except when extension 
is included
in shared_preload_libraries list), Bloom extension doesn't require it. 
So if your first statement executed in your session is:


 explain (analyze,bloom) select * from t where pk=2000;

...you will get error:

ERROR:  unrecognized EXPLAIN option "bloom"
LINE 1: explain (analyze,bloom) select * from t where pk=2000;

It happens because at the moment when explain statement parses options, 
Bloom index is not yet selected and so bloom extension is not loaded 
and|RegisterCustomInsrumentation|is not yet called. If we repeat the 
query, then proper result will be displayed (see above).
diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index 330811ec60..af23ffe821 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -174,6 +174,13 @@ typedef struct BloomScanOpaqueData
 
 typedef BloomScanOpaqueData *BloomScanOpaque;
 
+typedef struct
+{
+   uint64 matches;
+} BloomUsage;
+
+extern BloomUsage bloomUsage;
+
 /* blutils.c */
 extern void initBloomState(BloomState *state, Relation index);
 extern void BloomFillMetapage(Relation index, Page metaPage);
diff --git a/contrib/bloom/blscan.c b/contrib/bloom/blscan.c
index 61d1f66b38..8890951943 100644
--- a/contrib/bloom/blscan.c
+++ b/contrib/bloom/blscan.c
@@ -166,6 +166,6 @@ blgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
CHECK_FOR_INTERRUPTS();
}
FreeAccessStrategy(bas);
-
+   bloomUsage.matches += ntids;
return ntids;
 }
diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index f23fbb1d9e..d795875920 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -18,6 +18,7 @@
 #include "access/reloptions.h"
 #include "bloom.h"
 #include "catalog/index.h"
+#include "commands/explain.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
@@ -34,6 +35,55 @@
 
 PG_FUNCTION_INFO_V1(blhandler);
 
+BloomUsage bloomUsage;
+
+static void
+bloomUsageAdd(CustomResourceUsage* dst, CustomResourceUsage const* add)
+{
+   ((BloomUsage*)dst)->matches += ((BloomUsage*)add)->matches;
+}
+
+static void
+bloomUsageAccum(CustomResourceUsage* acc, CustomResourceUsage const* end, 
CustomResourceUsage const* start)
+{
+   ((BloomUsage*)acc)->matches += ((BloomUsage*)end)->matches - 
((BloomUsage*)start)->matches;;
+}
+
+static void
+bloomUsageShow(ExplainState* es, CustomResourceUsage const* usage, bool 
planning)
+{
+   if (es->format == EXPLAIN_FORMAT_TEXT)
+   {
+   if (planning)
+   {
+   ExplainIndentText(es);
+ 

Re: Remove extraneous break condition in logical slot advance function

2023-10-21 Thread Gurjeet Singh
On Fri, Oct 20, 2023 at 7:30 PM Bharath Rupireddy
 wrote:
>
> Hi,
>
> There exists an extraneous break condition in
> pg_logical_replication_slot_advance(). When the end of WAL or moveto
> LSN is reached, the main while condition helps to exit the loop, so no
> separate break condition is needed. Attached patch removes it.
>
> Thoughts?

+1 for the patch.

The only advantage I see of the code as it stands right now is that it
avoids one last call to CHECK_FOR_INTERRUPTS() by break'ing early. I
don't think we'd lose much in terms of performance by making one (very
cheap, in common case) extra call of this macro.

Best regards,
Gurjeet
http://Gurje.et




Re: Remove extraneous break condition in logical slot advance function

2023-10-21 Thread Tom Lane
Gurjeet Singh  writes:
> On Fri, Oct 20, 2023 at 7:30 PM Bharath Rupireddy
>  wrote:
>> There exists an extraneous break condition in
>> pg_logical_replication_slot_advance(). When the end of WAL or moveto
>> LSN is reached, the main while condition helps to exit the loop, so no
>> separate break condition is needed. Attached patch removes it.

> The only advantage I see of the code as it stands right now is that it
> avoids one last call to CHECK_FOR_INTERRUPTS() by break'ing early. I
> don't think we'd lose much in terms of performance by making one (very
> cheap, in common case) extra call of this macro.

Agreed, bypassing the last CHECK_FOR_INTERRUPTS() shouldn't save
anything noticeable.  Could there be a correctness argument for it
though?  Can't see what.  We should assume that CFIs might happen
down inside LogicalDecodingProcessRecord.

I wondered why the code looks like this, and whether there used
to be more of a reason for it.  "git blame" reveals the probable
answer: when this code was added, in 9c7d06d60, the loop
condition was different so the break was necessary.
38a957316 simplified the loop condition to what we see today,
but didn't notice that the break was thereby made pointless.

While we're here ... the comment above the loop seems wrong
already, and this makes it more so.  I suggest something like

-   /* Decode at least one record, until we run out of records */
+   /* Decode records until we reach the requested target */
while (ctx->reader->EndRecPtr < moveto)

regards, tom lane




Re: Improve WALRead() to suck data directly from WAL buffers when possible

2023-10-21 Thread Bharath Rupireddy
On Fri, Oct 20, 2023 at 10:19 PM Bharath Rupireddy
 wrote:
>
> On Thu, Oct 12, 2023 at 4:13 AM Andres Freund  wrote:
> >
> > On 2023-10-03 16:05:32 -0700, Jeff Davis wrote:
> > > On Sat, 2023-01-14 at 12:34 -0800, Andres Freund wrote:
> > > > One benefit would be that it'd make it more realistic to use direct
> > > > IO for WAL
> > > > - for which I have seen significant performance benefits. But when we
> > > > afterwards have to re-read it from disk to replicate, it's less
> > > > clearly a win.
> > >
> > > Does this patch still look like a good fit for your (or someone else's)
> > > plans for direct IO here? If so, would committing this soon make it
> > > easier to make progress on that, or should we wait until it's actually
> > > needed?
> >
> > I think it'd be quite useful to have. Even with the code as of 16, I see
> > better performance in some workloads with debug_io_direct=wal,
> > wal_sync_method=open_datasync compared to any other configuration. Except of
> > course that it makes walsenders more problematic, as they suddenly require
> > read IO. Thus having support for walsenders to send directly from wal 
> > buffers
> > would be beneficial, even without further AIO infrastructure.
>
> I'm attaching the v11 patch set with the following changes:
> - Improved input validation in the function that reads WAL from WAL
> buffers in 0001 patch.
> - Improved test module's code in 0002 patch.
> - Modernized meson build file in 0002 patch.
> - Added commit messages for both the patches.
> - Ran pgindent on both the patches.
>
> Any thoughts are welcome.

I'm attaching v12 patch set with just pgperltidy ran on the new TAP
test added in 0002. No other changes from that of v11 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From c240d914967261e462290b714ec6ae2803d72442 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Sat, 21 Oct 2023 13:50:15 +
Subject: [PATCH v12] Allow WAL reading from WAL buffers

This commit adds WALRead() the capability to read WAL from WAL
buffers when possible. When requested WAL isn't available in WAL
buffers, the WAL is read from the WAL file as usual. It relies on
WALBufMappingLock so that no one replaces the WAL buffer page that
we're reading from. It skips reading from WAL buffers if
WALBufMappingLock can't be acquired immediately. In other words,
it doesn't wait for WALBufMappingLock to be available. This helps
reduce the contention on WALBufMappingLock.

This commit benefits the callers of WALRead(), that are walsenders
and pg_walinspect. They can now avoid reading WAL from the WAL
file (possibly avoiding disk IO). Tests show that the WAL buffers
hit ratio stood at 95% for 1 primary, 1 sync standby, 1 async
standby, with pgbench --scale=300 --client=32 --time=900. In other
words, the walsenders avoided 95% of the time reading from the
file/avoided pread system calls:
https://www.postgresql.org/message-id/CALj2ACXKKK%3DwbiG5_t6dGao5GoecMwRkhr7GjVBM_jg54%2BNa%3DQ%40mail.gmail.com

This commit also benefits when direct IO is enabled for WAL.
Reading WAL from WAL buffers puts back the performance close to
that of without direct IO for WAL:
https://www.postgresql.org/message-id/CALj2ACV6rS%2B7iZx5%2BoAvyXJaN4AG-djAQeM1mrM%3DYSDkVrUs7g%40mail.gmail.com

This commit also paves the way for the following features in
future:
- Improves synchronous replication performance by replicating
directly from WAL buffers.
- A opt-in way for the walreceivers to receive unflushed WAL.
More details here:
https://www.postgresql.org/message-id/20231011224353.cl7c2s222dw3de4j%40awork3.anarazel.de

Author: Bharath Rupireddy
Reviewed-by: Dilip Kumar, Andres Freund
Reviewed-by: Nathan Bossart, Kuntal Ghosh
Discussion: https://www.postgresql.org/message-id/CALj2ACXKKK%3DwbiG5_t6dGao5GoecMwRkhr7GjVBM_jg54%2BNa%3DQ%40mail.gmail.com
---
 src/backend/access/transam/xlog.c   | 208 
 src/backend/access/transam/xlogreader.c |  45 -
 src/include/access/xlog.h   |   6 +
 3 files changed, 257 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index cea13e3d58..9553a880f1 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1706,6 +1706,214 @@ GetXLogBuffer(XLogRecPtr ptr, TimeLineID tli)
 	return cachedPos + ptr % XLOG_BLCKSZ;
 }
 
+/*
+ * Read WAL from WAL buffers.
+ *
+ * Read 'count' bytes of WAL from WAL buffers into 'buf', starting at location
+ * 'startptr', on timeline 'tli' and return total read bytes.
+ *
+ * Note that this function reads as much as it can from WAL buffers, meaning,
+ * it may not read all the requested 'count' bytes. Caller must be aware of
+ * this and deal with it.
+ *
+ * Note that this function is not available for frontend code as WAL buffers is
+ * an internal mechanism to the server.
+ */
+Size
+XLogReadFromBuffers(XLogReaderState *state,
+	XL

Re: Switching XLog source from archive to streaming when primary available

2023-10-21 Thread Bharath Rupireddy
On Fri, Jul 21, 2023 at 12:38 PM Bharath Rupireddy
 wrote:
>
> Needed a rebase. I'm attaching the v13 patch for further consideration.

Needed a rebase. I'm attaching the v14 patch. It also has the following changes:

- Ran pgindent on the new source code.
- Ran pgperltidy on the new TAP test.
- Improved the newly added TAP test a bit. Used the new wait_for_log
core TAP function in place of custom find_in_log.

Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From e6010e3b2e4c52d32a5d2f3eb2d59954617b221b Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Sat, 21 Oct 2023 13:41:46 +
Subject: [PATCH v14] Allow standby to switch WAL source from archive to
 streaming

A standby typically switches to streaming replication (get WAL
from primary), only when receive from WAL archive finishes (no
more WAL left there) or fails for any reason. Reading WAL from
archive may not always be as efficient and fast as reading from
primary. This can be due to the differences in disk types, IO
costs, network latencies etc., all impacting recovery
performance on standby. And, while standby is reading WAL from
archive, primary accumulates WAL because the standby's replication
slot stays inactive. To avoid these problems, one can use this
parameter to make standby switch to stream mode sooner.

This feature adds a new GUC that specifies amount of time after
which standby attempts to switch WAL source from WAL archive to
streaming replication (getting WAL from primary). However, standby
exhausts all the WAL present in pg_wal before switching. If
standby fails to switch to stream mode, it falls back to archive
mode.

Reported-by: SATYANARAYANA NARLAPURAM
Author: Bharath Rupireddy
Reviewed-by: Cary Huang, Nathan Bossart
Reviewed-by: Kyotaro Horiguchi
Discussion: https://www.postgresql.org/message-id/CAHg+QDdLmfpS0n0U3U+e+dw7X7jjEOsJJ0aLEsrtxs-tUyf5Ag@mail.gmail.com
---
 doc/src/sgml/config.sgml  |  45 ++
 doc/src/sgml/high-availability.sgml   |  15 +-
 src/backend/access/transam/xlogrecovery.c | 146 --
 src/backend/utils/misc/guc_tables.c   |  12 ++
 src/backend/utils/misc/postgresql.conf.sample |   4 +
 src/include/access/xlogrecovery.h |   1 +
 src/test/recovery/meson.build |   1 +
 src/test/recovery/t/040_wal_source_switch.pl  | 130 
 8 files changed, 333 insertions(+), 21 deletions(-)
 create mode 100644 src/test/recovery/t/040_wal_source_switch.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3839c72c86..3a18ba9b26 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4791,6 +4791,51 @@ ANY num_sync ( 
+  streaming_replication_retry_interval (integer)
+  
+   streaming_replication_retry_interval configuration parameter
+  
+  
+  
+   
+Specifies amount of time after which standby attempts to switch WAL
+source from WAL archive to streaming replication (getting WAL from
+primary). However, standby exhausts all the WAL present in pg_wal
+before switching. If standby fails to switch to stream mode, it falls
+back to archive mode. If this parameter's value is specified without
+units, it is taken as milliseconds. Default is five minutes
+(5min). With a lower value for this parameter,
+standby makes frequent WAL source switch attempts. To avoid this, it is
+recommended to set a reasonable value. A setting of 0
+disables the feature. When disabled, standby typically switches to
+stream mode only when receive from WAL archive finishes (no more WAL
+left there) or fails for any reason. This parameter can only be set in
+the postgresql.conf file or on the server command
+line.
+   
+   
+
+ Standby may not always attempt to switch source from WAL archive to
+ streaming replication at exact streaming_replication_retry_interval
+ intervals. For example, if the parameter is set to 1min
+ and fetching WAL file from archive takes about 2min,
+ then the source switch attempt happens for the next WAL file after
+ current WAL file fetched from archive is fully applied.
+
+   
+   
+Reading WAL from archive may not always be as efficient and fast as
+reading from primary. This can be due to the differences in disk types,
+IO costs, network latencies etc., all impacting recovery performance on
+standby. And, while standby is reading WAL from archive, primary
+accumulates WAL because the standby's replication slot stays inactive.
+To avoid these problems, one can use this parameter to make standby
+switch to stream mode sooner.
+   
+  
+ 
+
  
   recovery_min_apply_delay (integer)
   
diff --git a/doc/src/sgml/high-

Re: Removing unneeded self joins

2023-10-21 Thread Alexander Korotkov
On Thu, Oct 19, 2023 at 6:16 AM Andrei Lepikhov
 wrote:
> On 19/10/2023 01:50, Alexander Korotkov wrote:
> > This query took 3778.432 ms with self-join removal disabled, and
> > 3756.009 ms with self-join removal enabled.  So, no measurable
> > overhead.  Similar to the higher number of joins.  Can you imagine
> > some extreme case when self-join removal could introduce significant
> > overhead in comparison with other optimizer parts?  If not, should we
> > remove self_join_search_limit GUC?
> Thanks,
> It was Zhihong Yu who worried about that case [1]. And my purpose was to
> show a method to avoid such a problem if it would be needed.
> I guess the main idea here is that we have a lot of self-joins, but only
> few of them (or no one) can be removed.
> I can't imagine a practical situation when we can be stuck in the
> problems here. So, I vote to remove this GUC.

I've removed the self_join_search_limit.  Anyway there is
enable_self_join_removal if the self join removal algorithm causes any
problems.  I also did some grammar corrections for the comments.  I
think the patch is getting to the committable shape.  I noticed some
failures on commitfest.cputube.org.  I'd like to check how this
version will pass it.

--
Regards,
Alexander Korotkov


0001-Remove-useless-self-joins-v46.patch
Description: Binary data


Re: LLVM 16 (opaque pointers)

2023-10-21 Thread Thomas Munro
On Sat, Oct 21, 2023 at 7:08 PM Andres Freund  wrote:
> I've attached a patch revision that I spent the last couple hours working
> on. It's very very roughly based on a patch Tom Stellard had written (which I
> think a few rpm packages use). But instead of encoding details about specific
> layout details, I made the code check if the data layout works and fall back
> to the cpu / features used for llvmjit_types.bc.  This way it's not s390x
> specific, future odd architecture behaviour would "automatically" be handled
> the same

The explanation makes sense and this seems like a solid plan to deal
with it.  I didn't try on a s390x, but I tested locally on our master
branch with LLVM 7, 10, 17, 18, and then I hacked your patch to take
the fallback path as if a layout mismatch had been detected, and it
worked fine:

2023-10-22 11:49:55.663 NZDT [12000] DEBUG:  detected CPU "skylake",
with features "...", resulting in layout
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
2023-10-22 11:49:55.664 NZDT [12000] DEBUG:  detected CPU / features
yield incompatible data layout, using values from module instead
2023-10-22 11:49:55.664 NZDT [12000] DETAIL:  module CPU "x86-64",
features "...", resulting in layout
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"

+To deal with that, check if data layouts match during JIT
initialization. If
+the runtime detected cpu / features result in a different layout,
try if the
+cpu/features recorded in in llvmjit_types.bc work.

s|try |check |
s| in in | in |

+   errmsg_internal("could not
determine working CPU / feature comination for JIT compilation"),

s|comination|combination|
s| / |/|g




Re: LLVM 16 (opaque pointers)

2023-10-21 Thread Thomas Munro
On Sat, Oct 21, 2023 at 2:45 PM Tom Lane  wrote:
> Thomas Munro  writes:
> > (It'd be nice if the
> > build farm logged "$LLVM_CONFIG --version" somewhere.)
>
> It's not really the buildfarm script's responsibility to do that,
> but feel free to make configure do so.

Done, copying the example of how we do it for perl and various other things.




Re: The documentation for storage type 'plain' actually allows single byte header

2023-10-21 Thread Bruce Momjian
On Fri, Oct 20, 2023 at 09:48:05PM -0400, Bruce Momjian wrote:
> Here is the original thread from pgsql-docs:
> 
>   
> https://www.postgresql.org/message-id/flat/167336599095.2667301.15497893107226841625%40wrigleys.postgresql.org
> 
> The report is about single-byte headers being used for varlena values
> with PLAIN storage.
> 
> Here is the reproducible report:
> 
>   CREATE EXTENSION pageinspect;
>   CREATE TABLE test(a VARCHAR(1) STORAGE PLAIN);
>   INSERT INTO test VALUES (repeat('A',10));
>   
>   Now peek into the page with pageinspect functions
>   
>   SELECT left(encode(t_data, 'hex'), 40) FROM
>   heap_page_items(get_raw_page('test', 0));
>   
>   This returned value of "1741414141414141414141".
>   Here the first byte 0x17 = 0001 0111 in binary.
>   Length + 1 is stored in the length bits (1-7). So Len = 0001011-1 = 
> (11-1)
>   [base-10] = 10 [base-10]
>   which exactly matches the expected length. Further the data "41" 
> repeated 10
>   times also indicates character A (65 or 0x41 in ASCII) repeated 10 
> times.
> 
> I researched this and thought it would be a case where we were lacking a
> check before creating a single-byte header, but I couldn't find anything
> missing.  I think the problem is that the _source_ tupleDesc attstorage
> attribute is being used to decide if we should use a short header, while
> it is really the storage type of the destination that we should be
> checking.  Unfortunately, I don't think the destination is accessible at
> the location were we are deciding about a short header.
> 
> I am confused how to proceed.  I feel we need to fully understand why
> this happening before we adjust anything.  Here is a backtrace --- the
> short header is being created in fill_val() and the attstorage value
> there is 'x'/EXTENDED.

I did some more research.  It turns out that the source slot/planSlot is
populating its pg_attribute information via makeTargetEntry() and it
has no concept of a storage type.

Digging further, I found that we cannot get rid of the the use of
att->attstorage != TYPSTORAGE_PLAIN in macros ATT_IS_PACKABLE and
VARLENA_ATT_IS_PACKABLE macros in src/backend/access/common/heaptuple.c
because there are internal uses of fill_val() that can't handle packed
varlena headers.

I ended up with a doc patch that adds a C comment about this odd
behavior and removes doc text about PLAIN storage not using packed
headers.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: The documentation for storage type 'plain' actually allows single byte header

2023-10-21 Thread Bruce Momjian
On Sat, Oct 21, 2023 at 09:56:13PM -0400, Bruce Momjian wrote:
> I did some more research.  It turns out that the source slot/planSlot is
> populating its pg_attribute information via makeTargetEntry() and it
> has no concept of a storage type.
> 
> Digging further, I found that we cannot get rid of the the use of
> att->attstorage != TYPSTORAGE_PLAIN in macros ATT_IS_PACKABLE and
> VARLENA_ATT_IS_PACKABLE macros in src/backend/access/common/heaptuple.c
> because there are internal uses of fill_val() that can't handle packed
> varlena headers.
> 
> I ended up with a doc patch that adds a C comment about this odd
> behavior and removes doc text about PLAIN storage not using packed
> headers.

Oops, patch attached.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 148fb1b49d..3ea4e5526d 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -456,9 +456,7 @@ for storing TOAST-able columns on disk:
 
  
   PLAIN prevents either compression or
-  out-of-line storage; furthermore it disables use of single-byte headers
-  for varlena types.
-  This is the only possible strategy for
+  out-of-line storage.  This is the only possible strategy for
   columns of non-TOAST-able data types.
  
 
diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index d6a4ddfd51..c52d40dce0 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -68,7 +68,16 @@
 #include "utils/memutils.h"
 
 
-/* Does att's datatype allow packing into the 1-byte-header varlena format? */
+/*
+ * Does att's datatype allow packing into the 1-byte-header varlena format?
+ * While functions that use TupleDescAttr() and assign attstorage =
+ * TYPSTORAGE_PLAIN cannot use packed varlena headers, functions that call
+ * TupleDescInitEntry() use typeForm->typstorage (TYPSTORAGE_EXTENDED) and
+ * can use packed varlena headers, e.g.:
+ * CREATE TABLE test(a VARCHAR(1) STORAGE PLAIN);
+ * INSERT INTO test VALUES (repeat('A',10));
+ * This can be verified with pageinspect.
+ */
 #define ATT_IS_PACKABLE(att) \
 	((att)->attlen == -1 && (att)->attstorage != TYPSTORAGE_PLAIN)
 /* Use this if it's already known varlena */
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index d65e5625c7..24bad52923 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -2608,7 +2608,8 @@ range_contains_elem_internal(TypeCacheEntry *typcache, const RangeType *r, Datum
  * values into a range object.  They are modeled after heaptuple.c's
  * heap_compute_data_size() and heap_fill_tuple(), but we need not handle
  * null values here.  TYPE_IS_PACKABLE must test the same conditions as
- * heaptuple.c's ATT_IS_PACKABLE macro.
+ * heaptuple.c's ATT_IS_PACKABLE macro.  See the comments thare for more
+ * details.
  */
 
 /* Does datatype allow packing into the 1-byte-header varlena format? */


Re: Guiding principle for dropping LLVM versions?

2023-10-21 Thread Thomas Munro
Rebased.  I also noticed this woefully out of date line:

-  PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-7
llvm-config-6.0 llvm-config-5.0 llvm-config-4.0 llvm-config-3.9)
+  PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-17
llvm-config-16 llvm-config-15 llvm-config-14)
From 522a4f3744e691f42f43bb1ca91df60b12465bff Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Thu, 19 Oct 2023 04:45:46 +1300
Subject: [PATCH v3 1/2] jit: Require at least LLVM 14, if enabled.

Remove support for a lot of older LLVM versions dating back to 3.9.  The
default on common software distritbutions will be at least LLVM 14 when
PostgreSQL 17 ships.

Discussion: https://postgr.es/m/CA%2BhUKGLhNs5geZaVNj2EJ79Dx9W8fyWUU3HxcpZy55sMGcY%3DiA%40mail.gmail.com
---
 config/llvm.m4  |  10 +-
 configure   |  43 +--
 doc/src/sgml/installation.sgml  |   4 +-
 meson.build |   2 +-
 src/backend/jit/llvm/llvmjit.c  | 158 +---
 src/backend/jit/llvm/llvmjit_error.cpp  |  35 --
 src/backend/jit/llvm/llvmjit_expr.c |   6 +-
 src/backend/jit/llvm/llvmjit_inline.cpp |  51 +---
 src/backend/jit/llvm/llvmjit_wrap.cpp   |  65 --
 src/include/jit/llvmjit.h   |  17 ---
 src/include/pg_config.h.in  |  12 --
 src/tools/msvc/Solution.pm  |   3 -
 12 files changed, 15 insertions(+), 391 deletions(-)

diff --git a/config/llvm.m4 b/config/llvm.m4
index 21d8cd4f90..ede5e25e91 100644
--- a/config/llvm.m4
+++ b/config/llvm.m4
@@ -13,7 +13,7 @@ AC_DEFUN([PGAC_LLVM_SUPPORT],
   AC_REQUIRE([AC_PROG_AWK])
 
   AC_ARG_VAR(LLVM_CONFIG, [path to llvm-config command])
-  PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-7 llvm-config-6.0 llvm-config-5.0 llvm-config-4.0 llvm-config-3.9)
+  PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-17 llvm-config-16 llvm-config-15 llvm-config-14)
 
   # no point continuing if llvm wasn't found
   if test -z "$LLVM_CONFIG"; then
@@ -25,14 +25,14 @@ AC_DEFUN([PGAC_LLVM_SUPPORT],
 AC_MSG_ERROR([$LLVM_CONFIG does not work])
   fi
   # and whether the version is supported
-  if echo $pgac_llvm_version | $AWK -F '.' '{ if ([$]1 >= 4 || ([$]1 == 3 && [$]2 >= 9)) exit 1; else exit 0;}';then
-AC_MSG_ERROR([$LLVM_CONFIG version is $pgac_llvm_version but at least 3.9 is required])
+  if echo $pgac_llvm_version | $AWK -F '.' '{ if ([$]1 >= 14) exit 1; else exit 0;}';then
+AC_MSG_ERROR([$LLVM_CONFIG version is $pgac_llvm_version but at least 14 is required])
   fi
   AC_MSG_NOTICE([using llvm $pgac_llvm_version])
 
   # need clang to create some bitcode files
   AC_ARG_VAR(CLANG, [path to clang compiler to generate bitcode])
-  PGAC_PATH_PROGS(CLANG, clang clang-7 clang-6.0 clang-5.0 clang-4.0 clang-3.9)
+  PGAC_PATH_PROGS(CLANG, clang clang-17 clang-16 clang-15 clang-14)
   if test -z "$CLANG"; then
 AC_MSG_ERROR([clang not found, but required when compiling --with-llvm, specify with CLANG=])
   fi
@@ -115,8 +115,6 @@ AC_DEFUN([PGAC_CHECK_LLVM_FUNCTIONS],
   # Check which functionality is present
   SAVE_CPPFLAGS="$CPPFLAGS"
   CPPFLAGS="$CPPFLAGS $LLVM_CPPFLAGS"
-  AC_CHECK_DECLS([LLVMOrcGetSymbolAddressIn], [], [], [[#include ]])
-  AC_CHECK_DECLS([LLVMGetHostCPUName, LLVMGetHostCPUFeatures], [], [], [[#include ]])
   AC_CHECK_DECLS([LLVMCreateGDBRegistrationListener, LLVMCreatePerfJITEventListener], [], [], [[#include ]])
   CPPFLAGS="$SAVE_CPPFLAGS"
 ])# PGAC_CHECK_LLVM_FUNCTIONS
diff --git a/configure b/configure
index c2cb1b1b24..2d92e3dea3 100755
--- a/configure
+++ b/configure
@@ -5056,7 +5056,7 @@ if test "$with_llvm" = yes; then :
 
 
   if test -z "$LLVM_CONFIG"; then
-  for ac_prog in llvm-config llvm-config-7 llvm-config-6.0 llvm-config-5.0 llvm-config-4.0 llvm-config-3.9
+  for ac_prog in llvm-config llvm-config-17 llvm-config-16 llvm-config-15 llvm-config-14
 do
   # Extract the first word of "$ac_prog", so it can be a program name with args.
 set dummy $ac_prog; ac_word=$2
@@ -5120,8 +5120,8 @@ fi
 as_fn_error $? "$LLVM_CONFIG does not work" "$LINENO" 5
   fi
   # and whether the version is supported
-  if echo $pgac_llvm_version | $AWK -F '.' '{ if ($1 >= 4 || ($1 == 3 && $2 >= 9)) exit 1; else exit 0;}';then
-as_fn_error $? "$LLVM_CONFIG version is $pgac_llvm_version but at least 3.9 is required" "$LINENO" 5
+  if echo $pgac_llvm_version | $AWK -F '.' '{ if ($1 >= 14) exit 1; else exit 0;}';then
+as_fn_error $? "$LLVM_CONFIG version is $pgac_llvm_version but at least 14 is required" "$LINENO" 5
   fi
   { $as_echo "$as_me:${as_lineno-$LINENO}: using llvm $pgac_llvm_version" >&5
 $as_echo "$as_me: using llvm $pgac_llvm_version" >&6;}
@@ -5129,7 +5129,7 @@ $as_echo "$as_me: using llvm $pgac_llvm_version" >&6;}
   # need clang to create some bitcode files
 
   if test -z "$CLANG"; then
-  for ac_prog in clang clang-7 clang-6.0 clang-5.0 clang-4.0 clang-3.9
+  for ac_prog in clang clang-17 clang-16 clang-15 cl

Re: Row pattern recognition

2023-10-21 Thread Tatsuo Ishii
Attached is the v10 patch. This version enhances the performance of
pattern matching.  Previously it generated all possible pattern string
candidates. This resulted in unnecessarily large number of
candidates. For example if you have 2 pattern variables and the target
frame includes 100 rows, the number of candidates can reach to 2^100
in the worst case. To avoid this, I do a pruning in the v10
patch. Suppose you have:

PATTERN (A B+ C+)

Candidates like "BAC" "CAB" cannot survive because they never satisfy
the search pattern. To judge this, I assign sequence numbers (0, 1, 2)
to (A B C).  If the pattern generator tries to generate BA, this is
not allowed because the sequence number for B is 1 and for A is 0, and
0 < 1: B cannot be followed by A. Note that this technique can be
applied when the quantifiers are "+" or "*". Maybe other quantifiers
such as '?'  or '{n, m}' can be applied too but I don't confirm yet
because I have not implemented them yet.

Besides this improvement, I fixed a bug in the previous and older
patches: when an expression in DEFINE uses text operators, it errors
out:

ERROR:  could not determine which collation to use for string comparison
HINT:  Use the COLLATE clause to set the collation explicitly.

This was fixed by adding assign_expr_collations() in
transformDefineClause().

Also I have updated documentation "3.5. Window Functions"

- It still mentioned about rpr(). It's not applied anymore.
- Enhance the description about DEFINE and PATTERN.
- Mention that quantifier '*' is supported.

Finally I have added more test cases to the regression test.
- same pattern variable appears twice
- case for quantifier '*'

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
>From a8b41dd155a2d9ce969083fcfc53bbae3c28b263 Mon Sep 17 00:00:00 2001
From: Tatsuo Ishii 
Date: Sun, 22 Oct 2023 11:22:10 +0900
Subject: [PATCH v10 1/7] Row pattern recognition patch for raw parser.

---
 src/backend/parser/gram.y   | 222 ++--
 src/include/nodes/parsenodes.h  |  56 
 src/include/parser/kwlist.h |   8 ++
 src/include/parser/parse_node.h |   1 +
 4 files changed, 273 insertions(+), 14 deletions(-)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c224df4ecc..e09eb061f8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -251,6 +251,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	DefElem	   *defelt;
 	SortBy	   *sortby;
 	WindowDef  *windef;
+	RPCommonSyntax	*rpcom;
+	RPSubsetItem	*rpsubset;
 	JoinExpr   *jexpr;
 	IndexElem  *ielem;
 	StatsElem  *selem;
@@ -278,6 +280,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	MergeWhenClause *mergewhen;
 	struct KeyActions *keyactions;
 	struct KeyAction *keyaction;
+	RPSkipTo	skipto;
 }
 
 %type 	stmt toplevel_stmt schema_stmt routine_body_stmt
@@ -453,8 +456,12 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 TriggerTransitions TriggerReferencing
 vacuum_relation_list opt_vacuum_relation_list
 drop_option_list pub_obj_list
-
-%type 	opt_routine_body
+row_pattern_measure_list row_pattern_definition_list
+opt_row_pattern_subset_clause
+row_pattern_subset_list row_pattern_subset_rhs
+row_pattern
+%type 	 row_pattern_subset_item
+%type 	opt_routine_body row_pattern_term
 %type  group_clause
 %type 	group_by_list
 %type 	group_by_item empty_grouping_set rollup_clause cube_clause
@@ -551,6 +558,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type 	relation_expr_opt_alias
 %type 	tablesample_clause opt_repeatable_clause
 %type 	target_el set_target insert_column_item
+row_pattern_measure_item row_pattern_definition
+%type 	first_or_last
 
 %type 		generic_option_name
 %type 	generic_option_arg
@@ -633,6 +642,9 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type 	window_clause window_definition_list opt_partition_clause
 %type 	window_definition over_clause window_specification
 opt_frame_clause frame_extent frame_bound
+%type 	opt_row_pattern_common_syntax opt_row_pattern_skip_to
+%type 	opt_row_pattern_initial_or_seek
+%type 	opt_row_pattern_measures
 %type 	opt_window_exclusion_clause
 %type 		opt_existing_window_name
 %type  opt_if_not_exists
@@ -659,7 +671,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 json_object_constructor_null_clause_opt
 json_array_constructor_null_clause_opt
 
-
 /*
  * Non-keyword token types.  These are hard-wired into the "flex" lexer.
  * They must be listed first so that their numeric codes do not depend on
@@ -702,7 +713,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
 
 	DATA_P DATABASE DAY_P DEALLOCATE DEC DECIMAL

Re: Guiding principle for dropping LLVM versions?

2023-10-21 Thread Xing Guo
Hi,

Can we also check if the clang's version is compatible with llvm's version
in llvm.m4? I have multiple llvm toolchains installed on my system and I
have to specify the $CLANG and $LLVM_CONFIG variables each time I build the
server against a toolchain that is not present in $PATH. If one of the
variables is missing, the build system will pick up a default one whose
version might not be compatible with the other. E.g., If we use clang-16
and llvm-config-15, there will be issues when creating indexes for bitcodes
at the end of installation.

There will be errors look like

```
LLVM ERROR: ThinLTO cannot create input file: Unknown attribute kind (86)
(Producer: 'LLVM16.0.6' Reader: 'LLVM 15.0.7')
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/
and include the crash backtrace.
Stack dump:
0.  Program arguments: /usr/lib/llvm15/bin/llvm-lto -thinlto
-thinlto-action=thinlink -o postgres.index.bc postgres/access/brin/brin.bc
postgres/access/brin/brin_bloom.bc postgres/acces
s/brin/brin_inclusion.bc postgres/access/brin/brin_minmax.bc
postgres/access/brin/brin_minmax_multi.bc
postgres/access/brin/brin_pageops.bc postgres/access/brin/brin_revmap.bc
postgres/acce
ss/brin/brin_tuple.bc postgres/access/brin/brin_validate.bc
postgres/access/brin/brin_xlog.bc postgres/access/common/attmap.bc
postgres/access/common/bufmask.bc postgres/access/common/detoa
st.bc postgres/access/common/heaptuple.bc
postgres/access/common/indextuple.bc postgres/access/common/printsimple.bc
postgres/access/common/printtup.bc postgres/access/common/relation.bc po
stgres/access/common/reloptions.bc postgres/access/common/scankey.bc
postgres/access/common/session.bc postgres/access/common/syncscan.bc
postgres/access/common/toast_compression.bc postgre
s/access/common/toast_internals.bc postgres/access/common/tupconvert.bc
postgres/access/common/tupdesc.bc postgres/access/gin/ginarrayproc.bc
postgres/access/gin/ginbtree.bc postgres/access
/gin/ginbulk.bc postgres/access/gin/gindatapage.bc
postgres/access/gin/ginentrypage.bc postgres/access/gin/ginfast.bc
postgres/access/gin/ginget.bc postgres/access/gin/gininsert.bc postgres
/access/gin/ginlogic.bc postgres/access/gin/ginpostinglist.bc
postgres/access/gin/ginscan.bc postgres/access/gin/ginutil.bc
postgres/access/gin/ginvacuum.bc postgres/access/gin/ginvalidate.
bc postgres/access/gin/ginxlog.bc postgres/access/gist/gist.bc
postgres/access/gist/gistbuild.bc postgres/access/gist/gistbuildbuffers.bc
postgres/access/gist/gistget.bc postgres/access/gis
t/gistproc.bc postgres/access/gist/gistscan.bc
postgres/access/gist/gistsplit.bc postgres/access/gist/gistutil.bc
postgres/access/gist/gistvacuum.bc postgres/access/gist/gistvalidate.bc pos
tgres/access/gist/gistxlog.bc postgres/access/hash/hash.bc
postgres/access/hash/hash_xlog.bc postgres/access/hash/hashfunc.bc
postgres/access/hash/hashinsert.bc postgres/access/hash/hashovf
l.bc postgres/access/hash/hashpage.bc postgres/access/hash/hashsearch.bc
postgres/access/hash/hashsort.bc postgres/access/hash/hashutil.bc
postgres/access/hash/hashvalidate.bc postgres/acce
ss/heap/heapam.bc postgres/access/heap/heapam_handler.bc
postgres/access/heap/heapam_visibility.bc postgres/access/heap/heaptoast.bc
postgres/access/heap/hio.bc postgres/access/heap/prunehe
```

If we can check the llvm-config versions and clang versions at the
configuration phase we can detect the problem earlier.

Best Regards,
Xing








On Sun, Oct 22, 2023 at 10:07 AM Thomas Munro 
wrote:

> Rebased.  I also noticed this woefully out of date line:
>
> -  PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-7
> llvm-config-6.0 llvm-config-5.0 llvm-config-4.0 llvm-config-3.9)
> +  PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-17
> llvm-config-16 llvm-config-15 llvm-config-14)
>