Custom explain options
Hi hackers, EXPLAIN statement has a list of options (i.e. ANALYZE, BUFFERS, COST,...) which help to provide useful details of query execution. In Neon we have added PREFETCH option which shows information about page prefetching during query execution (prefetching is more critical for Neon architecture because of separation of compute and storage, so it is implemented not only for bitmap heap scan as in Vanilla Postgres, but also for seqscan, indexscan and indexonly scan). Another possible candidate for explain options is local file cache (extra caching layer above shared buffers which is used to somehow replace file system cache in standalone Postgres). I think that it will be nice to have a generic mechanism which allows extensions to add its own options to EXPLAIN. I have attached the patch with implementation of such mechanism (also available as PR: https://github.com/knizhnik/postgres/pull/1 ) I have demonstrated this mechanism using Bloom extension - just to report number of Bloom matches. Not sure that it is really useful information but it is used mostly as example: explain (analyze,bloom) select * from t where pk=2000; QUERY PLAN - Bitmap Heap Scan on t (cost=15348.00..15352.01 rows=1 width=4) (actual time=25.244..25.939 rows=1 loops=1) Recheck Cond: (pk = 2000) Rows Removed by Index Recheck: 292 Heap Blocks: exact=283 Bloom: matches=293 -> Bitmap Index Scan on t_pk_idx (cost=0.00..15348.00 rows=1 width=0) (actual time=25.147..25.147 rows=293 loops=1) Index Cond: (pk = 2000) Bloom: matches=293 Planning: Bloom: matches=0 Planning Time: 0.387 ms Execution Time: 26.053 ms (12 rows) There are two known issues with this proposal: 1. I have to limit total size of all custom metrics - right now it is limited by 128 bytes. It is done to keep|Instrumentation|and some other data structures fixes size. Otherwise maintaining varying parts of this structure is ugly, especially in shared memory 2. Custom extension is added by means of|RegisterCustomInsrumentation|function which is called from|_PG_init| But|_PG_init|is called when extension is loaded and it is loaded on demand when some of extension functions is called (except when extension is included in shared_preload_libraries list), Bloom extension doesn't require it. So if your first statement executed in your session is: explain (analyze,bloom) select * from t where pk=2000; ...you will get error: ERROR: unrecognized EXPLAIN option "bloom" LINE 1: explain (analyze,bloom) select * from t where pk=2000; It happens because at the moment when explain statement parses options, Bloom index is not yet selected and so bloom extension is not loaded and|RegisterCustomInsrumentation|is not yet called. If we repeat the query, then proper result will be displayed (see above). diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h index 330811ec60..af23ffe821 100644 --- a/contrib/bloom/bloom.h +++ b/contrib/bloom/bloom.h @@ -174,6 +174,13 @@ typedef struct BloomScanOpaqueData typedef BloomScanOpaqueData *BloomScanOpaque; +typedef struct +{ + uint64 matches; +} BloomUsage; + +extern BloomUsage bloomUsage; + /* blutils.c */ extern void initBloomState(BloomState *state, Relation index); extern void BloomFillMetapage(Relation index, Page metaPage); diff --git a/contrib/bloom/blscan.c b/contrib/bloom/blscan.c index 61d1f66b38..8890951943 100644 --- a/contrib/bloom/blscan.c +++ b/contrib/bloom/blscan.c @@ -166,6 +166,6 @@ blgetbitmap(IndexScanDesc scan, TIDBitmap *tbm) CHECK_FOR_INTERRUPTS(); } FreeAccessStrategy(bas); - + bloomUsage.matches += ntids; return ntids; } diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c index f23fbb1d9e..d795875920 100644 --- a/contrib/bloom/blutils.c +++ b/contrib/bloom/blutils.c @@ -18,6 +18,7 @@ #include "access/reloptions.h" #include "bloom.h" #include "catalog/index.h" +#include "commands/explain.h" #include "commands/vacuum.h" #include "miscadmin.h" #include "storage/bufmgr.h" @@ -34,6 +35,55 @@ PG_FUNCTION_INFO_V1(blhandler); +BloomUsage bloomUsage; + +static void +bloomUsageAdd(CustomResourceUsage* dst, CustomResourceUsage const* add) +{ + ((BloomUsage*)dst)->matches += ((BloomUsage*)add)->matches; +} + +static void +bloomUsageAccum(CustomResourceUsage* acc, CustomResourceUsage const* end, CustomResourceUsage const* start) +{ + ((BloomUsage*)acc)->matches += ((BloomUsage*)end)->matches - ((BloomUsage*)start)->matches;; +} + +static void +bloomUsageShow(ExplainState* es, CustomResourceUsage const* usage, bool planning) +{ + if (es->format == EXPLAIN_FORMAT_TEXT) + { + if (planning) + { + ExplainIndentText(es); +
Re: Remove extraneous break condition in logical slot advance function
On Fri, Oct 20, 2023 at 7:30 PM Bharath Rupireddy wrote: > > Hi, > > There exists an extraneous break condition in > pg_logical_replication_slot_advance(). When the end of WAL or moveto > LSN is reached, the main while condition helps to exit the loop, so no > separate break condition is needed. Attached patch removes it. > > Thoughts? +1 for the patch. The only advantage I see of the code as it stands right now is that it avoids one last call to CHECK_FOR_INTERRUPTS() by break'ing early. I don't think we'd lose much in terms of performance by making one (very cheap, in common case) extra call of this macro. Best regards, Gurjeet http://Gurje.et
Re: Remove extraneous break condition in logical slot advance function
Gurjeet Singh writes: > On Fri, Oct 20, 2023 at 7:30 PM Bharath Rupireddy > wrote: >> There exists an extraneous break condition in >> pg_logical_replication_slot_advance(). When the end of WAL or moveto >> LSN is reached, the main while condition helps to exit the loop, so no >> separate break condition is needed. Attached patch removes it. > The only advantage I see of the code as it stands right now is that it > avoids one last call to CHECK_FOR_INTERRUPTS() by break'ing early. I > don't think we'd lose much in terms of performance by making one (very > cheap, in common case) extra call of this macro. Agreed, bypassing the last CHECK_FOR_INTERRUPTS() shouldn't save anything noticeable. Could there be a correctness argument for it though? Can't see what. We should assume that CFIs might happen down inside LogicalDecodingProcessRecord. I wondered why the code looks like this, and whether there used to be more of a reason for it. "git blame" reveals the probable answer: when this code was added, in 9c7d06d60, the loop condition was different so the break was necessary. 38a957316 simplified the loop condition to what we see today, but didn't notice that the break was thereby made pointless. While we're here ... the comment above the loop seems wrong already, and this makes it more so. I suggest something like - /* Decode at least one record, until we run out of records */ + /* Decode records until we reach the requested target */ while (ctx->reader->EndRecPtr < moveto) regards, tom lane
Re: Improve WALRead() to suck data directly from WAL buffers when possible
On Fri, Oct 20, 2023 at 10:19 PM Bharath Rupireddy wrote: > > On Thu, Oct 12, 2023 at 4:13 AM Andres Freund wrote: > > > > On 2023-10-03 16:05:32 -0700, Jeff Davis wrote: > > > On Sat, 2023-01-14 at 12:34 -0800, Andres Freund wrote: > > > > One benefit would be that it'd make it more realistic to use direct > > > > IO for WAL > > > > - for which I have seen significant performance benefits. But when we > > > > afterwards have to re-read it from disk to replicate, it's less > > > > clearly a win. > > > > > > Does this patch still look like a good fit for your (or someone else's) > > > plans for direct IO here? If so, would committing this soon make it > > > easier to make progress on that, or should we wait until it's actually > > > needed? > > > > I think it'd be quite useful to have. Even with the code as of 16, I see > > better performance in some workloads with debug_io_direct=wal, > > wal_sync_method=open_datasync compared to any other configuration. Except of > > course that it makes walsenders more problematic, as they suddenly require > > read IO. Thus having support for walsenders to send directly from wal > > buffers > > would be beneficial, even without further AIO infrastructure. > > I'm attaching the v11 patch set with the following changes: > - Improved input validation in the function that reads WAL from WAL > buffers in 0001 patch. > - Improved test module's code in 0002 patch. > - Modernized meson build file in 0002 patch. > - Added commit messages for both the patches. > - Ran pgindent on both the patches. > > Any thoughts are welcome. I'm attaching v12 patch set with just pgperltidy ran on the new TAP test added in 0002. No other changes from that of v11 patch set. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com From c240d914967261e462290b714ec6ae2803d72442 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Sat, 21 Oct 2023 13:50:15 + Subject: [PATCH v12] Allow WAL reading from WAL buffers This commit adds WALRead() the capability to read WAL from WAL buffers when possible. When requested WAL isn't available in WAL buffers, the WAL is read from the WAL file as usual. It relies on WALBufMappingLock so that no one replaces the WAL buffer page that we're reading from. It skips reading from WAL buffers if WALBufMappingLock can't be acquired immediately. In other words, it doesn't wait for WALBufMappingLock to be available. This helps reduce the contention on WALBufMappingLock. This commit benefits the callers of WALRead(), that are walsenders and pg_walinspect. They can now avoid reading WAL from the WAL file (possibly avoiding disk IO). Tests show that the WAL buffers hit ratio stood at 95% for 1 primary, 1 sync standby, 1 async standby, with pgbench --scale=300 --client=32 --time=900. In other words, the walsenders avoided 95% of the time reading from the file/avoided pread system calls: https://www.postgresql.org/message-id/CALj2ACXKKK%3DwbiG5_t6dGao5GoecMwRkhr7GjVBM_jg54%2BNa%3DQ%40mail.gmail.com This commit also benefits when direct IO is enabled for WAL. Reading WAL from WAL buffers puts back the performance close to that of without direct IO for WAL: https://www.postgresql.org/message-id/CALj2ACV6rS%2B7iZx5%2BoAvyXJaN4AG-djAQeM1mrM%3DYSDkVrUs7g%40mail.gmail.com This commit also paves the way for the following features in future: - Improves synchronous replication performance by replicating directly from WAL buffers. - A opt-in way for the walreceivers to receive unflushed WAL. More details here: https://www.postgresql.org/message-id/20231011224353.cl7c2s222dw3de4j%40awork3.anarazel.de Author: Bharath Rupireddy Reviewed-by: Dilip Kumar, Andres Freund Reviewed-by: Nathan Bossart, Kuntal Ghosh Discussion: https://www.postgresql.org/message-id/CALj2ACXKKK%3DwbiG5_t6dGao5GoecMwRkhr7GjVBM_jg54%2BNa%3DQ%40mail.gmail.com --- src/backend/access/transam/xlog.c | 208 src/backend/access/transam/xlogreader.c | 45 - src/include/access/xlog.h | 6 + 3 files changed, 257 insertions(+), 2 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index cea13e3d58..9553a880f1 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1706,6 +1706,214 @@ GetXLogBuffer(XLogRecPtr ptr, TimeLineID tli) return cachedPos + ptr % XLOG_BLCKSZ; } +/* + * Read WAL from WAL buffers. + * + * Read 'count' bytes of WAL from WAL buffers into 'buf', starting at location + * 'startptr', on timeline 'tli' and return total read bytes. + * + * Note that this function reads as much as it can from WAL buffers, meaning, + * it may not read all the requested 'count' bytes. Caller must be aware of + * this and deal with it. + * + * Note that this function is not available for frontend code as WAL buffers is + * an internal mechanism to the server. + */ +Size +XLogReadFromBuffers(XLogReaderState *state, + XL
Re: Switching XLog source from archive to streaming when primary available
On Fri, Jul 21, 2023 at 12:38 PM Bharath Rupireddy wrote: > > Needed a rebase. I'm attaching the v13 patch for further consideration. Needed a rebase. I'm attaching the v14 patch. It also has the following changes: - Ran pgindent on the new source code. - Ran pgperltidy on the new TAP test. - Improved the newly added TAP test a bit. Used the new wait_for_log core TAP function in place of custom find_in_log. Thoughts? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com From e6010e3b2e4c52d32a5d2f3eb2d59954617b221b Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Sat, 21 Oct 2023 13:41:46 + Subject: [PATCH v14] Allow standby to switch WAL source from archive to streaming A standby typically switches to streaming replication (get WAL from primary), only when receive from WAL archive finishes (no more WAL left there) or fails for any reason. Reading WAL from archive may not always be as efficient and fast as reading from primary. This can be due to the differences in disk types, IO costs, network latencies etc., all impacting recovery performance on standby. And, while standby is reading WAL from archive, primary accumulates WAL because the standby's replication slot stays inactive. To avoid these problems, one can use this parameter to make standby switch to stream mode sooner. This feature adds a new GUC that specifies amount of time after which standby attempts to switch WAL source from WAL archive to streaming replication (getting WAL from primary). However, standby exhausts all the WAL present in pg_wal before switching. If standby fails to switch to stream mode, it falls back to archive mode. Reported-by: SATYANARAYANA NARLAPURAM Author: Bharath Rupireddy Reviewed-by: Cary Huang, Nathan Bossart Reviewed-by: Kyotaro Horiguchi Discussion: https://www.postgresql.org/message-id/CAHg+QDdLmfpS0n0U3U+e+dw7X7jjEOsJJ0aLEsrtxs-tUyf5Ag@mail.gmail.com --- doc/src/sgml/config.sgml | 45 ++ doc/src/sgml/high-availability.sgml | 15 +- src/backend/access/transam/xlogrecovery.c | 146 -- src/backend/utils/misc/guc_tables.c | 12 ++ src/backend/utils/misc/postgresql.conf.sample | 4 + src/include/access/xlogrecovery.h | 1 + src/test/recovery/meson.build | 1 + src/test/recovery/t/040_wal_source_switch.pl | 130 8 files changed, 333 insertions(+), 21 deletions(-) create mode 100644 src/test/recovery/t/040_wal_source_switch.pl diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 3839c72c86..3a18ba9b26 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4791,6 +4791,51 @@ ANY num_sync ( + streaming_replication_retry_interval (integer) + + streaming_replication_retry_interval configuration parameter + + + + +Specifies amount of time after which standby attempts to switch WAL +source from WAL archive to streaming replication (getting WAL from +primary). However, standby exhausts all the WAL present in pg_wal +before switching. If standby fails to switch to stream mode, it falls +back to archive mode. If this parameter's value is specified without +units, it is taken as milliseconds. Default is five minutes +(5min). With a lower value for this parameter, +standby makes frequent WAL source switch attempts. To avoid this, it is +recommended to set a reasonable value. A setting of 0 +disables the feature. When disabled, standby typically switches to +stream mode only when receive from WAL archive finishes (no more WAL +left there) or fails for any reason. This parameter can only be set in +the postgresql.conf file or on the server command +line. + + + + Standby may not always attempt to switch source from WAL archive to + streaming replication at exact streaming_replication_retry_interval + intervals. For example, if the parameter is set to 1min + and fetching WAL file from archive takes about 2min, + then the source switch attempt happens for the next WAL file after + current WAL file fetched from archive is fully applied. + + + +Reading WAL from archive may not always be as efficient and fast as +reading from primary. This can be due to the differences in disk types, +IO costs, network latencies etc., all impacting recovery performance on +standby. And, while standby is reading WAL from archive, primary +accumulates WAL because the standby's replication slot stays inactive. +To avoid these problems, one can use this parameter to make standby +switch to stream mode sooner. + + + + recovery_min_apply_delay (integer) diff --git a/doc/src/sgml/high-
Re: Removing unneeded self joins
On Thu, Oct 19, 2023 at 6:16 AM Andrei Lepikhov wrote: > On 19/10/2023 01:50, Alexander Korotkov wrote: > > This query took 3778.432 ms with self-join removal disabled, and > > 3756.009 ms with self-join removal enabled. So, no measurable > > overhead. Similar to the higher number of joins. Can you imagine > > some extreme case when self-join removal could introduce significant > > overhead in comparison with other optimizer parts? If not, should we > > remove self_join_search_limit GUC? > Thanks, > It was Zhihong Yu who worried about that case [1]. And my purpose was to > show a method to avoid such a problem if it would be needed. > I guess the main idea here is that we have a lot of self-joins, but only > few of them (or no one) can be removed. > I can't imagine a practical situation when we can be stuck in the > problems here. So, I vote to remove this GUC. I've removed the self_join_search_limit. Anyway there is enable_self_join_removal if the self join removal algorithm causes any problems. I also did some grammar corrections for the comments. I think the patch is getting to the committable shape. I noticed some failures on commitfest.cputube.org. I'd like to check how this version will pass it. -- Regards, Alexander Korotkov 0001-Remove-useless-self-joins-v46.patch Description: Binary data
Re: LLVM 16 (opaque pointers)
On Sat, Oct 21, 2023 at 7:08 PM Andres Freund wrote: > I've attached a patch revision that I spent the last couple hours working > on. It's very very roughly based on a patch Tom Stellard had written (which I > think a few rpm packages use). But instead of encoding details about specific > layout details, I made the code check if the data layout works and fall back > to the cpu / features used for llvmjit_types.bc. This way it's not s390x > specific, future odd architecture behaviour would "automatically" be handled > the same The explanation makes sense and this seems like a solid plan to deal with it. I didn't try on a s390x, but I tested locally on our master branch with LLVM 7, 10, 17, 18, and then I hacked your patch to take the fallback path as if a layout mismatch had been detected, and it worked fine: 2023-10-22 11:49:55.663 NZDT [12000] DEBUG: detected CPU "skylake", with features "...", resulting in layout "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" 2023-10-22 11:49:55.664 NZDT [12000] DEBUG: detected CPU / features yield incompatible data layout, using values from module instead 2023-10-22 11:49:55.664 NZDT [12000] DETAIL: module CPU "x86-64", features "...", resulting in layout "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +To deal with that, check if data layouts match during JIT initialization. If +the runtime detected cpu / features result in a different layout, try if the +cpu/features recorded in in llvmjit_types.bc work. s|try |check | s| in in | in | + errmsg_internal("could not determine working CPU / feature comination for JIT compilation"), s|comination|combination| s| / |/|g
Re: LLVM 16 (opaque pointers)
On Sat, Oct 21, 2023 at 2:45 PM Tom Lane wrote: > Thomas Munro writes: > > (It'd be nice if the > > build farm logged "$LLVM_CONFIG --version" somewhere.) > > It's not really the buildfarm script's responsibility to do that, > but feel free to make configure do so. Done, copying the example of how we do it for perl and various other things.
Re: The documentation for storage type 'plain' actually allows single byte header
On Fri, Oct 20, 2023 at 09:48:05PM -0400, Bruce Momjian wrote: > Here is the original thread from pgsql-docs: > > > https://www.postgresql.org/message-id/flat/167336599095.2667301.15497893107226841625%40wrigleys.postgresql.org > > The report is about single-byte headers being used for varlena values > with PLAIN storage. > > Here is the reproducible report: > > CREATE EXTENSION pageinspect; > CREATE TABLE test(a VARCHAR(1) STORAGE PLAIN); > INSERT INTO test VALUES (repeat('A',10)); > > Now peek into the page with pageinspect functions > > SELECT left(encode(t_data, 'hex'), 40) FROM > heap_page_items(get_raw_page('test', 0)); > > This returned value of "1741414141414141414141". > Here the first byte 0x17 = 0001 0111 in binary. > Length + 1 is stored in the length bits (1-7). So Len = 0001011-1 = > (11-1) > [base-10] = 10 [base-10] > which exactly matches the expected length. Further the data "41" > repeated 10 > times also indicates character A (65 or 0x41 in ASCII) repeated 10 > times. > > I researched this and thought it would be a case where we were lacking a > check before creating a single-byte header, but I couldn't find anything > missing. I think the problem is that the _source_ tupleDesc attstorage > attribute is being used to decide if we should use a short header, while > it is really the storage type of the destination that we should be > checking. Unfortunately, I don't think the destination is accessible at > the location were we are deciding about a short header. > > I am confused how to proceed. I feel we need to fully understand why > this happening before we adjust anything. Here is a backtrace --- the > short header is being created in fill_val() and the attstorage value > there is 'x'/EXTENDED. I did some more research. It turns out that the source slot/planSlot is populating its pg_attribute information via makeTargetEntry() and it has no concept of a storage type. Digging further, I found that we cannot get rid of the the use of att->attstorage != TYPSTORAGE_PLAIN in macros ATT_IS_PACKABLE and VARLENA_ATT_IS_PACKABLE macros in src/backend/access/common/heaptuple.c because there are internal uses of fill_val() that can't handle packed varlena headers. I ended up with a doc patch that adds a C comment about this odd behavior and removes doc text about PLAIN storage not using packed headers. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Re: The documentation for storage type 'plain' actually allows single byte header
On Sat, Oct 21, 2023 at 09:56:13PM -0400, Bruce Momjian wrote: > I did some more research. It turns out that the source slot/planSlot is > populating its pg_attribute information via makeTargetEntry() and it > has no concept of a storage type. > > Digging further, I found that we cannot get rid of the the use of > att->attstorage != TYPSTORAGE_PLAIN in macros ATT_IS_PACKABLE and > VARLENA_ATT_IS_PACKABLE macros in src/backend/access/common/heaptuple.c > because there are internal uses of fill_val() that can't handle packed > varlena headers. > > I ended up with a doc patch that adds a C comment about this odd > behavior and removes doc text about PLAIN storage not using packed > headers. Oops, patch attached. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you. diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml index 148fb1b49d..3ea4e5526d 100644 --- a/doc/src/sgml/storage.sgml +++ b/doc/src/sgml/storage.sgml @@ -456,9 +456,7 @@ for storing TOAST-able columns on disk: PLAIN prevents either compression or - out-of-line storage; furthermore it disables use of single-byte headers - for varlena types. - This is the only possible strategy for + out-of-line storage. This is the only possible strategy for columns of non-TOAST-able data types. diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index d6a4ddfd51..c52d40dce0 100644 --- a/src/backend/access/common/heaptuple.c +++ b/src/backend/access/common/heaptuple.c @@ -68,7 +68,16 @@ #include "utils/memutils.h" -/* Does att's datatype allow packing into the 1-byte-header varlena format? */ +/* + * Does att's datatype allow packing into the 1-byte-header varlena format? + * While functions that use TupleDescAttr() and assign attstorage = + * TYPSTORAGE_PLAIN cannot use packed varlena headers, functions that call + * TupleDescInitEntry() use typeForm->typstorage (TYPSTORAGE_EXTENDED) and + * can use packed varlena headers, e.g.: + * CREATE TABLE test(a VARCHAR(1) STORAGE PLAIN); + * INSERT INTO test VALUES (repeat('A',10)); + * This can be verified with pageinspect. + */ #define ATT_IS_PACKABLE(att) \ ((att)->attlen == -1 && (att)->attstorage != TYPSTORAGE_PLAIN) /* Use this if it's already known varlena */ diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c index d65e5625c7..24bad52923 100644 --- a/src/backend/utils/adt/rangetypes.c +++ b/src/backend/utils/adt/rangetypes.c @@ -2608,7 +2608,8 @@ range_contains_elem_internal(TypeCacheEntry *typcache, const RangeType *r, Datum * values into a range object. They are modeled after heaptuple.c's * heap_compute_data_size() and heap_fill_tuple(), but we need not handle * null values here. TYPE_IS_PACKABLE must test the same conditions as - * heaptuple.c's ATT_IS_PACKABLE macro. + * heaptuple.c's ATT_IS_PACKABLE macro. See the comments thare for more + * details. */ /* Does datatype allow packing into the 1-byte-header varlena format? */
Re: Guiding principle for dropping LLVM versions?
Rebased. I also noticed this woefully out of date line: - PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-7 llvm-config-6.0 llvm-config-5.0 llvm-config-4.0 llvm-config-3.9) + PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-17 llvm-config-16 llvm-config-15 llvm-config-14) From 522a4f3744e691f42f43bb1ca91df60b12465bff Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Thu, 19 Oct 2023 04:45:46 +1300 Subject: [PATCH v3 1/2] jit: Require at least LLVM 14, if enabled. Remove support for a lot of older LLVM versions dating back to 3.9. The default on common software distritbutions will be at least LLVM 14 when PostgreSQL 17 ships. Discussion: https://postgr.es/m/CA%2BhUKGLhNs5geZaVNj2EJ79Dx9W8fyWUU3HxcpZy55sMGcY%3DiA%40mail.gmail.com --- config/llvm.m4 | 10 +- configure | 43 +-- doc/src/sgml/installation.sgml | 4 +- meson.build | 2 +- src/backend/jit/llvm/llvmjit.c | 158 +--- src/backend/jit/llvm/llvmjit_error.cpp | 35 -- src/backend/jit/llvm/llvmjit_expr.c | 6 +- src/backend/jit/llvm/llvmjit_inline.cpp | 51 +--- src/backend/jit/llvm/llvmjit_wrap.cpp | 65 -- src/include/jit/llvmjit.h | 17 --- src/include/pg_config.h.in | 12 -- src/tools/msvc/Solution.pm | 3 - 12 files changed, 15 insertions(+), 391 deletions(-) diff --git a/config/llvm.m4 b/config/llvm.m4 index 21d8cd4f90..ede5e25e91 100644 --- a/config/llvm.m4 +++ b/config/llvm.m4 @@ -13,7 +13,7 @@ AC_DEFUN([PGAC_LLVM_SUPPORT], AC_REQUIRE([AC_PROG_AWK]) AC_ARG_VAR(LLVM_CONFIG, [path to llvm-config command]) - PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-7 llvm-config-6.0 llvm-config-5.0 llvm-config-4.0 llvm-config-3.9) + PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-17 llvm-config-16 llvm-config-15 llvm-config-14) # no point continuing if llvm wasn't found if test -z "$LLVM_CONFIG"; then @@ -25,14 +25,14 @@ AC_DEFUN([PGAC_LLVM_SUPPORT], AC_MSG_ERROR([$LLVM_CONFIG does not work]) fi # and whether the version is supported - if echo $pgac_llvm_version | $AWK -F '.' '{ if ([$]1 >= 4 || ([$]1 == 3 && [$]2 >= 9)) exit 1; else exit 0;}';then -AC_MSG_ERROR([$LLVM_CONFIG version is $pgac_llvm_version but at least 3.9 is required]) + if echo $pgac_llvm_version | $AWK -F '.' '{ if ([$]1 >= 14) exit 1; else exit 0;}';then +AC_MSG_ERROR([$LLVM_CONFIG version is $pgac_llvm_version but at least 14 is required]) fi AC_MSG_NOTICE([using llvm $pgac_llvm_version]) # need clang to create some bitcode files AC_ARG_VAR(CLANG, [path to clang compiler to generate bitcode]) - PGAC_PATH_PROGS(CLANG, clang clang-7 clang-6.0 clang-5.0 clang-4.0 clang-3.9) + PGAC_PATH_PROGS(CLANG, clang clang-17 clang-16 clang-15 clang-14) if test -z "$CLANG"; then AC_MSG_ERROR([clang not found, but required when compiling --with-llvm, specify with CLANG=]) fi @@ -115,8 +115,6 @@ AC_DEFUN([PGAC_CHECK_LLVM_FUNCTIONS], # Check which functionality is present SAVE_CPPFLAGS="$CPPFLAGS" CPPFLAGS="$CPPFLAGS $LLVM_CPPFLAGS" - AC_CHECK_DECLS([LLVMOrcGetSymbolAddressIn], [], [], [[#include ]]) - AC_CHECK_DECLS([LLVMGetHostCPUName, LLVMGetHostCPUFeatures], [], [], [[#include ]]) AC_CHECK_DECLS([LLVMCreateGDBRegistrationListener, LLVMCreatePerfJITEventListener], [], [], [[#include ]]) CPPFLAGS="$SAVE_CPPFLAGS" ])# PGAC_CHECK_LLVM_FUNCTIONS diff --git a/configure b/configure index c2cb1b1b24..2d92e3dea3 100755 --- a/configure +++ b/configure @@ -5056,7 +5056,7 @@ if test "$with_llvm" = yes; then : if test -z "$LLVM_CONFIG"; then - for ac_prog in llvm-config llvm-config-7 llvm-config-6.0 llvm-config-5.0 llvm-config-4.0 llvm-config-3.9 + for ac_prog in llvm-config llvm-config-17 llvm-config-16 llvm-config-15 llvm-config-14 do # Extract the first word of "$ac_prog", so it can be a program name with args. set dummy $ac_prog; ac_word=$2 @@ -5120,8 +5120,8 @@ fi as_fn_error $? "$LLVM_CONFIG does not work" "$LINENO" 5 fi # and whether the version is supported - if echo $pgac_llvm_version | $AWK -F '.' '{ if ($1 >= 4 || ($1 == 3 && $2 >= 9)) exit 1; else exit 0;}';then -as_fn_error $? "$LLVM_CONFIG version is $pgac_llvm_version but at least 3.9 is required" "$LINENO" 5 + if echo $pgac_llvm_version | $AWK -F '.' '{ if ($1 >= 14) exit 1; else exit 0;}';then +as_fn_error $? "$LLVM_CONFIG version is $pgac_llvm_version but at least 14 is required" "$LINENO" 5 fi { $as_echo "$as_me:${as_lineno-$LINENO}: using llvm $pgac_llvm_version" >&5 $as_echo "$as_me: using llvm $pgac_llvm_version" >&6;} @@ -5129,7 +5129,7 @@ $as_echo "$as_me: using llvm $pgac_llvm_version" >&6;} # need clang to create some bitcode files if test -z "$CLANG"; then - for ac_prog in clang clang-7 clang-6.0 clang-5.0 clang-4.0 clang-3.9 + for ac_prog in clang clang-17 clang-16 clang-15 cl
Re: Row pattern recognition
Attached is the v10 patch. This version enhances the performance of pattern matching. Previously it generated all possible pattern string candidates. This resulted in unnecessarily large number of candidates. For example if you have 2 pattern variables and the target frame includes 100 rows, the number of candidates can reach to 2^100 in the worst case. To avoid this, I do a pruning in the v10 patch. Suppose you have: PATTERN (A B+ C+) Candidates like "BAC" "CAB" cannot survive because they never satisfy the search pattern. To judge this, I assign sequence numbers (0, 1, 2) to (A B C). If the pattern generator tries to generate BA, this is not allowed because the sequence number for B is 1 and for A is 0, and 0 < 1: B cannot be followed by A. Note that this technique can be applied when the quantifiers are "+" or "*". Maybe other quantifiers such as '?' or '{n, m}' can be applied too but I don't confirm yet because I have not implemented them yet. Besides this improvement, I fixed a bug in the previous and older patches: when an expression in DEFINE uses text operators, it errors out: ERROR: could not determine which collation to use for string comparison HINT: Use the COLLATE clause to set the collation explicitly. This was fixed by adding assign_expr_collations() in transformDefineClause(). Also I have updated documentation "3.5. Window Functions" - It still mentioned about rpr(). It's not applied anymore. - Enhance the description about DEFINE and PATTERN. - Mention that quantifier '*' is supported. Finally I have added more test cases to the regression test. - same pattern variable appears twice - case for quantifier '*' Best reagards, -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp >From a8b41dd155a2d9ce969083fcfc53bbae3c28b263 Mon Sep 17 00:00:00 2001 From: Tatsuo Ishii Date: Sun, 22 Oct 2023 11:22:10 +0900 Subject: [PATCH v10 1/7] Row pattern recognition patch for raw parser. --- src/backend/parser/gram.y | 222 ++-- src/include/nodes/parsenodes.h | 56 src/include/parser/kwlist.h | 8 ++ src/include/parser/parse_node.h | 1 + 4 files changed, 273 insertions(+), 14 deletions(-) diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index c224df4ecc..e09eb061f8 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -251,6 +251,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); DefElem *defelt; SortBy *sortby; WindowDef *windef; + RPCommonSyntax *rpcom; + RPSubsetItem *rpsubset; JoinExpr *jexpr; IndexElem *ielem; StatsElem *selem; @@ -278,6 +280,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); MergeWhenClause *mergewhen; struct KeyActions *keyactions; struct KeyAction *keyaction; + RPSkipTo skipto; } %type stmt toplevel_stmt schema_stmt routine_body_stmt @@ -453,8 +456,12 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); TriggerTransitions TriggerReferencing vacuum_relation_list opt_vacuum_relation_list drop_option_list pub_obj_list - -%type opt_routine_body +row_pattern_measure_list row_pattern_definition_list +opt_row_pattern_subset_clause +row_pattern_subset_list row_pattern_subset_rhs +row_pattern +%type row_pattern_subset_item +%type opt_routine_body row_pattern_term %type group_clause %type group_by_list %type group_by_item empty_grouping_set rollup_clause cube_clause @@ -551,6 +558,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type relation_expr_opt_alias %type tablesample_clause opt_repeatable_clause %type target_el set_target insert_column_item +row_pattern_measure_item row_pattern_definition +%type first_or_last %type generic_option_name %type generic_option_arg @@ -633,6 +642,9 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type window_clause window_definition_list opt_partition_clause %type window_definition over_clause window_specification opt_frame_clause frame_extent frame_bound +%type opt_row_pattern_common_syntax opt_row_pattern_skip_to +%type opt_row_pattern_initial_or_seek +%type opt_row_pattern_measures %type opt_window_exclusion_clause %type opt_existing_window_name %type opt_if_not_exists @@ -659,7 +671,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); json_object_constructor_null_clause_opt json_array_constructor_null_clause_opt - /* * Non-keyword token types. These are hard-wired into the "flex" lexer. * They must be listed first so that their numeric codes do not depend on @@ -702,7 +713,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE DATA_P DATABASE DAY_P DEALLOCATE DEC DECIMAL
Re: Guiding principle for dropping LLVM versions?
Hi, Can we also check if the clang's version is compatible with llvm's version in llvm.m4? I have multiple llvm toolchains installed on my system and I have to specify the $CLANG and $LLVM_CONFIG variables each time I build the server against a toolchain that is not present in $PATH. If one of the variables is missing, the build system will pick up a default one whose version might not be compatible with the other. E.g., If we use clang-16 and llvm-config-15, there will be issues when creating indexes for bitcodes at the end of installation. There will be errors look like ``` LLVM ERROR: ThinLTO cannot create input file: Unknown attribute kind (86) (Producer: 'LLVM16.0.6' Reader: 'LLVM 15.0.7') PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: /usr/lib/llvm15/bin/llvm-lto -thinlto -thinlto-action=thinlink -o postgres.index.bc postgres/access/brin/brin.bc postgres/access/brin/brin_bloom.bc postgres/acces s/brin/brin_inclusion.bc postgres/access/brin/brin_minmax.bc postgres/access/brin/brin_minmax_multi.bc postgres/access/brin/brin_pageops.bc postgres/access/brin/brin_revmap.bc postgres/acce ss/brin/brin_tuple.bc postgres/access/brin/brin_validate.bc postgres/access/brin/brin_xlog.bc postgres/access/common/attmap.bc postgres/access/common/bufmask.bc postgres/access/common/detoa st.bc postgres/access/common/heaptuple.bc postgres/access/common/indextuple.bc postgres/access/common/printsimple.bc postgres/access/common/printtup.bc postgres/access/common/relation.bc po stgres/access/common/reloptions.bc postgres/access/common/scankey.bc postgres/access/common/session.bc postgres/access/common/syncscan.bc postgres/access/common/toast_compression.bc postgre s/access/common/toast_internals.bc postgres/access/common/tupconvert.bc postgres/access/common/tupdesc.bc postgres/access/gin/ginarrayproc.bc postgres/access/gin/ginbtree.bc postgres/access /gin/ginbulk.bc postgres/access/gin/gindatapage.bc postgres/access/gin/ginentrypage.bc postgres/access/gin/ginfast.bc postgres/access/gin/ginget.bc postgres/access/gin/gininsert.bc postgres /access/gin/ginlogic.bc postgres/access/gin/ginpostinglist.bc postgres/access/gin/ginscan.bc postgres/access/gin/ginutil.bc postgres/access/gin/ginvacuum.bc postgres/access/gin/ginvalidate. bc postgres/access/gin/ginxlog.bc postgres/access/gist/gist.bc postgres/access/gist/gistbuild.bc postgres/access/gist/gistbuildbuffers.bc postgres/access/gist/gistget.bc postgres/access/gis t/gistproc.bc postgres/access/gist/gistscan.bc postgres/access/gist/gistsplit.bc postgres/access/gist/gistutil.bc postgres/access/gist/gistvacuum.bc postgres/access/gist/gistvalidate.bc pos tgres/access/gist/gistxlog.bc postgres/access/hash/hash.bc postgres/access/hash/hash_xlog.bc postgres/access/hash/hashfunc.bc postgres/access/hash/hashinsert.bc postgres/access/hash/hashovf l.bc postgres/access/hash/hashpage.bc postgres/access/hash/hashsearch.bc postgres/access/hash/hashsort.bc postgres/access/hash/hashutil.bc postgres/access/hash/hashvalidate.bc postgres/acce ss/heap/heapam.bc postgres/access/heap/heapam_handler.bc postgres/access/heap/heapam_visibility.bc postgres/access/heap/heaptoast.bc postgres/access/heap/hio.bc postgres/access/heap/prunehe ``` If we can check the llvm-config versions and clang versions at the configuration phase we can detect the problem earlier. Best Regards, Xing On Sun, Oct 22, 2023 at 10:07 AM Thomas Munro wrote: > Rebased. I also noticed this woefully out of date line: > > - PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-7 > llvm-config-6.0 llvm-config-5.0 llvm-config-4.0 llvm-config-3.9) > + PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-17 > llvm-config-16 llvm-config-15 llvm-config-14) >