Re: Allow file inclusion in pg_hba and pg_ident files
On Mon, Oct 10, 2022 at 10:51:32AM +0800, Julien Rouhaud wrote: > On Sun, Sep 18, 2022 at 01:06:12AM +0800, Julien Rouhaud wrote: > > On Tue, Aug 16, 2022 at 02:10:30PM +0800, Julien Rouhaud wrote: > > > > > > On Sat, Jul 30, 2022 at 04:09:36PM +0800, Julien Rouhaud wrote: > > > > > > > > > > > - 0001: the rule_number / mapping_number addition in the views in a > > > > separate > > > > commit > > > > - 0002: the main file inclusion patch. Only a few minor bugfix since > > > > previous version discovered thanks to the tests (a bit more about it > > > > after), > > > > and documentation tweaks based on previous discussions > > > > - 0003: the pg_hba_matches() POC, no changes v12 attached, fixing multiple conflicts with recent activity. >From 3361344bc0e907b341df912cf8ffcf28440b2370 Mon Sep 17 00:00:00 2001 From: Julien Rouhaud Date: Mon, 30 May 2022 10:59:51 +0800 Subject: [PATCH v12 1/3] Add rule_number / mapping_number to the pg_hba/pg_ident views. Author: Julien Rouhaud Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya%40jrouhaud --- doc/src/sgml/system-views.sgml | 22 + src/backend/utils/adt/hbafuncs.c| 50 ++--- src/include/catalog/pg_proc.dat | 11 --- src/test/regress/expected/rules.out | 10 +++--- 4 files changed, 72 insertions(+), 21 deletions(-) diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml index 1ca7c3f9bf..4723f712a7 100644 --- a/doc/src/sgml/system-views.sgml +++ b/doc/src/sgml/system-views.sgml @@ -991,6 +991,18 @@ + + + rule_number int4 + + + Rule number of this rule among all rules if the rule is valid, otherwise + null. This indicates the order in which each rule will be considered + until the first matching one, if any, is used to perform authentication + with the client. + + + line_number int4 @@ -1131,6 +1143,16 @@ + + + mapping_number int4 + + + Mapping number, in priority order, of this mapping if the mapping is + valid, otherwise null + + + line_number int4 diff --git a/src/backend/utils/adt/hbafuncs.c b/src/backend/utils/adt/hbafuncs.c index cfdc4d8b39..21a451e391 100644 --- a/src/backend/utils/adt/hbafuncs.c +++ b/src/backend/utils/adt/hbafuncs.c @@ -26,10 +26,12 @@ static ArrayType *get_hba_options(HbaLine *hba); static void fill_hba_line(Tuplestorestate *tuple_store, TupleDesc tupdesc, - int lineno, HbaLine *hba, const char *err_msg); + int rule_number, int lineno, HbaLine *hba, + const char *err_msg); static void fill_hba_view(Tuplestorestate *tuple_store, TupleDesc tupdesc); static void fill_ident_line(Tuplestorestate *tuple_store, TupleDesc tupdesc, - int lineno, IdentLine *ident, const char *err_msg); + int mapping_number, int lineno, IdentLine *ident, + const char *err_msg); static void fill_ident_view(Tuplestorestate *tuple_store, TupleDesc tupdesc); @@ -157,7 +159,7 @@ get_hba_options(HbaLine *hba) } /* Number of columns in pg_hba_file_rules view */ -#define NUM_PG_HBA_FILE_RULES_ATTS 9 +#define NUM_PG_HBA_FILE_RULES_ATTS 10 /* * fill_hba_line @@ -165,6 +167,7 @@ get_hba_options(HbaLine *hba) * * tuple_store: where to store data * tupdesc: tuple descriptor for the view + * rule_number: unique rule identifier among all valid rules * lineno: pg_hba.conf line number (must always be valid) * hba: parsed line data (can be NULL, in which case err_msg should be set) * err_msg: error message (NULL if none) @@ -174,7 +177,8 @@ get_hba_options(HbaLine *hba) */ static void fill_hba_line(Tuplestorestate *tuple_store, TupleDesc tupdesc, - int lineno, HbaLine *hba, const char *err_msg) + int rule_number, int lineno, HbaLine *hba, + const char *err_msg) { Datum values[NUM_PG_HBA_FILE_RULES_ATTS]; boolnulls[NUM_PG_HBA_FILE_RULES_ATTS]; @@ -193,6 +197,11 @@ fill_hba_line(Tuplestorestate *tuple_store, TupleDesc tupdesc, memset(nulls, 0, sizeof(nulls)); index = 0; + /* rule_number */ + if (err_msg) + nulls[index++] = true; + else + values[index++] = Int32GetDatum(rule_number); /* line_number */ values[index++] = Int32GetDatum(lineno); @@ -336,7 +345,7 @@ fill_hba_line(Tuplestorestate *tuple_store, TupleDesc tupdesc, else { /* no parsing result, so set relevant fields to nulls */ - memset([1], true,
Re: Crash after a call to pg_backup_start()
On Mon, Oct 24, 2022 at 11:42:58AM +0900, Kyotaro Horiguchi wrote: > At Sat, 22 Oct 2022 09:56:06 +0200, Alvaro Herrera > wrote in >> My intention here was that the Assert should be inside the block, that >> is, we already know that at least one is true, and we want to make sure >> that they are not *both* true. >> >> AFAICT the attached patch also fixes the bug without making the assert >> weaker. On the contrary, it seems to me that putting the assertion within the if() block makes the assertion weaker, because we would never check for an incorrect state after do_pg_abort_backup() is registered (aka any pg_backup_start() call) when not entering in this if() block. Saying that, if you feel otherwise I am fine with your conclusion as well, so feel free to solve this issue as you see fit. :p -- Michael signature.asc Description: PGP signature
Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
On Fri, 2022-10-21 at 17:39 -0700, Peter Geoghegan wrote: > One small thought on the presentation/docs side of this: maybe it > would be better to invent a new kind of autovacuum It's possible this would be easier for users to understand: one process that does cleanup work over time in a way that minimizes interference; and another process that activates in more urgent situations (perhaps due to misconfiguration of the first process). But we should be careful that we don't end up with more confusion. For something like that to work, we'd probably want the second process to not be configurable at all, and we'd want it to be issuing WARNINGs pointing to what might be misconfigured, and otherwise just be invisible. > That way we wouldn't be fighting against the widely held perception > that antiwraparound autovacuums are scary. There's certainly a terminology problem there. Just to brainstorm on some new names, we might want to call it something like "xid reclamation" or "xid horizon advancement". When it starts to run out, we can use words like "wraparound" or "exhaustion". -- Jeff Davis PostgreSQL Contributor Team - AWS
Question about savepoint level?
Hi, hackers The TransactionStateData has savepointLevel field, however, I do not understand what is savepoint level, it seems all savepoints have the same savepointLevel, I want to know how the savepoint level changes. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: [PoC] Reducing planning time when tables have many partitions
Hello, On Wed, Sep 21, 2022 at 6:43 PM Yuya Watari wrote: > 1.1. Revert one of our optimizations (v5) > > As I mentioned in the comment in > v[5|6|7]-0002-Revert-one-of-the-optimizations.patch, I reverted one of > our optimizations. This code tries to find EquivalenceMembers that do > not satisfy the bms_overlap condition. We encounter such members early > in the loop, so the linear search is enough, and our optimization is > too excessive here. As a result of experiments, I found this > optimization was a bottleneck, so I reverted it. In the previous mail, I proposed a revert of one excessive optimization. In addition, I found a new bottleneck and attached a new version of the patch solving it to this email. The new bottleneck exists in the select_outer_pathkeys_for_merge() function. At the end of this function, we count EquivalenceMembers that satisfy the specific condition. To count them, we have used Bitmapset operations. Through experiments, I concluded that this optimization is effective for larger cases but leads to some degradation for the smaller number of partitions. The new patch switches two algorithms depending on the problem sizes. 1. Experimental result 1.1. Join Order Benchmark As in the previous email, I used the Join Order Benchmark to evaluate the patches' performance. The correspondence between each version and patches is as follows. v3: v8-0001-*.patch v5 (v3 with revert): v8-0001-*.patch + v8-0002-*.patch v8 (v5 with revert): v8-0001-*.patch + v8-0002-*.patch + v8-0003-*.patch I show the speed-up of each method compared with the master branch in Table 1. When the number of partitions is 1, performance degradation is kept to 1.1% in v8, while they are 4.2% and 1.8% in v3 and v5. This result indicates that a newly introduced revert is effective. Table 1: Speedup of Join Order Benchmark (higher is better) (n = the number of partitions) -- n | v3 | v5 (v3 with revert) | v8 (v5 with revert) -- 2 | 95.8% | 98.2% | 98.9% 4 | 97.2% | 99.7% | 99.3% 8 | 101.4% | 102.5% | 103.4% 16 | 108.7% | 111.4% | 110.2% 32 | 127.1% | 127.6% | 128.8% 64 | 169.5% | 172.1% | 172.4% 128 | 330.1% | 335.2% | 332.3% 256 | 815.1% | 826.4% | 821.8% -- 1.2. pgbench The following table describes the result of pgbench. The v5 and v8 performed clearly better than the v3 patch. The difference between v5 and v8 is not so significant, but v8's performance is close to the master branch. Table 2: The result of pgbench (tps) - n | Master | v3 | v5 (v3 with revert) | v8 (v5 with revert) - 1 | 7550 | 7422 |7474 |7521 2 | 7594 | 7381 |7536 |7529 4 | 7518 | 7362 |7461 |7524 8 | 7459 | 7340 |7424 |7460 - Avg | 7531 | 7377 |7474 |7509 - 2. Conclusion and future works The revert in the v8-0003-*.patch is effective in preventing performance degradation for the smaller number of partitions. However, I don't think what I have done in the patch is the best or ideal solution. As I mentioned in the comments in the patch, switching two algorithms may be ugly because it introduces code duplication. We need a wiser solution to this problem. -- Best regards, Yuya Watari v8-0001-Apply-eclass_member_speedup_v3.patch.patch Description: Binary data v8-0002-Revert-one-of-the-optimizations.patch Description: Binary data v8-0003-Use-conventional-algorithm-for-smaller-size-cases.patch Description: Binary data
Re: Patch proposal: make use of regular expressions for the username in pg_hba.conf
On Fri, Oct 21, 2022 at 02:10:37PM +0200, Drouvot, Bertrand wrote: > On 10/21/22 2:58 AM, Michael Paquier wrote: >> The same approach with keywords first, regex, and exact match could be >> applied as well for the databases? Perhaps it is just mainly a matter >> of taste, > > Yeah, I think it is. ;) Still it looks that this makes for less confusion with a minimal footprint once the new additions are in place. >> In the same fashion as load_ident(), it seems to me that we >> need two extra things for this patch: >> - if !ok (see where we do MemoryContextDelete(hbacxt)), we need to go >> through new_parsed_lines and free for each line the AuthTokens for the >> database and user lists. >> - if ok and new_parsed_lines != NIL, the same cleanup needs to >> happen. > > Right, but I think that should be "parsed_hba_lines != NIL". For the second case, where we need to free the past contents after a success, yes. > Right. To avoid code duplication in the !ok/ok cases, the function > free_hba_line() has been added in v2: it goes through the list of databases > and roles tokens and call free_auth_token() for each of them. Having a small-ish routine for that is fine. I have spent a couple of hours doing a pass over v2, playing manually with regex patterns, reloads, the system views and item lists. The logic was fine, but I have adjusted a few things related to the comments and the documentation (particularly with the examples, removing one example and updating one with a regex that has a comma, needing double quotes). The CI and all my machines were green, and the test coverage looked sufficient. So, applied. I'll keep an eye on the buildfarm. -- Michael signature.asc Description: PGP signature
Re: doubt about FullTransactionIdAdvance()
Hi, Andres On Oct 24, 2022, 01:16 +0800, Andres Freund , wrote: > Hi, > > On 2022-10-22 11:32:47 +0800, Zhang Mingli wrote: > > Hi, hackers > > > > I don't quite understand FullTransactionIdAdvance(), correct me if I’m > > wrong. > > > > > > /* > > * Advance a FullTransactionId variable, stepping over xids that would > > appear > > * to be special only when viewed as 32bit XIDs. > > */ > > static inline void > > FullTransactionIdAdvance(FullTransactionId *dest) > > { > > dest->value++; > > > > /* see FullTransactionIdAdvance() */ > > if (FullTransactionIdPrecedes(*dest, FirstNormalFullTransactionId)) > > return; > > > > while (XidFromFullTransactionId(*dest) < FirstNormalTransactionId) > > dest->value++; > > } > > > > #define XidFromFullTransactionId(x) ((x).value) > > #define FullTransactionIdPrecedes(a, b) ((a).value < (b).value) > > > > Can the codes reach line: while (XidFromFullTransactionId(*dest) < > > FirstNormalTransactionId)? > > As we will return if (FullTransactionIdPrecedes(*dest, > > FirstNormalFullTransactionId)), and the two judgements seem equal. > > Another confusion is the comments: /* see FullTransactionIdAdvance() */, is > > its own itself. > > Could anyone explain this? Thanks in advance. > > FullTransactionId is 64bit. An "old school" xid is 32bit. The first branch is > to protect against the special fxids that are actually below > FirstNormalFullTransactionId: > > if (FullTransactionIdPrecedes(*dest, FirstNormalFullTransactionId)) > return; > > The second branch is to protect against 64bit xids that would yield a 32bit > xid below FirstNormalTransactionId after truncating to 32bit: > > while (XidFromFullTransactionId(*dest) < FirstNormalTransactionId) > dest->value++; > > E.g. we don't want to modify the 64bit xid 0 (meaning InvalidTransactionId) as > it has special meaning. But we have to make sure that e.g. the 64bit xid > 0x1 won't exist, as it'd also result in InvalidTransactionId if > truncated to 32bit. > > > However, it looks like this comment: > /* see FullTransactionIdAdvance() */ > if (FullTransactionIdPrecedes(*dest, FirstNormalFullTransactionId)) > return; > > is bogus, and it's my fault. Looks like it's intending to reference > FullTransactionIdRetreat(). > > Greetings, > > Andres Freund Now I get it, thanks for your explanation. Regards, Zhang Mingli
Re: csv_populate_recordset and csv_agg
Steve Chavez writes: > CSV processing is also a common use case and PostgreSQL has the COPY .. > FROM .. CSV form but COPY is not compatible with libpq pipeline mode and > the interface is clunkier to use. > I propose to include two new functions: > - csv_populate_recordset ( base anyelement, from_csv text ) > - csv_agg ( anyelement ) The trouble with CSV is there are so many mildly-incompatible versions of it. I'm okay with supporting it in COPY, where we have the freedom to add random sub-options (QUOTE, ESCAPE, FORCE_QUOTE, yadda yadda) to cope with those variants. I don't see a nice way to handle that issue in the functions you propose --- you'd have to assume that there is One True CSV, which sadly ain't so, or else complicate the functions beyond usability. Also, in the end CSV is a surface presentation layer, and as such it's not terribly well suited as the calculation representation for aggregates and other functions. I think these proposed functions would have pretty terrible performance as a consequence of the need to constantly re-parse the surface format. The same point could be made about JSON ... which is why we prefer to implement processing functions with JSONB. regards, tom lane
Re: pg_recvlogical prints bogus error when interrupted
On Fri, Oct 21, 2022 at 7:52 AM Kyotaro Horiguchi wrote: > > > +1. How about emitting a message like its friend pg_receivewal, like > > the attached patch? > > I'm not a fan of treating SIGINT as an error in this case. It calls > prepareToTerminate() when time_to_abort and everything goes fine after > then. So I think we should do the same thing after receiving an > interrupt. This also does file-sync naturally as a part of normal > shutdown. I'm also not a fan of doing fsync at error. I think the pg_recvlogical can gracefully exit on both SIGINT and SIGTERM to keep things simple. > > > I also then noticed that we don't fsync the output file in cases of > > > errors - > > > that seems wrong to me? Looks to me like that block should be moved till > > > after > > > the error:? > > > > How about something like the attached patch? The attached patch (pg_recvlogical_graceful_interrupt.text) has a couple of problems, I believe. We're losing prepareToTerminate() with keepalive true and we're not skipping pg_log_error("unexpected termination of replication stream: %s" upon interrupt, after all we're here discussing how to avoid it. I came up with the attached v2 patch, please have a look. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com From 56e25373796b114254f5995701b07b05381f28ef Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Sat, 22 Oct 2022 08:35:16 + Subject: [PATCH v2] pg_recvlogical fixes --- src/bin/pg_basebackup/pg_recvlogical.c | 25 ++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c index 5f2e6af445..849e9d9071 100644 --- a/src/bin/pg_basebackup/pg_recvlogical.c +++ b/src/bin/pg_basebackup/pg_recvlogical.c @@ -54,7 +54,8 @@ static const char *plugin = "test_decoding"; /* Global State */ static int outfd = -1; -static volatile sig_atomic_t time_to_abort = false; +static bool time_to_abort = false; +static volatile sig_atomic_t ready_to_exit = false; static volatile sig_atomic_t output_reopen = false; static bool output_isfile; static TimestampTz output_last_fsync = -1; @@ -283,6 +284,23 @@ StreamLogicalLog(void) copybuf = NULL; } + /* When we get SIGINT/SIGTERM, we exit */ + if (ready_to_exit) + { + /* + * Try informing the server about our exit, but don't wait around + * or retry on failure. + */ + (void) PQputCopyEnd(conn, NULL); + (void) PQflush(conn); + time_to_abort = ready_to_exit; + + if (verbose) +pg_log_info("received interrupt signal, exiting"); + + break; + } + /* * Potentially send a status message to the primary. */ @@ -614,7 +632,8 @@ StreamLogicalLog(void) res = PQgetResult(conn); } - if (PQresultStatus(res) != PGRES_COMMAND_OK) + if (!ready_to_exit && + PQresultStatus(res) != PGRES_COMMAND_OK) { pg_log_error("unexpected termination of replication stream: %s", PQresultErrorMessage(res)); @@ -656,7 +675,7 @@ error: static void sigexit_handler(SIGNAL_ARGS) { - time_to_abort = true; + ready_to_exit = true; } /* -- 2.34.1
Re: cross-platform pg_basebackup
Hi, On Fri, Oct 21, 2022 at 09:15:39AM -0400, Robert Haas wrote: > > Committed and back-patched to all supported branches. Is there any additional things to be taken care of or should https://commitfest.postgresql.org/40/3954/ be closed?
Re: Crash after a call to pg_backup_start()
At Sat, 22 Oct 2022 09:56:06 +0200, Alvaro Herrera wrote in > On 2022-Oct-21, Michael Paquier wrote: > > > On Fri, Oct 21, 2022 at 05:53:25PM +0800, Richard Guo wrote: > > > > /* These conditions can not be both true */ > > > > If you do that, it would be a bit easier to read as of the following > > assertion instead? Like: > > Assert(!during_backup_start || > >sessionBackupState == SESSION_BACKUP_NONE); > > My intention here was that the Assert should be inside the block, that > is, we already know that at least one is true, and we want to make sure > that they are not *both* true. > > AFAICT the attached patch also fixes the bug without making the assert > weaker. I'm fine with either of them, but.. The reason for that works the same way is that the if() block excludes the case of (!during_backup_start && S_B_RUNNING)<*1>. In other words the strictness is a kind of illusion [*a]. Actually the assertion does not detect the case <*1>. In this regard, moving the current assertion into the if() block might be confusing. regards, <*1>: It's evidently superfluous but "strictness" and "illusion" share the exactly the same pronounciation in Japanese "Ghen-Ka-Ku". -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [DOCS] Stats views and functions not in order?
A rebase was needed. PSA v3*. -- Kind Regards, Peter Smith. Fujitsu Australia
csv_populate_recordset and csv_agg
Hello hackers, The `json_populate_recordset` and `json_agg` functions allow systems to process/generate json directly on the database. This "cut outs the middle tier"[1] and notably reduces the complexity of web applications. CSV processing is also a common use case and PostgreSQL has the COPY .. FROM .. CSV form but COPY is not compatible with libpq pipeline mode and the interface is clunkier to use. I propose to include two new functions: - csv_populate_recordset ( base anyelement, from_csv text ) - csv_agg ( anyelement ) I would gladly implement these if it sounds like a good idea. I see there's already some code that deals with CSV on - src/backend/commands/copyfromparse.c(CopyReadAttributesCSV) - src/fe_utils/print.c(csv_print_field) - src/backend/utils/error/csvlog(write_csvlog) So perhaps a new csv module could benefit the codebase as well. Best regards, Steve [1]: https://www.crunchydata.com/blog/generating-json-directly-from-postgres
Re: [PATCH] minor bug fix for pg_dump --clean
=?UTF-8?Q?Fr=c3=a9d=c3=a9ric_Yhuel?= writes: > When using pg_dump (or pg_restore) with option "--clean", there is some > SQL code to drop every objects at the beginning. Yup ... > The DROP statement for a view involving circular dependencies is : > CREATE OR REPLACE VIEW [...] > (see commit message of d8c05aff for a much better explanation) > If the view is not in the "public" schema, and the target database is > empty, this statement fails, because the schema hasn't been created yet. > The attached patches are a TAP test which can be used to reproduce the > bug, and a proposed fix. They apply to the master branch. I am disinclined to accept this as a valid bug, because there's never been any guarantee that a --clean script would execute error-free in a database that doesn't match what the source database contains. (The pg_dump documentation used to say that in so many words. I see that whoever added the --if-exists option was under the fond illusion that that fixes all cases, which it surely does not. We need to back off the promises a bit there.) An example of a case that won't execute error-free is if the view having a circular dependency includes a column of a non-built-in data type. If you try to run that in an empty database, you'll get an error from the CREATE OR REPLACE VIEW's reference to that data type. For instance, if I adjust your test case to make the "payload" column be of type hstore, I get something like psql:dumpresult.sql:22: ERROR: type "public.hstore" does not exist LINE 4: NULL::public.hstore AS payload; ^ The same type of failure occurs for user-defined functions and operators that use a non-built-in type, and I'm sure there are more cases in the same vein. But it gets *really* messy if the target database isn't completely empty, but contains objects with different properties than the dump script expects; for example, if the view being discussed here exists with a different column set than the script thinks, or if the dependency chains aren't all the same. If this fix were cleaner I might be inclined to accept it anyway, but it's not very clean at all --- for example, it's far from obvious to me what are the side-effects of changing the filter in RestoreArchive like that. Nor am I sure that the schema you want to create is guaranteed to get dropped again later in every use-case. So I think mainly what we ought to do here is to adjust the documentation to make it clearer that --clean is not guaranteed to work without errors unless the target database has the same set of objects as the source. --if-exists can reduce the set of error cases, but not eliminate it. Possibly we should be more enthusiastic about recommending --create --clean (ie, drop and recreate the whole database) instead. regards, tom lane
Re: [PATCH] minor bug fix for pg_dump --clean
Hi, Good catch! Here are several points for improvement: 1. pg_dump.c:17380 Maybe better to write simpler appendPQExpBuffer(delcmd, "CREATE SCHEMA IF NOT EXISTS %s;\n", tbinfo->dobj.namespace->dobj.name); because there is a schema name inside the `tbinfo->dobj.namespace->dobj.name ` 2. pg_backup_archiver.c:588 Here are no necessary spaces before and after braces, and no spaces around the '+' sign. ( strncmp(dropStmt, "CREATE SCHEMA IF NOT EXISTS", 27) == 0 && strstr(dropStmt+29, "CREATE OR REPLACE VIEW") )) Best regards, Viktoria Shepard чт, 1 сент. 2022 г. в 12:13, Frédéric Yhuel : > Hello, > > When using pg_dump (or pg_restore) with option "--clean", there is some > SQL code to drop every objects at the beginning. > > The DROP statement for a view involving circular dependencies is : > > CREATE OR REPLACE VIEW [...] > > (see commit message of d8c05aff for a much better explanation) > > If the view is not in the "public" schema, and the target database is > empty, this statement fails, because the schema hasn't been created yet. > > The attached patches are a TAP test which can be used to reproduce the > bug, and a proposed fix. They apply to the master branch. > > Best regards, > Frédéric
Re: Use simplehash.h instead of dynahash in SMgr
Hi, On 2021-04-25 03:58:38 +1200, David Rowley wrote: > Currently, we use dynahash hash tables to store the SMgrRelation so we > can perform fast lookups by RelFileNodeBackend. However, I had in mind > that a simplehash table might perform better. So I tried it... > The test case was basically inserting 100 million rows one at a time > into a hash partitioned table with 1000 partitions and 2 int columns > and a primary key on one of those columns. It was about 12GB of WAL. I > used a hash partitioned table in the hope to create a fairly > random-looking SMgr hash table access pattern. Hopefully something > similar to what might happen in the real world. A potentially stupid question: Do we actually need to do smgr lookups in this path? Afaict nearly all of the buffer lookups here will end up as cache hits in shared buffers, correct? Afaict we'll do two smgropens in a lot of paths: 1) XLogReadBufferExtended() does smgropen so it can do smgrnblocks() 2) ReadBufferWithoutRelcache() does an smgropen() It's pretty sad that we constantly do two smgropen()s to start with. But in the cache hit path we don't actually need an smgropen in either case afaict. ReadBufferWithoutRelcache() does an smgropen, because that's ReadBuffer_common()'s API. Which in turn has that API because it wants to use RelationGetSmgr() when coming from ReadBufferExtended(). It doesn't seem awful to allow smgr to be NULL and to pass in the rlocator in addition. XLogReadBufferExtended() does an smgropen() so it can do smgrcreate() and smgrnblocks(). But neither is needed in the cache hit case, I think. We could do a "read-only" lookup in s_b, and only do the current logic in case that fails? Greetings, Andres Freund
Re: Multiple grouping set specs referencing duplicate alias
David Kimura writes: > I think I may have stumbled across a case of wrong results on HEAD (same > through version 9.6, though interestingly 9.5 produces different results > altogether). > test=# SELECT i AS ai1, i AS ai2 FROM generate_series(1,3)i GROUP BY > ai2, ROLLUP(ai1) ORDER BY ai1, ai2; Yeah, this is an instance of an issue we've known about for awhile: when using grouping sets (ROLLUP), the planner fails to distinguish between "ai1" and "ai1 as possibly nulled by the action of the grouping node". This has been discussed at, eg, [1] and [2]. The direction I'd like to take to fix it is to invent explicit labeling of Vars that have been nulled by some operation such as outer joins or grouping, and then represent grouping set outputs as either PlaceHolderVars or Vars tied to a new RTE that represents the grouping step. I have been working on a patch that'd do the first half of that [3], but it's been slow going, because we've indulged in a lot of semantic squishiness in this area and cleaning it all up is a large undertaking. > I tinkered a bit and hacked together an admittedly ugly patch that triggers an > explicit sort constructed from the parse tree. I seriously doubt that that'll fix all the issues in this area. We really really need to understand that a PathKey based on the scan-level value of a Var is different from a PathKey based on a post-nulling-step value. regards, tom lane [1] https://www.postgresql.org/message-id/flat/CAMbWs48AtQTQGk37MSyDk_EAgDO3Y0iA_LzvuvGQ2uO_Wh2muw%40mail.gmail.com [2] https://www.postgresql.org/message-id/flat/7dbdcf5c-b5a6-ef89-4958-da212fe10176%40iki.fi [3] https://www.postgresql.org/message-id/flat/830269.1656693...@sss.pgh.pa.us
Re: Re[2]: Possible solution for masking chosen columns when using pg_dump
Hi, Thank you, Oleg Tselebrovskiy, for your valuable review, here are the fixes Best regards, Viktoria Shepard ср, 12 окт. 2022 г. в 12:19, Виктория Шепард : > Hi, > > Here is an idea of how to read masking options from a file. Please, take a > look. > > пн, 10 окт. 2022 г. в 14:54, Олег Целебровский >: > >> Hi, >> >> I applied most of suggestions: used separate files for most of added >> code, fixed typos/mistakes, got rid of that pesky TODO that was already >> implemented, just not deleted. >> >> Added tests (and functionality) for cases when you need to mask columns >> in tables with the same name in different schemas. If schema is not >> specified, then columns in all tables with specified name are masked >> (Example - pg_dump -t ‘t0’ --mask-columns id --mask-function mask_int will >> mask all ids in all tables with names ‘t0’ in all existing schemas). >> >> Wrote comments for all ‘magic numbers’ >> >> About that >> >> >- Also it can be hard to use a lot of different functions for different >> fields, maybe it would be better to set up functions in a file. >> >> I agree with that, but I know about at least 2 other patches (both are >> WIP, but still) that are interacting with reading command-line options from >> file. And if everyone will write their own version of reading command-line >> options from file, it will quickly get confusing. >> >> A solution to that problem is another patch that will put all options >> from file (one file for any possible options, from existing to future ones) >> into **argv in main, so that pg_dump can process them as if they came form >> command line. >> >> >> Пятница, 7 октября 2022, 8:01 +07:00 от Виктория Шепард < >> we.vikt...@gmail.com>: >> >> Hi, >> I took a look, here are several suggestions for improvement: >> >> - Masking is not a main functionality of pg_dump and it is better to >> write most of the connected things in a separate file like parallel.c or >> dumputils.c. This will help slow down the growth of an already huge pg_dump >> file. >> >> - Also it can be hard to use a lot of different functions for different >> fields, maybe it would be better to set up functions in a file. >> >> - How will it work for the same field and tables in the different >> schemas? Can we set up the exact schema for the field? >> >> - misspelling in a word >> >/* >> >* Add all columns and funcions to list of MaskColumnInfo structures, >> >*/ >> >> - Why did you use 256 here? >> > char* table = (char*) pg_malloc(256 * sizeof(char)); >> Also for malloc you need malloc on 1 symbol more because you have to >> store '\0' symbol. >> >> - Instead of addFuncToDatabase you can run your query using something >> already defined from fe_utils/query_utils.c. And It will be better to set >> up a connection only once and create all functions. Establishing a >> connection is a resource-intensive procedure. There are a lot of magic >> numbers, better to leave some comments explaining why there are 64 or 512. >> >> - It seems that you are not using temp_string >> > char *temp_string = (char*)malloc(256 * sizeof(char)); >> >> - Grammar issues >> >/* >> >* mask_column_info_list contains info about every to-be-masked column: >> >* its name, a name its table (if nothing is specified - mask all columns >> with this name), >> >* name of masking function and name of schema containing this function >> (public if not specified) >> >*/ >> the name of its table >> >> >> пн, 3 окт. 2022 г. в 20:45, Julien Rouhaud > >: >> >> Hi, >> >> On Mon, Oct 03, 2022 at 06:30:17PM +0300, Олег Целебровский wrote: >> > >> > Hello, here's my take on masking data when using pg_dump >> > >> > The main idea is using PostgreSQL functions to replace data during a >> SELECT. >> > When table data is dumped SELECT a,b,c,d ... from ... query is >> generated, the columns that are marked for masking are replaced with result >> of functions on those columns >> > Example: columns name, count are to be masked, so the query will look >> as such: SELECT id, mask_text(name), mask_int(count), date from ... >> > >> > So about the interface: I added 2 more command-line options: >> > >> > --mask-columns, which specifies what columns from what tables will be >> masked >> > usage example: >> > --mask-columns "t1.name, t2.description" - both columns >> will be masked with the same corresponding function >> > or --mask-columns name - ALL columns with name "name" from >> all dumped tables will be masked with correspoding function >> > >> > --mask-function, which specifies what functions will mask data >> > usage example: >> > --mask-function mask_int - corresponding columns will be >> masked with function named "mask_int" from default schema (public) >> > or --mask-function my_schema.mask_varchar - same as above >> but with specified schema where the function is stored >> > or --mask-function somedir/filename - the function is >> "defined" here - more on the structure below
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
On Thu, Oct 20, 2022 at 10:31 AM Andres Freund wrote: > - "repossession" is a very unintuitive name for me. If we want something like > it, can't we just name it reuse_failed or such? +1, I think "repossessed" is awkward. I think "reuse_failed" works, but no strong opinions on an alternate name. > - Wonder if the column names should be reads, writes, extends, etc instead of > the current naming pattern Why? Lukas suggested alignment with existing views like pg_stat_database and pg_stat_statements. It doesn't make sense to use the blks_ prefix since it's not all blocks, but otherwise it seems like we should be consistent, no? > > "freelist_acquired" is confusing for local buffers but I wanted to > > distinguish between reuse/eviction of local buffers and initial > > allocation. "freelist_acquired" seemed more fitting because there is a > > clocksweep to find a local buffer and if it hasn't been allocated yet it > > is allocated in a place similar to where shared buffers acquire a buffer > > from the freelist. If I didn't count it here, I would need to make a new > > column only for local buffers called "allocated" or something like that. > > I think you're making this too granular. We need to have more detail than > today. But we don't necessarily need to catch every nuance. In general I agree that coarser granularity here may be easier to use. I do think the current docs explain what's going on pretty well, though, and I worry if merging too many concepts will make that harder to follow. But if a less detailed breakdown still communicates potential problems, +1. > > This seems not too bad at first, but if you consider that later we will > > add other kinds of IO -- eg WAL IO or temporary file IO, we won't be > > able to use these existing columns and will need to add even more > > columns describing the exact behavior in those cases. > > I think it's clearly not the right direction. +1, I think the existing approach makes more sense.
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
On Wed, Oct 19, 2022 at 12:27 PM Melanie Plageman wrote: > > v34 is attached. > I think the column names need discussion. Also, the docs need more work > (I added a lot of new content there). I could use feedback on the column > names and definitions and review/rephrasing ideas for the docs > additions. Nice! I think the expanded docs are great, and make this information much easier to interpret. >+ io_context bulkread, existing >+ dirty buffers in the ring requirng flush are "requiring" >+ shared buffers were acquired from the freelist and added to the >+ fixed-size strategy ring buffer. Shared buffers are added to the >+ strategy ring lazily. If the current buffer in the ring is pinned or in This is the first mention of the term "strategy" in these docs. It's not totally opaque, since there's some context, but maybe we should either try to avoid that term or define it more explicitly? >+ io_contexts. This is equivalent to >+ evicted for shared buffers in >+ io_context shared, as the >contents >+ of the buffer are evicted but refers to the case when >the I don't quite follow this: does this mean that I should expect 'reused' and 'evicted' to be equal in the 'shared' context, because they represent the same thing? Or will 'reused' just be null because it's not distinct from 'evicted'? It looks like it's null right now, but I find the wording here confusing. >+ future with a new shared buffer. A high number of >+ bulkread rejections can indicate a need for more >+ frequent vacuuming or more aggressive autovacuum settings, as buffers >are >+ dirtied during a bulkread operation when updating the hint bit or when >+ performing on-access pruning. This is great. Just wanted to re-iterate that notes like this are really helpful to understanding this view. > I've implemented a change using the same function pg_settings uses to > turn the build-time parameter BLCKSZ into 8kB (get_config_unit_name()) > using the flag GUC_UNIT_BLOCKS. I am unsure if this is better or worse > than "block_size". I am feeling very conflicted about this column. Yeah, I guess it feels less natural here than in pg_settings, but it still kind of feels like one way of doing this is better than two...
Re: Pluggable toaster
Hi! Aleksander, >Don't you think that this is an arguable design decision? Basically >all we know about the underlying TableAM is that it stores tuples >_somehow_ and that tuples have TIDs [1]. That's it. We don't know if >it even has any sort of pages, whether they are fixed in size or not, >whether it uses shared buffers, etc. It may not even require TOAST. >(Not to mention the fact that when you have N TOAST implementations >and M TableAM implementations now you have to run N x M compatibility >tests. And this doesn't account for different versions of Ns and Ms, >different platforms and different versions of PostgreSQL.) >I believe the proposed approach is architecturally broken from the beginning. Existing TOAST mechanics just works, but for certain types of data it does so very poorly, and, let's face it, this mechanics has very strict limitations that limit overall capabilities of DBMS, because TOAST was designed when today's usual amounts of data were not the case - I mean tables with hundreds of billions of rows, with sizes measured by hundreds of Gb and even by Terabytes. But TOAST itself is good solution to problem of storing oversized attributes, and though it has some limitations - it is unwise to just throw it away, better way is to make it up-to-date by revising it, get rid of the most painful limitations and allow to use different (custom) TOAST strategies for special cases. The main idea of Pluggable TOAST is to extend TOAST capabilities by providing common API allowing to uniformly use different strategies to TOAST different data. With the acronym "TOAST" I mean that data would be stored externally to source table, somewhere only its Toaster know where and how - it may be regular Heap tables, Heap tables with different table structure, some other AM tables, files outside of the database, even files on different storage systems. Pluggable TOAST allows using advanced compression methods and complex operations on externally stored data, like search without fully de-TOASTing data, etc. Also, existing TOAST is a part of Heap AM and is restricted to use Heap only. To make it extensible - we have to separate TOAST from Heap AM. Default TOAST in Pluggable TOAST still uses Heap, but Heap knows nothing about TOAST. It fits perfectly in OOP paradigms >It looks like the idea should be actually turned inside out. I.e. what >would be nice to have is some sort of _framework_ that helps TableAM >authors to implement TOAST (alternatively, the rest of the TableAM >except for TOAST) if the TableAM is similar to the default one. In >other words the idea is not to implement alternative TOASTers that >will work with all possible TableAMs but rather to simplify the task >of implementing an alternative TableAM which is similar to the default >one except for TOAST. These TableAMs should reuse as much common code >as possible except for the parts where they differ. To implement different TOAST strategies you must have an API to plug them in, otherwise for each strategy you'd have to change the core. TOAST API allows to plug in custom TOAST strategies just by adding contrib modules, once the API is merged into the core. I have to make a point that different TOAST strategies do not have to store data with other TAMs, they just could store these data in Heap but using knowledge of internal data structure of workflow to store them in a more optimal way - like fast and partially compressed and decompressed JSON, lots of large chunks of binary data stored in the database (as you know, largeobjects are not of much help with this) and so on. Implementing another Table AM just to implement another TOAST strategy seems too much, the TAM API is very heavy and complex, and you would have to add it as a contrib. Lots of different TAMs would cause much more problems than lots of Toasters because such a solution results in data incompatibility between installations with different TAMs and some minor changes in custom TAM contrib could lead to losing all data stored with this TAM, but with custom TOAST you (in the worst case) could lose just TOASTed data and nothing else. We have lots of requests from clients and tickets related to TOAST limitations and extending Postgres this way - this growing need made us develop Pluggable TOAST. On Sun, Oct 23, 2022 at 12:38 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi Nikita, > > > Pluggable TOAST API was designed with storage flexibility in mind, and > Custom TOAST mechanics is > > free to use any storage methods > > Don't you think that this is an arguable design decision? Basically > all we know about the underlying TableAM is that it stores tuples > _somehow_ and that tuples have TIDs [1]. That's it. We don't know if > it even has any sort of pages, whether they are fixed in size or not, > whether it uses shared buffers, etc. It may not even require TOAST. > (Not to mention the fact that when you have N TOAST implementations > and M TableAM
Re: Interesting areas for beginners
On Sun, Oct 23, 2022 at 6:24 AM Aleksander Alekseev wrote: > > Hi Matheus, > > > Some months ago I've got my first patch accept [1], and I'm looking to try > > to > > make other contributions. > > In personal experience reviewing other people's code is a good > starting point. Firstly, IMO this is one of the most valuable > contributions, since the community is always short on reviewers. > Secondly, in the process you will learn what the rest of the community > is working on, which patches have good chances to be accepted, and > learn the implementation details of the system. > > Additionally I would like to recommend the following materials for self-study: > > * > https://www.amazon.com/Database-System-Concepts-Abraham-Silberschatz/dp/1260084507/ > ** Especially the chapter available online about PostgreSQL > * https://www.youtube.com/playlist?list=PLSE8ODhjZXjZaHA6QcxDfJ0SIWBzQFKEG > * > https://www.timescale.com/blog/how-and-why-to-become-a-postgresql-contributor/ > I would second the recommendation to help with patch reviewing because it is one of the most valuable contributions you can make to the project as well as a good way to start to build relationships with other contributors, which will be helpful the next time they are tearing apart one of your patches ;-) In addition, two other resources to be aware of: * Paul Ramsey has a really nice write up on his thoughts on getting started hacking Postgres: http://blog.cleverelephant.ca/2022/10/postgresql-links.html * I suspect you may have seen these, but in case not, the wiki has several key pages to be aware of, which are linked to from https://wiki.postgresql.org/wiki/Development_information Robert Treat https://xzilla.net
Re: Tracking last scan time
On Fri, Oct 14, 2022 at 2:55 PM Dave Page wrote: > On Fri, 14 Oct 2022 at 19:16, Andres Freund wrote: >> On 2022-10-13 14:38:06 +0100, Dave Page wrote: >> > Thanks for that. It looks good to me, bar one comment (repeated 3 times in >> > the sql and expected files): >> > >> > fetch timestamps from before the next test >> > >> > "from " should be removed. >> >> I was trying to say something with that from, but clearly it wasn't >> understandable :). Removed. >> >> With that I pushed the changes and marked the CF entry as committed. > > > Thanks! > Hey folks, I was looking at this a bit further (great addition btw) and noticed the following behavior (this is a mre of the original testing that uncovered this): pagila=# select * from pg_stat_user_tables ; relid | schemaname | relname | seq_scan | last_seq_scan | seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count ---++-+--+---+--+--+---+---+---+---+---+---+++-++-+-+--+--+--+--+---+--- (0 rows) pagila=# create table x (xx int); CREATE TABLE Time: 2.145 ms pagila=# select * from pg_stat_user_tables ; relid | schemaname | relname | seq_scan | last_seq_scan | seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count ---++-+--+---+--+--+---+---+---+---+---+---+++-++-+-+--+--+--+--+---+--- 16392 | public | x |0 | [null]| 0 | [null] | [null]|[null] | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | [null] | [null] | [null] | [null] |0 |0 | 0 | 0 (1 row) pagila=# insert into x select 1; INSERT 0 1 pagila=# select * from pg_stat_user_tables ; relid | schemaname | relname | seq_scan | last_seq_scan | seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count ---++-+--++--+--+---+---+---+---+---+---+++-++-+-+--+--+--+--+---+--- 16392 | public | x |0 | 1999-12-31 19:00:00-05 | 0 | [null] | [null]|[null] | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | [null] | [null] | [null] | [null] |0 |0 | 0 | 0 (1 row) Normally we populate "last" columns with a NULL value when the corresponding marker is zero, which seems correct in the first query, but no longer matches in the second. I can see an argument that this is a necessary exception to that rule (I'm not sure I agree with it, but I see it) but even in that scenario, ISTM we should avoid populating the table with a "special value", which generally goes against observability best practices, and I believe we've been able to avoid it elsewhere. Beyond that, I also notice the behavior changes when adding a table with a PK, though not necessarily better... pagila=# drop table x; DROP TABLE Time: 2.896 ms pagila=# select * from pg_stat_user_tables ; relid | schemaname | relname | seq_scan | last_seq_scan | seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | autovacuum_count | analyze_count |
Re: doubt about FullTransactionIdAdvance()
Hi, On 2022-10-22 11:32:47 +0800, Zhang Mingli wrote: > Hi, hackers > > I don't quite understand FullTransactionIdAdvance(), correct me if I’m wrong. > > > /* > * Advance a FullTransactionId variable, stepping over xids that would appear > * to be special only when viewed as 32bit XIDs. > */ > static inline void > FullTransactionIdAdvance(FullTransactionId *dest) > { > dest->value++; > > /* see FullTransactionIdAdvance() */ > if (FullTransactionIdPrecedes(*dest, FirstNormalFullTransactionId)) > return; > > while (XidFromFullTransactionId(*dest) < FirstNormalTransactionId) > dest->value++; > } > > #define XidFromFullTransactionId(x) ((x).value) > #define FullTransactionIdPrecedes(a, b) ((a).value < (b).value) > > Can the codes reach line: while (XidFromFullTransactionId(*dest) < > FirstNormalTransactionId)? > As we will return if (FullTransactionIdPrecedes(*dest, > FirstNormalFullTransactionId)), and the two judgements seem equal. > Another confusion is the comments: /* see FullTransactionIdAdvance() */, is > its own itself. > Could anyone explain this? Thanks in advance. FullTransactionId is 64bit. An "old school" xid is 32bit. The first branch is to protect against the special fxids that are actually below FirstNormalFullTransactionId: if (FullTransactionIdPrecedes(*dest, FirstNormalFullTransactionId)) return; The second branch is to protect against 64bit xids that would yield a 32bit xid below FirstNormalTransactionId after truncating to 32bit: while (XidFromFullTransactionId(*dest) < FirstNormalTransactionId) dest->value++; E.g. we don't want to modify the 64bit xid 0 (meaning InvalidTransactionId) as it has special meaning. But we have to make sure that e.g. the 64bit xid 0x1 won't exist, as it'd also result in InvalidTransactionId if truncated to 32bit. However, it looks like this comment: /* see FullTransactionIdAdvance() */ if (FullTransactionIdPrecedes(*dest, FirstNormalFullTransactionId)) return; is bogus, and it's my fault. Looks like it's intending to reference FullTransactionIdRetreat(). Greetings, Andres Freund
Re: [PATCH] Replace "implementor" with "implementer"
Hi Tom, > They're both valid according to the dictionaries I looked > at, eg [1]. I don't feel a need to change anything. OK, thanks. And sorry for the noise. -- Best regards, Aleksander Alekseev
Re: [PATCH] Replace "implementor" with "implementer"
Aleksander Alekseev writes: > I noticed that there are several places where we use the spelling > "implementOr" while the correct one seems to be "implementEr". Here is > the patch. They're both valid according to the dictionaries I looked at, eg [1]. I don't feel a need to change anything. regards, tom lane [1] https://www.merriam-webster.com/dictionary/implement
Re: [PATCH] Replace "implementor" with "implementer"
Hi hackers, > I noticed that there are several places where we use the spelling > "implementOr" while the correct one seems to be "implementEr". Here is > the patch. After a little more study I found evidence that both spellings can be acceptable [1]. As a non-native speaker I can't judge whether this is true or not and which spelling is preferable. I believe we should unify the spelling though. The reason why I initially thought "implementOr" is an incorrect spelling is because most spell-checking tools I personally use indicated so. [1] https://english.stackexchange.com/a/358111 -- Best regards, Aleksander Alekseev
Re: Interesting areas for beginners
Hi Matheus, > Some months ago I've got my first patch accept [1], and I'm looking to try to > make other contributions. In personal experience reviewing other people's code is a good starting point. Firstly, IMO this is one of the most valuable contributions, since the community is always short on reviewers. Secondly, in the process you will learn what the rest of the community is working on, which patches have good chances to be accepted, and learn the implementation details of the system. Additionally I would like to recommend the following materials for self-study: * https://www.amazon.com/Database-System-Concepts-Abraham-Silberschatz/dp/1260084507/ ** Especially the chapter available online about PostgreSQL * https://www.youtube.com/playlist?list=PLSE8ODhjZXjZaHA6QcxDfJ0SIWBzQFKEG * https://www.timescale.com/blog/how-and-why-to-become-a-postgresql-contributor/ -- Best regards, Aleksander Alekseev
[PATCH] Replace "implementor" with "implementer"
Hi hackers, I noticed that there are several places where we use the spelling "implementOr" while the correct one seems to be "implementEr". Here is the patch. -- Best regards, Aleksander Alekseev v1-0001-Fix-incorrect-spelling-of-the-word-implementer.patch Description: Binary data
Re: Difference between HeapTupleData and TupleTableSlot structures
Hi hackers, > TupleTableSlot is a more abstract concept, being a container > for a tuple that can be present in several different forms. > It can contain a concrete tuple (HeapTupleData), or a "virtual" > tuple that is just an array of Datum+isnull values. The executor > usually uses tuple slots to return tuples out of plan nodes; > they're not very common elsewhere. I came across another little piece of information about TupleTableSlots [1] and recalled this thread: """ To implement an access method, an implementer will typically need to implement an AM-specific type of tuple table slot (see src/include/executor/tuptable.h), which allows code outside the access method to hold references to tuples of the AM, and to access the columns of the tuple. """ Hopefully this is helpful. [1] https://www.postgresql.org/docs/current/tableam.html -- Best regards, Aleksander Alekseev
Re: Pluggable toaster
Hi Nikita, > Pluggable TOAST API was designed with storage flexibility in mind, and Custom > TOAST mechanics is > free to use any storage methods Don't you think that this is an arguable design decision? Basically all we know about the underlying TableAM is that it stores tuples _somehow_ and that tuples have TIDs [1]. That's it. We don't know if it even has any sort of pages, whether they are fixed in size or not, whether it uses shared buffers, etc. It may not even require TOAST. (Not to mention the fact that when you have N TOAST implementations and M TableAM implementations now you have to run N x M compatibility tests. And this doesn't account for different versions of Ns and Ms, different platforms and different versions of PostgreSQL.) I believe the proposed approach is architecturally broken from the beginning. It looks like the idea should be actually turned inside out. I.e. what would be nice to have is some sort of _framework_ that helps TableAM authors to implement TOAST (alternatively, the rest of the TableAM except for TOAST) if the TableAM is similar to the default one. In other words the idea is not to implement alternative TOASTers that will work with all possible TableAMs but rather to simplify the task of implementing an alternative TableAM which is similar to the default one except for TOAST. These TableAMs should reuse as much common code as possible except for the parts where they differ. Does it make sense? Sorry, I realize this will probably imply a complete rewrite of the patch. This is the reason why one should start proposing changes from gathering the requirements, writing an RFC and run it through several rounds of discussion. [1]: https://www.postgresql.org/docs/current/tableam.html -- Best regards, Aleksander Alekseev