Re: Should rolpassword be toastable?
Nathan Bossart writes: > Here is a v3 patch set that fixes the test comment and a compiler warning > in cfbot. Nitpick: the message should say "%d bytes" not "%d characters", because we're counting bytes. Passes an eyeball check otherwise. regards, tom lane
Re: Why mention to Oracle ?
Tomas Vondra writes: > On 9/20/24 14:36, Marcos Pegoraro wrote: >> Why PostgreSQL DOCs needs to show or compare the Oracle way of doing >> things ? > I didn't dig into all the places you mention, but I'd bet those places > reference Oracle simply because it was the most common DB people either > migrated from or needed to support in their application next to PG, and > thus were running into problems. The similarity of the interfaces and > SQL dialects also likely played a role. It's less likely to run into > subtle behavior differences e.g. SQL Server when you have to rewrite > T-SQL stuff from scratch anyway. As far as the mentions in "Data Type Formatting Functions" go, those are there because those functions are not in the SQL standard; we stole the API definitions for them from Oracle, lock stock and barrel. (Except for the discrepancies that are called out by referencing what Oracle does differently.) A number of the other references probably have similar origins. regards, tom lane
Re: Clock-skew management in logical replication
Nisha Moond writes: > While considering the implementation of timestamp-based conflict > resolution (last_update_wins) in logical replication (see [1]), there > was a feedback at [2] and the discussion on whether or not to manage > clock-skew at database level. FWIW, I cannot see why we would do anything beyond suggesting that people run NTP. That's standard anyway on the vast majority of machines these days. Why would we add complexity that we have to maintain (and document) in order to cater to somebody not doing that? regards, tom lane
Re: attndims, typndims still not enforced, but make the value within a sane threshold
Michael Paquier writes: > On Fri, Sep 20, 2024 at 11:51:49AM +0800, Junwang Zhao wrote: >> Should you also bump the catalog version? > No need to worry about that when sending a patch because committers > take care of that when merging a patch into the tree. Doing that in > each patch submitted just creates more conflicts and work for patch > authors because they'd need to recolve conflicts each time a > catversion bump happens. And that can happen on a daily basis > sometimes depending on what is committed. Right. Sometimes the committer forgets to do that :-(, which is not great but it's not normally a big problem either. We've concluded it's better to err in that direction than impose additional work on patch submitters. If you feel concerned about the point, best practice is to include a mention that catversion bump is needed in your draft commit message. regards, tom lane
Rethinking parallel-scan relation identity checks
In [1] I whined about how the parallel heap scan machinery should have noticed that the same ParallelTableScanDesc was being used to give out block numbers for two different relations. Looking closer, there are Asserts that mean to catch this type of error --- but they are comparing relation OIDs, whereas what would have been needed to detect the problem was to compare RelFileLocators. It seems to me that a scan is fundamentally operating at the physical relation level, and therefore these tests should check RelFileLocators not OIDs. Hence I propose the attached. (For master only, of course; this would be an ABI break in the back branches.) This passes check-world and is able to catch the problem exposed in the other thread. Another possible view is that we should check both physical and logical relation IDs, but that seems like overkill to me. Thoughts? regards, tom lane [1] https://www.postgresql.org/message-id/2042942.1726781733%40sss.pgh.pa.us diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c index dcd04b813d..1859be614c 100644 --- a/src/backend/access/index/indexam.c +++ b/src/backend/access/index/indexam.c @@ -500,8 +500,8 @@ index_parallelscan_initialize(Relation heapRelation, Relation indexRelation, EstimateSnapshotSpace(snapshot)); offset = MAXALIGN(offset); - target->ps_relid = RelationGetRelid(heapRelation); - target->ps_indexid = RelationGetRelid(indexRelation); + target->ps_locator = heapRelation->rd_locator; + target->ps_indexlocator = indexRelation->rd_locator; target->ps_offset = offset; SerializeSnapshot(snapshot, target->ps_snapshot_data); @@ -544,7 +544,9 @@ index_beginscan_parallel(Relation heaprel, Relation indexrel, int nkeys, Snapshot snapshot; IndexScanDesc scan; - Assert(RelationGetRelid(heaprel) == pscan->ps_relid); + Assert(RelFileLocatorEquals(heaprel->rd_locator, pscan->ps_locator)); + Assert(RelFileLocatorEquals(indexrel->rd_locator, pscan->ps_indexlocator)); + snapshot = RestoreSnapshot(pscan->ps_snapshot_data); RegisterSnapshot(snapshot); scan = index_beginscan_internal(indexrel, nkeys, norderbys, snapshot, diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c index e57a0b7ea3..bd8715b679 100644 --- a/src/backend/access/table/tableam.c +++ b/src/backend/access/table/tableam.c @@ -168,7 +168,7 @@ table_beginscan_parallel(Relation relation, ParallelTableScanDesc pscan) uint32 flags = SO_TYPE_SEQSCAN | SO_ALLOW_STRAT | SO_ALLOW_SYNC | SO_ALLOW_PAGEMODE; - Assert(RelationGetRelid(relation) == pscan->phs_relid); + Assert(RelFileLocatorEquals(relation->rd_locator, pscan->phs_locator)); if (!pscan->phs_snapshot_any) { @@ -389,7 +389,7 @@ table_block_parallelscan_initialize(Relation rel, ParallelTableScanDesc pscan) { ParallelBlockTableScanDesc bpscan = (ParallelBlockTableScanDesc) pscan; - bpscan->base.phs_relid = RelationGetRelid(rel); + bpscan->base.phs_locator = rel->rd_locator; bpscan->phs_nblocks = RelationGetNumberOfBlocks(rel); /* compare phs_syncscan initialization to similar logic in initscan */ bpscan->base.phs_syncscan = synchronize_seqscans && diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h index 521043304a..114a85dc47 100644 --- a/src/include/access/relscan.h +++ b/src/include/access/relscan.h @@ -18,6 +18,7 @@ #include "access/itup.h" #include "port/atomics.h" #include "storage/buf.h" +#include "storage/relfilelocator.h" #include "storage/spin.h" #include "utils/relcache.h" @@ -62,7 +63,7 @@ typedef struct TableScanDescData *TableScanDesc; */ typedef struct ParallelTableScanDescData { - Oid phs_relid; /* OID of relation to scan */ + RelFileLocator phs_locator; /* physical relation to scan */ bool phs_syncscan; /* report location to syncscan logic? */ bool phs_snapshot_any; /* SnapshotAny, not phs_snapshot_data? */ Size phs_snapshot_off; /* data for snapshot */ @@ -169,8 +170,8 @@ typedef struct IndexScanDescData /* Generic structure for parallel scans */ typedef struct ParallelIndexScanDescData { - Oid ps_relid; - Oid ps_indexid; + RelFileLocator ps_locator; /* physical table relation to scan */ + RelFileLocator ps_indexlocator; /* physical index relation to scan */ Size ps_offset; /* Offset in bytes of am specific structure */ char ps_snapshot_data[FLEXIBLE_ARRAY_MEMBER]; } ParallelIndexScanDescData;
Re: Should rolpassword be toastable?
Nathan Bossart writes: > Oh, actually, I see that we are already validating the hash, but you can > create valid SCRAM-SHA-256 hashes that are really long. So putting an > arbitrary limit (patch attached) is probably the correct path forward. I'd > also remove pg_authid's TOAST table while at it. Shouldn't we enforce the limit in every case in encrypt_password, not just this one? (I do agree that encrypt_password is an okay place to enforce it.) I think you will get pushback from a limit of 256 bytes --- I seem to recall discussion of actual use-cases where people were using strings of a couple of kB. Whatever the limit is, the error message had better cite it explicitly. Also, the ereport call needs an errcode. ERRCODE_PROGRAM_LIMIT_EXCEEDED is probably suitable. regards, tom lane
Re: BUG #18545: \dt breaks transaction, calling error when executed in SET SESSION AUTHORIZATION
I wrote: > Justin Pryzby writes: >> This commit seems to trigger elog(), not reproducible in the >> parent commit. >> 6e086fa2e77 Allow parallel workers to cope with a newly-created session user >> ID. >> postgres=# SET min_parallel_table_scan_size=0; CLUSTER pg_attribute USING >> pg_attribute_relid_attnum_index; >> ERROR: pg_attribute catalog is missing 26 attribute(s) for relation OID >> 70321 > I've been poking at this all day, and I still have little idea what's > going on. Got it, after a good deal more head-scratching. Here's the relevant parts of ParallelWorkerMain: /* * We've changed which tuples we can see, and must therefore invalidate * system caches. */ InvalidateSystemCaches(); /* * Restore GUC values from launching backend. We can't do this earlier, * because GUC check hooks that do catalog lookups need to see the same * database state as the leader. */ gucspace = shm_toc_lookup(toc, PARALLEL_KEY_GUC, false); RestoreGUCState(gucspace); ... /* Restore relmapper state. */ relmapperspace = shm_toc_lookup(toc, PARALLEL_KEY_RELMAPPER_STATE, false); RestoreRelationMap(relmapperspace); InvalidateSystemCaches blows away the worker's relcache. Then RestoreGUCState causes some catalog lookups (tracing shows that restoring default_text_search_config is what triggers this on my setup), and in particular pg_attribute's relcache entry will get constructed to support that. Then we wheel in a new set of relation map entries *without doing anything about what that might invalidate*. In the given test case, the globally-visible relmap says that pg_attribute's relfilenode is, say, . But we are busy rewriting it, so the parent process has an "active" relmap entry that says pg_attribute's relfilenode is . Given the above, the worker process will have built a pg_attribute relcache entry that contains , and even though it now knows is the value it should be using, that information never makes it to the worker's relcache. The upshot of this is that when the parallel heap scan machinery doles out some block numbers for the parent process to read, and some other block numbers for the worker to read, the worker is reading those block numbers from the pre-clustering copy of pg_attribute, which most likely doesn't match the post-clustering image. This accounts for the missing and duplicate tuples I was seeing in the scan output. Of course, the reason 6e086fa2e made this visible is that before that, any catalog reads triggered by RestoreGUCState were done in an earlier transaction, and then we would blow away the ensuing relcache entries in InvalidateSystemCaches. So there was no bug as long as you assume that the "..." code doesn't cause any catalog reads. I'm not too sure of that though --- it's certainly not very comfortable to assume that functions like SetCurrentRoleId and SetTempNamespaceState will never attempt a catalog lookup. The code has another hazard too, which is that this all implies that the GUC-related catalog lookups will be done against the globally-visible relmap state not whatever is active in the parent process. I have not tried to construct a POC showing that that can give incorrect answers (that is, different from what the parent thinks), but it seems plausible that it could. So the fix seems clear to me: RestoreRelationMap needs to happen before anything that could result in catalog lookups. I'm kind of inclined to move up the adjacent restores of non-transactional low-level stuff too, particularly RestoreReindexState which has direct impact on how catalog lookups are done. Independently of that, it's annoying that the parallel heap scan machinery failed to notice that it was handing out block numbers for two different relfilenodes. I'm inclined to see if we can put some Asserts in there that would detect that. This particular bug would have been far easier to diagnose that way, and it hardly seems unlikely that "worker is reading the wrong relation" could happen with other mistakes in future. regards, tom lane
Re: Should rolpassword be toastable?
Nathan Bossart writes: > Hm. It does seem like there's little point in giving pg_authid a TOAST > table, as rolpassword is the only varlena column, and it obviously has > problems. But wouldn't removing it just trade one unhelpful internal error > when trying to log in for another when trying to add a really long password > hash (which hopefully nobody is really trying to do in practice)? I wonder > if we could make this a little more user-friendly. We could put an arbitrary limit (say, half of BLCKSZ) on the length of passwords. regards, tom lane
Re: BUG #18545: \dt breaks transaction, calling error when executed in SET SESSION AUTHORIZATION
Justin Pryzby writes: > This commit seems to trigger elog(), not reproducible in the > parent commit. > 6e086fa2e77 Allow parallel workers to cope with a newly-created session user > ID. > postgres=# SET min_parallel_table_scan_size=0; CLUSTER pg_attribute USING > pg_attribute_relid_attnum_index; > ERROR: pg_attribute catalog is missing 26 attribute(s) for relation OID 70321 I've been poking at this all day, and I still have little idea what's going on. I've added a bunch of throwaway instrumentation, and have managed to convince myself that the problem is that parallel heap scan is broken. The scans done to rebuild pg_attribute's indexes seem to sometimes miss heap pages or visit pages twice (in different workers). I have no idea why this is, and even less idea how 6e086fa2e is provoking it. As you say, the behavior isn't entirely reproducible, but I couldn't make it happen at all after reverting 6e086fa2e's changes in transam/parallel.c, so apparently there is some connection. Another possibly useful data point is that for me it reproduces fairly well (more than one time in two) on x86_64 Linux, but I could not make it happen on macOS ARM64. If it's a race condition, which smells plausible, that's perhaps not hugely surprising. regards, tom lane
Re: detoast datum into the given buffer as a optimization.
Andy Fan writes: > * Note if caller provides a non-NULL buffer, it is the duty of caller > * to make sure it has enough room for the detoasted format (Usually > * they can use toast_raw_datum_size to get the size) This is a pretty awful, unsafe API design. It puts it on the caller to know how to get the detoasted length, and it implies double decoding of the toast datum. > One of the key point is we can always get the varlena rawsize cheaply > without any real detoast activity in advance, thanks to the existing > varlena design. This is not an assumption I care to wire into the API design. How about a variant like struct varlena * detoast_attr_cxt(struct varlena *attr, MemoryContext cxt) which promises to allocate the result in the specified context? That would cover most of the practical use-cases, I think. regards, tom lane
Re: Large expressions in indexes can't be stored (non-TOASTable)
Nathan Bossart writes: > On Wed, Sep 04, 2024 at 03:20:33PM -0400, Jonathan S. Katz wrote: >> On 9/4/24 3:08 PM, Tom Lane wrote: >>> Nathan Bossart writes: >>>> Thanks to commit 96cdeae, only a few catalogs remain that are missing TOAST >>>> tables: pg_attribute, pg_class, pg_index, pg_largeobject, and >>>> pg_largeobject_metadata. I've attached a short patch to add one for >>>> pg_index, which resolves the issue cited here. > Any objections to committing this? Nope. regards, tom lane
Re: Detailed release notes
Marcos Pegoraro writes: > Em qua., 18 de set. de 2024 às 06:02, Peter Eisentraut > escreveu: >> Maybe this shouldn't be done between RC1 and GA. This is clearly a more >> complex topic that should go through a proper review and testing cycle. > I think this is just a question of decision, not reviewing or testing. I'd say the opposite: the thing we lack is exactly testing, in the sense of how non-hackers will react to this. Nonetheless, I'm not upset about trying to do it now. We will get more feedback about major-release notes than minor-release notes. And the key point is that it's okay to consider this experimental. Unlike say a SQL feature, there are no compatibility concerns that put a premium on getting it right the first time. We can adjust the annotations or give up on them without much cost. regards, tom lane
Re: Detailed release notes
Jelte Fennema-Nio writes: > On Wed, 18 Sept 2024 at 02:55, Bruce Momjian wrote: >>> Also very clutter-y. I'm not convinced that any of this is a good >>> idea that will stand the test of time: I estimate that approximately >>> 0.01% of people who read the release notes will want these links. >> Yes, I think 0.01% is accurate. > I think that is a severe underestimation. I'm sure a very large fraction of the people commenting on this thread would like to have these links. But we are by no means representative of the average Postgres user. regards, tom lane
Re: attndims, typndims still not enforced, but make the value within a sane threshold
jian he writes: > Can we error out at the stage "create table", "create domain" > time if the attndims or typndims is larger than MAXDIM (6) ? The last time this was discussed, I think the conclusion was we should remove attndims and typndims entirely on the grounds that they're useless. I certainly don't see a point in adding more logic that could give the misleading impression that they mean something. regards, tom lane
Re: DROP OWNED BY fails to clean out pg_init_privs grants
=?UTF-8?B?0JXQs9C+0YAg0KfQuNC90LTRj9GB0LrQuNC9?= writes: > This query does not expect that test database may already contain some > information about custom user that ran test_pg_dump-running. I'm perfectly content to reject this as being an abuse of the test case. Our TAP tests are built on the assumption that they use databases created within the test case. Apparently, you've found a way to use the meson test infrastructure to execute a TAP test in the equivalent of "make installcheck" rather than "make check" mode. I am unwilling to buy into the proposition that our TAP tests should be proof against doing that after making arbitrary changes to the database's initial state. If anything, the fact that this is possible is a bug in our meson scripts. regards, tom lane
Re: BUG #18545: \dt breaks transaction, calling error when executed in SET SESSION AUTHORIZATION
Justin Pryzby writes: > This commit seems to trigger elog(), not reproducible in the > parent commit. Yeah, I can reproduce that. Will take a look tomorrow. regards, tom lane
Re: Detailed release notes
Bruce Momjian writes: > On Tue, Sep 17, 2024 at 03:29:54PM -0300, Marcos Pegoraro wrote: >> Em ter., 17 de set. de 2024 às 05:12, Alvaro Herrera >> >>> Add backend support for injection points (Michael Paquier) [commit 1] [2] > I think trying to add text to each item is both redundant and confusing, Also very clutter-y. I'm not convinced that any of this is a good idea that will stand the test of time: I estimate that approximately 0.01% of people who read the release notes will want these links. But if we keep it I think the annotations have to be very unobtrusive. (Has anyone looked at the PDF rendering of this? It seems rather unfortunate to me.) regards, tom lane
Re: Regression tests fail with tzdata 2024b
Wolfgang Walther writes: > The core regression tests need to be run with a timezone that tests > special cases in the timezone handling code. But that might not be true > for extensions - all they want could be a stable output across major and > minor versions of postgres and versions of tzdata. It could be helpful > to set pg_regress' timezone to UTC, for example? I would not recommend that choice. It would mask simple errors such as failing to apply the correct conversion between timestamptz and timestamp values. Also, if you have test cases that are affected by this issue at all, you probably have a need/desire to test things like DST transitions. regards, tom lane
Re: Regression tests fail with tzdata 2024b
Sven Klemm writes: > On Mon, Sep 16, 2024 at 5:19 PM Tom Lane wrote: >> Configurable to what? If your test cases are dependent on the >> historical behavior of PST8PDT, you're out of luck, because that >> simply isn't available anymore (or won't be once 2024b reaches >> your platform, anyway). > I was wondering whether the timezone used by pg_regress could be made > configurable. Yes, I understood that you were suggesting that. My point is that it wouldn't do you any good: you will still have to change any regression test cases that depend on behavior PST8PDT has/had that is different from America/Los_Angeles. That being the case, I don't see much value in making it configurable. regards, tom lane
Re: Document DateStyle effect on jsonpath string()
"David E. Wheeler" writes: > BTW, will the back-patch to 17 (cc4fdfa) be included in 17.0 or 17.1? 17.0. If we were already past 17.0 I'd have a lot more angst about changing this behavior. regards, tom lane
Re: Psql meta-command conninfo+
Alvaro Herrera writes: > On 2024-Sep-16, Jim Jones wrote: >> * The value of "Current User" does not match the function current_user() >> --- as one might expcect. It is a little confusing, as there is no >> mention of "Current User" in the docs. In case this is the intended >> behaviour, could you please add it to the docs? > It is intended. As Peter said[1], what we wanted was to display > client-side info, so PQuser() is the right thing to do. Now maybe > "Current User" is not the perfect column header, but at least the > definition seems consistent with the desired end result. Seems like "Session User" would be closer to being accurate, since PQuser()'s result does not change when you do SET ROLE etc. > Now, I think > the current docs saying to look at session_user() are wrong, they should > point to the libpq docs for the function instead; something like "The > name of the current user, as returned by PQuser()" and so on. Sure, but this does not excuse choosing a misleading column name when there are better choices readily available. regards, tom lane
Re: Regression tests fail with tzdata 2024b
Sven Klemm writes: > This is an unfortunate change as this will break extensions tests using > pg_regress for testing. We run our tests against multiple minor versions > and this getting backported means our tests will fail with the next minor > pg release. Is there a workaround available to make the timezone for > pg_regress configurable without going into every test? Configurable to what? If your test cases are dependent on the historical behavior of PST8PDT, you're out of luck, because that simply isn't available anymore (or won't be once 2024b reaches your platform, anyway). regards, tom lane
Re: Regression tests fail with tzdata 2024b
Wolfgang Walther writes: > Tom Lane: >> Also, as a real place to a greater extent >> than "PST8PDT" is, it's more subject to historical revisionism when >> somebody turns up evidence of local law having been different than >> TZDB currently thinks. > I now tried all versions of tzdata which we had in tree back to 2018g, > they all work fine with the same regression test output. 2018g was an > arbitrary cutoff, I just didn't try any further. Yeah, my belly-aching above is just about hypothetical future instability. In reality, I'm sure America/Los_Angeles is pretty well researched and so it's unlikely that there will be unexpected changes in its zone history. > In the end, we don't need a default timezone that will never change. We do, really. For example, there's a nonzero chance the USA will cancel DST altogether at some future time. (This would be driven by politicians who don't remember the last time, but there's no shortage of those.) That'd likely break some of our test results, and even if it happened not to, it'd still be bad because we'd probably lose some coverage of the DST-transition-related code paths in src/timezone/. So I'd really be way happier with a test timezone that never changes but does include DST behavior. I thought PST8PDT fit those requirements pretty well, and I'm annoyed at Eggert for having tossed it overboard for no benefit whatever. But I don't run tzdb, so here we are. > We just need one that didn't change in a reasonable number of > releases going backwards. We've had this sort of fire-drill before, e.g. commit 8d7af8fbe. It's no fun, and the potential surface area for unexpected changes is now much greater than the few tests affected by that change. But here we are, so I pushed your patch with a couple of other cosmetic bits. There are still a couple of references to PST8PDT in the tree, but they don't appear to care what the actual meaning of that zone is, so I left them be. regards, tom lane
Re: A starter task
sia kc writes: > About reply to all button, I think only sending to mailing list address > should suffice. Why including previous recipients too? It's a longstanding habit around here for a couple of reasons: * The mail list servers are occasionally slow. (Our infrastructure is way better than it once was, but that still happens sometimes.) If you directly cc: somebody, they can reply to that copy right away whether or not they get a copy from the list right away. * pgsql-hackers is a fire hose. cc'ing people who have shown interest in the thread is useful because they will get those copies separately from the list traffic, and so they can follow the thread without having to dig through all the traffic. regards, tom lane
Re: how to speed up 002_pg_upgrade.pl and 025_stream_regress.pl under valgrind
Thomas Munro writes: > (An interesting archeological detail about the regression tests is > that they seem to derive from the Wisconsin benchmark, famous for > benchmark wars and Oracle lawyers[1]. This is quite off-topic for the thread, but ... we actually had an implementation of the Wisconsin benchmark in src/test/bench, which we eventually removed (a05a4b478). It does look like the modern regression tests borrowed the definitions of "tenk1" and some related tables from there, but I think it'd be a stretch to say the tests descended from it. regards, tom lane
Re: A starter task
Tomas Vondra writes: > Presumably a new contributor will start by discussing the patch first, > and won't waste too much time on it. Yeah, that is a really critical piece of advice for a newbie: no matter what size of patch you are thinking about, a big part of the job will be to sell it to the rest of the community. It helps a lot to solicit advice while you're still at the design stage, before you spend a lot of time writing code you might have to throw away. Stuff that is on the TODO list has a bit of an advantage here, because that indicates there's been at least some interest and previous discussion. But that doesn't go very far, particularly if there was not consensus about how to do the item. Job 1 is to build that consensus. regards, tom lane
Re: A starter task
Tomas Vondra writes: > I think you can take a look at https://wiki.postgresql.org/wiki/Todo and > see if there's a patch/topic you would be interested in. It's really > difficult to "assign" a task based on a single sentence, with no info > about the person (experience with other projects, etc.). Beware that that TODO list is poorly maintained, so items may be out of date. Worse, most of what's there got there because it's hard, or there's not consensus about what the feature should look like, or both. So IMO it's not a great place for a beginner to start. > FWIW, maybe it'd be better to start by looking at existing patches and > do a bit of a review, learn how to apply/test those and learn from them. Yeah, this is a good way to get some exposure to our code base and development practices. regards, tom lane
Re: Trim the heap free memory
I wrote: > The single test case you showed suggested that maybe we could > usefully prod glibc to free memory at query completion, but we > don't need all this interrupt infrastructure to do that. I think > we could likely get 95% of the benefit with about a five-line > patch. To try to quantify that a little, I wrote a very quick-n-dirty patch to apply malloc_trim during finish_xact_command and log the effects. (I am not asserting this is the best place to call malloc_trim; it's just one plausible possibility.) Patch attached, as well as statistics collected from a run of the core regression tests followed by grep malloc_trim postmaster.log | sed 's/.*LOG:/LOG:/' | sort -k4n | uniq -c >trim_savings.txt We can see that out of about 43K test queries, 32K saved nothing whatever, and in only four was more than a couple of meg saved. That's pretty discouraging IMO. It might be useful to look closer at the behavior of those top four though. I see them as 2024-09-15 14:58:06.146 EDT [960138] LOG: malloc_trim saved 7228 kB 2024-09-15 14:58:06.146 EDT [960138] STATEMENT: ALTER TABLE delete_test_table ADD PRIMARY KEY (a,b,c,d); 2024-09-15 14:58:09.861 EDT [960949] LOG: malloc_trim saved 12488 kB 2024-09-15 14:58:09.861 EDT [960949] STATEMENT: with recursive search_graph(f, t, label, is_cycle, path) as ( select *, false, array[row(g.f, g.t)] from graph g union distinct select g.*, row(g.f, g.t) = any(path), path || row(g.f, g.t) from graph g, search_graph sg where g.f = sg.t and not is_cycle ) select * from search_graph; 2024-09-15 14:58:09.866 EDT [960949] LOG: malloc_trim saved 12488 kB 2024-09-15 14:58:09.866 EDT [960949] STATEMENT: with recursive search_graph(f, t, label) as ( select * from graph g union distinct select g.* from graph g, search_graph sg where g.f = sg.t ) cycle f, t set is_cycle to 'Y' default 'N' using path select * from search_graph; 2024-09-15 14:58:09.853 EDT [960949] LOG: malloc_trim saved 12616 kB 2024-09-15 14:58:09.853 EDT [960949] STATEMENT: with recursive search_graph(f, t, label) as ( select * from graph0 g union distinct select g.* from graph0 g, search_graph sg where g.f = sg.t ) search breadth first by f, t set seq select * from search_graph order by seq; I don't understand why WITH RECURSIVE queries might be more prone to leave non-garbage-collected memory behind than other queries, but maybe that is worth looking into. regards, tom lane diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 8bc6bea113..9efb4f7636 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -21,6 +21,7 @@ #include #include +#include #include #include #include @@ -2797,6 +2798,16 @@ finish_xact_command(void) MemoryContextStats(TopMemoryContext); #endif + { + char *old_brk = sbrk(0); + char *new_brk; + + malloc_trim(0); + new_brk = sbrk(0); + elog(LOG, "malloc_trim saved %zu kB", + (old_brk - new_brk + 1023) / 1024); + } + xact_started = false; } } 32293 LOG: malloc_trim saved 0 kB 4 LOG: malloc_trim saved 4 kB 12 LOG: malloc_trim saved 8 kB 7 LOG: malloc_trim saved 12 kB 57 LOG: malloc_trim saved 16 kB 3 LOG: malloc_trim saved 20 kB 22 LOG: malloc_trim saved 24 kB 14 LOG: malloc_trim saved 28 kB 288 LOG: malloc_trim saved 32 kB 20 LOG: malloc_trim saved 36 kB 26 LOG: malloc_trim saved 40 kB 18 LOG: malloc_trim saved 44 kB 26 LOG: malloc_trim saved 48 kB 27 LOG: malloc_trim saved 52 kB 37 LOG: malloc_trim saved 56 kB 26 LOG: malloc_trim saved 60 kB 218 LOG: malloc_trim saved 64 kB 20 LOG: malloc_trim saved 68 kB 44 LOG: malloc_trim saved 72 kB 44 LOG: malloc_trim saved 76 kB 45 LOG: malloc_trim saved 80 kB 29 LOG: malloc_trim saved 84 kB 91 LOG: malloc_trim saved 88 kB 31 LOG: malloc_trim saved 92 kB 191 LOG: malloc_trim saved 96 kB 30 LOG: malloc_trim saved 100 kB 81 LOG: malloc_trim saved 104 kB 24 LOG: malloc_trim saved 108 kB 214 LOG: malloc_trim saved 112 kB 32 LOG: malloc_trim saved 116 kB 178 LOG: malloc_trim saved 120 kB 86 LOG: malloc_trim saved 124 kB 4498 LOG: malloc_trim saved 128 kB 29 LOG: malloc_trim saved 132 kB 286 LOG: malloc_trim saved 136 kB 34 LOG: malloc_trim saved 140 kB 563 LOG: malloc_trim saved 144 kB 20 LOG: malloc_trim saved 148 kB 821 LOG: malloc_trim saved 152 kB 987 LOG: malloc_trim saved 156 kB 212 LOG: malloc_trim saved 160 kB 8 LOG
Re: how to speed up 002_pg_upgrade.pl and 025_stream_regress.pl under valgrind
Tomas Vondra writes: > [ 002_pg_upgrade and 027_stream_regress are slow ] > I don't have a great idea how to speed up these tests, unfortunately. > But one of the problems is that all the TAP tests run serially - one > after each other. Could we instead run them in parallel? The tests setup > their "private" clusters anyway, right? But there's parallelism within those two tests already, or I would hope so at least. If you run them in parallel then you are probably causing 40 backends instead of 20 to be running at once (plus 40 valgrind instances). Maybe you have a machine beefy enough to make that useful, but I don't. Really the way to fix those two tests would be to rewrite them to not depend on the core regression tests. The core tests do a lot of work that's not especially useful for the purposes of those tests, and it's not even clear that they are exercising all that we'd like to have exercised for those purposes. In the case of 002_pg_upgrade, all we really need to do is create objects that will stress all of pg_dump. It's a little harder to scope out what we want to test for 027_stream_regress, but it's still clear that the core tests do a lot of work that's not helpful. regards, tom lane
Re: Trim the heap free memory
shawn wang writes: > I have successfully registered my patch for the commitfest. However, upon > integration, I encountered several errors during the testing phase. I am > currently investigating the root causes of these issues and will work on > providing the necessary fixes. I should think the root cause is pretty obvious: malloc_trim() is a glibc-ism. I'm fairly doubtful that this is something we should spend time on. It can never work on any non-glibc platform. Even granting that a Linux-only feature could be worth having, I'm really doubtful that our memory allocation patterns are such that malloc_trim() could be expected to free a useful amount of memory mid-query. The single test case you showed suggested that maybe we could usefully prod glibc to free memory at query completion, but we don't need all this interrupt infrastructure to do that. I think we could likely get 95% of the benefit with about a five-line patch. regards, tom lane
Re: Regression tests fail with tzdata 2024b
Wolfgang Walther writes: > Building --with-system-tzdata and the latest tzdata 2024b fails the > regression tests for me (see attached .diffs). This seems to be because > of [1], which changed the way "PST8PDT" is handled. This is the timezone > that the regression tests are run with. That's quite annoying, especially since it was not mentioned in the 2024b release notes. (I had read the notes and concluded that 2024b didn't require any immediate attention on our part.) > From 2024b on, "PST8PDT" is the same as "America/Los_Angeles", so by > changing the regression tests to use the latter as the default, we're > getting consistent output on at least 2024a and 2024b. I'm fairly un-thrilled with this answer, not least because it exposes that zone's idiosyncratic "LMT" offset of -7:52:58 for years before 1883. (I'm surprised that that seems to affect only one or two regression results.) Also, as a real place to a greater extent than "PST8PDT" is, it's more subject to historical revisionism when somebody turns up evidence of local law having been different than TZDB currently thinks. We may not have a lot of choice though. I experimented with using full POSIX notation, that is "PST8PDT,M3.2.0,M11.1.0", but that is actually worse in terms of the number of test diffs, since it doesn't match the DST transition dates that the tests expect for years before 2007. Another objection is that the tests would then not exercise any of the mainstream tzdb-file-reading code paths within the timezone code itself. Grumble. regards, tom lane
Cleaning up ERRCODE usage in our XML code
I noticed while working on bug #18617 [1] that we are fairly slipshod about which SQLSTATE to report when libxml2 returns an error. There are places using ERRCODE_INTERNAL_ERROR for easily-triggered errors; there are different places that use different ERRCODEs for exactly the same condition; and there are places that use different ERRCODEs for failures from xmlXPathCompile and xmlXPathCompiledEval. I found that this last is problematic because some errors you might expect to be reported during XPath compilation are not detected till execution, notably namespace-related errors. That seems more like a libxml2 implementation artifact than something we should expect to be stable behavior, so I think we should avoid using different ERRCODEs. A lot of this can be blamed on there not being any especially on-point SQLSTATE values back when this code was first written. I learned that recent revisions of SQL have a whole new SQLSTATE class, class 10 = "XQuery Error", so we have an opportunity to sync up with that as well as be more self-consistent. The spec's subclass codes in this class seem quite fine-grained. It might be an interesting exercise to try to teach xml_errorHandler() to translate libxml2's error->code values into fine-grained SQLSTATEs, but I've not attempted that; I'm not sure whether there is a close mapping between what libxml2 reports and the set of codes the SQL committee chose. What I've done in the attached first-draft patch is just to select one relatively generic code in class 10, 10608 = invalid_argument_for_xquery, and use that where it seemed apropos. This is pretty low-priority, so I'll stash it in the next CF. regards, tom lane [1] https://www.postgresql.org/message-id/356363.1726333674%40sss.pgh.pa.us diff --git a/contrib/xml2/xpath.c b/contrib/xml2/xpath.c index f94b622d92..cc40fa5624 100644 --- a/contrib/xml2/xpath.c +++ b/contrib/xml2/xpath.c @@ -388,7 +388,7 @@ pgxml_xpath(text *document, xmlChar *xpath, xpath_workspace *workspace) /* compile the path */ comppath = xmlXPathCompile(xpath); if (comppath == NULL) -xml_ereport(xmlerrcxt, ERROR, ERRCODE_EXTERNAL_ROUTINE_EXCEPTION, +xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_ARGUMENT_FOR_XQUERY, "XPath Syntax Error"); /* Now evaluate the path expression. */ @@ -652,7 +652,7 @@ xpath_table(PG_FUNCTION_ARGS) comppath = xmlXPathCompile(xpaths[j]); if (comppath == NULL) xml_ereport(xmlerrcxt, ERROR, - ERRCODE_EXTERNAL_ROUTINE_EXCEPTION, + ERRCODE_INVALID_ARGUMENT_FOR_XQUERY, "XPath Syntax Error"); /* Now evaluate the path expression. */ diff --git a/contrib/xml2/xslt_proc.c b/contrib/xml2/xslt_proc.c index f30a3a42c0..e761ca5cb5 100644 --- a/contrib/xml2/xslt_proc.c +++ b/contrib/xml2/xslt_proc.c @@ -90,7 +90,7 @@ xslt_process(PG_FUNCTION_ARGS) XML_PARSE_NOENT); if (doctree == NULL) - xml_ereport(xmlerrcxt, ERROR, ERRCODE_EXTERNAL_ROUTINE_EXCEPTION, + xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_XML_DOCUMENT, "error parsing XML document"); /* Same for stylesheet */ @@ -99,14 +99,14 @@ xslt_process(PG_FUNCTION_ARGS) XML_PARSE_NOENT); if (ssdoc == NULL) - xml_ereport(xmlerrcxt, ERROR, ERRCODE_EXTERNAL_ROUTINE_EXCEPTION, + xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_XML_DOCUMENT, "error parsing stylesheet as XML document"); /* After this call we need not free ssdoc separately */ stylesheet = xsltParseStylesheetDoc(ssdoc); if (stylesheet == NULL) - xml_ereport(xmlerrcxt, ERROR, ERRCODE_EXTERNAL_ROUTINE_EXCEPTION, + xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_ARGUMENT_FOR_XQUERY, "failed to parse stylesheet"); xslt_ctxt = xsltNewTransformContext(stylesheet, doctree); @@ -141,7 +141,7 @@ xslt_process(PG_FUNCTION_ARGS) NULL, NULL, xslt_ctxt); if (restree == NULL) - xml_ereport(xmlerrcxt, ERROR, ERRCODE_EXTERNAL_ROUTINE_EXCEPTION, + xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_ARGUMENT_FOR_XQUERY, "failed to apply stylesheet"); resstat = xsltSaveResultToString(&resstr, &reslen, restree, stylesheet); diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c index 1a07876cd5..3eaf9f78a8 100644 --- a/src/backend/utils/adt/xml.c +++ b/src/backend/utils/adt/xml.c @@ -742,7 +742,7 @@ xmltotext_with_options(xmltype *data, XmlOptionType xmloption_arg, bool indent) { /* If it's a document, saving is easy. */ if (xmlSaveDoc(ctxt, doc) == -1 || xmlerrcxt->err_occurred) -xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR, +xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY, "could not save document to xmlBuffer"); } else if (content_nodes != NULL) @@ -785,7 +785,7 @@ xmltotext_with_options(xmltype *data, XmlOptionType xmlo
Re: Psql meta-command conninfo+
Alvaro Herrera writes: > I don't understand why this is is printing half the information in > free-form plain text and the other half in tabular format. All these > items that you have in the free-form text lines should be part of the > table, I think. +1, that was my reaction as well. I can see the point of showing those items the same way as plain \conninfo does, but I think we're better off just making \conninfo+ produce a table and nothing else. regards, tom lane
Re: Obsolete comment in pg_stat_statements
Julien Rouhaud writes: > On Sat, 14 Sept 2024, 12:39 Tom Lane, wrote: >> Hmm ... I agree that para is out of date, but is there anything to >> salvage rather than just delete it? > I thought about it but I think that now that knowledge is in the else > branch, with the mention that we still have to bump the nesting level even > if it's not locally handled. After sleeping on it I looked again, and I think you're right, there's no useful knowledge remaining in this para. Pushed. regards, tom lane
Re: Mutable foreign key constraints
Andrew Dunstan writes: > On 2024-09-12 Th 5:33 PM, Tom Lane wrote: >> I'm inclined to propose rejecting FK constraints if the comparison >> operator is not immutable. > Isn't there an upgrade hazard here? People won't thank us if they can't > now upgrade their clusters. If we can get around that then +1. Yeah, they would have to fix the bad DDL before upgrading. It'd be polite of us to add a pg_upgrade precheck for such cases, perhaps. regards, tom lane
Re: Obsolete comment in pg_stat_statements
Julien Rouhaud writes: > While adapting in pg_stat_kcache the fix for buggy nesting level calculation, > I > noticed that one comment referencing the old approach was missed. Trivial > patch attached. Hmm ... I agree that para is out of date, but is there anything to salvage rather than just delete it? regards, tom lane
Re: Exporting float_to_shortest_decimal_buf(n) with Postgres 17 on Windows
Andrew Kane writes: > With Postgres 17 RC1 on Windows, `float_to_shortest_decimal_buf` and > `float_to_shortest_decimal_bufn` are not longer exported. This causes > `unresolved external symbol` linking errors for extensions that rely on > these functions (like pgvector). Can these functions be exported like > previous versions of Postgres? AFAICS it's in the exact same place it was in earlier versions. You might need to review your linking commands. regards, tom lane
Re: Add system column support to the USING clause
Denis Garsh writes: > The patch adds support for system columns in JOIN USING clause. I think this is an actively bad idea, and it was likely intentional that it's not supported today. A few reasons why: * There are, fundamentally, no use-cases for joining on system columns. The only one that is stable enough to even consider using for the purpose is tableoid, and I'm not detecting a reason why that'd be useful. If there are any edge cases where people would actually wish to do that, it can be done easily enough with a standard JOIN ON clause. * Non-heap table AMs may not provide the same system columns that heap does, further reducing the scope for plausible use-cases. (Yeah, I know we don't support that too well today.) * This breaks ruleutils.c's mechanism for dealing with name conflicts across multiple USING clauses. That relies on being able to assign aliases to the USING inputs at the table level (that is, "FROM realtable AS aliastable(aliascolumn,...)"). There's no way to alias a system column in the FROM syntax. regards, tom lane
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch
Nathan Bossart writes: > Here's a patch to make the sequence permanent and to make the output of > tuple_data_split() not depend on endianness. +1 --- I checked this on mamba's host and it does produce "\\x0101" regardless of endianness. regards, tom lane
Re: Mutable foreign key constraints
"David G. Johnston" writes: > On Thursday, September 12, 2024, Tom Lane wrote: >> A possible objection is that if anybody has such a setup and >> hasn't noticed a problem because they never change their >> timezone setting, they might not appreciate us breaking it. >> So I certainly wouldn't propose back-patching this. But >> maybe we should add it as a foot-gun defense going forward. > I’m disinclined to begin enforcing this. If they got a volatile data type > in a key column and don’t attempt to index the key, which would fail on the > volatile side, I’d be mighty surprised. Um, neither type is "volatile" and each can be indexed just fine. It's the cross-type comparison required by the FK that brings the hazard. > I suggest adding the commentary and queries used to check for just such a > situation to the “don’t do this page” of the wiki and there just explain > while allowed for backward compatibility it is definitely not a recommended > setup. Yeah, that's a possible approach. regards, tom lane
Mutable foreign key constraints
I happened to notice that Postgres will let you do regression=# create table foo (id timestamp primary key); CREATE TABLE regression=# create table bar (ts timestamptz references foo); CREATE TABLE This strikes me as a pretty bad idea, because whether a particular timestamp is equal to a particular timestamptz depends on your timezone setting. Thus the constraint could appear to be violated after a timezone change. I'm inclined to propose rejecting FK constraints if the comparison operator is not immutable. Among the built-in opclasses, the only instances of non-immutable btree equality operators are regression=# select amopopr::regoperator from pg_amop join pg_operator o on o.oid = amopopr join pg_proc p on p.oid = oprcode where amopmethod=403 and amopstrategy=3 and provolatile != 'i'; amopopr - =(date,timestamp with time zone) =(timestamp without time zone,timestamp with time zone) =(timestamp with time zone,date) =(timestamp with time zone,timestamp without time zone) (4 rows) A possible objection is that if anybody has such a setup and hasn't noticed a problem because they never change their timezone setting, they might not appreciate us breaking it. So I certainly wouldn't propose back-patching this. But maybe we should add it as a foot-gun defense going forward. Thoughts? regards, tom lane
Re: may be a mismatch between the construct_array and construct_md_array comments
Alena Rybakina writes: > I noticed that there is a comment that values with NULL are not > processed there, but in fact this function calls the construct_md_array > function, which > contains a comment that it can handle NULL values. Right. construct_md_array has a "bool *nulls" argument, but construct_array doesn't --- it passes NULL for that to construct_md_array, which will therefore assume there are no null array elements. regards, tom lane
Re: Remove shadowed declaration warnings
David Rowley writes: > On Thu, 12 Sept 2024 at 12:33, Peter Smith wrote: >> I normally build the code with warnings enabled (specifically, >> -Wshadow) which exposes many "shadowed" declarations. > 0fe954c28 did add -Wshadow=compatible-local to the standard set of > complication flags. I felt it was diminishing returns after that, but > -Wshadow=local would be the next step before going full -Wshadow. I think that insisting that local declarations not shadow globals is an anti-pattern, and I'll vote against any proposal to make that a standard warning. Impoverished as C is, it does have block structure; why would we want to throw that away by (in effect) demanding a single flat namespace for the entire program? I do grant that sometimes shadowing of locals can cause bugs. I don't recall right now why we opted for -Wshadow=compatible-local over -Wshadow=local, but we could certainly take another look at that. regards, tom lane
Re: Accept invalidation messages before the query starts inside a transaction
Andrei Lepikhov writes: > I don't know whether to classify this as a bug. The sketch of the patch > with an example isolation test is attached. This seems like an extremely strange place (and an extremely brute-force way) to insert an AcceptInvalidationMessages call. Under what circumstances wouldn't we do one or more AIMs later on, eg while acquiring locks? regards, tom lane
Re: Document DateStyle effect on jsonpath string()
"David E. Wheeler" writes: > Should it use the database-native stringification standard or the jsonpath > stringification standard? In the case of the former, output should omit the > “T” time separator and simplify the time zone `07:00` to `07`. But if it’s > the latter case, then it’s good as is. Seems to me it should be the jsonpath convention. If the spec does require any specific spelling, surely it must be that one. regards, tom lane
Re: Document DateStyle effect on jsonpath string()
"David E. Wheeler" writes: > On Sep 11, 2024, at 12:26, Tom Lane wrote: >> Building on that thought, maybe we could fix it as attached? > It looks like that’s what datum_to_json_internal() in json.c does, which IIUC > is the default stringification for date and time values. Right. I actually lifted the code from convertJsonbScalar in jsonb_util.c. Here's a more fleshed-out patch with docs and regression test fixes. I figured we could shorten the tests a bit now that the point is just to verify that datestyle *doesn't* affect it. regards, tom lane diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 1bde4091ca..aa1ac2c4fe 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -18017,16 +18017,15 @@ ERROR: jsonpath member accessor can only be applied to an object String value converted from a JSON boolean, number, string, or -datetime (the output format for datetimes is determined by -the parameter) +datetime jsonb_path_query_array('[1.23, "xyz", false]', '$[*].string()') ["1.23", "xyz", "false"] -jsonb_path_query('"2023-08-15"', '$.datetime().string()') -"2023-08-15" +jsonb_path_query('"2023-08-15 12:34:56"', '$.timestamp().string()') +"2023-08-15T12:34:56" diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpath_exec.c index e3ee0093d4..b9c2443b65 100644 --- a/src/backend/utils/adt/jsonpath_exec.c +++ b/src/backend/utils/adt/jsonpath_exec.c @@ -72,6 +72,7 @@ #include "utils/datetime.h" #include "utils/float.h" #include "utils/formatting.h" +#include "utils/json.h" #include "utils/jsonpath.h" #include "utils/lsyscache.h" #include "utils/memutils.h" @@ -1629,32 +1630,13 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp, break; case jbvDatetime: { - switch (jb->val.datetime.typid) - { -case DATEOID: - tmp = DatumGetCString(DirectFunctionCall1(date_out, - jb->val.datetime.value)); - break; -case TIMEOID: - tmp = DatumGetCString(DirectFunctionCall1(time_out, - jb->val.datetime.value)); - break; -case TIMETZOID: - tmp = DatumGetCString(DirectFunctionCall1(timetz_out, - jb->val.datetime.value)); - break; -case TIMESTAMPOID: - tmp = DatumGetCString(DirectFunctionCall1(timestamp_out, - jb->val.datetime.value)); - break; -case TIMESTAMPTZOID: - tmp = DatumGetCString(DirectFunctionCall1(timestamptz_out, - jb->val.datetime.value)); - break; -default: - elog(ERROR, "unrecognized SQL/JSON datetime type oid: %u", - jb->val.datetime.typid); - } + char buf[MAXDATELEN + 1]; + + JsonEncodeDateTime(buf, + jb->val.datetime.value, + jb->val.datetime.typid, + &jb->val.datetime.tz); + tmp = pstrdup(buf); } break; case jbvNull: diff --git a/src/test/regress/expected/jsonb_jsonpath.out b/src/test/regress/expected/jsonb_jsonpath.out index 70eeb655a2..acdf7e436f 100644 --- a/src/test/regress/expected/jsonb_jsonpath.out +++ b/src/test/regress/expected/jsonb_jsonpath.out @@ -2652,30 +2652,30 @@ select jsonb_path_query('"2023-08-15 12:34:56 +5:30"', '$.timestamp().string()') ERROR: cannot convert value from timestamptz to timestamp without time zone usage HINT: Use *_tz() function for time zone support. select jsonb_path_query_tz('"2023-08-15 12:34:56 +5:30"', '$.timestamp().string()'); -- should work -jsonb_path_query_tz - - "Tue Aug 15 00:04:56 2023" + jsonb_path_query_tz +--- + "2023-08-15T00:04:56" (1 row) select jsonb_path_query('"2023-08-15 12:34:56"', '$.timestamp_tz().string()'); ERROR: cannot convert value from timestamp to timestamptz without time zone usage HINT: Use *_tz() function for time zone support. select jsonb_path_query_tz('"2023-08-15 12:34:56"', '$.timestamp_tz().string()'); -- should work - jsonb_path_query_tz - - "Tue Aug 15 12:34:56 2023 PDT" + jsonb_path_query_tz +- + "2023-08-15T12:34:56-07:00" (1 row) select jsonb_path_query('"2023-08-15 12:34:56 +5:30"', '$.
Re: Document DateStyle effect on jsonpath string()
I wrote: > I think I'd be content to have string() duplicate that behavior > --- in fact, it seems like it'd be odd if it doesn't match. Building on that thought, maybe we could fix it as attached? This changes the just-committed test cases of course, and I did not look at whether there are documentation changes to make. regards, tom lane diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpath_exec.c index e3ee0093d4..b9c2443b65 100644 --- a/src/backend/utils/adt/jsonpath_exec.c +++ b/src/backend/utils/adt/jsonpath_exec.c @@ -72,6 +72,7 @@ #include "utils/datetime.h" #include "utils/float.h" #include "utils/formatting.h" +#include "utils/json.h" #include "utils/jsonpath.h" #include "utils/lsyscache.h" #include "utils/memutils.h" @@ -1629,32 +1630,13 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp, break; case jbvDatetime: { - switch (jb->val.datetime.typid) - { -case DATEOID: - tmp = DatumGetCString(DirectFunctionCall1(date_out, - jb->val.datetime.value)); - break; -case TIMEOID: - tmp = DatumGetCString(DirectFunctionCall1(time_out, - jb->val.datetime.value)); - break; -case TIMETZOID: - tmp = DatumGetCString(DirectFunctionCall1(timetz_out, - jb->val.datetime.value)); - break; -case TIMESTAMPOID: - tmp = DatumGetCString(DirectFunctionCall1(timestamp_out, - jb->val.datetime.value)); - break; -case TIMESTAMPTZOID: - tmp = DatumGetCString(DirectFunctionCall1(timestamptz_out, - jb->val.datetime.value)); - break; -default: - elog(ERROR, "unrecognized SQL/JSON datetime type oid: %u", - jb->val.datetime.typid); - } + char buf[MAXDATELEN + 1]; + + JsonEncodeDateTime(buf, + jb->val.datetime.value, + jb->val.datetime.typid, + &jb->val.datetime.tz); + tmp = pstrdup(buf); } break; case jbvNull:
Re: Document DateStyle effect on jsonpath string()
"David E. Wheeler" writes: > On Sep 11, 2024, at 11:11, Tom Lane wrote: >> What "let result be stringified" behavior are you thinking of, >> exactly? AFAICS there's not sensitivity to timezone unless you >> use the _tz variant, otherwise it just regurgitates the input. > There is stringification of a time, date, or timestamp value, which > has no TZ, but is still affected by DateStyle. What I understood you to be referencing is what happens without string(), which AFAICS does not result in any timezone rotation: regression=# set timezone = 'America/New_York'; SET regression=# select jsonb_path_query('"2023-08-15 12:34:56-09"', '$.timestamp_tz()'); jsonb_path_query - "2023-08-15T12:34:56-09:00" (1 row) I think I'd be content to have string() duplicate that behavior --- in fact, it seems like it'd be odd if it doesn't match. regards, tom lane
Re: Document DateStyle effect on jsonpath string()
"David E. Wheeler" writes: > I wonder, then, whether .string() should be modified to use the ISO format in > UTC, and therefore be immutable. That’s the format you get if you omit > .string() and let result be stringified from a date/time/timestamp. What "let result be stringified" behavior are you thinking of, exactly? AFAICS there's not sensitivity to timezone unless you use the _tz variant, otherwise it just regurgitates the input. I agree that we should force ISO datestyle, but I'm not quite sure about whether we're in the clear with timezone handling. We already had a bunch of specialized rules about timezone handling in the _tz and not-_tz variants of these functions. It seems to me that simply forcing UTC would not be consistent with that pre-existing behavior. However, I may not have absorbed enough caffeine yet. regards, tom lane
Re: Document DateStyle effect on jsonpath string()
Peter Eisentraut writes: > What I'm concerned about is that this makes the behavior of JSON_QUERY > non-immutable. Maybe there are other reasons for it to be > non-immutable, in which case this isn't important. But it might be > worth avoiding that? Fair point, but haven't we already bit that bullet with respect to timezones? [ looks... ] Hmm, it looks like jsonb_path_exists_tz is marked stable while jsonb_path_exists is claimed to be immutable. So yeah, there's a problem here. I'm not 100% convinced that jsonb_path_exists was truly immutable before, but for sure it is not now, and that's bad. regression=# select jsonb_path_query('"2023-08-15 12:34:56"', '$.timestamp().string()'); jsonb_path_query --- "2023-08-15 12:34:56" (1 row) regression=# set datestyle = postgres; SET regression=# select jsonb_path_query('"2023-08-15 12:34:56"', '$.timestamp().string()'); jsonb_path_query "Tue Aug 15 12:34:56 2023" (1 row) regards, tom lane
Re: Test improvements and minor code fixes for formatting.c.
Aleksander Alekseev writes: > I'm having difficulties applying the patch. Could you please do `git > format-patch` and resend it? patch(1) is generally far more forgiving than 'git am'. regards, tom lane
Re: Make printtup a bit faster
Andy Fan writes: > Just to be clearer, I'd like work on the out function only due to my > internal assignment. (Since David planned it for PG18, so it is better > say things clearer eariler). I'd put parts of out(print) function > refactor in the next 2 days. I think it deserves a double check before > working on *all* the out function. Well, sure. You *cannot* write a patch that breaks existing output functions. Not at the start, and not at the end either. You should focus on writing the infrastructure and, for starters, converting just a few output functions as a demonstration. If that gets accepted then you can work on converting other output functions a few at a time. But they'll never all be done, because we can't realistically force extensions to convert. There are lots of examples of similar incremental conversions in our project's history. I think the most recent example is the "soft error handling" work (d9f7f5d32, ccff2d20e, and many follow-on patches). regards, tom lane
Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Masahiko Sawada writes: > On Tue, Sep 10, 2024 at 11:57 AM Tom Lane wrote: >> Yeah, that seems like it could work. But are we sure that replicas >> get a copy of the primary's control file rather than creating their >> own? > Yes, I think so. Since at least the system identifiers of primary and > replicas must be identical for physical replication, if replicas use > their own control files then they cannot start the replication. Got it. So now I'm wondering if we need all the complexity of storing stuff in the GIN metapages. Could we simply read the (primary's) signedness out of pg_control and use that? We might need some caching mechanism to make that cheap enough, but accessing the current index's metapage is far from free either. regards, tom lane
Re: Speeding up ruleutils' name de-duplication code, redux
David Rowley writes: > On Wed, 11 Sept 2024 at 03:06, Tom Lane wrote: >> We could accomplish what you suggest by re-ordering the calls so that >> we build the hash table before enlarging the array. 0001 attached >> is the same as before (modulo line number changes from being rebased >> up to HEAD) and then 0002 implements this idea on top. On the whole >> though I find 0002 fairly ugly and would prefer to stick to 0001. >> I really doubt that scanning any newly-created column positions is >> going to take long enough to justify intertwining things like this. > I'm fine with that. I did test the performance with and without > v2-0002 and the performance is just a little too noisy to tell. Both > runs I did with v2-0002, it was slower, so I agree it's not worth > making the code uglier for. > I've no more comments. Looks good. Thanks for the review! I'll go push just 0001. regards, tom lane
Re: [BUG?] XMLSERIALIZE( ... INDENT) won't work with blank nodes
Jim Jones writes: > [ xmlserialize patches ] Pushed with minor editorialization. Notably, I got rid of scribbling on xmlBufferContent's buffer --- I don't know how likely that is to upset libxml2, but it seems like a fairly bad idea given that they declare the result as "const xmlChar*". Casting away the const is poor form in any case. regards, tom lane
Re: Document DateStyle effect on jsonpath string()
"David E. Wheeler" writes: > On Sep 10, 2024, at 14:51, Tom Lane wrote: >> Pushed with a little additional polishing. > Thank you! Do you think it’d be worthwhile to back port to 17? Not as things stand. If we adopt Peter's nearby position that the current behavior is actually buggy, then probably back-patching a corrected version would be worthwhile as a part of fixing it. regards, tom lane
Re: Document DateStyle effect on jsonpath string()
Peter Eisentraut writes: > These JSON path functions are specified by the SQL standard, so they > shouldn't depend on PostgreSQL-specific settings. At least in new > functionality we should avoid that, no? Hmm ... but does the standard precisely define the output format? Since these conversions are built on our own timestamp I/O code, I rather imagine there is quite a lot of behavior there that's not to be found in the standard. That doesn't really trouble me as long as the spec's behavior is a subset of it (i.e., reachable as long as you've got the right parameter settings). regards, tom lane
Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Masahiko Sawada writes: > An alternative way would be that we store the char signedness in the > control file, and gin_trgm_ops opclass reads it if the bytes in the > meta page shows 'unset'. The char signedness in the control file > doesn't mean to be used for the compatibility check for physical > replication but used as a hint. But it also could be a bit messy, > though. Yeah, that seems like it could work. But are we sure that replicas get a copy of the primary's control file rather than creating their own? regards, tom lane
Re: Document DateStyle effect on jsonpath string()
"David E. Wheeler" writes: > Rebase on 47c9803. I also changed the commitfest item[1] to “ready for > committer”, since jian reviewed it, though I couldn’t see a way to add jian > as a reviewer in the app. Hope that makes sense. Pushed with a little additional polishing. I thought the best way to address jian's complaint about DateStyle not being clearly locked down was to change horology.sql to verify the prevailing setting, as it has long done for TimeZone. That's the lead test script for related stuff, so it makes the most sense to do it there. Having done that, I don't feel a need to duplicate that elsewhere. regards, tom lane
Re: [PATCH] Add CANONICAL option to xmlserialize
Jim Jones writes: > This patch introduces the CANONICAL option to xmlserialize, which > serializes xml documents in their canonical form - as described in > the W3C Canonical XML Version 1.1 specification. This option can > be used with the additional parameter WITH [NO] COMMENTS to keep > or remove xml comments from the canonical xml output. While I don't object to providing this functionality in some form, I think that doing it with this specific syntax is a seriously bad idea. I think there's significant risk that at some point the SQL committee will either standardize this syntax with a somewhat different meaning or standardize some other syntax for the same functionality. How about instead introducing a plain function along the lines of "xml_canonicalize(xml, bool keep_comments) returns text" ? The SQL committee will certainly never do that, but we won't regret having created a plain function whenever they get around to doing something in the same space. regards, tom lane
Re: Converting README documentation to Markdown
Daniel Gustafsson writes: > Since there doesn't seem to be much interest in going all the way to Markdown, > the attached 0001 is just the formatting changes for achieving (to some > degree) > consistency among the README's. This mostly boils down to using a consistent > amount of whitespace around code, using the same indentation on bullet lists > and starting sections the same way. Inspecting the patch with git diff -w > reveals that it's not much left once whitespace is ignored. There might be a > few markdown hunks left which I'll hunt down in case anyone is interested in > this. > As an added bonus this still makes most READMEs render nicely as Markdown, > just > not automatically on Github as it doesn't know the filetype. I did not inspect the patch in detail, but this approach seems like a reasonable compromise. However, if we're not officially going to Markdown, how likely is it that these files will stay valid in future edits? I suspect most of us don't have those syntax rules wired into our fingers (I sure don't). regards, tom lane
Re: Speeding up ruleutils' name de-duplication code, redux
David Rowley writes: > On Tue, 30 Jul 2024 at 10:14, Tom Lane wrote: >> On my development machine, it takes over 14 minutes to pg_upgrade >> this, and it turns out that that time is largely spent in column >> name de-duplication while deparsing the CHECK constraints. The >> attached patch reduces that to about 3m45s. > I looked at the patch and tried it out. Thanks for looking! > This gives me what I'd expect to see. I wanted to ensure the point > where you're switching to the hashing method was about the right > place. It seems to be, at least for my test. Yeah, I was just going by gut feel there. It's good to have some numbers showing it's not a totally silly choice. > Perhaps you don't think it's worth the additional complexity, but I > see that in both locations you're calling build_colinfo_names_hash(), > it's done just after a call to expand_colnames_array_to(). I wondered > if it was worthwhile unifying both of those functions maybe with a new > name so that you don't need to loop over the always NULL element of > the colnames[] array when building the hash table. This is likely > quite a small overhead compared to the quadratic search you've > removed, so it might not move the needle any. I just wanted to point > it out as I've little else I can find to comment on. Hmm, but there are quite a few expand_colnames_array_to calls that are not associated with build_colinfo_names_hash. On the whole it feels like those are separate concerns that are better kept separate. We could accomplish what you suggest by re-ordering the calls so that we build the hash table before enlarging the array. 0001 attached is the same as before (modulo line number changes from being rebased up to HEAD) and then 0002 implements this idea on top. On the whole though I find 0002 fairly ugly and would prefer to stick to 0001. I really doubt that scanning any newly-created column positions is going to take long enough to justify intertwining things like this. regards, tom lane diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index ee1b7f3dc9..badbf111ee 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -224,6 +224,10 @@ typedef struct * of aliases to columns of the right input. Thus, positions in the printable * column alias list are not necessarily one-for-one with varattnos of the * JOIN, so we need a separate new_colnames[] array for printing purposes. + * + * Finally, when dealing with wide tables we risk O(N^2) costs in assigning + * non-duplicate column names. We ameliorate that by using a hash table that + * holds all the strings appearing in colnames, new_colnames, and parentUsing. */ typedef struct { @@ -291,6 +295,15 @@ typedef struct int *leftattnos; /* left-child varattnos of join cols, or 0 */ int *rightattnos; /* right-child varattnos of join cols, or 0 */ List *usingNames; /* names assigned to merged columns */ + + /* + * Hash table holding copies of all the strings appearing in this struct's + * colnames, new_colnames, and parentUsing. We use a hash table only for + * sufficiently wide relations, and only during the colname-assignment + * functions set_relation_column_names and set_join_column_names; + * otherwise, names_hash is NULL. + */ + HTAB *names_hash; /* entries are just strings */ } deparse_columns; /* This macro is analogous to rt_fetch(), but for deparse_columns structs */ @@ -376,6 +389,9 @@ static bool colname_is_unique(const char *colname, deparse_namespace *dpns, static char *make_colname_unique(char *colname, deparse_namespace *dpns, deparse_columns *colinfo); static void expand_colnames_array_to(deparse_columns *colinfo, int n); +static void build_colinfo_names_hash(deparse_columns *colinfo); +static void add_to_names_hash(deparse_columns *colinfo, const char *name); +static void destroy_colinfo_names_hash(deparse_columns *colinfo); static void identify_join_columns(JoinExpr *j, RangeTblEntry *jrte, deparse_columns *colinfo); static char *get_rtable_name(int rtindex, deparse_context *context); @@ -4133,6 +4149,10 @@ has_dangerous_join_using(deparse_namespace *dpns, Node *jtnode) * * parentUsing is a list of all USING aliases assigned in parent joins of * the current jointree node. (The passed-in list must not be modified.) + * + * Note that we do not use per-deparse_columns hash tables in this function. + * The number of names that need to be assigned should be small enough that + * we don't need to trouble with that. */ static void set_using_names(deparse_namespace *dpns, Node *jtnode, List *parentUsing) @@ -4408,6 +4428,9 @@ set_relation_column_names(deparse_namespace *dpns, RangeTblEntry *rte, colinfo->new_colnames = (char **) palloc(ncolumns * sizeof(char *)); co
Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Masahiko Sawada writes: > On Mon, Sep 9, 2024 at 4:42 PM Tom Lane wrote: >> Do you have an idea for how we'd get >> this to happen during pg_upgrade, exactly? > What I was thinking is that we have "pg_dump --binary-upgrade" emit a > function call, say "SELECT binary_upgrade_update_gin_meta_page()" for > each GIN index, and the meta pages are updated when restoring the > schemas. Hmm, but ... 1. IIRC we don't move the relation files into the new cluster until after we've run the schema dump/restore step. I think this'd have to be driven in some other way than from the pg_dump output. I guess we could have pg_upgrade start up the new postmaster and call a function in each DB, which would have to scan for GIN indexes by itself. 2. How will this interact with --link mode? I don't see how it doesn't involve scribbling on files that are shared with the old cluster, leading to possible problems if the pg_upgrade fails later and the user tries to go back to using the old cluster. It's not so much the metapage update that is worrisome --- we're assuming that that will modify storage that's unused in old versions. But the change would be unrecorded in the old cluster's WAL, which sounds risky. Maybe we could get away with forcing --copy mode for affected indexes, but that feels a bit messy. We'd not want to do it for unaffected indexes, so the copying code would need to know a great deal about this problem. regards, tom lane
Re: change "attnum <=0" to "attnum <0" for better reflect system attribute
jian he writes: > get_attnum says: > Returns InvalidAttrNumber if the attr doesn't exist (or is dropped). > So I conclude that "attnum == 0" is not related to the idea of a system > column. attnum = 0 is also used for whole-row Vars. This is a pretty unfortunate choice given the alternative meaning of "invalid", but cleaning it up would be a daunting task (with not a whole lot of payoff in the end, AFAICS). It's deeply embedded. That being the case, you have to tread *very* carefully when considering making changes like this. > for example, ATExecColumnDefault, following code snippet, > the second ereport should be "if (attnum < 0)" > /* Prevent them from altering a system attribute */ > if (attnum <= 0) I think that's just fine as-is. Sure, the == case is unreachable, but it is very very common to consider whole-row Vars as being more like system attributes than user attributes. In this particular case, for sure we don't want to permit attaching a default to a whole-row Var. So I'm content to allow the duplicative rejection. regards, tom lane
Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Masahiko Sawada writes: > When do we set the byte on the primary server? If it's the first time > to use the GIN index, secondary servers would have to wait for the > primary to use the GIN index, which could be an unpredictable time or > it may never come depending on index usages. Probably we can make > pg_upgrade set the byte in the meta page of GIN indexes that use the > gin_trgm_ops. Hmm, perhaps. That plus set-it-during-index-create would remove the need for dynamic update like I suggested. So very roughly the amount of complexity would balance out. Do you have an idea for how we'd get this to happen during pg_upgrade, exactly? regards, tom lane
Re: pgstattuple: fix free space calculation
=?UTF-8?Q?Fr=C3=A9d=C3=A9ric_Yhuel?= writes: > v4 patch attached. LGTM, pushed. regards, tom lane
Re: access numeric data in module
Robert Haas writes: > On Mon, Sep 9, 2024 at 1:25 PM Tom Lane wrote: >> IMO it'd be a lot better if numeric.c exposed whatever functionality >> Ed feels is missing, while keeping the contents of a numeric opaque. > We could certainly expose a bunch of functions, but I think that would > actually be a bigger maintenance burden for us than just exposing some > of the details that are currently private to numeric.c. This whole argument is contingent on details that haven't been provided, namely exactly what it is that Ed wants to do that he can't do today. I think we should investigate that before deciding that publishing previously-private detail is the best solution. > Also, this seems to me to be holding the numeric data type to a > different standard than other things. By that argument, we should move every declaration in every .c file into c.h and be done. I'd personally be happier if we had *not* exposed the other data structure details you mention, but that ship has sailed. If we do do what you're advocating, I'd at least insist that the declarations go into a new file numeric_internal.h, so that it's clear to all concerned that they're playing with fire if they depend on that stuff. regards, tom lane
Re: Remove hardcoded hash opclass function signature exceptions
Peter Eisentraut writes: > On 06.09.24 21:43, Tom Lane wrote: >> * I don't really like the new control structure, or rather lack of >> structure, in hashvalidate. In particular the uncommented >> s/break/continue/ changes look like typos. They aren't, but can't >> you do this in a less confusing fashion? Or at least add comments >> like "continue not break because the code below the switch doesn't >> apply to this case". > Ok, I cleaned that up a bit. That looks nicer. Thanks. >> I wish we could get rid of those, but according to >> codesearch.debian.net, postgis and a couple of other extensions are >> relying on them. If we remove them we'll break any convenient upgrade >> path for those extensions. > Those are using the C function, which is ok. I was thinking about > removing the SQL function (from pg_proc.dat), because you can't use that > for much anymore. (You can't call it directly, and the hash AM will no > longer accept it.) I have done that in this patch version and added > some code comments around it. No, it isn't okay. What postgis (and the others) is doing is equivalent to regression=# create function myhash(bytea) returns int as 'hashvarlena' LANGUAGE 'internal' IMMUTABLE STRICT PARALLEL SAFE; CREATE FUNCTION After applying the v2 patch, I get regression=# create function myhash(bytea) returns int as 'hashvarlena' LANGUAGE 'internal' IMMUTABLE STRICT PARALLEL SAFE; ERROR: there is no built-in function named "hashvarlena" The reason is that the fmgr_builtins table is built from pg_proc.dat, and only names appearing in it can be used as 'internal' function definitions. So you really can't remove the pg_proc entry. The other thing that's made from pg_proc.dat is the list of extern function declarations in fmgrprotos.h. That's why you had to add those cowboy declarations inside hashfunc.c, which are both ugly and not helpful for any external module that might wish to call those functions at the C level. Other than the business about removing those pg_proc entries, I think this is good to go. regards, tom lane
Re: access numeric data in module
Robert Haas writes: > On Mon, Sep 9, 2024 at 10:14 AM Tom Lane wrote: >> It's intentional that that stuff is not exposed, so no. >> What actual functionality do you need that numeric.h doesn't expose? > I don't agree with this reponse at all. It seems entirely reasonable > for third-party code to want to have a way to construct and interpret > numeric datums. Keeping the details private would MAYBE make sense if > the internal details were changing release to release, but that's > clearly not the case. We have changed numeric's internal representation in the past, and I'd like to keep the freedom to do so again. There's been discussion for example of reconsidering the choice of NBASE to make more sense on 64-bit hardware. Yeah, maintaining on-disk compatibility limits what we can do there, but not as much as if some external module is in bed with the representation. > Even if it were, an extension author is > completely entitled to say "hey, I'd rather have access to an unstable > API and update my code for new releases" and we should accommodate > that. If we don't, people don't give up on writing the code that they > want to write -- they just cut-and-paste private declarations/code > into their own source tree, which is WAY worse than if we just put the > stuff in a .h file. IMO it'd be a lot better if numeric.c exposed whatever functionality Ed feels is missing, while keeping the contents of a numeric opaque. regards, tom lane
Re: SPI_connect, SPI_connect_ext return type
I wrote: > I too think it's good to go. If no one complains or asks for > more time to review, I will push it Monday or so. And done. regards, tom lane
Re: Retire support for OpenSSL 1.1.1 due to raised API requirements
Daniel Gustafsson writes: > The patchset in https://commitfest.postgresql.org/49/5025/ which adds support > for configuring cipher suites in TLS 1.3 handshakes require an API available > in > OpenSSL 1.1.1 and onwards. With that as motivation I'd like to propose that > we > remove support for OpenSSL 1.1.0 and set the minimum required version to > 1.1.1. > OpenSSL 1.1.0 was EOL in September 2019 and was never an LTS version, so it's > not packaged in anything anymore AFAICT and should be very rare in production > use in conjunction with an updated postgres. 1.1.1 LTS will be 2 years EOL by > the time v18 ships so I doubt this will be all that controversial. Yeah ... the alternative would be to conditionally compile the new functionality. That doesn't seem like a productive use of developer time if it's supporting just one version that should be extinct in the wild by now. regards, tom lane
Re: access numeric data in module
Ed Behn writes: > I want to be able to have a function received and/or return numeric data. > However, I'm having trouble getting data from Datums and building Datums to > return. numeric.h does not contain the macros to do this. They are in > numeric.c. > Is there a way to access the values in the numeric structures? If not, > would a PR to move macros to numeric.h be welcome in the next commitfest? It's intentional that that stuff is not exposed, so no. What actual functionality do you need that numeric.h doesn't expose? regards, tom lane
Re: Jargon and acronyms on this mailing list
David Rowley writes: > I think HEAD is commonly misused to mean master instead of the latest > commit of the current branch. I see the buildfarm even does that. > Thanks for getting that right in your blog post. IIRC, HEAD *was* the technically correct term back when we were using CVS. Old habits die hard. regards, tom lane
Re: Test improvements and minor code fixes for formatting.c.
I wrote: > [ v1-formatting-test-improvements.patch ] Meh ... cfbot points out I did the float-to-int conversions wrong. v2 attached. regards, tom lane diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 461fc3f437..e2d45989d8 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -8739,6 +8739,8 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}'); n is the number of digits following V. V with to_number divides in a similar manner. + The V can be thought of as marking the position + of an implicit decimal point in the input or output string. to_char and to_number do not support the use of V combined with a decimal point diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c index 68fa89418f..85a7dd4561 100644 --- a/src/backend/utils/adt/formatting.c +++ b/src/backend/utils/adt/formatting.c @@ -5189,6 +5189,11 @@ NUM_cache(int len, NUMDesc *Num, text *pars_str, bool *shouldFree) } +/* + * Convert integer to Roman numerals + * Result is upper-case and not blank-padded (NUM_processor converts as needed) + * If input is out-of-range, produce '###' + */ static char * int_to_roman(int number) { @@ -5201,32 +5206,42 @@ int_to_roman(int number) result = (char *) palloc(16); *result = '\0'; + /* + * This range limit is the same as in Oracle(TM). The difficulty with + * handling 4000 or more is that we'd need to use more than 3 "M"'s, and + * more than 3 of the same digit isn't considered a valid Roman string. + */ if (number > 3999 || number < 1) { fill_str(result, '#', 15); return result; } + + /* Convert to decimal, then examine each digit */ len = snprintf(numstr, sizeof(numstr), "%d", number); + Assert(len > 0 && len <= 4); for (p = numstr; *p != '\0'; p++, --len) { num = *p - ('0' + 1); if (num < 0) - continue; - - if (len > 3) + continue; /* ignore zeroes */ + /* switch on current column position */ + switch (len) { - while (num-- != -1) -strcat(result, "M"); - } - else - { - if (len == 3) + case 4: +while (num-- >= 0) + strcat(result, "M"); +break; + case 3: strcat(result, rm100[num]); - else if (len == 2) +break; + case 2: strcat(result, rm10[num]); - else if (len == 1) +break; + case 1: strcat(result, rm1[num]); +break; } } return result; @@ -6367,7 +6382,6 @@ numeric_to_char(PG_FUNCTION_ARGS) char *numstr, *orgnum, *p; - Numeric x; NUM_TOCHAR_prepare; @@ -6376,12 +6390,15 @@ numeric_to_char(PG_FUNCTION_ARGS) */ if (IS_ROMAN(&Num)) { - x = DatumGetNumeric(DirectFunctionCall2(numeric_round, -NumericGetDatum(value), -Int32GetDatum(0))); - numstr = - int_to_roman(DatumGetInt32(DirectFunctionCall1(numeric_int4, - NumericGetDatum(x; + int32 intvalue; + bool err; + + /* Round and convert to int */ + intvalue = numeric_int4_opt_error(value, &err); + /* On overflow, just use PG_INT32_MAX; int_to_roman will cope */ + if (err) + intvalue = PG_INT32_MAX; + numstr = int_to_roman(intvalue); } else if (IS_(&Num)) { @@ -6421,6 +6438,7 @@ numeric_to_char(PG_FUNCTION_ARGS) { int numstr_pre_len; Numeric val = value; + Numeric x; if (IS_MULTI(&Num)) { @@ -6589,12 +6607,18 @@ int8_to_char(PG_FUNCTION_ARGS) NUM_TOCHAR_prepare; /* - * On DateType depend part (int32) + * On DateType depend part (int64) */ if (IS_ROMAN(&Num)) { - /* Currently don't support int8 conversion to roman... */ - numstr = int_to_roman(DatumGetInt32(DirectFunctionCall1(int84, Int64GetDatum(value; + int32 intvalue; + + /* On overflow, just use PG_INT32_MAX; int_to_roman will cope */ + if (value <= PG_INT32_MAX && value >= PG_INT32_MIN) + intvalue = (int32) value; + else + intvalue = PG_INT32_MAX; + numstr = int_to_roman(intvalue); } else if (IS_(&Num)) { @@ -6695,7 +6719,18 @@ float4_to_char(PG_FUNCTION_ARGS) NUM_TOCHAR_prepare; if (IS_ROMAN(&Num)) - numstr = int_to_roman((int) rint(value)); + { + int32 intvalue; + + /* See notes in ftoi4() */ + value = rint(value); + /* On overflow, just use PG_INT32_MAX; int_to_roman will cope */ + if (!isnan(value) && FLOAT4_FITS_IN_INT32(value)) + intvalue = (int32) value; + else + intvalue = PG_INT32_MAX; + numstr = int_to_roman(intvalue); + } else if (IS_(&Num)) { if (isnan(value) || isinf(value)) @@ -6797,7 +6832,18 @@ float8_to_char(PG_FUNCTION_ARGS) NUM_TOCHAR_prepare; if (IS_ROMAN(&Num)) - numstr = int_to_roman((int) rint(value)); + { + int32 intvalue; + + /* See notes in dtoi4() */ + value = rint(
Re: [PATCH] Add roman support for to_number function
I wrote: > * Further to Aleksander's point about lack of test coverage for > the to_char direction, I see from > https://coverage.postgresql.org/src/backend/utils/adt/formatting.c.gcov.html > that almost none of the existing roman-number code paths are covered > today. While it's not strictly within the charter of this patch > to improve that, maybe we should try to do so --- at the very > least it'd raise formatting.c's coverage score a few notches. For the archives' sake: I created a patch and a separate discussion thread [1] for that. regards, tom lane [1] https://www.postgresql.org/message-id/flat/2956175.1725831...@sss.pgh.pa.us
Test improvements and minor code fixes for formatting.c.
Over in the thread about teaching to_number() to handle Roman numerals [1], it was noted that we currently have precisely zero test coverage for to_char()'s existing Roman-numeral code, except in the timestamp code path which shares nothing with the numeric code path. In looking at this, I found that there's also no test coverage for the , V, or PL format codes. Also, the possibility of overflow while converting an input value to int in order to pass it to int_to_roman was ignored. Attached is a patch that adds more test coverage and cleans up the Roman-numeral code a little bit. BTW, I also discovered that there is a little bit of support for a "B" format code: we can parse it, but then we ignore it. And it's not documented. Oracle defines this code as a flag that: Returns blanks for the integer part of a fixed-point number when the integer part is zero (regardless of zeros in the format model). It doesn't seem super hard to implement that, but given the complete lack of complaints about it being missing, maybe we should just rip out the incomplete support instead? regards, tom lane [1] https://www.postgresql.org/message-id/flat/CAMWA6ybh4M1VQqpmnu2tfSwO%2B3gAPeA8YKnMHVADeB%3DXDEvT_A%40mail.gmail.com diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 461fc3f437..e2d45989d8 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -8739,6 +8739,8 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}'); n is the number of digits following V. V with to_number divides in a similar manner. + The V can be thought of as marking the position + of an implicit decimal point in the input or output string. to_char and to_number do not support the use of V combined with a decimal point diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c index 68fa89418f..116d79ae11 100644 --- a/src/backend/utils/adt/formatting.c +++ b/src/backend/utils/adt/formatting.c @@ -5189,6 +5189,11 @@ NUM_cache(int len, NUMDesc *Num, text *pars_str, bool *shouldFree) } +/* + * Convert integer to Roman numerals + * Result is upper-case and not blank-padded (NUM_processor converts as needed) + * If input is out-of-range, produce '###' + */ static char * int_to_roman(int number) { @@ -5201,32 +5206,42 @@ int_to_roman(int number) result = (char *) palloc(16); *result = '\0'; + /* + * This range limit is the same as in Oracle(TM). The difficulty with + * handling 4000 or more is that we'd need to use more than 3 "M"'s, and + * more than 3 of the same digit isn't considered a valid Roman string. + */ if (number > 3999 || number < 1) { fill_str(result, '#', 15); return result; } + + /* Convert to decimal, then examine each digit */ len = snprintf(numstr, sizeof(numstr), "%d", number); + Assert(len > 0 && len <= 4); for (p = numstr; *p != '\0'; p++, --len) { num = *p - ('0' + 1); if (num < 0) - continue; - - if (len > 3) + continue; /* ignore zeroes */ + /* switch on current column position */ + switch (len) { - while (num-- != -1) -strcat(result, "M"); - } - else - { - if (len == 3) + case 4: +while (num-- >= 0) + strcat(result, "M"); +break; + case 3: strcat(result, rm100[num]); - else if (len == 2) +break; + case 2: strcat(result, rm10[num]); - else if (len == 1) +break; + case 1: strcat(result, rm1[num]); +break; } } return result; @@ -6367,7 +6382,6 @@ numeric_to_char(PG_FUNCTION_ARGS) char *numstr, *orgnum, *p; - Numeric x; NUM_TOCHAR_prepare; @@ -6376,12 +6390,15 @@ numeric_to_char(PG_FUNCTION_ARGS) */ if (IS_ROMAN(&Num)) { - x = DatumGetNumeric(DirectFunctionCall2(numeric_round, -NumericGetDatum(value), -Int32GetDatum(0))); - numstr = - int_to_roman(DatumGetInt32(DirectFunctionCall1(numeric_int4, - NumericGetDatum(x; + int32 intvalue; + bool err; + + /* Round and convert to int */ + intvalue = numeric_int4_opt_error(value, &err); + /* On overflow, just use PG_INT32_MAX; int_to_roman will cope */ + if (err) + intvalue = PG_INT32_MAX; + numstr = int_to_roman(intvalue); } else if (IS_(&Num)) { @@ -6421,6 +6438,7 @@ numeric_to_char(PG_FUNCTION_ARGS) { int numstr_pre_len; Numeric val = value; + Numeric x; if (IS_MULTI(&Num)) { @@ -6589,12 +6607,18 @@ int8_to_char(PG_FUNCTION_ARGS) NUM_TOCHAR_prepare; /* - * On DateType depend part (int32) + * On DateType depend part (int64) */ if (IS_ROMAN(&Num)) { - /* Currently don't support int8 conversion to roman... */ - numstr = int_to_roman(
Re: SPI_connect, SPI_connect_ext return type
Stepan Neretin writes: >> This combines portions of Stepan's >> two patches with some additional work (mostly, that he'd missed fixing >> any of contrib/). > Thank you for the feedback! I believe the patch looks satisfactory. Should > we await a review? The changes seem straightforward to me. I too think it's good to go. If no one complains or asks for more time to review, I will push it Monday or so. regards, tom lane
Re: [PATCH] Add roman support for to_number function
Maciek Sakrejda writes: > Tested again, and the patch looks good. It does not accept leading or > trailing whitespace, which seems reasonable, given the unclear behavior of > to_number with other format strings. It also rejects less common Roman > spellings like "". I don't feel strongly about that one way or the other, > but perhaps a test codifying that behavior would be useful to make it clear > it's intentional. Yeah, I don't have a strong feeling about that either, but probably being strict is better. to_number has a big problem with "garbage in garbage out" already, and being lax will make that worse. A few notes from a quick read of the patch: * roman_to_int() should have a header comment, notably explaining its result convention. I find it fairly surprising that "0" means an error --- I realize that Roman notation can't represent zero, but wouldn't it be better to use -1? * Do *not* rely on toupper(). There are multiple reasons why not, but in this context the worst one is that in Turkish locales it's likely to translate "i" to "İ", on which you will fail. I'd use pg_ascii_toupper(). * I think roman_to_int() is under-commented internally too. To start with, where did the magic "15" come from? And why should we have such a test anyway --- what if the format allows for trailing stuff after the roman numeral? (That is, I think you need to fix this so that roman_to_int reports how much data it ate, instead of assuming that it must read to the end of the input.) The logic for detecting invalid numeral combinations feels a little opaque too. Do you have a source for the rules you're implementing, and if so could you cite it? * This code will get run through pgindent before commit, so you might want to revisit whether your comments will still look nice afterwards. There's not a lot of consistency in them about initial cap versus lower case or trailing period versus not, too. * roman_result could be declared where used, no? * I'm not really convinced that we need a new errcode ERRCODE_INVALID_ROMAN_NUMERAL rather than using a generic one like ERRCODE_INVALID_TEXT_REPRESENTATION. However, if we do do it like that, why did you pick the out-of-sequence value 22P07? * Might be good to have a few test cases demonstrating what happens when there's other stuff combined with the RN format spec. Notably, even if RN itself won't eat whitespace, there had better be a way to write the format string to allow that. * Further to Aleksander's point about lack of test coverage for the to_char direction, I see from https://coverage.postgresql.org/src/backend/utils/adt/formatting.c.gcov.html that almost none of the existing roman-number code paths are covered today. While it's not strictly within the charter of this patch to improve that, maybe we should try to do so --- at the very least it'd raise formatting.c's coverage score a few notches. regards, tom lane
Re: pgstattuple: fix free space calculation
I wrote: > Now alternatively you could argue that a "new" page isn't usable free > space yet and so we should count it as zero, just as we don't count > dead tuples as usable free space. You need VACUUM to turn either of > those things into real free space. But that'd be a bigger definitional > change, and I'm not sure we want it. Thoughts? On the third hand: the code in question is in statapprox_heap, which is presumably meant to deliver numbers comparable to pgstat_heap. And pgstat_heap takes no special care for "new" pages, it just applies PageGetHeapFreeSpace (or PageGetExactFreeSpace after this patch). So that leaves me feeling pretty strongly that this whole stanza is wrong and we should just do PageGetExactFreeSpace here. regards, tom lane
Re: Undocumented functions
Marcos Pegoraro writes: > Example, elem_contained_by_range is not documented. I know I can use > select 2 <@ '[1,3]'::int4range > But why is that function not documented ? Functions that are primarily meant to implement operators are normally not documented separately: we feel it would bloat the docs without adding a lot. There are pg_description entries for them, eg regression=# \df+ elem_contained_by_range List of functions Schema | Name | Result data type | Argument data types | Type | Volatility | Parallel | Owner | Security | Access privileges | Language | Internal name | Description +-+--+--+--++--+--+--+---+--+-+--- pg_catalog | elem_contained_by_range | boolean | anyelement, anyrange | func | immutable | safe | postgres | invoker | | internal | elem_contained_by_range | implementation of <@ operator ^ (1 row) I think pg_relation_is_updatable is primarily meant as support for the information_schema views, which may explain why it's not in the docs either. There's less of a formal policy about functions underlying system views, but the majority of them probably aren't documented. regards, tom lane
Re: pgstattuple: fix free space calculation
Rafia Sabih writes: > On Thu, 29 Aug 2024 at 16:53, Frédéric Yhuel > wrote: >> So I think we should just use PageGetExactFreeSpace(). >> >> Here is a v3 patch. It's the same as v2, I only removed the last >> paragraph in the commit message. > Thanks for the new patch. LGTM. I looked at this patch. I agree with making the change. However, I don't agree with the CF entry's marking of "target version: stable" (i.e., requesting back-patch). I think this falls somewhere in the gray area between a bug fix and a definitional change. Also, people are unlikely to be happy if they suddenly get new, not-comparable numbers after a minor version update. So I think we should just fix it in HEAD. As far as the patch itself goes, the one thing that is bothering me is this comment change /* - * It's not safe to call PageGetHeapFreeSpace() on new pages, so we + * It's not safe to call PageGetExactFreeSpace() on new pages, so we * treat them as being free space for our purposes. */ which looks like it wasn't made with a great deal of thought. Now it seems to me that the comment was already bogus when written: there isn't anything uncertain about what will happen if you call either of these functions on a "new" page. PageIsNew checks for return ((PageHeader) page)->pd_upper == 0; If pd_upper is 0, PageGet[Exact]FreeSpace is absolutely guaranteed to return zero, even if pd_lower contains garbage. And then PageGetHeapFreeSpace will likewise return zero. Perhaps there could be trouble if we got into the line-pointer-checking part of PageGetHeapFreeSpace, but we can't. So this comment is wrong, and is even more obviously wrong after the above change. I thought for a moment about removing the PageIsNew test altogether, but then I decided that it probably *is* what we want and is just mis-explained. I think the comment should read more like /* * PageGetExactFreeSpace() will return zero for a "new" page, * but it's actually usable free space, so count it that way. */ Now alternatively you could argue that a "new" page isn't usable free space yet and so we should count it as zero, just as we don't count dead tuples as usable free space. You need VACUUM to turn either of those things into real free space. But that'd be a bigger definitional change, and I'm not sure we want it. Thoughts? Also, do we need any documentation change for this? I looked through https://www.postgresql.org/docs/devel/pgstattuple.html and didn't see anything that was being very specific about what "free space" means, so maybe it's fine as-is. regards, tom lane
Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Noah Misch writes: > On Fri, Sep 06, 2024 at 06:36:53PM -0400, Tom Lane wrote: >> I feel like all of these are leaning heavily on users to get it right, > What do you have in mind? I see things for the pg_upgrade implementation to > get right, but I don't see things for pg_upgrade users to get right. Well, yeah, if you are willing to put pg_upgrade in charge of executing ALTER EXTENSION UPDATE, then that would be a reasonably omission-proof path. But we have always intended the pg_upgrade process to *not* auto-update extensions, so I'm concerned about the potential side-effects of drilling a hole in that policy. As an example: even if we ensure that pg_trgm 1.6 to 1.7 is totally transparent except for this fix, what happens if the user's old database is still on some pre-1.6 version? Is it okay to force an update that includes feature upgrades? regards, tom lane
Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Noah Misch writes: > Yes, that's one way to make it work. If we do it that way, it would be > critical to make the ALTER EXTENSION UPDATE run before anything uses the > index. Otherwise, we'll run new v18 "signed char" code on a v17 "unsigned > char" data file. Running ALTER EXTENSION UPDATE early enough should be > feasible, so that's fine. Some other options: > - If v18 "pg_dump -b" decides to emit CREATE INDEX ... gin_trgm_ops_unsigned, > then make it also emit the statements to create the opclass. > - Ship pg_trgm--1.6--1.7.sql in back branches. If the upgrade wants to use > gin_trgm_ops_unsigned, require the user to ALTER EXTENSION UPDATE first. > (In back branches, the C code behind gin_trgm_ops_unsigned could just raise > an error if called.) I feel like all of these are leaning heavily on users to get it right, and therefore have a significant chance of breaking use-cases that were perfectly fine before. Remind me of why we can't do something like this: * Add APIs that allow opclasses to read/write some space in the GIN metapage. (We could get ambitious and add such APIs for other AMs too, but doing it just for GIN is probably a prudent start.) Ensure that this space is initially zeroed. * In gin_trgm_ops, read a byte of this space and interpret it as 0 = unset 1 = signed chars 2 = unsigned chars If the value is zero, set the byte on the basis of the native char-signedness of the current platform. (Obviously, a secondary server couldn't write the byte, and would have to wait for the primary to update the value. In the meantime, it's no more broken than today.) * Subsequently, use either signed or unsigned comparisons based on that value. This would automatically do the right thing in all cases that work today, and it'd provide the ability for cross-platform replication to work in future. You can envision cases where transferring a pre-existing index to a platform of the other stripe would misbehave, but they're the same cases that fail today, and the fix remains the same: reindex. regards, tom lane
Re: Remove hardcoded hash opclass function signature exceptions
Peter Eisentraut writes: > hashvalidate(), which validates the signatures of support functions for > the hash AM, contains several hardcoded exceptions. > ... > This patch removes those exceptions by providing new support functions > that have the proper declared signatures. They internally share most of > the C code with the "wrong" functions they replace, so the behavior is > still the same. +1 for cleaning this up. A couple of minor nitpicks: * I don't really like the new control structure, or rather lack of structure, in hashvalidate. In particular the uncommented s/break/continue/ changes look like typos. They aren't, but can't you do this in a less confusing fashion? Or at least add comments like "continue not break because the code below the switch doesn't apply to this case". * Hand-picking OIDs as you did in pg_proc.dat is kind of passé now. I guess it's all right as long as nobody else does the same thing in the near future, but ... > Not done here, but maybe hashvarlena() and hashvarlenaextended() should > be removed from pg_proc.dat, since their use as opclass support > functions is now dubious. I wish we could get rid of those, but according to codesearch.debian.net, postgis and a couple of other extensions are relying on them. If we remove them we'll break any convenient upgrade path for those extensions. regards, tom lane
Re: [PATCH] TODO “Allow LISTEN on patterns”
Alexander Cheshev writes: > [ v4-0001-Support-wildcards-in-LISTEN-command.patch ] I had not been paying much if any attention to this thread. I assumed from the title that it had in mind to allow something like LISTEN "foo%bar"; where the parameter would be interpreted similarly to a LIKE pattern. I happened to look at the current state of affairs and was rather astonished to find how far off those rails the proposal has gotten. Where is the field demand for N-part channel names? If we do accept that, how well do you think it's going to work to continue to constrain the total name length to NAMEDATALEN? Why, if you're building a thousand-line patch, would you have arbitrary pattern restrictions like "% can only appear at the end of a name part"? What makes you think it's okay to randomly change around unrelated parts of the grammar, scansup.c, etc? (The potential side-effects of that scare me quite a bit: even if you didn't break anything, the blast radius that a reviewer has to examine is very large.) I've also got serious doubts about the value of the trie structure you're building to try to speed up name matching. I haven't seen any evidence that channel name matching is a bottleneck in NOTIFY processing (in a quick test with "perf", it's not even visible). I do think the net effect of a patch like this would be to slow things down, but mostly because it would encourage use of overly-verbose channel names and thereby increase the amount of data passing through the notify SLRU. I think this is dramatically over-engineered and you ought to start over with a much simpler concept. The fact that one person ten years ago asked for something that used exactly ASP.NET's notation doesn't mean that that's exactly how we need to do it. (There's a separate discussion to be had about whether the whole issue is really worth bothering with, given the rather low field demand. But it'd be a lot easier to justify a hundred-line patch than this thing.) regards, tom lane
Re: [BUG?] XMLSERIALIZE( ... INDENT) won't work with blank nodes
Jim Jones writes: > mmh... xmlDocContentDumpOutput seems to add a trailing newline in the > end of a document by default, making the serialization of the same xml > string with DOCUMENT and CONTENT different: Does seem a bit inconsistent. > Or should we in this case consider something like this in > xmltotext_with_options()? > result = cstring_to_text_with_len((const char *) xmlBufferContent(buf), > xmlBufferLength(buf) - 1); I think it'd be quite foolish to assume that every extant and future version of libxml2 will share this glitch. Probably should use logic more like pg_strip_crlf(), although we can't use that directly. Would it ever be the case that trailing whitespace would be valid data? In a bit of testing, it seems like that could be true in CONTENT mode but not DOCUMENT mode. regards, tom lane
Re: Remove one TimestampTzGetDatum call in pg_stat_get_io()
I wrote: > Hmm, TimestampTzGetDatum is not a no-op on 32-bit machines. If you're > correct about this, why are our 32-bit BF animals not crashing? Are > we failing to test this code? Oh, I had the polarity backwards: this error doesn't result in trying to dereference something that's not a pointer, but rather in constructing an extra indirection layer, with the end result being that the timestamp displayed in the pg_stat_io view is garbage (I saw output like "1999-12-31 19:11:45.880208-05" while testing in a 32-bit VM). It's not so surprising that our regression tests are insensitive to the values being displayed there. I confirm that this fixes the problem. Will push shortly. regards, tom lane
Re: Remove one TimestampTzGetDatum call in pg_stat_get_io()
Bertrand Drouvot writes: > While working on the per backend I/O statistics patch ([1]), I noticed that > there is an unnecessary call to TimestampTzGetDatum() in pg_stat_get_io() ( > as the reset_time is already a Datum). Hmm, TimestampTzGetDatum is not a no-op on 32-bit machines. If you're correct about this, why are our 32-bit BF animals not crashing? Are we failing to test this code? regards, tom lane
Re: SPI_connect, SPI_connect_ext return type
Peter Eisentraut writes: > On 10.08.24 16:12, Tom Lane wrote: >> We go to a lot of effort to keep the SPI API as stable as we can >> across major versions, so I don't see why we'd just randomly make >> an API-breaking change like this. > Here is a previous discussion: > https://www.postgresql.org/message-id/flat/1356682025.20017.4.camel%40vanquo.pezone.net > I like the idea that we would keep the API but convert most errors to > exceptions. After thinking about this for awhile, one reason that it's practical to change it today for SPI_connect is that SPI_connect has not returned anything except SPI_OK_CONNECT since v10. So if we tell extension authors they don't need to check the result, it's unlikely that that will cause any new code they write to get used with PG versions where it would be wrong. I fear that we'd need a similar multi-year journey to get to a place where we could deprecate checking the result of any other SPI function. Nonetheless, there seems no reason not to do it now for SPI_connect. So attached is a patch that documents the result value as vestigial and removes the calling-code checks in our own code, but doesn't touch SPI_connect[_ext] itself. This combines portions of Stepan's two patches with some additional work (mostly, that he'd missed fixing any of contrib/). regards, tom lane diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c index 755293456f..c1c82eb4dd 100644 --- a/contrib/dblink/dblink.c +++ b/contrib/dblink/dblink.c @@ -2377,9 +2377,7 @@ get_tuple_of_interest(Relation rel, int *pkattnums, int pknumatts, char **src_pk /* * Connect to SPI manager */ - if ((ret = SPI_connect()) < 0) - /* internal error */ - elog(ERROR, "SPI connect failure - returned %d", ret); + SPI_connect(); initStringInfo(&buf); diff --git a/contrib/spi/refint.c b/contrib/spi/refint.c index 18062eb1cf..e1aef7cd2a 100644 --- a/contrib/spi/refint.c +++ b/contrib/spi/refint.c @@ -108,9 +108,7 @@ check_primary_key(PG_FUNCTION_ARGS) tupdesc = rel->rd_att; /* Connect to SPI manager */ - if ((ret = SPI_connect()) < 0) - /* internal error */ - elog(ERROR, "check_primary_key: SPI_connect returned %d", ret); + SPI_connect(); /* * We use SPI plan preparation feature, so allocate space to place key @@ -328,9 +326,7 @@ check_foreign_key(PG_FUNCTION_ARGS) tupdesc = rel->rd_att; /* Connect to SPI manager */ - if ((ret = SPI_connect()) < 0) - /* internal error */ - elog(ERROR, "check_foreign_key: SPI_connect returned %d", ret); + SPI_connect(); /* * We use SPI plan preparation feature, so allocate space to place key diff --git a/contrib/tablefunc/tablefunc.c b/contrib/tablefunc/tablefunc.c index 7d1b5f5143..2a25607a2a 100644 --- a/contrib/tablefunc/tablefunc.c +++ b/contrib/tablefunc/tablefunc.c @@ -385,9 +385,7 @@ crosstab(PG_FUNCTION_ARGS) per_query_ctx = rsinfo->econtext->ecxt_per_query_memory; /* Connect to SPI manager */ - if ((ret = SPI_connect()) < 0) - /* internal error */ - elog(ERROR, "crosstab: SPI_connect returned %d", ret); + SPI_connect(); /* Retrieve the desired rows */ ret = SPI_execute(sql, true, 0); @@ -724,9 +722,7 @@ load_categories_hash(char *cats_sql, MemoryContext per_query_ctx) HASH_ELEM | HASH_STRINGS | HASH_CONTEXT); /* Connect to SPI manager */ - if ((ret = SPI_connect()) < 0) - /* internal error */ - elog(ERROR, "load_categories_hash: SPI_connect returned %d", ret); + SPI_connect(); /* Retrieve the category name rows */ ret = SPI_execute(cats_sql, true, 0); @@ -806,9 +802,7 @@ get_crosstab_tuplestore(char *sql, tupstore = tuplestore_begin_heap(randomAccess, false, work_mem); /* Connect to SPI manager */ - if ((ret = SPI_connect()) < 0) - /* internal error */ - elog(ERROR, "get_crosstab_tuplestore: SPI_connect returned %d", ret); + SPI_connect(); /* Now retrieve the crosstab source rows */ ret = SPI_execute(sql, true, 0); @@ -1151,15 +1145,11 @@ connectby(char *relname, AttInMetadata *attinmeta) { Tuplestorestate *tupstore = NULL; - int ret; MemoryContext oldcontext; - int serial = 1; /* Connect to SPI manager */ - if ((ret = SPI_connect()) < 0) - /* internal error */ - elog(ERROR, "connectby: SPI_connect returned %d", ret); + SPI_connect(); /* switch to longer term context to create the tuple store */ oldcontext = MemoryContextSwitchTo(per_query_ctx); diff --git a/contrib/xml2/xpath.c b/contrib/xml2/xpath.c index b999b1f706..f94b622d92 100644 --- a/contrib/xml2/xpath.c +++ b/contrib/xml2/xpath.c @@ -560,8 +560,7 @@ xpath_table(PG_FUNCTION_ARGS) relname, condition); - if ((ret = SPI_connect()) < 0) - elog(ERROR, "xpath_table: SPI_connect returned %d", ret); + SPI_connect(); if ((ret = SPI_exec(query_buf.dat
Re: Add GiST support for mixed-width integer operators
Paul Jungwirth writes: > On 7/6/24 05:04, Andrey M. Borodin wrote:>> On 5 Jul 2024, at 23:46, Paul > Jungwirth > wrote: >>> this commit adds support for all combinations of int2/int4/int8 for all >>> five btree operators (=/>). Perhaps I'm missing something, but how can this possibly work without any changes to the C code? For example, gbt_int4_consistent assumes that the comparison value is always an int4. Due to the way we represent Datums in-memory, this will kind of work if it's really an int2 or int8 --- unless the comparison value doesn't fit in int4, and then you will get a completely wrong answer based on a value truncated to int4. (But I would argue that even the cases where it seems to work are a type violation, and getting the right answer is accidental if you have not applied the correct PG_GETARG conversion macro.) Plus, int4-vs-int8 comparisons will fail in very obvious ways, up to and including core dumps, on 32-bit machines where int8 is pass-by-reference. > Here is another patch adding float4/8 and also date/timestamp/timestamptz, in > the same combinations > as btree. Similar problems, plus comparing timestamp to timestamptz requires a timezone conversion that this code isn't doing. I think to make this work, you'd need to define a batch of new opclass-specific strategy numbers that convey both the kind of comparison to perform and the type of the RHS value. And then there would need to be a nontrivial amount of new code in the consistent functions to deal with cross-type cases. regards, tom lane
Re: change regexp_substr first argument make tests more easier to understand.
Ilia Evdokimov writes: > Current tests with regexp_instr() and regexp_substr() with string > 'abcabcabc' are really unreadable and you would spend time to understand > that happens in these tests and if they are really correct. I'd better > change them into "abcdefghi" just like in query > SELECT regexp_substr('abcdefghi', 'd.q') IS NULL AS t; On looking more closely at these test cases, I think the point of them is exactly to show the behavior of the functions with multiple copies of the target substring. Thus, what Jian is proposing breaks the tests: it's no longer perfectly clear whether the result is because the function did what we expect, or because the pattern failed to match anywhere else. (Sure, "a.c" *should* match "aXc", but if it didn't, you wouldn't discover that from this test.) What Ilia proposes would break them worse. I think we should just reject this patch, or at least reject the parts of it that change existing test cases. I have no opinion about whether the new test cases add anything useful. regards, tom lane
Re: Invalid "trailing junk" error message when non-English letters are used
Karina Litskevich writes: > On Thu, Sep 5, 2024 at 6:07 PM Karina Litskevich > wrote: >> In v3 of the patch I grouped all the *_junk rules together and included >> the suggested comment with a little added something. > Oops, I forgot to attach the patch, here it is. Pushed with a bit of further wordsmithing on the comment. I left out the proposed new test case "SELECT 1ä;". The trouble with that is it'd introduce an encoding dependency into the test. For example, it'd likely fail with some other error message in a server encoding that lacks an equivalent to UTF8 "ä". While we have methods for coping with such cases, it requires some pushups, and I didn't see the value. The changes in existing test case results are sufficient to show the patch does what we want. Also, while the bug exists in v15, the patch didn't apply at all. I got lazy and just did the minimal s/ident_start/identifier/ change in that branch, instead of back-patching the cosmetic aspects. regards, tom lane
Re: Make query cancellation keys longer
Jacob Champion writes: > Has there been any work/discussion around not sending the cancel key > in plaintext from psql? It's not a prerequisite or anything (the > longer length is a clear improvement either way), but it seems odd > that this longer "secret" is still just going to be exposed on the > wire when you press Ctrl+C. Wasn't this already addressed in v17, by Author: Alvaro Herrera 2024-03-12 [61461a300] libpq: Add encrypted and non-blocking query cancellation ? Perhaps we need to run around and make sure none of our standard clients use the old API anymore, but the libpq infrastructure is there already. regards, tom lane
Re: updatable view set default interact with base rel generated stored columns
jian he writes: > --- > drop table if exists base_tbl cascade; > CREATE TABLE base_tbl (a int, b int GENERATED ALWAYS AS (22) stored, d > int default 22); > create view rw_view1 as select * from base_tbl; > insert into rw_view1(a) values (12) returning *; > alter view rw_view1 alter column b set default 11.1; > insert into rw_view1(a,b,d) values ( 12, default,33) returning *; > insert into rw_view1(a,d) values (12,33) returning *; > insert into rw_view1 default values returning *; > SELECT events & 4 != 0 AS can_upd, > events & 8 != 0 AS can_ins, > events & 16 != 0 AS can_del > FROM pg_catalog.pg_relation_is_updatable('rw_view1'::regclass, false) > t(events); > --- I don't really see anything wrong here. Yeah, you broke insertions into the view yet it still claims to be updatable. But there is nothing about the view that makes it not-updatable; it's something that happens at runtime in the base table that is problematic. If we try to detect all such cases we'll be solving the halting problem. That is, I don't see any functional difference between this example and, say, a default value attached to the view that violates a CHECK constraint of the base table. regards, tom lane
Re: Role Granting Issues in PostgreSQL: Need Help
"David G. Johnston" writes: > On Wednesday, September 4, 2024, Muhammad Imtiaz > wrote: >> replication_expert | Cannot login > Those are not permissions, they are attributes, and attributes are not > inherited. Specifically: the NOLOGIN attribute on a role is a hard block on logging in with that role, independently of any and every other condition. regards, tom lane
Re: Invalid "trailing junk" error message when non-English letters are used
Karina Litskevich writes: > I see the two solutions here: either move the rule for decinteger_junk > below the rules for hexinteger_junk, octinteger_junk and bininteger_junk, > or just use a single rule decinteger_junk for all these cases, since the > error message is the same anyway. I implemented the latter in the second > version of the patch, also renamed this common rule to integer_junk. That seems reasonable, but IMO this code was unacceptably undercommented before and what you've done has made it worse. We really need a comment block associated with the flex macros, perhaps along the lines of /* * An identifier immediately following a numeric literal is disallowed * because in some cases it's ambiguous what is meant: for example, * 0x1234 could be either a hexinteger or a decinteger "0" and an * identifier "x1234". We can detect such problems by seeing if * integer_junk matches a longer substring than any of the XXXinteger * patterns. (One "junk" pattern is sufficient because this will match * all the same strings we'd match with {hexinteger}{identifier} etc.) * Note that the rule for integer_junk must appear after the ones for * XXXinteger to make this work correctly. */ (Hmm, actually, is that last sentence true? My flex is a bit rusty.) param_junk really needs a similar comment, or maybe we could put all the XXX_junk macros together and use one comment for all. > Additionally, I noticed that this patch is going to change error messages > in some cases, though I don't think it's a big deal. Example: > Without patch: > postgres=# select 0xyz; > ERROR: invalid hexadecimal integer at or near "0x" > With patch: > postgres=# select 0xyz; > ERROR: trailing junk after numeric literal at or near "0xyz" That's sort of annoying, but I don't really see a better way, or at least not one that's worth the effort. >> FWIW output of the whole string in the error message doesnt' look nice to >> me, but other places of code do this anyway e.g: >> select ('1'||repeat('p',100))::integer; >> This may be worth fixing. I think this is nonsense: we are already in the habit of repeating the whole failing query string in the STATEMENT field. In any case it's not something for this patch to worry about. regards, tom lane
Re: psql: fix variable existence tab completion
"Anton A. Melnikov" writes: > On 19.07.2024 01:10, Tom Lane wrote: >> With respect to the other hacks Alexander mentions, maybe we >> could clean some of those out too? I don't recall what platform >> we had in mind there, but we've moved our goalposts on what >> we support pretty far in the last couple years. > Agreed that no reason to save workarounds for non-supported systems. > Here is the patch that removes fixes for Buster bug mentioned above. Pushed. I shall now watch the buildfarm from a safe distance. regards, tom lane
Re: Large expressions in indexes can't be stored (non-TOASTable)
Nathan Bossart writes: > Thanks to commit 96cdeae, only a few catalogs remain that are missing TOAST > tables: pg_attribute, pg_class, pg_index, pg_largeobject, and > pg_largeobject_metadata. I've attached a short patch to add one for > pg_index, which resolves the issue cited here. This passes "check-world" > and didn't fail for a few ad hoc tests (e.g., VACUUM FULL on pg_index). I > haven't spent too much time investigating possible circularity issues, but > I'll note that none of the system indexes presently use the indexprs and > indpred columns. Yeah, the possibility of circularity seems like the main hazard, but I agree it's unlikely that the entries for system indexes could ever need out-of-line storage. There are many other things that would have to be improved before a system index could use indexprs or indpred. regards, tom lane
Re: gamma() and lgamma() functions
I wrote: > AFAICS this patch doesn't inspect signgam, so whether it gets > overwritten by a concurrent thread wouldn't matter. However, > it'd be a good idea to add a comment noting the hazard. Further to that ... I looked at POSIX issue 8 (I had been reading 7) and found this illuminating discussion: Earlier versions of this standard did not require lgamma(), lgammaf(), and lgammal() to be thread-safe because signgam was a global variable. They are now required to be thread-safe to align with the ISO C standard (which, since the introduction of threads in 2011, requires that they avoid data races), with the exception that they need not avoid data races when storing a value in the signgam variable. Since signgam is not specified by the ISO C standard, this exception is not a conflict with that standard. So the other reason to avoid using signgam is that it might not exist at all in some libraries. regards, tom lane
Re: gamma() and lgamma() functions
Dean Rasheed writes: > On Fri, 23 Aug 2024 at 13:40, Peter Eisentraut wrote: >> What are examples of where this would be useful in a database context? > Of course, there's a somewhat fuzzy line between what is generally > useful enough, and what is too specialised for core Postgres, but I > would argue that these qualify, since they are part of C99, and > commonly appear in other general-purpose math libraries like the > Python math module. Yeah, I think any math function that's part of C99 or POSIX is arguably of general interest. >> I'm not sure why you are doing the testing for special values (NaN etc.) >> yourself when the C library function already does it. For example, if I >> remove the isnan(arg1) check in your dlgamma(), then it still behaves >> the same in all tests. > It's useful to do that so that we don't need to assume that every > platform conforms to the POSIX standard, and because it simplifies the > later overflow checks. This is consistent with the approach taken in > other functions, such as dexp(), dsin(), dcos(), etc. dexp() and those other functions date from the late stone age, before it was safe to assume that libm's behavior matched the POSIX specs. Today I think we can assume that, at least till proven differently. There's not necessarily anything wrong with what you've coded, but I don't buy this argument for it. >> Btw., I'm reading that lgamma() in the C library is not necessarily >> thread-safe. Is that a possible problem? > It's not quite clear what to do about that. Per the Linux man page, the reason lgamma() isn't thread-safe is The lgamma(), lgammaf(), and lgammal() functions return the natural logarithm of the absolute value of the Gamma function. The sign of the Gamma function is returned in the external integer signgam declared in . It is 1 when the Gamma function is positive or zero, -1 when it is negative. AFAICS this patch doesn't inspect signgam, so whether it gets overwritten by a concurrent thread wouldn't matter. However, it'd be a good idea to add a comment noting the hazard. (Presumably, the reason POSIX says "need not be thread-safe" is that an implementation could make it thread-safe by making signgam thread-local, but the standard doesn't wish to mandate that.) regards, tom lane