Re: range_agg
Hi út 2. 7. 2019 v 0:38 odesílatel Jeff Davis napsal: > On Fri, 2019-05-03 at 15:56 -0700, Paul Jungwirth wrote: > > Hello, > > > > I wrote an extension to add a range_agg function with similar > > behavior > > to existing *_agg functions, and I'm wondering if folks would like > > to > > have it in core? Here is the repo: > > https://github.com/pjungwir/range_agg > > This seems like a very useful extension, thank you. > > For getting into core though, it should be a more complete set of > related operations. The patch is implicitly introducing the concept of > a "multirange" (in this case, an array of ranges), but it's not making > the concept whole. > > What else should return a multirange? This patch implements the union- > agg of ranges, but we also might want an intersection-agg of ranges > (that is, the set of ranges that are subranges of every input). Given > that there are other options here, the name "range_agg" is too generic > to mean union-agg specifically. > > What can we do with a multirange? A lot of range operators still make > sense, like "contains" or "overlaps"; but "adjacent" doesn't quite > work. What about new operations like inverting a multirange to get the > gaps? > > Do we want to continue with the array-of-ranges implementation of a > multirange, or do we want a first-class multirange concept that might > eliminate the boilerplate around defining all of these operations? > > If we have a more complete set of operators here, the flags for > handling overlapping ranges and gaps will be unnecessary. > I think so scope of this patch is correct. Introduction of set of ranges data type based on a array or not, should be different topic. The question is naming - should be this agg function named "range_agg", and multi range agg "multirange_agg"? Personally, I have not a problem with range_agg, and I have not a problem so it is based on union operation. It is true so only result of union can be implemented as range simply. When I though about multi range result, then there are really large set of possible algorithms how to do some operations over two, three values. So personally, I am satisfied with scope of simple range_agg functions, because I see a benefits, and I don't think so this implementation block any more complex designs in the future. There is really big questions how to implement multi range, and now I think so special data type will be better than possible unordered arrays. Regards Pavel Regards, > Jeff Davis > > > > >
Declared but no defined functions
Hi, I realized that TransactionIdAbort is declared in the transam.h but there is not its function body. As far as I found there are three similar functions in total by the following script. for func in `git ls-files | egrep "\w+\.h$" | xargs cat | egrep "extern \w+ \w+\(.*\);" | sed -e "s/.* \(.*\)(.*);/\1(/g"` do if [ `git grep "$func" -- "*.c" | wc -l` -lt 1 ];then echo $func fi done I think the following functions are mistakenly left in the header file. So attached patch removes them. dsa_startup() TransactionIdAbort() renameatt_type() Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center diff --git a/src/include/access/transam.h b/src/include/access/transam.h index 6cbb0c82c7..33fd052156 100644 --- a/src/include/access/transam.h +++ b/src/include/access/transam.h @@ -209,7 +209,6 @@ extern PGDLLIMPORT VariableCache ShmemVariableCache; extern bool TransactionIdDidCommit(TransactionId transactionId); extern bool TransactionIdDidAbort(TransactionId transactionId); extern bool TransactionIdIsKnownCompleted(TransactionId transactionId); -extern void TransactionIdAbort(TransactionId transactionId); extern void TransactionIdCommitTree(TransactionId xid, int nxids, TransactionId *xids); extern void TransactionIdAsyncCommitTree(TransactionId xid, int nxids, TransactionId *xids, XLogRecPtr lsn); extern void TransactionIdAbortTree(TransactionId xid, int nxids, TransactionId *xids); diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h index b09afa2775..ac2bfaff1e 100644 --- a/src/include/commands/tablecmds.h +++ b/src/include/commands/tablecmds.h @@ -60,8 +60,6 @@ extern void SetRelationHasSubclass(Oid relationId, bool relhassubclass); extern ObjectAddress renameatt(RenameStmt *stmt); -extern ObjectAddress renameatt_type(RenameStmt *stmt); - extern ObjectAddress RenameConstraint(RenameStmt *stmt); extern ObjectAddress RenameRelation(RenameStmt *stmt); diff --git a/src/include/utils/dsa.h b/src/include/utils/dsa.h index ddd3cef8c2..991b62d28c 100644 --- a/src/include/utils/dsa.h +++ b/src/include/utils/dsa.h @@ -99,8 +99,6 @@ typedef pg_atomic_uint64 dsa_pointer_atomic; */ typedef dsm_handle dsa_handle; -extern void dsa_startup(void); - extern dsa_area *dsa_create(int tranche_id); extern dsa_area *dsa_create_in_place(void *place, size_t size, int tranche_id, dsm_segment *segment);
Re: Add client connection check during the execution of the query
On Sat, Feb 9, 2019 at 6:16 AM wrote: > The purpose of this patch is to stop the execution of continuous > requests in case of a disconnection from the client. In most cases, the > client must wait for a response from the server before sending new data > - which means there should not remain unread data on the socket and we > will be able to determine a broken connection. > Perhaps exceptions are possible, but I could not think of such a use > case (except COPY). I would be grateful if someone could offer such > cases or their solutions. > I added a test for the GUC variable when the client connects via SSL, > but I'm not sure that this test is really necessary. Hello Sergey, This seems like a reasonable idea to me. There is no point in running a monster 24 hour OLAP query if your client has gone away. It's using MSG_PEEK which is POSIX, and I can't immediately think of any reason why it's not safe to try to peek at a byte in that socket at any time. Could you add a comment to explain why you have !PqCommReadingMsg && !PqCommBusy? The tests pass on a couple of different Unixoid OSes I tried. Is it really necessary to do a bunch of IO and busy CPU work in $long_query? pg_sleep(60) can do the job, because it includes a standard CHECK_FOR_INTERRUPTS/latch loop that will loop around on SIGALRM. I just tested that your patch correctly interrupts pg_sleep() if I kill -9 my psql process. Why did you call the timeout SKIP_CLIENT_CHECK_TIMEOUT (I don't understand the "SKIP" part)? Why not CLIENT_CONNECTION_CHECK_TIMEOUT or something? I wonder if it's confusing to users that you get "connection to client lost" if the connection goes away while running a query, but nothing if the connection goes away without saying goodbye ("X") while idle. The build fails on Windows. I think it's because src/tools/msvc/Mkvcbuild.pm is trying to find something to compile under src/test/modules/connection, and I think the solution is to add that to the variable @contrib_excludes. (I wonder if that script should assume there is nothing to build instead of dying with "Could not determine contrib module type for connection", otherwise every Unix hacker is bound to get this wrong every time.) https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.45820 Aside from that problem, my Windows CI building thing isn't smart enough to actually run those extra tests yet, so I don't know if it actually works on that platform yet (I really need to teach that thing to use the full buildfarm scripts...) -- Thomas Munro https://enterprisedb.com
Re: [proposal] de-TOAST'ing using a iterator
On Thu, Jun 20, 2019 at 1:51 AM Binguo Bao wrote: > Hi hackers! > This proposal aims to provide the ability to de-TOAST a fully TOAST'd and > compressed field using an iterator and then update the appropriate parts of > the code to use the iterator where possible instead of de-TOAST'ing and > de-compressing the entire value. Examples where this can be helpful include > using position() from the beginning of the value, or doing a pattern or > substring match. > > de-TOAST iterator overview: > 1. The caller requests the slice of the attribute value from the de-TOAST > iterator. > 2. The de-TOAST iterator checks if there is a slice available in the output > buffer, if there is, return the result directly, > otherwise goto the step3. > 3. The de-TOAST iterator checks if there is the slice available in the input > buffer, if there is, goto step44. Otherwise, > call fetch_datum_iterator to fetch datums from disk to input buffer. > 4. If the data in the input buffer is compressed, extract some data from the > input buffer to the output buffer until the caller's > needs are met. > > I've implemented the prototype and apply it to the position() function to > test performance. Hi Binguo, Interesting work, and nice performance improvements so far. Just by the way, the patch currently generates warnings: https://travis-ci.org/postgresql-cfbot/postgresql/builds/554345719 -- Thomas Munro https://enterprisedb.com
Re: Conflict handling for COPY FROM
On Fri, Jun 28, 2019 at 10:57 PM Surafel Temesgen wrote: > In addition to the above change in the attached patch i also change > the syntax to ERROR LIMIT because it is no longer only skip > unique and exclusion constrain violation Hi Surafel, FYI copy.sgml has some DTD validity problems. https://travis-ci.org/postgresql-cfbot/postgresql/builds/554350168 -- Thomas Munro https://enterprisedb.com
Re: check_recovery_target_lsn() does a PG_CATCH without a throw
On Sun, Jun 30, 2019 at 09:35:52PM +0900, Michael Paquier wrote: > The refactoring looks good to me (including what you have just fixed > with PG_RETURN_LSN). Thanks for considering it. This issue was still listed as an open item for v12, so I have removed it. -- Michael signature.asc Description: PGP signature
Re: [PATCH] get rid of StdRdOptions, use individual binary reloptions representation for each relation kind instead
On Thu, Jul 04, 2019 at 07:44:42PM +0100, Dent John wrote: > I see that your patch removed that particular type, so I guess that > feature in vacuum.c has been added in the meantime. > > Would you have a more recent patch? I have switched the patch as waiting on author. -- Michael signature.asc Description: PGP signature
Re: errbacktrace
On Tue, Jun 25, 2019 at 11:08 PM Peter Eisentraut wrote: > For the implementation, I support both backtrace() provided by the OS as > well as using libunwind. The former seems to be supported by a number > of platforms, including glibc, macOS, and FreeBSD, so maybe we don't > need the libunwind suport. I haven't found any difference in quality in > the backtraces between the two approaches, but surely that is highly > dependent on the exact configuration. > > I would welcome testing in all direction with this, to see how well it > works in different circumstances. I like it. Works out of the box on my macOS machine, but for FreeBSD I had to add -lexecinfo to LIBS. -- Thomas Munro https://enterprisedb.com
Re: Introduce MIN/MAX aggregate functions to pg_lsn
On Thu, Jul 04, 2019 at 01:48:24PM -0300, Fabrízio de Royes Mello wrote: > On Thu, Jul 4, 2019 at 10:57 AM Robert Haas wrote: >> It would be pretty silly to have one and not the other, regardless of >> whether we can think of an immediate use case. > > +1 OK, applied with a catalog version bump. This is cool to have. -- Michael signature.asc Description: PGP signature
Re: base backup client as auxiliary backend process
On Sun, Jun 30, 2019 at 8:05 AM Peter Eisentraut wrote: > Attached is a very hackish patch to implement this. It works like this: > > # (assuming you have a primary already running somewhere) > initdb -D data2 --minimal > $EDITOR data2/postgresql.conf # set primary_conninfo > pg_ctl -D data2 start +1, very nice. How about --replica? FIY Windows doesn't like your patch: src/backend/postmaster/postmaster.c(1396): warning C4013: 'sleep' undefined; assuming extern returning int [C:\projects\postgresql\postgres.vcxproj] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.45930 -- Thomas Munro https://enterprisedb.com
RE: [PATCH] Speedup truncates of relation forks
Hi, > I updated the patch based from comments, but it still fails the regression > test as indicated in (5) above. > Kindly verify if I correctly addressed the other parts as what you intended. > > Thanks again for the review! > I'll update the patch again after further comments. I updated the patch which is similar to V3 of the patch, but addressing my problem in (5) in the previous email regarding FreeSpaceMapVacuumRange. It seems to pass the regression test now. Kindly check for validation. Thank you! Regards, Kirk Jamison v4-0001-Speedup-truncates-of-relation-forks.patch Description: v4-0001-Speedup-truncates-of-relation-forks.patch
Re: Replacing the EDH SKIP primes
On Thu, Jul 04, 2019 at 08:24:13AM +0200, Daniel Gustafsson wrote: > LGTM, thanks. Okay, done, after rechecking the shape of the key. Thanks! -- Michael signature.asc Description: PGP signature
Re: Introduce timeout capability for ConditionVariableSleep
On Fri, Mar 22, 2019 at 7:21 AM Shawn Debnath wrote: > > On Sat, Mar 16, 2019 at 03:27:17PM -0700, Shawn Debnath wrote: > > > + * Track the current time so that we can calculate the > > > remaining timeout > > > + * if we are woken up spuriously. > > > > > > I think tha "track" means chasing a moving objects. So it might > > > be bettter that it is record or something? > > > > > > > * Wait for the given condition variable to be signaled or till > > > > timeout. > > > > * > > > > * Returns -1 when timeout expires, otherwise returns 0. > > > > * > > > > * See ConditionVariableSleep() for general usage. > > > > > > > > > +ConditionVariableTimedSleep(ConditionVariable *cv, long timeout, > > > > > > > > > > Counldn't the two-state return value be a boolean? > > > > I will change it to Record in the next iteration of the patch. > > Posting rebased and updated patch. Changed the word 'Track' to 'Record' > and also changed variable name rem_timeout to cur_timeout to match > naming in other use cases. Hi Shawn, I think this is looking pretty good and I'm planning to commit it soon. The convention for CHECK_FOR_INTERRUPTS() in latch wait loops seems to be to put it after the ResetLatch(), so I've moved it in the attached version (though I don't think it was wrong where it was). Also pgindented and with my proposed commit message. I've also attached a throw-away test module that gives you CALL poke() and SELECT wait_for_poke(timeout) using a CV. Observations that I'm not planning to do anything about: 1. I don't really like the data type "long", but it's already established that we use that for latches so maybe it's too late for me to complain about that. 2. I don't really like the fact that we have to do floating point stuff in INSTR_TIME_GET_MILLISEC(). That's not really your patch's fault and you've copied the timeout adjustment code from latch.c, which seems reasonable. -- Thomas Munro https://enterprisedb.com 0001-Introduce-timed-waits-for-condition-variables-v5.patch Description: Binary data 0002-Simple-module-for-waiting-for-other-sessions-v5.patch Description: Binary data
Re: mcvstats serialization code is still shy of a load
Tomas Vondra writes: > I was about to push into REL_12_STABLE, when I realized that maybe we > need to do something about the catversion first. REL_12_STABLE is still > on 201906161, while master got to 201907041 thanks to commit > 7b925e12703. Simply cherry-picking the commits would get us to > 201907052 in both branches, but that'd be wrong as the catalogs do > differ. I suppose this is what you meant by "catversion differences to > cope with". Yeah, exactly. My recommendation is to use 201907051 on v12 and 201907052 on master (or whatever is $today for you). They need to be different now that the branches' catalog histories have diverged, and it seems to me that the back branch should be "older". regards, tom lane
Re: mcvstats serialization code is still shy of a load
On Tue, Jul 02, 2019 at 10:38:29AM +0200, Tomas Vondra wrote: On Sun, Jun 30, 2019 at 08:30:33PM -0400, Tom Lane wrote: Tomas Vondra writes: Attached is a slightly improved version of the serialization patch. I reviewed this patch, and tested it on hppa and ppc. I found one serious bug: in the deserialization varlena case, you need - dataptr += MAXALIGN(len); + dataptr += MAXALIGN(len + VARHDRSZ); (approx. line 1100 in mcv.c). Without this, the output data is corrupt, plus the Assert a few lines further down about dataptr having been advanced by the correct amount fires. (On one machine I tested on, that happened during the core regression tests. The other machine got through regression, but trying to do "select * from pg_stats_ext;" afterwards exhibited the crash. I didn't investigate closely, but I suspect the difference has to do with different MAXALIGN values, 4 and 8 respectively.) The attached patch (a delta atop your v2) corrects that plus some cosmetic issues. Thanks. If we're going to push this, it would be considerably less complicated to do so before v12 gets branched --- not long after that, there will be catversion differences to cope with. I'm planning to make the branch tomorrow (Monday), probably ~1500 UTC. Just sayin'. Unfortunately, I was travelling on Sunday and was quite busy on Monday, so I've been unable to push this before the branching :-( I'll push by the end of this week, once I get home. I've pushed the fix (along with the pg_mcv_list_item fix) into master, hopefully the buildfarm won't be upset about it. I was about to push into REL_12_STABLE, when I realized that maybe we need to do something about the catversion first. REL_12_STABLE is still on 201906161, while master got to 201907041 thanks to commit 7b925e12703. Simply cherry-picking the commits would get us to 201907052 in both branches, but that'd be wrong as the catalogs do differ. I suppose this is what you meant by "catversion differences to cope with". I suppose this is not the first time this happened - how did we deal with it in the past? I guess we could use some "past" non-conflicting catversion number in the REL_12_STABLE branch (say, 201907030x) but maybe that'd be wrong? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
On Thu, Jul 4, 2019 at 10:38 AM Peter Geoghegan wrote: > I tried this on my own "UK land registry" test data [1], which was > originally used for the v12 nbtree work. My test case has a low > cardinality, multi-column text index. I chose this test case because > it was convenient for me. > > On v12/master, the index is 1100MB. Whereas with your patch, it ends > up being 196MB -- over 5.5x smaller! I also see a huge and consistent space saving for TPC-H. All 9 indexes are significantly smaller. The lineitem orderkey index is "just" 1/3 smaller, which is the smallest improvement among TPC-H indexes in my index bloat test case. The two largest indexes after the initial bulk load are *much* smaller: the lineitem parts supplier index is ~2.7x smaller, while the lineitem ship date index is a massive ~4.2x smaller. Also, the orders customer key index is ~2.8x smaller, and the order date index is ~2.43x smaller. Note that the test involved retail insertions, not CREATE INDEX. I haven't seen any regression in the size of any index so far, including when the number of internal pages is all that we measure. Actually, there seems to be cases where there is a noticeably larger reduction in internal pages than in leaf pages, probably because of interactions with suffix truncation. This result is very impressive. We'll need to revisit what the right trade-off is for the compression scheme, which Heikki had some thoughts on when we left off 3 years ago, but that should be a lot easier now. I am very encouraged by the fact that this relatively simple approach already works quite nicely. It's also great to see that bulk insertions with lots of compression are very clearly faster with this latest revision of your patch, unlike earlier versions from 2016 that made those cases slower (though I haven't tested indexes that don't really use compression). I think that this is because you now do the compression lazily, at the point where it looks like we may need to split the page. Previous versions of the patch had to perform compression eagerly, just like GIN, which is not really appropriate for nbtree. -- Peter Geoghegan
Re: Fix runtime errors from -fsanitize=undefined
On Mon, Jun 03, 2019 at 09:21:48PM +0200, Peter Eisentraut wrote: > After many years of trying, it seems the -fsanitize=undefined checking > in gcc is now working somewhat reliably. Attached is a patch that fixes > all errors of the kind > > runtime error: null pointer passed as argument N, which is declared to > never be null > > Most of the cases are calls to memcpy(), memcmp(), etc. with a length of > zero, so one appears to get away with passing a null pointer. I just saw this proposal. The undefined behavior in question is strictly academic. These changes do remove the need for new users to discover -fno-sanitize=nonnull-attribute, but they make the code longer and no clearer. Given the variety of code this touches, I expect future commits will reintroduce the complained-of usage patterns, prompting yet more commits to restore the invariant achieved here. Hence, I'm -0 for this change.
Re: [PATCH] Implement uuid_version()
On 4/7/19 17:30, Alvaro Herrera wrote: On 2019-Jul-04, Tom Lane wrote: A possible option 3 is to keep the function in pgcrypto but change its C code to call the core code. This seems most reasonable, and is what José Luis proposed upthread. We don't have to bump the pgcrypto extension version, as nothing changes for pgcrypto externally. Yes, indeed. ...which means I get another todo item if nobody else volunteers :) Thanks! / J.L.
Proposal to add GUC_REPORT to lc_monetary, lc_numeric and search_path
Hello, The first 2 lc_monetary and lc_numeric are useful if the client for some reason executes set lc_*. We don't get a report and in many cases can't continue to parse numerics or money. Now it it possible to get these at startup by issuing show or querying the catalog, but it seems much cleaner to just send them. The latter is important for similar reasons. JDBC caches prepared statements internally and if the user changes the search path without using setSchema or uses a function to change it then internally it would be necessary to invalidate the cache. Currently if this occurs these statements fail. This seems like a rather innocuous change as the protocol is not changed, rather the amount of information returned on startup is increased marginally. I've included the authors of the npgsql and the node drivers in the email for their input. Dave Cramer
RE: duplicate key entries for primary key -- need urgent help
Thanks a lot Tomas for the reply. Which version are you running, exactly? Whih minor version? [Pawan]: Its (PostgreSQL) 9.5.9 sai=> select version(); version -- PostgreSQL 9.5.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit (1 row) Why do you think it's the issue you linked? [Pawan]: Because the thread which I shared also has problem statement like "Duplicate entries of Primary key" . If this is also known to this version, I will be appreciating a lot if we have some Workaround or config change. In our production: See below entries, proc_id is primary key and we can see duplicate entries. How it is possible? sai=> select ctid,proc_id from etl_status where proc_id='2993229'; ctid | proc_id --+- (381,20) | 2993229 (388,28) | 2993229 (2 rows) Any idea, how it happened? I will waiting for your reply WBR, -Pawan -Original Message- From: Tomas Vondra Sent: Thursday, July 04, 2019 10:18 PM To: Kumar, Pawan (Nokia - IN/Bangalore) Cc: and...@2ndquadrant.com; and...@dunslane.net; j...@agliodbs.com; pgsql-hack...@postgresql.org Subject: Re: duplicate key entries for primary key -- need urgent help On Thu, Jul 04, 2019 at 01:37:01PM +, Kumar, Pawan (Nokia - IN/Bangalore) wrote: >Hi Guys, > >Can you please help here? > >Below reported issue in past about duplicate key entries for primary key. >https://www.postgresql.org/message-id/534c8b33.9050...@pgexperts.com > >the solution was provided in 9.3 version of postgres but it seems issue is >still there in 9.5 version which I am running currently. > >Can you please let me know if this is also known in 9.5? any fix or Workaround >please? > Which version are you running, exactly? Whih minor version? Why do you think it's the issue you linked? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: duplicate key entries for primary key -- need urgent help
Thanks for reply. This has happened very often and at different production system. There is no version change. System running with same version since 1 year but duplicate key issue came quiet a time. And impact is big because of that and only way to fix is to delete the duplicate primary key. Any suggestions to check which logs? Any command to run to get more info during the issue? Any potential configuration to check? Plz suggest Wbr, Pk From: Tomas Vondra Sent: Thursday, July 4, 2019 11:31:48 PM To: Kumar, Pawan (Nokia - IN/Bangalore) Cc: and...@2ndquadrant.com; and...@dunslane.net; j...@agliodbs.com; pgsql-hack...@postgresql.org Subject: Re: duplicate key entries for primary key -- need urgent help On Thu, Jul 04, 2019 at 05:34:21PM +, Kumar, Pawan (Nokia - IN/Bangalore) wrote: >Thanks a lot Tomas for the reply. > >Which version are you running, exactly? Whih minor version? >[Pawan]: Its (PostgreSQL) 9.5.9 > You're missing 2 years of bugfixes, some of which are addressing data corruption issues and might have caused this. >sai=> select version(); > version >-- > PostgreSQL 9.5.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 > (Red Hat 4.8.5-11), 64-bit >(1 row) > >Why do you think it's the issue you linked? > >[Pawan]: Because the thread which I shared also has problem statement like >"Duplicate entries of Primary key" . >If this is also known to this version, I will be appreciating a lot if we have >some Workaround or config change. > Duplicate entries are clearly some sort of data corruption, but that might have happened in various ways - it does not mean it's the same issue. And yes, 9.5.9 has a fix for the issue in the thread you linked. >In our production: See below entries, proc_id is primary key and we can see >duplicate entries. How it is possible? > >sai=> select ctid,proc_id from etl_status where proc_id='2993229'; > ctid | proc_id >--+- > (381,20) | 2993229 > (388,28) | 2993229 >(2 rows) > >Any idea, how it happened? > No, that's impossible to say without you doing some more investigation. We need to know when those rows were created, on which version that happened (the system might have been updated and the corruption predates might have happened on the previous version), and so on. For example, if the system crashed or had any significant issues, that might be related to data corruption issues. We know nothing about your system, so you'll have to do a bit of investigation, look for suspicious things, etc. FWIW it might be a good idea to look for other cases of data corruption. Both to know the extent of the problem, and to gain insight. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Periods
Hello Vik, Paul, On 2019-Jul-04, Vik Fearing wrote: > I've been working on this as an extension instead. It allows me to do some > stuff in plpgsql which is much easier for me. > > https://github.com/xocolatl/periods/ Hmm, okay. > I would love to have all this in core (especially since MariaDB 10.4 > just became the first open source database to get all of this > functionality), but something is better than nothing. Agreed on all accounts. I think that the functionality in your patch is already integrated in Paul's patch for temporal PK/FK elsewhere ... is that correct, or is this patch orthogonal to that work? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [PATCH] get rid of StdRdOptions, use individual binary reloptions representation for each relation kind instead
Hi Nikolay, I have had a crack at re-basing the current patch against 12b2, but I didn’t trivially succeed. It’s probably more my bodged fixing of the rejections than a fundamental problem. But I now get compile fails in — and seems like only — vacuum.c. gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument -g -O2 -I../../../src/include -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk -c -o vacuum.o vacuum.c vacuum.c:1761:20: error: expected expression ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup) ^ vacuum.c:1761:6: error: use of undeclared identifier 'StdRdOptions' ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup) ^ vacuum.c:1771:20: error: expected expression ((StdRdOptions *) onerel->rd_options)->vacuum_truncate) ^ vacuum.c:1771:6: error: use of undeclared identifier 'StdRdOptions' ((StdRdOptions *) onerel->rd_options)->vacuum_truncate) ^ 4 errors generated. I see that your patch removed that particular type, so I guess that feature in vacuum.c has been added in the meantime. Would you have a more recent patch? For what it is worth, I had a play at getting it to work, and only made things much worse! denty. > On 1 Jul 2019, at 12:52, Thomas Munro wrote: > > On Mon, Apr 1, 2019 at 4:36 AM Nikolay Shaplov wrote: >> В письме от понедельник, 18 марта 2019 г. 3:03:04 MSK пользователь Iwata, Aya >> написал: >>> This patch does not apply. >> Oh... Sorry... here goes new version > > Hi Nikolay, > > Could we please have a new rebase? > > Thanks, > > -- > Thomas Munro > https://enterprisedb.com
Re: duplicate key entries for primary key -- need urgent help
On Thu, Jul 04, 2019 at 06:28:03PM +, Kumar, Pawan (Nokia - IN/Bangalore) wrote: Thanks for reply. This has happened very often and at different production system. There is no version change. System running with same version since 1 year but duplicate key issue came quiet a time. And impact is big because of that and only way to fix is to delete the duplicate primary key. Any suggestions to check which logs? Any command to run to get more info during the issue? Any potential configuration to check? Plz suggest Well, I've already pointed out you're missing 2 years worth of fixes, so upgrading to current minor version is the first thing I'd do. (I doubt people will be rushing to help you in their free time when you're missing two years of fixes, possibly causing this issue.) If the issue happens even after upgrading, we'll need to see more details about an actual case - commands creating/modifying the duplicate rows, or anything you can find. It's impossible to help you when we only know there are duplicate values in a PK. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: range_agg
I noticed that this patch has a // comment about it segfaulting. Did you ever figure that out? Is the resulting code the one you intend as final? Did you make any inroads regarding Jeff Davis' suggestion about implementing "multiranges"? I wonder if that's going to end up being a Pandora's box. Stylistically, the code does not match pgindent's choices very closely. I can return a diff to a pgindented version of your v0002 for your perusal, if it would be useful for you to learn its style. (A committer would eventually run pgindent himself[1], but it's good to have submissions be at least close to the final form.) BTW, I suggest "git format-patch -vN" to prepare files for submission. [1] No female committers yet ... is this 2019? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Periods
On Thursday, July 4, 2019 8:04 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: > Hello Vik, > On 2018-Jun-05, Vik Fearing wrote: > >>> I understand that your patch is just to allow defining periods, but I >>> thought I'd share my own hopes earlier rather than later, in case you >>> are doing more work on this. >> >> Yes, I plan on doing much more work on this. My goal is to implement >> (by myself or with help from other) the entire SQL:2016 spec on >> periods and system versioned tables. This current patch is just >> infrastructure. > > Have you had the chance to work on this? Hi Alvaro, I've been working on this as an extension instead. It allows me to do some stuff in plpgsql which is much easier for me. https://github.com/xocolatl/periods/ I would love to have all this in core (especially since MariaDB 10.4 just became the first open source database to get all of this functionality), but something is better than nothing. -- Vik Fearing
question about SnapBuild
i am some pullzed by Snapbuild.c it seem some code can not reach for ever. SnapBuildCommitTxn { //can not reach there if (builder->state == SNAPBUILD_START || (builder->state == SNAPBUILD_BUILDING_SNAPSHOT && TransactionIdPrecedes(xid, SnapBuildNextPhaseAt(builder } DecodeXactOp { if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT) return; //returned } } | | ifquant | | ifqu...@163.com | 签名由网易邮箱大师定制
Re: Periods
Hello Vik, On 2018-Jun-05, Vik Fearing wrote: > > I understand that your patch is just to allow defining periods, but I > > thought I'd share my own hopes earlier rather than later, in case you > > are doing more work on this. > > Yes, I plan on doing much more work on this. My goal is to implement > (by myself or with help from other) the entire SQL:2016 spec on > periods and system versioned tables. This current patch is just > infrastructure. Have you had the chance to work on this? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [PATCH] Incremental sort (was: PoC: Partial sort)
On Thu, Jul 4, 2019 at 10:46 AM Tomas Vondra wrote: > > On Thu, Jul 04, 2019 at 09:29:49AM -0400, James Coleman wrote: > >On Tue, Jun 25, 2019 at 7:22 PM Tomas Vondra > > wrote: > >> > >> On Tue, Jun 25, 2019 at 04:53:40PM -0400, James Coleman wrote: > >> > > >> >Unrelated: if you or someone else you know that's more familiar with > >> >the parallel code, I'd be interested in their looking at the patch at > >> >some point, because I have a suspicion it might not be operating in > >... > >> So I've looked into that, and the reason seems fairly simple - when > >> generating the Gather Merge paths, we only look at paths that are in > >> partial_pathlist. See generate_gather_paths(). > >> > >> And we only have sequential + index paths in partial_pathlist, not > >> incremental sort paths. > >> > >> IMHO we can do two things: > >> > >> 1) modify generate_gather_paths to also consider incremental sort for > >> each sorted path, similarly to what create_ordered_paths does > >> > >> 2) modify build_index_paths to also generate an incremental sort path > >> for each index path > >> > >> IMHO (1) is the right choice here, because it automatically does the > >> trick for all other types of ordered paths, not just index scans. So, > >> something like the attached patch, which gives me plans like this: > >... > >> But I'm not going to claim those are total fixes, it's the minimum I > >> needed to do to make this particular type of plan work. > > > >Thanks for looking into this! > > > >I intended to apply this to my most recent version of the patch (just > >sent a few minutes ago), but when I apply it I noticed that the > >partition_aggregate regression tests have several of these failures: > > > >ERROR: could not find pathkey item to sort > > > >I haven't had time to look into the cause yet, so I decided to wait > >until the next patch revision. > > > > FWIW I don't claim the patch I shared is complete and/or 100% correct. > It was more an illustration of the issue and the smallest patch to make > a particular query work. The test failures are a consequence of that. > > I'll try looking into the failures over the next couple of days, but I > can't promise anything. Yep, I understand, I just wanted to note that it was still an outstanding item and give a quick update on why so. Anything you can look at is much appreciated. James Coleman
Re: Optimize partial TOAST decompression
Of course, I forgot to attach the files, so here they are. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services t.data.gz Description: application/gzip (gdb) bt #0 toast_decompress_datum (attr=0x12572e0) at tuptoaster.c:2291 #1 toast_decompress_datum (attr=0x12572e0) at tuptoaster.c:2277 #2 0x004c3b08 in heap_tuple_untoast_attr_slice (attr=, sliceoffset=0, slicelength=-1) at tuptoaster.c:315 #3 0x0085c1e5 in pg_detoast_datum_slice (datum=, first=, count=) at fmgr.c:1767 #4 0x00833b7a in text_substring (str=133761519127512, start=0, length=, length_not_specified=) at varlena.c:956 #5 0x00600a5b in ExecInterpExpr (state=0x12549c8, econtext=0x1253728, isnull=) at execExprInterp.c:649 #6 0x0061511f in ExecEvalExprSwitchContext (isNull=0x7ffc45db755f, econtext=, state=) at ../../../src/include/executor/executor.h:307 #7 advance_aggregates (aggstate=0x1253500, aggstate=0x1253500) at nodeAgg.c:679 #8 agg_retrieve_direct (aggstate=0x1253500) at nodeAgg.c:1847 #9 ExecAgg (pstate=0x1253500) at nodeAgg.c:1572 #10 0x0060371b in ExecProcNode (node=0x1253500) at ../../../src/include/executor/executor.h:239 #11 ExecutePlan (execute_once=, dest=0x12595b8, direction=, numberTuples=0, sendTuples=, operation=CMD_SELECT, use_parallel_mode=, planstate=0x1253500, estate=0x12532c0) at execMain.c:1648 #12 standard_ExecutorRun (queryDesc=0x11be580, direction=, count=0, execute_once=) at execMain.c:365 #13 0x00748b2b in PortalRunSelect (portal=portal@entry=0x12043a0, forward=forward@entry=true, count=0, count@entry=9223372036854775807, dest=dest@entry=0x12595b8) at pquery.c:929 #14 0x00749f00 in PortalRun (portal=portal@entry=0x12043a0, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x12595b8, altdest=altdest@entry=0x12595b8, completionTag=0x7ffc45db77e0 "") at pquery.c:770 #15 0x0074611f in exec_simple_query (query_string=0x119c820 "select sum(length(substr(a,0))) from t;") at postgres.c:1215 #16 0x0074783e in PostgresMain (argc=, argv=argv@entry=0x11c84d0, dbname=, username=) at postgres.c:4249 #17 0x006d75d9 in BackendRun (port=0x11c09e0, port=0x11c09e0) at postmaster.c:4431 #18 BackendStartup (port=0x11c09e0) at postmaster.c:4122 #19 ServerLoop () at postmaster.c:1704 #20 0x006d840c in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x11963e0) at postmaster.c:1377
Re: duplicate key entries for primary key -- need urgent help
On Thu, Jul 04, 2019 at 05:34:21PM +, Kumar, Pawan (Nokia - IN/Bangalore) wrote: Thanks a lot Tomas for the reply. Which version are you running, exactly? Whih minor version? [Pawan]: Its (PostgreSQL) 9.5.9 You're missing 2 years of bugfixes, some of which are addressing data corruption issues and might have caused this. sai=> select version(); version -- PostgreSQL 9.5.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit (1 row) Why do you think it's the issue you linked? [Pawan]: Because the thread which I shared also has problem statement like "Duplicate entries of Primary key" . If this is also known to this version, I will be appreciating a lot if we have some Workaround or config change. Duplicate entries are clearly some sort of data corruption, but that might have happened in various ways - it does not mean it's the same issue. And yes, 9.5.9 has a fix for the issue in the thread you linked. In our production: See below entries, proc_id is primary key and we can see duplicate entries. How it is possible? sai=> select ctid,proc_id from etl_status where proc_id='2993229'; ctid | proc_id --+- (381,20) | 2993229 (388,28) | 2993229 (2 rows) Any idea, how it happened? No, that's impossible to say without you doing some more investigation. We need to know when those rows were created, on which version that happened (the system might have been updated and the corruption predates might have happened on the previous version), and so on. For example, if the system crashed or had any significant issues, that might be related to data corruption issues. We know nothing about your system, so you'll have to do a bit of investigation, look for suspicious things, etc. FWIW it might be a good idea to look for other cases of data corruption. Both to know the extent of the problem, and to gain insight. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Optimize partial TOAST decompression
On Thu, Jul 04, 2019 at 11:10:24AM +0200, Andrey Borodin wrote: 3 июля 2019 г., в 18:06, Binguo Bao написал(а): Paul Ramsey 于2019年7月2日周二 下午10:46写道: This looks good to me. A little commentary around why pglz_maximum_compressed_size() returns a universally correct answer (there's no way the compressed size can ever be larger than this because...) would be nice for peasants like myself. ... Thanks for your comment. I've updated the patch. Thanks Biguo and Paul! From my POV patch looks ready for committer, so I switched state on CF. I've done a bit of testing and benchmaring on this patch today, and there's a bug somewhere, making it look like there are corrupted data. What I'm seeing is this: CREATE TABLE t (a text); -- attached is data for one row COPY t FROM '/tmp/t.data'; SELECT length(substr(a,1000)) from t; psql: ERROR: compressed data is corrupted SELECT length(substr(a,0,1000)) from t; length 999 (1 row) SELECT length(substr(a,1000)) from t; psql: ERROR: invalid memory alloc request size 2018785106 That's quite bizarre behavior - it does work with a prefix, but not with suffix. And the exact ERROR changes after the prefix query. (Of course, on master it works in all cases.) The backtrace (with the patch applied) looks like this: #0 toast_decompress_datum (attr=0x12572e0) at tuptoaster.c:2291 #1 toast_decompress_datum (attr=0x12572e0) at tuptoaster.c:2277 #2 0x004c3b08 in heap_tuple_untoast_attr_slice (attr=, sliceoffset=0, slicelength=-1) at tuptoaster.c:315 #3 0x0085c1e5 in pg_detoast_datum_slice (datum=, first=, count=) at fmgr.c:1767 #4 0x00833b7a in text_substring (str=133761519127512, start=0, length=, length_not_specified=) at varlena.c:956 ... I've only observed this with a very small number of rows (the data is generated randomly with different compressibility etc.), so I'm only attaching one row that exhibits this issue. My guess is toast_fetch_datum_slice() gets confused by the headers or something, or something like that. FWIW the new code added to this function does not adhere to our code style, and would deserve some additional explanation of what it's doing/why. Same for the heap_tuple_untoast_attr_slice, BTW. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
On Thu, Jul 4, 2019 at 5:06 AM Anastasia Lubennikova wrote: > i - number of distinct values in the index. > So i=1 means that all rows have the same key, > and i=1000 means that all keys are different. > > i / old size (MB) / new size (MB) > 121588 > 100021590 > 1021571 > 1000214214 > > For more, see the attached diagram with test results. I tried this on my own "UK land registry" test data [1], which was originally used for the v12 nbtree work. My test case has a low cardinality, multi-column text index. I chose this test case because it was convenient for me. On v12/master, the index is 1100MB. Whereas with your patch, it ends up being 196MB -- over 5.5x smaller! I also tried it out with the "Mouse genome informatics" database [2], which was already improved considerably by the v12 work on duplicates. This is helped tremendously by your patch. It's not quite 5.5x across the board, of course. There are 187 indexes (on 28 tables), and almost all of the indexes are smaller. Actually, *most* of the indexes are *much* smaller. Very often 50% smaller. I don't have time to do an in-depth analysis of these results today, but clearly the patch is very effective on real world data. I think that we tend to underestimate just how common indexes with a huge number of duplicates are. [1] https://https:/postgr.es/m/cah2-wzn_nayk4pr0hrwo0stwhmxjp5qyu+x8vppt030xpqr...@mail.gmail.com [2] http://www.informatics.jax.org/software.shtml -- Peter Geoghegan
Re: Introduce MIN/MAX aggregate functions to pg_lsn
On Thu, Jul 4, 2019 at 5:17 AM Michael Paquier wrote: > > On Tue, Jul 02, 2019 at 11:31:49AM -0300, Fabrízio de Royes Mello wrote: > > New version attached. > > This looks in pretty good shape to me, and no objections from me to > get those functions as the min() flavor is useful for monitoring WAL > retention for complex deployments. > > Do you have a particular use-case in mind for max() one? I can think > of at least one case: monitoring the flush LSNs of a set of standbys > to find out how much has been replayed at most. > I use min/max to measure the amount of generated WAL (diff) during some periods based on wal position stored in some monitoring system. Regards, -- Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Re: Introduce MIN/MAX aggregate functions to pg_lsn
On Thu, Jul 4, 2019 at 10:57 AM Robert Haas wrote: > > On Thu, Jul 4, 2019 at 4:17 AM Michael Paquier wrote: > > Do you have a particular use-case in mind for max() one? I can think > > of at least one case: monitoring the flush LSNs of a set of standbys > > to find out how much has been replayed at most. > > It would be pretty silly to have one and not the other, regardless of > whether we can think of an immediate use case. > +1 -- Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Re: duplicate key entries for primary key -- need urgent help
On Thu, Jul 04, 2019 at 01:37:01PM +, Kumar, Pawan (Nokia - IN/Bangalore) wrote: Hi Guys, Can you please help here? Below reported issue in past about duplicate key entries for primary key. https://www.postgresql.org/message-id/534c8b33.9050...@pgexperts.com the solution was provided in 9.3 version of postgres but it seems issue is still there in 9.5 version which I am running currently. Can you please let me know if this is also known in 9.5? any fix or Workaround please? Which version are you running, exactly? Whih minor version? Why do you think it's the issue you linked? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
More tzdb fun: POSIXRULES is being deprecated upstream
Paul Eggert, in https://mm.icann.org/pipermail/tz/2019-June/028172.html: > zic’s -p option was intended as a transition from historical > System V code that treated TZ="XXXnYYY" as meaning US > daylight-saving rules in a time zone n hours west of UT, > with XXX abbreviating standard time and YYY abbreviating DST. > zic -p allows the tzdata installer to specify (say) > Europe/Brussels's rules instead of US rules. This behavior > is not well documented and often fails in practice; for example it > does not work with current glibc for contemporary timestamps, and > it does not work in tzdb itself for timestamps after 2037. > So, document it as being obsolete, with the intent that it > will be removed in a future version. This change does not > affect behavior of the default installation. As he says, this doesn't work for post-2038 dates: regression=# set timezone = 'FOO5BAR'; SET regression=# select now(); now --- 2019-07-04 11:55:46.905382-04 (1 row) regression=# select timeofday(); timeofday - Thu Jul 04 11:56:14.102770 2019 BAR (1 row) regression=# select '2020-07-04'::timestamptz; timestamptz 2020-07-04 00:00:00-04 (1 row) regression=# select '2040-07-04'::timestamptz; timestamptz 2040-07-04 00:00:00-05 <<-- should be -04 (1 row) and this note makes it clear that the IANA crew aren't planning on fixing that. It does work if you write a full POSIX-style DST specification: regression=# set timezone = 'FOO5BAR,M3.2.0,M11.1.0'; SET regression=# select '2040-07-04'::timestamptz; timestamptz 2040-07-04 00:00:00-04 (1 row) so I think what Eggert has in mind here is that they'll remove the TZDEFRULES-loading logic and always fall back to TZDEFRULESTRING when presented with a POSIX-style zone spec that lacks explicit transition date rules. So, what if anything should we do about this? We do document posixrules, very explicitly, see datatype.sgml around line 2460: When a daylight-savings zone abbreviation is present, it is assumed to be used according to the same daylight-savings transition rules used in the IANA time zone database's posixrules entry. In a standard PostgreSQL installation, posixrules is the same as US/Eastern, so that POSIX-style time zone specifications follow USA daylight-savings rules. If needed, you can adjust this behavior by replacing the posixrules file. One option is to do nothing until the IANA code actually changes, but as 2038 gets closer, people are more likely to start noticing that this "feature" doesn't work as one would expect. We could get out front of the problem and remove the TZDEFRULES-loading logic ourselves. That would be a bit of a maintenance hazard, but perhaps not too awful, because we already deviate from the IANA code in that area (we have our own ideas about when/whether to try to load TZDEFRULES). I don't think we'd want to change this behavior in the back branches, but it might be OK to do it as a HEAD change. I think I'd rather do it like that than be forced into playing catchup when the IANA code does change. A more aggressive idea would be to stop supporting POSIX-style timezone specs altogether, but I'm not sure I like that answer. Even if we could get away with it from a users'-eye standpoint, I think we have some internal dependencies on being able to use such specifications. regards, tom lane
Re: [PATCH] Implement uuid_version()
On 2019-Jul-04, Tom Lane wrote: > A possible option 3 is to keep the function in pgcrypto but change > its C code to call the core code. This seems most reasonable, and is what José Luis proposed upthread. We don't have to bump the pgcrypto extension version, as nothing changes for pgcrypto externally. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
duplicate key entries for primary key -- need urgent help
Hi Guys, Can you please help here? Below reported issue in past about duplicate key entries for primary key. https://www.postgresql.org/message-id/534c8b33.9050...@pgexperts.com the solution was provided in 9.3 version of postgres but it seems issue is still there in 9.5 version which I am running currently. Can you please let me know if this is also known in 9.5? any fix or Workaround please? WBR, -Pawan
Re: [PATCH] Implement uuid_version()
Peter Eisentraut writes: > I think the alternatives are: > 1. We keep the code in both places. This is fine. There is no problem > with having the same C function or the same SQL function name in both > places. > 2. We remove the C function from pgcrypto and make an extension version > bump. This will create breakage for (some) current users of the > function from pgcrypto. > So option 2 would ironically punish the very users we are trying to > help. So I think just doing nothing is the best option. Hm. Option 1 means that it's a bit unclear which function you are actually calling. As long as the implementations behave identically, that seems okay, but I wonder if that's a constraint we want for the long term. A possible option 3 is to keep the function in pgcrypto but change its C code to call the core code. regards, tom lane
Re: [PATCH] Implement uuid_version()
On 2019-07-02 17:09, Tom Lane wrote: > Peter Eisentraut writes: >> On 2019-06-30 14:50, Fabien COELHO wrote: >>> I'm wondering whether pg_random_uuid() should be taken out of pgcrypto if >>> it is available in core? > >> That would probably require an extension version update dance in >> pgcrypto. I'm not sure if it's worth that. Thoughts? > > We have some previous experience with this type of thing when we migrated > contrib/tsearch2 stuff into core. I'm too caffeine-deprived to remember > exactly what we did or how well it worked. But it seems advisable to go > study that history, because we could easily make things a mess for users > if we fail to consider their upgrade experience. I think in that case we wanted users of the extension to transparently end up using the in-core code. This is not the case here: Both the extension and the proposed in-core code do the same thing and there is very little code duplication, so having them coexist would be fine in principle. I think the alternatives are: 1. We keep the code in both places. This is fine. There is no problem with having the same C function or the same SQL function name in both places. 2. We remove the C function from pgcrypto and make an extension version bump. This will create breakage for (some) current users of the function from pgcrypto. So option 2 would ironically punish the very users we are trying to help. So I think just doing nothing is the best option. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: "long" type is not appropriate for counting tuples
On 2019-07-02 22:56, Tom Lane wrote: > I took a look through these and see nothing objectionable. There are > probably more places that can be improved, but we need not insist on > getting every such place in one go. > > Per Robert's position that variables ought to have well-defined widths, > there might be something to be said for not touching the variable > declarations that you changed from int64 to long long, and instead > casting them to long long in the sprintf calls. But I'm not really > convinced that that's better than what you've done. > > Marked CF entry as ready-for-committer. committed -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [PATCH] Incremental sort (was: PoC: Partial sort)
On Thu, Jul 04, 2019 at 09:29:49AM -0400, James Coleman wrote: On Tue, Jun 25, 2019 at 7:22 PM Tomas Vondra wrote: On Tue, Jun 25, 2019 at 04:53:40PM -0400, James Coleman wrote: > >Unrelated: if you or someone else you know that's more familiar with >the parallel code, I'd be interested in their looking at the patch at >some point, because I have a suspicion it might not be operating in ... So I've looked into that, and the reason seems fairly simple - when generating the Gather Merge paths, we only look at paths that are in partial_pathlist. See generate_gather_paths(). And we only have sequential + index paths in partial_pathlist, not incremental sort paths. IMHO we can do two things: 1) modify generate_gather_paths to also consider incremental sort for each sorted path, similarly to what create_ordered_paths does 2) modify build_index_paths to also generate an incremental sort path for each index path IMHO (1) is the right choice here, because it automatically does the trick for all other types of ordered paths, not just index scans. So, something like the attached patch, which gives me plans like this: ... But I'm not going to claim those are total fixes, it's the minimum I needed to do to make this particular type of plan work. Thanks for looking into this! I intended to apply this to my most recent version of the patch (just sent a few minutes ago), but when I apply it I noticed that the partition_aggregate regression tests have several of these failures: ERROR: could not find pathkey item to sort I haven't had time to look into the cause yet, so I decided to wait until the next patch revision. FWIW I don't claim the patch I shared is complete and/or 100% correct. It was more an illustration of the issue and the smallest patch to make a particular query work. The test failures are a consequence of that. I'll try looking into the failures over the next couple of days, but I can't promise anything. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Memory-Bounded Hash Aggregation
On Wed, Jul 03, 2019 at 07:03:06PM -0700, Jeff Davis wrote: On Wed, 2019-07-03 at 02:17 +0200, Tomas Vondra wrote: What does "partitioned hash strategy" do? It's probably explained in one of the historical discussions, but I'm not sure which one. I assume it simply hashes the group keys and uses that to partition the data, and then passing it to hash aggregate. Yes. When spilling, it is cheap to partition on the hash value at the same time, which dramatically reduces the need to spill multiple times. Previous discussions: Unfortunately the second link does not work :-( It's supposed to be: https://www.postgresql.org/message-id/CAGTBQpa__-NP7%3DkKwze_enkqw18vodRxKkOmNhxAPzqkruc-8g%40mail.gmail.com I'm not going to block Approach 1, althought I'd really like to see something that helps with array_agg. I have a WIP patch that I just posted. It doesn't yet work with ARRAY_AGG, but I think it can be made to work by evicting the entire hash table, serializing the transition states, and then later combining them. Aren't all three approaches a way to "fix" hash aggregate? In any case, it's certainly reasonable to make incremental changes. The question is whether "approach 1" is sensible step towards some form of "approach 3" Disk-based hashing certainly seems like a reasonable algorithm on paper that has some potential advantages over sorting. It certainly seems sensible to me that we explore the disk-based hashing strategy first, and then we would at least know what we are missing (if anything) by going with the hybrid approach later. There's also a fair amount of design space to explore in the hybrid strategy. That could take a while to converge, especially if we don't have anything in place to compare against. Makes sense. I haven't thought about how the hybrid approach would be implemented very much, so I can't quite judge how complicated would it be to extend "approach 1" later. But if you think it's a sensible first step, I trust you. And I certainly agree we need something to compare the other approaches against. > * It means we have a hash table and sort running concurrently, each > using memory. Andres said this might not be a problem[3], but I'm > not convinced that the problem is zero. If you use small work_mem > for the write phase of sorting, you'll end up with a lot of runs > to > merge later and that has some kind of cost. > Why would we need to do both concurrently? I thought we'd empty the hash table before doing the sort, no? So you are saying we spill the tuples into a tuplestore, then feed the tuplestore through a tuplesort? Seems inefficient, but I guess we can. I think the question is whether we see this as "emergency fix" (for cases that are misestimated and could/would fail with OOM at runtime), or as something that is meant to make "hash agg" more widely applicable. I personally see it as an emergency fix, in which cases it's perfectly fine if it's not 100% efficient, assuming it kicks in only rarely. Effectively, we're betting on hash agg, and from time to time we lose. But even if we see it as a general optimization technique it does not have to be perfectly efficient, as long as it's properly costed (so the planner only uses it when appropriate). If we have a better solution (in terms of efficiency, code complexity, etc.) then sure - let's use that. But considering we've started this discussion in ~2015 and we still don't have anything, I wouldn't hold my breath. Let's do something good enough, and maybe improve it later. Do we actually need to handle that case? How many such aggregates are there? I think it's OK to just ignore that case (and keep doing what we do now), and require serial/deserial functions for anything better. Punting on a few cases is fine with me, if the user has a way to fix it. +1 to doing that regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Introduce MIN/MAX aggregate functions to pg_lsn
On Thu, Jul 4, 2019 at 4:17 AM Michael Paquier wrote: > Do you have a particular use-case in mind for max() one? I can think > of at least one case: monitoring the flush LSNs of a set of standbys > to find out how much has been replayed at most. It would be pretty silly to have one and not the other, regardless of whether we can think of an immediate use case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Binary support for pgoutput plugin
On Wed, 5 Jun 2019 at 18:50, Andres Freund wrote: > Hi, > > On 2019-06-05 18:47:57 -0400, Dave Cramer wrote: > > So one of the things they would like added is to get not null information > > in the schema record. This is so they can mark the field Optional in > Java. > > I presume this would also have some uses in other languages. As I > > understand it this would require a protocol bump. If this were to be > > accepted are there any outstanding asks that would useful to add if we > were > > going to bump the protocol? > > I'm pretty strongly opposed to this. What's the limiting factor when > adding such information? I think clients that want something like this > ought to query the database for catalog information when getting schema > information. > > So talking some more to the guys that want to use this for Change Data Capture they pointed out that if the schema changes while they are offline there is no way to query the catalog information Dave >
Re: SQL/JSON path issues/questions
On 7/3/19 11:59 PM, Alexander Korotkov wrote: Hi! On Wed, Jul 3, 2019 at 5:27 PM Liudmila Mantrova wrote: I have rechecked the standard and I agree that we should use "filter expression" whenever possible. "A filter expression must be enclosed in parentheses..." looks like an oversight, so I fixed it. As for what's actually enclosed, I believe we can still use the word "condition" here as it's easy to understand and is already used in our docs, e.g. in description of the WHERE clause that serves a similar purpose. The new version of the patch fixes the terminology, tweaks the examples, and provides some grammar and style fixes in the jsonpath-related chapters. It looks good to me. But this sentence looks a bit too complicated. "It can be followed by one or more accessor operators to define the JSON element on a lower nesting level by which to filter the result." Could we phrase this as following? "In order to filter the result by values lying on lower nesting level, @ operator can be followed by one or more accessor operators." -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company Thank you! I think we can make this sentence even shorter, the fix is attached: "To refer to a JSON element stored at a lower nesting level, add one or more accessor operators after @." -- Liudmila Mantrova Technical writer at Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 3a8581d..6d2aefb 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -11538,7 +11538,8 @@ table2-mapping from the JSON data, similar to XPath expressions used for SQL access to XML. In PostgreSQL, path expressions are implemented as the jsonpath - data type, described in . + data type and can use any elements described in + . JSON query functions and operators @@ -11585,7 +11586,7 @@ table2-mapping }, { "location": [ 47.706, 13.2635 ], "start time": "2018-10-14 10:39:21", -"HR": 130 +"HR": 135 } ] } } @@ -11637,23 +11638,33 @@ table2-mapping When defining the path, you can also use one or more - filter expressions, which work similar to - the WHERE clause in SQL. Each filter expression - can provide one or more filtering conditions that are applied - to the result of the path evaluation. Each filter expression must - be enclosed in parentheses and preceded by a question mark. - Filter expressions are evaluated from left to right and can be nested. - The @ variable denotes the current path evaluation - result to be filtered, and can be followed by one or more accessor - operators to define the JSON element by which to filter the result. - Functions and operators that can be used in the filtering condition - are listed in . - SQL/JSON defines three-valued logic, so the result of the filter - expression may be true, false, + filter expressions that work similar to the + WHERE clause in SQL. A filter expression begins with + a question mark and provides a condition in parentheses: + + +? (condition) + + + + + Filter expressions must be specified right after the path evaluation step + to which they are applied. The result of this step is filtered to include + only those items that satisfy the provided condition. SQL/JSON defines + three-valued logic, so the condition can be true, false, or unknown. The unknown value - plays the same role as SQL NULL. Further path + plays the same role as SQL NULL and can be tested + for with the is unknown predicate. Further path evaluation steps use only those items for which filter expressions - return true. + return true. + + + + Functions and operators that can be used in filter expressions are listed + in . The path + evaluation result to be filtered is denoted by the @ + variable. To refer to a JSON element stored at a lower nesting level, + add one or more accessor operators after @. @@ -11667,8 +11678,8 @@ table2-mapping To get the start time of segments with such values instead, you have to filter out irrelevant segments before returning the start time, so the - filter is applied to the previous step and the path in the filtering - condition is different: + filter expression is applied to the previous step, and the path used + in the condition is different: '$.track.segments[*] ? (@.HR 130)."start time"' @@ -11693,9 +11704,9 @@ table2-mapping - You can also nest filters within each other: + You can also nest filter expressions within each other: -'$.track ? (@.segments[*] ? (@.HR 130)).segments.size()' +'$.track ? (exists(@.segments[*] ? (@.HR 130))).segments.size()' This expression returns the size of the track if it contains any segments with high heart rate values, or an empty sequence otherwise. @@
Re: [PATCH] Incremental sort (was: PoC: Partial sort)
On Tue, Jun 25, 2019 at 7:22 PM Tomas Vondra wrote: > > On Tue, Jun 25, 2019 at 04:53:40PM -0400, James Coleman wrote: > > > >Unrelated: if you or someone else you know that's more familiar with > >the parallel code, I'd be interested in their looking at the patch at > >some point, because I have a suspicion it might not be operating in ... > So I've looked into that, and the reason seems fairly simple - when > generating the Gather Merge paths, we only look at paths that are in > partial_pathlist. See generate_gather_paths(). > > And we only have sequential + index paths in partial_pathlist, not > incremental sort paths. > > IMHO we can do two things: > > 1) modify generate_gather_paths to also consider incremental sort for > each sorted path, similarly to what create_ordered_paths does > > 2) modify build_index_paths to also generate an incremental sort path > for each index path > > IMHO (1) is the right choice here, because it automatically does the > trick for all other types of ordered paths, not just index scans. So, > something like the attached patch, which gives me plans like this: ... > But I'm not going to claim those are total fixes, it's the minimum I > needed to do to make this particular type of plan work. Thanks for looking into this! I intended to apply this to my most recent version of the patch (just sent a few minutes ago), but when I apply it I noticed that the partition_aggregate regression tests have several of these failures: ERROR: could not find pathkey item to sort I haven't had time to look into the cause yet, so I decided to wait until the next patch revision. James Coleman
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
The new version of the patch is attached. This version is even simpler than the previous one, thanks to the recent btree design changes and all the feedback I received. I consider it ready for review and testing. [feature overview] This patch implements the deduplication of btree non-pivot tuples on leaf pages in a manner similar to GIN index "posting lists". Non-pivot posting tuple has following format: t_tid | t_info | key values | posting_list[] Where t_tid and t_info fields are used to store meta info about tuple's posting list. posting list is an array of ItemPointerData. Currently, compression is applied to all indexes except system indexes, unique indexes, and indexes with included columns. On insertion, compression applied not to each tuple, but to the page before split. If the target page is full, we try to compress it. [benchmark results] idx ON tbl(c1); index contains 1000 integer values i - number of distinct values in the index. So i=1 means that all rows have the same key, and i=1000 means that all keys are different. i / old size (MB) / new size (MB) 1 215 88 1000 215 90 10 215 71 1000 214 214 For more, see the attached diagram with test results. [future work] Many things can be improved in this feature. Personally, I'd prefer to keep this patch as small as possible and work on other improvements after a basic part is committed. Though, I understand that some of these can be considered essential for this patch to be approved. 1. Implement a split of the posting tuples on a page split. 2. Implement microvacuum of posting tuples. 3. Add a flag into pg_index, which allows enabling/disabling compression for a particular index. 4. Implement posting list compression. -- Anastasia Lubennikova Postgres Professional:http://www.postgrespro.com The Russian Postgres Company diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c index 602f884..fce499b 100644 --- a/src/backend/access/nbtree/nbtinsert.c +++ b/src/backend/access/nbtree/nbtinsert.c @@ -20,6 +20,7 @@ #include "access/tableam.h" #include "access/transam.h" #include "access/xloginsert.h" +#include "catalog/catalog.h" #include "miscadmin.h" #include "storage/lmgr.h" #include "storage/predicate.h" @@ -56,6 +57,8 @@ static void _bt_insert_parent(Relation rel, Buffer buf, Buffer rbuf, static bool _bt_pgaddtup(Page page, Size itemsize, IndexTuple itup, OffsetNumber itup_off); static void _bt_vacuum_one_page(Relation rel, Buffer buffer, Relation heapRel); +static bool insert_itupprev_to_page(Page page, BTCompressState *compressState); +static void _bt_compress_one_page(Relation rel, Buffer buffer, Relation heapRel); /* * _bt_doinsert() -- Handle insertion of a single index tuple in the tree. @@ -759,6 +762,12 @@ _bt_findinsertloc(Relation rel, _bt_vacuum_one_page(rel, insertstate->buf, heapRel); insertstate->bounds_valid = false; } + + /* + * If the target page is full, try to compress the page + */ + if (PageGetFreeSpace(page) < insertstate->itemsz) + _bt_compress_one_page(rel, insertstate->buf, heapRel); } else { @@ -806,6 +815,11 @@ _bt_findinsertloc(Relation rel, } /* + * Before considering moving right, try to compress the page + */ + _bt_compress_one_page(rel, insertstate->buf, heapRel); + + /* * Nope, so check conditions (b) and (c) enumerated above * * The earlier _bt_check_unique() call may well have established a @@ -2286,3 +2300,232 @@ _bt_vacuum_one_page(Relation rel, Buffer buffer, Relation heapRel) * the page. */ } + +/* + * Add new item (compressed or not) to the page, while compressing it. + * If insertion failed, return false. + * Caller should consider this as compression failure and + * leave page uncompressed. + */ +static bool +insert_itupprev_to_page(Page page, BTCompressState *compressState) +{ + IndexTuple to_insert; + OffsetNumber offnum = PageGetMaxOffsetNumber(page); + + if (compressState->ntuples == 0) + to_insert = compressState->itupprev; + else + { + IndexTuple postingtuple; + /* form a tuple with a posting list */ + postingtuple = BTreeFormPostingTuple(compressState->itupprev, + compressState->ipd, + compressState->ntuples); + to_insert = postingtuple; + pfree(compressState->ipd); + } + + /* Add the new item into the page */ + offnum = OffsetNumberNext(offnum); + + elog(DEBUG4, "insert_itupprev_to_page. compressState->ntuples %d IndexTupleSize %zu free %zu", + compressState->ntuples, IndexTupleSize(to_insert), PageGetFreeSpace(page)); + + if (PageAddItem(page, (Item) to_insert, IndexTupleSize(to_insert), + offnum, false, false) == InvalidOffsetNumber) + { + elog(DEBUG4, "insert_itupprev_to_page. failed"); + /* + * this may happen if tuple is bigger than freespace + * fallback to uncompressed page case + */ + if (compressState->ntuples > 0) + pfree(to_insert); + return false; + }
run os command in pg_regress?
In my case, I want to sleep 3 seconds in xxx.sql for pg_regress program. but I don't want to run 'select pg_sleep(3)' . so it is possible for pg_regress? in psql, I can run \! sleep(3); exit; but looks pg_regress doesn't support it.
Re: Minimal logical decoding on standbys
On Thu, 4 Jul 2019 at 17:21, Amit Khandekar wrote: > > On Thu, 4 Jul 2019 at 15:52, tushar wrote: > > > > On 07/01/2019 11:04 AM, Amit Khandekar wrote: > > > > Also, in the updated patch (v11), I have added some scenarios that > > verify that slot is dropped when either master wal_level is > > insufficient, or when slot is conflicting. Also organized the test > > file a bit. > > > > One scenario where replication slot removed even after fixing the problem > > (which Error message suggested to do) > > Which specific problem are you referring to ? Removing a conflicting > slot, itself is the part of the fix for the conflicting slot problem. > > > > > Please refer this below scenario > > > > Master cluster- > > postgresql,conf file > > wal_level=logical > > hot_standby_feedback = on > > port=5432 > > > > Standby cluster- > > postgresql,conf file > > wal_level=logical > > hot_standby_feedback = on > > port=5433 > > > > both Master/Slave cluster are up and running and are in SYNC with each other > > Create a logical replication slot on SLAVE ( SELECT * from > > pg_create_logical_replication_slot('m', 'test_decoding'); ) > > > > change wal_level='hot_standby' on Master postgresql.conf file / restart the > > server > > Run get_changes function on Standby - > > postgres=# select * from pg_logical_slot_get_changes('m',null,null); > > ERROR: logical decoding on standby requires wal_level >= logical on master > > > > Correct it on Master postgresql.conf file ,i.e set wal_level='logical' > > again / restart the server > > and again fire get_changes function on Standby - > > postgres=# select * from pg_logical_slot_get_changes('m',null,null); > > ERROR: replication slot "m" does not exist > > > > This looks little weird as slot got dropped/removed internally . i guess it > > should get invalid rather than removed automatically. > > Lets user's delete the slot themself rather than automatically removed as > > a surprise. > > It was earlier discussed about what action should be taken when we > find conflicting slots. Out of the options, one was to drop the slot, > and we chose that because that was simple. See this : > https://www.postgresql.org/message-id/flat/20181212204154.nsxf3gzqv3gesl32%40alap3.anarazel.de Sorry, the above link is not the one I wanted to refer to. Correct one is this : https://www.postgresql.org/message-id/20181214005521.jaty2d24lz4nroil%40alap3.anarazel.de -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company
Re: POC: Cleaning up orphaned files using undo logs
On Mon, Jul 1, 2019 at 1:24 PM Thomas Munro wrote: > > Another small change/review: the function UndoLogGetNextInsertPtr() > previously took a transaction ID, but I'm not sure if that made sense, > I need to think about it some more. > The changes you have made related to UndoLogGetNextInsertPtr() doesn't seem correct to me. @@ -854,7 +854,9 @@ FindUndoEndLocationAndSize(UndoRecPtr start_urecptr, * has already started in this log then lets re-fetch the undo * record. */ - next_insert = UndoLogGetNextInsertPtr(slot->logno, uur->uur_xid); + next_insert = UndoLogGetNextInsertPtr(slot->logno); + + /* TODO this can't happen */ if (!UndoRecPtrIsValid(next_insert)) I think this is a possible case. Say while the discard worker tries to register the rollback request from some log and after it fetches the undo record corresponding to start location in this function, another backend adds the new transaction undo. The same is mentioned in comments as well. Can you explain what makes you think that this can't happen? If we don't want to pass the xid to UndoLogGetNextInsertPtr, then I think we need to get the insert location before fetching the record. I will think more on it to see if there is any other problem with the same. 2. @@ -167,25 +205,14 @@ UndoDiscardOneLog(UndoLogSlot *slot, TransactionId xmin, bool *hibernate) + if (!TransactionIdIsValid(wait_xid) && !pending_abort) { UndoRecPtr next_insert = InvalidUndoRecPtr; - /* - * If more undo has been inserted since we checked last, then - * we can process that as well. - */ - next_insert = UndoLogGetNextInsertPtr(logno, undoxid); - if (!UndoRecPtrIsValid(next_insert)) - continue; + next_insert = UndoLogGetNextInsertPtr(logno); This change is also not safe. It can lead to discarding the undo of some random transaction because a new undo records from some other transaction would have been added since we last fetched the undo record. This can be fixed by just removing the call to UndoLogGetNextInsertPtr. I have done so in undoprocessing branch and added the comment as well. I think the common problem with the above changes is that it assumes that new undo can't be added to undo logs while discard worker is processing them. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Minimal logical decoding on standbys
On Thu, 4 Jul 2019 at 15:52, tushar wrote: > > On 07/01/2019 11:04 AM, Amit Khandekar wrote: > > Also, in the updated patch (v11), I have added some scenarios that > verify that slot is dropped when either master wal_level is > insufficient, or when slot is conflicting. Also organized the test > file a bit. > > One scenario where replication slot removed even after fixing the problem > (which Error message suggested to do) Which specific problem are you referring to ? Removing a conflicting slot, itself is the part of the fix for the conflicting slot problem. > > Please refer this below scenario > > Master cluster- > postgresql,conf file > wal_level=logical > hot_standby_feedback = on > port=5432 > > Standby cluster- > postgresql,conf file > wal_level=logical > hot_standby_feedback = on > port=5433 > > both Master/Slave cluster are up and running and are in SYNC with each other > Create a logical replication slot on SLAVE ( SELECT * from > pg_create_logical_replication_slot('m', 'test_decoding'); ) > > change wal_level='hot_standby' on Master postgresql.conf file / restart the > server > Run get_changes function on Standby - > postgres=# select * from pg_logical_slot_get_changes('m',null,null); > ERROR: logical decoding on standby requires wal_level >= logical on master > > Correct it on Master postgresql.conf file ,i.e set wal_level='logical' > again / restart the server > and again fire get_changes function on Standby - > postgres=# select * from pg_logical_slot_get_changes('m',null,null); > ERROR: replication slot "m" does not exist > > This looks little weird as slot got dropped/removed internally . i guess it > should get invalid rather than removed automatically. > Lets user's delete the slot themself rather than automatically removed as a > surprise. It was earlier discussed about what action should be taken when we find conflicting slots. Out of the options, one was to drop the slot, and we chose that because that was simple. See this : https://www.postgresql.org/message-id/flat/20181212204154.nsxf3gzqv3gesl32%40alap3.anarazel.de By the way, you are getting the "logical decoding on standby requires wal_level >= logical on master" error while using the slot, which is because we reject the command even before checking the existence of the slot. Actually the slot is already dropped due to master wal_level. Then when you correct the master wal_level, the command is not rejected, and proceeds to use the slot and then it is found that the slot does not exist. -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company
Re: Feature: Add Greek language fulltext search
On 2019-03-25 12:04, Panagiotis Mavrogiorgos wrote: > Last November snowball added support for Greek language [1]. Following > the instructions [2], I wrote a patch that adds fulltext search for > Greek in Postgres. The patch is attached. I have committed a full sync from the upstream snowball repository, which pulled in the new greek stemmer. Could you please clarify where you got the stopword list from? The README says those need to be downloaded separately, but I wasn't able to find the download location. It would be good to document this, for example in the commit message. I haven't committed the stopword list yet. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
RE: [PATCH] Speedup truncates of relation forks
On Tuesday, July 2, 2019 4:59 PM (GMT+9), Masahiko Sawada wrote: > Thank you for updating the patch. Here is the review comments for v2 patch. Thank you so much for review! I indicated the addressed parts below and attached the updated patch. --- visibilitymap.c: visibilitymap_truncate() > I don't think we should describe about the previous behavior here. > Rather we need to describe what visibilitymap_mark_truncate does and what > it returns to the caller. > > I'm not sure that visibilitymap_mark_truncate function name is appropriate > here since it actually truncate map bits, not only marking. Perhaps we can > still use visibilitymap_truncate or change to > visibilitymap_truncate_prepare, or something? Anyway, this function > truncates only tail bits in the last remaining map page and we can have a > rule that the caller must call smgrtruncate() later to actually truncate > pages. > > The comment of second paragraph is now out of date since this function no > longer sends smgr invalidation message. (1) I updated function name to "visibilitymap_truncate_prepare()" as suggested. I think that parameter name fits, unless other reviewers suggest a better name. I also updated its description more accurately: describing current behavior, caller must eventually call smgrtruncate() to actually truncate vm pages, and removed the outdated description. > Is it worth to leave the current visibilitymap_truncate function as a shortcut > function, instead of replacing? That way we don't need to change > pg_truncate_visibility_map function. (2) Yeah, it's kinda displeasing that I had to add lines in pg_truncate_visibility_map. By any chance, re: shortcut function, you meant to retain the function visibilitymap_truncate() and just add another visibilitymap_truncate_prepare(), isn't it? I'm not sure if it's worth the additional lines of adding another function in visibilitymap.c, that's why I just updated the function itself which just adds 10 lines to pg_truncate_visibility_map anyway. Hmm. If it's not wrong to do it this way, then I will retain this change. OTOH, if pg_visibility.c *must* not be modified, then I'll follow your advice. pg_visibility.c: pg_truncate_visibility_map() > @@ -383,6 +383,10 @@ pg_truncate_visibility_map(PG_FUNCTION_ARGS) > { > Oid relid = PG_GETARG_OID(0); > Relationrel; > + ForkNumber forks[MAX_FORKNUM]; > + BlockNumber blocks[MAX_FORKNUM]; > + BlockNumber newnblocks = InvalidBlockNumber; > + int nforks = 0; > > Why do we need the array of forks and blocks here? I think it's enough to > have one fork and one block number. (3) Thanks for the catch. Updated. freespace.c: MarkFreeSpaceMapTruncateRel() > The same comments are true for MarkFreeSpaceMapTruncateRel. > + BlockNumber new_nfsmblocks = InvalidBlockNumber;/* FSM > blocks */ > + BlockNumber newnblocks = InvalidBlockNumber;/* VM > blocks */ > + int nforks = 0; > > I think that we can have new_nfsmblocks and new_nvmblocks and wipe out the > comments. (4) I updated the description accordingly, describing only the current behavior. The caller must eventually call smgrtruncate() to actually truncate fsm pages. I also removed the outdated description and irrelevant comments. -- storage.c: RelationTruncate() > +* We might as well update the local smgr_fsm_nblocks and > smgr_vm_nblocks > +* setting. smgrtruncate sent an smgr cache inval message, > which will cause > +* other backends to invalidate their copy of smgr_fsm_nblocks and > +* smgr_vm_nblocks, and this one too at the next command > boundary. But this > +* ensures it isn't outright wrong until then. > +*/ > + if (rel->rd_smgr) > + { > + rel->rd_smgr->smgr_fsm_nblocks = new_nfsmblocks; > + rel->rd_smgr->smgr_vm_nblocks = newnblocks; > + } > > new_nfsmblocks and newnblocks could be InvalidBlockNumber when the forks are > already enough small. So the above code sets incorrect values to > smgr_{fsm,vm}_nblocks. > > Also, I wonder if we can do the above code in smgrtruncate. Otherwise we > always > need to set smgr_{fsm,vm}_nblocks after smgrtruncate, which is inconvenient. (5) In my patch, did you mean that there's a possibility that these values were already set to InvalidBlockNumber even before I did the setting, is it? I'm not sure if IIUC, the point of the above code is to make sure that smgr_{fsm,vm}_nblocks are not InvalidBlockNumber until the next command boundary, and while we don't reach that boundary yet, we make sure these values are valid within that window. Is my understanding correct? Maybe following your advice that putting it inside the smgrtruncate loop will make these values correct. For example, below? void smgrtruncate { ... CacheInvalidateSmgr(reln->smgr_rnode); /* Do
Re: Index Skip Scan
On Wed, Jul 3, 2019 at 12:27 AM David Rowley wrote: > On Tue, 2 Jul 2019 at 21:00, Thomas Munro wrote: > > 2. SELECT i, MIN(j) FROM t GROUP BY i could benefit from this if > > you're allowed to go forwards. Same for SELECT i, MAX(j) FROM t GROUP > > BY i if you're allowed to go backwards. Those queries are equivalent > > to SELECT DISTINCT ON (i) i, j FROM t ORDER BY i [DESC], j [DESC] > > (though as Floris noted, the backwards version gives the wrong answers > > with v20). That does seem like a much more specific thing applicable > > only to MIN and MAX, and I think preprocess_minmax_aggregates() could > > be taught to handle that sort of query, building an index only scan > > path with skip scan in build_minmax_path(). > > For the MIN query you just need a path with Pathkeys: { i ASC, j ASC > }, UniqueKeys: { i, j }, doing the MAX query you just need j DESC. While updating the Loose Index Scan wiki page with links to other products' terminology on this subject, I noticed that MySQL can skip-scan MIN() and MAX() in the same query. Hmm. That seems quite desirable. I think it requires a new kind of skipping: I think you have to be able to skip to the first AND last key that has each distinct prefix, and then stick a regular agg on top to collapse them into one row. Such a path would not be so neatly describable by UniqueKeys, or indeed by the amskip() interface in the current patch. I mention all this stuff not because I want us to run before we can walk, but because to be ready to commit the basic distinct skip scan feature, I think we should know approximately how it'll handle the future stuff we'll need. -- Thomas Munro https://enterprisedb.com
Re: Minimal logical decoding on standbys
On 07/01/2019 11:04 AM, Amit Khandekar wrote: Also, in the updated patch (v11), I have added some scenarios that verify that slot is dropped when either master wal_level is insufficient, or when slot is conflicting. Also organized the test file a bit. One scenario where replication slot removed even after fixing the problem (which Error message suggested to do) Please refer this below scenario Master cluster- postgresql,conf file wal_level=logical hot_standby_feedback = on port=5432 Standby cluster- postgresql,conf file wal_level=logical hot_standby_feedback = on port=5433 both Master/Slave cluster are up and running and are in SYNC with each other Create a logical replication slot on SLAVE ( SELECT * from pg_create_logical_replication_slot('m', 'test_decoding'); ) change wal_level='hot_standby' on Master postgresql.conf file / restart the server Run get_changes function on Standby - postgres=# select * from pg_logical_slot_get_changes('m',null,null); ERROR: logical decoding on standby requires wal_level >= logical on master Correct it on Master postgresql.conf file ,i.e set wal_level='logical' again / restart the server and again fire get_changes function on Standby - postgres=# select * from pg_logical_slot_get_changes('m',null,null); *ERROR: replication slot "m" does not exist *This looks little weird as slot got dropped/removed internally . i guess it should get invalid rather than removed automatically. Lets user's delete the slot themself rather than automatically removed as a surprise. -- regards,tushar EnterpriseDB https://www.enterprisedb.com/ The Enterprise PostgreSQL Company
Re: Excessive memory usage in multi-statement queries w/ partitioning
On Tue, May 28, 2019 at 6:57 AM Amit Langote wrote: > > On 2019/05/27 21:56, Tom Lane wrote: > > Amit Langote writes: > >> On 2019/05/24 23:28, Tom Lane wrote: > >>> So my thought, if we want to do something about this, is not "find > >>> some things we can pfree at the end of planning" but "find a way > >>> to use a separate context for each statement in the query string". > > > >> Maybe like the attached? I'm not sure if we need to likewise be concerned > >> about exec_sql_string() being handed multi-query strings. the whole extension sql script is passed to execute_sql_string(), so I think that it's a good thing to have similar workaround there. About the patch: -* Switch to appropriate context for constructing querytrees (again, -* these must outlive the execution context). +* Switch to appropriate context for constructing querytrees. +* Memory allocated during this construction is released before +* the generated plan is executed. The comment should mention query and plan trees, everything else seems ok to me.
Re: Optimize partial TOAST decompression
> 3 июля 2019 г., в 18:06, Binguo Bao написал(а): > > Paul Ramsey 于2019年7月2日周二 下午10:46写道: > This looks good to me. A little commentary around why > pglz_maximum_compressed_size() returns a universally correct answer > (there's no way the compressed size can ever be larger than this > because...) would be nice for peasants like myself. > ... > > Thanks for your comment. I've updated the patch. Thanks Biguo and Paul! From my POV patch looks ready for committer, so I switched state on CF. Best regards, Andrey Borodin.
Re: pg_waldump and PREPARE
On Thu, Jul 4, 2019 at 9:45 AM Michael Paquier wrote: > > On Wed, Jul 03, 2019 at 08:23:44PM +0200, Julien Rouhaud wrote: > > So the patch compiles and works as intended. I don't have much to say > > about it, it all looks good to me, since the concerns about xactdesc.c > > aren't worth the trouble. > > > > I'm not sure that I understand Michael's objection though, as > > xl_xact_prepare is not a new definition and AFAICS it couldn't contain > > the records anyway. So I'll let him say if he has further objections > > or if it's ready for committer! > > This patch provides parsing information only for the header of the 2PC > record. Wouldn't it be interesting to get more information from the > various TwoPhaseRecordOnDisk's callbacks? We could also print much > more information in xact_desc_prepare(). Like the subxacts, the XID, > the invalidation messages and the delete-on-abort/commit rels. Most of those are already described in the COMMIT PREPARE message, wouldn't that be redundant? abortrels aren't displayed anywhere though, so +1 for adding them. I also see that the dbid isn't displayed in any of the 2PC message, that'd be useful to have it directly instead of looking for it in other messages for the same transaction.
Re: Introduce MIN/MAX aggregate functions to pg_lsn
On Tue, Jul 02, 2019 at 11:31:49AM -0300, Fabrízio de Royes Mello wrote: > New version attached. This looks in pretty good shape to me, and no objections from me to get those functions as the min() flavor is useful for monitoring WAL retention for complex deployments. Do you have a particular use-case in mind for max() one? I can think of at least one case: monitoring the flush LSNs of a set of standbys to find out how much has been replayed at most. -- Michael signature.asc Description: PGP signature
Re: extension patch of CREATE OR REPLACE TRIGGER
On Wed, Jul 03, 2019 at 04:37:00AM +, osumi.takami...@fujitsu.com wrote: > I really appreciate your comments. > Recently, I'm very busy because of my work. > So, I'll fix this within a couple of weeks. Please note that I have switched the patch as waiting on author. -- Michael signature.asc Description: PGP signature
Re: pg_waldump and PREPARE
On Wed, Jul 03, 2019 at 08:23:44PM +0200, Julien Rouhaud wrote: > So the patch compiles and works as intended. I don't have much to say > about it, it all looks good to me, since the concerns about xactdesc.c > aren't worth the trouble. > > I'm not sure that I understand Michael's objection though, as > xl_xact_prepare is not a new definition and AFAICS it couldn't contain > the records anyway. So I'll let him say if he has further objections > or if it's ready for committer! This patch provides parsing information only for the header of the 2PC record. Wouldn't it be interesting to get more information from the various TwoPhaseRecordOnDisk's callbacks? We could also print much more information in xact_desc_prepare(). Like the subxacts, the XID, the invalidation messages and the delete-on-abort/commit rels. -- Michael signature.asc Description: PGP signature
Re: Refactoring base64 encoding and decoding into a safer interface
On Tue, Jul 02, 2019 at 09:56:03AM +0200, Daniel Gustafsson wrote: > Looks good, passes tests, provides value to the code. Bumping this to ready > for committer as I no more comments to add. Thanks. I have spent more time testing the different error paths and the new checks in base64.c, and committed the thing. -- Michael signature.asc Description: PGP signature
Re: Replacing the EDH SKIP primes
> On 04 Jul 2019, at 02:58, Michael Paquier wrote: > >> On Wed, Jul 03, 2019 at 08:56:42PM +0200, Daniel Gustafsson wrote: >> Agreed, I’ve updated the patch with a comment on this formulated such that it >> should stand the test of time even as OpenSSL changes etc. > > I'd like to think that we had rather mention the warning issue > explicitely, so as people don't get surprised, like that for example: > > * This is the 2048-bit DH parameter from RFC 3526. The generation of the > * prime is specified in RFC 2412, which also discusses the design choice > * of the generator. Note that when loaded with OpenSSL this causes > * DH_check() to fail on with DH_NOT_SUITABLE_GENERATOR, where leaking > * a bit is preferred. > > Now this makes an OpenSSL-specific issue pop up within a section of > the code where we want to make things more generic with SSL, so your > simpler version has good arguments as well. > > I have just rechecked the shape of the key, and we have an exact > match. LGTM, thanks. cheers ./daniel