Re: Performance degradation after upgrading from 9.5 to 14
On 4/17/24 19:13, Johnathan Tiamoh wrote: > Hello, > > > I performed an upgrade from postgresql-9.5 to postgresql-14 and the > performance has degraded drastically. > > Please, is they any advice on getting performance back ? > There's very little practical advice we can provide based on this report, because it's missing any useful details. There's a number of things that might have caused this, but we'd have to speculate. For example: 1) How did you upgrade? pg_dump or pg_upgrade? 2) Did you run ANALYZE to collect statistics after the upgrade? 3) Did you transfer the configuration, or did you just create a new cluster with the default values? 4) What exactly is slower? Queries? Inserts? 5) Can you quantify the impact? Is it 2x slower? 100x slower? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: How do you optimize the disk IO when you cannot assume a file will start at a boundary then?
On 2/22/24 02:22, Siddharth Jain wrote: > Hi All, > > I understand the storage layer in databases goes to great lengths to ensure: > - a row does not cross a block boundary > - read/writes/allocation happen in units of blocks > etc. The motivation is that at the OS level, it reads and writes pages > (blocks), not individual bytes. I am only concerned about SSDs but I think > the principle applies to HDD as well. > > but how can we do all this when we are not even guaranteed that the > beginning of a file will be aligned with a block boundary? refer this > <https://stackoverflow.com/questions/8018449/is-it-guaranteed-that-the-beginning-of-a-file-is-aligned-with-pagesize-of-file-s> > . > > Further, I don't see any APIs exposing I/O operations in terms of blocks. > All File I/O APIs I see expose a file as a randomly accessible contiguous > byte buffer. Would it not have been easier if there were APIs that exposed > I/O operations in terms of blocks? > > can someone explain this to me? > The short answer is that this is well outside our control. We do the best we can - split our data files to "our" 8kB pages - and hope that the OS / filesystem will do the right thing to map this to blocks at the storage level. The filesystems do the same thing, to some extent - they align stuff with respect to the beginning of the partition, but if the partition itself is not properly aligned, that won't really work. As for the APIs, we work with what we have in POSIX - I don't think there are any APIs working with blocks, and it's not clear to me how would it fundamentally differ from the APIs we have now. Moreover, it's not really clear which of the "block" would matter. The postgres 8kB page? The filesytem page? The storage block/sector size? FWIW I think for SSDs this matters way more than for HDD, because SSDs have to erase the space before a rewrite, which makes it much more expensive. But that's not just about the alignment, but about the page size (with smaller pages being better). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Intermittent Issue with WAL Segment Removal in Logical Replication
On 12/29/23 22:28, Kaushik Iska wrote: > I am unfortunately not really familiar with Google Cloud SQL internals > as well. But we have seen this happen on Amazon RDS as well. > Do you have a reproducer for regular Postgres? > Could it be possible that we are requesting a future WAL segment, say > WAL upto X is written and we are asking for X + 1? It could be that the > error message is misleading. > I don't think that should be possible. The LSN in the START_REPLICATION comes from the replica, where it's tracked as the last LSN received from the upstream. So that shouldn't be in the future. And it's doesn't seem to be suspiciously close to segment boundary either. In fact, the LSN in the message is 6/5AE67D79, but the "failed" segment is 000100060059, which is the *preceding* one. So it can't be in the future. > I do not have the information from pg_replication_slots as I have > terminated the test. I am fairly certain that I can reproduce this > again. I will gather both the restart_lsn and contents of pg_wal for the > failed segment. Is there any other information that would help debug > this further? > Hard to say. The best thing would be to have a reproducer script, ofc. If that's not possible, the information already requested seems like a good start. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Intermittent Issue with WAL Segment Removal in Logical Replication
On 12/27/23 16:31, Kaushik Iska wrote: > Hi all, > > I'm including additional details, as I am able to reproduce this issue a > little more reliably. > > Postgres Version: POSTGRES_14_9.R20230830.01_07 > Vendor: Google Cloud SQL > Logical Replication Protocol version 1 > I don't know much about Google Cloud SQL internals. Is it relatively close to Postgres (as e.g. RDS) or are the internals very different / modified for cloud environments? > Here are the logs of attempt succeeding right after it fails: > > 2023-12-27 01:12:40.581 UTC [59790]: [6-1] db=postgres,user=postgres > STATEMENT: START_REPLICATION SLOT peerflow_slot_wal_testing_2 LOGICAL > 6/5AE67D79 (proto_version '1', publication_names > 'peerflow_pub_wal_testing_2') <- FAILS > 2023-12-27 01:12:41.087 UTC [59790]: [7-1] db=postgres,user=postgres > ERROR: requested WAL segment 000100060059 has already been > removed > 2023-12-27 01:12:44.581 UTC [59794]: [3-1] db=postgres,user=postgres > STATEMENT: START_REPLICATION SLOT peerflow_slot_wal_testing_2 LOGICAL > 6/5AE67D79 (proto_version '1', publication_names > 'peerflow_pub_wal_testing_2') <- SUCCEEDS > 2023-12-27 01:12:44.582 UTC [59794]: [4-1] db=postgres,user=postgres > LOG: logical decoding found consistent point at 6/5A31F050 > > Happy to include any additional details of my setup. > I personally don't see how could this fail and then succeed, unless Google does something smart with the WAL segments under the hood. Surely we try to open the same WAL segment (given the LSN is the same), so how could it not exist and then exist? As Ron already suggested, it might be useful to see information for the replication slot peerflow_slot_wal_testing_2 (especially the restart_lsn value). Also, maybe show the contents of pg_wal (especially for the segment referenced in the error message). Can you reproduce this outside Google cloud environment? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: How to grant read only functions execute permission to read only user
On 7/17/23 16:11, Yi Sun wrote: > Hello guys, > > Our read only user is okay to be granted read only permission of tables. > > How to grant read only functions execute permission to read only user, > is there a simple way to do it please? If not, how to get the list of > read only functions please? Then can grant one by one based on the list, > thanks What is read-only function? I don't think Postgres has anything like that. Functions inherit the privileges of the user that executes them by default. So if the user is read-only (i.e. has just SELECT privilege), then the function can't do any writes either. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Query take a long time and use no index
On 7/17/23 13:20, Tomas Vondra wrote: > ... > > It's always going to be slow with the COUNT(DISTINCT), I'm afraid. > > Not sure how much you can modify the query / database, and how accurate > results you need. If you're OK with estimates, you can try postgres-hll > extension [2] which estimates count(distinct). For exact reaults, I > wrote count_distinct extension [2] that uses hashtable. Might be worth a > try, I guess. > Damn, I forgot to add the links: [1] https://github.com/citusdata/postgresql-hll [2] https://github.com/tvondra/count_distinct regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Query take a long time and use no index
137..2121.695 rows=5962263 loops=3) > Planning Time: 0.247 ms > JIT: > Functions: 11 > Options: Inlining false, Optimization false, Expressions true, > Deforming true > Timing: Generation 2.665 ms, Inlining 0.000 ms, Optimization 1.612 > ms, Emission 48.580 ms, Total 52.857 ms > Execution Time: 3525.656 ms > (12 rows) > > Why are the indexes not used ? It's a common misconception that indexes necessarily improve performance and that slow queries are slow because of missing indexes. In practice indexes are most useful for selective queries, i.e. queries that scan only a fraction of the table - say, a couple percent. But this query scans ~17M rows from the table, and my guess is that's a significant part of the table. In which case the index is useless. Also, notice the seqscan took ~4s. Even if that drops to ~1s with an index, the overall query will still take 12s. > Is it a Problem of the DATE_TRUNC ? My guess is it's more about the DISTINCT, which forces a sort before the aggregate function. Which also means the query can't be parallel. You can try simplifying the query step by step - remove the DISTINCT first, then the DATE_TRUNC. > How can I optimize? > It's always going to be slow with the COUNT(DISTINCT), I'm afraid. Not sure how much you can modify the query / database, and how accurate results you need. If you're OK with estimates, you can try postgres-hll extension [2] which estimates count(distinct). For exact reaults, I wrote count_distinct extension [2] that uses hashtable. Might be worth a try, I guess. Another option is to precalculate stuff, and do queries on that. That is, you're not going to add data with old timestamps, so you can pre-aggregate that. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [Beginner Question]What is query jumbling?
On 7/17/23 05:23, Wen Yi wrote: > Hi community, > I am reading the 'Generate code for query jumbling through > gen_node_support.pl' > (https://www.postgresql.org/message-id/E1pMk51-000puf-55%40gemulon.postgresql.org) > to understand the principle of the postgres's node, but I really don't know > the meaning of the word 'query jumbling'. > > Can someone provide me some advice? > Thanks in advance! > Stuff like this is usually explained somewhere in the code base, so "git grep" is your friend. In this case it's probably good to read the comment in src/backend/nodes/queryjumblefuncs.c which says: * Normalization is implemented by fingerprinting queries, selectively * serializing those fields of each query tree's nodes that are judged * to be essential to the query. This is referred to as a query jumble. So "jumbling" is a process to generate a "normalized" query. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: move databases from a MySQL server to Postgresql.
On 6/23/23 13:45, Alfredo Alcala wrote: > Hello > > I need to move some databases from a MySQL server to Postgresql. > > Can someone tell me the migration procedure, tools and recommendations? > I'm not an expect on this, but migrations tend to be somewhat application-specific. I'd suggest you take a look at this wiki page: https://wiki.postgresql.org/wiki/Converting_from_other_Databases_to_PostgreSQL#MySQL and maybe try some of the tools mentioned there (pgloader, mysql2pgsql, and so on). You'll have to give it a try on your databases, and then ask questions about practical issues you run ran into. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pb with join plan
On 6/21/23 00:26, Marc Millas wrote: > > > On Tue, Jun 20, 2023 at 11:19 PM David Rowley <mailto:dgrowle...@gmail.com>> wrote: > > On Wed, 21 Jun 2023 at 08:34, Marc Millas <mailto:marc.mil...@mokadb.com>> wrote: > > > > On Tue, Jun 20, 2023 at 10:14 PM David Rowley > mailto:dgrowle...@gmail.com>> wrote: > >> > >> On Wed, 21 Jun 2023 at 07:42, Marc Millas <mailto:marc.mil...@mokadb.com>> wrote: > >> > But if I do the same with clause one OR clause 2, I have to > kill the request after an hour, seeing the filesystem showing more > than 140 Mb of increased usage. > >> > >> > > link to the anonymized plan of the req with one clause : > https://explain.depesz.com/s/TWp4 <https://explain.depesz.com/s/TWp4> > > link to the plan with the second > clause alone: https://explain.depesz.com/s/byW5 > <https://explain.depesz.com/s/byW5> > link to the plan with both clauses ORed (the one not > finishing) https://explain.depesz.com/s/jHO2 > <https://explain.depesz.com/s/jHO2> > > > > It's quite difficult to know what the problem is you want to fix here. > Your initial post indicated it was the query with the OR condition > that was causing you the problems, but the plan you've posted has no > OR condition?! > > You're more likely to get help here if you take time to properly > explain the situation and post the information that's actually > relevant to the problem you're having, or state the problem more > clearly, as there's a mismatch somewhere. > > It might also be worth having a look at > https://wiki.postgresql.org/wiki/Slow_Query_Questions > <https://wiki.postgresql.org/wiki/Slow_Query_Questions> . EXPLAIN is not > going to tell us what part of the query is slow. I'll let the wiki > page guide you into what to do instead. > > > I know that page. obviously, as I have to kill the request, I cannot > provide a explain analyze... > It's a bit weird the "victor" table is joined seemingly without any join conditions, leading to a cross join (which massively inflates the cost for joins above it). Maybe the anonymized plan mangles it somehow. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Question regarding specifics of GIN and pg_trgm performance and potential use of show_trgm to improve it
ndexing the number of trigrams like this can help, and I'm not sure how to improve this (at least for the built-in GIN). It seem similarity searches are bound to be proportional to the most frequent trigram in the query. I wonder if the "newer" GIN variants like RUM [1] could improve this, but I don't think it has trgm opclasses. regards [1] https://github.com/postgrespro/rum -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Expr. extended stats are skipped with equality operator
On 8/7/22 19:28, Tom Lane wrote: > Tomas Vondra writes: >> Yeah, this seems like a clear bug - we should not fail queries like >> this. It's a sign statext_is_compatible_clause() and the MCV code >> disagrees which clauses are compatible. > > Indeed. I attempted to fix that at e33ae53dd. > Thanks! That probably explains why I've been unable to reproduce that, I haven't realized there's a fix already. -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Expr. extended stats are skipped with equality operator
On 8/5/22 11:50, Danny Shemesh wrote: > Hey all ! > > I'm on a quest to help the planner (on pg14) use the best of several > partial, expressional indices we have on some large tables (few TBs in > size, billions of records). > > As we know, stats for expressions in partial indices aren't gathered by > default - so I'm tinkering with expressional extended stats to cover for > those. > > I've tackled two interesting points there: > 1. Seems like expressional stats involving the equality operator are > skipped or mismatched (fiddle > <https://www.db-fiddle.com/f/4jyoMCicNSZpjMt4jFYoz5/5379>) > Let's take the following naive example: > /create table t1 (x integer[]); > insert into t1 select array[1]::integer[] from generate_series(1, > 10, 1); > create statistics s1 on (x[1] = 1) from t1; > analyze t1; > / > /explain analyze select * from t1 where x[1] = 1;/ > /> Seq Scan on t1 (cost=0.00..1986.00 rows=500 width=29) (actual > time=0.009..36.035 rows=10 loops=1)/ > / > / > Now, of course one can just create the stat on x[1] directly in this > case, but I have a more complex use case where an equality operator is > beneficial; > should the above case be supported ? feels like I'm just missing > something fundamental. > Hmmm. The problem here is that the expression may be interpreted either as an operator clause (A=B), or as a boolean expression ((A=B)=TRUE). In principle we might check which option matches extended statistics, but that's not for free :-( So what the current code does is trying to match the more specific cases first, leaving the "bool expression" as a last resort. That seems like a reasonable trade off to me, but I'll think about this a bit more. There are probably other ways to match expressions, and we can't possibly explore all of them. For example you may create statistics on (x=1 AND y=2) and I doubt we'll match that, because we'll try matching individual clauses not some arbitrary combinations of clauses. (Maybe we shouldn't even allow creating such statistics ...) > 2. Less important, just a minor note - feel free to ignore - although > the eq. operator above seems to be skipped when matching the ext. stats, > I can work around this by using a CASE expression (fiddle > <https://www.db-fiddle.com/f/wJZNH1rNwJSo3D5aByQiWX/1>); > Building on the above example, we can: > /create statistics s2 on (case x[1] when 1 then true else false end) > from t1;/ > /explain analyze select * from t1 where (case x[1] when 1 then true else > false end/ > /> Seq Scan on t1 (cost=0.00..1986.00 rows=10 width=25) (actual > time=0.011..33.721 rows=10 loops=1)/ > / Yes, because this end ups not being matches as opclause, and therefore goes all the way to the boolvarsel() in clause_selectivity_ext. > / > What's a bit problematic here, though, is that if we mix other dependent > columns to the extended stat, and specifically if we create an mcv, > queries involving the CASE expression throw with `error: unknown clause > type 130`, where clause type == T_CaseExpr. > > The second point for me would be that I've found it a bit non intuitive > that creating an extended statistic can fail queries at query time; it > makes sense that the mcv wouldn't work for case expressions, but it > might've been a bit clearer to: > > a. Fail this at statistic creation time, potentially, or > b. Convert the type numeric in the above error to its text > representation, if we can extract it out at runtime somehow - > I couldn't find a mapping of clause type numerics to their names, and as > the node tags are generated at compile time, it could be build-dependent > and a bit hard to track down if one doesn't control the build flags > Yeah, this seems like a clear bug - we should not fail queries like this. It's a sign statext_is_compatible_clause() and the MCV code disagrees which clauses are compatible. Can you share an example triggering this? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: PostgreSQL 14.4 ERROR: out of memory issues
On 7/14/22 21:25, Aleš Zelený wrote: > > st 13. 7. 2022 v 2:20 odesílatel Michael Paquier <mailto:mich...@paquier.xyz>> napsal: > > On Mon, Jul 11, 2022 at 10:50:23AM +0200, Aleš Zelený wrote: > > So far, it has happened three times (during a single week) from > the 14.3 -> > > 14.4 upgrade, before 14.4 we haven't suffered from such an issue. > > > > Questions: > > 1) Can we safely downgrade from 14.4 to 14.3 by shutting down the > instance > > and reinstalling 14.3 PG packages (to prove, that the issue > disappear)? > > 2) What is the best way to diagnose what is the root cause? > > Hmm. 14.4 has nothing in its release notes that would point to a > change in the vacuum or autovacuum's code paths: > https://www.postgresql.org/docs/14/release-14-4.html#id-1.11.6.5.4 > <https://www.postgresql.org/docs/14/release-14-4.html#id-1.11.6.5.4> > > There is nothing specific after a look at the changes as of, and I am > not grabbing anything that would imply a change in memory context > handling either: > `git log --stat REL_14_3..REL_14_4` > `git diff REL_14_3..REL_14_4 -- *.c` > > Saying that, you should be able to downgrade safely as there are no > changes in WAL format or such that would break things. Saying that, > the corruption issue caused by CONCURRENTLY is something you'd still > have to face. > > > Thanks, good to know that, we can use it for a test case, since we > already hit the CONCURRENTLY bug on 14.3. > > > 2022-07-02 14:48:07 CEST [3930]: [3-1] user=,db=,host=,app= > ERROR: out of > > memory > > 2022-07-02 14:48:07 CEST [3930]: [4-1] user=,db=,host=,app= > DETAIL: Failed > > on request of size 152094068 in memory context > "TopTransactionContext". > > 2022-07-02 14:48:07 CEST [3930]: [5-1] user=,db=,host=,app= CONTEXT: > > automatic vacuum of table "prematch.replication.tab_queue_tmp" > > This is the interesting part. Do you happen to use logical > replication in a custom C++ plugin? > > > We are using logical replication to other instances (pg_output) and > decoderbufs > https://github.com/debezium/postgres-decoderbufs > <https://github.com/debezium/postgres-decoderbufs> for other applications. > This is probably just a red herring - std:bad_alloc is what the process that runs into the overcommit limit gets. But the real issue (e.g. memory leak) is likely somewhere else - different part of the code, different process ... > ... > > Checking the RssAnon from proc/pid/status I've found some points where > RssAnon memory usage grew very steep for a minute, but no "suspicious" > queries/arguments were found in the instance logfile. > > Any hint, on how to get the root cause would be appreciated since so far > I've failed to isolate the issue reproducible testcase. > At least I hope that looking for the RssAnon process memory is an > appropriate metric, if not, let me know and I'll try to update the > monitoring to get the root cause. > > I can imagine a workaround with client application regular reconnect..., > but u to 14.3 it works, so I'd like to fix the issue either on our > application side or at PG side if it is a PG problem. > I think it's be interesting to get memory context stats from the processes consuming a lot of memory. If you know which processes are suspect (and it seems you know, bacause if a reconnect helps it's the backend handling the connection), you can attach a debugger and do $ gdb -p $PID call MemoryContextStats(TopMemoryContext) which will log info about memory contexts, just like autovacuum. Hopefully that tells us memory context is bloated, and that might point us to particular part of the code. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: increasing effective_cache_size slows down join queries by a factor of 4000x
On 2/15/22 01:06, A Shaposhnikov wrote: Interestingly I have a second PG 14.2 database, with identical table definitions, but about 10% smaller row counts, and the exact same query works fast there without the 2nd condition: Are you sure about the 10%? Because in the plans from the first machine I see this: > > -> Index Scan using team_pkey on team t (cost=0.57..11382381.88 > rows=78693167 width=175) (actual time=0.016..0.695 rows=854 loops=1) > while the second machine does this: -> Index Scan using team_pkey on team t (cost=0.57..2366113.83 rows=2807531 width=160) (actual time=0.031..0.801 rows=888 loops=1) That's 2.8M vs. 78M, quite far from "10% difference". Not sure about team_aliases table, that's imposible to say from the plans. This may matter a lot, because we use effective cache size to calculate cache hit ratio for the query, with relation sizes as an input. So smaller relations (or larger effective_cache_size) means cheaper random I/O, hence preference for nested loop join. The other thing is data distribution - that may matter too. IMO it's pointless to investigate this further - we know what's causing the issue. The optimizer is oblivious that merge join will have to skip large part of the second input, due to the implicit condition. Notice that adding the condition changes the cost from: Limit (cost=81.33..331.82 rows=1000 width=183) ... to Limit (cost=81.33..720.48 rows=1000 width=183) ... So it seems *more* expensive than the first plan. Taken to the extreme the planner could theoretically have chosen to use the first plan (and delay the condition until after the join). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: increasing effective_cache_size slows down join queries by a factor of 4000x
On 2/4/22 05:21, A Shaposhnikov wrote: Tomas, thank you! The query: select 1 from data as d, data_class as dc where dc.data_id = d.id and d.id > 205284974 and dc.data_id > 205284974 -- new condition order by d.id limit 1000; totally solved it - it is now fast under all conditions! I thought that the optimizer would be able to infer it itself. Unfortunately, the optimizer is not that smart - we can do that for equality conditions, but not for other operators. There was actually a thread [1] exploring a possibility to extend this to inequalities, but it went nowhere so far. It also explains why it's done only for equality operators. In short, it's fairly expensive, makes costing of joins more difficult, and most queries can't benefit from it (because conditions on join keys are not that common). BTW how does the final query plan look like? Is it using the merge sort of nested loop? I wonder if this might be formulated as a costing issue, pushing the planner to use the nested loop. [1] https://www.postgresql.org/message-id/flat/CAFQUnFhqkWuPCwQ1NmHYrisHJhYx4DoJak-dV%2BFcjyY6scooYA%40mail.gmail.com regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: increasing effective_cache_size slows down join queries by a factor of 4000x
On 2/2/22 22:10, Artyom Shaposhnikov wrote: the row estimate became ~1000x smaller with the stat in place, so it looks like it grossly miscalculates the query plans without the stats for large tables representing M:M relations. Well, if the estimates are significantly off (and 3 orders of magnitude certainly qualifies), then all bets are off. There's no magical option that'd fix planning in such conditions. Ultimately, fixing the estimates (e.g. by creating extended statistics) is the right "fix" as it gives the optimizer the information needed to pick the right plan. On Wed, Feb 2, 2022 at 11:47 AM Michael Lewis wrote: What does the row estimate look like on the scan of data table with that statistic in place? Anytime the stats give a mis-estimate this far off, I wouldn't expect that plans would be optimal except by luck. Index Scan using data_pkey on data t (cost=0.57..21427806.53 rows=58785023 width=131) (actual time=0.024..0.482 rows=854 loops=1) Index Cond: (id > 205284974) It'd be interesting to see the plans without the LIMIT, as that makes the "actual" values low simply by terminating early. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Create and access a dictionary type
On 1/5/22 14:36, Shaozhong SHI wrote: I just checked operators. I could not find any operator to set a new value given a key. I'd say jsonb_set is what you're looking for. Postgres does not have a thing that'd match the Oracle feature exactly, you'll have to adjust your application to use what's available. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Create and access a dictionary type
On 1/5/22 17:24, Ron wrote: On 1/5/22 7:22 AM, Tomas Vondra wrote: On 1/5/22 14:17, Shaozhong SHI wrote: Any examples in Postgres to create a dictionary type to store and access key value pairs? I'd say JSONB can be used as an associative array, and e.g. in Python can map to dict data type. The question confuses me. Why isn't "create a table with Primary Key" the answer to "Any examples in Postgres to create a dictionary type to store and access key value pairs?" Well, yes - you can always decompose the dict and store it in an EAV, but sometimes it's more convenient to just use "nested" dictionary at the row level. Which is what a JSONB column does. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Create and access a dictionary type
On 1/5/22 14:17, Shaozhong SHI wrote: Any examples in Postgres to create a dictionary type to store and access key value pairs? I'd say JSONB can be used as an associative array, and e.g. in Python can map to dict data type. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Postgis - geography function
On 1/5/22 09:48, Sameer TWAYANA wrote: Hi Sir, I need some help. I have been using the postgis extension for many years. Now in postgis version 3.2 using geom data in geography with ST_Length and ST_Distance function returns some weird results. Before I used 3.1.4, all was good. But I got different results in version 3.2. So can you suggest why I am getting different results in postgis version 3.2. I really appreciate for your help. Thank you. You haven't shown a lot of detail about the issue, so it's hard to say what's going on, unfortunately :-( You'll need to show us some queries with expected/actual results (the simpler the better, of course). And provide info about the Postgres version, etc. BTW if it's a problem related to PostGIS, maybe using a mailing list for that specific project [1] would yield better results. [1] https://lists.osgeo.org/mailman/listinfo/postgis-users regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Plans with bad estimates/paths
On 11/16/21 9:22 PM, Joe Wildish wrote: > Hello, > > I am struggling with a problem that appears planning related. I'm > hoping folk here may be able to advise on how best to tackle the > issue. > > We have a system that ingests JSON messages containing order data. > The data from these messages is inserted into a normalised table > structure; "order", "order_item", "order_discount", etc. Orders can > have multiple items, items can have multiple discounts and so forth. > Referential constraints exist between the tables in the expected way. > Fields in the tables are all declared NOT NULL. If the schema of the > JSON is such that a field is optional, we reflect that optionality in > another "order_" table, and make "order_" be a subset of > "order". Moreover, we ingest messages for different > clients/customers. Therefore each message-related table carries with > it a client identifier which forms part of the primary key on the > table. For example, "order" has a key of "(client_id, order-id)". > > We have written transformations that calculate various facts about > orders. For example, one of the transforms emits order item data > where we have calculated the overall discount for the item, and have > "filled in" some of the optional fields with defaults, and have > categorised the order item on the basis of some aspect of the order > (e.g. "this is an e-commerce order, this is retail order"). These > transforms are typically per client (e.g. `WHERE client_id = 123`) > although in some cases we transform over multiple clients (e.g. > `WHERE client_id = ANY (SELECT client_id FROM clients WHERE ...)`). > > The issue is that for some clients, or combination of clients, the > planner is choosing a path that takes substantially longer to > evaluate than the plan it typically chooses for other clients. The > number of tables being joined is in the region of 15. There is an > extended statistic object in place to help the one aggregation that > occurs (defined on the basis of the `GROUP BY` columns) to try and > get a better estimate of the likely number of rows emitted. However, > what I am often seeing in the explain plan is that the estimated rows > is small and the actuals are significantly larger e.g. > > Merge Join (cost=1.14..253250.32 rows=1099 width=69) (actual > time=1268.587..2400.353 rows=4282355 loops=1) > It sure smells like a case of correlated data for some clients but not others, but who knows ... Hard to say without seeing the nodes below the join. If the lower nodes are estimated accurately, then it's the join selectivity that is estimated poorly, and there's not much we can do about it :-( Do the "good" plans have the same underestimate? Maybe there's just much less data for those clients, and the "poor" plan ends up being fast anyway? > I am assuming this underestimation is the source of the planner > choosing the "wrong" path; in production, we have had to resort to > setting the join and from collapse limits to 1 to force a naive plan > to be generated. This is giving us execution times in the 10/20 > second range vs. >45m in some cases. > That may be happening, yes. So is it the join order that ends up being wrong, or the join methods? Have you tried increasing the collapse limit instead? Although, if it works for some queries but not others, that's likely not going to help. > (a) Do you have any suggestions on a general approach to tackling the > problem? For example, one option might be to pre-compute some of the > subqueries that are occurring in the transforms, write the results > into their own tables, and substitute those tables in place of the > subqueries in the main transform. Is this something people typically > do in this situation? > The first thing I'd do is reduce the query size as much as possible. In this case I'd try removing as many joins as possible until the issue disappears. The simpler the query, the easier it is to investigate. And yes, replacing parts of a query with a temporary table is a common solution, because it's possible to collect statistics on it, build indexes etc. That usually solves estimation issues in multi-tenancy. Sometimes even a CTE with materialization is enough. > (b) Do I need to provide a schema and explain plans to get any > concrete advice on how to proceed? > That would be helpful, particularly after making the query as small as possible. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: GIN index
Hi, Why exactly are you starting a new thread again, instead of continuing in the thread you started a couple days ago? A couple recommendations: 1) I find it unlikely you'll get a lot of help unless you provide a way to reproduce the issue easily. That is, something people can build and test on their own. Posting incomplete chunks of code means we'd have to spend time filling in the gaps, etc. Sorry, not going to happen. Create an extension that we can build/install easily, include a SQL script reproducing the issue. 2) The code is incomplete, e.g. hex2char/char2hex are missing. And the calls seems to be the other way around. I'd expect geomgrid_in to transform hex string to binary, but it calls Char2Hex. 3) AFIAK there's nothing like CREATE FUNCTION ... NOT FENCED ..., so it seems you're using some sort of fork. In that case you should probably ask them. 4) Don't post backtraces as screenshots. Copy them as text, attach them as a text file. 5) Continue in the thread you already started, otherwise it's impossible to follow the discussion. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Avg/max size of these JSON docs in Postgres
On 10/12/21 21:21, Michael Lewis wrote: On Tue, Oct 12, 2021 at 12:51 PM Simon Riggs mailto:simon.ri...@enterprisedb.com>> wrote: On Tue, 12 Oct 2021 at 18:53, Michael Lewis mailto:mle...@entrata.com>> wrote: > > On Tue, Oct 12, 2021 at 1:43 AM Simon Riggs mailto:simon.ri...@enterprisedb.com>> wrote: >> >> On Tue, 12 Oct 2021 at 08:14, Ram Pratap Maurya >> mailto:ram.mau...@lavainternational.in>> wrote: >> >> > Confirm what is Avg/max size of these JSON docs in Postgres. >> >> JSON and JSONB datatypes can both be max 1GB in size. > > That is per row. No, that is per column. Yes, sorry. My attempt at clarification only muddled things. Each column within each row can be up to 1GB in size is how I understand the limit. But you're kinda right, actually. It very much is per-tuple, because in various places we form tuples with all the data inline. Consider for example this: create table t (a text, b text); alter table t alter column a set storage extended; alter table t alter column b set storage extended; insert into t select repeat(md5('a'), 512*1024*1024/32), repeat(md5('b'), 512*1024*1024/32); ERROR: invalid memory alloc request size 1073741880 Clearly, both values are only 512MB (no compression). Yet it fails, simply because tts_virtual_copy_heap_tuple calls heap_form_tuple to form a tuple with all values and also the tuple header. But this succeeds, because the values are 64B shorter, leaving enough space for the tuple header etc. insert into t select repeat(md5('a'), 512*1024*1024/32 - 2), repeat(md5('b'), 512*1024*1024/32); And you can even select the data: select * from t; You can even do this: update t set a = repeat(md5('a'), 512*1024*1024/32); which works, so now you have a row with two 512MB values. But then you'll face this: select * from t; ERROR: out of memory DETAIL: Cannot enlarge string buffer containing 536870922 bytes by 536870912 more bytes. because printtup() builds a huge string with all the data (and if the columns had compression, this would be decompressed, because it goes to the client). So yeah, there's an explicit 1GB limit per value, but having rows close to the 1GB limit is going to cause all sorts of unpredictable and rather painful issues :-( regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: create a new GIN index for my own type
On 10/4/21 3:32 PM, huangning...@yahoo.com wrote: I have already debugged the program according to this step, but I found that in the DataCopy function, the variable typlen should be 8, but it is -1, Well, if you have debugged this, it'd be nice if you could share more information (e.g. backtraces, etc.) otherwise others can just guess what you saw. And that makes it much harder to help you. I see you defined the data type as PASSEDBYVALUE, but you haven't specified INTERNALLENGTH, so it's -1 (i.e. variable length). Obviously, that can't be passed by value - not sure if this is intentional or just a case of CREATE TYPE not checking it. BTW it's customary not to top post - inline replies are much easier to follow, as it makes clearer which parts you respond to. And please make sure that you're responding to the mailing list, not just directly to the other person. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Problem in pg_basebackup of 1.7 TB database while replicating in PostgreSQL 9.4.( Master and slave both are in postgreSQL 9.4)
On 10/4/21 9:28 AM, Amal Chakravarty wrote: Hi all. I am working with a database of 1.7 TB size which is in PostgreSQL 9.4. Kindly suggest how to take the pg_basebackup of such a huge data while replicating from master to slave. And what exactly is the issue? Does it fail in some way or are you looking for a faster / more efficient way to transfer the data? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: create a new GIN index for my own type
On 10/4/21 8:30 AM, huangning...@yahoo.com wrote: Hi: I created a new data type, and then I wanted to create a GIN index for it, but when I created the index, the program would crash 。 The version of postgresql is 9.6。 The following is part of the code, and I also refer to the code of intarray. I doubt anyone is going to investigate this unless you provide a more complete example - something like an extension where people can do "make install" without having to fill in various pieces of code. To investigate the crash, you need to attach a debugger to the backend and run the CREATE INDEX (or whatever triggers the crash). The debugger should catch the segfault and you'll be able to identify where exactly it crashes and why (and investigate). 1) first get PID of the backend SELECT pg_backend_pid(); 2) then attach a debugger to the backend gdb -p $PID (gdb) c 3) run the CREATE INDEX query 4) get backtrace from the debugger (gdb) bt regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: ERROR: unrecognized node type
On 9/30/21 8:38 AM, Amine Tengilimoglu wrote: ... Remarkable thing the related error occurs when executing sql statements containing where. The sqls that do not contain a where are not getting an error. Location information as below; ERROR: XX000: unrecognized node type: 223 *LOCATION: exprType, nodeFuncs.c:263* STATEMENT: SELECT n.nspname as "Schema", c.relname as "Name", CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm' THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' WHEN 'p' THEN 'partitioned table' WHEN 'I' THEN 'partitioned index' END as "Type", pg_catalog.pg_get_userbyid(c.relowner) as "Owner" FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE c.relkind IN ('r','p','v','m','S','f','') AND n.nspname <> 'pg_catalog' AND n.nspname <> 'information_schema' AND n.nspname !~ '^pg_toast' AND pg_catalog.pg_table_is_visible(c.oid) ORDER BY 1,2; I'm unable to reproduce the issue, so it probably depends on what tables are created etc. But if you say it only happens with WHERE clause, that's interesting. It suggests the failure probably happens somewhere in transformWhereClause, but we can only speculate why and the query conditions look entirely reasonable. I suggest you do this: 1) start a session, identify the PID of the backend select pg_backend_pid(); 2) attach a debugger (e.g. gdb) to the pid gdb -p $PID 3) set breakpoint to the location in the error message (gdb) break nodeFuncs.c:263 (gdb) continue 4) run the query, the breakpoint should be triggered 5) extract full backtrace (gdb) bt full 6) print the expression (gdb) p nodeToString(expr) That should give us some hints about what might be wrong ... regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: ERROR: unrecognized node type
On 9/29/21 8:37 PM, Tom Lane wrote: Adrian Klaver writes: Unless folks are using plain text only email readers. That is why screenshots are generally a bad idea for textual information. For the record the Postgres version is 12.8. Hmm, if it's v12 then NodeTag 223 is T_List, which is ... a bit surprising. That's a common enough node type that "oversight" doesn't seem to fly as an explanation. I don't think we'll be able to make progress on this without a lot more detail. A self-contained example that triggers it would be very useful; or maybe you could get a backtrace from the point of the error? https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend A handy trick for similar cases (not mentioned in the wiki) is to set SET log_error_verbosity = verbose; and then trigger the error. This logs the exact location (file:line) where the error is printed, and then set a breakpoint to that place. Makes it easier to generate the backtrace. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: How to hash a large amount of data within Postgres?
On 6/23/21 7:39 PM, Peter J. Holzer wrote: > On 2021-06-21 15:53:09 +0200, Thorsten Schöning wrote: >> Some years ago I implemented some SQL to read all files, build a table >> of SHA256 hashes and tell me how much data is redundant. The goal was >> to have a look at which files share the same hash with different LOIDs >> and optionally change that, so that all those files are only stored >> once on the end. >> >> While the approach was pretty naive, because it simply read all files >> into memory to calculate the hashes, I'm somewhat sure it worked in >> the past with Postgres 9.6. The executing server had enough free RAM >> available as well to process the at most ~4 GiB large files one after >> another. >> >> I tried that SQL today with Postgres 11 on UB 18.04 and it failed: >> >>> [Code: 0, SQL State: XX000] FEHLER: invalid memory alloc request size >>> 1898107949 >>> Wobei: PL/pgSQL-Funktion loid_hash_calc(oid,text)[...] > [...] >> I searched regaridng that issue and only found two relevant results: >> Corrupted rows for some reason and simply size restrictions when >> allocating memory. The latter is more likely than the former in my >> case, as the restrictions seems to be 1 GiB and I do have larger >> files. > > 1 GB is the maximum size of quite a few data structures in PostgreSQL. I > don't use PL/pgSQL, but I wouldn't be surprised if it was the maximum > size of whatever loread() returns (a bytea?). I would be surprised if > this limit was higher in version 9.6 than it is in version 11, however. > Well, it's actually a bit worse than that - the maximum allocation size is (1GB - 1B), as it's defined like this: #define MaxAllocSize((Size) 0x3fff) /* 1 gigabyte - 1 */ And this includes both the "user data" and a small "header" used for the bytea value. Depending on what format you use to output the values there may be additional limits (e.g. 'hex' requires 2 characters per byte, so doubling the amount of memory needed). For large objects this is not an issue, because we store them in small chunks, not as one large bytea value. > >> I'm doing the following simply currently, because I didn't find any >> interfaces allowing to forward blocks of data, LOIDs, file descriptors >> or anything like that working with smaller buffers or alike. >> >>> fd := lo_open( loid, INV_READ); >>> size:= lo_lseek(fd, 0, SEEK_END); >>> PERFORMlo_lseek(fd, 0, SEEK_SET); >> >>> hashBin := digest(loread(fd, size), algorithm); >>> hashHex := encode(hashBin, 'hex'); >> >> So, is there any way to work around the problem I have currently? > > Normally, hash libararies have a way to feed chunks of data into a hash > computations to avoid having to keep the whole thing in RAM. > The pgcrypto extension seems to be lacking such functionality, however. > > I would build something similar to a Merkle tree: > > Choose a convenient chunk size (a few MB is probably ok), read the large > object in chunks of this size, computing the hash for each. Concatenate > all the hashes and compute the hash of that. Add intermediate levels if > the the concatenated hashes are still too large to fit in memory. > Not sure where you searched, but there definitely are interfaces to read chunks of data from large objects - see this: 1) lo_get (loid, offset, length) https://www.postgresql.org/docs/13/lo-funcs.html 2) lo_seek() + lo_read() https://www.postgresql.org/docs/13/lo-interfaces.html Obviously, you can't do "loread(fd, size)" because that's going to attempt building one large bytea, failing because of the alloc limit. You have to stream the data into the hash. Doing that in plpgsql is possible, although possibly somewhat slow. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Strange behavior of function date_trunc
On 5/5/21 3:23 PM, Pavel Luzanov wrote: Hello, It is very likely that the date_trunc function in the following example is executed for each line of the query. Although it marked as a STABLE and could only be called once. It could, but that's just an option - the database may do that, but it's not required to do it. In this case it might be beneficial, but it'd make the planner more complex etc. EXPLAIN (ANALYZE) SELECT * FROM generate_series('2021-01-01', '2021-06-01', '1 s'::interval) AS g(x) WHERE g.x >= date_trunc('day', '2021-05-05'::timestamptz); QUERY PLAN --- Function Scan on generate_series g (cost=0.00..15.00 rows=333 width=8) (actual time=2801.884..3263.328 rows=2332801 loops=1) Filter: (x >= date_trunc('day'::text, '2021-05-05 00:00:00+03'::timestamp with time zone)) Rows Removed by Filter: 10713600 Planning Time: 0.040 ms Execution Time: 3336.657 ms When replacing date_trunc with now, the query is much faster: EXPLAIN (ANALYZE) SELECT * FROM generate_series('2021-01-01', '2021-06-01', '1 s'::interval) AS g(x) WHERE g.x >= now(); QUERY PLAN --- Function Scan on generate_series g (cost=0.00..15.00 rows=333 width=8) (actual time=1648.777..1845.430 rows=2275325 loops=1) Filter: (x >= now()) Rows Removed by Filter: 10771076 Planning Time: 0.039 ms Execution Time: 1918.767 ms The variant with now works almost as fast as with the constant. This suggests me that perhaps date_trunc is being executed for every line of the query: EXPLAIN (ANALYZE) SELECT * FROM generate_series('2021-01-01', '2021-06-01', '1 s'::interval) AS g(x) WHERE g.x >= '2021-05-05'::timestamptz; QUERY PLAN --- Function Scan on generate_series g (cost=0.00..12.50 rows=333 width=8) (actual time=1628.743..1826.841 rows=2332801 loops=1) Filter: (x >= '2021-05-05 00:00:00+03'::timestamp with time zone) Rows Removed by Filter: 10713600 Planning Time: 0.033 ms Execution Time: 1901.680 ms In this regard, I have two questions: 1. How can I find out exactly how many times the date_trunc function has been executed? So far, these are just my assumptions. 2. If date_trunc is indeed called multiple times, why is this happening? Well, it'd not like date_trunc is executed for each row while now() is executed only once. The functions are executed for each row in both cases, but now() is simply much cheaper - it just returns a value that is already calculated, while date_trunc has to parse and truncate the value, etc. You can use CTE to execute it just once, I think: with x as (select date_trunc('day', '2021-04-01'::timestamptz) as x) select * from t where a > (select x from x); regards Tomas
Re: Extended stats - value not in MCV list
Hi, this topic would probably be a better fit for pgsql-hackers, as it's about the internal implementation, but let's continue the discussion here for now. On 5/2/21 6:52 AM, Pedro Luis Guzmán Hernández wrote: Hi there, I've just started using extended stats cause the planner was giving me terrible estimates for a certain table. MCV extended stats solved my problem when values are in the extended MCV list, but estimates are still terrible when they are not in the MCV list. Limited as my knowledge of database internals is, I dug into the source code and found an important difference on how these not-MCV values are handled in the single-column and multi-column cases. Hmm, so which which PostgreSQL version are you testing this on? I'm not sure which code branch you're looking at. Can you share a reproducer, i.e. a simple example demonstrating the issue? That'd be very helpful. For single columns, the estimate is calculated as follows: selectivity = (1 - sum(MCV_frequencies)) / (distinct_values - count(MCV)) Which seems to assume a uniform distribution of non-MCV values and looks like a sensible guess, at least to my amateur eyes. For multi-column statistics it seems to me that the estimate is calculated instead as: selectivity = 1 - sum(MCV_frequencies) Which instead seems to assume that the value could potentially be present in all the rows not covered by the MCV. This seems like an adequate upper bound, but is also quite conservative compared to the single-column estimate. In my specific case this yields a selectivity even higher than some of the least frequent MCV values, which is a condition that is actually checked for and compensated in the single-column estimate as an additional check. I have MCV and distinct extended stats, so I know the distinct_values stats is available. It shouldn't behave like that, and some quick experiments suggest it does not (at least on master). I can't rule out a bug, of course. A reproducer would be helpful. So I hope my question is clear from the above. How come the estimates are calculated with such different approaches? I insist I have no experience with database/query planner development, so excuse me if I am overlooking some obvious conceptual difference between single-column and multi-column stats. The single-column estimate is actually described in the documentation, but the multi-column estimate is not. If there is indeed a good reason for this difference, I think it should be documented. As for the ndistinct estimates and multi-column MCV lists, it's not all that simple - there may be conditions on only some of the columns, in which case we don't know how many groups we actually matched, etc. TBH I'm not sure how much of those implementation details we want to put into the user docs - it may be a bit too much, and we may need to change some of it. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: BRIN index on timestamptz
On 4/23/21 10:31 AM, Mohan Radhakrishnan wrote: Hi, I am planning to use as I search based on timestamptz fields. There are millions of records.I refer https://www.percona.com/blog/2019/07/16/brin-index-for-postgresql-dont-forget-the-benefits <https://www.percona.com/blog/2019/07/16/brin-index-for-postgresql-dont-forget-the-benefits> I execute this on the AWS RDS instance. Is there something in the plan I should pay attention to ? I notice the Execution Time. Thanks, Mohan INSERT INTO testtab (id, date, level, msg) SELECT g, CURRENT_TIMESTAMP + ( g || 'minute' ) :: interval, random() * 6, md5(g::text) FROM generate_series(1,800) as g; Two things: a) You need to do ANALYZE, otherwise there are no statistics the optimizer could use (which is why the row estimates in the plans are entirely bogus). b) BRIN indexes don't work on random data, because the whole idea is about eliminating large blocks of data (e.g. 1MB). But with random data that's not going to happen, because each such range will match anything. Which is why seqscan is a bit faster than when using BRIN index. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Is there a way to dump schema to files on disk and keep them in sync
On 12/13/20 6:34 PM, Adrian Klaver wrote: > On 12/12/20 6:58 PM, Tim Uckun wrote: >> I want to dump my postgres schema to disk in neat directories like >> pgadmin presents. Then I want to be able to edit the files and sync >> changes to the database and ideally if changes were made in the >> database to sync them back to the disk. >> >> Is there a tool that does this? Is there a tool that will dump the >> schema into separate directories and files like pgadmin does? > > pgAdmin does not create directories, it just organizes the contents of > the system catalogs into GUI elements. > > For schema management I would suggest a tool like the one I use > Sqitch(https://sqitch.org/). It will organize the process of schema > creation and management. > Yeah, that was my thought too. Or maybe look at the other schema versioning tools available - we have a list on the wiki: https://wiki.postgresql.org/wiki/Change_management_tools_and_techniques I'm sure it's incomplete, but it's helpful nevertheless. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Potential BRIN Index Corruption
On 12/9/20 12:07 AM, Huan Ruan wrote: > HI Alvaro > > Unfortunately those records were written a while ago and we no longer keep > their WAL logs. Thanks for your help anyway. > Can you estimate when roughly the records were written? E.g. by using a rough estimate of WAL or XIDs generated per day, or something like that. Maybe there was some I/O issue in that time period and a couple writes got lost, or something like that. Are there any other corrupted indexes on the table? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Extended statistics for correlated columns, row estimates when values are not in MCVs list
On 12/4/20 11:53 PM, Michael Lewis wrote: > On version 12.5, I have a query similar to the below where I am getting > rather unfortunate row estimates and a sub-optimal index choice as a result. > > SELECT > id > FROM > messages > WHERE > client_id = 1234 > and site_id = 889977 > and message_type_id in ( 59, 62, 102, 162 ) > and sent_on > NOW() - INTERVAL '3 days'; > > I created extended statistics (all types) on client_id and site_id, > analyzed the columns, and in pg_stats_ext.dependencies (4 = client_id, 7 = > site_id) have: > {"4 => 7": 0.002997, "7 => 4": 0.896230} > > When I check row estimates like below, I get results that indicate the > columns are expected to be independent still. > explain SELECT id FROM messages; --889594304 > explain SELECT id FROM messages where client_id = 1234; --133439 > explain SELECT id FROM messages where site_id = 889977; --28800 > explain SELECT id FROM messages where client_id = 1234 and site_id = > 889977; --4 > I was a bit confused about this initially, because why would it still be estimated as independent clauses, with the extended stats defined? But I think the explanation is fairly simple - the combination of values is simply rare enough not to be included in the MCV list, so it probably gets estimated using the "regular" logic as if independent. But we still mark the clauses as estimated, so the functional dependencies are not really considered. (I wonder if the code in PG14 would work better, but that's of little use of course.) > However, I pick a client & site ID pair which show up in the MCVs list, > then I get the same estimate when querying for that site_id with or without > including the client_id. That is great. > I think this is consistent with the above explanation - in this case the MCV actually kicks in, significantly improving the estimate. > Is it reasonable to expect that if the correlation between two columns is > rather high, then the optimizer might figure the columns are not > independent and perhaps would give less weight to the value derived from > independent column statistics? With table statistics, it is possible to set > a static value or ratio for something like ndistinct. Any chance for > something similar on dependency someday? > > Perhaps I am expecting too much or have a poor understanding of what > extended statistics can or someday might do. I deal with under estimates > from correlations between client_id and sites or other similar > dependent objects a fair bit and am hopeful to resolve some of those > planning problems with extended stats, without the maintenance overhead of > migrating everything to a single client per database to get more specific > statistics. > What you might try is defining the statistics with only the functional dependencies. That should consider the column-level correlation even when the combination of values is not in the MCV. It might make the "good" estimate worse, but that's an inherent trade-off. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Check constraints do not seem to be working!!!
On 11/11/20 10:06 AM, Jitendra Loyal wrote: > Thanks Nikolay > > I read that but is there a way to meet the above requirement. And I will > like to add that IS NULL and IS NOT NULL should evaluate to true/false. > These operators are made for this and should not be returning NULL. > This has nothing to do with IS [NOT] NULL, it's the first part of the expression (b = TRUE) causing trouble. Essentially, the constraint (b = true) and (c is not null) is evaluated in two steps. First we evaluate the two parts individually, and for (null, true) the results would look like this: (b = true) => null (c is not null) => true and then we combine those results using 'AND' null AND true => null which is considered as if the constraint matches. If you want to handle NULL for the first expression, you may do this, for example: (b it not null and b = true) and (c is not null) Or something like that. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: DB running out of memory issues after upgrade
On Tue, Feb 18, 2020 at 05:46:28PM +, Nagaraj Raj wrote: after upgrade Postgres to v9.6.11 from v9.6.9 DB running out of memory issues no world load has changed before and after upgrade. spec: RAM 16gb,4vCore Any bug reported like this or suggestions on how to fix this issue? I appreciate the response..!! This bug report (in fact, we don't know if it's a bug, but OK) is woefully incomplete :-( The server log is mostly useless, unfortunately - it just says a bunch of processes were killed (by OOM killer, most likely) so the server has to restart. It tells us nothing about why the backends consumed so much memory etc. What would help us is knowing how much memory was the backend (killed by OOM) consuming, which should be in dmesg. And then MemoryContextStats output - you need to connect to a backend consuming a lot of memory using gdb (before it gets killed) and do (gdb) p MemoryContextStats(TopMemoryContext) (gdb) q and show us the output printed into server log. If it's a backend running a query, it'd help knowing the execution plan. It would also help knowing the non-default configuration, i.e. stuff tweaked in postgresql.conf. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: 1 Status of vertical clustered index - 2 Join using (fk_constraint) suggestion - 3 Status of pgsql's parser autonomization
On Sun, Feb 16, 2020 at 10:38:29PM +0100, maxzor wrote: Hello, 1. I was told that M$ SQLServer provides huge performance deltas over PostgreSQL when dealing with index-unaligned queries : create index i on t (a,b, c); select * from t where b=... and c=...; Perhaps index-only scans might help here, but that generally does not work for "SELECT *" queries. Columnar storage has been tried by various companies, CitusData, EnterpriseDB, 2ndQuadrant, Fujitsu Enterprise Postgres. It has been discussed quite a lot, last thread that I was able to find being in 2017, https://www.postgresql.org/message-id/CAJrrPGfaC7WC9NK6PTTy6YN-NN%2BhCy8xOLAh2doYhVg5d6HsAA%40mail.gmail.com where Fujitsu's patch made it quite far. What is the status on such a storage manager extension interface ? I think you're looking for threads about zheap and (especially) zedstore. Those are two "storage manager" implementations various people are currently working on. Neither of those is likely to make it into pg13, though :-( 2. What do you think of adding a new syntax : 'from t join t2 using (fk_constraint)' ? And further graph algorithms to make automatic joins ? Both 'natural join' and 'using (column_name)' are useless when the columns are not the same in source and destination. Plus it is often the case that the fk_constraints are over numerous columns, even though this is usually advised against. But when this case happens there will be a significant writing speedup. I'm not really sure what's the point / benefit here. Initially it seemed you simply propose a syntax saying "do a join using the columns in the FK constraint" but it's unclear to me how this implies any writing speedup? I have been bothered by this to the point that I developed a graphical-query-builder plugin for pgModeler, https://github.com/maxzor/plugins/tree/master/graphicalquerybuilder#automatic-join-mode , but I believe such a syntax would be much better in the core! Hm, maybe. 3. What is the status of making the internal parser of PostgreSQL less coupled to the core, and easier to cherry-pick from outside? It would be great to incorporate it into companion projects : pgAdmin4, pgModeler, pgFormatter... I have no idea what you mean by "less coupled" here. What are the requirements / use cases you're thinking about? FWIW I think it's pretty bad idea to post questions about three very different topics into a single pgsql-hackers thread. That'll just lead to a lot of confusion. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Avoiding out of date statistics / planner
On Wed, Feb 12, 2020 at 10:23:22AM -0700, Michael Lewis wrote: It may also be worth noting that it is possible to make autovacuum/analyze more aggressive, perhaps only on the tables that see large changes in data that might result in a statistics issue. If you could share a query, explain analyze output, and pseudo code or at least description of what sort of bulk operations are being done, then more insight could be offered. Another thing you can do is deploy auto_explain, and log explain plan for long-runnning queries. That won't fix the root cause, but it will help you with confirming the root cause - you'll see the query plan, which should give you enough context. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Wed, Jan 08, 2020 at 05:24:05PM +1030, Andrew Dunstan wrote: On Wed, Jan 8, 2020 at 7:08 AM Pavel Stehule wrote: Hi po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan napsal: Updated version including docco and better error message. cheers andrew I think so my objections are solved. I have small objection + errdetail("exception raised due to \"null_value_treatment := 'raise_exception'\""), + errhint("to avoid, either change the null_value_treatment argument or ensure that an SQL NULL is not used"))); "null_value_treatment := 'raise_exception'\"" it use proprietary PostgreSQL syntax for named parameters. Better to use ANSI/SQL syntax "null_value_treatment => 'raise_exception'\"" It is fixed in attached patch source compilation without warnings, compilation docs without warnings check-world passed without any problems I'll mark this patch as ready for commiter Thank you for your work Thanks for the review. I propose to commit this shortly. Now that this was committed, I've updated the patch status accordingly. Thanks! -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: BigSQL pgc alternative
On Fri, Dec 20, 2019 at 12:15:06PM -0300, Samuel Teixeira Santos wrote: Hi all. BigSQL still allow to install Postgres and others resources as like a portable install. But today, it's only offer your tool (pgc) for newer Postgresql versions. I would like to install as portable option because it's more easy to config and use in my own user in my development station. What you recommend to do to replace pgc tool as alternative to install postgresql 10 and the respective postgis version as like portable option? I'm not familiar with BigSQL, so I'm not sure what exactly you mean when you say "portable option". Can you explain? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: How to import Apache parquet files?
On Tue, Nov 05, 2019 at 04:21:45PM +0100, Softwarelimits wrote: Hi Imre, thanks for the quick response - yes, I found that, but I was not sure if it is already production ready - also I would like to use the data with the timescale extension, that is why I need a full import. Well, we're not in the position to decide if parquet_fdw is production ready, that's something you need to ask author of the extension (and then also judge yourself). That being said, I think FDW is probably the best way to do this. It's explicitly designed to work with foreign data, so using it to access parquet files seems somewhat natural. The alternative is probably transforming the data into COPY format, and then load it into Postgres using COPY (either as a file, or stdin). Which of these options is the right one depends on your requirements. FDW is more convenient, but row-based and probably significantly less efficient than COPY. So if you have a lot of these parquet files, I'd probably use the COPY. But maybe the ability to query the parquet files directly (with FDW) is useful for you. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Barman
On Thu, Oct 31, 2019 at 05:29:34PM +, Daulat Ram wrote: Hi All, We have configured postgres 11.2 in streaming replication (primary & Standby) on docker and I am looking to initiate the Postgres backup using barman. As I know there are few options for taking backup using barman. RSYNC backup Incremental Backups Streaming Backup with continuous WAL streaming Centralized and Catalogued Backups Which is the best option for backup using barman? So that we can keep the database safe in case of disaster? I feel the Incremental Backups are most useful to perform the PITR but I want to know the experts suggestions. You're mixing a number of topics, here. Firstly, all backups done by barman are centralized and catalogued, that's pretty much one of the main purposes of barman. When it comes to backup methods, there are two basic methods. rsync and postgres (which means pg_basebackup). This is about creating the initial base backup. Both methods then can replicate WAL by either streaming or archive_command. So first you need to decide whether to use rsync and pg_basebackup, where rsync allows advanced features like incremental backup, parallel backup and deduplication. Then you need to decide whether to use archive_command or streaming (i.e. pg_receivexlog). The "right" backup method very much depends on the size of your database, activity, and so on. By default you should probably go with the default option, described as "scenario 1" in the barman docs, i.e. pg_basebackup (backup_method = postgres) and WAL streaming. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Mon, Oct 21, 2019 at 08:06:46AM -0700, Adrian Klaver wrote: On 10/20/19 11:07 PM, Tomas Vondra wrote: On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: True. And AFAIK catching exceptions is not really possible in some code, e.g. in stored procedures (because we can't do subtransactions, so no exception blocks). Can you explain the above to me as I thought there are exception blocks in stored functions and now sub-transactions in stored procedures. Sorry for the confusion - I've not been particularly careful when writing that response. Let me illustrate the issue with this example: CREATE TABLE t (a int); CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$ DECLARE msg TEXT; BEGIN -- SAVEPOINT s1; INSERT INTO t VALUES (1); -- COMMIT; EXCEPTION WHEN others THEN msg := SUBSTR(SQLERRM, 1, 100); RAISE NOTICE 'error: %', msg; END; $$; CALL test(); If you uncomment the SAVEPOINT, you get NOTICE: error: unsupported transaction command in PL/pgSQL because savepoints are not allowed in stored procedures. Fine. If you uncomment the COMMIT, you get NOTICE: error: cannot commit while a subtransaction is active which happens because the EXCEPTION block creates a subtransaction, and we can't commit when it's active. But we can commit outside the exception block: CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$ DECLARE msg TEXT; BEGIN BEGIN INSERT INTO t VALUES (1); EXCEPTION WHEN others THEN msg := SUBSTR(SQLERRM, 1, 100); RAISE NOTICE 'error: %', msg; END; COMMIT; END; $$; regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: On 10/20/19 4:18 PM, Tomas Vondra wrote: On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote: On 10/20/19 1:14 PM, David G. Johnston wrote: On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan mailto:andrew.duns...@2ndquadrant.com>> wrote: And yet another is to raise an exception, which is easy to write but really punts the issue back to the application programmer who will have to decide how to ensure they never pass in a NULL parameter. That's kinda the point - if they never pass NULL they won't encounter any problems but as soon as the data and their application don't see eye-to-eye the application developer has to decide what they want to do about it. We are in no position to decide for them and making it obvious they have a decision to make and implement here doesn't seem like a improper position to take. The app dev can avoid this problem today by making sure they don't pass a NULL as the value. Or they can use a wrapper function which does that for them. So frankly this doesn't seem like much of an advance. And, as has been noted, it's not consistent with what either MySQL or MSSQL do. In general I'm not that keen on raising an exception for cases like this. I think the general premise of this thread is that the application developer does not realize that may be necessary, because it's a bit surprising behavior, particularly when having more experience with other databases that behave differently. It's also pretty easy to not notice this issue for a long time, resulting in significant data loss. Let's say you're used to the MSSQL or MySQL behavior, you migrate your application to PostgreSQL or whatever - how do you find out about this behavior? Users are likely to visit https://www.postgresql.org/docs/12/functions-json.html but that says nothing about how jsonb_set works with NULL values :-( We should certainly fix that. I accept some responsibility for the omission. +1 You're right raising an exception may not be the "right behavior" for whatever definition of "right". But I kinda agree with David that it's somewhat reasonable when we don't know what the "universally correct" thing is (or when there's no such thing). IMHO that's better than silently discarding some of the data. I'm not arguing against the idea of improving the situation. But I am arguing against a minimal fix that will not provide much of value to a careful app developer. i.e. I want to do more to support app devs. Ideally they would not need to use wrapper functions. There will be plenty of situations where it is mighty inconvenient to catch an exception thrown by jsonb_set(). And catching exceptions can be expensive. You want to avoid that if possible in your performance-critical plpgsql code. True. And AFAIK catching exceptions is not really possible in some code, e.g. in stored procedures (because we can't do subtransactions, so no exception blocks). FWIW I think the JSON/JSONB part of our code base is amazing, and the fact that various other databases adopted something very similar over the last couple of years just confirms that. And if this is the only speck of dust in the API, I think that's pretty amazing. TY. When I first saw the SQL/JSON spec I thought I should send a request to the SQL standards committee for a royalty payment, since it looked so familiar ;-) ;-) I'm not sure how significant this issue actually is - it's true we got a couple of complaints over the years (judging by a quick search for jsonb_set and NULL in the archives), but I'm not sure that's enough to justify any changes in backbranches. I'd say no, but I have no idea how many people are affected by this but don't know about it ... No, no backpatching. As I said upthread, this isn't a bug, but it is arguably a POLA violation, which is why we should do something for release 13. WFM regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: CPU SPIKE
On Mon, Oct 21, 2019 at 10:45:33AM +0530, Pawan Sharma wrote: Hi All, Having real high CPU issue (95-98%), with SELECT statements and select queries contains multiple AND operator, is it will cause any CPU Spike..??? apps team is using sub-partition, PG11, CPU:24, Mem: 16GB Any query execution is using CPU, so yes - it can cause a spike in CPU usage. The question is whether the query plan is efficient, or whether it's using the CPU unenecessarily. You need to show us EXPLAIN ANALYZE and perhaps explain why you think it shouldn't use this much CPU. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote: On 10/20/19 1:14 PM, David G. Johnston wrote: On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan mailto:andrew.duns...@2ndquadrant.com>> wrote: And yet another is to raise an exception, which is easy to write but really punts the issue back to the application programmer who will have to decide how to ensure they never pass in a NULL parameter. That's kinda the point - if they never pass NULL they won't encounter any problems but as soon as the data and their application don't see eye-to-eye the application developer has to decide what they want to do about it. We are in no position to decide for them and making it obvious they have a decision to make and implement here doesn't seem like a improper position to take. The app dev can avoid this problem today by making sure they don't pass a NULL as the value. Or they can use a wrapper function which does that for them. So frankly this doesn't seem like much of an advance. And, as has been noted, it's not consistent with what either MySQL or MSSQL do. In general I'm not that keen on raising an exception for cases like this. I think the general premise of this thread is that the application developer does not realize that may be necessary, because it's a bit surprising behavior, particularly when having more experience with other databases that behave differently. It's also pretty easy to not notice this issue for a long time, resulting in significant data loss. Let's say you're used to the MSSQL or MySQL behavior, you migrate your application to PostgreSQL or whatever - how do you find out about this behavior? Users are likely to visit https://www.postgresql.org/docs/12/functions-json.html but that says nothing about how jsonb_set works with NULL values :-( You're right raising an exception may not be the "right behavior" for whatever definition of "right". But I kinda agree with David that it's somewhat reasonable when we don't know what the "universally correct" thing is (or when there's no such thing). IMHO that's better than silently discarding some of the data. FWIW I think the JSON/JSONB part of our code base is amazing, and the fact that various other databases adopted something very similar over the last couple of years just confirms that. And if this is the only speck of dust in the API, I think that's pretty amazing. I'm not sure how significant this issue actually is - it's true we got a couple of complaints over the years (judging by a quick search for jsonb_set and NULL in the archives), but I'm not sure that's enough to justify any changes in backbranches. I'd say no, but I have no idea how many people are affected by this but don't know about it ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: releasing space
On Sat, Oct 19, 2019 at 04:34:32PM -0700, Adrian Klaver wrote: On 10/19/19 4:17 PM, Julie Nishimura wrote: Thank you, Thomas. Do you know if it is safe to replicate 9.6.2 (smaller) db to 9.6.15 (larger capacity) using pg_basebackup? Would it be considered as an upgrade? pg_basebackup backups an entire Postgres cluster which will be many databases. So when you say db do mean a Postgres cluster or an individual database? My understanding is Julie wants to create a copy of a 9.6.2 cluster using pg_basebackup and then run 9.6.15 on it. That's OK, it's essentially a minor version upgrade. FWIW Julie, please don't top post - it just makes it harder to follow the discussion. Also, this seems like a completely separate question, unrelated to the DROP DATABLASE one. It might be better to start a new thread instead of repurposing an existing one. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Sat, Oct 19, 2019 at 12:47:39PM -0400, Andrew Dunstan wrote: On 10/19/19 12:32 PM, David G. Johnston wrote: On Sat, Oct 19, 2019 at 9:19 AM Tomas Vondra mailto:tomas.von...@2ndquadrant.com>> wrote: > >We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it >since 9.5. That's five releases ago. So it's a bit late to be coming to >us telling us it's not safe (according to your preconceptions of what it >should be doing). > There have been numerous complaints and questions about this behavior in those five years; and none of the responses to those defenses has actually made the current behavior sound beneficial but rather have simply said "this is how it works, deal with it". I haven't seen a patch, which for most possible solutions should be fairly simple to code. This is open source. Code speaks louder than complaints. IMHO that might be a bit too harsh - I'm not surprised no one sent a patch when we're repeatedly telling people "you're holding it wrong". Without a clear consensus what the "correct" behavior is, I wouldn't send a patch either. > >We could change it prospectively (i.e. from release 13 on) if we choose. >But absent an actual bug (i.e. acting contrary to documented behaviour) >we do not normally backpatch such changes, especially when there is a >simple workaround for the perceived problem. And it's that policy that >is in large measure responsible for Postgres' deserved reputation for >stability. > Yeah. Agreed, this is v13 material if enough people come on board to support making a change. We have changed such things in the past. But maybe a new function might be a better way to go. I haven't given it enough thought yet. I think the #1 thing we should certainly do is explaining the behavior in the docs. >And if we were to change it I'm not at all sure that we should do it the >way that's suggested here, which strikes me as no more intuitive than >the current behaviour. Rather I think we should possibly fill in a json >null in the indicated place. > Not sure, but that seems rather confusing to me, because it's mixing SQL NULL and JSON null, i.e. it's not clear to me why [...] But I admit it's quite subjective. Providing SQL NULL to this function and asking it to do something with that is indeed subjective - with no obvious reasonable default, and I agree that "return a NULL" while possible consistent is probably the least useful behavior that could have been chosen. We should never have allowed an SQL NULL to be an acceptable argument in the first place, and can reasonably safely and effectively prevent it going forward. Then people will have to explicitly code what they want to do if their data and queries present this invalid unknown data to the function. How exactly do we prevent a NULL being passed as an argument? The only thing we could do would be to raise an exception, I think. That seems like a fairly ugly thing to do, I'd need a h3eck of a lot of convincing. I don't know, but if we don't know what the "right" behavior with NULL is, is raising an exception really that ugly? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Sat, Oct 19, 2019 at 11:21:26AM -0400, Stephen Frost wrote: Greetings, * Dmitry Dolgov (9erthali...@gmail.com) wrote: If we want to change it, the question is where to stop? Essentially we have: update table set data = some_func(data, some_args_with_null); where some_func happened to be jsonb_set, but could be any strict function. I don't think it makes any sense to try and extrapolate this out to other strict functions. Functions should be strict when it makes sense for them to be- in this case, it sounds like it doesn't really make sense for jsonb_set to be strict, and that's where we stop it. Yeah. I think the issue here is (partially) that other databases adopted similar functions after us, but decided to use a different behavior. It might be more natural for the users, but that does not mean we should change the other strict functions. Plus I'm not sure if SQL standard says anything about strict functions (I found nothing, but I looked only very quickly), but I'm pretty sure we can't change how basic operators change, and we translate them to function calls (e.g. 1+2 is int4pl(1,2)). I wonder if in this case it makes sense to think about an alternative? For example, there is generic type subscripting patch, that allows to update a jsonb in the following way: update table set jsonb_data[key] = 'value'; It doesn't look like a function, so it's not a big deal if it will handle NULL values differently. And at the same time one can argue, that people, who are not aware about this caveat with jsonb_set and NULL values, will most likely use it due to a bit simpler syntax (more similar to some popular programming languages). This seems like an entirely independent thing ... Right. Useful, but entirely separate feature. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Sat, Oct 19, 2019 at 11:26:50AM -0400, Andrew Dunstan wrote: ... The hyperbole here is misplaced. There is a difference between a bug and a POLA violation. This might be the latter, but it isn't the former. So please tone it down a bit. It's not the function that's unsafe, but the ill-informed use of it. We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it since 9.5. That's five releases ago. So it's a bit late to be coming to us telling us it's not safe (according to your preconceptions of what it should be doing). We could change it prospectively (i.e. from release 13 on) if we choose. But absent an actual bug (i.e. acting contrary to documented behaviour) we do not normally backpatch such changes, especially when there is a simple workaround for the perceived problem. And it's that policy that is in large measure responsible for Postgres' deserved reputation for stability. Yeah. Incidentally, why is your function written in plpgsql? Wouldn't a simple SQL wrapper be better? create or replace function safe_jsonb_set (target jsonb, path text[], new_value jsonb, create_missing boolean default true) returns jsonb as $func$ select case when new_value is null then target else jsonb_set(target, path, new_value, create_missing) end $func$ language sql; And if we were to change it I'm not at all sure that we should do it the way that's suggested here, which strikes me as no more intuitive than the current behaviour. Rather I think we should possibly fill in a json null in the indicated place. Not sure, but that seems rather confusing to me, because it's mixing SQL NULL and JSON null, i.e. it's not clear to me why jsonb_set(..., "...", NULL) should do the same thing as jsonb_set(..., "...", 'null':jsonb) I'm not entirely surprised it's what MySQL does ;-) but I'd say treating it as a deletion of the key (just like MSSQL) is somewhat more sensible. But I admit it's quite subjective. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Has there been any discussion of custom dictionaries being defined in the database?
On Thu, Oct 17, 2019 at 11:52:39AM +0200, Tom Lane wrote: Morris de Oryx writes: Given that Amazon is bragging this week about turning off Oracle, it seems like they could kick some resources towards contributing something to the Postgres project. With that in mind, is the idea of defining dictionaries within a table somehow meritless, or unexpectedly difficult? Well, it'd just be totally different. I don't think anybody cares to provide two separate definitions of common dictionaries (which'd have to somehow be kept in sync). As for why we did it with external text files in the first place --- for at least some of the dictionary types, the point is that you can drop in data files that are available from upstream sources, without any modification. Getting the same info into a table would require some nonzero amount of data transformation. IMHO being able to load dictionaries from a table would be quite useful, and not just because of RDS. For example, it's not entirely true we're just using the upstream dictionaries verbatim - it's quite common to add new words, particularly in specialized fields. That's way easier when you can do that through a table and not through a file. Having said that ... in the end a dictionary is really just a set of functions implementing the dictionary API; where they get their data from is their business. So in theory you could roll your own dictionary that gets its data out of a table. But the dictionary API would be pretty hard to implement except in C, and I bet RDS doesn't let you install your own C functions either :-( Not sure. Of course, if we expect the dictionary to work just like the ispell one, with preprocessing the dictionary into shmem, then that requires C. I don't think that's entirely necessary, thoug - we could use the table directly. Yes, that would be slower, but maybe it'd be sufficient. But I think the idea is ultimately that we'd implement a new dict type in core, and people would just specify which table to load data from. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: releasing space
On Thu, Oct 17, 2019 at 05:20:09PM +, Julie Nishimura wrote: Hello everybody, We are running PostgreSQL 9.6.2 cluster master -> standby (streaming replication). 22 tb of space (constantly struggling with the space, pruning the old data, but not fast enough). The biggest db takes 16 tb. So, we've copied it to another server, and now we would like to delete it from our original source, to free up the space. What would be the right approach for this? Just issue drop database command (16tb). How long it might take? Should we do it gradually (drop biggest tables first)? Any suggestions? Caveats? Generally speaking, DROP DATABASE simply recursively drops all the various objects - indexes, tables, etc. It mostly just deleting the files, which should not be very expensive (we certainly don't need to delete all the data or anything), but there's certain number of I/O involved. But it does depend on the OS / filesystem / hardware if that's an issue. So if you want to be on the safe side, you can drop the objects one by one, with a bit of delay between them, to throttle the I/O a bit. FWIW the latest minor release for 9.6 is 9.6.15, you're 13 minor versions (~30 months) of fixes behind. You might want to consider upgrading ... -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Replication of Replication
On Fri, Oct 18, 2019 at 05:54:34PM -0400, Edilmar Alves wrote: I have 3 servers running CentOS+PG 11 (postgresql11-server-11.5-1PGDG.rhel7.x86_64): - s1: main db + publication - s2: subscription of the main db all works fine until here... Now, I tried to config this: - s2: publication of the same db that was replicated from s1 - s3: subscription of the db from s2 When I did this, replication s1=>s2 stopped, and replication s2=>s3 never worked. Is this not possible? You'll have to share more details - error messages from the server log, how you check that the replication stopped, etc. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: timescaleDB & WAL replication
On Tue, Oct 15, 2019 at 09:28:27AM +0200, basti wrote: Hello, I have a Master / Slave Postgres setup with WAL Replication. Now I want to add timescaleDB. I found this todo: https://docs.timescale.com/latest/tutorials/replication As I understand that in the right way I just need to add the timescaleDB extention on the master side? AFAIK timescale is "just" a regular extension, in which case yes, all you need to do is installing it on the master side. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: temporary files
On Mon, Oct 07, 2019 at 01:52:41PM -0700, dangal wrote: Hello everyone, I have a question to see if you can help me, I have set work_mem in 100 MB but I have the following in the pgbadger Queries generating the most temporary files (N) Count Total size Min size Max sizeAvg size 58 3.24 GiB 57.15 MiB 57.15 MiB 57.15 MiB How could it be that if you require 57 MB, 100 MB will not reach ? thanks a lot The on-disk and in-memory representations are not the same, and the on-disk one is often significantly more efficient. Data that needs 57MB temporary file may beed ~150MB memory to perform in-memory sort, for example. Yet another reason may be that e.g. hash join splits the memory into batches, and each one has to fit into work_mem. And whevener we hit the limit, we double the number of batches, i.e. we cut the batch size in half. Thus the size is somewhere between 50 and 100MB, with ~75MB on average. Furthermore, we also estimate the batch size before running the query, so we may start with unnecessarily many batches, pushing the average batch size down even more. FWIW it's somewhat suspicious you have 58 temp files with almost no variability in size. That seems as if a single query was executed repeatedly. Maybe try looking into what query that is. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pgcrypto question
On Mon, Oct 07, 2019 at 02:51:30PM -0400, Erik Aronesty wrote: Good idea for "psycopg". It would be easy for a POC, but I think the only meaningful layer to operate at would be a libpq drop-in replacement that intercepts PQgetvalue, PQprepare, PQexecParams, PQexecPrepared ... etc. That way odbc, python, node, etc would "just work" as long as you used LD_PRELOAD appropriately. It's not clear to me how would that know which columns are encrypted, with what key, etc. Because those encrypted columns are essentially just regular bytea columns, so there's no easy way to distinguish them. I'm no psycopg2 expert, but it does have some infrastructure for casting PostgreSQL types to Python types, and I guess that could be used for the encryption. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pgcrypto question
On Mon, Oct 07, 2019 at 12:05:16PM -0400, Erik Aronesty wrote: Currently, it is my understanding that the pgcrypto library requires the user to send a password or private key up to the server for decryption. Correct. In the naive case the key is included in each SQL query, which does have various issues. Bruce Momjian has a nice extension that allows you to fix that by loading the key into backend memory: http://momjian.us/download/pgcryptokey/ Is there a notion of a client-side encrypt/decrypt plugin when doing a postgres query? For example, a user could query postgres, get back data of type "encrypted", and a "libpq" plugin could decode/decrypt those columns that are of data type "encrypted" in a manner transparent to the user of the client Obviously I could write this by intercepting the relevant libpq calls using LD_PRELOAD or Microsoft's "Detours" ... but it there a better way to do that? AFAIk that's usually done at the application level, i.e. the application is sending/receiving encrypted data, and the database simply sees bytea columns. I'm not aware of a driver doing that transparently, but it seems like an interesting idea - I wonder if it could be done e.g. in psycopg as an extension, or something like that. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: PG 12 not yet for mac
On Mon, Oct 07, 2019 at 11:11:43AM -0400, Ravi Krishna wrote: https://postgresapp.com/downloads.html The link which says PG 12 is actually PG 11. Not sure if the link is correct or not (it seems to point to Postgres-2.3-12.dmg, which seems like it might be v12). But more imporantly, that's not a project/page managed by the PostgreSQL community, you need to report the issues to the authors (most likely through github issues). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Install postgres on rhel 7
On Mon, Oct 07, 2019 at 01:00:32AM -0400, Mageshwaran Janarthanam wrote: Hi Team, I have question on the postgres install process. Tried to read the archives but couldn't find it. I wanted to install postgres server via rpm and wanted to store the binaries under non default location like /pgbin. The rpm package (at least not those available at yum.postgresql.org) are not relocatable: rpm -qpi postgresql12-12.0-1PGDG.rhel7.x86_64.rpm | grep Relocations Relocations : (not relocatable) So unfortunately, you'll have to build your own packages, or something like that. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: COPY command returns "ERROR: invalid XML content"
On Sun, Oct 06, 2019 at 08:45:40PM -0700, Konstantin Izmailov wrote: Hi, I'm using libpq (v10) to import lots of xml files into a PG10 table. I noticed if number of records imported exceeds 2100 then the following error is returned: ERROR: invalid XML content DETAIL: line 1: Couldn't find end of Start Tag timeBasedFileNamingAndTriggerin line 1 logFile.%d{-MM-dd}.%i.html My guess is this is an issue/limitation in libxml2, which we use to parse and process XML. What libxml2 version you have installed? Can you share an example of a XML document to reproduce the issue? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: integrate Postgres Users Authentication with our own LDAP Server
On Thu, May 09, 2019 at 07:11:24AM +, M Tarkeshwar Rao wrote: We want to setup ldap authentication in pg_hba.conf, for Postgresql users(other than postgres super user). We are getting issue with special characters by following steps given in postgres documentation. It is not accepting any special characters as special characters are mandatory in our use case. Can you please help us or have you any steps by which we can configure any postgres with LDAP? Please don't cross-post - this is a fairly generic question, it has nothing to do with performance or development, so the right thing is to send it to pgsql-general. Likewise, it makes little sense to send questions to the "owner". I've removed the other lists from CC. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: backend crash on DELETE, reproducible locally
On 11/6/18 10:54 PM, Andres Freund wrote: > On 2018-11-06 16:47:20 -0500, Tom Lane wrote: >> =?UTF-8?Q?Ond=c5=99ej_Bouda?= writes: >>>> Ondřej, as a short-term workaround you could prevent the crash >>>> by setting that index's recheck_on_update property to false. >> >>> Thanks for the tip. I am unsuccessful using it, though: >>> # ALTER INDEX public.schedulecard_overlap_idx SET (recheck_on_update = >>> FALSE); >>> ERROR: unrecognized parameter "recheck_on_update" >> >> Oh, for crying out loud. That's yet a different bug. >> I'm not sure that it's the fault of the recheck_on_update >> feature proper though; it might be a pre-existing bug in >> the reloptions code. Looks like somebody forgot to list >> RELOPT_KIND_GIST in RELOPT_KIND_INDEX, but is that the >> fault of commit c203d6cf8 or was it busted before? > > Looks new: > + RELOPT_KIND_INDEX = > RELOPT_KIND_BTREE|RELOPT_KIND_HASH|RELOPT_KIND_GIN|RELOPT_KIND_SPGIST, > > there aren't any other "for all indexes" type options, so the whole > category didn't exist before. > > It also strikes me as a really bad idea, even if RELOPT_KIND_GIST > wouldn't have been omitted: It breaks index am extensibility. > Does it? The RELOPT_KIND_* stuff is hard-coded in reloptions.h anyway, so I'm not sure how this particular thing makes it less extensible? That being said, we also have RELOPT_KIND_BRIN, and that seems to be missing from RELOPT_KIND_INDEX too (and AFAICS the optimization works for all index types). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Why the index is not used ?
Hi, On 10/08/2018 04:10 PM, ROS Didier wrote: > Hi Tomas > > Thank you for your answer and recommendation which is very > interesting. I'm going to study the PCI DSS document right now. > > * Here are my answer to your question : > > />>/ > /What is your threat model?/ > /< we want to prevent access to sensitive data for everyone except those > who have the encryption key. > in case of files theft, backups theft, dumps theft, we do not want > anyone to access sensitive data. > The thing is - encryption is not panacea. The interesting question is whether this improves security compared to simply using FDE and regular access rights (which are grantable at the column level). Using those two pieces properly may very well be a better defense than not well designed encryption scheme - and based on this discussion, it does not seem very polished / resilient. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Why the index is not used ?
Hi, On 10/07/2018 08:32 PM, ROS Didier wrote: > Hi Francisco > > Thank you for your remark. > You're right, but it's the only procedure I found to make search on > encrypted fields with good response times (using index) ! > Unfortunately, that kinda invalidates the whole purpose of in-database encryption - you'll have encrypted on-disk data in one place, and then plaintext right next to it. If you're dealing with credit card numbers, then you presumably care about PCI DSS, and this is likely a direct violation of that. > Regarding access to the file system, our servers are in protected network areas. few people can connect to it. > Then why do you need encryption at all? If you assume access to the filesystem / storage is protected, why do you bother with encryption? What is your threat model? > it's not the best solution, but we have data encryption needs and > good performance needs too. I do not know how to do it except the > specified procedure.. > > if anyone has any proposals to put this in place, I'm interested. > One thing you could do is hashing the value and then searching by the hash. So aside from having the encrypted column you'll also have a short hash, and you may use it in the query *together* with the original condition. It does not need to be unique (in fact it should not be to make it impossible to reverse the hash), but it needs to have enough distinct values to make the index efficient. Say, 10k values should be enough, because that means 0.01% selectivity. So the function might look like this, for example: CREATE FUNCTION cchash(text) RETURNS int AS $$ SELECT abs(hashtext($1)) % 1; $$ LANGUAGE sql; and then be used like this: CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cchash(cc)); and in the query SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE pgp_sym_decrypt(cc, 'motdepasse')='test value 32' AND cchash(cc) = cchash('test value 32'); Obviously, this does not really solve the issues with having to pass the password to the query, making it visible in pg_stat_activity, various logs etc. Which is why people generally use FDE for the whole disk, which is transparent and provides the same level of protection. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Help to understand Actual Rows vs Plan Rows from the query planner output
> + > Plan Width: 0 > + > Actual Startup Time: 0.048 > + > Actual Total Time: 0.048 > + > Actual Rows: 1 > + > Actual Loops: 1 > + > Index Cond: "(lower((city)::text) = 'melanyfort'::text)" > + > - Node Type: "Bitmap Index Scan" > + > Parent Relationship: "Member" > + > Parallel Aware: false > + > Index Name: "lower_state_users_idx" > + > Startup Cost: 0.00 > + > Total Cost: 5.87 > + > Plan Rows: 211 > + > Plan Width: 0 > + > Actual Startup Time: 0.102 > + > Actual Total Time: 0.102 > + > Actual Rows: 211 > + > Actual Loops: 1 > + > Index Cond: "(lower((state)::text) = 'ohio'::text)" > + > Planning Time: 0.260 > + > Triggers: > + > Execution Time: 0.249 > (1 row) > > aruprakshit=# > > > -- > > In the first node type of "Bitmap Index Scan” on “users_lower_idx”, I > see the plan rows are 50, but actual rows it got 1. In the second node > type of "Bitmap Index Scan” on “ lower_state_users_idx”, I see the plan > rows are 211, and actual rows 211. Both are same. Based on what > conditions planner estimated the planned and actual rows count? > > > In node type “BitmapAnd”, I see again the actual rows 1, then why on the > final plan i.e. Node Type: "Bitmap Heap Scan” again planner estimated > rows 1? How does it counts these? What does the Loops count says us > about the query? > It's not very clear to me whether you're asking how the planner computes estimates in general, or how it computed these particular estimates (or what issues you see there). Perhaps this would give you at least some answers: https://www.postgresql.org/docs/11/static/row-estimation-examples.html regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Code of Conduct
On 09/18/2018 01:47 PM, James Keener wrote: > following a long consultation process It's not a consultation if any dissenting voice is simply ignored. Don't sugar-coat or politicize it like this -- it was rammed down everyone's throats. That is core's right, but don't act as everyone's opinions and concerns were taken into consideration. I respectfully disagree. I'm not sure which dissenting voices you think were ignored, but from what I've observed in the various CoC threads the core team took the time to respond to all comments. That does not necessarily mean the resulting CoC makes everyone happy, but unfortunately that's not quite possible. And it does not mean it was not an honest consultation. IMO the core team did a good job in listening to comments, tweaking the wording and/or explaining the reasoning. Kudos to them. There are a good number of folks who are concerned that this CoC is overreaching and is ripe for abuse. Those concerns were always simply, plainly, and purposely ignored. No, they were not. There were multiple long discussions about exactly these dangers, You may dislike the outcome, but it was not ignored. > Please take time to read and understand the CoC, which is intended to ensure that PostgreSQL remains an open and enjoyable project for anyone to join and participate in. I sincerely hope so, and that it doesn't become a tool to enforce social ideology like in other groups I've been part of. Especially since this is the main place to come to get help for PostgreSQL and not a social club. Ultimately, it's a matter of trust that the CoC committee and core team apply the CoC in a careful and cautious way. Based on my personal experience with most of the people involved in both groups I'm not worried about this part. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Fwd: is there any adverse effect on DB if I set autovacuum scale factor to zero?
On 08/13/2018 04:24 PM, Ashu Pachauri wrote: + pgsql-general Thanks and Regards, Ashu Pachauri -- Forwarded message - From: *Ashu Pachauri* mailto:ashu210...@gmail.com>> Date: Mon, Aug 13, 2018 at 7:53 PM Subject: Re: is there any adverse effect on DB if I set autovacuum scale factor to zero? To: mailto:raghavendra...@gmail.com>> The way I see *autovacuum_vacuum_scale_factor* is not in terms of absolute number but as the percentage of any table that can consist of updated / deleted tuples to make it eligible for vacuuming. A factor of 0.1 ensures that your tables would be eligible for vacuuming if more than 10% of the tuples are deleted/updated. 1. If you think that 10% is too high for you in terms of storage cost, you can decrease the number or set it to zero. But, I would advise to increase the value of *autovacuum_vacuum_threshold* to something reasonable if you do that, otherwise you pay the CPU cost frequent vacuuming across all tables. 2. However, if your issue is not the fixed 10% overhead but the lack of throughput i.e. you see the number of deleted/updated tuples keeps increasing in an unbounded fashion, the right way to deal with it is a) Having higher value of *autovacuum_max_workers* b) lower value for *autovacuum_naptime*. Increasing autovacuum_max_workers is unlikely to solve the issue with throughput, because all the workers are throttled together - there's a limit on the amount of work that can be done per second. Increasing the number of workers is akin to allowing more cars on a highway, but also lowering the speed limit. You need to increase the limit on amount of work, and lowering naptime is one way to do that. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: is there any adverse effect on DB if I set autovacuum scale factor to zero?
On 08/13/2018 03:41 PM, Raghavendra Rao J S V wrote: Hi Tomas, Thank you very much for your response. As we know table becomes a candidate for autovacuum process based on below formula. *Autovacuum VACUUM thresold for a table = autovacuum_vacuum_scale_factor * number of tuples + autovacuum_vacuum_threshold* *Current settings in my database are as follows.* /autovacuum_vacuum_scale_factor = 0.1/ /autovacuum_vacuum_threshold = 40/ Due to above formula the dead tuples are accumulating based on the number of live tuples as show below picture. select relname,n_live_tup,n_dead_tup,(n_live_tup*.1+40) expected_to_autovacuum,* from pg_stat_user_tables where n_dead_tup>800 order by n_live_tup desc limit 100; In order to avoid the dead tuples accumulation I wold like to change the auto vacuum settings in *"postgresql.conf"* as below. /autovacuum_vacuum_scale_factor = 0.01/ //autovacuum_vacuum_threshold = 100/ / OK, so the tables apparently have enough dead tuples to trigger vacuum. That mean the autovacuum throughput is insufficient to do all the cleanup. If you lower the scale factor, the amount of cleanup will *increase* (more tables being eligible for cleanup) making it less likely autovacuum can keep up. You need to increase the throughtput, by increasing vacuum_cost_limit or something like that. *Kindly guide me your views. Does it cause any adverse effect on DB.* * Well, it forces the database to do more stuff / more often, so it may have adverse impact, of course. It's hard to say if it's going to be a win overall, because we don't know how serious is the bloat. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: is there any adverse effect on DB if I set autovacuum scale factor to zero?
On 08/13/2018 11:07 AM, Raghavendra Rao J S V wrote: Hi All, We are using postgres *9.2* version on *Centos *operating system. We have around *1300+* tables.We have following auto vacuum settings are enables. Still few of the tables(84 tables) which are always busy are not vacuumed.Dead tuples in those tables are more than 5000. Due to that tables are bloating and observed few areas has performance degradation. You don't say how large the tables are, so it's impossible to say whether 5000 dead tuples is excessive or not. IMHO it's a negligible amount and should not lead to excessive bloat or issues. A certain amount of wasted is expected - it's a trade-off between immediate and delayed cleanup. If you delay the cleanup a bit, it's going to be more efficient overall. It's also unclear why the tables are not vacuumed - it may easily be due to all the autovacuum workers being constantly busy, unable to cleanup all tables in a timely manner. In that case lowering the threshold is not going to help, on the contrary. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Shared buffers increased but cache hit ratio is still 85%
On 07/18/2018 10:43 AM, Andreas Kretschmer wrote: > > > Am 18.07.2018 um 10:26 schrieb Hans Schou: >> Am I doing something wrong or should some history be cleared? > > Reset the stats for that database. You can check the date of last reset > with: > > select stats_reset from pg_stat_database where datname = 'database_name'; > > and reset it with: > > ||pg_stat_reset() > || > ||Reset all statistics counters for the current database to zero > (requires superuser privileges by default, but EXECUTE for this function > can be granted to others.)|| > It might be better to note current values of the counters somewhere, and compute a delta later (and use that to compute the cache hit ratio). The issue is that pg_stat_reset() throws away all sorts of interesting and important stats, including those driving autovacuum/autoanalyze. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: ERROR: found multixact from before relminmxid
On 04/09/2018 01:49 PM, Alexandre Arruda wrote: > > > 2018-04-06 13:11 GMT-03:00 Tomas Vondra <tomas.von...@2ndquadrant.com > <mailto:tomas.von...@2ndquadrant.com>>: > > > > On 04/06/2018 04:29 PM, Alexandre Arruda wrote: > > 2018-04-06 9:39 GMT-03:00 Tomas Vondra <tomas.von...@2ndquadrant.com > <mailto:tomas.von...@2ndquadrant.com> > > <mailto:tomas.von...@2ndquadrant.com > <mailto:tomas.von...@2ndquadrant.com>>>: > > > > > > > > On 04/06/2018 02:09 AM, Alexandre Arruda wrote: > > > Hi, > > > > > > Some time ago, I had this errors frequently showed in logs > after some > > > autovacuum in some tables(pg 9.6). VACUUM FULL or CLUSTER in > this tables > > > show the same and not complete the tasks (showed by some > table bloat > > > select). > > > Then, I did a full dump/restore into a new version (10.2) > and everything > > > is ok for a couple of months. Now, I have errors like this > again: > > > > > > db1=# cluster pc44t; > > > > > > ERROR: found multixact 134100944 from before relminmxid > 192042633 > > > > > > > > > > > > Like before, the only way to make the errors to desapear is by > > > dump/reload the whole table. > > > > > > > > > Thanks for any help. > > > > > > > That's going to be hard, unless you still have the cluster around. > > > > This surely seems like some sort of data corruption issue, but > without > > being able to inspect the data pages it's nearly impossible to > determine > > what went wrong. > > > > We'd also need more information about what happened to the > hardware and > > cluster before the issues started to appear - crashes, > hardware issues. > > And so on. > > > > regards > > > > -- > > Tomas Vondra http://www.2ndQuadrant.com > > PostgreSQL Development, 24x7 Support, Remote DBA, Training & > Services > > > > > > Hi Tomas, > > The old cluster are gone, unfortunatly. > > > > This server is a 128GB ECC RAM with a dual redundant hba fiber channel > > connect to a sotorage with Raid 6 and I don't have (apparently) any > > errors reported. > > Yesterday I did a test with one table: some sum aggragates, count(*), > > etc, then dump/reload and repeat the tests the results (of querys) are > > the same, regarding the vacuum problem > > thats disapeared. > > > > I'm not sure I understand correctly. So you can reproduce the issue? If > yes, how can you share the scripts/data you use (and how large is it)? > If we could reproduce it locally, it would make the investigation much > easier. > > BTW you mentioned you're using PostgreSQL 9.6 - which minor version, > specifically? > > > regards > > > Hi Tomas, > > No, I can't reproduce. What I did is a simple way to "validate" the > current table data to see if a dump/reload > preserve them. Old postgresql was 9.6.5. The problem returns now in new > 10.3 installation. > I'm confused. First you say you can't reproduce the issue, then you say it got back on a new installation. So which is it? > There is a way to correct this tables without a dump/reload ? > > I'm thinking to reinstall cluster doing a initdb --data-checksums, > but I'm affraid about a severe performance impact. > The performance impact should be non-measurable for most workloads. It might be a couple of percent on certain workloads/hardware, but that's probably not your case. The bigger question is whether this can actually detect the issue. If it's due to an storage issue, then perhaps yes. But if you only see multixact issues consistently and nothing else, it might easily be a PostgreSQL bug (in which case the checksum will be correct). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Seems like there is an issue with reltuples showing twice the number of rows
On 04/03/2018 11:14 AM, Ranjith Ramachandra wrote: > I am relying on reltuples on my web app to get fast row counts. > > This was recommended by this article to get fast approx row > counts: https://wiki.postgresql.org/wiki/Count_estimate > > > However for some table I am getting twice as many values when I try to > do this. I did some more research and came up with this query. > > select reltuples, n_live_tup, n_dead_tup > from pg_stat_user_tables join pg_class using (relname) > where relname = > 'main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d'; > > > it returns > > > reltuples | n_live_tup | n_dead_tup > -++ > 2.7209e+06 | 1360448 | 1360448 > > > If I run analyze main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d > > and I run the same query again, > > reltuples | n_live_tup | n_dead_tup > -++ > 1.36045e+06 | 1360448 | 1360448 > > But after some time the value goes back to being double the value. This > is causing me a lot of problems since this inaccuracy does not make any > sense to me. > There was a difference between VACUUM and ANALYZE in handling recently dead rows (essentially deleted rows that can't be removed yet), causing similar changes to reltuples. Essentially if you do VACUUM and ANALYZE, it may set reltuples to rather different estimates. That is fixed now and should be in the next minor release. Hard to say if this is a case of that, but judging by the number of dead tuples chances are it is. It's probably better to use n_live_tup instead, though. I'd say that's closer to the "live tuples" definition. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on
Hi, the perf profiles look really weird - clearly, there's a lot of lock contention, because the top item is this 13.49%13.40% postmaster postgres [.] LWLockAcquire | ---LWLockAcquire That's a sign of lock contention - not sure which one, though. And then there's heap_hot_search_buffer 12.65% 7.92% postmaster postgres [.] heap_hot_search_buffer | ---heap_hot_search_buffer So either there's a table with many HOT-updated rows (repeatedly), or we search the HOT chains very frequently for some reason. Considering it only affects the non-LIMIT case, I'd guess the latter. There also seems to be quite a few page faults, for some reason. Trying to reproduce this without the actual data is rather futile. We need some sort of reproducer (sample data to test the query on). regards On 03/28/2018 10:30 AM, Alessandro Aste wrote: > Hello, any news ? > > Thank you, > > Alessandro. > > On Fri, Mar 23, 2018 at 8:22 PM, Alessandro Aste > <alessandro.a...@gmail.com <mailto:alessandro.a...@gmail.com>> wrote: > > PS , in the meanwhile I discovered a 2nd workaround(beside disabling > parallel processing) . I added offset 0 to the subquery , and, > according to the documentation, “OFFSET 0 is the same as omitting > the OFFSET clause” > - https://www.postgresql.org/docs/current/static/queries-limit.html > <https://www.postgresql.org/docs/current/static/queries-limit.html> > > cmd3dev=# show max_parallel_workers_per_gather ; > > *max_parallel_workers_per_gather* > > *-* > > *8* > > (1 row) > > > > cmd3dev=# \timing > > Timing is on. > > cmd3dev=# SELECT * FROM (SELECT seg.circuit_id AS id, > vendor_gtt_pop.vendor_id, CASE WHEN vendor_gtt_pop.push_to_gaa = 1 > THEN 'Y' END as gaa, pop.gii_circuitid AS pop_name, cst.label AS > pop_status, seg.a_company_name AS pop_location, seg.vendor_id AS > pop_provider_id, seg.vendor_name AS pop_provider, cs.address1 AS > pop_street, cs.city AS pop_city, cs.postal_code AS pop_postal_code, > cs.state AS pop_state, csc.code AS pop_country, cs.country_id AS > pop_country_id FROM ( SELECT c.gii_circuitid, max(so.id > <http://so.id>) AS service_order_id FROM service_order so join > circuit c on c.product_id=so.product_id join master_service_order > mso on mso.id <http://mso.id>=so.master_service_order_id WHERE NOT > (so.ordertype_id = 2 AND so.status <> 999) AND NOT (so.ordertype_id > = 3 AND so.status <> 999) AND c.status >= 20 AND c.status not in > (160,999) AND mso.client_id=11615 AND c.service_description=28 AND > c.status!=160 GROUP BY c.gii_circuitid ) pop JOIN service_order so > ON so.id <http://so.id> = pop.service_order_id left JOIN client_site > cs on cs.id <http://cs.id>=so.a_site_id left JOIN country csc on > csc.id <http://csc.id>=cs.country_id JOIN circuit c ON > so.product_id=c.product_id JOIN circuit_status cst ON cst.id > <http://cst.id>=c.status JOIN ( SELECT c.id <http://c.id> AS > circuit_id, sg.id <http://sg.id> AS segment_id, > c.pop_support_vendor_id AS vendor_id, v.name <http://v.name> AS > vendor_name, sg.a_company_name FROM segment sg JOIN circuit_layout > cl ON cl.segment_id = sg.id <http://sg.id> AND cl.ordinal = 1 JOIN > circuit c ON c.id <http://c.id> = cl.circuit_id JOIN vendor v ON > v.id <http://v.id> = c.pop_support_vendor_id ) seg ON seg.circuit_id > = c.id <http://c.id> JOIN vendor_gtt_pop on > vendor_gtt_pop.gtt_pop_id = seg.circuit_id offset 0) foo where > vendor_id = 12346 AND (1=1) ORDER BY pop_name ASC,id desc LIMIT 10; > > id | vendor_id | gaa | pop_name | pop_status | > pop_location | pop_provider_id | pop_provider | pop_street > | pop_city | pop > > _postal_code | pop_state | pop_country | pop_country_id > > > +---+-+---++--+-+--+-+--+ > > -+---+-+ > > 684807 | 12346 | Y | GTT/POP/LON1T | Active | LON1T > | 12288 | Telehouse UK | 14 Coriander Avenue | London | E14 > > 2AA | | GB | 219 > > (1 row) > > > > *Time: 2245.073 ms (00:02.245)* > > > > > On Fri, Mar 23, 2018 at 9:31 AM, Alessandro Aste > <alessandro.a...@gmail.com <mailto:ales
Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on
On 03/22/2018 11:29 PM, Alessandro Aste wrote: > Thanks Tomas. We're currently building postgres from source. In order to > enable symbols, you want me to re-configure postres with --enable-debug > then run perf? > Yes. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on
On 03/22/2018 11:33 AM, Alessandro Aste wrote: > Tomas, thank you. This machine is abare metal server running only a > staging postgresql 10.3 instance. Nobody is using it beside me. > > I'm attaching 4 files. > > every_30_seconds_top_stats_during_query.txt - this is a caputure of > the top command every 30 seconds(more or less) for 10+ minutes while I'm > running the query. Let me know if this helps to answere your question. > EXPLAIN_WITH_LIMIT_AND_max_parallel_workers_per_gather_SET_TO_0.txt - > query plan with full query and max_parallel_workers_per_gather force > to 0. Full output. > EXPLAIN_WITH_LIMIT_AND_DEFAULT_PARALLEL_PROCESSING_SETTINGS.txt - > query plan with full query and default parellel processing settings. > Full output. > EXPLAIN_WITHOUT_LIMIT_AND_DEFAULT_PARALLEL_PROCESSING_SETTINGS.tx - > query plan of the query omitting the LIMIT clause and default parellel > processing settings. Full output. > OK. Looking at the top output, I see this: PID USER VIRT RES SHR S %CPU %MEMTIME+ COMMAND 104880 postgres 30.8g 1.9g 1.9g R 92.0 1.5 15:15.60 postmaster 111732 postgres 30.8g 476m 473m R 88.2 0.4 0:00.47 postmaster 111730 postgres 30.8g 473m 470m R 86.4 0.4 0:00.46 postmaster 111731 postgres 30.8g 476m 473m R 86.4 0.4 0:00.46 postmaster 111733 postgres 30.8g 476m 473m R 86.4 0.4 0:00.46 postmaster 111734 postgres 30.8g 476m 473m R 86.4 0.4 0:00.46 postmaster 111728 root 15824 1912 828 R 3.8 0.0 0:00.04 top That means it certainly is not stuck, it's simply doing a lot of work on CPU. The question is why and what it's doing ... Can you collect some CPU profiles using perf? There's a howto here: https://wiki.postgresql.org/wiki/Profiling_with_perf But in short - install perf, install debuginfo packages for postgres, and then do perf record -a -g -s sleep 60 while running the query. Once the profile data is collected, do perf report > report.txt and share the report.txt with us (well, if it's very large you may need to only cut the first 1000 lines or so). That should tell us in which functions most of the time is spent. That will give us some hints, hopefully. > > For what concerns the self-contained test case - I'll do my best to > prepare it. > Yeah, that would be helpful. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on
On 03/21/2018 08:44 PM, Alessandro Aste wrote: > Thanks for your reply Tomas. The query just got stuck for forever. I > observed no CPU spikes, it is currently running and I see 89 of the CPU > idle. I'm using a relative powerfull machine (120 GB ram, 56 CPU). > That doesn't really answer the question, I'm afraid. I suppose "89 of CPU" means that 89% idle in total, but 11% with 56 CPUs still means about 6 cores 100% busy. But maybe you meant something else? Is there something else running on the machine? If you look at "top" are the processes (the one you're connected to and the parallel workers) doing something on the CPU? > > Plain analyze as requested. : > I don't see anything obviously broken with the query plan, and it's difficult to compare with the other plans because they are quite different. But there's one part of the plan interesting: Limit (cost=253523.56..253523.57 rows=1 width=176) -> Sort (cost=253523.56..253523.57 rows=1 width=176) Sort Key: c_2.gii_circuitid, c_1.id -> Nested Loop (cost=33190.89..253523.55 rows=1 width=176) Join Filter: (c_1.id = c.id) -> Nested Loop (cost=31724.87..31736.29 rows=1 ...) ... -> Gather (cost=1466.02..221787.23 rows=3 width=75) Workers Planned: 5 -> Hash Join (cost=466.02..220786.93 rows=1 ...) ... That is, there's a Gather on the inner side of a Nested Loop. I wonder if that might cause issues in case of under-estimate (in which case we'd be restarting the Gather many times) ... BTW one of the plans you sent earlier is incomplete, because it ends like this: -> Nested Loop (cost=42469.41..42480.82 rows=1 width=85) (...) Join Filter: (c.status = cst.id) Time: 3016.688 ms (00:03.017) That is, it's missing the part below the join. That being said, I'm not sure what's the issue here. Can you prepare a self-contained test case that we might use to reproduce the issue? For example by dumping the relevant part of the tables? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on
On 03/21/2018 05:09 PM, Alessandro Aste wrote: > Hi there, we are using postgresql 10.3 and we're facing an issue with a > query. The query (full query below) completes only when: > > 1 - LIMIT 10 is removed > or > 2 - show max_parallel_workers_per_gather is set to 0, so parallel > processing is disabled. > > With max_parallel_workers_per_gather set to the default value (8) I'm > not even able to get the query plan. > > Notes: > > * We're experiencing the issue in any server of ours but I've > reproduced the issue in a fresh restored database with full > vacuum/reindex of the tables. > * We didn't touch any parameter concering the parallel processing, > we're running the defaults: > > > cmdstaging=# show max_parallel_workers_per_gather ; > max_parallel_workers_per_gather > - > 8 > (1 row) > > cmdstaging=# show max_worker_processes ; > max_worker_processes > -- > 8 > (1 row) > > cmdstaging=# show max_parallel_workers; > max_parallel_workers > -- > 8 > (1 row) > > > > > The query completes only omitting the LIMIT clause or when I disable > parallel processing: > > id | vendor_id | gaa | pop_name | pop_status | pop_location | > pop_provider_id | pop_provider | pop_street | pop_city | pop > _postal_code | pop_state | pop_country | pop_country_id > +---+-+---++--+-+--+-+--+ > -+---+-+ > 684807 | 12346 | | GTT/POP/LON1T | Active | LON1T | > 12288 | Telehouse UK | 14 Coriander Avenue | London | E14 > 2AA | | GB | 219 > (1 row) > > Time: 4374.759 ms (00:04.375) > cmdstaging=# show max_parallel_workers_per_gather ; > max_parallel_workers_per_gather > - > 0 > (1 row) > > Time: 0.097 ms > > > Otherwise it just keep running for forever. > When you say "running forever" is it actually using CPU, or does it get stuck on something? > > This is the full query: > > > SELECT * FROM ( > SELECT > seg.circuit_id AS id, > vendor_gtt_pop.vendor_id, > CASE WHEN vendor_gtt_pop.push_to_gaa = 1 THEN 'Y' END as gaa, > pop.gii_circuitid AS pop_name, > cst.label AS pop_status, > seg.a_company_name AS pop_location, > seg.vendor_id AS pop_provider_id, > seg.vendor_name AS pop_provider, > cs.address1 AS pop_street, > cs.city AS pop_city, > cs.postal_code AS pop_postal_code, > cs.state AS pop_state, > csc.code AS pop_country, > cs.country_id AS pop_country_id > FROM ( > SELECT c.gii_circuitid, max(so.id <http://so.id>) AS service_order_id > FROM service_order so > join circuit c on c.product_id=so.product_id > join master_service_order mso on mso.id > <http://mso.id>=so.master_service_order_id > WHERE NOT (so.ordertype_id = 2 AND so.status <> 999) AND > NOT (so.ordertype_id = 3 AND so.status <> 999) AND > c.status >= 20 AND > c.status not in (160,999) AND > mso.client_id=11615 AND > c.service_description=28 AND > c.status!=160 > GROUP BY c.gii_circuitid > ) pop > JOIN service_order so ON so.id <http://so.id> = pop.service_order_id > left JOIN client_site cs on cs.id <http://cs.id>=so.a_site_id > left JOIN country csc on csc.id <http://csc.id>=cs.country_id > JOIN circuit c ON so.product_id=c.product_id > JOIN circuit_status cst ON cst.id <http://cst.id>=c.status > JOIN ( > SELECT c.id <http://c.id> AS circuit_id, sg.id <http://sg.id> AS segment_id, > c.pop_support_vendor_id AS vendor_id, > v.name <http://v.name> AS vendor_name, sg.a_company_name > FROM segment sg > JOIN circuit_layout cl ON cl.segment_id = sg.id <http://sg.id> AND cl.ordinal > = 1 > JOIN circuit c ON c.id <http://c.id> = cl.circuit_id > JOIN vendor v ON v.id <http://v.id> = c.pop_support_vendor_id > ) seg ON seg.circuit_id = c.id <http://c.id> > JOIN vendor_gtt_pop on vendor_gtt_pop.gtt_pop_id = seg.circuit_id > ) foo where vendor_id = 12346 AND (1=1) ORDER BY pop_name ASC,id LIMIT 10 > > > > Execution plan with max_parallel_workers_per_gather =0 , > max_parallel_workers_per_gather =8 and no LIMIT clause : > > We really need to see the execution plan that causes issues, i.e. max_parallel_workers_per_gather=8 with LIMIT clause. Plain explain (without analyze), at least. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: dirty_ratio & dirty_background_ratio settings with huge memory
On 03/06/2018 01:16 PM, Laurenz Albe wrote: > pinker wrote: >> I've got cutomer with really huge RAM, now it's: >> total used free sharedbuffers cached >> Mem: 31021113052596 49515 2088019922961185 >> -/+ buffers/cache: 904183011693 >> Swap: 8191 1 8190 >> >> (free -m) >> >> and before it was twice more (6TB). >> >> and trying to figure out how to set dirty_ratio & dirty_background_ratio >> parameters. Even for normal sized server those default sizes are to high, >> but now would be ridiculously huge, respectively 1,24TB and 300GB. >> I'm thinking about 1 percent for dirty_ratio and not using >> dirty_background_ratio because it's not possible to set it below 1% but to >> set dirty_background_bytes instead to about 1-2GB. >> Maybe someone has got other experience with RAM of this size and those >> settings? > > Yes, you should set vm.dirty_background_bytes and vm.dirty_bytes > and not use the *_ratio settings. > > 2 GB for vm.dirty_background_bytes and 1 GB for vm.dirty_bytes sounds fine. > It should be the other way around: dirty_background_bytes < dirty_bytes regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Not sure if I should CREATE INDEX for text columns on which I plan to filter later
On 02/21/2018 05:00 PM, Bill Moran wrote: > On Wed, 21 Feb 2018 13:33:18 +0100 > Alexander Farber <alexander.far...@gmail.com> wrote: > >> Hi Martin - >> >> On Wed, Feb 21, 2018 at 1:26 PM, Martin Moore <martin.mo...@avbrief.com> >> wrote: >> >>> I’m no expert but I’d think it unlikely an index would be considered for a >>> table with only 100 rows in. Also I’m pretty sure only one index per table >>> is used, so you’d want to put state1 and state2 in one index. >> >> I hope to have more records in the words_games table later when my game is >> launched (currently in soft launch/beta). > > To elaborate+clarify Martin's comments. > > The index will not be used while the table is small because it's actually > slower > to process an index than it is to just read the entire table. However, as the > table gets more rows, these timings will reverse and Postgres will start using > the indexes. It's probably best to just create them even though the table is > small. The performance improvement you'll get when the table grows will be > well worth it, and it avoids the problem of trying to remember to create it > later. > > However, Martin's other comment about only using a single index is incorrect. > Postgres can use multiple indexes per query, so it's often good practace to > put indexes on every column that might ever be used in a WHERE clause. > I call this practice "shotgun" and generally discourage people from using it. It seems attractive, but not every where condition can be evaluated using an index, and every index has maintenance overhead. There are cases when it's somewhat reasonable (e.g. when you don't know which columns will be referenced by WHERE conditions, and data ingestion has lower priority than queries). But that doesn't seem to be the case here - you know the WHERE conditions, and people are likely sending a lot of inserts (and expecting low latency responses). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: query performance
On 02/18/2018 06:37 AM, David Rowley wrote: > On 18 February 2018 at 12:35, hmidi slim <hmidi.sl...@gmail.com> wrote: >> Is there an other optimized solution to make a query such this: >> select * from ( >> select e.name, e1.name, e.id >> from establishment as e, establishment as e1 >> where e.id <> e1.id >> and e1.id = 1 >> and ST_DWithin(geom, ST_MakePoint(e1.longitude, e1.latitude)::geography, >> 1000)) as tmp inner join product as p on p.establishment_id = tmp.id > > You really should state what is wrong with the performance of the > above version and also include the EXPLAIN (ANALYZE, BUFFERS) of that > query. > > The version of PostgreSQL that you're running it on is also a good > thing to share. > > Details of the indexes which you've defined on the tables are also > useful information. It would especially be good to know if you've > added an index on product (establishment_id), for example. > There's actually a wiki page with overview of what should be included in "slow query" questions: https://wiki.postgresql.org/wiki/Slow_Query_Questions regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: shared_buffers 8GB maximum
On 02/18/2018 02:41 PM, Vitaliy Garnashevich wrote: > >> I certainly wouldn't recommend using 1/2 of RAM right away. There's >> a good chance it would be a waste of memory - for example due to >> double buffering, which effectively reduces "total" cache hit >> ratio. > > Double buffering is often mentioned in context of tuning shared > buffers. Is there a tool to actually measure the amount of double > buffering happening in the system? > I'm not aware of such tool. But I suppose it could be done by integrating information from pg_buffercache and pgfincore [1]. [1] https://github.com/klando/pgfincore >> Those evictions are performed either by backends or bgwriter, both >> of which are less efficient than checkpointer. Not only can >> checkpointer perform various optimizations (e.g. sorting buffers to >> make the writes more sequential), but it also writes each dirty >> buffer just once. With smaller shared_buffers the page may have be >> written multiple times. > > In the case when shared_buffers cover most of RAM, most of writes > should happen by checkpointer, and cache hit ratio should be high. So > a hypothetical question: Could shared_buffers=200GB on a 250 GB RAM > server ever be a reasonable setting? (assuming there are no other > applications running except postgres, and 50GB is enough for > allocating work_mem/maintenance_work_mem and for serving queries) > It depends on how large is the active part of the data set is. If it fits into 200GB but not to smaller shared buffers (say, 100GB), then using 200GB may be a win. If the active set is much larger than RAM, smaller shared_buffer values work better in my experience. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: shared_buffers 8GB maximum
On 02/17/2018 02:56 AM, George Neuner wrote: > On Sat, 17 Feb 2018 00:36:57 +0200, Vitaliy Garnashevich > <vgarnashev...@gmail.com> wrote: > ... > >> Could that be a reason for increasing the value of shared_buffers? >> >> - Could shared_buffers=128GB or more on a 250 GB RAM server be a >> reasonable setting? What downsides could there be? > > It depends. 8GB is pretty small for such a large server, but taking > 1/2 the RAM is not necessarily the right thing either. > I certainly wouldn't recommend using 1/2 of RAM right away. There's a good chance it would be a waste of memory - for example due to double buffering, which effectively reduces "total" cache hit ratio. Start with lower value, increment it gradually and monitor behavior of the server. > The size of shared buffers affects log size and the time to complete > checkpoints. If a large(ish) percentage of your workload is writes, > having a very large shared space could be bad for performance, or > bad for space on the log device. > The size of shared_buffers has pretty much no impact on the size of the WAL - that's flat out wrong. It also does not affect the time needed to perform a checkpoint. It may mean that the checkpoint has to write more dirty buffers, but that is actually a good thing because a checkpoint is about the most efficient way to do writes. By using smaller shared buffers you're making it more likely the database has to evict (dirty) buffers from shared buffers to make space for other buffers needed by queries/vacuum/whatever. Those evictions are performed either by backends or bgwriter, both of which are less efficient than checkpointer. Not only can checkpointer perform various optimizations (e.g. sorting buffers to make the writes more sequential), but it also writes each dirty buffer just once. With smaller shared_buffers the page may have be written multiple times. What actually *does* matter is the active part of the data set, i.e. the part of the data that is actually accessed regularly. In other words, your goal is to achieve good cache hit ratio - say, 95% or more. This also helps reducing the number of syscalls (when reading data from page cache). What is the right shared_buffers size? I have no idea, as it's very dependent on the application. It might be 1GB or 100GB, hard to say. The best thing you can do is set shared buffers to some conservative value (say, 4-8GB), let the system run for a day or two, compute the cache hit ratio using metrics in pg_stat_database, and then decide if you need to resize shared buffers. >> PS. Some background. We had shared_buffers=8GB initially. In >> pg_stat_bgwriter we saw that dirty buffers were written to disk more >> frequently by backends than during checkpoints (buffers_clean > >> buffers_checkpoint, and buffers_backend > buffers_checkpoint). According >> to pg_buffercache extension, there was very small percentage of dirty >> pages in shared buffers. The percentage of pages with usagecount >= 3 >> was also low. Some of our more frequently used tables and indexes are >> more than 10 GB in size. This all suggested that probably the bigger >> tables and indexes, whenever scanned, are constantly flushing pages from >> the shared buffers area. After increasing shared_buffers to 32GB, the >> picture started looking healthier. There were 1GB+ of dirty pages in >> shared buffers (vs 40-200MB before), 30-60% of pages with usagecount >= >> 3 (vs 10-40% before), buffers_checkpoint started to grow faster than >> buffers_clean or buffers_backend. There is still not all frequently used >> data fits in shared_buffers, so we're considering to increase the >> parameter more. I wanted to have some idea about how big it could >> reasonably be. > > So now you know that 32GB is better for your workload than 8GB. But > that is not necessarily a reason immediately to go crazy with it. Try > increasing it gradually - e.g., adding 16GB at a time - and see if the > additional shared space provides any real benefit. > Gradual increases are a good approach in general. And yes, having buffers_checkpoint > buffers_clean > buffers_backend is a good idea too. Together with the cache hit ratio it's probably a more sensible metric than looking at usagecount directly. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: New Copy Formats - avro/orc/parquet
On 02/10/2018 04:30 PM, Nicolas Paris wrote: >>> I d'found useful to be able to import/export from postgres to those modern >>> data >>> formats: >>> - avro (c writer=https://avro.apache.org/docs/1.8.2/api/c/index.html) >>> - parquet (c++ writer=https://github.com/apache/parquet-cpp) >>> - orc (all writers=https://github.com/apache/orc) >>> >>> Something like : >>> COPY table TO STDOUT ORC; >>> >>> Would be lovely. >>> >>> This would greatly enhance how postgres integrates in big-data ecosystem. >>> >>> Any thought ? >> >> https://www.postgresql.org/docs/10/static/sql-copy.html >> >> "PROGRAM >> >> A command to execute. In COPY FROM, the input is read from standard >> output of the command, and in COPY TO, the output is written to the standard >> input of the command. >> >> Note that the command is invoked by the shell, so if you need to pass >> any arguments to shell command that come from an untrusted source, you must >> be careful to strip or escape any special characters that might have a >> special meaning for the shell. For security reasons, it is best to use a >> fixed command string, or at least avoid passing any user input in it. >> " >> > > PROGRAM would involve overhead of transforming data from CSV or > BINARY to AVRO for example. > > Here, I am talking about native format exports/imports for > performance considerations. > That is true, but the question is how significant the overhead is. If it's 50% then reducing it would make perfect sense. If it's 1% then no one if going to be bothered by it. Without these numbers it's hard to make any judgments. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: New Copy Formats - avro/orc/parquet
On 02/10/2018 04:38 PM, David G. Johnston wrote: > On Saturday, February 10, 2018, Nicolas Paris <nipari...@gmail.com > <mailto:nipari...@gmail.com>> wrote: > > Hello > > I d'found useful to be able to import/export from postgres to those > modern data > formats: > - avro (c writer=https://avro.apache.org/docs/1.8.2/api/c/index.html > <https://avro.apache.org/docs/1.8.2/api/c/index.html>) > - parquet (c++ writer=https://github.com/apache/parquet-cpp > <https://github.com/apache/parquet-cpp>) > - orc (all writers=https://github.com/apache/orc > <https://github.com/apache/orc>) > > Something like : > COPY table TO STDOUT ORC; > > Would be lovely. > > This would greatly enhance how postgres integrates in big-data > ecosystem. > > > It would be written "... with (format 'orc')" and your best bet would be > to create an extension. I don't think that having such code in core (or > contrib) is desirable. > I don't think make this extensible by an extension (i.e. the formats supported by COPY are hard-coded in core). But I agree that if we are to add multiple new formats, it'd be nice to allow doing that in extension. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: ERROR: invalid memory alloc request size 1073741824
On 01/31/2018 09:51 PM, Jan Wieck wrote: > > > On Wed, Jan 31, 2018 at 12:32 PM, Stefan Blanke > <stefan.bla...@framestore.com <mailto:stefan.bla...@framestore.com>> wrote: > > > > > I'll bet you it's not that. It's quite unlikely that would fail with > > exactly 1GB request size. It seems much more like a buffer that we keep > > to be power of 2. The question is which one. > > I had dismissed corruption before writing in. It's exactly 1GB every > time this has happened - and we can dump the full dataset > periodically without issue. > > >> I have my money on a corrupted TOAST entry. Is this happening on > >> trustworthy hardware or beige box with no ECC or RAID? > > It's good quality commercial hardware in our colo - no exactly sure > what. > > > If it is a sporadic issue and you can dump the full dataset, then I just > lost my money (Tomas, you coming to PGConf in Jersey City?). > Unfortunately no, but I'm sure there will be other opportunities to buy me a beer ;-) Like pgcon, for example. > > But then, if this is a plain COPY to stdout ... I am wondering what > could possibly be in that path that wants to allocate 1GB. Or is this > not so plain but rather a COPY ... SELECT ...? > That's what I've been guessing, and why I was asking for a query plan. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Issue with WAL logs temporary not replaying
to tables, and yet > keep silence? There was no indication WAL is not being applied, > postgres log would be silent (even though it kept writing other > commodity messages, such as "incomplete startup packet" and > malformed query attempts). > > Due to this, we noticed the issue only hours after postgres > supposedly stopped applying the WAL logs to tables. > It wouldn't. But it's unclear how you came to that conclusion. > * Is there any other log (like some lower level system log?) where I > could see the status of writing / reading / applying WAL segments? > There's only one log for PostgreSQL, so it would have to be in that file depending on the log level. > * What measures could be done to prevent this in future, or at least > make the situation easier to recover from (e.g. without data loss)? > The jury is still out on the root cause, so it's too soon to say. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: EXPLAIN BUFFERS: dirtied
On 01/29/2018 08:21 PM, Vitaliy Garnashevich wrote: > I've read this article: https://wiki.postgresql.org/wiki/Hint_Bits > > It says: > >> A plain SELECT, count(*), or VACUUM on the entire table will check >> every tuple for visibility and set its hint bits. > > Suppose, a new page was created using many INSERTs, and then was written > to disk during a checkpoint. There were no SELECTs or VACUUM on the page > or table yet. Will the following SELECT of one tuple from the page > update hint bits for ALL tuples on the page? Is that correct? > Possibly, if there are no old transactions running. > When a page is initially created and then is being written to disk > during a checkpoint, does checkpoint writer update the hint bits before > writing the page, or the following SELECT/VACUUM will have to do that > (possibly loading/updating/writing the page again)? > Checkpoint only deals with 8kB chunks of data. Hint bits are not set on a page, but on individual items (rows), so it's not up to the checkpoint process to tweak that - that's up to queries accessing the data. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Hardware advice
Hi, On 01/22/2018 09:46 PM, Alban Hertroys wrote: > Hi all, > > At work we are in the process of setting up a data-warehouse using PG > 10. I'm looking for a suitable server, but I hardly know anything > about server-grade hardware. > > Hence, we're looking for some advice, preferably with some > opportunity to discuss our situation and possibly things we did not > take into account etc. A face to talk to would be appreciated. Who > provides that in or near the eastern border of the Netherlands? > Coincidentally, there's a big conference (FOSDEM) in that area next week, with a smaller PostgreSQL conference (FOSDEM PgDay) the day before. Might be a good opportunity to talk to PostgreSQL people. If you're looking for an actual consulting, there are multiple companies that might help you with this (including ours). > More details: > > We're planning to deploy on bare-metal hardware, with a fallback > server with similar or lesser specs for emergencies and upgrades and > perhaps some (read-only) load balancing of different kinds of loads. > > The server will be accessed for reporting and ETL (or ELT) mostly. > Both reporting servers (test/devel and production) are configured for > at most 40 agents, so that's max 40 connections each to the warehouse > for now. So far, we haven't reached that number in real loads, but > reports are requested ~40,000 times a month (we measure HTTP requests > minus static content). > > We will also be doing ETL of (very) remote (SAP) tables to the > warehouse server; in what we got so far in our limited test > environment we have tables of over 30GB, most of which is from the > last 4 to 5 years. > That's nice, but it does not really tell us how much work that means for the database :-( Those queries might be touching tiny subset of the data, or it might touch the whole data set. That will have significant impact on the hardware requirements. > The biggy though is that we also plan to store factory process > measurements on this server (temperatures, pressures, etc. at 5s > intervals). So, time series data. I wonder if timescale [1] would be appropriate here (never used it, but seems to be designed for exactly this use case). And built on PostgreSQL. [1] http://www.timescale.com/ > Part of one factory has already been writing that data to > a different server, but that's already 4.3 billion records (140GB) > for about a year of measuring and that's not even half of the > factory. We will be required to retain 10-15 years of data from > several factories (on the short term, at least 2). The expectancy is > that this will grow to ~15TB for our factory alone. > Storing this amounts of data is not that difficult - the DL360 machines can handle 40TB+ for example. The question is how intensive the data access and processing will be, so that you can pick the right storage configuration, size the amount of RAM etc. FWIW it makes no sense to size this for 10-15 years from the get go, because (a) you won't get it right anyway (things change over time), and (b) you're unlikely to keep the same hardware for 10+ years. Get smaller but more powerful hardware, plan to replace it in a couple of years with never machines. > We also want to keep our options for growth of this data warehouse > open. There are some lab databases, for example, that currently exist > as two separate brand database servers (with different major versions > of the lab software, so there are design differences as well), that > aren't exactly small either. > > I have been drooling over those shiny new AMD Epyc processors, which > look certainly adequate with a truckload of memory and a good RAID-10 > array and some SSD(s) for the WAL, but it's really hard to figure out > how many cores and memory we need. Sure, 2 7601's at 64 cores and 4TB > of memory (if that's even for sale) would probably do the trick, but > even I think that might be going a little overboard ;) > Well, you can either find out what your actual needs are (by looking at the current system and extrapolating it in some way) and sizing the hardware accordingly. Or you can squeeze as much money from the management as possible, and buying the shiniest stuff out possible. > Oh yeah, apparently we're married to HP or something… At least, IT > management told me to look at their offerings. > I'd say Proliant machines are pretty solid. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: postgresql 9.5 has ocuuered OOM
On 12/20/2017 04:08 PM, mark wrote: > I have set shared_buffers is 1/4 of memory. > work_mem is 2% of memory. > max_connections is 50. That means if you have all 50 connections active, they may easily consume 100% of memory, because 50 * 2 is 100. It's even easier if the connections are executing complex queries, because each query may use multiple work_mem buffers. So 2% seems a bit too high. > momery size is 16GB. > postgresql process used over 70% of memory and occuered OOM. So, did a single process use 70% of memory, or all postgres processes combined? If just a single process, it might be a poor plan choice (e.g. hash aggregate may easily cause that). If all processes combined, then perhaps it's due to work_mem being too high. > what should I do to deal with this problem? > Hard to say, until you provide enough information. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Size of pg_multixact/members increases 11355
On 12/13/2017 01:42 PM, Yogesh Sharma wrote: > Dear Thomas , > > Thanks for sharing information. > > Is it possible to remove mentioned folder files in some time intervals > by some DB command? > Currently i can not upgrade to 3.6.20. > So please share if any solution is available. > There are no explicit commands to remove them. When the database decides those files are not needed, it'll remove them automatically. It's strongly discouraged to mess with those files directly, as it may easily lead to data loss or data corruption, particularly when it's unclear why the directory got so large. You could try setting autovacuum_multixact_freeze_max_age=0 and vacuuming the database, but it's hard to say if that will help. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: large numbers of inserts out of memory strategy
On 11/28/2017 07:26 PM, Ted Toth wrote: > On Tue, Nov 28, 2017 at 12:01 PM, Tomas Vondra > <tomas.von...@2ndquadrant.com> wrote: >> >> ... >> >> That is, most of the memory is allocated for SPI (2.4GB) and PL/pgSQL >> procedure (500MB). How do you do the load? What libraries/drivers? >> > > I'm doing the load with 'psql -f'. I using 9.6 el6 rpms on a Centos VM > I downloaded from the postgres repo. > So what does the script actually do? Because psql certainly is not running pl/pgsql procedures on it's own. We need to understand why you're getting OOM in the first place - just inserts alone should not cause failures like that. Please show us more detailed explanation of what the load actually does, so that we can try reproducing it. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: large numbers of inserts out of memory strategy
Hi, On 11/28/2017 06:17 PM, Ted Toth wrote: > I'm writing a migration utility to move data from non-rdbms data > source to a postgres db. Currently I'm generating SQL INSERT > statements involving 6 related tables for each 'thing'. With 100k or > more 'things' to migrate I'm generating a lot of statements and when I > try to import using psql postgres fails with 'out of memory' when > running on a Linux VM with 4G of memory. If I break into smaller > chunks say ~50K statements then thde import succeeds. I can change my > migration utility to generate multiple files each with a limited > number of INSERTs to get around this issue but maybe there's > another/better way? > The question is what exactly runs out of memory, and how did you modify the configuration (particularly related to memory). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: query causes connection termination
On 11/23/2017 01:25 AM, Neto pr wrote: > Another fact is that when executing the query without the command > EXPLAIN ANALYZE, the result is usually returned after a few minutes. > I do not understand, because when using the EXPLAIN ANALYZE command the > dbms closes the connection. > Anyone have any tips on why this occurs? > Attach gdb to the backend, run the query and when it fails get us the backtrace. So something like 1) select pg_backend_pid() 2) gdb -p $PID 3) run the EXPLAIN ANALYZE again 4) watch the gdb session, when it fails do 'bt' You need to install debuginfo first, so that the backtrace makes sense. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services