Re: OOM Killer kills PostgreSQL
Greetings, * Piotr Włodarczyk (piotrwlodarczy...@gmail.com) wrote: > We met unexpected PostgreSQL shutdown. After a little investigation we've > discovered that problem is in OOM killer which kills our PostgreSQL. You need to configure your system to not overcommit. Read up on overcommit_ratio and overcommit_memory Linux settings. Thanks, Stephen signature.asc Description: PGP signature
Re: Best partition type for billions of addresses
Greetings, * Arya F (arya6...@gmail.com) wrote: > > * Arya F (arya6...@gmail.com) wrote: > > > I need to store about 600 million rows of property addresses across > > > multiple counties. I need to have partitioning setup on the table as > > > there will be updates and inserts performed to the table frequently > > > and I want the queries to have good performance. > > > > That's not what partitioning is for, and 600m rows isn't all *that* > > many. > > But I have noticed that my updates and inserts have slowed down > dramatically when I started going over about 20 million rows and the > reason was because every time it has to update the index. When I > removed the index, my insert performance stayed good no matter the > size of the table. Sure it does. > So I should be able to achieve good performance with just one > partition? Maybe I just need to get hardware with more memory? Instead of jumping to partitioning, I'd suggest you post your actual table structures, queries, and explain results here and ask for help. https://wiki.postgresql.org/wiki/Slow_Query_Questions Thanks, Stephen signature.asc Description: PGP signature
Re: Best partition type for billions of addresses
Greetings, * Arya F (arya6...@gmail.com) wrote: > I need to store about 600 million rows of property addresses across > multiple counties. I need to have partitioning setup on the table as > there will be updates and inserts performed to the table frequently > and I want the queries to have good performance. That's not what partitioning is for, and 600m rows isn't all *that* many. > >From what I understand hash partitioning would not be the right > approach in this case, since for each query PostgreSQL has to check > the indexes of all partitions? > > Would list partitioning be suitable? if I want PostgreSQL to know > which partition the row is it can directly load the relevant index > without having to check other partitions. Should I be including the > partition key in the where clause? > > I'd like to hear some recommendations on the best way to approach > this. I'm using PostgreSQL 12 In this case, it sounds like "don't" is probably the best option. Partitioning is good for data management, particularly when you have data that "ages out" or should be removed/dropped at some point, provided your queries use the partition key. Partitioning doesn't speed up routine inserts and updates that are using a proper index and only updating a small set of rows at a time. Thanks, Stephen signature.asc Description: PGP signature
Re: PostgreSQL does not choose my indexes well
Greetings, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frost writes: > > Turns out to be because what was provided wasn't actually what was being > > used- there's a domain in there and that seems to gum up the works and > > make it so we don't consider the partial index as being something we can > > use (see the discussion at the end of the other sub-thread). > > Some simple experiments here don't find that a domain-type column prevents > use of the partial index. So it's still not entirely clear what's > happening for the OP. I concur with Jeff's suggestion to try forcing > use of the desired index, and see whether it happens at all and what > the cost estimate is. Once burned, twice shy, I suppose- considering we weren't given the actual DDL the first round, I'm guessing there's other differences. > I'm also wondering exactly which Postgres version this is. Also a good question. Thanks, Stephen signature.asc Description: PGP signature
Re: PostgreSQL does not choose my indexes well
Greetings, * Jeff Janes (jeff.ja...@gmail.com) wrote: > In order to read 1409985 / 12839 = 109 rows per buffer page, the table must > be extraordinarily well clustered on this index. That degree of clustering > is going to steal much of the thunder from the index-only scan. But in my > hands, it does still prefer the partial index with index-only scan by a > cost estimate ratio of 3 to 1 (despite it actually being slightly slower) > so I don't know why you don't get it being used. Turns out to be because what was provided wasn't actually what was being used- there's a domain in there and that seems to gum up the works and make it so we don't consider the partial index as being something we can use (see the discussion at the end of the other sub-thread). Thanks, Stephen signature.asc Description: PGP signature
Re: PostgreSQL does not choose my indexes well
Greetings, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frost writes: > > I do wonder if we are maybe missing a bet at times though, considering > > that I'm pretty sure we'll always go through the index in order, and > > therefore randomly, even when we don't actually need the results in > > order..? Has there been much consideration for just opening an index > > and sequentially scanning it in cases like this where we have to go > > through all of the index anyway and don't need the results in order? > > As I recall, it's unsafe to do so because of consistency considerations, > specifically there's a risk of missing or double-visiting some entries due > to concurrent index page splits. VACUUM has some way around that, but it > doesn't work for regular data-fetching cases. (nbtree/README has more > about this, but I don't feel like looking it up for you.) That README isn't exactly small, but the mention of VACUUM having a trick there helped me find this: --- The tricky part of this is to avoid missing any deletable tuples in the presence of concurrent page splits: a page split could easily move some tuples from a page not yet passed over by the sequential scan to a lower-numbered page already passed over. (This wasn't a concern for the index-order scan, because splits always split right.) To implement this, we provide a "vacuum cycle ID" mechanism that makes it possible to determine whether a page has been split since the current btbulkdelete cycle started. If btbulkdelete finds a page that has been split since it started, and has a right-link pointing to a lower page number, then it temporarily suspends its sequential scan and visits that page instead. It must continue to follow right-links and vacuum dead tuples until reaching a page that either hasn't been split since btbulkdelete started, or is above the location of the outer sequential scan. Then it can resume the sequential scan. This ensures that all tuples are visited. --- So the issue is with a page split happening and a tuple being moved to an earlier leaf page, resulting in us potentially not seeing it even though we should have during a sequential scan. The trick that VACUUM does seems pretty involved and would be more complicated for use for this as it's not ok to return the same tuples multiple times (though perhaps in a BitmapIndexScan we could handle that..). Then again, maybe the skipping scan mechanism that's been talked about recently would let us avoid having to scan the entire index even in cases where the conditional doesn't include the initial index columns, since it looks like that might be what we're doing now. > My guess based on your results is that the OP's table *isn't* all-visible, > or at least the planner doesn't know it is. Hrmpf, even then I seem to end up with an IndexOnlyScan- =# select * from pg_visibility_map('entidad') where all_visible; blkno | all_visible | all_frozen ---+-+ (0 rows) analyze entidad; =# select relallvisible from pg_class where relname = 'entidad'; relallvisible --- 0 (1 row) =# explain (analyze, buffers) select * from entidad where cod_tabla = 4; QUERY PLAN --- Index Only Scan using idx_entidad_tabla_4 on entidad (cost=0.43..170908.14 rows=1657114 width=24) (actual time=0.312..3511.629 rows=1720668 loops=1) Heap Fetches: 3441336 Buffers: shared hit=6444271 read=469499 Planning Time: 2.831 ms Execution Time: 3563.413 ms (5 rows) I'm pretty suspicious that they've made some odd planner configuration changes or something along those lines to end up with the plan they got, or there's some reason we don't think we can use the partial index. Thanks, Stephen signature.asc Description: PGP signature
Re: PostgreSQL does not choose my indexes well
Greetings, * David G. Johnston (david.g.johns...@gmail.com) wrote: > On Thu, Apr 23, 2020 at 8:29 AM Tom Lane wrote: > > "David G. Johnston" writes: > > > On Thursday, April 23, 2020, Thomas Kellerer wrote: > > >> Plus: scanning idx_tabla_entidad is more efficient because that index is > > >> smaller. > > > > > Really? The absence of 33 million rows in the partial index seems like > > it > > > would compensate fully and then some for the extra included columns. > > > > On the other hand, an indexscan is likely to end up being effectively > > random-access rather than the purely sequential access involved in > > a seqscan. > > I feel like I'm missing something as the OP's query is choosing indexscan - > just it is choosing to scan the full index containing the searched upon > field instead of a partial index that doesn't contain the field but whose > predicate matches the where condition - in furtherance of a count(*) > computation where the columns don't really matter. The actual query isn't a count(*) though, it's a 'select *'. > I do get "its going to perform 1.4 million random index entries and heap > lookup anyway - so it doesn't really matter" - but the first answer was > "the full index is smaller than the partial" which goes against my > intuition. Yeah, I'm pretty sure the full index is quite a bit bigger than the partial index- see my note from just a moment ago. > The sequential scan that isn't being used would have to touch 25x the > number of records - so its non-preference seems reasonable. Agreed on that. Thanks, Stephen signature.asc Description: PGP signature
Re: PostgreSQL does not choose my indexes well
Greetings, * Tom Lane (t...@sss.pgh.pa.us) wrote: > "David G. Johnston" writes: > > On Thursday, April 23, 2020, Thomas Kellerer wrote: > >> Plus: scanning idx_tabla_entidad is more efficient because that index is > >> smaller. > > > Really? The absence of 33 million rows in the partial index seems like it > > would compensate fully and then some for the extra included columns. > > On the other hand, an indexscan is likely to end up being effectively > random-access rather than the purely sequential access involved in > a seqscan. An indexscan is what was chosen though, so this doesn't really seem to be a question of index scan vs. seq scan, it's a question of why one index vs. another, though it seems a bit odd that we'd pick a regular index scan instead of a BitmapHeap/Index scan. > (If the index was built recently, then it might not be > so bad --- but the planner doesn't know that, so it assumes that the > index leaf pages are laid out pretty randomly.) Moreover, unless the > table is mostly marked all-visible, there will be another pile of > randomized accesses into the heap to validate visibility of the index > entries. If the table *is* marked all visible, though, then certainly that index will be better, and I think that's what a lot of this is coming down to in this particular case. Populating the tables provided based on the minimal info we got, minimizing the numbers of pages that 'cod_tabla=4' is on: insert into tabla select generate_series, 'abcdef' from generate_series(1,20); insert into entidad select generate_series, 4, generate_series+1 from generate_series(1,1409985); insert into entidad select generate_series+1409985, generate_series % 20 + 1, generate_series+1 from generate_series(1,34413354) where generate_series % 20 + 1 <> 4; vacuum analyze entidad; With this, the table is 1.7GB, idx_tabla_entidad is about 700MB, while idx_entidad_tabla_4 is only 81MB. With this, on v12-HEAD, PG will happily use the partial index: =# explain (analyze, buffers) select * from entidad where cod_tabla = 4; QUERY PLAN - Index Only Scan using idx_entidad_tabla_4 on entidad (cost=0.43..55375.20 rows=1422085 width=24) (actual time=0.050..144.745 rows=1409985 loops=1) Heap Fetches: 0 Buffers: shared hit=8497 Planning Time: 0.338 ms Execution Time: 183.081 ms (5 rows) Dropping that index and then running it again shows: =# explain (analyze, buffers) select * from entidad where cod_tabla = 4; QUERY PLAN -- Bitmap Heap Scan on entidad (cost=26641.72..608515.59 rows=1422085 width=24) (actual time=102.844..242.522 rows=1409985 loops=1) Recheck Cond: (cod_tabla = 4) Heap Blocks: exact=8981 Buffers: shared read=12838 -> Bitmap Index Scan on idx_tabla_entidad (cost=0.00..26286.20 rows=1422085 width=0) (actual time=101.969..101.969 rows=1409985 loops=1) Index Cond: (cod_tabla = 4) Buffers: shared read=3857 Planning Time: 0.264 ms Execution Time: 277.854 ms (9 rows) If we spread out where the 'cod_tabla=4' tuples are, the partial index is still used (note that we end up with more like 1.7M tuples instead of 1.4M, but I don't think that's terribly relevant): truncate entidad; insert into entidad select generate_series, generate_series % 20 + 1, generate_series+1 from generate_series(1,34413354); =# explain (analyze, buffers) select * from entidad where cod_tabla = 4; QUERY PLAN - Index Only Scan using idx_entidad_tabla_4 on entidad (cost=0.43..65231.31 rows=1664459 width=24) (actual time=0.036..185.171 rows=1720668 loops=1) Heap Fetches: 0 Buffers: shared hit=10375 Planning Time: 0.247 ms Execution Time: 233.205 ms (5 rows) Things get a lot worse when we drop that partial index: drop index idx_entidad_tabla_4; =# explain (analyze, buffers) select * from entidad where cod_tabla = 4; QUERY PLAN -- Bitmap Heap Scan on entidad
Re: Fwd: upgrade to PG11 on secondary fails (no initdb was launched)
Greetings, * Bruce Momjian (br...@momjian.us) wrote: > On Fri, Jun 14, 2019 at 10:39:40AM -0400, Bruce Momjian wrote: > > On Fri, Jun 14, 2019 at 03:12:29PM +0200, Fabio Pardi wrote: > > > Using --size-only, tells rsync to only check the size of the blocks. > > > That is: if the block is present on the destination, and is the same > > > size as the origin, then skip. > > > > The files are _exactly_ the same on primary and standby, so we don't > > need to check anything. Frankly, it is really only doing hard linking > > of the files. > > Here is the description from our docs: > >What this does is to record the links created by pg_upgrade's >link mode that connect files in the old and new clusters on the >primary server. It then finds matching files in the standby's old >cluster and creates links for them in the standby's new cluster. >Files that were not linked on the primary are copied from the >primary to the standby. (They are usually small.) This provides >rapid standby upgrades. Unfortunately, rsync needlessly copies >files associated with temporary and unlogged tables because these >files don't normally exist on standby servers. > > The primary and standby have to be binary the same or WAL replay would > not work on the standby. (Yes, I sometimes forgot how this worked so I > wrote it down in the docs.) :-) Right- this is *not* a general process for building a replica, this is specifically *only* for when doing a pg_upgrade and *everything* is shut down when it runs, and every step is checked to ensure that there are no errors during the process. Thanks! Stephen signature.asc Description: PGP signature
Re: Shortest offline window on database migration
Greetings, * Haroldo Kerry (hke...@callix.com.br) wrote: > The bottleneck at dump is CPU (a single one, on a 44 thread server), as we > are using the -Fc option, that does not allow multiple jobs. > We tried some time ago to use the --jobs option of pg_dump but it was > slower, even with more threads. Our guess is the sheer volume of files > outweighs the processing gains of using a compressed file output. Also > pg_dump even with multiple jobs spends a lot of time (1h+) on the "reading > dependency data" section that seems to be single threaded (our application > is multi-tenant and we use schemas to separate tenant data, hence we have a > lot of tables). You might want to reconsider using the separate-schemas-for-tenants approach. This isn't the only annoyance you can run into with lots and lots of tables. That said, are you using the newer version of pg_dump (which is what you should be doing when migrating to a newer version of PG, always)? We've improved it over time, though I can't recall off-hand if this particular issue was improved of in-between the releases being discussed here. Of course, lots of little files and dealing with them could drag down performance when working in parallel. Still a bit surprised that it's ending up slower than -Fc. > We are creating the replica using : > docker exec pg-2 pg_basebackup -h 192.168.0.107 -U replication -P --xlog -D > /var/lib/postgresql/data_9.6 > and it is taking 1h10m , instead of the 2h I reported initially, because we > were using rsync with checksums to do it, after experimenting with > pg_basebackup we found out it is faster, rsync was taking 1h just to > calculate all checksums. Thanks for your insight on this taking too long. So, it's a bit awkward still, unfortunately, but you can use pgbackrest to effectively give you a parallel-replica-build. The steps are something like: Get pgbackrest WAL archiving up and going, with the repo on the destination server/filesystem, but have 'compress=n' in the pgbackrest.conf for the repo. Run: pgbackrest --stanza=mydb --type=full --process-max=8 backup Once that's done, just do: mv /path/to/repo/backup/mydb/20190605-12F/pg_data /new/pgdata chmod -R g-rwx /new/pgdata Then in /new/pgdata, create a recovery.conf file like: restore_command = 'pgbackrest --stanza=mydb archive-get %f "%p"' And start up the DB server. We have some ideas about how make that whole thing cleaner but the rewrite into C has delayed our efforts, perhaps once that's done (this fall), we can look at it. Of course, you won't have an actual backup of the new database server at that point yet, so you'll want to clean things up and make that happen ASAP. Another option, which is what I usually recommend, is just to take a new backup (properly) and then do a restore from it, but that'll obviously take longer since there's two copies being done instead of one (though you can parallelize to your heart's content, so it can still be quite fast if you have enough CPU and I/O). Thanks, Stephen signature.asc Description: PGP signature
Re: Use Postgres as a column store by creating one table per column
Greetings, * Lev Kokotov (lev.koko...@gmail.com) wrote: > Is it efficient to use Postgres as a column store by creating one table per > column? Short answer is no, not in a traditional arrangement, anyway. The tuple overhead would be extremely painful. It's possible to improve on that, but it requires sacrificing what the tuple header gives you- visibility information, along with some other things. The question will be if that's acceptable or not. > I'm thinking since Postgres stores tables in continuous blocks of 16MB each > (I think that's the default page size?) I would get efficient reads and > with parallel queries I could benefit from multiple cores. The page size in PG is 8k, not 16MB. Thanks, Stephen signature.asc Description: PGP signature
Re: Fwd: upgrade to PG11 on secondary fails (no initdb was launched)
Greetings, Please don't post these kinds of questions to this list, it's not the right list. Pick the correct list to use in the future, and don't cross-post to multiple lists. This list is specifically for performance issues and questions regarding PostgreSQL, not about how to upgrade. For that, I would suggest either -general OR -admin (not both). > Any idea how to handle it ? I'm sending it to the performance mail list > because no one answered it in the admin list .. This isn't an acceptable reason to forward it to another list. These lists have specific purposes and should be used for those purposes. Further, no one is under any obligation to respond to questions posed to these lists and any help provided is entirely at the discretion of those on the list as to if they wish to, and have time to, help, or not. Thanks, Stephen signature.asc Description: PGP signature
Re: PostgreSQL VS MongoDB: a use case comparison
Greetings, * Merlin Moncure (mmonc...@gmail.com) wrote: > On Tue, Nov 20, 2018 at 11:28 AM Stephen Frost wrote: > > Oh yes, having a dictionary would be a great start to reducing the size > > of the jsonb data, though it could then become a contention point if > > there's a lot of new values being inserted and such. Naturally there > > would also be a cost to pulling that data back out as well but likely it > > would be well worth the benefit of not having to store the field names > > repeatedly. > > Yes, the biggest concern with a shared dictionary ought to be > concurrency type problems. Hmmm, I wonder if we could do something like have a dictionary per page.. Or perhaps based on some hash of the toast ID.. Not sure. :) Thanks! Stephen signature.asc Description: PGP signature
Re: PostgreSQL VS MongoDB: a use case comparison
Greetings, * Merlin Moncure (mmonc...@gmail.com) wrote: > On Tue, Nov 20, 2018 at 10:43 AM Stephen Frost wrote: > > * Merlin Moncure (mmonc...@gmail.com) wrote: > > > On Mon, Nov 19, 2018 at 11:26 AM Stephen Frost wrote: > > > > Looks like a lot of the difference being seen and the comments made > > > > about one being faster than the other are because one system is > > > > compressing *everything*, while PG (quite intentionally...) only > > > > compresses the data sometimes- once it hits the TOAST limit. That > > > > likely also contributes to why you're seeing the on-disk size > > > > differences that you are. > > > > > > Hm. It may be intentional, but is it ideal? Employing datum > > > compression in the 1kb-8kb range with a faster but less compressing > > > algorithm could give benefits. > > > > Well, pglz is actually pretty fast and not as good at compression as > > other things. I could certainly see an argument for allowing a column > > to always be (or at least attempted to be) compressed. > > > > There's been a lot of discussion around supporting alternative > > compression algorithms but making that happen is a pretty big task. > > Yeah; pglz is closer to zlib. There's much faster stuff out > there...Andres summed it up pretty well; > https://www.postgresql.org/message-id/20130605150144.GD28067%40alap2.anarazel.de > > There are also some interesting discussions on jsonb specific > discussion approaches. Oh yes, having a dictionary would be a great start to reducing the size of the jsonb data, though it could then become a contention point if there's a lot of new values being inserted and such. Naturally there would also be a cost to pulling that data back out as well but likely it would be well worth the benefit of not having to store the field names repeatedly. Then again, taken far enough, what you end up with are tables... :) Thanks! Stephen signature.asc Description: PGP signature
Re: PostgreSQL VS MongoDB: a use case comparison
Greetings, * Merlin Moncure (mmonc...@gmail.com) wrote: > On Mon, Nov 19, 2018 at 11:26 AM Stephen Frost wrote: > > Looks like a lot of the difference being seen and the comments made > > about one being faster than the other are because one system is > > compressing *everything*, while PG (quite intentionally...) only > > compresses the data sometimes- once it hits the TOAST limit. That > > likely also contributes to why you're seeing the on-disk size > > differences that you are. > > Hm. It may be intentional, but is it ideal? Employing datum > compression in the 1kb-8kb range with a faster but less compressing > algorithm could give benefits. Well, pglz is actually pretty fast and not as good at compression as other things. I could certainly see an argument for allowing a column to always be (or at least attempted to be) compressed. There's been a lot of discussion around supporting alternative compression algorithms but making that happen is a pretty big task. Thanks! Stephen signature.asc Description: PGP signature
Re: PostgreSQL VS MongoDB: a use case comparison
Greetings, * Fabio Pardi (f.pa...@portavita.eu) wrote: > thanks for your feedback. We prefer on these mailing lists to not top-post but instead to reply inline, as I'm doing here. This helps the conversation by eliminating unnecessary dialogue and being able to make comments regarding specific points clearly. > I agree with you the compression is playing a role in the comparison. > Probably there is a toll to pay when the load is high and the CPU > stressed from de/compressing data. If we will be able to bring our > studies that further, this is definitely something we would like to measure. I was actually thinking of the compression as having more of an impact with regard to the 'cold' cases because you're pulling fewer blocks when it's compressed. The decompression cost on CPU is typically much, much less than the cost to pull the data off of the storage medium. When things are 'hot' and in cache then it might be interesting to question if the compression/decompression is worth the cost. > I also agree with you that at the moment Postgres really shines on > relational data. To be honest, after seeing the outcome of our research, > we are actually considering to decouple some (or all) fields from their > JSON structure. There will be a toll to be payed there too, since we are > receiving data in JSON format. PostgreSQL has tools to help with this, you might look into 'json_to_record' and friends. > And the toll will be in time spent to deliver such a solution, and > indeed time spent by the engine in doing the conversion. It might not be > that convenient after all. Oh, the kind of reduction you'd see in space from both an on-disk and in-memory footprint would almost certainly be worth the tiny amount of CPU overhead from this. > Anyway, to bring data from JSON to a relational model is out of topic > for the current discussion, since we are actually questioning if > Postgres is a good replacement for Mongo when handling JSON data. This narrow viewpoint isn't really sensible though- what you should be thinking about is what's appropriate for your *data*. JSON is just a data format, and while it's alright as a system inter-exchange format, it's rather terrible as a storage format. > As per sharing the dataset, as mentioned in the post we are handling > medical data. Even if the content is anonymized, we are not keen to > share the data structure too for security reasons. If you really want people to take your analysis seriously, others must be able to reproduce your results. I certainly appreciate that there are very good reasons that you can't share this actual data, but your testing could be done with completely generated data which happens to be similar in structure to your data and have similar frequency of values. The way to approach generating such a data set would be to aggregate up the actual data to a point where the appropriate committee/board agree that it can be shared publicly, and then you build a randomly generated set of data which aggregates to the same result and then use that for testing. > That's a pity I know but i cannot do anything about it. > The queries we ran and the commands we used are mentioned in the blog > post but if you see gaps, feel free to ask. There were a lot of gaps that I saw when I looked through the article- starting with things like the actual CREATE TABLE command you used, and the complete size/structure of the JSON object, but really what a paper like this should include is a full script which creates all the tables, loads all the data, runs the analysis, calculates the results, etc. Thanks! Stephen signature.asc Description: PGP signature
Re: PostgreSQL VS MongoDB: a use case comparison
Greetings, * Fabio Pardi (f.pa...@portavita.eu) wrote: > We are open to any kind of feedback and we hope you enjoy the reading. Looks like a lot of the difference being seen and the comments made about one being faster than the other are because one system is compressing *everything*, while PG (quite intentionally...) only compresses the data sometimes- once it hits the TOAST limit. That likely also contributes to why you're seeing the on-disk size differences that you are. Of course, if you want to see where PG will really shine, you'd stop thinking of data as just blobs of JSON and actually define individual fields in PG instead of just one 'jsonb' column, especially when you know that field will always exist (which is obviously the case if you're building an index on it, such as your MarriageDate) and then remove those fields from the jsonb and just reconstruct the JSON when you query. Doing that you'll get the size down dramatically. And that's without even going to that next-level stuff of actual normalization where you pull out duplicate data from across the JSON and have just one instance of that data in another, smaller, table and use a JOIN to bring it all back together. Even better is when you realize that then you only have to update one row in this other table when something changes in that subset of data, unlike when you repeatedly store that data in individual JSON entries all across the system and such a change requires rewriting every single JSON object in the entire system... Lastly, as with any performance benchmark, please include full details- all scripts used, all commands run, all data used, so that others can reproduce your results. I'm sure it'd be fun to take your json data and create actual tables out of it and see what it'd be like then. Thanks! Stephen signature.asc Description: PGP signature
Re: need meta data table/command to find query log
Greetings, These questions are not appropriate for the 'performance' mailing list but should be either on 'admin' or 'general'. Please use the appropriate list for asking questions in the future. * Rambabu V (ram.wis...@gmail.com) wrote: > Please help us to get the query log details from meta data table/command in > postgresql. aw we are not maintaining log files more than 2 days due to > lack of space. It's entirely unclear what you are asking for here when you say "meta data." Information about tables is stored in the system catalog, particularly the "pg_class" and "pg_attribute" tables, but that's independent from the WAL. To read the WAL files, you can use pg_waldump (or pg_xlogdump on older versions), though that's not 'meta' data. > And also please provide document or sop for database upgrade from 9.3 to > 9.6, as our database size was 4.5 tb and having table spaces as well. as it > was production database system we do-not want to take any risk, please help > us on this as well. You'll likely want to use pg_upgrade to perform such an upgrade: https://www.postgresql.org/docs/10/static/pgupgrade.html Thanks! Stephen signature.asc Description: PGP signature
Re: by mistake dropped physical file dropped for one table.
Greetings, * Rambabu V (ram.wis...@gmail.com) wrote: > by mistake one physical file dropped for one of our table, as we do-not > have backup for this table we are getting below error. > > ERROR: could not open file "base/12669/16394": No such file or directory > > please help us to recover the table. You're not likely able to recover that table. To do so would require completely stopping the system immediately and attempting to perform filesystem maniuplation to "undelete" the file, or pull back chunks from the filesystem which contain pieces of the file and attempting to reconstruct it. If you've been keeping all WAL since the beginning of the cluster, it's possible you could recover that way, but you claim to not have any backups, so I'm guessing that's pretty unlikely. Thanks! Stephen signature.asc Description: PGP signature
Re: Updating large tables without dead tuples
Greetings, * l...@laurent-hasson.com (l...@laurent-hasson.com) wrote: > This was done during a maintenance window, and that table is read-only except > when we ETL data to it on a weekly basis, and so I was just wondering why I > should pay the "bloat" penalty for this type of transaction. Is there a trick > that could be use here? Yes, create a new table and INSERT the data into that table, then swap the new table into place as the old table. Another option, if you don't mind the exclusive lock taken on the table, is to dump the data to another table, then TRUNCATE the current one and then INSERT into it. There's other options too, involving triggers and such to allow updates and other changes to be captured during this process, avoiding the need to lock the table, but that gets a bit complicated. > More generally, I suspect that the MVCC architecture is so deep that > something like LOCK TABLE, which would guarantee that there won't be > contentions, couldn't be used as a heuristic to not create dead tuples? That > would make quite a performance improvement for this type of work though. I'm afraid it wouldn't be quite that simple, particularly you have to think about what happens when you issue a rollback... Thanks! Stephen signature.asc Description: PGP signature