Re: [SPAM] [PERFORM] OS cache management
Il 06/09/2017 10:12, Soni M ha scritto: Let's say I have 10 year data, and commonly used data only the last 1 year. This data is quite big, so each table and index file is divided into several file in PGDATA/base May not be relevant to what you asked, but if you want to keep last yeat data in a "small and fast" dataset separated (physically separated!) by old data (that's still available, but response times may vary), IMHO you should consider partitioning... https://www.postgresql.org/docs/current/static/ddl-partitioning.html HTH, Moreno.- -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Rollback table data.
Il 07/06/2017 13:33, Dinesh Chandra 12108 ha scritto: Dear Expert, Is there any way to rollback table data in PostgreSQL? Not knowing anything else about your what you want to do and what context you're in, I can only say that AFAIK, once you COMMITted a transaction, no rollback is possible. A very quick google search gave me this: https://stackoverflow.com/questions/12472318/can-i-rollback-a-transaction-ive-already-committed-data-loss There's a long, detailed post from Craig Ringer that gives you some advise on how to proceed (given that a committed transaction cannot be ROLLBACKed) Take a look, I hope it's applicable to your scenario. HTH Moreno.-
Re: [PERFORM] Filter certain range of IP address.
Il 07/04/2017 17:56, Rick Otten ha scritto: On Fri, Apr 7, 2017 at 11:29 AM, David G. Johnstonwrote: On Fri, Apr 7, 2017 at 8:18 AM, Dinesh Chandra 12108 wrote: Dear Vinny, Thanks for your valuable replay. but I need a select query, which select only that record which starts from IP "172.23.110" only from below table. xxx 172.23.110.175 yyy 172.23.110.178 zzz 172.23.110.177 aaa 172.23.110.176 bbb 172.23.111.180 ccc 172.23.115.26 SELECT ... WHERE substring(ip_addr::text, 1, 10) = '172.23.110' or select ... where ip_addr << '172.23.110/32'; /32 is for one address only (fourth byte, which we want to exclude), so we need to use /24 (as for CIDR notation), that would be equal to a 255.255.255.0 subnet mask. My 2 cents Moreno
Re: [PERFORM] Big number of connections
Il 04/04/2016 16:54, Artem Tomyuk ha scritto: 2016-04-04 17:43 GMT+03:00 Moreno Andreo <moreno.and...@evolu-s.it>: s there a way to monitor active connections, or at least to report when they grow too much? (say, I have an 8-core system and want to track down if, and when, active connections grow over 80) You can achieve that just running simple query like select count(*) from pg_stat_activity where state = 'active' Thanks, but this way I get the "sample" on that actual moment: what I'd need is to monitor, or to have something warning me like "Hey, You've got 2000 active connections! Time to grow up!" :-) Cheers, Moreno.-
Re: [PERFORM] Big number of connections
Il 04/04/2016 15:33, Pavel Stehule ha scritto: PostgreSQL doesn't contain integrated pooler - so any connection to Postgres enforces one PostgreSQL proces. A performance benchmarks is showing maximum performance about 10x cores. With high number of connections you have to use low size of work_mem, what enforces can have negative impact on performance too. Too high number of active PostgreSQL processes increase a risk of performance problems with spin locks, etc. :-O I wasn't absolutely aware of this thing... is there a way to monitor active connections, or at least to report when they grow too much? (say, I have an 8-core system and want to track down if, and when, active connections grow over 80) Thanks Moreno.- -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [SPAM] Re: [PERFORM] Architectural question
Il 23/03/2016 19:51, Jim Nasby ha scritto: On 3/23/16 4:14 AM, Moreno Andreo wrote: The main goal is to be *quick*. A doctor with a patient on the other side of his desk does not want to wait, say, 30 seconds for a clinical record to open. Let me explain what is the main problem (actually there are 2 problems). 1. I'm handling health data, and sometines they store large images (say an hi-res image of an x-ray). When their team mates (spread all over the city, not in the same building) ask for that bitmap (that is, 20 megabytes), surely it can't be cached (images are loaded only if requested by user) and searching a 35k rows, 22 GB table for the matching image should not be that fast, even with proper indexing (patient record number) Why wouldn't that be fast? Unless the TOAST table for that particular table is pretty fragmented, I'm running on Debian with ext4 file system. I'm not expecting fragmentation. Am I wrong? pulling up thumbnails should be very fast. I'd expect it to be the cost of reading a few pages sequentially. I'm not extracting thumbnails. I have a layout that is similar to an email client, with all rows with data and, in a column, a clip, that lets user to load the real image, not its thumbnail. If you're mixing all your blobs together, then you might end up with a problem. It might be worth partitioning the blob table based on the size of what you're storing. OK, I went to documentation and read about partitioning :-) I knew about inheritance, but I was totally unaware of partitioning. Today it's a good day, because I've learned something new. You're saying that it would be better creating, for example, a table for blobs < 1 MB, another for blobs between 1 and 5 MB and another for blobs > 5 MB? And what about the master table? Should it be one of these three? Blobs data and size are unpredictable (from 2k RTF to 20 MB JPG), 2. When I load patient list, their photo must be loaded as well, because when I click on the table row, a small preview is shown (including a small thumbnail of the patient's photo). Obviously I can't load all thumbs while loading the whole patient list (the list can be up to 4-5000 records and photo size is about 4-500kBytes, so it would be an enormous piece of data to be downloaded. I would think a thumbnail would be 30-40k or less, not 500k. You have a point. We adviced of that the users, but they don't care, or simply don't know what they are doing. We need to change the application to accept max 50k files. It sounds like part of the problem is you should keep the thumbnails separate from the high-res file. But really you should probably do that for everything... I suspect there's parts of the UI when you want to display a fairly low-res version of something like an xray, only pulling the raw image if someone actually needs it. That's what we are doing. thumbnails are only patient portraits, while no other blob (clinical scans) is read until someone asks for it Thanks Moreno. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Architectural question
Il 23/03/2016 13:29, Mike Sofen ha scritto: -Original Message- Thomas Kellerer Wednesday, March 23, 2016 2:51 AM Jim Nasby schrieb am 11.03.2016 um 17:37: If the blob is in the database then you have nothing extra to do. It's handled just like all your other data. If it's a file in a file system then you need to: - Have application code that knows how and where to get at the file - Have a way to make those files available on all your webservers - Have completely separate backup and recovery plans for those files That's a lot of extra work. Sometimes it's necessary, but many times it's not. Don't forget the code you need to write to properly handle transactional access (writing, deleting) to the files You usually also need to distribute the files over many directories. Having millions of files in a single directory is usually not such a good idea. In my experience you also need some cleanup job that removes orphaned files from the file system. Because no matter how hard you try, to get updates/writes to the file system right, at some point this fails. Also from a security point of view having this in the database is more robust then in the file system. The downside of bytea is that you can't stream them to the client. The application always needs to read the whole blob into memory before it can be used. This might put some memory pressure on the application server. Thomas This is really an excellent conversation, and highlights the never-ending contemplation of blob storage. That seems like discussing about politics or religion :-) I've had to go through this dialog in two different industries - healthcare and now genomics, creating a new EMR (electronic medical record) system and storing and manipulating huge genomic data sets. I have, in both cases, ended up leaving the blob-type data outside of the database. Even though, as Thomas mentioned, it requires more database and app code to manage, it ends up allowing for both systems to be optimized for their respective duties. Our approach, still mantaining BLOBs in databases, is quite an hybrid, because BLOBs are not spread among DB tables, but we have a dedicated table, with an appropriate indexing, where 95% of our blobs (and 99% of blob storage) reside, so if we need to have a quick dump, we can exclude BLOBs table or treat it in a separate way (i.e. backup util in our app is made of two separate steps, clinical data and blobs). As I wrote in a previous post, we have our blobs encrypted, so it's more handy keeping them in DB rather than saving to a file (and, I think, quicker when the user request for any of these) In addition, the vastly smaller database sizes result in far faster backups and restores, transactional replication maintains it's speed, and in general, I find the fault tolerant behaviors to be excellent. Yes, losing track of a file would be very bad, and...we're only storing things like xray photos or ct scans (healthcare), or genomic processing results. In both cases, usually, the results can be recreated. That said, I've never lost a file so haven't needed to pull on that lever. In our case we have to assume that blob contents cannot be recreated. Patients can change family doctor... if a trial arise and a critical document is lost, he's on his own. That's why we have a daily-based automatic backup policy on the customer local server. My latest model is placing large genomic data onto the AWS S3 file system, keeping all of the metadata inside the database. It's working very well so far, but we're still in development. Mike -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Architectural question
Il 23/03/2016 10:50, Thomas Kellerer ha scritto: Jim Nasby schrieb am 11.03.2016 um 17:37: If the blob is in the database then you have nothing extra to do. It's handled just like all your other data. If it's a file in a file system then you need to: - Have application code that knows how and where to get at the file - Have a way to make those files available on all your webservers - Have completely separate backup and recovery plans for those files That's a lot of extra work. Sometimes it's necessary, but many times it's not. Don't forget the code you need to write to properly handle transactional access (writing, deleting) to the files You usually also need to distribute the files over many directories. Having millions of files in a single directory is usually not such a good idea. In my experience you also need some cleanup job that removes orphaned files from the file system. Because no matter how hard you try, to get updates/writes to the file system right, at some point this fails. Also from a security point of view having this in the database is more robust then in the file system. The downside of bytea is that you can't stream them to the client. The application always needs to read the whole blob into memory before it can be used. This might put some memory pressure on the application server. Thomas I just wrote about it in my last message that I sent a few minutes ago We have blobs in a reserved table in each customer database, so we can keep up with privacy, since every blob is encrypted... so no extra work :-) Thanks Moreno.- -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [SPAM] Re: [PERFORM] Architectural question
Il 11/03/2016 17:37, Jim Nasby ha scritto: On 2/22/16 8:40 AM, Moreno Andreo wrote: Il 18/02/2016 21:33, Jim Nasby ha scritto: Depending on your needs, could could use synchronous replication as part of that setup. You can even do that at a per-transaction level, so maybe you use sync rep most of the time, and just turn it off when inserting or updating BLOBS. This sounds good, and when everything is OK we have I/O operation split across the two servers; a small delay in synchronizing blobs should not be a big deal, even if something bad happens (because of XLOG), right? It all depends on what you can tolerate. You also don't have to use synchronous replication; normal streaming replication is async, so if you can stand to lose some data if one of the servers dies then you can do that. I can't tolerate data loss, so synchronous replication is mandatory (I had a case this week of a customer asking for an old document that I couldn't find in the database, either if the "attach present" flag was true... and I had a bit of a hard time trying to convince the customer it was his fault... :-) ) Last thing: should blobs (or the whole database directory itself) go in a different partition, to optimize performance, or in VM environment this is not a concern anymore? First: IMO concerns about blobs in the database are almost always overblown. In many places I've been they say, at last, "BLOBs are slow". So I considered this as another point to analyze while designing server architecture. If you say "don't mind", then I won't. It all depends. They're certainly a lot slower than handling a single int, but in many cases the difference just doesn't matter. The main goal is to be *quick*. A doctor with a patient on the other side of his desk does not want to wait, say, 30 seconds for a clinical record to open. Let me explain what is the main problem (actually there are 2 problems). 1. I'm handling health data, and sometines they store large images (say an hi-res image of an x-ray). When their team mates (spread all over the city, not in the same building) ask for that bitmap (that is, 20 megabytes), surely it can't be cached (images are loaded only if requested by user) and searching a 35k rows, 22 GB table for the matching image should not be that fast, even with proper indexing (patient record number) 2. When I load patient list, their photo must be loaded as well, because when I click on the table row, a small preview is shown (including a small thumbnail of the patient's photo). Obviously I can't load all thumbs while loading the whole patient list (the list can be up to 4-5000 records and photo size is about 4-500kBytes, so it would be an enormous piece of data to be downloaded. 30GB of blobs on modern hardware really isn't a big deal, and there's a *lot* to be said for not having to write the extra code to manage all that by hand. What do you mean? Extra code? If the blob is in the database then you have nothing extra to do. It's handled just like all your other data. If it's a file in a file system then you need to: - Have application code that knows how and where to get at the file - Have a way to make those files available on all your webservers - Have completely separate backup and recovery plans for those files That's a lot of extra work. Sometimes it's necessary, but many times it's not. In my case I think it's not necessary, since all blobs go into a bytea field in a table that's just for them. It's an approach that helps us keeping up with privacy, since all blobs are encrypted, and can be accessed only by application. When it comes to your disk layout, the first things I'd look at would be: - Move the temporary statistics directory to a RAM disk - Move pg_xlog to it's own partition So I need another vDisk, not that big, for pg_xlog? Yeah, but note that with virtualization that may or may not help. I was afraid of that. With virtualization we are bound to that hardware lying behind us, and that we can't see nor control. Even if we create 2 vDisk, they should be bound to the same host spindles, and so having two vDisk is completely useless. I'm thinking of increase checkpoint_segments interval, so In the next two week I should have the VM deployed, so I'll see what I'll have in terms of speed and response (looking at the amount we are paying, I hope it will be a very FAST machine... :-D) Thanks Moreno.- -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [SPAM] Re: [PERFORM] autovacuum disk IO
Il 02/03/2016 19:40, Alvaro Herrera ha scritto: Scott Marlowe wrote: On Wed, Mar 2, 2016 at 9:11 AM, Moreno Andreo <moreno.and...@evolu-s.it> wrote: ... or maybe add some more RAM to have more disk caching (if you're on *nix) this worked for me in the past... even if IMHO it's more a temporary "patch" while upgrading (if it can't be done in a hurry) than a real solution... Oh yeah, definitely worth looking at. But RAM can't speed up writes, just reads, so it's very workload dependent. If you're IO subsystem is maxing out on writes, faster drives / IO. If it's maxing out on reads, more memory. But if your dataset is much bigger than memory (say 64GB RAM and a 1TB data store) then more RAM isn't going to be the answer. In the particular case of autovacuum, it may be helpful to create a "ramdisk" and put the stats temp file in it. Definitely. I my new server (as I've been taught here :-) ) I'm going to put stats in a ramdisk and pg_xlog in another partition. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [SPAM] Re: [PERFORM] autovacuum disk IO
Il 02/03/2016 16:49, Scott Marlowe ha scritto: On Wed, Mar 2, 2016 at 8:45 AM, Scott Marlowewrote: On Wed, Mar 2, 2016 at 8:25 AM, Artem Tomyuk wrote: Hi. I've noticed that autovac. process worked more than 10 minutes, during this zabbix logged more than 90% IO disk utilization on db volume ===>29237 2016-03-02 15:17:23 EET 0 [24-1]LOG: automatic vacuum of table "lb_upr.public._reference32": index scans: 1 pages: 0 removed, 263307 remain tuples: 298 removed, 1944753 remain, 0 are dead but not yet removable buffer usage: 67814 hits, 265465 misses, 15647 dirtied avg read rate: 3.183 MB/s, avg write rate: 0.188 MB/s system usage: CPU 5.34s/6.27u sec elapsed 651.57 sec Is it possible to log autovac. io impact during it execution? Is there any way to limit or "nice" autovac. process? I'll assume you're running a fairly recent version of postgresql. There are a few settings that adjust how hard autovacuum works when it's working. autovacuum_max_workers tells autovacuum how many threads to vacuum with. Lowering this will limit the impact of autovacuum, but generally the default setting of 3 is reasonable on most machines. autovacuum_vacuum_cost_delay sets how to wail between internal rounds. Raising this makes autovacuum take bigger pauses internally. The default of 20ms is usually large enough to keep you out of trouble, but feel free to raise it and see if your IO utilization lowers. autovacuum_vacuum_cost_limit sets a limit to how much work to do between the pauses set by the cost delay above. Lowering this will cause autovac to do less work between pauses. Most of the time I'm adjusting these I'm making vacuum more aggressive, not less aggressive because vacuum falling behind is a problem on the large, fast production systems I work on. In your case you want to watch for when autovacuum IS running, and using a tool like vmstat or iostat or iotop, watch it for % utilization. You can then adjust cost delay and cost limit to make it less aggressive and see if your io util goes down. Note though that 90% utilization isn't 100% so it's not likely flooding the IO. But if you say raise cost delay from 20 to 40ms, it might drop to 75% or so. The primary goal here is to arrive at numbers that left autovacuum keep up with reclaiming the discarded tuples in the database without getting in the way of the workload. If your workload isn't slowing down, or isn't slowing down very much, during autobvacuum then you're OK. Just to add a point here. If you're machine can't keep up with production load AND the job of vacuuming, then your IO subsystem is too slow and needs upgrading. The difference between a pair of spinning 7200RPM drives and a pair of enterprise class SSDs (always with power off safe writing etc, consumer SSDs can eat your data on power off) can be truly huge. I've seen improvements from a few hundred transactions per second to thousands of transactions per second by a simple upgrade like that. ... or maybe add some more RAM to have more disk caching (if you're on *nix) this worked for me in the past... even if IMHO it's more a temporary "patch" while upgrading (if it can't be done in a hurry) than a real solution... Cheers Moreno. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [SPAM] Re: [PERFORM] Architectural question
Il 18/02/2016 21:33, Jim Nasby ha scritto: Just before we go on, I have to say that I'm working on PostgreSQL for about 10 years now, but while in the past "leave everything as it is" worked, in the last 15 months I began to research and study how to improve my server performance, so I'm quite a bit of novice in being a DBA (but a novice that when needed reads a lot of documentation :-) ) So, if some questions may sound "strange", "noobish" to you, that's the reason. On 2/11/16 12:06 PM, Moreno Andreo wrote: Now, the actual question, is: Having a VM that can be upgraded with a click on a panel and a reboot, and that the server fault is only related to a OS failure, should I keep a single-server solution (but I fear that I/O throughput will become even more inadequate) or is it convenient to migrate in a 2-server system? And, in case of 2-server configuration, what would you recommend? Much of that depends on your disaster recovery strategy. I'm planning to have a cron job that backups data (only data) overnight (I was thinking something like pg_dumpall) and takes a snapshot of the whole server over the weekend (If I'm not wrong, VMWare allows live snapshots), so if something bad happens, I'll recover the snapshot from last save and restore all databases from latest backup. Scenario 1: Given 350 databases, I split them in 2, 175 on server 1 and 175 on server 2, having pgBouncer to resolve the connections and each server has its own workload Scenario 2: Server 1 -> Master, Server 2 -> Slave (Replicated with Slony or...?), Server 1 for writes, Server 2 for reads Personally I'd do kind of a hybrid at this point. First, I'd split the masters across both servers, with a way to easily fail over if one of the servers dies. Next, I'd get streaming replication setup so that the half with masters on A have replicas on B and vice-versa. That way you can easily recover from one server or the other failing. Depending on your needs, could could use synchronous replication as part of that setup. You can even do that at a per-transaction level, so maybe you use sync rep most of the time, and just turn it off when inserting or updating BLOBS. This sounds good, and when everything is OK we have I/O operation split across the two servers; a small delay in synchronizing blobs should not be a big deal, even if something bad happens (because of XLOG), right? Last thing: should blobs (or the whole database directory itself) go in a different partition, to optimize performance, or in VM environment this is not a concern anymore? First: IMO concerns about blobs in the database are almost always overblown. In many places I've been they say, at last, "BLOBs are slow". So I considered this as another point to analyze while designing server architecture. If you say "don't mind", then I won't. 30GB of blobs on modern hardware really isn't a big deal, and there's a *lot* to be said for not having to write the extra code to manage all that by hand. What do you mean? Extra code? When it comes to your disk layout, the first things I'd look at would be: - Move the temporary statistics directory to a RAM disk - Move pg_xlog to it's own partition So I need another vDisk, not that big, for pg_xlog? Those don't always help, but frequently they do. And when they do, it usually makes a big difference. Beyond that, there might be some advantage to putting blobs on their own tablespace. Hard to say without trying it. I'm thinking about it, because while the most of the blobs are < 1MB, there are some that reach 20, 50 and even 100 megabytes, and I'm quite concerned in overall performance of the whole system (even if it's on modern hardware, 100 megs to extract are not that fast...) when these have to be sent to whom is requesting them... So, my ideas are clearer now, but the first step is to decide if there's need for only one server (my budget will be happier, because they seem very good, but quite expensive, at GCP...) or it's best with two, using pgBouncer, and where to put pgBouncer... :-) Thanks Moreno -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Architectural question
Hi everyone, I have a question that I hope fits in this discussion group. I'm migrating my actual server into a new, more powerful architecture on Google Cloud Platform. ATM the server is a VM with 4 vCPUs (the host has 4 Xeon E2xx 3,1 GHZ, if I remember) and 32 GB RAM, just running Ubuntu Server 12.04 and PostgreSQL 9.1 The server contains about 350 DBs and the same number of roles (every role has its own database). Databases are made of about 75 tables that can contain blobs (one table is peculiar in containing blobs) and single blob size can grow up to 20-25 megabytes. ATM our biggest DB is about 30 GB, 95% made of blobs. Apart from user growth, that means more resource consumption, we are starting new services, that will have more and more impact on databases. I read about how blobs are SLOW and I'm a bit worried on how to manage them. Now, the actual question, is: Having a VM that can be upgraded with a click on a panel and a reboot, and that the server fault is only related to a OS failure, should I keep a single-server solution (but I fear that I/O throughput will become even more inadequate) or is it convenient to migrate in a 2-server system? And, in case of 2-server configuration, what would you recommend? Scenario 1: Given 350 databases, I split them in 2, 175 on server 1 and 175 on server 2, having pgBouncer to resolve the connections and each server has its own workload Scenario 2: Server 1 -> Master, Server 2 -> Slave (Replicated with Slony or...?), Server 1 for writes, Server 2 for reads Last thing: should blobs (or the whole database directory itself) go in a different partition, to optimize performance, or in VM environment this is not a concern anymore? I tried to be as brief as possible, if you need some more details just ask :-) Thanks in advance, Moreno. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] How to find the culprit in server load spikes?
Hi everyone, I host a Postgresql server on Ubuntu 12.04 and I am facing server load spikes (if I run top, it goes up to 25-30 on a 4-core system)... In some cases, I have to restart potgresql service because users call us complaining of the slowness, but in some cases I can leave things on their way and I see that after a bunch of minutes (about 5-10) the situations drops to the normality (0.50-2 load). The problem is, as in the most cases, the I/O, but I need a small hand to know some methods or tools that can help me to investigate who or what is causing me these spikes. Any help would be appreciated. Best regards, Moreno. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance