Re: [PERFORM] Write workload is causing severe slowdown in Production
> Are those wrapped in a transaction or not? Each transaction forces a fsync > when committing, and if each of those INSERT/UPDATE statements stands on > it's own it may cause of lot of I/O. Yes, it's wrapped inside a transaction. May be this could be a reason for slowdown, as you've highlighted here. Atleast, we've got some guidance here to troubleshoot in this aspect also. > There are tools that claim to remove the object vs. relational discrepancy > when accessing the database. They often generate queries on the fly, and > some of the queries are pretty awful (depends on how well the ORM model is > defined). There are various reasons why this may suck - loading too much > data, using lazy fetch everywhere etc. Thanks for the clarification. > Are you using something like Hibernate, JPA, ... to handle persistence? No, we're not using any persistence frameworks/libraries as such. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Write workload is causing severe slowdown in Production
First off, thank you *so much* for that detailed explanation comparing with a real-time application performance benchmark, which was really enlightening for me. > How are you handling concurrency? (Are you using FOR SHARE on your > SELECT statements? Are you explicitly acquiring table locks before > modifying data? Etc.) You might be introducing blocking somehow. No, actually am not explicitly locking any tables -- all are *simple* select, update, insert statements only. > In particular, I recommend that you *never* leave transactions open > or hold locks while waiting for user response or input. Again, we're not leaving any transaction opens until for any user responses, etc. > When you hit that issue, there is not a continual slowdown -- > queries which normally run very fast (a small fraction of a second) > may periodically all take tens of seconds. Is that the pattern > you're seeing? Yes, you're correct. Queries those normally run fast are becoming slow at the time of this slowdown. > Besides the outdated PostgreSQL release and possible blocking, I > would be concerned if you are using any sort of ORM for the update > application. You want to watch that very closely because the > default behavior of many of them does not scale well. There's > usually a way to get better performance through configuration and/or > bypassing automatic query generation for complex data requests. Am not able to understand above statements (...any sort of ORM for the update application ...) clearly. Can you help me in understanding this? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Write workload is causing severe slowdown in Production
> There's a checkpoint_warning option. Set it to 3600 and you should get > messages in the log. I've a basic question about setting "checkpoint_warning" configuration. 8.2 doc (http://www.postgresql.org/docs/8.2/interactive/runtime-config-wal.html) says: "Write a message to the server log if checkpoints caused by the filling of checkpoint segment files happen closer together than this many seconds (which suggests that checkpoint_segments ought to be raised). The default is 30 seconds (30s)." How does increasing the default 30s to 3600s (which is 1 hour or 60 minutes) print messages to the log? Even after reading the description from above doc, am not able to get this point clearly. Can you help me in understanding this? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Write workload is causing severe slowdown in Production
> So, what else is running on the system? Because if there's 35GB RAM and > the shared buffers are 1.5GB, then there's about 33GB for page cache. > Something like 16GB would be a conservative setting. Yes, you guessed it right. Both Web server and DB server are running in the same machine. > I'm not saying this will fix the issues, but maybe it shows that something > else is running on the box and maybe that's the culprit, not PostgreSQL? Based on my observations and analysis, I'm sure that database write operation "is" causing the slowdown, but not because of other applications running in the same server. > Then install the app again and collect as much info as possible. Otherwise > it's all just wild guesses. OK. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Write workload is causing severe slowdown in Production
> There's a checkpoint_warning option. Set it to 3600 and you should get > messages in the log. Correlate those to the issues (do they happen at the > same time?). After setting "checkpoint_warning" to 3600, can you explain on how do I correlate with the messages? > If you can, install iotop and watch the processes that cause the I/O. I tried installing "iotop", but it failed to run because it requires Linux >= 2.6.20. Our CentOS5.2 is 2.6.18-8. > What we need is more details about your setup, especially > - checkpoint_segments > - checkpoint_timeout > - shared_buffers # - Memory - shared_buffers=1536MB # - Planner Cost Constants - effective_cache_size = 4GB # - Checkpoints - checkpoint_segments=32 checkpoint_timeout=5min checkpoint_warning=270s # - Background writer - bgwriter_delay = 200ms bgwriter_lru_percent = 1.0 bgwriter_lru_maxpages = 5 bgwriter_all_percent = 0.333 bgwriter_all_maxpages = 5 > also it'd be nice to have samples from the vmstat/iostat and messages from > the log. Unfortunately, I don't have "exact" logs when the problem actually happened -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Write workload is causing severe slowdown in Production
Hi, We're running a web-based application powered by PostgreSQL. Recently, we've developed a "new" separate Java-based standalone (daemon process) threaded program that performs both read and write operations heavily on 2 "huge" tables. One table has got 5.4 million records and other has 1.3 million records. Moreover, more than one read and/or write operations may be executing concurrently. The issue that we're facing currently in our Production server is, whenever this "newly" developed Java program is started/run, then immediately the entire web application becomes very slow in response. At this time, I could also see from the output of " iostat -tx" that "%util" is even crossing more than 80%. So, what I could infer here based on my knowledge is, this is creating heavy IO traffic because of write operation. Since it was entirely slowing down web application, we've temporarily stopped running this standalone application. Meantime, I also read about "checkpoint spikes" could be a reason for slow down in "write workload" database. I'm also reading that starting in PostgreSQL 8.3, we can get verbose logging of the checkpoint process by turning on "log_checkpoints". My question is, how do I determine whether "checkpoint" occurrences are the root cause of this slowdown in my case? We're running PostgreSQL v8.2.22 on CentOS5.2 having 35 GB RAM. "log_checkpoints" is not available in PostgreSQL v8.2.22. We want to optimize our Production database to handle both reads and writes, any suggestions/advice/guidelines on this are highly appreciated. Some important "postgresql.conf" parameters are: # - Memory - shared_buffers=1536MB # - Planner Cost Constants - effective_cache_size = 4GB # - Checkpoints - checkpoint_segments=32 checkpoint_timeout=5min checkpoint_warning=270s # - Background writer - bgwriter_delay = 200ms bgwriter_lru_percent = 1.0 bgwriter_lru_maxpages = 5 bgwriter_all_percent = 0.333 bgwriter_all_maxpages = 5 Regards, Gnanam -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Inner Join - Explicit vs Implicit Join Performance
Hi, In PostgreSQL, is there any performance difference between queries written using "explicit join notation" vs "implicit join notation" in complex queries? EXAMPLE: Simple "explicit join notation" SELECT * FROM employee INNER JOIN department ON employee.DepartmentID = department.DepartmentID; EXAMPLE: Simple "implicit join notation" SELECT * FROM employee, department WHERE employee.DepartmentID = department.DepartmentID; Regards, Gnanam -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Function execution consuming lot of memory and eventually making server unresponsive
Hi, We're using PostgreSQL v8.2.3 on RHEL5. I'm developing a PostgreSQL plpgsql function for one of our application report. When I try to run the function multiple times (even twice or thrice), I'm seeing considerable amount of memory being taken up by PostgreSQL and thereby after sometime, complete server itself comes to standstill and not responding at all, even am not able to login to my server using PuTTY client. I then end up physically restarting the server. Pasted below the function which I'm developing. Is there something am doing differently in the function that would cause PostgreSQL to consume lot of memory? In my experience, I can say, this is the first time I'm seeing PostgreSQL consuming/eating lot of memory and causing severe performance issue and eventually making server come to standstill. Also, I can say that another 2 functions which I'm calling from within this function ("get_hours_worked" and "convert_hours_n_minutes_to_decimal") do not have any performance issues, since those 2 functions we're already using in some other reports and have not found any performance issues. Experts suggestions/recommendations on this are highly appreciated. For example, I would call this function like: SELECT hours_worked_day_wise_breakup(90204,23893,38921, '01-01-2010 00:00:00', '12-31-2010 23:59:59'); Output of this function will be like this: 8.00-typ1,4.25-typ2,0.00-typ5,6.00-typ3,8.00-typ4 Logic of this function: Given any 2 dates and filter inputs (input1, input2, input3), it would return hours worked for each day (along with a suffix - typ[1,2,3,4]) in comma separated form. In above example, I'm trying to run this function for one year. CREATE or replace FUNCTION hours_worked_day_wise_breakup(numeric, numeric, numeric, varchar, varchar) RETURNS VARCHAR AS ' DECLARE p_input1ALIAS FOR $1; p_input2ALIAS FOR $2; p_input3ALIAS FOR $3; p_startdate ALIAS FOR $4; p_enddate ALIAS FOR $5; v_loopingdate VARCHAR; v_cur_start_dateVARCHAR; v_cur_end_date VARCHAR; v_hours_in_decimal NUMERIC := 0.00; v_returnvalue VARCHAR := ; BEGIN v_loopingdate := TO_CHAR(DATE(p_startdate), ''mm-dd-''); WHILE (DATE(v_loopingdate) <= DATE(p_enddate)) LOOP v_cur_start_date := v_loopingdate || '' 00:00:00''; v_cur_end_date := v_loopingdate || '' 23:59:59''; IF (LENGTH(TRIM(v_returnvalue)) >0) THEN v_returnvalue := v_returnvalue || '',''; END IF; v_hours_in_decimal := convert_hours_n_minutes_to_decimal(get_hours_worked(p_input1, p_input2, 7, 1, -1, p_input3, v_cur_start_date, v_cur_end_date)); IF (v_hours_in_decimal > 0) THEN v_returnvalue := v_returnvalue || v_hours_in_decimal || ''-typ1''; v_loopingdate := TO_CHAR((DATE(v_loopingdate) + interval ''1 day''), ''mm-dd-''); CONTINUE; END IF; v_hours_in_decimal := convert_hours_n_minutes_to_decimal(get_hours_worked(p_input1, p_input2, 6, 1, -1, p_input3, v_cur_start_date, v_cur_end_date)); IF (v_hours_in_decimal > 0) THEN v_returnvalue := v_returnvalue || v_hours_in_decimal || ''-typ2''; v_loopingdate := TO_CHAR((DATE(v_loopingdate) + interval ''1 day''), ''mm-dd-''); CONTINUE; END IF; v_hours_in_decimal := convert_hours_n_minutes_to_decimal(get_hours_worked(p_input1, p_input2, 4, 1, -1, p_input3, v_cur_start_date, v_cur_end_date)); IF (v_hours_in_decimal > 0) THEN v_returnvalue := v_returnvalue || v_hours_in_decimal || ''-typ3''; v_loopingdate := TO_CHAR((DATE(v_loopingdate) + interval ''1 day''), ''mm-dd-''); CONTINUE; END IF; v_hours_in_decimal := convert_hours_n_minutes_to_decimal(get_hours_worked(p_input1, p_input2, 3, 1, -1, p_input3, v_cur_start_date, v_cur_end_date)); IF (v_hours_in_decimal > 0) THEN v_returnvalue := v_returnvalue || v_hours_in_decimal || ''-typ4''; v_loopingdate := TO_CHAR((DATE(v_loopingdate) + interval ''1 day''), ''mm-dd-''); CONTINUE; END IF; v_hours_in_decimal := 0.00; v_returnvalue := v_returnvalue || v_hours_in_decimal || ''-typ5''; v_loopingdate := TO_CHAR((DATE(v_loopingdate) + interval ''1 day''), ''mm-dd-''); END LOOP; RETURN v_returnvalue; END ;' LANGUAGE 'plpgsql'
Re: [PERFORM] How to boost performance of queries containing pattern matching characters
> The closest you can do is something like Artur's option (or the pg_trgm > module - handy since you are looking at 3-chars and up) to select likely > matches combined with a separate search on '%domain.com%' to confirm > that fact. Thanks for your suggestion. Our production server is currently running PostgreSQL v8.2.3. I think pg_trgm contrib module is not available for 8.2 series. Also, I read about WildSpeed - fast wildcard search for LIKE operator. What is your opinion on that? http://www.sai.msu.su/~megera/wiki/wildspeed -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How to boost performance of queries containing pattern matching characters
> If you really need to match all those options, you can't use an index. A > substring-matching index would need to have multiple entries per > character per value (since it doesn't know what you will search for). > The index-size becomes unmanageable very quickly. > That's why I asked what you really wanted to match. To be more specific, in fact, our current application allows to delete email(s) with a minimum of 3 characters. There is a note/warning also given for application Users' before deleting, explaining the implication of this delete action (partial & case-insensitive, and it could be wide-ranging too). > So, I'll ask again: do you really want to match all of those options? Yes, as explained above, I want to match all those. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How to boost performance of queries containing pattern matching characters
> Is that really what you are after? Or, did you just want to match: >us...@domain.com >us...@sub.domain.com I understand that because I've (%) at the beginning and end, it's going to match unrelated domains, etc., which as you said rightly, it is wide-ranging. But my point here is that how can I improve performance of the queries containing pattern matching characters. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] How to boost performance of queries containing pattern matching characters
Hi, How can we boost performance of queries containing pattern matching characters? In my case, we're using a percent sign (%) that matches any string of zero or more characters. QUERY: DELETE FROM MYTABLE WHERE EMAIL ILIKE '%domain.com%' EMAIL column is VARCHAR(256). As it is clear from the above query, email is matched "partially and case-insensitively", which my application requirement demands. In case, if it were a full match, I could easily define a functional INDEX on EMAIL column (lower(EMAIL)) and I could rewrite my DELETE where criteria like lower(EMAIL) = 'someem...@domain.com'. MYTABLE currently contains 2 million records and grows consistently. Regards, Gnanam -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Database size growing over time and leads to performance impact
We're using pgpool-II version 2.0.1 for PostgreSQL connection management. pgpool configurations are: num_init_children = 450 child_life_time = 300 connection_life_time = 120 child_max_connections = 30 As you recommended, I ran "ps -ax|grep postgres" at almost a busy transaction time and I can find "idle" entries: [r...@newuser ~]# ps -ax|grep postgres 2664 ?Ss 0:00 postgres: newuser mydb 192.168.0.200(43545) idle 2783 ?Ss 0:00 postgres: newuser mydb 192.168.0.200(43585) idle 2806 ?Ss 0:02 postgres: newuser mydb 192.168.0.200(43588) idle 2807 ?Ss 0:01 postgres: newuser mydb 192.168.0.200(43589) idle 2818 ?Ss 0:00 postgres: newuser mydb 192.168.0.200(43601) idle 2819 ?Ss 0:00 postgres: newuser mydb 192.168.0.200(43602) idle 2833 ?Ss 0:02 postgres: newuser mydb 192.168.0.200(43603) idle 2856 ?Ss 0:03 postgres: newuser mydb 192.168.0.200(43614) idle Based on pgpool documentation, and also as far as I know, even though application layer returns/closes the application, pgpool will only handle actual closing of connections based on the connection_life_time parameter defined. And if this timeout, it goes to "wait for connection request" state. Can you throw some light on this? Is there any better way that we need to re-configure our pgpool parameters? -Original Message- From: Andy Colson [mailto:a...@squeakycode.net] Sent: Saturday, March 27, 2010 7:06 PM To: Gnanakumar; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Database size growing over time and leads to performance impact On 03/27/2010 08:00 AM, Gnanakumar wrote: > Hi, > > We're using PostgreSQL 8.2. Recently, in our production database, there > was a severe performance impact.. Even though, we're regularly doing both: > > 1. VACUUM FULL ANALYZE once in a week during low-usage time and > > 2. ANALYZE everyday at low-usage time > > Also, we noticed that the physical database size has grown upto 30 GB. > But, if I dump the database in the form of SQL and import it locally in > my machine, it was only 3.2 GB. Then while searching in Google to > optimize database size, I found the following useful link: > > http://www.linuxinsight.com/optimize_postgresql_database_size.html > > It says that even vacuumdb or reindexdb doesn't really compact database > size, only dump/restore does because of MVCC architecture feature in > PostgreSQL and this has been proven here. > > So, finally we decided to took our production database offline and > performed dump/restore. After this, the physical database size has also > reduced from 30 GB to 3.5 GB and the performance was also very good than > it was before. > > Physical database size was found using the following command: > > du -sh /usr/local/pgsql/data/base/ > > I also cross-checked this size using > "pg_size_pretty(pg_database_size(datname))". > > Questions > > 1. Is there any version/update of PostgreSQL addressing this issue? > > 2. How in real time, this issues are handled by other PostgreSQL users > without taking to downtime? > > 3. Any ideas or links whether this is addressed in upcoming PostgreSQL > version 9.0 release? > The "issue" is not with PG's. Any newer version of PG will act exactly the same. I don't think you understand. Vacuum is not meant to reduce size of the db, its meant to mark pages for reuse. VACUUM FULL is almost never needed. The fact it didnt reduce your db size is probably because of something else, like an open transaction. If you have a transaction left open, then your db will never be able to shrink or re-use pages. You'd better fix that issue first. (run ps -ax|grep postgres and look for "idle in transaction") You need to vacuum way more often than once a week. Just VACUUM ANALYZE, two, three times a day. Or better yet, let autovacuum do its thing. (if you do have autovacuum enabled, then the only problem is the open transaction thing). Dont "VACUUM FULL", its not helping you, and is being removed in newer versions. -Andy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] REINDEXing database-wide daily
Hi, We're using PostgreSQL 8.2. I have a question in connection to this question posted by me earlier: http://archives.postgresql.org/pgsql-performance/2010-03/msg00343.php In our application, DML operations (INSERT/UPDATE/DELETE) are heavily performed in a day. I also read about pg_autovacuum & REINDEX at: http://www.postgresql.org/docs/8.2/interactive/routine-vacuuming.html http://www.postgresql.org/docs/8.2/static/sql-reindex.html I do not want to run pg_autovacuum daemon on a busy hour. In case, if I can afford to take my database offline at low-usage time and perform REINDEX database-wide manually/linux cron, to boost up index performance, what is the community answer/suggestion on the following: 1. Is it a good idea to perform this on a daily basis? 2. Any implications of doing this on a daily basis? 3. Is there a way to find out bloated indexes? 4. Any other maintenance command, like ANALYZE, that has to be executed before/after REINDEX? 5. Is there a way to find out when REINDEX was last run on an INDEX/TABLE/DATABASE? NOTE: I've also seen from my past experience that REINDEX database-wide greatly improves performance of the application.
[PERFORM] Database size growing over time and leads to performance impact
Hi, We're using PostgreSQL 8.2. Recently, in our production database, there was a severe performance impact.. Even though, we're regularly doing both: 1. VACUUM FULL ANALYZE once in a week during low-usage time and 2. ANALYZE everyday at low-usage time Also, we noticed that the physical database size has grown upto 30 GB. But, if I dump the database in the form of SQL and import it locally in my machine, it was only 3.2 GB. Then while searching in Google to optimize database size, I found the following useful link: http://www.linuxinsight.com/optimize_postgresql_database_size.html It says that even vacuumdb or reindexdb doesn't really compact database size, only dump/restore does because of MVCC architecture feature in PostgreSQL and this has been proven here. So, finally we decided to took our production database offline and performed dump/restore. After this, the physical database size has also reduced from 30 GB to 3.5 GB and the performance was also very good than it was before. Physical database size was found using the following command: du -sh /usr/local/pgsql/data/base/ I also cross-checked this size using "pg_size_pretty(pg_database_size(datname))". Questions 1. Is there any version/update of PostgreSQL addressing this issue? 2. How in real time, this issues are handled by other PostgreSQL users without taking to downtime? 3. Any ideas or links whether this is addressed in upcoming PostgreSQL version 9.0 release?