Re: [PERFORM] Write workload is causing severe slowdown in Production

2012-03-23 Thread Gnanakumar
> Are those wrapped in a transaction or not? Each transaction forces a fsync
> when committing, and if each of those INSERT/UPDATE statements stands on
> it's own it may cause of lot of I/O.

Yes, it's wrapped inside a transaction.  May be this could be a reason for 
slowdown, as you've highlighted here.  Atleast, we've got some guidance here to 
troubleshoot in this aspect also.

> There are tools that claim to remove the object vs. relational discrepancy
> when accessing the database. They often generate queries on the fly, and
> some of the queries are pretty awful (depends on how well the ORM model is
> defined). There are various reasons why this may suck - loading too much
> data, using lazy fetch everywhere etc.

Thanks for the clarification.

> Are you using something like Hibernate, JPA, ... to handle persistence?

No, we're not using any persistence frameworks/libraries as such.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Write workload is causing severe slowdown in Production

2012-03-23 Thread Gnanakumar
First off, thank you *so much* for that detailed explanation comparing with
a real-time application performance benchmark, which was really enlightening
for me.

> How are you handling concurrency?  (Are you using FOR SHARE on your
> SELECT statements?  Are you explicitly acquiring table locks before
> modifying data?  Etc.)  You might be introducing blocking somehow. 

No, actually am not explicitly locking any tables -- all are *simple*
select, update, insert statements only.

> In particular, I recommend that you *never* leave transactions open
> or hold locks while waiting for user response or input.

Again, we're not leaving any transaction opens until for any user responses,
etc.

> When you hit that issue, there is not a continual slowdown --
> queries which normally run very fast (a small fraction of a second)
> may periodically all take tens of seconds.  Is that the pattern
> you're seeing?

Yes, you're correct.  Queries those normally run fast are becoming slow at
the time of this slowdown.

> Besides the outdated PostgreSQL release and possible blocking, I
> would be concerned if you are using any sort of ORM for the update
> application.  You want to watch that very closely because the
> default behavior of many of them does not scale well.  There's
> usually a way to get better performance through configuration and/or
> bypassing automatic query generation for complex data requests.

Am not able to understand above statements (...any sort of ORM for the
update application ...) clearly.  Can you help me in understanding this?




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Write workload is causing severe slowdown in Production

2012-03-22 Thread Gnanakumar
> There's a checkpoint_warning option. Set it to 3600 and you should get 
> messages in the log.

I've a basic question about setting "checkpoint_warning" configuration.  8.2 
doc (http://www.postgresql.org/docs/8.2/interactive/runtime-config-wal.html) 
says:

"Write a message to the server log if checkpoints caused by the filling of 
checkpoint segment files happen closer together than this many seconds (which 
suggests that checkpoint_segments ought to be raised). The default is 30 
seconds (30s)."

How does increasing the default 30s to 3600s (which is 1 hour or 60 minutes) 
print messages to the log?  Even after reading the description from above doc, 
am not able to get this point clearly.  Can you help me in understanding this?



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Write workload is causing severe slowdown in Production

2012-03-22 Thread Gnanakumar
> So, what else is running on the system? Because if there's 35GB RAM and
> the shared buffers are 1.5GB, then there's about 33GB for page cache.
> Something like 16GB would be a conservative setting.

Yes, you guessed it right.  Both Web server and DB server are running in the 
same machine.

> I'm not saying this will fix the issues, but maybe it shows that something
> else is running on the box and maybe that's the culprit, not PostgreSQL?

Based on my observations and analysis, I'm sure that database write operation 
"is" causing the slowdown, but not because of other applications running in the 
same server.

> Then install the app again and collect as much info as possible. Otherwise
> it's all just wild guesses.

OK.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Write workload is causing severe slowdown in Production

2012-03-22 Thread Gnanakumar
> There's a checkpoint_warning option. Set it to 3600 and you should get
> messages in the log. Correlate those to the issues (do they happen at the
> same time?).
After setting "checkpoint_warning" to 3600, can you explain on how do I 
correlate with the messages?

> If you can, install iotop and watch the processes that cause the I/O.
I tried installing "iotop", but it failed to run because it requires Linux >= 
2.6.20.  Our CentOS5.2 is 2.6.18-8.

> What we need is more details about your setup, especially
>   - checkpoint_segments
>   - checkpoint_timeout
>   - shared_buffers
# - Memory -
shared_buffers=1536MB

# - Planner Cost Constants -
effective_cache_size = 4GB

# - Checkpoints -
checkpoint_segments=32
checkpoint_timeout=5min
checkpoint_warning=270s

# - Background writer -
bgwriter_delay = 200ms
bgwriter_lru_percent = 1.0
bgwriter_lru_maxpages = 5
bgwriter_all_percent = 0.333
bgwriter_all_maxpages = 5

> also it'd be nice to have samples from the vmstat/iostat and messages from
> the log.
Unfortunately, I don't have "exact" logs when the problem actually happened



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Write workload is causing severe slowdown in Production

2012-03-22 Thread Gnanakumar
Hi,

We're running a web-based application powered by PostgreSQL.  Recently,
we've developed a "new" separate Java-based standalone (daemon process)
threaded program that performs both read and write operations heavily on 2
"huge" tables.  One table has got 5.4 million records and other has 1.3
million records.  Moreover, more than one read and/or write operations may
be executing concurrently.

The issue that we're facing currently in our Production server is, whenever
this "newly" developed Java program is started/run, then immediately the
entire web application becomes very slow in response.  At this time, I could
also see from the output of " iostat -tx" that "%util" is even crossing more
than 80%.  So, what I could infer here based on my knowledge is, this is
creating heavy IO traffic because of write operation.  Since it was entirely
slowing down web application, we've temporarily stopped running this
standalone application.

Meantime, I also read about "checkpoint spikes" could be a reason for slow
down in "write workload" database.  I'm also reading that starting in
PostgreSQL 8.3, we can get verbose logging of the checkpoint process by
turning on "log_checkpoints".

My question is, how do I determine whether "checkpoint" occurrences are the
root cause of this slowdown in my case?  We're running PostgreSQL v8.2.22 on
CentOS5.2 having 35 GB RAM.  "log_checkpoints" is not available in
PostgreSQL v8.2.22.

We want to optimize our Production database to handle both reads and writes,
any suggestions/advice/guidelines on this are highly appreciated.

Some important "postgresql.conf" parameters are:
# - Memory -
shared_buffers=1536MB

# - Planner Cost Constants -
effective_cache_size = 4GB

# - Checkpoints -
checkpoint_segments=32
checkpoint_timeout=5min
checkpoint_warning=270s

# - Background writer -
bgwriter_delay = 200ms
bgwriter_lru_percent = 1.0
bgwriter_lru_maxpages = 5
bgwriter_all_percent = 0.333
bgwriter_all_maxpages = 5

Regards,
Gnanam



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Inner Join - Explicit vs Implicit Join Performance

2011-10-19 Thread Gnanakumar
Hi,

In PostgreSQL, is there any performance difference between queries written
using "explicit join notation" vs "implicit join notation" in complex
queries?

EXAMPLE: Simple "explicit join notation"
SELECT *
FROM   employee INNER JOIN department
ON employee.DepartmentID = department.DepartmentID;

EXAMPLE: Simple "implicit join notation"
SELECT *
FROM   employee, department 
WHERE  employee.DepartmentID = department.DepartmentID;

Regards,
Gnanam



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Function execution consuming lot of memory and eventually making server unresponsive

2011-02-24 Thread Gnanakumar
Hi,

We're using PostgreSQL v8.2.3 on RHEL5.

I'm developing a PostgreSQL plpgsql function for one of our application
report.  When I try to run the function multiple times (even twice or
thrice), I'm seeing considerable amount of memory being taken up by
PostgreSQL and thereby after sometime, complete server itself comes to
standstill and not responding at all, even am not able to login to my server
using PuTTY client.  I then end up physically restarting the server.

Pasted below the function which I'm developing.  

Is there something am doing differently in the function that would cause
PostgreSQL to consume lot of memory?  In my experience, I can say, this is
the first time I'm seeing PostgreSQL consuming/eating lot of memory and
causing severe performance issue and eventually making server come to
standstill.  Also, I can say that another 2 functions which I'm calling from
within this function ("get_hours_worked" and
"convert_hours_n_minutes_to_decimal") do not have any performance issues,
since those 2 functions we're already using in some other reports and have
not found any performance issues.

Experts suggestions/recommendations on this are highly appreciated.

For example, I would call this function like: SELECT
hours_worked_day_wise_breakup(90204,23893,38921, '01-01-2010 00:00:00',
'12-31-2010 23:59:59');
Output of this function will be like this:
8.00-typ1,4.25-typ2,0.00-typ5,6.00-typ3,8.00-typ4
Logic of this function: Given any 2 dates and filter inputs (input1, input2,
input3), it would return hours worked for each day (along with a suffix -
typ[1,2,3,4]) in comma separated form.  In above example, I'm trying to run
this function for one year.

CREATE or replace FUNCTION hours_worked_day_wise_breakup(numeric, numeric,
numeric, varchar, varchar) RETURNS VARCHAR AS '

DECLARE
p_input1ALIAS FOR $1;
p_input2ALIAS FOR $2;
p_input3ALIAS FOR $3;
p_startdate ALIAS FOR $4;
p_enddate   ALIAS FOR $5;

v_loopingdate   VARCHAR;
v_cur_start_dateVARCHAR;
v_cur_end_date  VARCHAR;
v_hours_in_decimal  NUMERIC := 0.00;
v_returnvalue   VARCHAR := ;

BEGIN
v_loopingdate := TO_CHAR(DATE(p_startdate), ''mm-dd-'');

WHILE (DATE(v_loopingdate) <= DATE(p_enddate)) LOOP
v_cur_start_date := v_loopingdate || '' 00:00:00'';
v_cur_end_date := v_loopingdate || '' 23:59:59'';

IF (LENGTH(TRIM(v_returnvalue)) >0) THEN
v_returnvalue := v_returnvalue || '','';
END IF;

v_hours_in_decimal :=
convert_hours_n_minutes_to_decimal(get_hours_worked(p_input1, p_input2, 7,
1, -1, p_input3, v_cur_start_date, v_cur_end_date));
IF (v_hours_in_decimal > 0) THEN
v_returnvalue := v_returnvalue || v_hours_in_decimal
|| ''-typ1'';
v_loopingdate := TO_CHAR((DATE(v_loopingdate) +
interval ''1 day''), ''mm-dd-'');
CONTINUE;
END IF;

v_hours_in_decimal :=
convert_hours_n_minutes_to_decimal(get_hours_worked(p_input1, p_input2, 6,
1, -1, p_input3, v_cur_start_date, v_cur_end_date));
IF (v_hours_in_decimal > 0) THEN
v_returnvalue := v_returnvalue || v_hours_in_decimal
|| ''-typ2'';
v_loopingdate := TO_CHAR((DATE(v_loopingdate) +
interval ''1 day''), ''mm-dd-'');
CONTINUE;
END IF;

v_hours_in_decimal :=
convert_hours_n_minutes_to_decimal(get_hours_worked(p_input1, p_input2, 4,
1, -1, p_input3, v_cur_start_date, v_cur_end_date));
IF (v_hours_in_decimal > 0) THEN
v_returnvalue := v_returnvalue || v_hours_in_decimal
|| ''-typ3'';
v_loopingdate := TO_CHAR((DATE(v_loopingdate) +
interval ''1 day''), ''mm-dd-'');
CONTINUE;
END IF;

v_hours_in_decimal :=
convert_hours_n_minutes_to_decimal(get_hours_worked(p_input1, p_input2, 3,
1, -1, p_input3, v_cur_start_date, v_cur_end_date));
IF (v_hours_in_decimal > 0) THEN
v_returnvalue := v_returnvalue || v_hours_in_decimal
|| ''-typ4'';
v_loopingdate := TO_CHAR((DATE(v_loopingdate) +
interval ''1 day''), ''mm-dd-'');
CONTINUE;
END IF;

v_hours_in_decimal := 0.00;
v_returnvalue := v_returnvalue || v_hours_in_decimal ||
''-typ5'';
v_loopingdate := TO_CHAR((DATE(v_loopingdate) + interval ''1
day''), ''mm-dd-'');
END LOOP;

RETURN v_returnvalue;

END ;'
LANGUAGE 'plpgsql'

Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-14 Thread Gnanakumar
> The closest you can do is something like Artur's option (or the pg_trgm 
> module - handy since you are looking at 3-chars and up) to select likely 
> matches combined with a separate search on '%domain.com%' to confirm 
> that fact.

Thanks for your suggestion.  Our production server is currently running
PostgreSQL v8.2.3.  I think pg_trgm contrib module is not available for 8.2
series. 

Also, I read about WildSpeed - fast wildcard search for LIKE operator.  What
is your opinion on that?
http://www.sai.msu.su/~megera/wiki/wildspeed


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-13 Thread Gnanakumar
> If you really need to match all those options, you can't use an index. A 
> substring-matching index would need to have multiple entries per 
> character per value (since it doesn't know what you will search for). 
> The index-size becomes unmanageable very quickly.

> That's why I asked what you really wanted to match.
To be more specific, in fact, our current application allows to delete
email(s) with a minimum of 3 characters.  There is a note/warning also given
for application Users' before deleting, explaining the implication of this
delete action (partial & case-insensitive, and it could be wide-ranging
too).

> So, I'll ask again: do you really want to match all of those options?
Yes, as explained above, I want to match all those.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-13 Thread Gnanakumar
> Is that really what you are after? Or, did you just want to match:
>us...@domain.com
>us...@sub.domain.com

I understand that because I've (%) at the beginning and end, it's going to
match unrelated domains, etc., which as you said rightly, it is
wide-ranging.  But my point here is that how can I improve performance of
the queries containing pattern matching characters.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-13 Thread Gnanakumar
Hi,

How can we boost performance of queries containing pattern matching
characters?  In my case, we're using a percent sign (%) that matches any
string of zero or more characters.

QUERY:  DELETE FROM MYTABLE WHERE EMAIL ILIKE '%domain.com%'

EMAIL column is VARCHAR(256).

As it is clear from the above query, email is matched "partially and
case-insensitively", which my application requirement demands.  

In case, if it were a full match, I could easily define a functional INDEX
on EMAIL column (lower(EMAIL)) and I could rewrite my DELETE where criteria
like lower(EMAIL) = 'someem...@domain.com'.

MYTABLE currently contains 2 million records and grows consistently.

Regards,
Gnanam


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Database size growing over time and leads to performance impact

2010-03-30 Thread Gnanakumar
We're using pgpool-II version 2.0.1 for PostgreSQL connection management.

pgpool configurations are:
num_init_children = 450
child_life_time = 300
connection_life_time = 120
child_max_connections = 30

As you recommended, I ran "ps -ax|grep postgres" at almost a busy
transaction time and I can find "idle" entries:
[r...@newuser ~]# ps -ax|grep postgres
 2664 ?Ss 0:00 postgres: newuser mydb 192.168.0.200(43545) idle
 2783 ?Ss 0:00 postgres: newuser mydb 192.168.0.200(43585) idle
 2806 ?Ss 0:02 postgres: newuser mydb 192.168.0.200(43588) idle
 2807 ?Ss 0:01 postgres: newuser mydb 192.168.0.200(43589) idle
 2818 ?Ss 0:00 postgres: newuser mydb 192.168.0.200(43601) idle
 2819 ?Ss 0:00 postgres: newuser mydb 192.168.0.200(43602) idle
 2833 ?Ss 0:02 postgres: newuser mydb 192.168.0.200(43603) idle
 2856 ?Ss 0:03 postgres: newuser mydb 192.168.0.200(43614) idle

Based on pgpool documentation, and also as far as I know, even though
application layer returns/closes the application, pgpool will only handle
actual closing of connections based on the connection_life_time parameter
defined.  And if this timeout, it goes to "wait for connection request"
state.

Can you throw some light on this?  Is there any better way that we need to
re-configure our pgpool parameters?

-Original Message-
From: Andy Colson [mailto:a...@squeakycode.net] 
Sent: Saturday, March 27, 2010 7:06 PM
To: Gnanakumar; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Database size growing over time and leads to
performance impact

On 03/27/2010 08:00 AM, Gnanakumar wrote:
> Hi,
>
> We're using PostgreSQL 8.2. Recently, in our production database, there
> was a severe performance impact.. Even though, we're regularly doing both:
>
> 1. VACUUM FULL ANALYZE once in a week during low-usage time and
>
> 2. ANALYZE everyday at low-usage time
>
> Also, we noticed that the physical database size has grown upto 30 GB.
> But, if I dump the database in the form of SQL and import it locally in
> my machine, it was only 3.2 GB. Then while searching in Google to
> optimize database size, I found the following useful link:
>
> http://www.linuxinsight.com/optimize_postgresql_database_size.html
>
> It says that even vacuumdb or reindexdb doesn't really compact database
> size, only dump/restore does because of MVCC architecture feature in
> PostgreSQL and this has been proven here.
>
> So, finally we decided to took our production database offline and
> performed dump/restore. After this, the physical database size has also
> reduced from 30 GB to 3.5 GB and the performance was also very good than
> it was before.
>
> Physical database size was found using the following command:
>
> du -sh /usr/local/pgsql/data/base/
>
> I also cross-checked this size using
> "pg_size_pretty(pg_database_size(datname))".
>
> Questions
>
> 1. Is there any version/update of PostgreSQL addressing this issue?
>
> 2. How in real time, this issues are handled by other PostgreSQL users
> without taking to downtime?
>
> 3. Any ideas or links whether this is addressed in upcoming PostgreSQL
> version 9.0 release?
>

The "issue" is not with PG's.  Any newer version of PG will act exactly the
same.  I don't think you understand.  Vacuum is not meant to reduce size of
the db, its meant to mark pages for reuse.  VACUUM FULL is almost never
needed.  The fact it didnt reduce your db size is probably because of
something else, like an open transaction.  If you have a transaction left
open, then your db will never be able to shrink or re-use pages.  You'd
better fix that issue first.  (run ps -ax|grep postgres  and look for "idle
in transaction")

You need to vacuum way more often than once a week.  Just VACUUM ANALYZE,
two, three times a day.  Or better yet, let autovacuum do its thing.  (if
you do have autovacuum enabled, then the only problem is the open
transaction thing).

Dont "VACUUM FULL", its not helping you, and is being removed in newer
versions.

-Andy


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] REINDEXing database-wide daily

2010-03-30 Thread Gnanakumar
Hi,

 

We're using PostgreSQL 8.2.

 

I have a question in connection to this question posted by me earlier:

http://archives.postgresql.org/pgsql-performance/2010-03/msg00343.php

 

In our application, DML operations (INSERT/UPDATE/DELETE) are heavily
performed in a day.

 

I also read about pg_autovacuum & REINDEX at:

http://www.postgresql.org/docs/8.2/interactive/routine-vacuuming.html

http://www.postgresql.org/docs/8.2/static/sql-reindex.html

 

I do not want to run pg_autovacuum daemon on a busy hour.

 

In case, if I can afford to take my database offline at low-usage time and
perform REINDEX database-wide manually/linux cron, to boost up index
performance, what is the community answer/suggestion on the following:

1. Is it a good idea to perform this on a daily basis?

2. Any implications of doing this on a daily basis?

3. Is there a way to find out bloated indexes?

4. Any other maintenance command, like ANALYZE, that has to be executed
before/after REINDEX?

5. Is there a way to find out when REINDEX was last run on an
INDEX/TABLE/DATABASE?

 

NOTE: I've also seen from my past experience that REINDEX database-wide
greatly improves performance of the application.

 



[PERFORM] Database size growing over time and leads to performance impact

2010-03-27 Thread Gnanakumar
Hi,

 

We're using PostgreSQL 8.2.  Recently, in our production database, there was
a severe performance impact..  Even though, we're regularly doing both:

1. VACUUM FULL ANALYZE once in a week during low-usage time and

2. ANALYZE everyday at low-usage time

 

Also, we noticed that the physical database size has grown upto 30 GB.  But,
if I dump the database in the form of SQL and import it locally in my
machine, it was only 3.2 GB.  Then while searching in Google to optimize
database size, I found the following useful link:

 

http://www.linuxinsight.com/optimize_postgresql_database_size.html

 

It says that even vacuumdb or reindexdb doesn't really compact database
size, only dump/restore does because of MVCC architecture feature in
PostgreSQL and this has been proven here.

 

So, finally we decided to took our production database offline and performed
dump/restore.  After this, the physical database size has also reduced from
30 GB to 3.5 GB and the performance was also very good than it was before.

 

Physical database size was found using the following command:

du -sh /usr/local/pgsql/data/base/

 

I also cross-checked this size using
"pg_size_pretty(pg_database_size(datname))".

 

Questions

1. Is there any version/update of PostgreSQL addressing this issue?

2. How in real time, this issues are handled by other PostgreSQL users
without taking to downtime?

3. Any ideas or links whether this is addressed in upcoming PostgreSQL
version 9.0 release?