[PERFORM] Config Check

2004-12-06 Thread Bryan
   #   debug5, debug4, debug3, debug2, debug1,
#   log, info, notice, warning, error
log_min_messages = error# Values, in order of decreasing detail:
#   debug5, debug4, debug3, debug2, debug1,
#   info, notice, warning, error, log, 
fatal,
#   panic

log_error_verbosity = terse   # terse, default, or verbose messages
#log_min_error_statement = panic # Values in order of increasing severity:
 #   debug5, debug4, debug3, debug2, 
debug1,
 #   info, notice, warning, error, 
panic(off)

#log_min_duration_statement = -1 # Log all statements whose
 # execution time exceeds the value, in
 # milliseconds.  Zero prints all queries.
 # Minus-one disables.
#silent_mode = false # DO NOT USE without Syslog!
# - What to Log -
debug_print_parse = false
debug_print_rewritten = false
debug_print_plan = false
debug_pretty_print = false
log_connections = false
log_duration = false
log_pid = false
log_statement = false
log_timestamp = true
log_hostname = true
log_source_port = false
#---
# RUNTIME STATISTICS
#---
# - Statistics Monitoring -
log_parser_stats = false
log_planner_stats = false
log_executor_stats = false
log_statement_stats = false
# - Query/Index Statistics Collector -
#stats_start_collector = true
#stats_command_string = false
#stats_block_level = false
#stats_row_level = false
#stats_reset_on_server_start = true
#---
# CLIENT CONNECTION DEFAULTS
#---
# - Statement Behavior -
#search_path = '$user,public'   # schema names
#check_function_bodies = true
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = false
#statement_timeout = 0  # 0 is disabled, in milliseconds
# - Locale and Formatting -
#datestyle = 'iso, mdy'
#timezone = unknown # actually, defaults to TZ environment 
setting
#australian_timezones = false
#extra_float_digits = 0 # min -15, max 2
#client_encoding = sql_ascii# actually, defaults to database encoding

# These settings are initialized by initdb -- they may be changed
lc_messages = 'en_US.UTF-8' # locale for system error 
message strings
lc_monetary = 'en_US.UTF-8' # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'  # locale for number formatting
lc_time = 'en_US.UTF-8' # locale for time formatting

# - Other Defaults -
#explain_pretty_print = true
#dynamic_library_path = '$libdir'
#max_expr_depth = 1 # min 10
#---
# LOCK MANAGEMENT
#---
#deadlock_timeout = 1000# in milliseconds
max_locks_per_transaction = 200 # min 10, ~260*max_connections bytes each
#---
# VERSION/PLATFORM COMPATIBILITY
#---
# - Previous Postgres Versions -
#add_missing_from = true
#regex_flavor = advanced# advanced, extended, or basic
#sql_inheritance = true
# - Other Platforms & Clients -
#transform_null_equals = false
<--config-->
Thanks
--
---
Bryan Vest
ComNet Inc.
bright.net Network Administration/Network Operations
(888)-618-4638
[EMAIL PROTECTED]Pager: [EMAIL PROTECTED]
---
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


[PERFORM] startup caching suggestions

2007-06-25 Thread Bryan Murphy

We have a search facility in our database that uses full text indexing to
search about 300,000 records spread across 2 tables.  Nothing fancy there.

The problem is, whenever we restart the database (system crash, lost
connectivity to SAN, upgrade, configuration change, etc.) our data is not
cached and query performance is really sketchy the first five to ten minutes
or so after the restart.  This is particularly problematic because the only
way the data gets cached in memory is if somebody actively searches for it,
and the first few people who visit our site after a restart are pretty much
screwed.

I'd like to know what are the recommended strategies for dealing with this
problem.  We need our search queries to be near instantaneous, and we just
can't afford the startup penalty.

I'm also concerned that Postgres may not be pulling data off the SAN as
efficiently as theory dictates.  What's the best way I can diagnose if the
SAN is performing up to spec?  I've been using iostat, and some of what I'm
seeing concerns me.  Here's a typical iostat output (iostat -m -d 1):

Device:tpsMB_read/sMB_wrtn/sMB_readMB_wrtn
sda   0.00 0.00 0.00  0  0
sdb 102.97 2.03 0.00  2  0
sdc   0.00 0.00 0.00  0  0
sdd   0.00 0.00 0.00  0  0

sda is the os partitionn (local), sdb is the primary database partion (SAN),
sdc is the log file partition (SAN), and sdd is used only for backups
(SAN).  I very rarely seen sdb MB_read/s much above 2, and most of the time
it hovers around 1 or lower.  This seems awfully goddamn slow to me, but
maybe I just don't fully understand what iostat is telling me.  I've seen
sdc writes get as high as 10 during a database restore.

A few bits of information about our setup:

Debian Linux 2.6.18-4-amd64 (stable)
4x Intel(R) Xeon(R) CPU 5110 @ 1.60GHz (100% dedicated to database)
RAID 1+0 iSCSI partitions over Gig/E MTU 9000 (99% dedicated to database)
8GB RAM
Postgres v8.1.9

The database is only about 4GB in size and the key tables total about 700MB.
Primary keys are CHAR(32) GUIDs

Thanks,
Bryan


Re: [PERFORM] startup caching suggestions

2007-06-25 Thread Bryan Murphy

No, but I was just informed of that trick earlier and intend to try it
soon.  Sometimes, the solution is so simple it's TOO obvious... :)

Bryan

On 6/25/07, Oleg Bartunov <[EMAIL PROTECTED]> wrote:


On Mon, 25 Jun 2007, Bryan Murphy wrote:

> We have a search facility in our database that uses full text indexing
to
> search about 300,000 records spread across 2 tables.  Nothing fancy
there.
>
> The problem is, whenever we restart the database (system crash, lost
> connectivity to SAN, upgrade, configuration change, etc.) our data is
not
> cached and query performance is really sketchy the first five to ten
minutes
> or so after the restart.  This is particularly problematic because the
only
> way the data gets cached in memory is if somebody actively searches for
it,
> and the first few people who visit our site after a restart are pretty
much
> screwed.
>
> I'd like to know what are the recommended strategies for dealing with
this
> problem.  We need our search queries to be near instantaneous, and we
just
> can't afford the startup penalty.

Bryan, did you try 'dd if=/path/to/your/table of=/dev/null' trick ?
It will very fast read you data into kernel's buffers.

>
> I'm also concerned that Postgres may not be pulling data off the SAN as
> efficiently as theory dictates.  What's the best way I can diagnose if
the
> SAN is performing up to spec?  I've been using iostat, and some of what
I'm
> seeing concerns me.  Here's a typical iostat output (iostat -m -d 1):
>
> Device:tpsMB_read/sMB_wrtn/sMB_readMB_wrtn
> sda   0.00 0.00 0.00  0  0
> sdb 102.97 2.03 0.00  2  0
> sdc   0.00 0.00 0.00  0  0
> sdd   0.00 0.00 0.00  0  0
>
> sda is the os partitionn (local), sdb is the primary database partion
(SAN),
> sdc is the log file partition (SAN), and sdd is used only for backups
> (SAN).  I very rarely seen sdb MB_read/s much above 2, and most of the
time
> it hovers around 1 or lower.  This seems awfully goddamn slow to me, but
> maybe I just don't fully understand what iostat is telling me.  I've
seen
> sdc writes get as high as 10 during a database restore.
>
> A few bits of information about our setup:
>
> Debian Linux 2.6.18-4-amd64 (stable)
> 4x Intel(R) Xeon(R) CPU 5110 @ 1.60GHz (100% dedicated to database)
> RAID 1+0 iSCSI partitions over Gig/E MTU 9000 (99% dedicated to
database)
> 8GB RAM
> Postgres v8.1.9
>
> The database is only about 4GB in size and the key tables total about
700MB.
> Primary keys are CHAR(32) GUIDs
>
> Thanks,
> Bryan
>

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83



Re: [PERFORM] Two questions.. shared_buffers and long reader issue

2007-07-11 Thread Bryan Murphy

We have a few tables that we need to pull relatively accurate aggregate
counts from, and we found the performance of SELECT COUNT(*) to be
unacceptable.  We solved this by creating triggers on insert and delete to
update counts in a secondary table which we join to when we need the count
information.

This may or may not work in your scenario, but it was a reasonable trade off
for us.

Bryan

On 7/11/07, Patric de Waha <[EMAIL PROTECTED]> wrote:


Hi,
   I've two questions for which I not really found answers in the web.

   Intro:
   I've a Website with some traffic.
   2 Million queries a day, during daylight.
   Postgres is running on a dedicated server  P4 DualCore, 4 Gig Ram.
   Mainly updates on 1 tuple. And more or less complex SELECT statements.
I noticed that the overall performance of postgres is decreasing
when one or more long
   readers are present. Where a long reader here is already a Select
count(*) from table.

   As postgres gets slower an slower, and users still hammering on the
reload button to get their
   page loaded. Postgres begins to reach max connections, and web site
is stuck.
   It's not because of a bad schema or bad select statements. As I said,
a select count(*) on big table is already
   triggering this behaviour.

   Why do long readers influence the rest of the transactions in such a
heavy way?
   Any configuration changes which can help here?
   Is it a disc-IO bottleneck thing?

   Second question. What is the right choice for the shared_buffers size?
   On a dedicated postgres server with 4 Giga RAM. Is there any rule of
thumb?
   Actually I set it to +-256M.


thanks for any suggestions.

Patric


My Setup:

Debian Etch
PSQL: 8.1.4

WAL files are located on another disc than the dbase itself.

max_connections = 190
shared_buffers = 3
temp_buffers = 3000
work_mem = 4096
maintenance_work_mem = 16384
fsync = on
wal_buffers = 16
effective_cache_size = 5000


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate



Re: [PERFORM] Optmal tags design?

2007-07-18 Thread Bryan Murphy

We store tags on our items like this like this:

Tag.ID INT NOT NULL PRIMARY KEY
Tag.Value TEXT LCASE NOT NULL UNIQUE

Item.ID INT NOT NULL PRIMARY KEY

ItemTagBinding.ItemID INT NOT NULL REFERENCES Item.ID
ItemTagBinding.TagID INT NOT NULL REFERENCES Tag.ID
ItemTagBinding.ItemID + ItemTagBinding.TagID UNIQUE

with appropriate indexes on the columns we need to frequently query.

We have about 3 million tag bindings right now, and have not run into any
performance issues related to tagging other than generating tag clouds
(which we pre-calculate anyway).

I'll have to get back to you when we get up to 10's, or even 100's of
millions and let you know how it scaled.

Bryan

On 7/18/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:


I am planning to add a tags (as in the "web 2.0" thing) feature to my web
based application. I would like some feedback from the experts here on
what the best database design for that would be.

The possibilities I have come up with are:
* A tags table containing the tag and id number of what it links to.
select pid from tags where tag='bla'
select tag from tags where pid=xxx.

* a tags table where each tag exists only once, and a table with the tag
ID and picture ID to link them together.

select pid from tags inner join picture_tags using(tag_id) where tag='bla'
select tag from tags inner join picture_tags using(tag_id) where pid='xxx'

* A full text index in the picture table containing the tags

select pid from pictures where tags @@ to_tsquery('bla')
(or the non-fti version)
select pid from pictures where tags ~* '.*bla.*'

select tags from pictures where pid=xxx;

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match



[PERFORM] cpu throttling

2007-08-02 Thread Bryan Murphy
We have a complicated stored procedure that we run frequently.  It
pegs one of our postmaster processes at 100% CPU utilization for a few
hours.  This has the unfortunate side effect of causing increased
latency for our other queries.  We are currently planning a fix, but
because of the complicated nature of this procedure it is going to
take some time to implement.

I've noticed that if I renice the process that is running the query,
the other postmaster processes are able to respond to our other
queries in a timely fashion.

My question:  Is there a way I can decrease the priority of a specific
query, or determine the PID of the process it is running in?  I'd like
to throw together a quick shell script if at all possible, as right
now I have to monitor the process manually and we'll have fixed the
problem long before we have the chance to implement proper database
clustering.

Bryan

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] cpu throttling

2007-08-02 Thread Bryan Murphy
It's a 4 processor Intel xeon machine with more than enough ram.  The
entire database can fit in memory, and while the CPU is pegged,
nothing is chewing up I/O bandwidth, and nothing is getting swapped
out of RAM.

I'm running Debian stable with only a few tweaks to the kernel's
memory settings.  As far as I'm aware, I have not changed anything
that would impact scheduling.

Other queries do respond, but it's more like every couple of seconds
one query which normally takes 300ms might take 8000ms.  Nothing
terrible, but enough that our users will notice.

Bryam

On 8/2/07, Alan Hodgson <[EMAIL PROTECTED]> wrote:
> On Thursday 02 August 2007 09:02, "Bryan Murphy" <[EMAIL PROTECTED]>
> wrote:
> > My question:  Is there a way I can decrease the priority of a specific
> > query, or determine the PID of the process it is running in?  I'd like
> > to throw together a quick shell script if at all possible, as right
> > now I have to monitor the process manually and we'll have fixed the
> > problem long before we have the chance to implement proper database
> > clustering.
>
> select procpid from pg_stat_activity  where current_query
>like '%stored_proc%' and current_query not like '%pg_stat_activity%';
>
> requires stats_command_string to be enabled
>
> I'm surprised your operating system doesn't automatically lower the priority
> of the process, though ..
>
> --
> "Remember when computers were frustrating because they did exactly what
> you told them to?  That actually seems sort of quaint now." --J.D. Baldwin
>
>
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend
>

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How to ENABLE SQL capturing???

2007-08-08 Thread Bryan Murphy
we currently have logging enabled for all queries over 100ms, and keep
the last 24 hours of logs before we rotate them.  I've found this tool
very helpful in diagnosing new performance problems that crop up:

http://pgfouine.projects.postgresql.org/

Bryan

On 8/8/07, Steinar H. Gunderson <[EMAIL PROTECTED]> wrote:
> On Wed, Aug 08, 2007 at 01:02:24PM -0700, smiley2211 wrote:
> > I am trying to enable capturing of the submitted code via an
> > application...how do I do this in Postgres?  Performance is SLOW on my
> > server and I have autovacuum enabled as well as rebuilt indexes...whatelse
> > should be looked at?
>
> Try "log_min_duration_statement = 100" in postgresql.conf; it will show all
> statements that take more than 100ms. Set to 0 to log _all_ statements, or
> -1 to turn the logging back off.
>
> /* Steinar */
> --
> Homepage: http://www.sesse.net/
>
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend
>

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] SAN vs Internal Disks

2007-09-07 Thread Bryan Murphy
We are currently running our database against on SAN share.  It looks like this:

2 x RAID 10 (4 disk SATA 7200 each)

Raid Group 0 contains the tables + indexes
Raid Group 1 contains the log files + backups (pg_dump)

Our database server connects to the san via iSCSI over Gig/E using
jumbo frames.  File system is XFS (noatime).

I believe our raid controller is an ARECA.  Whatever it is, it has the
option of adding a battery to it but I have not yet been able to
convince my boss that we need it.

Maintenance is nice, we can easily mess around with the drive shares,
expand and contract them, snapshot them, yadda yadda yadda.  All
things which we NEVER do to our database anyway. :)

Performance, however, is a mixed bag.  It handles concurrency very
well.  We have a number of shares (production shares, data shares, log
file shares, backup shares, etc. etc.) spread across the two raid
groups and it handles them with aplomb.

Throughput, however, kinda sucks.  I just can't get the kind of
throughput to it I was hoping to get.  When our memory cache is blown,
the database can be downright painful for the next few minutes as
everything gets paged back into the cache.

I'd love to try a single 8 disk RAID 10 with battery wired up directly
to our database, but given the size of our company and limited funds,
it won't be feasible any time soon.

Bryan

On 9/7/07, Matthew Schumacher <[EMAIL PROTECTED]> wrote:
> I'm getting a san together to consolidate my disk space usage for my
> servers.  It's iscsi based and I'll be pxe booting my servers from it.
> The idea is to keep spares on hand for one system (the san) and not have
> to worry about spares for each specific storage system on each server.
> This also makes growing filesystems and such pretty simple.  Redundancy
> is also good since I'll have two iscsi switches plugged into two cisco
> ethernet switches and two different raid controllers on the jbod.  I'll
> start plugging my servers into each switch for further redundancy.  In
> the end I could loose disks, ethernet switches, cables, iscsi switches,
> raid controller, whatever, and it keeps on moving.
>
> That said, I'm not putting my postgres data on the san.  The DB server
> will boot from the san and use it for is OS, but there are 6 15k SAS
> disks in it setup with raid-10 that will be used for the postgres data
> mount.  The machine is a dell 2950 and uses an LSI raid card.
>
> The end result is a balance of cost, performance, and reliability.  I'm
> using iscsi for the cost, reliability, and ease of use, but where I need
> performance I'm sticking to local disks.
>
> schu
>
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to [EMAIL PROTECTED] so that your
>message can get through to the mailing list cleanly
>

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] query io stats and finding a slow query

2007-09-25 Thread Bryan Murphy
We use pgfouine (http://pgfouine.projects.postgresql.org/).

I currently have postgres log every query that takes longer than
100ms, roll the log files every 24 hours, and run pgfouine nightly.  I
check it every couple of mornings and this gives me a pretty good
picture of who misbehaved over the last 24 hours.  I can't even count
the # of times I've come in in the morning and some new query has
bubbled to the top.

It's very handy.  I don't know if it would have helped you identify
your problem, but it's saved our butts a few times.

Bryan

On 9/25/07, Kamen Stanev <[EMAIL PROTECTED]> wrote:
> Thanks for the reply.
>
> Here is what I found about my problem. When i set the
> log_min_duration_statement and in the moments when the server performance is
> degrading I can see that almost all queries run very slowly (10-100 times
> slower). At first I thought that there is exclusive lock on one of the
> tables but there wasn't any.
>
> The information from the log files becomes useless when almost every query
> on you server is logged and when you can't tell which query after which. So
> I finally wrote a script to process the log file and graphically represent
> the timing of each query from the log (something like a gantt chart), and
> that way I found out what was the reason for the slowdowns. There was a
> query which actually reads all the data from one of the big tables and while
> it is running and some time after it finished the server is slowing down to
> death. I couldn't find it just looking at the log because it was not even
> the slowest query. After I examined the chart it was very clear what was
> happening. As I understand it, while this table was scanned all the disk i/o
> operations were slowed down, and maybe the data from that table was stored
> in the os cache, and hence all the other queries were so slow? After I
> removed the big query everything runs normally.
>
> However, I was wondering if there are any tools for such log analysis. I'm
> ready to provide my script if somebody is interested? I think it is very
> useful, but maybe someone has already done something better?
>
> Regards,
> Kamen
>
> On 9/21/07, Kevin Grittner <[EMAIL PROTECTED]> wrote:
> > >>> On Thu, Sep 20, 2007 at  4:36 PM, in message
> >
> <[EMAIL PROTECTED]>,
> "Kamen Stanev"
> > <[EMAIL PROTECTED]> wrote:
> > >
> > > Is there a way to find which query is doing large io operations and/or
> which
> > > is using cached data and which is reading from disk.
> >
> > A big part of your cache is normally in the OS, which makes that tough.
> >
> > > please share your experience on how do you decide which
> > > queries to optimize and how to reorganize your database?
> >
> > We base this on two things -- query metrics from our application framework
> > and user complaints about performance.
> >
> > > Is there any tools that you use to profile your database.
> >
> > Many people set log_min_duration_statement to get a look at long-running
> > queries.
> >
> > When you identify a problem query, running it with EXPLAIN ANALYZE in
> front
> > will show you the plan with estimated versus actual counts, costs, and
> time.
> > This does actually execute the query (unlike EXPLAIN without ANALYZE).
> >
> > -Kevin
> >
> >
> >
> >
>
>

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Random Page Cost and Planner

2010-05-27 Thread Bryan Hinton
Agree with Tom on his point about avoidance of cost param adjustments to fit
specific test cases.
A few suggestions...as I assume you own this database...
- check out pg_statio_user_tables - optimize your cache hit ratio on blocks
read...different time durations... pg_stat_bgwriter (read from a script or
something and snapshot)
- pg_buffercache in contrib/
- /proc/meminfo on linux
- find out exactly what is going on with your kernel buffer cache (size, how
it is buffering) and if your controller or drive is using a read ahead
cache.
- might want to play around with partial indexes vs. and/or range
partitioning with exclusion constraints, etc.
- define I/O characteristics of the dataset - taking into account index
clustering and index order on in-memory pages (i.e. re-cluster?), why need
for multiple index if clustering indexes on heap?
- solidify the referential integrity constraints between those tables, on
paperdefine the use cases before modifying the database tables...i
assume this is a dev database
- linux fs mount options to explore - i.e. noatime, writeback, etc.
-maybe look at prepared statements if you are running alot of similar
queries from a single session? assuming web front end for your db - with say
frequently queried region/category/dates for large read-only dataset with
multiple join conditions?

There are some good presentations on pgcon.org from PGCon 2010 that was held
last week...
 http://www.pgcon.org/2010/schedule/events/218.en.html

If you take everything into account and model it correctly (not too loose,
not too tight), your solution will be reusable and will save time and
hardware expenses.

Regards -

Bryan



On Thu, May 27, 2010 at 2:43 AM, David Jarvis  wrote:

> Hi, Bryan.
>
> I was just about to reply to the thread, thanks for asking. Clustering was
> key. After rebooting the machine (just to make sure absolutely nothing was
> cached), I immediately ran a report on Toronto: 5.25 seconds!
>
> Here's what I did:
>
>1. Created a new set of tables that matched the old set, with
>statistics of 1000 on the station and taken (date) columns.
>2. Inserted the data from the old hierarchy into the new set, ordered
>by station id then by date (same seven child tables as before: one per
>category).
>   - I wanted to ensure a strong correlation between primary key and
>   station id.
>   3. Added three indexes per table: (a) station id; (b) date taken;
>and (c) station-taken-category.
>4. Set the station-taken-category index as CLUSTER.
>5. Vacuumed the new tables.
>6. Dropped the old tables.
>7. Set the following configuration values:
>   - shared_buffers = 1GB
>   - temp_buffers = 32MB
>   - work_mem = 32MB
>   - maintenance_work_mem = 64MB
>   - seq_page_cost = 1.0
>   - random_page_cost = 2.0
>   - cpu_index_tuple_cost = 0.001
>   - effective_cache_size = 512MB
>
> I ran a few more reports (no reboots, but reading vastly different data
> sets):
>
>- Vancouver: 4.2s
>- Yellowknife: 1.7s
>- Montreal: 6.5s
>- Trois-Riviers: 2.8s
>
> No full table scans. I imagine some indexes are not strictly necessary and
> will test to see which can be removed (my guess: the station and taken
> indexes). The problem was that the station ids were scattered and so
> PostgreSQL presumed a full table scan would be faster.
>
> Physically ordering the data by station ids triggers index use every time.
>
> Next week's hardware upgrade should halve those times -- unless anyone has
> further suggestions to squeeze more performance out of PG. ;-)
>
> Dave
>
>


Re: [PERFORM] Performance tuning for postgres

2010-06-04 Thread Bryan Hinton
Is this a bulk insert?  Are you wrapping your statements within a
transaction(s)?
How many columns in the table?  What do the table statistics look like?



On Fri, Jun 4, 2010 at 9:21 AM, Michael Gould <
mgo...@intermodalsoftwaresolutions.net> wrote:

> In my opinion it depends on the application, the priority of the
> application
> and whether or not it is a commercially sold product, but depending on your
> needs you might want to consider having a 3rd party vendor who has
> expertise
> in this process review and help tune the application.  One vendor that I
> know does this is EnterpriseDB.  I've worked with other SQL engines and
> have
> a lot of experience tuning queries in a couple of the environments but
> PostGresql isn't one of them.  Having an experienced DBA review your system
> can make the difference between night and day.
>
> Best Regards
>
> Michael Gould
>
> "Kevin Grittner"  wrote:
> > Yogesh Naik  wrote:
> >
> >> I am performing a DB insertion and update for 3000+ records and
> >> while doing so i get CPU utilization to 100% with 67% of CPU used
> >> by postgres
> >>
> >> I have also done optimization on queries too...
> >>
> >> Is there any way to optimized the CPU utilization for postgres
> >
> > We'd need a lot more information before we could make useful
> > suggestions.  Knowing something about your hardware, OS, exact
> > PostgreSQL version, postgresql.conf contents, the table definition,
> > any foreign keys or other constraints, and exactly how you're doing
> > the inserts would all be useful.  Please read this and repost:
> >
> > http://wiki.postgresql.org/wiki/SlowQueryQuestions
> >
> > -Kevin
> >
> > --
> > Sent via pgsql-performance mailing list (
> pgsql-performance@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-performance
> >
>
> --
> Michael Gould, Managing Partner
> Intermodal Software Solutions, LLC
> 904.226.0978
> 904.592.5250 fax
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] How filesystems matter with PostgreSQL

2010-06-04 Thread Bryan Hinton
UFS2 w/ soft updates on FreeBSD might be an interesting addition to the list
of test cases

On Fri, Jun 4, 2010 at 9:33 AM, Andres Freund  wrote:

> On Friday 04 June 2010 16:25:30 Tom Lane wrote:
> > Andres Freund  writes:
> > > On Friday 04 June 2010 14:17:35 Jon Schewe wrote:
> > >> XFS (logbufs=8): ~4 hours to finish
> > >> ext4: ~1 hour 50 minutes to finish
> > >> ext3: 15 minutes to finish
> > >> ext3 on LVM: 15 minutes to finish
> > >
> > > My guess is that some of the difference comes from barrier differences.
> > > ext4 uses barriers by default, ext3 does not.
> > Or, to put it more clearly: the reason ext3 is fast is that it's unsafe.
> Jon: To verify you can enable it via the barrier=1 option during mounting..
>
> Andres
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] How filesystems matter with PostgreSQL

2010-06-04 Thread Bryan Hinton
What types of journaling on each fs?


On Fri, Jun 4, 2010 at 1:26 PM, Jon Schewe  wrote:

> On 6/4/10 9:33 AM, Andres Freund wrote:
> > On Friday 04 June 2010 16:25:30 Tom Lane wrote:
> >
> >> Andres Freund  writes:
> >>
> >>> On Friday 04 June 2010 14:17:35 Jon Schewe wrote:
> >>>
>  XFS (logbufs=8): ~4 hours to finish
>  ext4: ~1 hour 50 minutes to finish
>  ext3: 15 minutes to finish
>  ext3 on LVM: 15 minutes to finish
> 
> >>> My guess is that some of the difference comes from barrier differences.
> >>> ext4 uses barriers by default, ext3 does not.
> >>>
> >> Or, to put it more clearly: the reason ext3 is fast is that it's unsafe.
> >>
> > Jon: To verify you can enable it via the barrier=1 option during
> mounting..
> >
> >
> >
> First some details:
> Linux kernel 2.6.31
> postgres version: 8.4.2
>
> More test results:
> reiserfs: ~1 hour 50 minutes
> ext3 barrier=1: ~15 minutes
> ext4 nobarrier: ~15 minutes
> jfs: ~15 minutes
>
> --
> Jon Schewe | http://mtu.net/~jpschewe
> If you see an attachment named signature.asc, this is my digital
> signature. See http://www.gnupg.org for more information.
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] Architecting a database

2010-06-25 Thread Bryan Hinton
Interesting point you made about the read to write ratio of 1 to 15.
How frequently will you be adding new entities or in the case of storing the
customers in one database table, how frequently will you be adding new
objects of a certain entity type. How many entity types do you foresee
existing? i.e. "Customer?" Will Customer have subtypes or is a Customer the
single entity in the database?
How frequent and for how long are write operations and are they heavily
transaction based?  Will you need to support complex reporting in the
future?   What is the max number of customers?  And how much data
(approximate) will a single customer record consume in bytes?   At what rate
does it grow? (in bytes)
Will your system need to support any type of complex reporting in the future
(despite it being write intensive)?

I'd take a look at memcached, plproxy, pgpool, and some of the other cool
stuff in the postgresql community.
At a minimum, it might help you architect the system in such a manner that
you don't box yourself in.
Last, KV stores for heavy write intensive operations in distributed
environments are certainly interesting - a hybrid solution could work.

Sounds like a fun project!

Bryan



On Fri, Jun 25, 2010 at 7:02 PM, Greg Smith  wrote:

> Kevin Grittner wrote:
>
>> A schema is a logical separation within a database.  Table
>> client1.account is a different table from client2.account.  While a
>> user can be limited to tables within a single schema, a user with
>> rights to all the tables can join between them as needed.  You could
>> put common reference data in a public schema which all users could
>> access in addition to their private schemas
>>
>
> My guess would be that this app will end up being best split by schema.  I
> wonder whether it *also* needs to be split by database, too.  2000 clusters
> is clearly a nightmare, and putting all the client data into one big table
> has both performance and security issues; that leaves database and schema as
> possible splits.  However, having 2000 databases in a cluster is probably
> too many; having 2000 schemas in a database might also be too many.  There
> are downsides to expanding either of those to such a high quantity.
>
> In order to keep both those in the domain where they perform well and are
> managable, it may be that what's needed is, say, 50 databases with 40
> schemas each, rather than 2000 of either.  Hard to say the ideal ratio.
>  However, I think that at the application design level, it would be wise to
> consider each client as having a database+schema pair unique to them, and
> with the assumption some shared data may need to be replicated to all the
> databases in the cluster.  Then it's possible to shift the trade-off around
> as needed once the app is built.  Building that level of flexibility in
> shouldn't be too hard if it's in the design from day one, but it would be
> painful bit of refactoring to do later.  Once there's a prototype, then some
> benchmark work running that app could be done to figure out the correct
> ratio between the two.  It might even make sense to consider full
> scalability from day one and make the unique client connection info
> host:port:database:schema.
>
> P.S. Very refreshing to get asked about this before rather than after a
> giant app that doesn't perform well is deployed.
>
> --
> Greg Smith  2ndQuadrant US  Baltimore, MD
> PostgreSQL Training, Services and Support
> g...@2ndquadrant.com   www.2ndQuadrant.us
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] Hardware recommendations

2010-12-09 Thread alan bryan
On Wed, Dec 8, 2010 at 3:03 PM, Benjamin Krajmalnik  wrote:
> I need to build a new high performance server to replace our current 
> production database server.


We run FreeBSD 8.1 with PG 8.4 (soon to upgrade to PG 9).  Hardware is:

Supermicro 2u 6026T-NTR+
2x  Intel Xeon E5520 Nehalem 2.26GHz Quad-Core (8 cores total), 48GB RAM

We use ZFS and use SSDs for both the log device and L2ARC.  All disks
and SSDs are behind a 3ware with BBU in single disk mode.  This has
given us the capacity of the spinning disks with (mostly) the
performance of the SSDs.

The main issue we've had is that if the server is rebooted performance
is horrible for a few minutes until the various memory and ZFS caches
are warmed up.  Luckily, that doesn't happen very often.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Free PostgreSQL Training, Philadelphia, Oct 30

2004-10-13 Thread Bryan Encina
> All,
>   My company (Chariot Solutions) is sponsoring a day of free
> PostgreSQL training by Bruce Momjian (one of the core PostgreSQL
> developers).  The day is split into 2 sessions (plus a Q&A session):
>
>  * Mastering PostgreSQL Administration
>  * PostgreSQL Performance Tuning
>
>   Registration is required, and space is limited.  The location is
> Malvern, PA (suburb of Philadelphia) and it's on Saturday Oct 30.  For
> more information or to register, see
>
> http://chariotsolutions.com/postgresql.jsp
>
> Thanks,
>   Aaron

Wow, that's good stuff, too bad there's no one doing stuff like that in the
Los Angeles area.

-b


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] Config review

2004-12-06 Thread Bryan Vest
   #   debug5, debug4, debug3, debug2, debug1,
#   log, info, notice, warning, error
log_min_messages = error# Values, in order of decreasing detail:
#   debug5, debug4, debug3, debug2, debug1,
#   info, notice, warning, error, log, 
fatal,
#   panic

log_error_verbosity = terse   # terse, default, or verbose messages
#log_min_error_statement = panic # Values in order of increasing severity:
 #   debug5, debug4, debug3, debug2, 
debug1,
 #   info, notice, warning, error, 
panic(off)

#log_min_duration_statement = -1 # Log all statements whose
 # execution time exceeds the value, in
 # milliseconds.  Zero prints all queries.
 # Minus-one disables.
#silent_mode = false # DO NOT USE without Syslog!
# - What to Log -
debug_print_parse = false
debug_print_rewritten = false
debug_print_plan = false
debug_pretty_print = false
log_connections = false
log_duration = false
log_pid = false
log_statement = false
log_timestamp = true
log_hostname = true
log_source_port = false
#---
# RUNTIME STATISTICS
#---
# - Statistics Monitoring -
log_parser_stats = false
log_planner_stats = false
log_executor_stats = false
log_statement_stats = false
# - Query/Index Statistics Collector -
#stats_start_collector = true
#stats_command_string = false
#stats_block_level = false
#stats_row_level = false
#stats_reset_on_server_start = true
#---
# CLIENT CONNECTION DEFAULTS
#---
# - Statement Behavior -
#search_path = '$user,public'   # schema names
#check_function_bodies = true
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = false
#statement_timeout = 0  # 0 is disabled, in milliseconds
# - Locale and Formatting -
#datestyle = 'iso, mdy'
#timezone = unknown # actually, defaults to TZ environment 
setting
#australian_timezones = false
#extra_float_digits = 0 # min -15, max 2
#client_encoding = sql_ascii# actually, defaults to database encoding

# These settings are initialized by initdb -- they may be changed
lc_messages = 'en_US.UTF-8' # locale for system error 
message strings
lc_monetary = 'en_US.UTF-8' # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'  # locale for number formatting
lc_time = 'en_US.UTF-8' # locale for time formatting

# - Other Defaults -
#explain_pretty_print = true
#dynamic_library_path = '$libdir'
#max_expr_depth = 1 # min 10
#---
# LOCK MANAGEMENT
#---
#deadlock_timeout = 1000# in milliseconds
max_locks_per_transaction = 200 # min 10, ~260*max_connections bytes each
#---
# VERSION/PLATFORM COMPATIBILITY
#---
# - Previous Postgres Versions -
#add_missing_from = true
#regex_flavor = advanced# advanced, extended, or basic
#sql_inheritance = true
# - Other Platforms & Clients -
#transform_null_equals = false
<--config-->
Thanks
--
---
Bryan Vest
ComNet Inc.
bright.net Network Administration/Network Operations
(888)-618-4638
[EMAIL PROTECTED]Pager: [EMAIL PROTECTED]
---
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] column name is "LIMIT"

2005-03-14 Thread Bryan Encina
> Note also that the Slony-I replication system has problems 
> with column 
> names identical to reserved words. This is rooted in the fact 
> that the 
> quote_ident() function doesn't quote reserved words ... as it IMHO is 
> supposed to do.
> 
> 
> Jan
> 

Does this apply to table names as well or just columns?

Bryan

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Performance tuning on FreeBSD

2008-03-03 Thread alan bryan
I've got a new server and am myself new to tuning postgres.

Server is an 8 core Xeon 2.33GHz, 8GB RAM, RAID 10 on a 3ware 9550SX-4LP w/ BBU.

It's serving as the DB for a fairly write intensive (maybe 25-30%) Web
application in PHP.  We are not using persistent connections, thus the
high max connections.

I've done the following so far:

> cat /boot/loader.conf
kern.ipc.semmni=256
kern.ipc.semmns=512
kern.ipc.semmnu=256

> cat /etc/sysctl.conf
kern.ipc.shmall=393216
kern.ipc.shmmax=1610612736
kern.ipc.semmap=256
kern.ipc.shm_use_phys=1

postgresql.conf settings (changed from Default):
max_connections = 180
shared_buffers = 1024MB
maintenance_work_mem = 128MB
wal_buffers = 1024kB

I then set up a test database for running pgbench with scaling factor
100. I then ran:
> pgbench -c 100 -t 1000 testdb
and got:
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 100
number of transactions per client: 1000
number of transactions actually processed: 10/10
tps = 557.095867 (including connections establishing)
tps = 558.013714 (excluding connections establishing)

Just for testing, I tried turning off fsync and got:
> pgbench -c 100 -t 1000 testdb
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 100
number of transactions per client: 1000
number of transactions actually processed: 10/10
tps = 4014.075114 (including connections establishing)
tps = 4061.662041 (excluding connections establishing)

Do these numbers sound inline with what I should be seeing?  What else
can I do to try to get better performance in the more general sense
(knowing that specifics are tied to real world data and testing).  Any
hints for FreeBSD specific tuning would be helpful.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] Performance tuning on FreeBSD

2008-03-03 Thread alan bryan
On Mon, Mar 3, 2008 at 4:26 PM, Bill Moran
<[EMAIL PROTECTED]> wrote:

>  > > cat /boot/loader.conf
>  > kern.ipc.semmni=256
>  > kern.ipc.semmns=512
>  > kern.ipc.semmnu=256
>  >
>  > > cat /etc/sysctl.conf
>  > kern.ipc.shmall=393216
>  > kern.ipc.shmmax=1610612736
>
>  I would just set this to 2G (which is the max).  It doesn't really hurt
>  anything if you don't use it all.

I'll try that and report back.


>  > kern.ipc.semmap=256
>  > kern.ipc.shm_use_phys=1
>  >
>  > postgresql.conf settings (changed from Default):
>  > max_connections = 180
>  > shared_buffers = 1024MB
>
>  Why not 2G, which would be 25% of total memory?


Ditto - I'll report back.



>  Are you running FreeBSD 7?  If performance is of the utmost importance,
>  then you need to be running the 7.X branch.
>
>  Based on your pgbench results, I'm guessing you didn't get battery-backed
>  cache on your systems?  That makes a big difference no matter what OS
>  you're using.
>
>  Besides that, I can't think of any FreeBSD-specific things to do.  Basically,
>  general tuning advice applies to FreeBSD as well as to most other OS.

Yes, FreeBSD 7.0-Release.  Tried both the 4BSD and ULE schedulers and
didn't see much difference with this test.
I do have the Battery for the 3ware and it is enabled.  I'll do some
bonnie++ benchmarks and make sure disk is near where it should be.

Should turning off fsync make things roughly 8x-10x faster?  Or is
that indicative of something not being correct or tuned quite right in
the rest of the system?  I'll have to run in production with fsync on
but was just testing to see how much of an effect it had.

Thanks,
Alan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] Performance tuning on FreeBSD

2008-03-04 Thread alan bryan
On Mon, Mar 3, 2008 at 5:11 PM, Greg Smith <[EMAIL PROTECTED]> wrote:
> On Mon, 3 Mar 2008, alan bryan wrote:
>
>  >> pgbench -c 100 -t 1000 testdb
>
> > tps = 558.013714 (excluding connections establishing)
>  >
>  > Just for testing, I tried turning off fsync and got:
>
> > tps = 4061.662041 (excluding connections establishing)
>
>  This is odd.  ~500 is what I expect from this test when there is no write
>  cache to accelerate fsync, while ~4000 is normal for your class of
>  hardware when you have such a cache.  Since you say your 3Ware card is
>  setup with a cache and a BBU, that's suspicious--you should be able to get
>  around 4000 with fsync on.  Any chance you have the card set to
>  write-through instead of write-back?  That's the only thing that comes to
>  mind that would cause this.
>
>  --
>  * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD
>


According to 3dm2 the cache is on.  I even tried setting The StorSave
preference to "Performance" with no real benefit.  There seems to be
something really wrong with disk performance.  Here's the results from
bonnie:

File './Bonnie.2551', size: 104857600
Writing with putc()...done
Rewriting...done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done...
 ---Sequential Output ---Sequential Input-- --Random--
 -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
 100  9989  4.8  6739  1.0 18900  7.8 225973 98.5 1914662
99.9 177210.7 259.7

This is on FreeBSD 7.0-Release.  I tried ULE and 4BSD schedulers with
no difference.  Maybe I'll try FreeBSD 6.3 to see what that does?

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


[PERFORM] CPU bound at 99%

2008-04-22 Thread Bryan Buecking
ead committed'
#default_transaction_read_only = off
#statement_timeout = 0  # 0 is disabled
#vacuum_freeze_min_age = 1

# - Locale and Formatting -

datestyle = 'iso, mdy'
#timezone = unknown # actually, defaults to TZ 
# environment setting
#timezone_abbreviations = 'Default' # select the set of available timezone
# abbreviations. Currently, there are
#   Default
#   Australia
#   India
# However you can also create your own
# file in share/timezonesets/.
#extra_float_digits = 0 # min -15, max 2
#client_encoding = sql_ascii# actually, defaults to database
# encoding

# These settings are initialized by initdb -- they might be changed
lc_messages = 'C'   # locale for system error message 
# strings
lc_monetary = 'C'   # locale for monetary formatting
lc_numeric = 'C'# locale for number formatting
lc_time = 'C'   # locale for time formatting

# - Other Defaults -

#explain_pretty_print = on
#dynamic_library_path = '$libdir'
#local_preload_libraries = ''


#---
# LOCK MANAGEMENT
#---

#deadlock_timeout = 1s
#max_locks_per_transaction = 64 # min 10
# (change requires restart)
# Note: each lock table slot uses ~270 bytes of shared memory, and there are
# max_locks_per_transaction * (max_connections + max_prepared_transactions)
# lock table slots.


#---
# VERSION/PLATFORM COMPATIBILITY
#---

# - Previous Postgres Versions -

#add_missing_from = off
#array_nulls = on
#backslash_quote = safe_encoding# on, off, or safe_encoding
#default_with_oids = off
#escape_string_warning = on
#standard_conforming_strings = off
#regex_flavor = advanced# advanced, extended, or basic
#sql_inheritance = on

# - Other Platforms & Clients -

#transform_null_equals = off


#---
# CUSTOMIZED OPTIONS
#---

#custom_variable_classes = ''   # list of custom variable class names

-- 
Bryan Buecking

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] CPU bound at 99%

2008-04-22 Thread Bryan Buecking
On Tue, Apr 22, 2008 at 08:41:09AM -0700, Joshua D. Drake wrote:
> On Wed, 23 Apr 2008 00:31:01 +0900
> Bryan Buecking <[EMAIL PROTECTED]> wrote:
> 
> > at any given time there is about 5-6 postgres in startup 
> > (ps auxwww | grep postgres | grep startup | wc -l)
> > 
> > about 2300 connections in idle 
> > (ps auxwww | grep postgres | idle)
> > 
> > and loads of "FATAL: sorry, too many clients already" being logged.
> > 
> > The server that connects to the db is an apache server using
> > persistent connections. MaxClients is 2048 thus the high number of
> > connections needed. Application was written in PHP using the Pear DB
> > class.
> 
> Sounds like your pooler isn't reusing connections properly.

The persistent connections are working properly. The idle connections
are expected given that the Apache child process are not closing them
(A la non-persistent).  The connections do go away after 1000 requests
(MaxChildRequest).

I decided to move towards persistent connections since prior to
persistent connections the idle vs startup were reversed.

-- 
Bryan Buecking  http://www.starling-software.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] CPU bound at 99%

2008-04-22 Thread Bryan Buecking
On Tue, Apr 22, 2008 at 10:55:19AM -0500, Erik Jones wrote:
> On Apr 22, 2008, at 10:31 AM, Bryan Buecking wrote:
> 
> >max_connections = 2400
> 
> That is WAY too high.  Get a real pooler, such as pgpool, and drop  
> that down to 1000 and test from there.

I agree, but the number of idle connections dont' seem to affect
performace only memory usage. I'm trying to lessen the load of
connection setup. But sounds like this tax is minimal?

When these issues started happening, max_connections was set to 1000 and
I was not using persistent connections.

> I see you mentioned 500 concurrent connections. Are each of those
> connections actually doing something?

Yes out of the 2400 odd connections, 500 are either in SELECT or RESET.

> My guess that once you cut down on the number actual connections
> you'll find that each connection can get it's work done faster
> and you'll see that number drop significantly.

I agree, but not in this case.  I will look at using pooling. 
-- 
Bryan Buecking  http://www.starling-software.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] CPU bound at 99%

2008-04-22 Thread Bryan Buecking
On Tue, Apr 22, 2008 at 10:55:19AM -0500, Erik Jones wrote:
> 
> Are you referring to PHP's persistent connections?  Do not use those.   
> Here's a thread that details the issues with why not:  
> http://archives.postgresql.org/pgsql-general/2007-08/msg00660.php .  

Thanks for that article, very informative and persuasive enough that
I've turned off persistent connections.

-- 
Bryan Buecking  http://www.starling-software.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] CPU bound at 99%

2008-04-22 Thread Bryan Buecking
On Tue, Apr 22, 2008 at 01:21:03PM -0300, Rodrigo Gonzalez wrote:
> Are tables vacuumed often?

How often is often.  Right now db is vaccumed once a day.
-- 
Bryan Buecking  http://www.starling-software.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Help Me Understand Why I'm Getting a Bad Query Plan

2009-03-24 Thread Bryan Murphy
I hate to nag, but could anybody help me with this?  We have a few
related queries that are causing noticeable service delays in our
production system.  I've tried a number of different things, but I'm
running out of ideas and don't know what to do next.

Thanks,
Bryan

On Mon, Mar 23, 2009 at 2:03 PM, Bryan Murphy  wrote:
> Hey Guys,
>
> I've got a query on our production system that isn't choosing a good
> plan.  I can't see why it's choosing to do a sequential scan on the
> ItemExperienceLog table.  That table is about 800mb and has about 2.5
> million records.  This example query only returns 4 records.  I've
> tried upping the statics for ItemExperienceLog.VistorId and
> ItemExperienceLog.ItemId to 1000 (from out default of 100) with no
> success.
>
> Our primary keys are guids stored as char(32) not null.  Our database
> uses UTF-8 encoding and is currently version v8.3.5.
>
> The queries:
>
>
>
> --SET enable_seqscan = off
> --SET enable_seqscan = on
>
> --ALTER TABLE ItemExperienceLog ALTER COLUMN VisitorId SET STATISTICS 1000
> --ALTER TABLE ItemExperienceLog ALTER COLUMN ItemId SET STATISTICS 1000
> --ANALYZE ItemExperienceLog
>
> SELECT MAX(l.Id) as Id, l.ItemId
> FROM ItemExperienceLog l
> INNER JOIN Items_Primary p ON p.Id = l.ItemId
> INNER JOIN Feeds f ON f.Id = p.FeedId
> INNER JOIN Visitors v ON v.Id = l.VisitorId
> WHERE
>        v.UserId = 'fbe2537f21d94f519605612c0bf7c2c5'
>        AND LOWER(f.Slug) = LOWER('Wealth_Building_by_NightingaleConant')
> GROUP BY l.ItemId
>
>
>
> Explain verbose output (set enable_seqscan = on):
>
>
>
> HashAggregate  (cost=124392.54..124392.65 rows=9 width=37) (actual
> time=7765.650..7765.654 rows=4 loops=1)
>  ->  Nested Loop  (cost=2417.68..124392.49 rows=9 width=37) (actual
> time=1706.703..7765.611 rows=11 loops=1)
>        ->  Nested Loop  (cost=2417.68..123868.75 rows=1807 width=70)
> (actual time=36.374..7706.677 rows=3174 loops=1)
>              ->  Hash Join  (cost=2417.68..119679.50 rows=1807
> width=37) (actual time=36.319..7602.221 rows=3174 loops=1)
>                    Hash Cond: (l.visitorid = v.id)
>                    ->  Seq Scan on itemexperiencelog l
> (cost=0.00..107563.09 rows=2581509 width=70) (actual
> time=0.010..4191.251 rows=2579880 loops=1)
>                    ->  Hash  (cost=2401.43..2401.43 rows=1300
> width=33) (actual time=3.673..3.673 rows=897 loops=1)
>                          ->  Bitmap Heap Scan on visitors v
> (cost=22.48..2401.43 rows=1300 width=33) (actual time=0.448..2.523
> rows=897 loops=1)
>                                Recheck Cond: (userid =
> 'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
>                                ->  Bitmap Index Scan on
> visitors_userid_index2  (cost=0.00..22.16 rows=1300 width=0) (actual
> time=0.322..0.322 rows=897 loops=1)
>                                      Index Cond: (userid =
> 'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
>              ->  Index Scan using items_primary_pkey on items_primary
> p  (cost=0.00..2.31 rows=1 width=66) (actual time=0.027..0.029 rows=1
> loops=3174)
>                    Index Cond: (p.id = l.itemid)
>        ->  Index Scan using feeds_pkey on feeds f  (cost=0.00..0.28
> rows=1 width=33) (actual time=0.016..0.016 rows=0 loops=3174)
>              Index Cond: (f.id = p.feedid)
>              Filter: (lower((f.slug)::text) =
> 'wealth_building_by_nightingaleconant'::text)
> Total runtime: 7765.767 ms
>
>
>
> Explain verbose output (set enable_seqscan = off):
>
>
>
> HashAggregate  (cost=185274.71..185274.82 rows=9 width=37) (actual
> time=185.024..185.028 rows=4 loops=1)
>  ->  Nested Loop  (cost=0.00..185274.67 rows=9 width=37) (actual
> time=0.252..184.989 rows=11 loops=1)
>        ->  Nested Loop  (cost=0.00..184751.21 rows=1806 width=70)
> (actual time=0.223..134.943 rows=3174 loops=1)
>              ->  Nested Loop  (cost=0.00..180564.28 rows=1806
> width=37) (actual time=0.192..60.214 rows=3174 loops=1)
>                    ->  Index Scan using visitors_userid_index2 on
> visitors v  (cost=0.00..2580.97 rows=1300 width=33) (actual
> time=0.052..2.342 rows=897 loops=1)
>                          Index Cond: (userid =
> 'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
>                    ->  Index Scan using
> itemexperiencelog__index__visitorid on itemexperiencelog l
> (cost=0.00..134.04 rows=230 width=70) (actual time=0.013..0.040 rows=4
> loops=897)
>                          Index Cond: (l.visitorid = v.id)
>              ->  Index Scan using items_primary_pkey on items_primary
> p  (cost=0.00..2.31 ro

Re: [PERFORM] Help Me Understand Why I'm Getting a Bad Query Plan

2009-03-24 Thread Bryan Murphy
I hate to nag, but could anybody help me with this?  We have a few
related queries that are causing noticeable service delays in our
production system.  I've tried a number of different things, but I'm
running out of ideas and don't know what to do next.

Thanks,
Bryan

On Mon, Mar 23, 2009 at 2:03 PM, Bryan Murphy  wrote:
> Hey Guys,
>
> I've got a query on our production system that isn't choosing a good
> plan.  I can't see why it's choosing to do a sequential scan on the
> ItemExperienceLog table.  That table is about 800mb and has about 2.5
> million records.  This example query only returns 4 records.  I've
> tried upping the statics for ItemExperienceLog.VistorId and
> ItemExperienceLog.ItemId to 1000 (from out default of 100) with no
> success.
>
> Our primary keys are guids stored as char(32) not null.  Our database
> uses UTF-8 encoding and is currently version v8.3.5.
>
> The queries:
>
>
>
> --SET enable_seqscan = off
> --SET enable_seqscan = on
>
> --ALTER TABLE ItemExperienceLog ALTER COLUMN VisitorId SET STATISTICS 1000
> --ALTER TABLE ItemExperienceLog ALTER COLUMN ItemId SET STATISTICS 1000
> --ANALYZE ItemExperienceLog
>
> SELECT MAX(l.Id) as Id, l.ItemId
> FROM ItemExperienceLog l
> INNER JOIN Items_Primary p ON p.Id = l.ItemId
> INNER JOIN Feeds f ON f.Id = p.FeedId
> INNER JOIN Visitors v ON v.Id = l.VisitorId
> WHERE
>        v.UserId = 'fbe2537f21d94f519605612c0bf7c2c5'
>        AND LOWER(f.Slug) = LOWER('Wealth_Building_by_NightingaleConant')
> GROUP BY l.ItemId
>
>
>
> Explain verbose output (set enable_seqscan = on):
>
>
>
> HashAggregate  (cost=124392.54..124392.65 rows=9 width=37) (actual
> time=7765.650..7765.654 rows=4 loops=1)
>  ->  Nested Loop  (cost=2417.68..124392.49 rows=9 width=37) (actual
> time=1706.703..7765.611 rows=11 loops=1)
>        ->  Nested Loop  (cost=2417.68..123868.75 rows=1807 width=70)
> (actual time=36.374..7706.677 rows=3174 loops=1)
>              ->  Hash Join  (cost=2417.68..119679.50 rows=1807
> width=37) (actual time=36.319..7602.221 rows=3174 loops=1)
>                    Hash Cond: (l.visitorid = v.id)
>                    ->  Seq Scan on itemexperiencelog l
> (cost=0.00..107563.09 rows=2581509 width=70) (actual
> time=0.010..4191.251 rows=2579880 loops=1)
>                    ->  Hash  (cost=2401.43..2401.43 rows=1300
> width=33) (actual time=3.673..3.673 rows=897 loops=1)
>                          ->  Bitmap Heap Scan on visitors v
> (cost=22.48..2401.43 rows=1300 width=33) (actual time=0.448..2.523
> rows=897 loops=1)
>                                Recheck Cond: (userid =
> 'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
>                                ->  Bitmap Index Scan on
> visitors_userid_index2  (cost=0.00..22.16 rows=1300 width=0) (actual
> time=0.322..0.322 rows=897 loops=1)
>                                      Index Cond: (userid =
> 'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
>              ->  Index Scan using items_primary_pkey on items_primary
> p  (cost=0.00..2.31 rows=1 width=66) (actual time=0.027..0.029 rows=1
> loops=3174)
>                    Index Cond: (p.id = l.itemid)
>        ->  Index Scan using feeds_pkey on feeds f  (cost=0.00..0.28
> rows=1 width=33) (actual time=0.016..0.016 rows=0 loops=3174)
>              Index Cond: (f.id = p.feedid)
>              Filter: (lower((f.slug)::text) =
> 'wealth_building_by_nightingaleconant'::text)
> Total runtime: 7765.767 ms
>
>
>
> Explain verbose output (set enable_seqscan = off):
>
>
>
> HashAggregate  (cost=185274.71..185274.82 rows=9 width=37) (actual
> time=185.024..185.028 rows=4 loops=1)
>  ->  Nested Loop  (cost=0.00..185274.67 rows=9 width=37) (actual
> time=0.252..184.989 rows=11 loops=1)
>        ->  Nested Loop  (cost=0.00..184751.21 rows=1806 width=70)
> (actual time=0.223..134.943 rows=3174 loops=1)
>              ->  Nested Loop  (cost=0.00..180564.28 rows=1806
> width=37) (actual time=0.192..60.214 rows=3174 loops=1)
>                    ->  Index Scan using visitors_userid_index2 on
> visitors v  (cost=0.00..2580.97 rows=1300 width=33) (actual
> time=0.052..2.342 rows=897 loops=1)
>                          Index Cond: (userid =
> 'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
>                    ->  Index Scan using
> itemexperiencelog__index__visitorid on itemexperiencelog l
> (cost=0.00..134.04 rows=230 width=70) (actual time=0.013..0.040 rows=4
> loops=897)
>                          Index Cond: (l.visitorid = v.id)
>              ->  Index Scan using items_primary_pkey on items_primary
> p  (cost=0.00..2.31 ro

Re: [PERFORM] Help Me Understand Why I'm Getting a Bad Query Plan

2009-03-24 Thread Bryan Murphy
On Tue, Mar 24, 2009 at 10:04 PM, marcin mank  wrote:
> There is one thing I don`t understand:
>
>              ->  Nested Loop  (cost=0.00..180564.28 rows=1806
> width=37) (actual time=0.192..60.214 rows=3174 loops=1)
>                    ->  Index Scan using visitors_userid_index2 on
> visitors v  (cost=0.00..2580.97 rows=1300 width=33) (actual
> time=0.052..2.342 rows=897 loops=1)
>                          Index Cond: (userid =
> 'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
>                    ->  Index Scan using
> itemexperiencelog__index__visitorid on itemexperiencelog l
> (cost=0.00..134.04 rows=230 width=70) (actual time=0.013..0.040 rows=4
> loops=897)
>                          Index Cond: (l.visitorid = v.id)
>
> If it expects 1300 visitors with the userid, and for each of them to
> have 230 entries in itemexperiencelog, how can it come up with 1806
> returned rows (and be about right!)?

I'm not sure I follow what you're saying.

One thing to keep in mind, due to a lapse in our judgement at the
time, this itemexperiencelog table serves as both a current state
table, and a log table.  Therefore, it potentially has multiple
redundant entries, but we typically only look at the most recent entry
to figure out the state of the current item.

We're in the process of re-factoring this now, as well as
denormalizing some of the tables to eliminate unnecessary joins, but I
keep running into these problems and need to understand what is going
on so that I know we're fixing the correct things.

Thanks,
Bryan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Help Me Understand Why I'm Getting a Bad Query Plan

2009-03-24 Thread Bryan Murphy
On Tue, Mar 24, 2009 at 9:30 PM, Josh Berkus  wrote:
> For some reason, your first post didn't make it to the list, which is why
> nobody responded.

Weird... I've been having problems with gmail and google reader all week.

>>> I've got a query on our production system that isn't choosing a good
>>> plan.  I can't see why it's choosing to do a sequential scan on the
>>> ItemExperienceLog table.  That table is about 800mb and has about 2.5
>>> million records.  This example query only returns 4 records.  I've
>>> tried upping the statics for ItemExperienceLog.VistorId and
>>> ItemExperienceLog.ItemId to 1000 (from out default of 100) with no
>>> success.
>
> Yes, that is kind of inexplicable.  For some reason, it's assigning a very
> high cost to the nestloops, which is why it wants to avoid them with a seq
> scan.  Can you try lowering cpu_index_cost to 0.001 and see how that affects
> the plan?

I'm assuming you meant cpu_index_tuple_cost.  I changed that to 0.001
as you suggested, forced postgres to reload it's configuration and I'm
still getting the same execution plan.

Looking through our configuration one more time, I see that at some
point I set random_page_cost to 2.0, but I don't see any other changes
to query planner settings from their default values.

Bryan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Help Me Understand Why I'm Getting a Bad Query Plan

2009-03-25 Thread Bryan Murphy
On Wed, Mar 25, 2009 at 8:40 AM, Robert Haas  wrote:
> On Tue, Mar 24, 2009 at 11:43 PM, Bryan Murphy  wrote:
>> Looking through our configuration one more time, I see that at some
>> point I set random_page_cost to 2.0, but I don't see any other changes
>> to query planner settings from their default values.
>
> You don't by any chance have enable_ set to "off", do you?
>
> ...Robert

Alas, I wish it were that simple.  Here's the whole query tuning
section in it's entirety.  All sections with the comment #BPM just
before them are changes I made from the default value.

(sorry for any duplicates, I'm still having dropouts with gmail)

Thanks,
Bryan

#--
# QUERY TUNING
#--

# - Planner Method Configuration -

#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

#seq_page_cost = 1.0# measured on an arbitrary scale

#BPM
#random_page_cost = 4.0 # same scale as above
random_page_cost = 2.0

#cpu_tuple_cost = 0.01  # same scale as above
#cpu_index_tuple_cost = 0.005   # same scale as above
#cpu_operator_cost = 0.0025 # same scale as above

#BPM
#effective_cache_size = 128MB
effective_cache_size = 12GB

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5# range 1-10
#geqo_pool_size = 0 # selects default based on effort
#geqo_generations = 0   # selects default based on effort
#geqo_selection_bias = 2.0  # range 1.5-2.0

# - Other Planner Options -

#BPM
#default_statistics_target = 10 # range 1-1000
default_statistics_target = 100

#constraint_exclusion = off

#from_collapse_limit = 8
#join_collapse_limit = 8# 1 disables collapsing of explicit
   # JOIN clauses

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Help Me Understand Why I'm Getting a Bad Query Plan

2009-03-25 Thread Bryan Murphy
On Wed, Mar 25, 2009 at 4:55 PM, Josh Berkus  wrote:
> Oh, I see.  It thinks that it'll need to pull 260,000 redundant rows in
> order to get 1800 unique ones.  Only it's wrong; you're only pulling about
> 4000.
>
> Try increasing some stats still further: itemexperiencelog.visitorid and
> visitors.user_id both to 500.

I tried that already, but I decided to try again in case I messed up
something last time.  Here's what I ran.  As you can see, it still
chooses to do a sequential scan.  Am I changing the stats for those
columns correctly?

Thanks,
Bryan


First, the query:


SELECT MAX(l.Id) as Id, l.ItemId
FROM ItemExperienceLog l
INNER JOIN Items_Primary p ON p.Id = l.ItemId
INNER JOIN Feeds f ON f.Id = p.FeedId
INNER JOIN Visitors v ON v.Id = l.VisitorId
WHERE
v.UserId = 'fbe2537f21d94f519605612c0bf7c2c5'
AND LOWER(f.Slug) = LOWER('Wealth_Building_by_NightingaleConant')
GROUP BY l.ItemId


The query plan:


HashAggregate  (cost=130291.23..130291.35 rows=9 width=37) (actual
time=8385.428..8385.433 rows=4 loops=1)
  ->  Nested Loop  (cost=2649.02..130291.19 rows=9 width=37) (actual
time=3707.336..8385.388 rows=11 loops=1)
->  Nested Loop  (cost=2649.02..129744.01 rows=1888 width=70)
(actual time=8.881..8322.029 rows=3210 loops=1)
  ->  Hash Join  (cost=2649.02..125273.81 rows=1888
width=37) (actual time=8.836..8196.469 rows=3210 loops=1)
Hash Cond: (l.visitorid = v.id)
->  Seq Scan on itemexperiencelog l
(cost=0.00..112491.03 rows=2697303 width=70) (actual
time=0.048..4459.139 rows=2646177 loops=1)
->  Hash  (cost=2631.24..2631.24 rows=1422
width=33) (actual time=7.751..7.751 rows=899 loops=1)
  ->  Bitmap Heap Scan on visitors v
(cost=23.44..2631.24 rows=1422 width=33) (actual time=0.577..6.347
rows=899 loops=1)
Recheck Cond: (userid =
'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
->  Bitmap Index Scan on
visitors_userid_index2  (cost=0.00..23.08 rows=1422 width=0) (actual
time=0.419..0.419 rows=899 loops=1)
  Index Cond: (userid =
'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
  ->  Index Scan using items_primary_pkey on items_primary
p  (cost=0.00..2.36 rows=1 width=66) (actual time=0.024..0.025 rows=1
loops=3210)
Index Cond: (p.id = l.itemid)
->  Index Scan using feeds_pkey on feeds f  (cost=0.00..0.28
rows=1 width=33) (actual time=0.018..0.018 rows=0 loops=3210)
  Index Cond: (f.id = p.feedid)
  Filter: (lower((f.slug)::text) =
'wealth_building_by_nightingaleconant'::text)
Total runtime: 8385.538 ms


Bump up the stats:


ALTER TABLE ItemExperienceLog ALTER COLUMN VisitorId SET STATISTICS 500;
ALTER TABLE ItemExperienceLog ALTER COLUMN ItemId SET STATISTICS 500;
ANALYZE ItemExperienceLog;

ALTER TABLE Visitors ALTER COLUMN UserId SET STATISTICS 500;
ALTER TABLE Visitors ALTER COLUMN Id SET STATISTICS 500;
ANALYZE Visitors;


The new query plan:


HashAggregate  (cost=127301.63..127301.72 rows=7 width=37) (actual
time=11447.033..11447.037 rows=4 loops=1)
  ->  Nested Loop  (cost=1874.67..127301.60 rows=7 width=37) (actual
time=4717.880..11446.987 rows=11 loops=1)
->  Nested Loop  (cost=1874.67..126923.09 rows=1306 width=70)
(actual time=20.565..11345.756 rows=3210 loops=1)
  ->  Hash Join  (cost=1874.67..123822.53 rows=1306
width=37) (actual time=20.445..8292.235 rows=3210 loops=1)
Hash Cond: (l.visitorid = v.id)
->  Seq Scan on itemexperiencelog l
(cost=0.00..112010.04 rows=2646604 width=70) (actual
time=0.065..4438.481 rows=2646549 loops=1)
->  Hash  (cost=1862.32..1862.32 rows=988
width=33) (actual time=19.360..19.360 rows=899 loops=1)
  ->  Bitmap Heap Scan on visitors v
(cost=18.08..1862.32 rows=988 width=33) (actual time=0.666..17.788
rows=899 loops=1)
Recheck Cond: (userid =
'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
->  Bitmap Index Scan on
visitors_userid_index2  (cost=0.00..17.83 rows=988 width=0) (actual
time=0.520..0.520 rows=899 loops=1)
  Index Cond: (userid =
'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
  ->  Index Scan using items_primary_pkey on items_primary
p  (cost=0.00..2.36 rows=1 width=66) (actual time=0.944..0.945 rows=1
loops=3210)
Index Cond: (p.id = l.itemid)
->  Index Scan using feeds_pkey on feeds f  (cost=0.00..0.28
rows=1 width=33) (actual time=0.029..0.029 rows=0 loops=3210)
  Index Cond: (f.id = p.feedid)
  Filter: (lower((f.slug)::text) =
'wealth_building_by_nig

Re: [PERFORM] Help Me Understand Why I'm Getting a Bad Query Plan

2009-03-25 Thread Bryan Murphy
On Wed, Mar 25, 2009 at 9:15 PM, Tom Lane  wrote:
> I think what you should be doing is messing with the cost parameters
> ... and not in the direction you tried before.  I gather from
>        effective_cache_size = 12GB
> that you have plenty of RAM on this machine.  If the tables involved
> are less than 1GB then it's likely that you are operating in a fully
> cached condition, and the default cost parameters are not set up for
> that.  You want to be charging a lot less for page accesses relative to
> CPU effort.  Try reducing both seq_page_cost and random_page_cost to 0.5
> or even 0.1.  You'll need to watch your other queries to make sure
> nothing gets radically worse though ...
>
>                        regards, tom lane

Thanks Tom, I think that did the trick.  I'm going to have to keep an
eye on the database for a few days to make sure there are no
unintended consequences, but it looks good.  Here's the new query
plan:


HashAggregate  (cost=40906.58..40906.67 rows=7 width=37) (actual
time=204.661..204.665 rows=4 loops=1)
  ->  Nested Loop  (cost=0.00..40906.55 rows=7 width=37) (actual
time=0.293..204.628 rows=11 loops=1)
->  Nested Loop  (cost=0.00..40531.61 rows=1310 width=70)
(actual time=0.261..113.576 rows=3210 loops=1)
  ->  Nested Loop  (cost=0.00..39475.97 rows=1310
width=37) (actual time=0.232..29.484 rows=3210 loops=1)
->  Index Scan using visitors_userid_index2 on
visitors v  (cost=0.00..513.83 rows=1002 width=33) (actual
time=0.056..2.307 rows=899 loops=1)
  Index Cond: (userid =
'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
->  Index Scan using
itemexperiencelog__index__visitorid on itemexperiencelog l
(cost=0.00..37.43 rows=116 width=70) (actual time=0.013..0.021 rows=4
loops=899)
  Index Cond: (l.visitorid = v.id)
  ->  Index Scan using items_primary_pkey on items_primary
p  (cost=0.00..0.79 rows=1 width=66) (actual time=0.018..0.019 rows=1
loops=3210)
Index Cond: (p.id = l.itemid)
->  Index Scan using feeds_pkey on feeds f  (cost=0.00..0.27
rows=1 width=33) (actual time=0.023..0.023 rows=0 loops=3210)
  Index Cond: (f.id = p.feedid)
  Filter: (lower((f.slug)::text) =
'wealth_building_by_nightingaleconant'::text)
Total runtime: 204.759 ms


What I did was change seq_page_cost back to 1.0 and then changed
random_page_cost to 0.5

This also makes logical sense to me.  We've completely rewritten our
caching layer over the last three weeks, and introduced slony into our
architecture, so our usage patterns have transformed overnight.
Previously we were very i/o bound, now most of the actively used data
is actually in memory.  Just a few weeks ago there was so much churn
almost nothing stayed cached for long.

This is great, thanks guys!

Bryan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Help Me Understand Why I'm Getting a Bad Query Plan

2009-03-25 Thread Bryan Murphy
On Wed, Mar 25, 2009 at 10:28 PM, Tom Lane  wrote:
> Bryan Murphy  writes:
>> What I did was change seq_page_cost back to 1.0 and then changed
>> random_page_cost to 0.5
>
> [ squint... ]  It makes no physical sense for random_page_cost to be
> less than seq_page_cost.  Please set them the same.
>
>                        regards, tom lane

Done.  Just  saw the tip in the docs as well.  Both are set to 0.5 now
and I'm still getting the good query plan.

Thanks!
Bryan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Limit I/O bandwidth of a certain backend

2009-05-05 Thread Bryan Murphy
On Tue, May 5, 2009 at 2:31 AM, Vlad Arkhipov  wrote:
> Is there a way to limit I/O bandwidth/CPU usage of a certain backend? It
> seems that ionice/renice makes no sense because of bgwriter/WAL writer
> processes are not a part of a backend. I have a periodic query (every
> hour) that make huge I/O load and should run in background. When this
> query runs all other fast queries slow down dramatically.

Could you use something like slony to replicate the needed data to a
secondary database and run the query there?

Bryan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance