Sorry to bother everyone with yet another "my query isn't using an
index" problem but I am over my head on this one.. I am open to ways
of restructuring this query to perform better.
I have a table, 'ea', with 22 million rows in it. VACUUM ANALYZE has
been just run on the table.
This is the
On May 8, 2005, at 6:51 PM, Russell Smith wrote:
On Mon, 9 May 2005 09:20 am, Dan Harris wrote:
You cannot use an index for %CORVETTE%, or %RED%. There is no way
for the index to know if a row had that in the middle without scanning
the whole
index. So it's much cheaper to do a sequence
On May 8, 2005, at 8:06 PM, Josh Berkus wrote:
If I were to use tsearch2 for full-text indexing, would I need to
create another table that merges all of my recordtext rows into a
single 'text' field type?
No. Read the OpenFTS docs, they are fairly clear on how to set up a
simple
FTS index. (TSe
On May 8, 2005, at 8:32 PM, Russell Smith wrote:
I have run this, and while it is very fast, I'm concerned it's not
doing what I need.
How fast is very fast?
It took 35 seconds to complete versus ~450 my old way.
select distinct em.incidentid, ea.recordtext as retdata, eg.long,
eg.lat
FROM em JOIN
ption and the formatting of the log
file and the fact that EVERY query is logged is not what I'm after
for this project. The "infinite-running" queries are unpredictable
and may only happen once a week. Logging 24/7 in anticipation of one
of these o
Gurus,
A table in one of my databases has just crossed the 30 million row
mark and has begun to feel very sluggish for just about anything I do
with it. I keep the entire database vacuumed regularly. And, as
long as I'm not doing a sequential scan, things seem reasonably quick
most of t
So sorry, I forgot to mention I'm running version 8.0.1
Thanks
---(end of broadcast)---
TIP 6: explain analyze is your friend
On Jul 13, 2005, at 1:11 PM, John A Meinel wrote:
I might be wrong, but there may be something much more substantially
wrong than slow i/o.
John
Yes, I'm afraid of that too. I just don't know what tools I should
use to figure that out. I have some 20 other databases on this
system, sa
On Jul 13, 2005, at 2:17 PM, Stephen Frost wrote:
Could you come up w/ a test case that others could reproduce where
explain isn't returning?
This was simply due to my n00bness :) I had always been doing
explain analyze, instead of just explain. Next time one of these
queries comes up,
On Jul 14, 2005, at 12:12 AM, Greg Stark wrote:
Dan Harris <[EMAIL PROTECTED]> writes:
I keep the entire database vacuumed regularly.
How often is "regularly"?
Well, once every day, but there aren't a ton of inserts or updates
going on a daily basis. Maybe 1,000
I'm trying to improve the speed of this query:
explain select recordtext from eventactivity inner join ( select
incidentid from k_r where id = 94 ) a using ( incidentid ) inner join
( select incidentid from k_b where id = 107 ) b using ( incidentid );
On Jul 14, 2005, at 9:47 AM, Alvaro Herrera wrote:
On Thu, Jul 14, 2005 at 12:28:05AM -0600, Dan Harris wrote:
. Ext3 must really be crappy
for postgres, or at least is on this box.
Were you using the default journal settings for ext3?
Yes, I was. Next time I get a chance to reboot
On Jul 14, 2005, at 9:42 AM, John A Meinel wrote:
You might try giving it a little bit more freedom with:
EXPLAIN ANALYZE
SELECT recordtext FROM eventactivity, k_r, k_b
WHERE eventactivity.incidentid = k_r.incidentid
AND eventactivity.incidentid = k_b.incidentid
AND k_r.id = 94
AND
On Jul 14, 2005, at 5:12 PM, John A Meinel wrote:
Dan Harris wrote:
Well, postgres is estimating around 500 rows each, is that way
off? Try
just doing:
EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107;
EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94;
Once again, do
On Jul 14, 2005, at 7:15 PM, John A Meinel wrote:
Is the distribution of your rows uneven? Meaning do you have more rows
with a later id than an earlier one?
There are definitely some id's that will have many times more than
the others. If I group and count them, the top 10 are fairly
On Jul 14, 2005, at 10:12 PM, John A Meinel wrote:
My biggest question is why the planner things the Nested Loop would be
so expensive.
Have you tuned any of the parameters? It seems like something is
out of
whack. (cpu_tuple_cost, random_page_cost, etc...)
here's some of my postgresql.co
On Jul 15, 2005, at 9:09 AM, Dan Harris wrote:
On Jul 14, 2005, at 10:12 PM, John A Meinel wrote:
My biggest question is why the planner things the Nested Loop
would be
so expensive.
Have you tuned any of the parameters? It seems like something is
out of
whack. (cpu_tuple_cost
On Jul 15, 2005, at 2:39 PM, Ron Wills wrote:
Hello all
I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and
an 3Ware SATA raid.
Operating System? Which file system are you using? I was having a
similar problem just a few days ago and learned that ext3 was the
culprit.
Lately, I've been reading a lot about these new Coraid AoE RAID
devices ( http://www.coraid.com ). They tout it as being fast and
cheap and better than iSCSI due to the lack of TCP/IP over the wire.
Is it likely that a 15-drive RAID 10 Linux software RAID would
outperform a 4-drive 10k SC
I am working on a process that will be inserting tens of million rows
and need this to be as quick as possible.
The catch is that for each row I could potentially insert, I need to
look and see if the relationship is already there to prevent
multiple entries. Currently I am doing a SELECT
On Jul 28, 2005, at 8:38 AM, Chris Hoover wrote:
I did some more testing, and ran the explain analyze on the problem.
In my session I did a set enable_hashjoin = false and then ran the
analyze. This caused it to use the indexes as I have been expecting
it to do.
Now, how can I get it to use
I thought I would send this to pg-performance since so many people
helped me with my speed issues recently. I was definitely IO-
bottlenecked.
Since then, I have installed 2 RAID arrays with 7 15k drives in them
in RAID 0+1 as well as add a new controller card with 512MB of cache
on it.
On Aug 9, 2005, at 1:08 PM, Tom Lane wrote:
"Joshua D. Drake" <[EMAIL PROTECTED]> writes:
My experience is that when this type of thing happens it is typically
specific queries that cause the problem. If you turn on statement
logging you can get the exact queries and debug from there.
He
On Aug 10, 2005, at 12:49 AM, Steve Poe wrote:
Dan,
Do you mean you did RAID 1 + 0 (RAID 10) or RAID 0 + 1? Just a
clarification, since RAID 0 is still a single-point of failure even if
RAID1 is on top of RAID0.
Well, you tell me if I stated incorrectly. There are two raid
enclosures with
On Aug 9, 2005, at 3:51 PM, John A Meinel wrote:
Dan Harris wrote:
On Aug 10, 2005, at 12:49 AM, Steve Poe wrote:
Dan,
Do you mean you did RAID 1 + 0 (RAID 10) or RAID 0 + 1? Just a
clarification, since RAID 0 is still a single-point of failure
even if
RAID1 is on top of RAID0.
Well
I have a web page for my customers that shows them count of records
and some min/max date ranges in each table of a database, as this is
how we bill them for service. They can log in and check the counts
at any time. I'd like for the counts to be as fresh as possible by
keeping this dynam
Thanks for all the great ideas. I have more options to evaluate now.
-Dan
---(end of broadcast)---
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
On Aug 19, 2005, at 12:55 AM, Jeffrey W. Baker wrote:
On Tue, 2005-08-16 at 10:46 -0700, Roger Hand wrote:
Have you considered booting your
machine with elevator=deadline?
Although I'm not the OP for this problem, I thought I'd try it out.
WOW.. this should be in a Pg tuning guide somewh
On Aug 19, 2005, at 3:01 PM, Jeremiah Jahn wrote:
Rebuild in progress with just ext3 on the raid array...will see if
this
helps the access times.
From my recent experiences, I can say ext3 is probably not a great
choice for Pg databases. If you check the archives you'll see
there's a l
Do you have any sources for that information? I am running dual
SmartArray 6402's in my DL585 and haven't noticed anything poor about
their performance.
On Sep 1, 2005, at 2:24 PM, Luke Lonergan wrote:
Are you using the built-in HP SmartArray RAID/SCSI controllers? If
so, that
could be
On Sep 28, 2005, at 8:32 AM, Arnau wrote:
Hi all,
I have been "googling" a bit searching info about a way to
monitor postgresql (CPU & Memory, num processes, ... )
You didn't mention your platform, but I have an xterm open pretty
much continuously for my DB server that runs plain old t
On Oct 3, 2005, at 5:02 AM, Steinar H. Gunderson wrote:
I thought this might be interesting, not the least due to the
extremely low
price ($150 + the price of regular DIMMs):
This has been posted before, and the main reason nobody got very
excited is that:
a) it only uses the PCI bus
On Oct 3, 2005, at 5:02 AM, Steinar H. Gunderson wrote:
I thought this might be interesting, not the least due to the
extremely low
price ($150 + the price of regular DIMMs):
Replying before my other post came through.. It looks like their
benchmarks are markedly improved since the last
Arjen van der Meijden wrote:
But be aware that there can be substantial and unexpected differences on
this relatively new platform due to simply changing the OS, like we saw
when going from linux 2.6.15 to 2.6.18, as you can see here:
http://tweakers.net/reviews/657/2
Having upgraded to 2
I have a new task of automating the export of a very complex Crystal
Report. One thing I have learned in the last 36 hours is that the
export process to PDF is really, really, slooww..
Anyway, that is none of your concern. But, I am thinking that I can
somehow utilize some of PG's strengths
Thank you all for your ideas. I appreciate the quick response.
-Dan
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
I've found that it would be helpful to be able to tell how busy my
dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before
pounding it with some OLAP-type queries. Specifically, I have a
multi-threaded client program that needs to run several thousand
sequential queries. I broke
Dan Harris wrote:
I've found that it would be helpful to be able to tell how busy my
dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before
pounding it with some OLAP-type queries.
..snip
Thank you all for your great ideas! I'm going to try the perl function
as
8.0.3 - Linux 2.6.18..
Freshly vacuumed and analyzed
This database has been humming along fine for a while now, but I've got one of
those sticky queries that is taking
much too long to finish.
After some digging, I've found that the planner is choosing to apply a
necessary seq scan to the tabl
Is there a pg_stat_* table or the like that will show how bloated an index is?
I am trying to squeeze some disk space and want to track down where the worst
offenders are before performing a global REINDEX on all tables, as the database
is rougly 400GB on disk and this takes a very long time to
Michael Stone wrote:
On Fri, Apr 27, 2007 at 09:27:49AM -0400, Carlos Moreno wrote:
Notice that the second part of my suggestion covers this --- have
additional
switches to initdb
If the person knows all that, why wouldn't they know to just change the
config parameters?
Exactly.. What I
Bill Moran wrote:
In response to Dan Harris <[EMAIL PROTECTED]>:
Why does the user need to manually track max_fsm_pages and max_fsm_relations? I
bet there are many users who have never taken the time to understand what this
means and wondering why performance still stinks after vac
Orhan Aglagul wrote:
Hi Everybody,
I was trying to see how many inserts per seconds my application could
handle on various machines.
I read that postgres does have issues with MP Xeon (costly context
switching). But I still think that with fsync=on 65 seconds is ridiculous.
CPU is unlikel
Daniel Haensse wrote:
Dear list,
I'm running postgres on a tomcat server. The vacuum is run every hour
(cronjob) which leads to a performance drop of the tomcat applications.
I played around with renice command and I think it is possible to reduce
this effect which a renice. The problem is how c
Tyler Durden wrote:
Hi,
I'm having some problems in performance in a simple select count(id)
from I have 700 000 records in one table, and when I do:
# explain select (id) from table_name;
-[ RECORD 1
]
QUERY PLAN | Seq Scan on
Andrew Sullivan wrote:
On Thu, Jun 07, 2007 at 03:26:56PM -0600, Dan Harris wrote:
They don't always have to be in a single transaction, that's a good idea to
break it up and vacuum in between, I'll consider that. Thanks
If you can do it this way, it helps _a lot_. I've
Thomas Finneid wrote:
Hi
During the somes I did I noticed that it does not necessarily seem to be
true that one needs the fastest disks to have a pg system that is fast.
It seems to me that its more important to:
- choose the correct methods to use for the operation
- tune the pg memory setti
Today, I looked at 'top' on my PG server and saw a pid that reported 270 hours
of CPU time. Considering this is a very simple query, I was surprised to say
the least. I was about to just kill the pid, but I figured I'd try and see
exactly what it was stuck doing for so long.
Here's the strac
Tom Lane wrote:
Dan Harris <[EMAIL PROTECTED]> writes:
Here's the strace summary as run for a few second sample:
% time seconds usecs/call callserrors syscall
-- --- --- - -
97.250.671629 9
Kari Lavikka wrote:
Hello!
Some background info.. We have a blog table that contains about eight
million blog entries. Average length of an entry is 1200 letters.
Because each 8k page can accommodate only a few entries, every query
that involves several entries causes several random seeks to
My PG server came to a screeching halt yesterday. Looking at top saw a very
large number of "startup waiting" tasks. A pg_dump was running and one of my
scripts had issued a CREATE DATABASE command. It looks like the CREATE DATABASE
was exclusive but was having to wait for the pg_dump to fin
On 3/22/10 4:36 PM, Carlo Stonebanks wrote:
Here we go again!
Can anyone see any obvious faults?
Carlo
maintenance_work_mem = 256MB
I'm not sure how large your individual tables are, but you might want to
bump this value up to get faster vacuums.
max_fsm_relations = 1000
I think this will d
On 10/7/10 11:47 AM, Aaron Turner wrote:
Basically, each connection is taking about 100MB resident. As we need
to increase the number of threads to be able to query all the devices
in the 5 minute window, we're running out of memory.
I think the first thing to do is look into using a connecti
On 10/11/10 8:02 PM, Scott Carey wrote:
would give you a 1MB read-ahead. Also, consider XFS and its built-in
defragmentation. I have found that a longer lived postgres DB will get extreme
file fragmentation over time and sequential scans end up mostly random.
On-line file defrag helps treme
On 10/12/10 10:44 AM, Scott Carey wrote:
On Oct 12, 2010, at 8:39 AM, Dan Harris wrote:
On 10/11/10 8:02 PM, Scott Carey wrote:
would give you a 1MB read-ahead. Also, consider XFS and its built-in
defragmentation. I have found that a longer lived postgres DB will get extreme
file
On 10/12/10 4:33 PM, Neil Whelchel wrote:
On Tuesday 12 October 2010 08:39:19 Dan Harris wrote:
On 10/11/10 8:02 PM, Scott Carey wrote:
would give you a 1MB read-ahead. Also, consider XFS and its built-in
defragmentation. I have found that a longer lived postgres DB will get
extreme file
On 3/4/11 11:03 AM, Wayne Conrad wrote:
On 03/04/11 10:34, Glyn Astill wrote:
> I'm wondering (and this may be a can of worms) what peoples opinions
are on these schedulers?
When testing our new DB box just last month, we saw a big improvement
in bonnie++ random I/O rates when using the noop
explain analyze
select distinct eventmain.incidentid, eventmain.entrydate,
eventgeo.long, eventgeo.lat, eventgeo.geox, eventgeo.geoy
from eventmain, eventgeo
where
eventmain.incidentid = eventgeo.incidentid and
( long > -104.998027962962 and long < -104.985957781349 ) and
( lat > 39.707
Dan Harris wrote:
explain analyze
doh.. sorry to reply to my own post. But I messed up copying some
of the fields into the select statement that you'll see in the "Sort
Key" section of the analyze results. The mistake was mine. Everything
else is "normal"
Markus Bertheau wrote:
Have you tried using a GIST index on lat & long? These things are
meant for two-dimensional data, whereas btree doesn't handle
two-dimensional data that well. How many rows satisfy either of the
long / lat condition?
According to the analyze, less than 500 rows mat
Dan Harris wrote:
Markus Bertheau wrote:
Have you tried using a GIST index on lat & long? These things are
meant for two-dimensional data, whereas btree doesn't handle
two-dimensional data that well. How many rows satisfy either of the
long / lat condition?
According to th
Merlin Moncure wrote:
As others will probably mention, effective queries on lot/long which
is a spatial problem will require r-tree or gist. I don't have a lot
of experience with exotic indexes but this may be the way to go.
One easy optimization to consider making is to make an index on eithe
I have a table with 1 live row that I found has 115000 dead rows in it (
from a testing run ). I'm trying to VACUUM FULL the table and it has
run for over 18 hours without completion. Considering the hardware on
this box and the fact that performance seems reasonable in all other
aspects, I'm
I have a query that is intended to select from multiple "small tables"
to get a limited subset of "incidentid" and then join with a "very
large" table. One of the operations will require a sequential scan, but
the planner is doing the scan on the very large table before joining the
small ones,
Tom Lane wrote:
That's very strange --- the estimated cost of the seqscan is high enough
that the planner should have chosen a nestloop with inner indexscan on
the big table. I'm not sure about the join-order point, but the hash
plan for the first join seems wrong in any case.
Um, you do have a
Tom Lane wrote:
So it's estimating 5775 cost units per probe into eventactivity, which
is pretty high --- it must think that a lot of rows will be retrieved by
the index (way more than the 20 or so it thinks will get past the filter
condition).
What does the pg_stats entry for eventactivity
Tom Lane wrote:
What the stats entry is saying is that the most common entries occur
about 75000 times apiece (0.0017 * 45e6), which is what's scaring
the planner here ;-). I think those frequencies are artificially high
though. The default statistics sample size is 3000 rows (300 *
statist
My database is used primarily in an OLAP-type environment. Sometimes my
users get a little carried away and find some way to slip past the
sanity filters in the applications and end up bogging down the server
with queries that run for hours and hours. And, of course, what users
tend to do is
Tom Lane wrote
You should be using SIGINT, not SIGTERM.
regards, tom lane
Thank you very much for clarifying this point! It works :)
---(end of broadcast)---
TIP 6: explain analyze is your friend
[EMAIL PROTECTED] wrote:
both of the two database are live but use for two different web app.
my company don't want to spend more to buy a new server, so then I think of
to implement both under the same server and one instance..
Just as an anecdote, I am running 30 databases on a single i
. I've had to tune the shmmax on linux machines before but
I'm new to AIX and not sure if this is even required on that platform?
Google has not been much help for specifics here.
Hoping someone else here has a similar platform and can offer some advice..
Thanks!
-
memory and run some benchmarks. For the price of these things,
they better be some good marks!
Thanks again
-Dan Harris
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
I wanted to solicit some opinions on architecture and performance from
you guys.
I am torn right now between these two systems to replace my aging DB server:
4 x 2.2 GHz Opteron
8GB RAM
Ultra320 15kRPM RAID5 with 128MB cache
and
2-way 1.2GHz POWER4+ IBM pSeries 615
8GB RAM
Ultra320 15kRPM RAID5 w
Greetings,
I have been beating myself up today trying to optimize indices for a
query that uses LIKE. In my research I have read that the locale
setting may affect PostgreSQL's choice of seq scan vs index scan. I am
running Fedora Core 2 and it appears when I run "locale" that it is set
to 'e
I've been fighting with the common workarounds for inadequate response
times on select count(*) and min(),max() on tables with tens of
millions of rows for quite a while now and understand the reasons for
the table scans.
I have applications that regularly poll a table ( ideally, the more
Laszlo Nagy wrote:
Question 4. How to make the partitions? This is the hardest question.
Here is my plan:
- the OS resides on 2 disks, RAID 1
- the databases should go on 8 disks, RAID 0 + 1
Make sure you understand the difference between RAID 1+0 and RAID 0+1..
I suspect you'll end up going
[EMAIL PROTECTED] wrote:
3) Build a table with totals or maybe subtotals, updated by triggers. This
requires serious changes in application as well as in database, but solves
issues of 1) and may give you even better results.
Tomas
I have tried this. It's not a magic bullet. We do our bil
Kenneth Marshall wrote:
Dan,
Did you try this with 8.3 and its new HOT functionality?
Ken
I did not. I had to come up with the solution before we were able to
move to 8.3. But, Tom did mention that the HOT might help and I forgot
about that when writing the prior message. I'm in the
My company finally has the means to install a new database server for
replication. I have Googled and found a lot of sparse information out
there regarding replication systems for PostgreSQL and a lot of it looks
very out-of-date. Can I please get some ideas from those of you that
are current
I'm looking for advice from the I/O gurus who have been in the SSD game
for a while now.
I understand that the majority of consumer grade SSD drives lack the
required capacitor to complete a write on a sudden power loss. But,
what about pairing up with a hardware controller with BBU write cac
80 matches
Mail list logo