Hello,
I'm the fellow who was interviewed in the fall about using PostgreSQL on
1-800-Save-A-Pet.com:
http://techdocs.postgresql.org/techdocs/interview-stosberg.php
The site traffic continues to grow, and we are now seeing parts of the
day where the CPU load (according to MRTG graphs) on the data
Neil Conway wrote:
> Mark Stosberg wrote:
>> I've used PQA to analyze my queries and happy overall with how they are
>> running. About 55% of the query time is going to variations of the pet
>> searching query, which seems like where it should be going. The query is
&
Hello,
I'm working on setting up replication with Slony, and will soon have a
slave that a lot of SELECT traffic will be sent to (over 500k/day).
The primary query we need to run is somewhat complex, but seems to
complete on average in well under a second.
However, every so often (less in 1 in
Hello,
I'm using geo_distance() from contrib/earthdistance would like to find a
way to spend up the geo distance calculation if possible. This is for a
proximity search: "Show me adoptable pets within 250 miles of this
zipcode".
I'm researched a number of approaches to this, but none seem as wor
Bruno Wolff III wrote:
> On Sat, Feb 03, 2007 at 14:00:26 -0500,
> Mark Stosberg <[EMAIL PROTECTED]> wrote:
>> I'm using geo_distance() from contrib/earthdistance would like to find a
>> way to spend up the geo distance calculation if possible. This is for a
Merlin Moncure wrote:
> On 2/5/07, Mark Stosberg <[EMAIL PROTECTED]> wrote:
>> Bruno Wolff III wrote:
>> > On Sat, Feb 03, 2007 at 14:00:26 -0500,
>> > Mark Stosberg <[EMAIL PROTECTED]> wrote:
>> >> I'm using geo_distance() from contrib/ear
Bruno Wolff III wrote:
>
> Some people here may be able to tell you more if you show us explain
> analyze output.
Here is my explain analyze output. Some brief context of what's going
on. The goal is to find "Pets Near You".
We join the pets table on the shelters table to get a zipcode, and then
Mark Stosberg wrote:
>
> I'm reading the explain analyze output correctly myself, nearly all of
> the time spent is related to the 'pets' table, but I can't see what to
> about it.
Something about typing that message jarred by brain to think to try:
VACUUM FULL
Hello,
I wanted to share something else I learned in my proximity search work.
One my requirements is to order by the distance that matches are found
from the center point.
When did this using earth_distance(), the benefit of the earth_box()
technique over the old geo_distance became minimal as
[EMAIL PROTECTED] wrote:
> Hi all,
>
> I'm fairly new to SQL, so this is probably a dumb way to form this
> query, but I don't know another.
>
> I want to see the usernames of all the users who have logged on
> today. "users" is my table of users with id's and username's.
> "session_stats" is m
With the help of some of this list, I was able to successfully set up
and benchmark a cube-based replacement for geo_distance() calculations.
On a development box, the cube-based variations benchmarked consistently
running in about 1/3 of the time of the gel_distance() equivalents.
After settin
[EMAIL PROTECTED] wrote:
> I have this function in my C#.NET app that goes out to find the
> business units for each event and returns a string (for my report).
> I'm finding that for larger reports it takes too long and times out.
>
> Does anyone know how I can speed this process up? Is this cod
Merlin Moncure wrote:
> On 2/10/07, Mark Stosberg <[EMAIL PROTECTED]> wrote:
>>
>> With the help of some of this list, I was able to successfully set up
>> and benchmark a cube-based replacement for geo_distance() calculations.
>>
>> On a development box
Merlin--
Thanks so much for your help. Some follow-ups are below.
Merlin Moncure wrote:
>
>> Here the basic query I'm using:
>> SELECT
>> -- 1609.344 is a constant for "meters per mile"
>> cube_distance( (SELECT earth_coords from zipcodes WHERE zipcode =
>> '90210') , earth_coords)/1609.344
>>
On Tue, Feb 13, 2007 at 09:31:18AM -0500, Merlin Moncure wrote:
>
> >my mistake, i misunderstood what you were trying to do...can you try
> >removing the 'order by radius' and see if it helps? if not, we can try
> >working on this query some more. There is a better, faster way to do
> >this, I'm s
Merlin Moncure wrote:
On 2/14/07, Tom Lane <[EMAIL PROTECTED]> wrote:
There are two things wrong here: first, that the estimated row
count is only 20% of actual; it should certainly not be that far
off for such a simple condition. I wonder if your vacuum/analyze
procedures are actually working.
On Wed, Feb 14, 2007 at 01:07:23PM -0500, Tom Lane wrote:
> Mark Stosberg <[EMAIL PROTECTED]> writes:
> > Your suggestion about the pet_state index was right on. I tried
> > "Analyze" on it, but still got the same bad estimate. However, I then
> > used "
Tom Lane wrote:
> Mark Stosberg <[EMAIL PROTECTED]> writes:
>> Your suggestion about the pet_state index was right on. I tried
>> "Analyze" on it, but still got the same bad estimate. However, I then
>> used "reindex" on that index, and that fixed
Ray Stell wrote:
> I'd like to have a toolbox prepared for when performance goes south.
> I'm clueless. Would someone mind providing some detail about how to
> measure these four items Craig listed:
>
> 1. The first thing is to find out which query is taking a lot of time.
>
> 2. A long-running
Our application has a table that is only logged to, and infrequently
used for reporting. There generally no deletes and updates.
Recently, the shear size (an estimated 36 million rows) caused a serious
problem because it prevented a "vacuum analyze" on the whole database
from finishing in a timely
When I upgraded a busy database system to PostgreSQL 8.1, I was excited
about AutoVacuum, and promptly enabled it, and turned off the daily
vacuum process.
(
I set the following, as well as the option to enable auto vacuuming
stats_start_collector = true
stats_row_level = true
)
I could see in t
Alvaro Herrera wrote:
> Mark Stosberg wrote:
>> When I upgraded a busy database system to PostgreSQL 8.1, I was excited
>> about AutoVacuum, and promptly enabled it, and turned off the daily
>> vacuum process.
>>
>> (
>> I set the following, as well
Thanks to everyone for the feedback about vacuuming. It's been very
useful. The pointers to the pgstattuple and Pgfouine tools were also
helpful.
I'm now considering the following plan for trying Autovacuuming again
with 8.1. I'd like any peer review you have to offer of the following:
1. First,
Hello,
I'm trying to make sense of the memory usage reported by 'top', compared
to what "pg_database_size" shows. Here's one result:
select pg_size_pretty(pg_database_size('production'));
pg_size_pretty
6573 MB
Now, looking at memory use with "top", there is a lot memory tha
Joshua D. Drake wrote:
> Mark Stosberg wrote:
>> Hello,
>>
>> I'm trying to make sense of the memory usage reported by 'top', compared
>> to what "pg_database_size" shows. Here's one result:'
>
>
> You are missing the
Joshua D. Drake wrote:
> Madison Kelly wrote:
>> Hi all,
>>
>> I am asking in this list because, at the end of the day, this is a
>> performance question.
>>
>> I am looking at writing a search engine of sorts for my database. I
>> have only ever written very simple search engines before which
Madison Kelly wrote:
>
> I think the more direct question I was trying to get at is "How do you
> build a 'relavence' search engine? One where results are returned/sorted
> by relevance of some sort?". At this point, the best I can think of,
> would be to perform multiple queries; first matching
>
> Now see, this is exactly the kind of sagely advice I was hoping for! :)
>
> I'll look into tsearch2, and failing that for some reason, I love the
> keyword table idea.
For example keyword search code, you can try this package:
http://downloads.sourceforge.net/cascade/cascade-devel-pieces-1.1
your help!
Mark
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Mark StosbergPrincipal Developer
m...@summersault.com Summersault, LLC
765-939-9301 ext 202 database driven websites
. . . . . http://www.summersault.com/ . . . . . . . .
--
Sent via pgsql-performance mailing list (pgsql-performanc
> You can use TRUNCATE instead DELETE. TRUNCATE is more efficient and
> faster that DELETE.
Thanks for the suggestion. However, TRUNCATE is not compatible with
Slony, and we also have some rows which remain in table.
> Now, we need more information about your system to give you a certain
> so
Each night we run over a 100,000 "saved searches" against PostgreSQL
9.0.x. These are all complex SELECTs using "cube" functions to perform a
geo-spatial search to help people find adoptable pets at shelters.
All of our machines in development in production have at least 2 cores
in them, and I'm
On 02/03/2011 10:54 AM, Oleg Bartunov wrote:
> Mark,
>
> you could try gevel module to get structure of GIST index and look if
> items distributed more or less homogenous (see different levels). You
> can visualize index like http://www.sai.msu.su/~megera/wiki/Rtree_Index
> Also, if your searches
On 02/03/2011 10:57 AM, gnuo...@rcn.com wrote:
> For PG to prosper in the future, it has to embrace the
> multi-core/processor/SSD machine at the query level
As the person who brought up the original concern, I'll add that
"multi-core at the query level" really isn't important for us. Most of
our
We perform over 1,000,000 searches each day for "adoptable shelter pets
near your zipcode". We already have adequate performance for these
searches using the "cube" contrib, but the new KNN work in 9.1 seemed
like it might be a promising way to speed this up even further.
I installed PostgreSQL 9
> I thought the benefit of KNN was that you could retrieve the rows in
> distance order, so that a query for the closest 20 locations (for
> example) would be very fast. I wouldn't have expected it to be
> helpful when you're selecting all the rows regardless of distance.
Kevin,
Thanks for the
> PostGIS, geometry columns, and UTM.. I'm not sure where they are wrt
> adding KNN support, but it's something they've been anxious to have for
> a while, so I expect support will come quickly.
I've looked into this a little more.
One approach seems to be to project the lat/long pairs on to a f
I tried again to use KNN for a real-world query, and I was able to get
it to add an approximately 6x speed-up vs the cube search or
earthdistance methods ( from 300 ms to 50ms ).
I had to make some notable changes for the KNN index to be considered.
- Of course, I had to switch to using basic po
On 02/17/2011 03:17 PM, Oleg Bartunov wrote:
> Mark,
>
> we investigating pgsphere http://pgsphere.projects.postgresql.org/, if
> we could add KNN support.
Great, thanks Oleg.
I'll be happy to test it when something is ready.
Mark
--
Sent via pgsql-performance mailing list (pgsql-perform
ECTED] for more information.
Thanks!
Mark
--
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Mark StosbergPrincipal Developer
[EMAIL PROTECTED] Summersault, LLC
765-939-9301 ext 202 database driven websites
. . . . . http://www.summe
The http://www.1-800-save-a-pet.com site is hosted with FreeBSD and
PostgreSQL and as a geo-spatial search as a central feature.
One thing that made a substantial performance improvement was switching
from the "geo_distance()" search in the earthdistance contrib, to use
the "cube" based geo-spati
When traffic to our PostgreSQL-backed website spikes, the first resource
we see being exhausted is the DB slots on the master server (currently
set to about 400).
I expect that as new Apache/mod_perl children are being put to us, they
are creating new database connections.
I'm interested in rec
I would stick to proven postgresql technologies such as pgbouncer.
Thanks for the fast recommendation, Joshua. I'll consider it.
Our application is Slony-replicated web/db project with two slaves.
Does this design seem sensible?
- Run one pgbouncer server on the master, with settings to
Oleg Bartunov wrote:
Mark,
do you know about our sky segmentation code Q3C,
see details http://www.sai.msu.su/~megera/wiki/SkyPixelization
We use it for billions objects in database and quite happy.
Oleg,
Thanks for the response. That sounds interesting, but it's not clear to
me how I would
Christian Bourque wrote:
Any idea? Is there any other improvements I could do?
Are you using the "COPY" syntax in the import script or individual
insert statements? Using COPY will always be *much* faster.
I believe COPY always appends to tables rather than replacing the
contents, you can
Under heavy load, Apache has the usual failure mode of spawning so
many threads/processes and database connections that it just exhausts
all the memory on the webserver and also kills the database.
As usual, I would use lighttpd as a frontend (also serving static
files) to handle the l
is
message is a second attempt.)
--
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Mark StosbergPrincipal Developer
m...@summersault.com Summersault, LLC
765-939-9301 ext 202 database driven websites
. . . . . http://www.summersault.com/ . . . . . . . .
--
Sent via
Thanks for the response, Matthew.
> On Thu, 10 Dec 2009, Mark Stosberg wrote:
> > What I'm noticing is that the while the FreeBSD server has 4 Gigs of
> > memory, there are rarely every more than 2 in use-- the memory use
> > graphs as being rather constant. My goal is
Table "public.zipcodes"
Column| Type | Modifiers
--+---+---
zipcode | character varying(5) | not null
lonlat_point | geometry(Point,4326) |
Indexes:
"zipcodes_pkey" PRIMARY KEY, btree (zipco
48 matches
Mail list logo