[PERFORM] Is Vacuum/analyze destroying my performance?

2006-12-03 Thread Carlo Stonebanks
I have always been frustrated by the wildly erratic performance of our postgresql 8 server. We run aprogram that does heavy data importing via a heuristics-based import program. Sometime records being imported would just fly by, sometimes they would crawl. The import program imports records from

Re: [PERFORM] Is Vacuum/analyze destroying my performance?

2006-12-03 Thread Carlo Stonebanks
Update on this issue, I "solved" my problem by doing the following: 1) Stopped the import, and did a checkpoint backup on my import target schema 2) Dropped the import target schema 3) Restored a backup from a previous checkpoint when the tables were much smaller 4) Performed a VACUUM/ANALYZE on

Re: [PERFORM] Is Vacuum/analyze destroying my performance?

2006-12-04 Thread Carlo Stonebanks
""Matthew O'Connor"" wrote in message news:[EMAIL PROTECTED] > Just a wild guess, but the performance problem sounds like maybe as your > data changes, eventually the planner moves some query from an index scan > to a sequential scan, do you have any details on what queries are taking > so lon

[PERFORM] Fast tsearch2, trigram matching on short phrases

2007-08-22 Thread Carlo Stonebanks
I have read that trigram matching (similarity()) performance degrades when the matching is on longer strings such as phrases. I need to quickly match strings and rate them by similiarity. The strings are typically one to seven words in length - and will often include unconventional abbreviations

Re: [PERFORM] Fast tsearch2, trigram matching on short phrases

2007-08-22 Thread Carlo Stonebanks
was oriented towards word mathcing, not phrase matching. Carlo ""Steinar H. Gunderson"" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] On Wed, Aug 22, 2007 at 12:02:54PM -0400, Carlo Stonebanks wrote: Any suggestions on where to go with this project to im

Re: [PERFORM] Fast tsearch2, trigram matching on short phrases

2007-08-22 Thread Carlo Stonebanks
Hi Oleg, you didn't show us explain analyze of your select. I didn't because I didn't expect any reaction to it - my understanding is that trigram matching for phrases is not recommended because of the performance. Do you believe that I SHOULD expect good performance from trigram matching o

Re: [PERFORM] Fast tsearch2, trigram matching on short phrases

2007-08-22 Thread Carlo Stonebanks
>> The problem is in idea, not in performance. Oh, I think we both agree on that! ;-D This is why I didn't post any EXPLAINs or anything like that. I thought the problem was in the entire method of how to best zero in on the set of records best suited for closer analysis by my phrase-matching f

[PERFORM] Performance on 8CPU's and 32GB of RAM

2007-09-04 Thread Carlo Stonebanks
A client is moving their postgresql db to a brand new Windows 2003 x64 server with 2 quad cores and 32GB of RAM. It is a dedicated server to run 8.2.4. The server typically will have less than 10 users. The primary use of this server is to host a database that is continuously being updated by

Re: [PERFORM] Performance on 8CPU's and 32GB of RAM

2007-09-05 Thread Carlo Stonebanks
ing RAID 1. -Original Message- From: Scott Marlowe [mailto:[EMAIL PROTECTED] Sent: September 4, 2007 7:15 PM To: Alvaro Herrera Cc: Carlo Stonebanks; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Performance on 8CPU's and 32GB of RAM On 9/4/07, Alvaro Herrera <[EMAIL PROTECTE

Re: [PERFORM] Performance on 8CPU's and 32GB of RAM

2007-09-05 Thread Carlo Stonebanks
Right, additionally NTFS is really nothing to use on any serious disc array. Do you mean that I will not see any big improvement if I upgrade the disk subsystem because the client is using NTFS (i.e. Windows) ---(end of broadcast)--- TIP 9: In

Re: [PERFORM] Performance on 8CPU's and 32GB of RAM

2007-09-05 Thread Carlo Stonebanks
>> Large shared_buffers and Windows do not mix. Perhaps you should leave the shmem config low, so that the kernel can cache the file pages. << Is there a problem BESIDES the one that used to cause windows to fail to allocate memory in blocks larger than 1.5GB? The symptom of this problem was tha

Re: [PERFORM] Performance on 8CPU's and 32GB of RAM

2007-09-06 Thread Carlo Stonebanks
Wow - it's nice to hear someone say that... out loud. Thanks, you gave me hope! -Original Message- From: James Mansion [mailto:[EMAIL PROTECTED] Sent: September 6, 2007 4:55 PM To: Carlo Stonebanks Cc: Scott Marlowe; Alvaro Herrera; pgsql-performance@postgresql.org Subject: Re: [PE

[PERFORM] How planner decides left-anchored LIKE can use index

2007-09-06 Thread Carlo Stonebanks
Exactly when does the planner decide that a left-anchored like can use the index? I have replaced a WHEN lower(last_name) = 'smith' with WHEN lower(last_name) like 'smith%' There is an index on lower(last_name). I have seen the planner convert the LIKE to lower(last_name) >= 'smith' and lower(

Re: [PERFORM] Performance on 8CPU's and 32GB of RAM

2007-09-06 Thread Carlo Stonebanks
<< If what you mean is that pg has a design that's heavily oriented towards things that tend to be cheap on POSIX and doesn't use the core Win32 features effectively, then let's track that as an optimisation opportunity for the Win32 port. >> Isn't it just easier to assume that Windows Server ca

[PERFORM] random_page_costs - are defaults of 4.0 realistic for SCSI RAID 1

2007-09-10 Thread Carlo Stonebanks
Can anyone answer this for me: Although I realize my client's disk subsystem (SCSI/RAID Smart Array E200 controller using RAID 1) is less than impressive - is the default setting of 4.0 realistic or could it be lower? Thanks! ---(end of broadcast)--

[PERFORM] Query works when kludged, but would prefer "best practice" solution

2007-09-17 Thread Carlo Stonebanks
Hi all, Please see the section marked as "PROBLEM" in "ORIGINAL QUERY" plan below. You can see it's pretty slow. Oddly enough, an index for facility_address_id is available but not being used, but I suspect it's questionable whether it would be an improvement. I knew that the filter was best

Re: [PERFORM] Query works when kludged, but would prefer "best practice" solution

2007-09-17 Thread Carlo Stonebanks
Well, there goes my dream of getting a recommendation that will deliver a blinding insight into how to speed up all of my queries a thousand-fold. Thanks Merlin! -Original Message- From: Merlin Moncure [mailto:[EMAIL PROTECTED] Sent: September 17, 2007 8:03 PM To: Carlo Stonebanks Cc

Re: [PERFORM] Query works when kludged, but would prefer "best practice" solution

2007-09-17 Thread Carlo Stonebanks
Thanks, it worked. Client happy. Big bonus in the mail. -Original Message- From: Merlin Moncure [mailto:[EMAIL PROTECTED] Sent: September 17, 2007 8:18 PM To: Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Query works when kludged, but would prefer "

Re: [PERFORM] Query works when kludged, but would prefer "best practice" solution

2007-09-17 Thread Carlo Stonebanks
ep, I have a dream -- and in this dream Tom writes a brilliant three line code sample that makes it all clear to me, and I wake up a PostgreSQL guru) ;-) Carlo -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: September 17, 2007 11:30 PM To: Merlin Moncure Cc: Carlo

[PERFORM] Nested loops row estimates always too high

2007-09-18 Thread Carlo Stonebanks
I am noticing that my queries are spending a lot of time in nested loops. The table/index row estimates are not bad, but the nested loops can be off by a factor of 50. In any case, they are always too high. If this is always occurring, is this an indication of a general configuration problem?

[PERFORM] Performance improves only after repeated VACUUM/ANALYZE

2007-09-18 Thread Carlo Stonebanks
My client "publishes" an "edition" of their DB from his production site to his hosted web/db server. This is done by FTPing a backup of the DB to his hosting provider. Immediately after a "publication" (restore to web/db server) we immediately run VACUUM ANALYZE to make sure the statistics and row

[PERFORM] REPOST: Nested loops row estimates always too high

2007-09-20 Thread Carlo Stonebanks
(SORRY FOR THE REPOST, I DON'T SEE MY ORIGINAL QUESTION OR ANY ANSWERS HERE) I am noticing that my queries are spending a lot of time in nested loops. The table/index row estimates are not bad, but the nested loops can be off by a factor of 50. In any case, they are always too high. Are the o

[PERFORM] REPOST: Performance improves only after repeated VACUUM/ANALYZE

2007-09-20 Thread Carlo Stonebanks
(SORRY FOR THE REPOST, I DON'T SEE MY ORIGINAL QUESTION OR ANY ANSWERS HERE) My client "publishes" an "edition" of their DB from his production site to his hosted web/db server. This is done by FTPing a backup of the DB to his hosting provider. Immediately after a "publication" (restore to we

Re: [PERFORM] REPOST: Nested loops row estimates always too high

2007-09-24 Thread Carlo Stonebanks
Has anyone offered any answers to you? No one else has replied to this post. "Ow Mun Heng" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] On Thu, 2007-09-20 at 11:02 -0400, Carlo Stonebanks wrote: (SORRY FOR THE REPOST, I DON'T SEE MY ORIGINAL QUESTION OR A

[PERFORM] Acceptable level of over-estimation?

2007-09-24 Thread Carlo Stonebanks
Is there a rule of thumb about when the planner's row estimates are too high? In particular, when should I be concerned that planner's estimated number of rows estimated for a nested loop is off? By a factor of 10? 100? 1000? Carlo ---(end of broadcast)-

Re: [PERFORM] REPOST: Nested loops row estimates always too high

2007-09-24 Thread Carlo Stonebanks
a great way to place the enterprise's db-centric business logic at the server. Carlo -Original Message- From: Ow Mun Heng [mailto:[EMAIL PROTECTED] Sent: September 24, 2007 8:51 PM To: Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] REPOST: Nested loops

[PERFORM] default_statistics_target

2010-03-14 Thread Carlo Stonebanks
Hi people, The whole topic of messing with stats makes my head spin but I am concerned about some horridly performing queries that have had bad rows estimates and others which always choose seq scans when indexes are available. Reading up on how to improve planner estimates, I have seen refere

Re: [PERFORM] default_statistics_target

2010-03-22 Thread Carlo Stonebanks
_counts = on vacuum_cost_delay = 5 # 0-1000 milliseconds wal_buffers = 4MB wal_sync_method = open_sync work_mem = 64MB Carlo "Greg Smith" wrote in message news:4b9e33af.2020...@2ndquadrant.com... Carlo Stonebanks wrote: The whole topic of messing with stats makes my head spin but I am concerned

[PERFORM] Got that new server, now it's time for config!

2010-03-22 Thread Carlo Stonebanks
Here we go again! Based on recommendations made here, I got my client to migrate off of our Windows 2003 Server x64 box to a new Linux box. # CENTOS 5.4 # Linux mdx_octo 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux # pgsql 8.3.10, 8 CPUs, 48GB RAM # RAID 1

[PERFORM] Function scan/Index scan to nested loop

2010-05-10 Thread Carlo Stonebanks
Hello all, A query ran twice in succession performs VERY poorly the first time as it iterates through the nested loop. The second time, it rips. Please see SQL, SLOW PLAN and FAST PLAN below. I don't know why these nested loops are taking so long to execute. " -> Nested Loop (cost=0.00..42

Re: [PERFORM] Function scan/Index scan to nested loop

2010-05-11 Thread Carlo Stonebanks
Thanks Scott, This is almost always due to caching. First time the data aren't in the cache, second time they are. << I had assumed that it was caching, but I don't know from where because of the inexplicable delay. Hardware? O/S (Linux)? DB? From the function, which is IMMUTABLE? I am co

[PERFORM] Does FILTER in SEQSCAN short-circuit AND?

2010-05-27 Thread Carlo Stonebanks
Sample code: SELECT * FROM MyTable WHERE foo = 'bar' AND MySlowFunc('foo') = 'bar' Let's say this required a SEQSCAN because there were no indexes to support column foo. For every row where foo <> 'bar' would the filter on the SEQSCAN short-circuit the AND return false right away, or would it

[PERFORM] Very bad plan when using VIEW and IN (SELECT...*)

2010-08-12 Thread Carlo Stonebanks
Ref these two queries against a view: -- QUERY 1, executes < 0.5 secs SELECT * FROM mdx_core.vw_provider AS p WHERE provider_id IN (13083101) -- QUERY 2, executes > 13.5 secs SELECT * FROM mdx_core.vw_provider AS p WHERE provider_id IN (SELECT 13083101) I am using the simple IN (SELECT n) in QU

Re: [PERFORM] Very bad plan when using VIEW and IN (SELECT...*)

2010-08-13 Thread Carlo Stonebanks
didn't want to throw too much info as my concern was actually whether views were as klunky as other DB platforms. Carlo -Original Message- From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] Sent: August 13, 2010 9:29 AM To: pgsql-performance@postgresql.org; Carlo Stonebanks Subjec

[PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
I am a software developer who is acting in a (temporary) dba role for a project. I had recommended PostgreSQL be brought in to replace the proposed MySQL DB - I chose PostgreSQL because of its reputation as a more stable solution than MySQL. At this early stage in the project, we are initializi

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
> How are you loading the tables? Copy? Insert? Once the data is transformed, it is inserted. I don't have stats, but the programs visual feedback does not spend a lot of time on the "inserting data" message. Then again, if there is an asynchronous component to an insert, perhaps I am not seein

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
> are you using the 'copy' interface? Straightforward inserts - the import data has to transformed, normalised and de-duped by the import program. I imagine the copy interface is for more straightforward data importing. These are - buy necessity - single row inserts. > thats a tough question.

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
> Are you wrapping all this in a transaction? Yes, the transactions can typically wrap 1 to 10 single-table, single-row inserts and updates. > You're doing some dynamically generated selects as part of the > "de-duping" process? They're probably the expensive bit. What > do those queries tend

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
> So your program first transforms the data and then inserts it? And it is > the transforming process which is running select statements that is slow? There are cross-referencing and deduplication processes. Does this person have an office at this exact address? In a similarily named building in

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
it? Carlo ""Jim C. Nasby"" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On Thu, Sep 28, 2006 at 01:53:22PM -0400, Carlo Stonebanks wrote: >> > are you using the 'copy' interface? >> >> Straightforward inserts - the impo

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
eries against the hash tables. This is where something like MySQL's in-memory tables would have come in handy... What is GDB? Carlo "Matthew Nuzum" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Carlo Stonebanks wrote: >>> are you using the &

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
> Something else to consider... databases love doing bulk operations. It > might be useful to load prospective data into a temporary table, and > then do as many operations as you can locally (ie: within the database) > on that table, hopefully eleminating as many candidate rows as possible > along

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
> imo, the key to high performance big data movements in postgresql is > mastering sql and pl/pgsql, especially the latter. once you get good > at it, your net time of copy+plpgsql is going to be less than > insert+tcl. If this implies bulk inserts, I'm afraid I have to consider something else.

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
> by the way, stats_command_string is a known performance killer that > iirc was improved in 8.2. just fyi. This is a handy fact, I will get on this right away. > bulk load denomalized tables into scratch tables into the postgresql > database. create indexes appropriate to the nomalization proces

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
> Don't forget about EXPLAIN ANALYZE, which is related to EXPLAIN but has > nothing to do with the ANALYZE command. Ah, hence my confusion. Thanks for the clarification... I never knew about ANALYZE as a seperate command. ---(end of broadcast)--

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
>> indexes. I don't know whether autovacuum will also analyze tables >> for you automagically, but it would be a good idea to analyze the table > > It does. So, I have checked my log and I see an autovacuum running once every minute on our various databases being hosted on the server - once every

Re: [PERFORM] Performace Optimization for Dummies

2006-10-02 Thread Carlo Stonebanks
. "Heikki Linnakangas" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Carlo Stonebanks wrote: >> We urgently need a major performance improvement. We are running the >> PostgreSQL 8.1.4 on a Windows 2003 x64 Server on a dual processor, dual >> core

Re: [PERFORM] Performace Optimization for Dummies

2006-10-02 Thread Carlo Stonebanks
> Did you think about putting the whole data into PostgreSQL using COPY in > a nearly unprocessed manner, index it properly, and then use SQL and > stored functions to transform the data inside the database to the > desired result? This is actually what we are doing. The slowness is on the row-by-

Re: [PERFORM] Performace Optimization for Dummies

2006-10-02 Thread Carlo Stonebanks
> My experience with that type of load process is that doing this > row-by-row is a very expensive approach and your results bear that out. I expected this, and had warned the client before the project started that this is exactly where SQL underperforms. > It is often better to write each step

Re: [PERFORM] Performace Optimization for Dummies

2006-10-02 Thread Carlo Stonebanks
> 1. fork your import somhow to get all 4 cores running This is already happening, albeit only 3. No improvement - it appears we have taken the same problem, and divided it by 3. Same projected completion time. this is really curious, to say the least. > 2. write the code that actually does the

[PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-03 Thread Carlo Stonebanks
Some very helpful people had asked that I post the troublesome code that was generated by my import program. I installed a SQL log feature in my import program. I have posted samples of the SQL statements that cause the biggest delays. Thanks for all of your help. Carlo -- Sample 1: Thi

Re: [PERFORM] Performace Optimization for Dummies

2006-10-03 Thread Carlo Stonebanks
> I still think that using a PL in the backend might be more performant > than having an external client, alone being the SPI interface more > efficient compared to the network serialization for external applications. I would actually love for this to work better, as this is technology that I wou

Re: [PERFORM] Performace Optimization for Dummies

2006-10-03 Thread Carlo Stonebanks
> Maybe "group by", "order by", "distinct on" and hand-written functions > and aggregates (like first() or best()) may help. We use these - we have lexical analysis functions which assign a rating to each row in a set, and the likelyhood that the data is a match, and then we sort our results. I

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-03 Thread Carlo Stonebanks
> explain analyze is more helpful because it prints the times. Sorry, this runs in-line in my code, and I didn't want to slow the already-slow program with explain analyze. I have run it outside of the code in its own query. The new results are below. > sample 1, couple questions: > what is the

Re: [PERFORM] Unsubscribe

2006-10-03 Thread Carlo Stonebanks
I got one of these last Christmas. It works great, but the device has no obvious power source and now I can't find my cat. God help me when I accidently try to unsubscribe like that .. Carlo <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On Mon, Oct 02, 2006 at 01:36:17PM -0400,

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-03 Thread Carlo Stonebanks
Please ignore sample 1 - now that I have the logging feature, I can see that my query generator algorithm made an error. The SQL of concern is now script 2. ---(end of broadcast)--- TIP 4: Have you searched our list archives? htt

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-04 Thread Carlo Stonebanks
> can you do explain analyze on the two select queries on either side of > the union separatly? the subquery is correctly written and unlikely > to be a problem (in fact, good style imo). so lets have a look at > both sides of facil query and see where the problem is. Sorry for the delay, the se

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-04 Thread Carlo Stonebanks
;-> Index Scan using facility_pkey on facility f (cost=0.00..3.56 rows=1 width=35) (actual time=0.009..0.010 rows=1 loops=65)" " Index Cond: ("outer".facility_id = f.facility_id)" "Total runtime: 217.520 ms" ""Merlin Moncure"

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-05 Thread Carlo Stonebanks
> do we have an multi-column index on > facility_address(facility_id, address_id)? did you run analyze? There is an index on facility_address on facility_id. I didn't create an index on facility_address.address_id because I expected joins to go in the other direction (from facility_address to a

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-05 Thread Carlo Stonebanks
Just to clarify: if I expect to join two tables that I expect to benfit from indexed scans, I should create indexes on the joined columns on BOTH sides? Carlo "Tom Lane" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > "Carlo Stonebanks" <[EMAIL

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-05 Thread Carlo Stonebanks
;[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On 10/5/06, Carlo Stonebanks <[EMAIL PROTECTED]> wrote: >> > do we have an multi-column index on >> > facility_address(facility_id, address_id)? did you run analyze? >> >> There is an index

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-06 Thread Carlo Stonebanks
f for imports, how frequently should I VACUUM? ""Merlin Moncure"" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On 10/5/06, Carlo Stonebanks <[EMAIL PROTECTED]> wrote: >> > do we have an multi-column index on >> > facil

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-06 Thread Carlo Stonebanks
> how did you determine that it is done every 500 rows? this is the The import program pages the import table - it is currently set at 500 rows per page. With each page, I run an ANALYZE. > default autovacuum paramater. if you followed my earlier > recommendations, you are aware that autovacuum

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-15 Thread Carlo Stonebanks
Hi Merlin, Well, I'm back. first of all, thanks for your dogged determination to help me out - it is much appreciated. I owe you a beer or twelve. The import has been running for a week. The import program got faster as I tuned things. I capture the dynamic SQL statements generated by the app,

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-15 Thread Carlo Stonebanks
able, 772747 nonremovable row versions in 7969 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.39s/0.18u sec elapsed 10.70 sec. Query returned successfully with no result in 10765 ms. "Tom Lane" <[EMAIL PRO

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-16 Thread Carlo Stonebanks
> Can you try temporarily disabling bitmap scans and see what comes up? Well, that's slowing everything down. I've got a couple of results, below 1) Bitmap scan off, but seq scan enabled. 2) Bitmap scan and seq scan off 3) Bitmap scan back on, seq scan back on, and a new index created 4) VACUUM V

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-16 Thread Carlo Stonebanks
EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On 10/15/06, Carlo Stonebanks <[EMAIL PROTECTED]> wrote: >> that contains full address data >> */ >> select >> f.facility_id, >> null as facility_address_id, >>

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-16 Thread Carlo Stonebanks
>I think there's 2 things that would help this case. First, partition on > country. You can either do this on a table level or on an index level > by putting where clauses on the indexes (index method would be the > fastest one to test, since it's just new indexes). That should shrink > the size of

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-16 Thread Carlo Stonebanks
of the union and you mentioned it. I was just under th eimpression that getting this sub-query to work would have produced the most clear, straightforward ANALYZE results. Carlo "Shaun Thomas" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On Monday 16 Octobe

Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-16 Thread Carlo Stonebanks
> you have a two part part key on facility(country code, postal code), > right? Well, I'm glad you pointed it out, because I THOUGhT I had created it, but apparently I haven't -- I only noticed that it was missing after I listed all the other indexes. Looks like this query is one of the victims

[PERFORM] Is ODBC that slow?

2006-10-20 Thread Carlo Stonebanks
Our Windows-based db server has to integrate with users that work regularily with Access.When attempting to import user's data from Access MDB files to PostgreSQL, we try on eof two things: either import using EMS SQL Manager's Data Import from Access utility, or export from Access to Postgresql

Re: [PERFORM] Is ODBC that slow?

2006-10-21 Thread Carlo Stonebanks
> carlo: please, please, get your mail server to quit telling me your > mailbox is full :) Merlin, sorry about that. This is the first I've heard of it. Carlo ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Is ODBC that slow?

2006-10-24 Thread Carlo Stonebanks
> Try Command Prompt's ODBC driver. Lately it has been measured to be > consistently faster than psqlODBC. > > http://projects.commandprompt.com/public/odbcng Thanks, I tried this, but via Access it always reports a login (username/password) to db failure. However, this a an Alpha - is there a

[PERFORM] commit so slow program looks frozen

2006-10-25 Thread Carlo Stonebanks
(I tried this question on the interface forum and got no result, but I don't know how to tell if it's an interface issue or not) I have a TCL app which typically takes hours to complete. I found out that it is taking longer than it should because it occasionally stalls inexplicably (for tens of

Re: [PERFORM] commit so slow program looks frozen

2006-10-25 Thread Carlo Stonebanks
> You may try to figure out what's the process doing (the backend > obviously, not the frontend (Tcl) process) by attaching to it with > strace. It's so sad when us poor Windows guys get helpful hints from people assume that we're smart enough to run *NIX... ;-) > Maybe it's swamped by a context

Re: [PERFORM] commit so slow program looks frozen

2006-10-25 Thread Carlo Stonebanks
>> I have a question for you: did you have a long running query keeping open a transaction?  I've just noticed the same problem here, but things cleaned up immediately when I aborted the long-running transaction. No, the only processes are from those in the import applications themselves

Re: [PERFORM] commit so slow program looks frozen

2006-10-26 Thread Carlo Stonebanks
> I can just see the postgresql group getting together at the next > O'Reilley's conference and creating that band. And it will all be your > fault. Finally, a chance for me to wear my black leather pants. > A context switch storm is when your machine spends more time trying to > figure out what

Re: [PERFORM] commit so slow program looks frozen

2006-10-26 Thread Carlo Stonebanks
This is pretty interesting - where can I read more on this? Windows isn't actually hanging, one single command line window is - from its behaviour, it looks like the TCL postgresql package is waiting for pg_exec to come back from the commit (I believe the commit has actually gone through). It c

Re: [PERFORM] commit so slow program looks frozen

2006-10-26 Thread Carlo Stonebanks
> when it happens, make sure to query pg_locks and see what is going on > there lock issues are not supposed to manifest on a commit, which > releases locks, but you never know. There aren't any pedning locks (assuming that pgAdmin is using pg_locks to display pendin glocks). > There have been r

Re: [PERFORM] commit so slow program looks frozen

2006-10-28 Thread Carlo Stonebanks
>Ben Trewern" <[EMAIL PROTECTED]> wrote in message >news:[EMAIL PROTECTED] > It might be worth turning off hyperthreading if your Xeons are using it. > There have been reports of this causing inconsistent behaviour with > PostgreSQL. Yes, this issue comes up often - I wonder if the Woodcrest Xe

Re: [PERFORM] commit so slow program looks frozen

2006-10-28 Thread Carlo Stonebanks
> I do think we need some better instrumentation for this kind of thing. Well, one thing's for sure - I have little other information to offer. The problem is that the lockups occur after hours of operation and thousands of rows being digested (which is the nature of the program). If "better in

[PERFORM] How to avoid hashjoin and mergejoin

2007-11-01 Thread Carlo Stonebanks
I am comparing the same query on two different PG 8.2 servers, one Linux (8GB RAM) and one Windows (32GB RAM). Both have similar drives and CPU's. The Windows posgrestsql.config is pretty well tuned but it looks like someone had wiped out the Linux config so the default one was re-installed. A

Re: [PERFORM] How to avoid hashjoin and mergejoin

2007-11-01 Thread Carlo Stonebanks
:[EMAIL PROTECTED] Sent: November 1, 2007 5:42 PM To: Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to avoid hashjoin and mergejoin "Carlo Stonebanks" <[EMAIL PROTECTED]> writes: > Still, the Linux server did not create the same, fast

Re: [PERFORM] How to avoid hashjoin and mergejoin

2007-11-08 Thread Carlo Stonebanks
16MB) = 3.6GB total RAM eaten up under peak load for these two values alone. If we wanted to get more aggressive, we can raise work_mem. Carlo -Original Message- From: Scott Marlowe [mailto:[EMAIL PROTECTED] Sent: November 1, 2007 5:39 PM To: Carlo Stonebanks Cc: pgsql-perfor

[PERFORM] Massive table (500M rows) update nightmare

2010-01-06 Thread Carlo Stonebanks
Our DB has an audit table which is 500M rows and growing. (FYI the objects being audited are grouped semantically, not individual field values). Recently we wanted to add a new feature and we altered the table to add a new column. We are backfilling this varchar(255) column by writing a TCL sc

Re: [PERFORM] Massive table (500M rows) update nightmare

2010-01-07 Thread Carlo Stonebanks
Got an explain analyze of the delete query? UPDATE mdx_core.audit_impt SET source_table = 'mdx_import.'||impt_name WHERE audit_impt_id >= 31941 AND audit_impt_id <= 319400010 AND coalesce(source_table, '') = '' Index Scan using audit_impt_pkey on audit_impt (cost=0.00..92.63 rows=1 width=

Re: [PERFORM] Massive table (500M rows) update nightmare

2010-01-07 Thread Carlo Stonebanks
If it is possible to lock this audit table exclusively (may be during off peak hours) I would look into - create new_audit_table as select col1, col2, col3 ... col9, 'new_col_value' from old_audit_table; - create all indexes - drop old_audit_table - rename new_audit_table to old_audit_table That

Re: [PERFORM] Massive table (500M rows) update nightmare

2010-01-07 Thread Carlo Stonebanks
What is the rationale behind this? How about doing 10k rows in 1 update, and committing every time? When we did 10K updates, the application would sometimes appear to have frozen, and we were concerned that there was a deadlock condition because of the number of locked rows. While we may have

Re: [PERFORM] Massive table (500M rows) update nightmare

2010-01-07 Thread Carlo Stonebanks
It might well be checkpoints. Have you tried cranking up checkpoint segments to something like 100 or more and seeing how it behaves then? No I haven't, althugh it certainly make sense - watching the process run, you get this sense that the system occaisionally pauses to take a deep, long bre

Re: [PERFORM] Massive table (500M rows) update nightmare

2010-01-07 Thread Carlo Stonebanks
runtime: 372.141 ms ""Kevin Grittner"" wrote in message news:4b46256302250002d...@gw.wicourts.gov... "Carlo Stonebanks" wrote: An interesting idea, if I can confirm that the performance problem is because of the WHERE clause, not the UPDATE. If you cou

Re: [PERFORM] Massive table (500M rows) update nightmare

2010-01-08 Thread Carlo Stonebanks
crank it up more and delay the checkpoints as much as possible during these updates. 64 segments is already 1024M. We have 425M rows, total table size is 78GB, so we can imagine a worst case UPDATE write is less than 200 bytes * number of rows specified in the update (is that logic correct?).

Re: [PERFORM] Massive table (500M rows) update nightmare

2010-01-08 Thread Carlo Stonebanks
I thought that post mentioned that the plan was one statement in an iteration, and that the cache would have been primed by a previous query checking whether there were any rows to update. If that was the case, it might be worthwhile to look at the entire flow of an iteration. This is the only

[PERFORM] New server to improve performance on our large and busy DB - advice?

2010-01-14 Thread Carlo Stonebanks
My client just informed me that new hardware is available for our DB server. . Intel Core 2 Quads Quad . 48 GB RAM . 4 Disk RAID drive (RAID level TBD) I have put the ugly details of what we do with our DB below, as well as the postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3

Re: [PERFORM] Massive table (500M rows) update nightmare

2010-01-14 Thread Carlo Stonebanks
-in-hand, looking for advice under the PERFORM post: "New server to improve performance on our large and busy DB - advice?" Thanks again! Carlo "Scott Marlowe" wrote in message news:dcc563d11001071740q572cdae2re410788fe790d...@mail.gmail.com... On Thu, Jan 7, 20

[PERFORM] New server to improve performance on our large and busy DB - advice? (v2)

2010-01-14 Thread Carlo Stonebanks
My client just informed me that new hardware is available for our DB server. . Intel Core 2 Quads Quad . 48 GB RAM . 4 Disk RAID drive (RAID level TBD) I have put the ugly details of what we do with our DB below, as well as the postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3.

Re: [PERFORM] Massive table (500M rows) update nightmare

2010-01-14 Thread Carlo Stonebanks
My bad - I thought I had, so it has been re-posted with a (v2) disclaimer in the title... like THAT will stop the flaming! Thanks for your patience! "Craig James" wrote in message news:4b4f8a49.7010...@emolecules.com... Carlo Stonebanks wrote: Guys, I want to thank you for

Re: [PERFORM] New server to improve performance on our large and busy DB - advice?

2010-01-14 Thread Carlo Stonebanks
Pretty much everyone thinks their requirements are exceptional. It's funny how infrequently that's actually true. The techniques that favor index-use aren't that unique: collect better stats, set basic parameters correctly, adjust random_page_cost, investigate plans that don't do what you wa

Re: [PERFORM] New server to improve performance on our large and busy DB - advice?

2010-01-19 Thread Carlo Stonebanks
Hi Scott, Sorry for the very late reply on this post, but I'd like to follow up. The reason that I took so long to reply was due to this suggestion: < My first thought was, does he mean against the entire DB? That would take a week! But, since it was recommended, I decided to see what woul

Re: [PERFORM] New server to improve performance on our large and busy DB - advice?

2010-01-20 Thread Carlo Stonebanks
yeah, the values are at the end. Sounds like your vacuum settings are too non-aggresive. Generally this is the vacuum cost delay being too high. Of course, I have to ask: what's the down side? Yes! You can run vacuum verbose against the regular old postgres database (or just create one for

Re: [PERFORM] Re: New server to improve performance on our large and busy DB - advice? (v2)

2010-01-20 Thread Carlo Stonebanks
* A database that is of small to medium size (5 - 10 GB)? * Around 10 clients that perform constant write operations to the database (UPDATE/INSERT) * Around 10 clients that occasionally read from the database * Around 6000 tables in your database * A problem with tuning it all * Migration to ne

  1   2   >