[PERFORM] mis-estimation on data-warehouse aggregate creation

2004-11-16 Thread Kris Jurka
I've have a miniature data-warehouse in which I'm trying to rebuild pre-calcuated aggregate data directly in the database and I'm geting some poor plans due to a bad mis-estimation of the number of rows involved. In a standard star schema I have a sales fact table and dimensions product, customer

Re: [PERFORM] mis-estimation on data-warehouse aggregate creation

2004-11-16 Thread F. Senault
Tuesday, November 16, 2004, 10:10:17 AM, you wrote: > HashAggregate (cost=32869.33..32869.34 rows=1 width=36) ^ > (actual time=475182.855..475188.304 rows=911 loops=1) ^^^

[PERFORM] Tsearch2 really slower than ilike ?

2004-11-16 Thread Hervé Piedvache
Hi, I'm completly dispointed with Tsearch2 ... I have a table like this : Table "public.site" Column |Type | Modifiers ---+-+---

Re: [PERFORM] Tsearch2 really slower than ilike ?

2004-11-16 Thread Michael Fuhr
On Tue, Nov 16, 2004 at 03:55:58PM +0100, Hervé Piedvache wrote: > WHERE s.idx_site_name @@ to_tsquery('atari'); How much text does each site_name field contain? From the field name I'd guess only a few words. Based on my own experience, if the fields were documents containing thousands of word

Re: [PERFORM] Tsearch2 really slower than ilike ?

2004-11-16 Thread Hervé Piedvache
Michael, Le Mardi 16 Novembre 2004 16:32, Michael Fuhr a écrit : > On Tue, Nov 16, 2004 at 03:55:58PM +0100, Hervé Piedvache wrote: > > WHERE s.idx_site_name @@ to_tsquery('atari'); > > How much text does each site_name field contain? From the field > name I'd guess only a few words. Based on my

[PERFORM] Efficient way to remove OID data

2004-11-16 Thread James Gunzelman
Title: Message I have a table that has 2 columns of an OID type.  I would like to issue a truncate table command but my understanding is that the data pointed to by the OIDs is not removed and orphaned.  What would be the most efficient way to truncate the table and not have orphaned data?  

Re: [PERFORM] Tsearch2 really slower than ilike ?

2004-11-16 Thread Joshua D. Drake
or more. Yes site name ... is company names or web site name ... so not many word in each record ... but I don't understand why more words are more efficient than few words ?? sorry ... Well there are a couple of reasons but the easiest one is index size. An ILIKE btree index is in gener

Re: [PERFORM] Tsearch2 really slower than ilike ?

2004-11-16 Thread Joshua D. Drake
QUERY PLAN Seq Scan on site_rss s (cost=0.00..11863.16 rows=295 width=158) (actual time=17.414..791.937 rows=12 loops=1) Filter: (s

Re: [PERFORM] Tsearch2 really slower than ilike ?

2004-11-16 Thread Hervé Piedvache
Le Mardi 16 Novembre 2004 17:06, Joshua D. Drake a écrit : > > QUERY PLAN > >-- > >-- Seq Scan on site_rss s > > (cost=0.00..11863.16 rows=2

Re: [PERFORM] Tsearch2 really slower than ilike ?

2004-11-16 Thread Oleg Bartunov
could you provide me a dump of your table (just id and tsvector columns), so I could try on my computer. Also, plain query (simple and clean) which demonstrated your problem would be preferred next time ! Oleg On Tue, 16 Nov 2004, [iso-8859-15] Herv? Piedvache wrote: Michael, Le Mardi 16 Novemb

Re: [PERFORM] Tsearch2 really slower than ilike ?

2004-11-16 Thread Oleg Bartunov
On Tue, 16 Nov 2004, Joshua D. Drake wrote: or more. Yes site name ... is company names or web site name ... so not many word in each record ... but I don't understand why more words are more efficient than few words ?? sorry ... Well there are a couple of reasons but the easiest one is index s

Re: [PERFORM] Insertion puzzles

2004-11-16 Thread J. Andrew Rogers
On Sat, 2004-11-13 at 18:00, [EMAIL PROTECTED] wrote: > I ran into the exact same problem you did. I tried many, many changes to > the conf file, I tried O.S. tuning but performance stunk. I had a fairly > simple job that had a lot of updates and inserts that was taking 4 1/2 > hours. I re-wrote

Re: [PERFORM] mis-estimation on data-warehouse aggregate creation

2004-11-16 Thread Kris Jurka
On Tue, 16 Nov 2004, F. Senault wrote: > Let me guess... You've never run "analyze" on your tables ? > No, I have. I mentioned that I did in my email, but you can also tell by the exactly correct guesses for some other plan steps: -> Seq Scan on period (cost=0.00..90.88 rows=3288 width=54

[PERFORM] nuderstanding 'explain analyse'

2004-11-16 Thread Alexandre Leclerc
Hello, I build two SELECT queries, and in one I used COALESCE with a CASE, and in the second one I used only CASE statements. When analysing, I'm getting the exact same result, except the cost. (For now I have so few data that the results are too fragmented. If the plans for both queries are exa

Re: [PERFORM] Tsearch2 really slower than ilike ?

2004-11-16 Thread Oleg Bartunov
ok, I downloaded dump of table and here is what I found: zz=# select count(*) from tt; count 183956 (1 row) zz=# select * from stat('select tt from tt') order by ndoc desc, nentry desc,wo rd limit 10; word | ndoc | nentry --+---+ blog | 12710

Re: [PERFORM] mis-estimation on data-warehouse aggregate creation

2004-11-16 Thread Simon Riggs
On Tue, 2004-11-16 at 09:10, Kris Jurka wrote: > By rewriting the JOIN > conditions to LEFT JOIN we force the planner to recognize that there will > be a match for every row in the sales table: > You realise that returns a different answer (or at least it potentially does, depending upon your

[PERFORM] memcached and PostgreSQL

2004-11-16 Thread Michael Adler
http://pugs.postgresql.org/sfpug/archives/21.html I noticed that some of you left coasters were talking about memcached and pgsql. I'm curious to know what was discussed. In reading about memcached, it seems that many people are using it to circumvent the scalability problems of MySQL (lack o

[PERFORM] query plan question

2004-11-16 Thread David Parker
I have a query for which postgres is generating a different plan on different machines. The database schema is the same, the dataset is the same, the configuration is the same (e.g., pg_autovacuum running in both cases), both systems are Solaris 9. The main difference in the two systems is that

Re: [PERFORM] query plan question

2004-11-16 Thread Russell Smith
On Wed, 17 Nov 2004 02:54 pm, you wrote: > I have a query for which postgres is generating a different plan on different > machines. The database schema is the same, the dataset is the same, the > configuration is the same (e.g., pg_autovacuum running in both cases), both > systems are Solaris 9

[PERFORM] Table Partitions: To Inherit Or Not To Inherit

2004-11-16 Thread Don Drake
I've read the previous thread on the list regarding partitioning mechanisms and I just wrote a plpgsql function to create the partition tables (by date) as well as another function used to do the insert (it determines which table will be inserted). The creation of the partition tables uses the inh

Re: [PERFORM] memcached and PostgreSQL

2004-11-16 Thread Josh Berkus
Michael, > So What does memcached offer pgsql users? It would still seem to offer > the benefit of a multi-machined cache. Yes, and a very, very fast one too ... like, 120,000 operations per second. PostgreSQL can't match that because of the overhead of authentication, security, transaction v

Re: [PERFORM] Table Partitions: To Inherit Or Not To Inherit

2004-11-16 Thread Josh Berkus
Don, > What's really the difference between this and creating separate tables > with the same column definition without the inherit, and then create a > view to "merge" them together? Easier syntax for queries. If you created completely seperate tables and UNIONED them together, you'd have to

Re: [PERFORM] nuderstanding 'explain analyse'

2004-11-16 Thread Josh Berkus
Alexandre, > If the plans for both queries are exactly the same, should I assume > that the cost will also be the same? Nope. A seq scan over 1,000,000,000 rows is going to cost a LOT more than a seq scan over 1000 rows, even though it's the same plan. When you have the data sorted out, post

Re: [PERFORM] query plan question

2004-11-16 Thread Joshua D. Drake
David Parker wrote: I have a query for which postgres is generating a different plan on different machines. The database schema is the same, the dataset is the same, the configuration is the same (e.g., pg_autovacuum running in both cases), both systems are Solaris 9. The main difference in the

Re: [PERFORM] memcached and PostgreSQL

2004-11-16 Thread Troels Arvin
On Tue, 16 Nov 2004 21:47:54 -0800, Josh Berkus wrote: > So memcached becomes a very good place to stick data that's read often but > not > updated often, or alternately data that changes often but is disposable. An > example of the former is a user+ACL list; and example of the latter is web