[PERFORM] When to bump up statistics?

2004-11-19 Thread Dawid Kuroczko
ALTER TABLE foo ALTER COLUMN bar SET STATISTICS n; .

I wonder what are the implications of using this statement,
I know by using, say n=100, ANALYZE will take more time,
pg_statistics will be bigger, planner will take longer time,
on the other hand it will make better decisions... Etc, etc.

I wonder however when it is most uselful to bump it up.
Please tell me what you think about it:

Is bumping up statistics is only useful for indexed columns?

When is it most useful/benefitial to bump them up:

1) huge table with huge number of distinct values (_almost_
unique ;))

2) huge table with relatively equally distributed values
(like each value is in between, say, 30-50 rows).

3) huge table with unequally distributed values (some
values are in 1-5 rows, some are in 1000-5000 rows).

4) huge table with small number values (around ~100
distinct values, equally or uneqally distributed).

5) boolean column.

I think SET STATISTICS 100 is very useful for case with
unequally distributed values, but I wonder what about
the other cases.  And as a side note -- what are the
reasonable bounds for statistics (between 10 and 100?)

What are the runtime implications of setting statistics
too large -- how much can it affect queries?

And finally -- how other RDBMS and RDBM-likes deal
with this issue? :)

   Regards,
  Dawid

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] sort_mem affect on inserts?

2004-11-19 Thread Jan Wieck
On 11/17/2004 5:07 PM, Josh Berkus wrote:
David,
I understand that the sort_mem conf setting affects queries with order by,
etc., and the doc mentions that it is used in create index. Does sort_mem
affect the updating of indexes, i.e., can the sort_mem setting affect the
performance of inserts?
Only if the table has Foriegn Keys whose lookup might require a large sort.   
Otherwise, no.

Hmmm ... what type of foreign key lookup would that be? None of the RI 
generated queries has any order by clause.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


[PERFORM] index use

2004-11-19 Thread Arshavir Grigorian
Hi,
I have a query that when run on similar tables in 2 different databases 
either uses the index on the column (primary key) in the where clause or 
does a full table scan. The structure of the tables is the same, except 
that the table where the index does not get used has an extra million 
rows (22mil vs 23mil).

The 2 boxes where these database run are very different (Sparc with scsi 
disks and 2G RAM running Solaris 8 AND a PC with 128M RAM running and an 
IDE drive running Linux RH9 2.4.20-20.9). I am not sure why that would 
make a difference, but maybe it does.
Also, according to our dba both tables have been analyzed about the same 
time.

Any pointers would be much appreciated.
Arshavir

WORKS:
= explain analyze select num from document where num like 'EP100%';
  QUERY PLAN
-
Index Scan using document_pkey on document  (cost=0.00..5.77 rows=1 width=14) (actual time=0.147..0.166 rows=2 loops=1)
  Index Cond: (((num)::text = 'EP100'::character varying) AND ((num)::text  'EP101'::character varying))
  Filter: ((num)::text ~~ 'EP100%'::text)
Total runtime: 0.281 ms
(4 rows) 

= \d document
   Table public.document
 Column   |  Type  | Modifiers 
---++---
num   | character varying(30)  | not null
titl  | character varying(500) | 
isscntry  | character varying(50)  | 
issdate   | date   | 
filedate  | date   | 
appnum| character varying(20)  | 
clnum | integer| 
exnum | integer| 
exmnr | character varying(300) | 
agent | character varying(300) | 
priodate  | date   | 
prionum   | character varying(100) | 
priocntry | character varying(50)  | 
legalstat | integer| 
Indexes:
   document_pkey primary key, btree (num)
Check constraints:
   document_legalstat CHECK (legalstat  0 AND legalstat  6)


DOES NOT WORK:
d5= EXPLAIN ANALYZE select num from document where num like 'EP100%';
 QUERY PLAN  
--
Seq Scan on document  (cost=0.00..804355.12 rows=1 width=14) (actual time=97.235..353286.781 rows=2 loops=1)
  Filter: ((num)::text ~~ 'EP100%'::text)
Total runtime: 353286.907 ms
(3 rows)

d5= \d document
   Table public.document
 Column   |  Type  | Modifiers 
---++---
num   | character varying(30)  | not null
titl  | character varying(500) | 
isscntry  | character varying(50)  | 
issdate   | date   | 
filedate  | date   | 
clnum | integer| 
exnum | integer| 
exmnr | character varying(300) | 
agent | character varying(300) | 
priodate  | date   | 
prionum   | character varying(100) | 
priocntry | character varying(50)  | 
legalstat | integer| 
appnum| character varying(20)  | 
Indexes:
   document_pkey primary key, btree (num)
Check constraints:
   $1 CHECK (legalstat  0 AND legalstat  6)

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] index use

2004-11-19 Thread Steinar H. Gunderson
On Fri, Nov 19, 2004 at 02:18:55PM -0500, Arshavir Grigorian wrote:
 The 2 boxes where these database run are very different (Sparc with scsi 
 disks and 2G RAM running Solaris 8 AND a PC with 128M RAM running and an 
 IDE drive running Linux RH9 2.4.20-20.9). I am not sure why that would 
 make a difference, but maybe it does.

Are you having different locales on your systems?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] When to bump up statistics?

2004-11-19 Thread Josh Berkus
Dawid,

 I wonder what are the implications of using this statement,
 I know by using, say n=100, ANALYZE will take more time,
 pg_statistics will be bigger, planner will take longer time,
 on the other hand it will make better decisions... Etc, etc.

Yep.   And pg_statistics will need to be vacuumed more often.

 Is bumping up statistics is only useful for indexed columns?

No.   It's potentially useful for any queried column.

 1) huge table with huge number of distinct values (_almost_
 unique ;))

Yes.

 2) huge table with relatively equally distributed values
 (like each value is in between, say, 30-50 rows).

Not usually.

 3) huge table with unequally distributed values (some
 values are in 1-5 rows, some are in 1000-5000 rows).

Yes.

 4) huge table with small number values (around ~100
 distinct values, equally or uneqally distributed).

Not usually, especially if they are equally distributed.

 5) boolean column.

Almost never, just as it is seldom useful to index a boolean column.

 I think SET STATISTICS 100 is very useful for case with
 unequally distributed values, but I wonder what about
 the other cases.  And as a side note -- what are the
 reasonable bounds for statistics (between 10 and 100?)

Oh, no, I've used values up to 500 in production, and we've tested up to the 
max on DBT-3.In my experience, if the default (10) isn't sufficient, you 
often have to go up to  250 to get a different plan.

 What are the runtime implications of setting statistics
 too large -- how much can it affect queries?

It won't affect select queries.   It will affect ANALYZE time (substantially 
in the aggregate) and maintenance on the pg_statistics table.

 And finally -- how other RDBMS and RDBM-likes deal
 with this issue? :)

Most don't allow such fine-tuned adjustment.   MSSQL, for example, allows only 
setting it per-table or maybe even database-wide, and on that platform it 
doesn't seem to have much effect on query plans.Oracle prefers to use 
HINTS, which are a brute-force method to manage query plans.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] index use

2004-11-19 Thread Josh Berkus
Arshavir,

 I have a query that when run on similar tables in 2 different databases
 either uses the index on the column (primary key) in the where clause or
 does a full table scan. The structure of the tables is the same, except
 that the table where the index does not get used has an extra million
 rows (22mil vs 23mil).

Are both using the same version of PostgreSQL?   If so, what version?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] index use

2004-11-19 Thread Arshavir Grigorian
Thanks for all the replies. It actually has to do with the locales. The 
db where the index gets used is running on C vs the the other one that 
uses en_US.UTF-8. I guess the db with the wrong locale will need to be 
waxed and recreated with correct locale settings. I wonder if there are 
any plans to make LIKE work with all locales.

Again, many thanks. You guys are great!

Arshavir
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


[PERFORM] help needed -- sequential scan problem

2004-11-19 Thread sarlav kumar
Hi All,

I am new to Postgres.

I have a query which does not use index scan unless I force postgres to use index scan. I dont want to force postgres, unless there is no way of optimizing this query.

The query :

select m.company_name,m.approved,cu.account_no,mbt.business_name,cda.country, 
(select count(*) from merchant_purchase mp left join data d on mp.data_id=d.idwheremp.merchant_id=m.id and d.status=5) as Trans_count,
(select sum(total * 0.01) from merchant_purchase mp left join data d on mp.data_id=d.id where mp.merchant_id=m.id and d.status=5) as Trans_amount,
(select count(*) from merchant_purchase mp left join data d on mp.data_id=d.id where d.what=15 and d.status=5 and d.flags=7 and mp.merchant_id=m.id) as Reversal_count
from merchant m 
left join customer cu on cu.id=m.uid 
left join customerdata cda on cda.uid=cu.id 
left join merchant_business_types mbt on mbt.id=m.businesstype and
exists (select distinct(merchant_id) from merchant_purchase where m.id=merchant_id);

First Question: I know the way I have written the first two sub-selects is really bad, as they have the same conditions in the where clause. But I am not sure if there is a way to select two columns in a single sub-select query. When I tried to combine the two sub-select queries, I got an error saying that the sub-select can have only one column. Does anyone know any other efficient way of doing it?

Second Question: The query plan is as follows:

QUERY PLAN  Hash Join (cost=901.98..17063.67 rows=619 width=88) (actual time=52.01..5168.09 rows=619 loops=1) Hash Cond: ("outer".businesstype = "inner".id) Join Filter: (subplan) - Merge Join (cost=900.34..1276.04 rows=619 width=62) (actual time=37.00..97.58 rows=619
 loops=1) Merge Cond: ("outer".id = "inner".uid) - Merge Join (cost=900.34..940.61 rows=619 width=52) (actual time=36.91..54.66 rows=619 loops=1) Merge Cond: ("outer".id = "inner".uid) - Sort (cost=795.45..810.32 rows=5949 width=17) (actual time=32.59..36.59 rows=5964 loops=1) Sort Key: cu.id - Seq Scan on customer cu (cost=0.00..422.49 rows=5949 width=17) (actual time=0.02..15.69 rows=5964
 loops=1) - Sort (cost=104.89..106.44 rows=619 width=35) (actual time=4.27..5.10 rows=619 loops=1) Sort Key: m.uid - Seq Scan on merchant m (cost=0.00..76.19 rows=619 width=35) (actual time=0.04..2.65 rows=619 loops=1) - Index Scan using customerdata_uid_idx on customerdata cda (cost=0.00..311.85 rows=5914 width=10) (actual time=0.09..27.70 rows=5919 loops=1) - Hash (cost=1.51..1.51 rows=51 width=26) (actual time=0.19..0.19 rows=0 loops=1) - Seq Scan o
  n
 merchant_business_types mbt (cost=0.00..1.51 rows=51 width=26) (actual time=0.04..0.12 rows=51 loops=1) SubPlan - Aggregate (cost=269.89..269.89 rows=1 width=12) (actual time=2.70..2.70 rows=1 loops=619) - Nested Loop (cost=0.00..269.78 rows=44 width=12) (actual time=2.40..2.69 rows=4 loops=619) Filter: ("inner".status = 5) - Seq Scan on merchant_purchase mp (cost=0.00..95.39 rows=44 width=4) (actual time=2.37..2.58 rows=6 loops=619) Filter: (merchant_id =
 $0) - Index Scan using data_pkey on data d (cost=0.00..3.91 rows=1 width=8) (actual time=0.01..0.01 rows=1 loops=3951) Index Cond: ("outer".data_id = d.id) - Aggregate (cost=269.89..269.89 rows=1 width=16) (actual time=2.73..2.73 rows=1 loops=619) - Nested Loop (cost=0.00..269.78 rows=44 width=16) (actual time=2.42..2.70 rows=4 loops=619) Filter: ("inner".status = 5) - Seq Scan on merchant_purchase m
  p
 (cost=0.00..95.39 rows=44 width=8) (actual time=2.39..2.60 rows=6 loops=619) Filter: (merchant_id = $0) - Index Scan using data_pkey on data d (cost=0.00..3.91 rows=1 width=8) (actual time=0.01..0.01 rows=1 loops=3951) Index Cond: ("outer".data_id = d.id) - Aggregate (cost=270.12..270.12 rows=1 width=20) (actual time=2.72..2.72 rows=1 loops=619) - Nested Loop (cost=0.00..270.00 rows=44 width=20) (actual time=2.63..2.72 rows=0
 loops=619) Filter: (("inner".what = 15) AND ("inner".status = 5) AND ("inner".flags = 7)) - Seq Scan on merchant_purchase mp (cost=0.00..95.39 rows=44 width=4) (actual time=2.40..2.62 rows=6 loops=619) Filter: (merchant_id = $0) - Index Scan using data_pkey on data d (cost=0.00..3.91 rows=1 width=16) (actual time=0.01..0.01 rows=1 loops=3951) Index Cond: ("outer".data_id =
 d.id) - Unique (cost=0.00..113.14 rows=4 width=4) (actual time=0.02..0.02 rows=0 loops=598) - Index Scan using merchant_purchase_merchant_id_idx on merchant_purchase (cost=0.00..113.02 rows=44 width=4) (actual time=0.01..0.01 rows=0 loops=598) Index Cond: ($0 = merchant_id)Total runtime: 5170.37 msec (5.170 sec)(42 rows)

As you can see, there are many sequential scans in the query plan. Postgres is not using the index defined, even though it leads to better performance(0.2 sec!! when i force index scan)

Is there something wrong in my query that makes postgres use seq scan as opposed to index scan?? Any help 

Re: [PERFORM] sort_mem affect on inserts?

2004-11-19 Thread Josh Berkus
Jan,

 Hmmm ... what type of foreign key lookup would that be? None of the RI
 generated queries has any order by clause.

I was under the impression that work_mem would be used for the index if there 
was an index for the RI lookup.   Wrong?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] index use

2004-11-19 Thread Tom Lane
Arshavir Grigorian [EMAIL PROTECTED] writes:
 I have a query that when run on similar tables in 2 different databases 
 either uses the index on the column (primary key) in the where clause or 
 does a full table scan. The structure of the tables is the same, except 
 that the table where the index does not get used has an extra million 
 rows (22mil vs 23mil).

I'd say you initialized the second database in a non-C locale.  The
planner is clearly well aware that the seqscan is going to be expensive,
so the explanation has to be that it does not have a usable index available.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] index use

2004-11-19 Thread Stephan Szabo
On Fri, 19 Nov 2004, Arshavir Grigorian wrote:

 Hi,

 I have a query that when run on similar tables in 2 different databases
 either uses the index on the column (primary key) in the where clause or
 does a full table scan. The structure of the tables is the same, except
 that the table where the index does not get used has an extra million
 rows (22mil vs 23mil).

 The 2 boxes where these database run are very different (Sparc with scsi
 disks and 2G RAM running Solaris 8 AND a PC with 128M RAM running and an
 IDE drive running Linux RH9 2.4.20-20.9). I am not sure why that would
 make a difference, but maybe it does.

Is the second server running in C locale or a different locale? The
optimization for LIKE to use indexes involves either making an index with
a *_pattern_ops operator class or being in C locale.

---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] tablespace + RAM disk?

2004-11-19 Thread David Parker
We are using 7.4.5 on Solaris 9. 

We have a couple tables (holding information about network sessions, for 
instance) which don't need to persist beyond the life of the server, but while 
the server is running they are heavily hit, insert/update/delete.

Temporary tables won't work for us because they are per-connection, and we are 
using a thread pool, and session data could be accessed from multiple 
connections.

Would 8.0 tablespaces, with a tablespace placed on a RAM disk be a potential 
solution for this? I have used RAM disks for disk caches in the past, but I 
don't know if there are any special issues with defining a tablespace that way.

Thanks.

- DAP
--
David ParkerTazz Networks(401) 709-5130
 

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] When to bump up statistics?

2004-11-19 Thread Chris Browne
[EMAIL PROTECTED] (Dawid Kuroczko) writes:
 ALTER TABLE foo ALTER COLUMN bar SET STATISTICS n; .

 I wonder what are the implications of using this statement,
 I know by using, say n=100, ANALYZE will take more time,
 pg_statistics will be bigger, planner will take longer time,
 on the other hand it will make better decisions... Etc, etc.

 I wonder however when it is most uselful to bump it up.
 Please tell me what you think about it:

 Is bumping up statistics is only useful for indexed columns?

The main decision changes that result from this would occur then...

 When is it most useful/benefitial to bump them up:

 1) huge table with huge number of distinct values (_almost_
 unique ;))

 2) huge table with relatively equally distributed values
 (like each value is in between, say, 30-50 rows).

 3) huge table with unequally distributed values (some
 values are in 1-5 rows, some are in 1000-5000 rows).

 4) huge table with small number values (around ~100
 distinct values, equally or uneqally distributed).

A hard and fast rule hasn't emerged, definitely not to distinguish
precisely between these cases.

There are two effects that come out of changing the numbers:

 1.  They increase the number of tuples examined.

 This would pointedly affect cases 3 and 4, increasing the
 likelihood that the statistics are more representative

 2.  They increase the number of samples that are kept, increasing the
 number of items recorded in the histogram.

 If you have on the order of 100 unique values (it would not be
 unusual for a company to have 100 main customers or suppliers),
 that allows there to be nearly a bin apiece, which makes
 estimates _way_ more representative both for common and less
 common cases amongst the top 100.

Both of those properties are useful for pretty much all of the above
cases.

 5) boolean column.

Boolean column would more or less indicate SET STATISTICS 2; the only
point to having more would be if there was one of the values that
almost never occurred so that you'd need to collect more stats to even
pick up instances of the rare case.

A boolean column is seldom much use for indices anyways...

 I think SET STATISTICS 100 is very useful for case with unequally
 distributed values, but I wonder what about the other cases.  And as
 a side note -- what are the reasonable bounds for statistics
 (between 10 and 100?)

If there are, say, 200 unique values, then increasing from 10 to 100
would seem likely to be useful in making the histogram MUCH more
representative...

 What are the runtime implications of setting statistics too large --
 how much can it affect queries?

More stats would mean a bit more time evaluating query plans, but the
quality of the plans should be better.

 And finally -- how other RDBMS and RDBM-likes deal with this issue? 
 :)

For Oracle and DB/2, the issues are not dissimilar.  Oracle somewhat
prefers the notion of collecting comprehensive statistics on the whole
table, which will be even more costly than PostgreSQL's sampling.
-- 
let name=cbbrowne and tld=cbbrowne.com in String.concat @ [name;tld];;
http://www.ntlug.org/~cbbrowne/linuxxian.html
A VAX is virtually a computer, but not quite.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Query Performance and IOWait

2004-11-19 Thread Andrew Janian
The data that we are accessing is via QLogic cards connected to an EMC Clarion. 
 We have tried it on local SCSI disks with the same (bad) results.

When the machine gets stuck in a 100% IOWAIT state it often crashes soon after 
that.

The disks are fine, have been replaced and checked.

Here are my results from hdparm -Tt /dev/sda1 (which is the EMC disk array)
/dev/sda1:
 Timing buffer-cache reads:   2976 MB in  2.00 seconds = 1488.00 MB/sec
 Timing buffered disk reads:   44 MB in  3.13 seconds =  14.06 MB/sec

-Original Message-
From: Dave Cramer [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 18, 2004 11:14 AM
To: Andrew Janian
Cc: [EMAIL PROTECTED]
Subject: Re: [PERFORM] Query Performance and IOWait


Andrew,

Dell's aren't well known for their disk performance, apparently most of 
the perc controllers sold with dell's are actually adaptec controllers. 
Also apparently they do not come with the battery required to use the 
battery backed up write cache ( In fact according to some Dell won't 
even sell the battery to you). Also Dell's monitoring software is quite 
a memory hog.

Have you looked at top ?, and also hdparm -Tt /dev/sd?

Dave

Andrew Janian wrote:

Hello All,

I have a setup with a Dell Poweredge 2650 with Red Hat and Postgres 7.4.5 with 
a database with about 27GB of data.  The table in question has about 35 
million rows.

I am running the following query:

SELECT *
FROM mb_fix_message
WHERE msg_client_order_id IN (
   SELECT msg_client_order_id
   FROM mb_fix_message
   WHERE msg_log_time = '2004-06-01'
   AND msg_log_time  '2004-06-01 13:30:00.000'
   AND msg_message_type IN ('D','G')
   AND mb_ord_type = '1'
   )
   AND msg_log_time  '2004-06-01'
   AND msg_log_time  '2004-06-01 23:59:59.999'
   AND msg_message_type = '8'
   AND (mb_raw_text LIKE '%39=1%' OR mb_raw_text LIKE '%39=2%');

with the following plan:

   
   
   
   QUERY PLAN
Nested Loop IN Join  (cost=0.00..34047.29 rows=1 width=526)
  -  Index Scan using mfi_log_time on mb_fix_message  (cost=0.00..22231.31 
 rows=2539 width=526)
   Index Cond: ((msg_log_time  '2004-06-01 00:00:00'::timestamp without 
 time zone) AND (msg_log_time  '2004-06-01 23:59:59.999'::timestamp without 
 time zone))
   Filter: (((msg_message_type)::text = '8'::text) AND 
 (((mb_raw_text)::text ~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~ 
 '%39=2%'::text)))
  -  Index Scan using mfi_client_ordid on mb_fix_message  (cost=0.00..445.56 
 rows=1 width=18)
   Index Cond: ((outer.msg_client_order_id)::text = 
 (mb_fix_message.msg_client_order_id)::text)
   Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without time 
 zone) AND (msg_log_time  '2004-06-01 13:30:00'::timestamp without time zone) 
 AND ((msg_message_type)::text = 'D'::text) OR ((msg_message_type)::text = 
 'G'::text)) AND ((mb_ord_type)::text = '1'::text))

While running, this query produces 100% iowait usage on its processor and 
takes a ungodly amount of time (about an hour).

The postgres settings are as follows:

shared_buffers = 32768  # min 16, at least max_connections*2, 8KB each
sort_mem = 262144   # min 64, size in KB

And the /etc/sysctl.conf has:
kernel.shmall = 274235392
kernel.shmmax = 274235392

The system has 4GB of RAM.

I am pretty sure of these settings, but only from my reading of the docs and 
others' recommendations online.

Thanks,

Andrew Janian
OMS Development
Scottrade Financial Services
(314) 965-1555 x 1513
Cell: (314) 369-2083

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings
  


-- 
Dave Cramer
http://www.postgresintl.com
519 939 0336
ICQ#14675561


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] tablespace + RAM disk?

2004-11-19 Thread Josh Berkus
David,

 We have a couple tables (holding information about network sessions, for
 instance) which don't need to persist beyond the life of the server, but
 while the server is running they are heavily hit, insert/update/delete.

See the thread this last week on Memcached for a cheaper solution.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] tablespace + RAM disk?

2004-11-19 Thread David Parker
Oh! I sort of started paying attention to that in the middle...and
couldn't make head or tail out of it. Will search back to the
beginning

Thanks.

- DAP

-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED] 
Sent: Friday, November 19, 2004 7:35 PM
To: [EMAIL PROTECTED]
Cc: David Parker
Subject: Re: [PERFORM] tablespace + RAM disk?

David,

 We have a couple tables (holding information about network sessions, 
 for
 instance) which don't need to persist beyond the life of the server, 
 but while the server is running they are heavily hit, 
insert/update/delete.

See the thread this last week on Memcached for a cheaper solution.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] tablespace + RAM disk?

2004-11-19 Thread David Parker
But, I'm also still interested in the answer to my question: is there
any reason you could not put an 8.0 tablespace on a RAM disk? 

I can imagine doing it by having an initdb run at startup somehow, with
the idea that having a mix of tablespaces in a database would make this
harder, but I haven't read enough about tablespaces yet. The problem
with trying to mix a RAM tablespace with a persistent tablespace would
seem to be that you would have to recreate select data files at system
startup before you could start the database. That's why an initdb seems
cleaner to me, but...I should stop talking and go read about tablespaces
and memcached.

I'd be interested to hear if anybody has tried this. And I will also
check out memcached, too, of course. Thanks for the pointer.

- DAP

-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of 
David Parker
Sent: Friday, November 19, 2004 8:34 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [PERFORM] tablespace + RAM disk?

Oh! I sort of started paying attention to that in the 
middle...and couldn't make head or tail out of it. Will search 
back to the beginning

Thanks.

- DAP

-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED]
Sent: Friday, November 19, 2004 7:35 PM
To: [EMAIL PROTECTED]
Cc: David Parker
Subject: Re: [PERFORM] tablespace + RAM disk?

David,

 We have a couple tables (holding information about network 
sessions, 
 for
 instance) which don't need to persist beyond the life of 
the server, 
 but while the server is running they are heavily hit,
insert/update/delete.

See the thread this last week on Memcached for a cheaper solution.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco


---(end of 
broadcast)---
TIP 4: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] help needed -- sequential scan problem

2004-11-19 Thread Tom Lane
sarlav kumar [EMAIL PROTECTED] writes:
 I have a query which does not use index scan unless I force postgres to use 
 index scan. I dont want to force postgres, unless there is no way of 
 optimizing this query.

The major issue seems to be in the sub-selects:

  -  Seq Scan on merchant_purchase mp  (cost=0.00..95.39 
 rows=44 width=4) (actual time=2.37..2.58 rows=6 loops=619)
Filter: (merchant_id = $0)

where the estimated row count is a factor of 7 too high.  If the
estimated row count were even a little lower, it'd probably have gone
for an indexscan.  You might get some results from increasing the
statistics target for merchant_purchase.merchant_id.  If that doesn't
help, I'd think about reducing random_page_cost a little bit.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend