Re: [PERFORM] increasing database connections

2007-03-01 Thread Magnus Hagander
On Thu, Mar 01, 2007 at 12:49:14AM -0500, Jonah H. Harris wrote:
 On 3/1/07, Shiva Sarna [EMAIL PROTECTED] wrote:
 I am sorry if it is a repeat question but I want to know if database
 performance will decrease if I increase the max-connections to 2000. At
 present it is 100.
 
 Most certainly.  Adding connections over 200 will degrade performance
 dramatically.  You should look into pgpool or connection pooling from
 the application.

Are you sure? I've heard of at least one installation which runs with
5000+ connections, and it works fine. (you know who you are - I don't
know if it's public info, so I can't put out the details - but feel free
to fill in :P)

That said, there's certainly some overhead, and using pgpool if possible
is good advice (depending on workload). I'm just wondering about
the dramatically part.

//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] increasing database connections

2007-03-01 Thread Florian Weimer
* Mark Kirkwood:

 Yeah - I thought that somewhere closer to 1 connections is where
 you get hit with socket management related performance issues.

Huh?  These sockets aren't handled in a single process, are they?
Nowadays, this number of sockets does not pose any problem for most
systems, especially if you don't do I/O multiplexing.  Of course, if
you've got 10,000 connections which are active in parallel, most users
won't be content with 1/10,000th of your database performance. 8-/ (I
don't see why idle connections should be a problem from a socket
management POV, though.)

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] increasing database connections

2007-03-01 Thread Joshua D. Drake
Magnus Hagander wrote:
 On Thu, Mar 01, 2007 at 12:49:14AM -0500, Jonah H. Harris wrote:
 On 3/1/07, Shiva Sarna [EMAIL PROTECTED] wrote:
 I am sorry if it is a repeat question but I want to know if database
 performance will decrease if I increase the max-connections to 2000. At
 present it is 100.
 Most certainly.  Adding connections over 200 will degrade performance
 dramatically.  You should look into pgpool or connection pooling from
 the application.
 
 Are you sure? I've heard of at least one installation which runs with
 5000+ connections, and it works fine.

We have one that high as well and it does fine. Although I wouldn't
suggest it on less than 8.1 ;). 8.2 handles it even better since 8.2
handles 8 cores better than 8.1.

Joshua D. Drake


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Identical Queries

2007-03-01 Thread Stephan Szabo
On Thu, 1 Mar 2007, Rob Schall wrote:

 Question for anyone...

 I tried posting to the bugs, and they said this is a better question for here.
 I have to queries. One runs in about 2 seconds. The other takes upwards
 of 2 minutes. I have a temp table that is created with 2 columns. This
 table is joined with the larger database of call detail records.
 However, these 2 queries are handled very differently.

How many rows are there in anitmp and how many rows in anitmp have
istf=true and how many have istf=false? If you don't currently analyze the
temp table after adding the rows, you might find that doing an analyze
helps, or at least makes the row estimates better.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] stats collector process high CPU utilization

2007-03-01 Thread Tom Lane
Benjamin Minshall [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 It's sounding like what you had was just transient bloat, in which case
 it might be useful to inquire whether anything out-of-the-ordinary had
 been done to the database right before the excessive-CPU-usage problem
 started.

 I don't believe that there was any unusual activity on the server, but I 
 have set up some more detailed logging to hopefully identify a pattern 
 if the problem resurfaces.

A further report led us to realize that 8.2.x in fact has a nasty bug
here: the stats collector is supposed to dump its stats to a file at
most every 500 milliseconds, but the code was actually waiting only
500 microseconds :-(.  The larger the stats file, the more obvious
this problem gets.

If you want to patch this before 8.2.4, try this...

Index: pgstat.c
===
RCS file: /cvsroot/pgsql/src/backend/postmaster/pgstat.c,v
retrieving revision 1.140.2.2
diff -c -r1.140.2.2 pgstat.c
*** pgstat.c26 Jan 2007 20:07:01 -  1.140.2.2
--- pgstat.c1 Mar 2007 20:04:50 -
***
*** 1689,1695 
/* Preset the delay between status file writes */
MemSet(write_timeout, 0, sizeof(struct itimerval));
write_timeout.it_value.tv_sec = PGSTAT_STAT_INTERVAL / 1000;
!   write_timeout.it_value.tv_usec = PGSTAT_STAT_INTERVAL % 1000;
  
/*
 * Read in an existing statistics stats file or initialize the stats to
--- 1689,1695 
/* Preset the delay between status file writes */
MemSet(write_timeout, 0, sizeof(struct itimerval));
write_timeout.it_value.tv_sec = PGSTAT_STAT_INTERVAL / 1000;
!   write_timeout.it_value.tv_usec = (PGSTAT_STAT_INTERVAL % 1000) * 1000;
  
/*
 * Read in an existing statistics stats file or initialize the stats to


regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Alex Deucher

Hello,

I have noticed a strange performance regression and I'm at a loss as
to what's happening.  We have a fairly large database (~16 GB).  The
original postgres 7.4 was running on a sun v880 with 4 CPUs and 8 GB
of ram running Solaris on local scsi discs.  The new server is a sun
Opteron box with 4 cores, 8 GB of ram running postgres 8.1.4 on Linux
(AMD64) on a 4 Gbps FC SAN volume.  When we created the new database
it was created from scratch rather than copying over the old one,
however the table structure is almost identical (UTF8 on the new one
vs. C on the old). The problem is queries are ~10x slower on the new
hardware.  I read several places that the SAN might be to blame, but
testing with bonnie and dd indicates that the SAN is actually almost
twice as fast as the scsi discs in the old sun server.  I've tried
adjusting just about every option in the postgres config file, but
performance remains the same.  Any ideas?

Thanks,

Alex

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Jeff Frost

On Thu, 1 Mar 2007, Joshua D. Drake wrote:


Alex Deucher wrote:

Hello,

I have noticed a strange performance regression and I'm at a loss as
to what's happening.  We have a fairly large database (~16 GB).  The
original postgres 7.4 was running on a sun v880 with 4 CPUs and 8 GB
of ram running Solaris on local scsi discs.  The new server is a sun
Opteron box with 4 cores, 8 GB of ram running postgres 8.1.4 on Linux
(AMD64) on a 4 Gbps FC SAN volume.  When we created the new database
it was created from scratch rather than copying over the old one,
however the table structure is almost identical (UTF8 on the new one
vs. C on the old). The problem is queries are ~10x slower on the new
hardware.  I read several places that the SAN might be to blame, but
testing with bonnie and dd indicates that the SAN is actually almost
twice as fast as the scsi discs in the old sun server.  I've tried
adjusting just about every option in the postgres config file, but
performance remains the same.  Any ideas?


Vacuum? Analayze? default_statistics_target? How many shared_buffers?
effective_cache_size? work_mem?


Also, an explain analyze from both the 7.4 and 8.1 systems with one of the 
10x slower queries would probably be handy.


What do you mean by created from scratch rather than copying over the old 
one?  How did you put the data in?  Did you run analyze after loading it? 
Is autovacuum enabled and if so, what are the thresholds?


--
Jeff Frost, Owner   [EMAIL PROTECTED]
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Alex Deucher

On 3/1/07, Joshua D. Drake [EMAIL PROTECTED] wrote:

Alex Deucher wrote:
 Hello,

 I have noticed a strange performance regression and I'm at a loss as
 to what's happening.  We have a fairly large database (~16 GB).  The
 original postgres 7.4 was running on a sun v880 with 4 CPUs and 8 GB
 of ram running Solaris on local scsi discs.  The new server is a sun
 Opteron box with 4 cores, 8 GB of ram running postgres 8.1.4 on Linux
 (AMD64) on a 4 Gbps FC SAN volume.  When we created the new database
 it was created from scratch rather than copying over the old one,
 however the table structure is almost identical (UTF8 on the new one
 vs. C on the old). The problem is queries are ~10x slower on the new
 hardware.  I read several places that the SAN might be to blame, but
 testing with bonnie and dd indicates that the SAN is actually almost
 twice as fast as the scsi discs in the old sun server.  I've tried
 adjusting just about every option in the postgres config file, but
 performance remains the same.  Any ideas?

Vacuum? Analayze? default_statistics_target? How many shared_buffers?
effective_cache_size? work_mem?



I'm running the autovacuum process on the 8.1 server.  vacuuming on
the old server was done manually.

default_statistics_target and effective_cache_size are set to the the
defaults on both.

postgres 7.4 server:
# - Memory -
shared_buffers = 82000 # 1000min 16, at least
max_connections*2, 8KB each
sort_mem = 8000# 1024min 64, size in KB
vacuum_mem = 32000 # 8192min 1024, size in KB
# - Free Space Map -
#max_fsm_pages = 2  # min max_fsm_relations*16, 6 bytes each
#max_fsm_relations = 1000   # min 100, ~50 bytes each
# - Kernel Resource Usage -
#max_files_per_process = 1000   # min 25

postgres 8.1 server:
# - Memory -
shared_buffers = 10 # min 16 or max_connections*2, 8KB each
temp_buffers = 2000 #1000   # min 100, 8KB each
max_prepared_transactions = 100 #5  # can be 0 or more
# note: increasing max_prepared_transactions costs ~600 bytes of shared memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
work_mem = 1#1024   # min 64, size in KB
maintenance_work_mem = 524288 #16384# min 1024, size in KB
#max_stack_depth = 2048 # min 100, size in KB

I've also tried using the same settings from the old server on the new
one; same performance issues.

Thanks,

Alex


Sincerely,

Joshua D. Drake



 Thanks,

 Alex

 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq



--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Identical Queries

2007-03-01 Thread Chad Wagner

On 3/1/07, Rob Schall [EMAIL PROTECTED] wrote:


There are 4 entries (wanted to make the playing field level for this
test). There are 2 with true for istf and 2 with false.



Then the difference here has to do with using orignum vs destnum as the join
criteria.  There must be more intersections for orignum than destnum, or
your statistics are so far out of whack.  It appears to be estimating 5M vs
500K for a result set, and naturally it chose a different plan.


Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Alex Deucher

On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote:

On Thu, 1 Mar 2007, Joshua D. Drake wrote:

 Alex Deucher wrote:
 Hello,

 I have noticed a strange performance regression and I'm at a loss as
 to what's happening.  We have a fairly large database (~16 GB).  The
 original postgres 7.4 was running on a sun v880 with 4 CPUs and 8 GB
 of ram running Solaris on local scsi discs.  The new server is a sun
 Opteron box with 4 cores, 8 GB of ram running postgres 8.1.4 on Linux
 (AMD64) on a 4 Gbps FC SAN volume.  When we created the new database
 it was created from scratch rather than copying over the old one,
 however the table structure is almost identical (UTF8 on the new one
 vs. C on the old). The problem is queries are ~10x slower on the new
 hardware.  I read several places that the SAN might be to blame, but
 testing with bonnie and dd indicates that the SAN is actually almost
 twice as fast as the scsi discs in the old sun server.  I've tried
 adjusting just about every option in the postgres config file, but
 performance remains the same.  Any ideas?

 Vacuum? Analayze? default_statistics_target? How many shared_buffers?
 effective_cache_size? work_mem?

Also, an explain analyze from both the 7.4 and 8.1 systems with one of the
10x slower queries would probably be handy.



I'll run some and get back to you.


What do you mean by created from scratch rather than copying over the old
one?  How did you put the data in?  Did you run analyze after loading it?
Is autovacuum enabled and if so, what are the thresholds?


Both the databases were originally created from xml files.  We just
re-created the new one from the xml rather than copying the old
database over.  I didn't manually run analyze on it, but we are
running the autovacuum process:

autovacuum = on #off# enable autovacuum subprocess?
autovacuum_naptime = 360 #60# time between autovacuum runs, in secs
autovacuum_vacuum_threshold = 1 #1000   # min # of tuple updates before
   # vacuum
autovacuum_analyze_threshold = 5000 #500# min # of tuple updates before

Thanks,

Alex



--
Jeff Frost, Owner   [EMAIL PROTECTED]
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Jeff Frost

On Thu, 1 Mar 2007, Alex Deucher wrote:


Vacuum? Analayze? default_statistics_target? How many shared_buffers?
effective_cache_size? work_mem?



I'm running the autovacuum process on the 8.1 server.  vacuuming on
the old server was done manually.

default_statistics_target and effective_cache_size are set to the the
defaults on both.

postgres 7.4 server:
# - Memory -
shared_buffers = 82000 # 1000min 16, at least
max_connections*2, 8KB each
sort_mem = 8000# 1024min 64, size in KB
vacuum_mem = 32000 # 8192min 1024, size in KB
# - Free Space Map -
#max_fsm_pages = 2  # min max_fsm_relations*16, 6 bytes each
#max_fsm_relations = 1000   # min 100, ~50 bytes each
# - Kernel Resource Usage -
#max_files_per_process = 1000   # min 25

postgres 8.1 server:
# - Memory -
shared_buffers = 10 # min 16 or max_connections*2, 8KB 
each

temp_buffers = 2000 #1000   # min 100, 8KB each
max_prepared_transactions = 100 #5  # can be 0 or more
# note: increasing max_prepared_transactions costs ~600 bytes of shared 
memory

# per transaction slot, plus lock space (see max_locks_per_transaction).
work_mem = 1#1024   # min 64, size in KB
maintenance_work_mem = 524288 #16384# min 1024, size in KB
#max_stack_depth = 2048 # min 100, size in KB

I've also tried using the same settings from the old server on the new
one; same performance issues.



If this is a linux system, could you give us the output of the 'free' command? 
Postgresql might be choosing a bad plan because your effective_cache_size is 
way off (it's the default now right?).  Also, what was the block read/write 
speed of the SAN from your bonnie tests?  Probably want to tune 
random_page_cost as well if it's also at the default.



--
Jeff Frost, Owner   [EMAIL PROTECTED]
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Alex Deucher

On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote:

On Thu, 1 Mar 2007, Alex Deucher wrote:

 Vacuum? Analayze? default_statistics_target? How many shared_buffers?
 effective_cache_size? work_mem?


 I'm running the autovacuum process on the 8.1 server.  vacuuming on
 the old server was done manually.

 default_statistics_target and effective_cache_size are set to the the
 defaults on both.

 postgres 7.4 server:
 # - Memory -
 shared_buffers = 82000 # 1000min 16, at least
 max_connections*2, 8KB each
 sort_mem = 8000# 1024min 64, size in KB
 vacuum_mem = 32000 # 8192min 1024, size in KB
 # - Free Space Map -
 #max_fsm_pages = 2  # min max_fsm_relations*16, 6 bytes each
 #max_fsm_relations = 1000   # min 100, ~50 bytes each
 # - Kernel Resource Usage -
 #max_files_per_process = 1000   # min 25

 postgres 8.1 server:
 # - Memory -
 shared_buffers = 10 # min 16 or max_connections*2, 8KB
 each
 temp_buffers = 2000 #1000   # min 100, 8KB each
 max_prepared_transactions = 100 #5  # can be 0 or more
 # note: increasing max_prepared_transactions costs ~600 bytes of shared
 memory
 # per transaction slot, plus lock space (see max_locks_per_transaction).
 work_mem = 1#1024   # min 64, size in KB
 maintenance_work_mem = 524288 #16384# min 1024, size in KB
 #max_stack_depth = 2048 # min 100, size in KB

 I've also tried using the same settings from the old server on the new
 one; same performance issues.


If this is a linux system, could you give us the output of the 'free' command?


total   used   free sharedbuffers cached
Mem:   80598528042868  16984  02287888648
-/+ buffers/cache: 1539927905860
Swap: 15631224   2164   15629060



Postgresql might be choosing a bad plan because your effective_cache_size is
way off (it's the default now right?).  Also, what was the block read/write


yes it's set to the default.


speed of the SAN from your bonnie tests?  Probably want to tune
random_page_cost as well if it's also at the default.



--Sequential Output-- --Sequential Input- --Random-
   -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
luna12-san   16000M 58896  91 62931   9 35870   5 54869  82 145504 13  397.7  0

effective_cache_size is the default.

Alex

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] stats collector process high CPU utilization

2007-03-01 Thread Merlin Moncure

On 3/1/07, Tom Lane [EMAIL PROTECTED] wrote:

Benjamin Minshall [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 It's sounding like what you had was just transient bloat, in which case
 it might be useful to inquire whether anything out-of-the-ordinary had
 been done to the database right before the excessive-CPU-usage problem
 started.

 I don't believe that there was any unusual activity on the server, but I
 have set up some more detailed logging to hopefully identify a pattern
 if the problem resurfaces.

A further report led us to realize that 8.2.x in fact has a nasty bug
here: the stats collector is supposed to dump its stats to a file at
most every 500 milliseconds, but the code was actually waiting only
500 microseconds :-(.  The larger the stats file, the more obvious
this problem gets.


I think this explains the trigger that was blowing up my FC4 box.

merlin

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Jeff Frost

On Thu, 1 Mar 2007, Alex Deucher wrote:


On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote:

On Thu, 1 Mar 2007, Alex Deucher wrote:

 Vacuum? Analayze? default_statistics_target? How many shared_buffers?
 effective_cache_size? work_mem?


 I'm running the autovacuum process on the 8.1 server.  vacuuming on
 the old server was done manually.

 default_statistics_target and effective_cache_size are set to the the
 defaults on both.

 postgres 7.4 server:
 # - Memory -
 shared_buffers = 82000 # 1000min 16, at least
 max_connections*2, 8KB each
 sort_mem = 8000# 1024min 64, size in KB
 vacuum_mem = 32000 # 8192min 1024, size in KB
 # - Free Space Map -
 #max_fsm_pages = 2  # min max_fsm_relations*16, 6 bytes each
 #max_fsm_relations = 1000   # min 100, ~50 bytes each
 # - Kernel Resource Usage -
 #max_files_per_process = 1000   # min 25

 postgres 8.1 server:
 # - Memory -
 shared_buffers = 10 # min 16 or max_connections*2, 
8KB

 each
 temp_buffers = 2000 #1000   # min 100, 8KB each
 max_prepared_transactions = 100 #5  # can be 0 or more
 # note: increasing max_prepared_transactions costs ~600 bytes of shared
 memory
 # per transaction slot, plus lock space (see max_locks_per_transaction).
 work_mem = 1#1024   # min 64, size in KB
 maintenance_work_mem = 524288 #16384# min 1024, size in KB
 #max_stack_depth = 2048 # min 100, size in KB

 I've also tried using the same settings from the old server on the new
 one; same performance issues.


If this is a linux system, could you give us the output of the 'free' 
command?


   total   used   free sharedbuffers cached
Mem:   80598528042868  16984  02287888648
-/+ buffers/cache: 1539927905860
Swap: 15631224   2164   15629060


So, I would set effective_cache_size = 988232 (7905860/8).



Postgresql might be choosing a bad plan because your effective_cache_size 
is

way off (it's the default now right?).  Also, what was the block read/write


yes it's set to the default.


speed of the SAN from your bonnie tests?  Probably want to tune
random_page_cost as well if it's also at the default.



	--Sequential Output-- --Sequential Input- 
--Random-
  -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
--Seeks--
MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec 
%CP
luna12-san   16000M 58896  91 62931   9 35870   5 54869  82 145504 13  397.7 
0




So, you're getting 62MB/s writes and 145MB/s reads.  Just FYI, that write 
speed is about the same as my single SATA drive write speed on my workstation, 
so not that great.  The read speed is decent, though and with that sort of 
read performance, you might want to lower random_page_cost to something like 
2.5 or 2 so the planner will tend to prefer index scans.


--
Jeff Frost, Owner   [EMAIL PROTECTED]
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Alex Deucher

On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote:

On Thu, 1 Mar 2007, Alex Deucher wrote:

 On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote:
 On Thu, 1 Mar 2007, Alex Deucher wrote:

  Vacuum? Analayze? default_statistics_target? How many shared_buffers?
  effective_cache_size? work_mem?
 
 
  I'm running the autovacuum process on the 8.1 server.  vacuuming on
  the old server was done manually.
 
  default_statistics_target and effective_cache_size are set to the the
  defaults on both.
 
  postgres 7.4 server:
  # - Memory -
  shared_buffers = 82000 # 1000min 16, at least
  max_connections*2, 8KB each
  sort_mem = 8000# 1024min 64, size in KB
  vacuum_mem = 32000 # 8192min 1024, size in KB
  # - Free Space Map -
  #max_fsm_pages = 2  # min max_fsm_relations*16, 6 bytes each
  #max_fsm_relations = 1000   # min 100, ~50 bytes each
  # - Kernel Resource Usage -
  #max_files_per_process = 1000   # min 25
 
  postgres 8.1 server:
  # - Memory -
  shared_buffers = 10 # min 16 or max_connections*2,
 8KB
  each
  temp_buffers = 2000 #1000   # min 100, 8KB each
  max_prepared_transactions = 100 #5  # can be 0 or more
  # note: increasing max_prepared_transactions costs ~600 bytes of shared
  memory
  # per transaction slot, plus lock space (see max_locks_per_transaction).
  work_mem = 1#1024   # min 64, size in KB
  maintenance_work_mem = 524288 #16384# min 1024, size in KB
  #max_stack_depth = 2048 # min 100, size in KB
 
  I've also tried using the same settings from the old server on the new
  one; same performance issues.
 

 If this is a linux system, could you give us the output of the 'free'
 command?

total   used   free sharedbuffers cached
 Mem:   80598528042868  16984  02287888648
 -/+ buffers/cache: 1539927905860
 Swap: 15631224   2164   15629060

So, I would set effective_cache_size = 988232 (7905860/8).


 Postgresql might be choosing a bad plan because your effective_cache_size
 is
 way off (it's the default now right?).  Also, what was the block read/write

 yes it's set to the default.

 speed of the SAN from your bonnie tests?  Probably want to tune
 random_page_cost as well if it's also at the default.


   --Sequential Output-- --Sequential Input-
 --Random-
   -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
 --Seeks--
 MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
 %CP
 luna12-san   16000M 58896  91 62931   9 35870   5 54869  82 145504 13  397.7
 0


So, you're getting 62MB/s writes and 145MB/s reads.  Just FYI, that write
speed is about the same as my single SATA drive write speed on my workstation,
so not that great.  The read speed is decent, though and with that sort of
read performance, you might want to lower random_page_cost to something like
2.5 or 2 so the planner will tend to prefer index scans.



Right, but the old box was getting ~45MBps on both reads and writes,
so it's an improvement for me :)  Thanks for the advice, I'll let you
know how it goes.

Alex

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Identical Queries

2007-03-01 Thread Stephan Szabo
On Thu, 1 Mar 2007, Rob Schall wrote:

 There are 4 entries (wanted to make the playing field level for this
 test). There are 2 with true for istf and 2 with false.

Then analyzing might help, because I think it's estimating many more rows
for both cases, and with 2 rows estimated to be returned the nested loop
should seem a lot more attractive than at 900+.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Ron

At 07:36 PM 3/1/2007, Jeff Frost wrote:

On Thu, 1 Mar 2007, Alex Deucher wrote:

 Postgresql might be choosing a bad plan because your 
effective_cache_size

 is
 way off (it's the default now right?).  Also, what was the 
block read/write


 yes it's set to the default.

 speed of the SAN from your bonnie tests?  Probably want to tune
 random_page_cost as well if it's also at the default.


   --Sequential Output-- --Sequential Input-
 --Random-
   -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
 --Seeks--
 MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP 
K/sec %CP /sec

 %CP
 luna12-san   16000M 58896  91 62931   9 35870   5 54869  82 
145504 13 397.7

 0

So, you're getting 62MB/s writes and 145MB/s reads.  Just FYI, that write
speed is about the same as my single SATA drive write speed on my 
workstation,

so not that great.  The read speed is decent, though and with that sort of
read performance, you might want to lower random_page_cost to something like
2.5 or 2 so the planner will tend to prefer index scans.


Right, but the old box was getting ~45MBps on both reads and writes,
so it's an improvement for me :)  Thanks for the advice, I'll let you
know how it goes.


Do you think that is because you have a different interface between 
you and the SAN?  ~45MBps is pretty slow - your average 7200RPM 
ATA133 drive can do that and costs quite a bit less than a SAN.


Is the SAN being shared between the database servers and other 
servers?  Maybe it was just random timing that gave you the poor 
write performance on the old server which might be also yielding 
occassional poor performance on the new one.


Remember that pg, even pg 8.2.3, has a known history of very poor 
insert speed (see comments on this point by Josh Berkus,  Luke Lonergan, etc)


For some reason, the code changes that have resulted in dramatic 
improvements in pg's read speed have not had nearly the same efficacy 
for writes.


Bottom line: pg presently has a fairly low and fairly harsh upper 
bound on write performance.   What exactly that bound is has been the 
subject of some discussion, but IIUC the fact of its existence is 
well established.


Various proposals for improving the situation exist, I've even made 
some of them, but AFAIK this is currently considered one of the 
tough pg problems.


Cheers,
Ron Peacetree 



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] stats collector process high CPU utilization

2007-03-01 Thread Merlin Moncure

On 3/2/07, Tom Lane [EMAIL PROTECTED] wrote:

Merlin Moncure [EMAIL PROTECTED] writes:
 On 3/1/07, Tom Lane [EMAIL PROTECTED] wrote:
 A further report led us to realize that 8.2.x in fact has a nasty bug
 here: the stats collector is supposed to dump its stats to a file at
 most every 500 milliseconds, but the code was actually waiting only
 500 microseconds :-(.  The larger the stats file, the more obvious
 this problem gets.

 I think this explains the trigger that was blowing up my FC4 box.

I dug in the archives a bit and couldn't find the report you're
referring to?


I was referring to this:
http://archives.postgresql.org/pgsql-hackers/2007-02/msg01418.php

Even though the fundamental reason was obvious (and btw, I inherited
this server less than two months ago), I was still curious what was
making 8.2 blow up a box that was handling a million tps/hour for over
a year. :-)

merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Joshua D. Drake
\
 Is the SAN being shared between the database servers and other
 servers?  Maybe
 it was just random timing that gave you the poor write performance on
 the old
 server which might be also yielding occassional poor performance on
 the new
 one.

 
 The direct attached scsi discs on the old database server we getting
 45MBps not the SAN.  The SAN got 62/145Mbps, which is not as bad. 

How many spindles you got in that SAN?

 We
 have 4 servers on the SAN each with it's own 4 GBps FC link via an FC
 switch.  I'll try and re-run the numbers when the servers are idle
 this weekend.
 
 Alex
 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Alex Deucher

On 3/1/07, Joshua D. Drake [EMAIL PROTECTED] wrote:

\
 Is the SAN being shared between the database servers and other
 servers?  Maybe
 it was just random timing that gave you the poor write performance on
 the old
 server which might be also yielding occassional poor performance on
 the new
 one.


 The direct attached scsi discs on the old database server we getting
 45MBps not the SAN.  The SAN got 62/145Mbps, which is not as bad.

How many spindles you got in that SAN?


105 IIRC.

Alex

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Jeff Frost

On Thu, 1 Mar 2007, Alex Deucher wrote:


On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote:

On Thu, 1 Mar 2007, Alex Deucher wrote:

  Postgresql might be choosing a bad plan because your
 effective_cache_size
  is
  way off (it's the default now right?).  Also, what was the block
 read/write
 
  yes it's set to the default.
 
  speed of the SAN from your bonnie tests?  Probably want to tune
  random_page_cost as well if it's also at the default.
 
 
--Sequential Output-- --Sequential Input-
  --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
  --Seeks--
  MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP
 /sec
  %CP
  luna12-san   16000M 58896  91 62931   9 35870   5 54869  82 145504 13
 397.7
  0
 

 So, you're getting 62MB/s writes and 145MB/s reads.  Just FYI, that 
write

 speed is about the same as my single SATA drive write speed on my
 workstation,
 so not that great.  The read speed is decent, though and with that sort 
of

 read performance, you might want to lower random_page_cost to something
 like
 2.5 or 2 so the planner will tend to prefer index scans.


 Right, but the old box was getting ~45MBps on both reads and writes,
 so it's an improvement for me :)  Thanks for the advice, I'll let you
 know how it goes.

Do you think that is because you have a different interface between you and
the SAN?  ~45MBps is pretty slow - your average 7200RPM ATA133 drive can do
that and costs quite a bit less than a SAN.

Is the SAN being shared between the database servers and other servers? 
Maybe
it was just random timing that gave you the poor write performance on the 
old

server which might be also yielding occassional poor performance on the new
one.



The direct attached scsi discs on the old database server we getting
45MBps not the SAN.  The SAN got 62/145Mbps, which is not as bad.  We
have 4 servers on the SAN each with it's own 4 GBps FC link via an FC
switch.  I'll try and re-run the numbers when the servers are idle
this weekend.


Sorry, I thought the old server was also attached to the SAN.  My fault for 
not hanging onto the entire email thread.


I think you're mixing and matching your capitol and lower case Bs in your 
sentence above though. :-)


I suspect what you really mean is The SAN got 62/145MBps (megabytes/sec) and 
teh FC link is 4Gbps (gigabits/sec) or 500MBps.  Is that correct?  If so, and 
seeing that you think there are 105 spindles on the SAN, I'd say you're either 
maxxing out the switch fabric of the SAN with your servers or you have a 
really poorly performing SAN in general, or you just misunderstood the .


As a comparison With 8 WD Raptors configured in a RAID10 with normal ext3 I 
get about 160MB/s write and 305MB/s read performance.  Hopefully the SAN has 
lots of other super nifty features that make up for the poor performance. :-(


--
Jeff Frost, Owner   [EMAIL PROTECTED]
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Alex Deucher

On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote:

On Thu, 1 Mar 2007, Alex Deucher wrote:

 On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote:
 On Thu, 1 Mar 2007, Alex Deucher wrote:

   Postgresql might be choosing a bad plan because your
  effective_cache_size
   is
   way off (it's the default now right?).  Also, what was the block
  read/write
  
   yes it's set to the default.
  
   speed of the SAN from your bonnie tests?  Probably want to tune
   random_page_cost as well if it's also at the default.
  
  
 --Sequential Output-- --Sequential Input-
   --Random-
 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
   --Seeks--
   MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP
  /sec
   %CP
   luna12-san   16000M 58896  91 62931   9 35870   5 54869  82 145504 13
  397.7
   0
  
 
  So, you're getting 62MB/s writes and 145MB/s reads.  Just FYI, that
 write
  speed is about the same as my single SATA drive write speed on my
  workstation,
  so not that great.  The read speed is decent, though and with that sort
 of
  read performance, you might want to lower random_page_cost to something
  like
  2.5 or 2 so the planner will tend to prefer index scans.
 
 
  Right, but the old box was getting ~45MBps on both reads and writes,
  so it's an improvement for me :)  Thanks for the advice, I'll let you
  know how it goes.

 Do you think that is because you have a different interface between you and
 the SAN?  ~45MBps is pretty slow - your average 7200RPM ATA133 drive can do
 that and costs quite a bit less than a SAN.

 Is the SAN being shared between the database servers and other servers?
 Maybe
 it was just random timing that gave you the poor write performance on the
 old
 server which might be also yielding occassional poor performance on the new
 one.


 The direct attached scsi discs on the old database server we getting
 45MBps not the SAN.  The SAN got 62/145Mbps, which is not as bad.  We
 have 4 servers on the SAN each with it's own 4 GBps FC link via an FC
 switch.  I'll try and re-run the numbers when the servers are idle
 this weekend.

Sorry, I thought the old server was also attached to the SAN.  My fault for
not hanging onto the entire email thread.

I think you're mixing and matching your capitol and lower case Bs in your
sentence above though. :-)


whoops :)



I suspect what you really mean is The SAN got 62/145MBps (megabytes/sec) and
teh FC link is 4Gbps (gigabits/sec) or 500MBps.  Is that correct?  If so, and
seeing that you think there are 105 spindles on the SAN, I'd say you're either
maxxing out the switch fabric of the SAN with your servers or you have a
really poorly performing SAN in general, or you just misunderstood the .

As a comparison With 8 WD Raptors configured in a RAID10 with normal ext3 I
get about 160MB/s write and 305MB/s read performance.  Hopefully the SAN has
lots of other super nifty features that make up for the poor performance. :-(



It's big and reliable (and compared to lots of others, relatively
inexpensive) which is why we bought it.  We bought it mostly as a huge
file store.  The RAID groups on the SAN were set up for maximum
capacity rather than for performance.  Using it for the databases just
came up recently.

Alex

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Joshua D. Drake
Alex Deucher wrote:
 On 3/1/07, Joshua D. Drake [EMAIL PROTECTED] wrote:
 \
  Is the SAN being shared between the database servers and other
  servers?  Maybe
  it was just random timing that gave you the poor write performance on
  the old
  server which might be also yielding occassional poor performance on
  the new
  one.
 
 
  The direct attached scsi discs on the old database server we getting
  45MBps not the SAN.  The SAN got 62/145Mbps, which is not as bad.

 How many spindles you got in that SAN?
 
 105 IIRC.

You have 105 spindles are you are only get 62megs on writes? That seems
about half what you should be getting. (at least).

Joshua D. Drake


 
 Alex
 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-01 Thread Alex Deucher

On 3/1/07, Joshua D. Drake [EMAIL PROTECTED] wrote:

Alex Deucher wrote:
 On 3/1/07, Joshua D. Drake [EMAIL PROTECTED] wrote:
 \
  Is the SAN being shared between the database servers and other
  servers?  Maybe
  it was just random timing that gave you the poor write performance on
  the old
  server which might be also yielding occassional poor performance on
  the new
  one.
 
 
  The direct attached scsi discs on the old database server we getting
  45MBps not the SAN.  The SAN got 62/145Mbps, which is not as bad.

 How many spindles you got in that SAN?

 105 IIRC.

You have 105 spindles are you are only get 62megs on writes? That seems
about half what you should be getting. (at least).



Take the numbers with grain of salt.  They are by no means a thorough
evaluation.  I just ran bonnie a couple times to get a rough reference
point.  I can do a more thorough analysis.

Alex


Joshua D. Drake



 Alex




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[PERFORM] Improving query performance

2007-03-01 Thread David Leangen

Hello!

I'm new to performance tuning on postgres. I've read the docs on the
posgtres site, as well as:

  http://www.revsys.com/writings/postgresql-performance.html
  http://www.powerpostgresql.com/PerfList

However, my query is really slow, and I'm not sure what the main cause
could be, as there are so many variables. I'm hoping people with more
experience could help out.

My machine has 8Gb RAM, 2xCPU (2Gz, I think...)

Table has about 1M rows.

This is my postgres.conf:

listen_addresses = '*'
port = 5432
max_connections = 100
shared_buffers = 256000
effective_cache_size = 100
work_mem = 500
redirect_stderr = on
log_directory = 'pg_log'
log_filename = 'postgresql-%a.log'
log_truncate_on_rotation = on
log_rotation_age = 1440
log_rotation_size = 0
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'

This is the result of explain analyze:

 Aggregate  (cost=384713.17..384713.18 rows=1 width=4) (actual
time=254856.025..254856.025 rows=1 loops=1)
   -  Seq Scan on medline_articles t0  (cost=0.00..382253.00
rows=984068 width=4) (actual time=511.841..254854.981 rows=788 loops=1)
 Filter: long_ugly_query_here



And this is the actual query:

SELECT COUNT(t0.ID) FROM public.MY_TABLE t0 
WHERE ((POSITION('adenosine cyclic 35-monophosphate' IN LOWER(t0.TITLE))
- 1) =0 OR 
(POSITION('adenosine cyclic 55-monophosphate' IN LOWER(t0.TEXT)) - 1) =
0 OR 
(POSITION('cyclic amp, disodium salt' IN LOWER(t0.TITLE)) - 1) = 0 OR 
(POSITION('cyclic amp, disodium salt' IN LOWER(t0.TEXT)) - 1) = 0 OR 
(POSITION('cyclic amp, sodium salt' IN LOWER(t0.TEXT)) - 1) = 0 OR 
(POSITION('cyclic amp, sodium salt' IN LOWER(t0.TITLE)) - 1) = 0 OR 
(POSITION('cyclic amp' IN LOWER(t0.TEXT)) - 1) = 0 OR 
(POSITION('cyclic amp' IN LOWER(t0.TITLE)) - 1) = 0 OR 
(POSITION('cyclic amp, monopotassium salt' IN LOWER(t0.TEXT)) - 1) = 0
OR 
(POSITION('cyclic amp, monopotassium salt' IN LOWER(t0.TEXT)) - 1) = 0
OR 
(POSITION('adenosine cyclic-35-monophosphate' IN LOWER(t0.TEXT)) - 1) =
0 OR 
(POSITION('adenosine cyclic-35-monophosphate' IN LOWER(t0.TITLE)) - 1)
= 0 OR 
(POSITION('adenosine cyclic monophosphate' IN LOWER(t0.TEXT)) - 1) = 0
OR 
(POSITION('adenosine cyclic monophosphate' IN LOWER(t0.TITLE)) - 1) = 0
OR 
(POSITION('cyclic amp, monoammonium salt' IN LOWER(t0.TEXT)) - 1) = 0
OR 
(POSITION('cyclic amp, monoammonium salt' IN LOWER(t0.TITLE)) - 1) = 0
OR 
(POSITION('adenosine cyclic 3,5 monophosphate' IN LOWER(t0.TEXT)) - 1)
= 0 OR 
(POSITION('adenosine cyclic 3,5 monophosphate' IN LOWER(t0.TITLE)) - 1)
= 0 OR 
(POSITION('cyclic amp, monosodium salt' IN LOWER(t0.TEXT)) - 1) = 0 OR 
(POSITION('cyclic amp, monosodium salt' IN LOWER(t0.TITLE)) - 1) = 0
OR 
(POSITION('cyclic amp, (r)-isomer' IN LOWER(t0.TEXT)) - 1) = 0 OR 
(POSITION('cyclic amp, (r)-isomer' IN LOWER(t0.TEXT)) - 1) = 0)


Some more info:

pubmed= SELECT relpages, reltuples FROM pg_class WHERE relname =
'MY_TABLE';
 relpages | reltuples
--+---
   155887 |984200
(1 row)



Thanks for any suggestions!

Dave




PS - Yes! I did run vacuum analyze :-)




---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[PERFORM] Array indexes, GIN?

2007-03-01 Thread Adam L Beberg
I need to cross reference 2 tables. There are O(10M) A's, each has an 
ordered set of 10 of the O(100K) B's associated with it. The dominant 
query will be finding the A's and their count associated with a given 
list of ~1k B's i.e. if 2 of the listed B's are in A's set of 10, it's 
(A,2), and we should get back ~100K rows. The good news is we only need 
to run this brutal query every couple minutes, but the row updates will 
flow fast.


Luckily this is PostgreSQL, so the simple solution seems to be

  CREATE TABLE xref( A bigint, B bigint[10] ); -- A is primary key

which cuts down the table overhead. O(10M) rows w/array.

On the surface, looks like a job for GIN, but GIN seems undocumented, 
specifically mentions it doesn't support the deletes we'll have many of 
since it's designed for word searching apparently, the performance 
implications are undocumented. I searched, I read, and even IRC'd, and 
it seems like GIN is just not used much.


Is GIN right? Will this work at all? Will it run fast enough to function?

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Array indexes, GIN?

2007-03-01 Thread Josh Berkus
Adam,

 On the surface, looks like a job for GIN, but GIN seems undocumented,
 specifically mentions it doesn't support the deletes we'll have many of
 since it's designed for word searching apparently, the performance
 implications are undocumented. I searched, I read, and even IRC'd, and
 it seems like GIN is just not used much.

It's new (as of 8.2).  And the authors, Oleg and Teodor, are notorious for 
skimpy documetentation.

I'd start with the code in INTARRAY contrib module (also by Teodor) and bug 
them on pgsql-hackers about helping you implement a GIN index for arrays.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Improving query performance

2007-03-01 Thread Tom Lane
David Leangen [EMAIL PROTECTED] writes:
 And this is the actual query:

 SELECT COUNT(t0.ID) FROM public.MY_TABLE t0 
 WHERE ((POSITION('adenosine cyclic 35-monophosphate' IN LOWER(t0.TITLE))
 - 1) =0 OR 
 (POSITION('adenosine cyclic 55-monophosphate' IN LOWER(t0.TEXT)) - 1) =
 0 OR 
 (POSITION('cyclic amp, disodium salt' IN LOWER(t0.TITLE)) - 1) = 0 OR 
 (POSITION('cyclic amp, disodium salt' IN LOWER(t0.TEXT)) - 1) = 0 OR 
 (POSITION('cyclic amp, sodium salt' IN LOWER(t0.TEXT)) - 1) = 0 OR 
 ...etc...

I think you need to look into full-text indexing (see tsearch2).

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Array indexes, GIN?

2007-03-01 Thread Oleg Bartunov

On Thu, 1 Mar 2007, Josh Berkus wrote:


Adam,


On the surface, looks like a job for GIN, but GIN seems undocumented,
specifically mentions it doesn't support the deletes we'll have many of
since it's designed for word searching apparently, the performance
implications are undocumented. I searched, I read, and even IRC'd, and
it seems like GIN is just not used much.


It's new (as of 8.2).  And the authors, Oleg and Teodor, are notorious for
skimpy documetentation.


We're getting better, we have 72 pages written about new FTS :)



I'd start with the code in INTARRAY contrib module (also by Teodor) and bug
them on pgsql-hackers about helping you implement a GIN index for arrays.


GIN already has support for one dimensional arrays and intarray, particularly,
too has support of GiN.


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Array indexes, GIN?

2007-03-01 Thread Adam L Beberg

Oleg Bartunov wrote on 3/1/2007 10:45 PM:

On Thu, 1 Mar 2007, Josh Berkus wrote:


Adam,


On the surface, looks like a job for GIN, but GIN seems undocumented,
specifically mentions it doesn't support the deletes we'll have many of
since it's designed for word searching apparently, the performance
implications are undocumented. I searched, I read, and even IRC'd, and
it seems like GIN is just not used much.


It's new (as of 8.2).  And the authors, Oleg and Teodor, are notorious 
for

skimpy documetentation.


We're getting better, we have 72 pages written about new FTS :)


I'm guessing FTS is not quite done since you matched 'FTS' to 'GIN' ;)

GIN already has support for one dimensional arrays and intarray, 
particularly, too has support of GiN.


Great, so can GIN handle my situation? I'm a little unsure what to make 
of Note: There is no delete operation for ET. in particular since I'm 
dealing with large numbers.


--
Adam L. Beberg
http://www.mithral.com/~beberg/

---(end of broadcast)---
TIP 6: explain analyze is your friend