Re: [PERFORM] Swapping on Solaris

2005-02-02 Thread Bruce Momjian
Andrew Sullivan wrote:
 On Wed, Jan 19, 2005 at 10:42:26AM -0500, Alan Stange wrote:
  
  I'm fairly sure that the pi and po numbers include file IO in Solaris, 
  because of the unified VM and file systems.
 
 That's correct.

I have seen cases on BSDs where 'pi' includes page-faulting in the
executables from the file system, but Solaris actually has 'po' as
filesystem I/O.  That is a new one to me.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Bitmap indexes

2005-02-02 Thread Bruce Momjian
PFC wrote:
  There's a great deal about this in the list archives (probably more in
  pgsql-hackers than in -performance).  Most of the current interest has
  to do with building in-memory bitmaps on the fly, as a way of decoupling
  index and heap scan processing.  Which is not quite what you're talking
  about but should be pretty effective for low-cardinality cases.  In
  particular it'd allow AND and OR combination of multiple indexes, which
  we do poorly or not at all at the moment.
 
   Is this called a star join ?
 
   It would also allow to access the data pages in a more sequential order 
  
 if the rows are not required to be retrieved in index order, which would  
 potentially be a large speedup for index scans concerning more than the  
 usual very small percentage of rows in a table : if several rows to be  
 retrieved are on the same page, it would visit this page only once.

Please see the TODO list for a summary of previous discussions and
directions.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Bad query optimizer misestimation because of TOAST tables

2005-02-02 Thread Markus Schaber
[This mail goes as X-Post to both pgsql-perform and postgis-users
because postgis users may suffer from this problem, but I would prefer
to keep the Discussion on pgsql-performance as it is a general TOAST
problem and not specific to PostGIS alone.]

Hello,

Running PostGIS 0.8.1 under PostgreSQL 7.4.6-7 (Debian), I struggled
over the following problem:

logigis=# explain analyze SELECT geom FROM adminbndy1 WHERE geom 
setsrid('BOX3D(9.4835390946502 47.39365740740741,9.5164609053498
47.40634259259259)'::box3d,4326);
   QUERY PLAN


 Seq Scan on adminbndy1  (cost=0.00..4.04 rows=1 width=121) (actual
time=133.591..7947.546 rows=5 loops=1)
   Filter: (geom  'SRID=4326;BOX3D(9.4835390946502 47.3936574074074
0,9.5164609053498 47.4063425925926 0)'::geometry)
 Total runtime: 7947.669 ms
(3 Zeilen)

logigis=# set enable_seqscan to off;
SET
logigis=# explain analyze SELECT geom FROM adminbndy1 WHERE geom 
setsrid('BOX3D(9.4835390946502 47.39365740740741,9.5164609053498
47.40634259259259)'::box3d,4326);
 QUERY PLAN


 Index Scan using adminbndy1_geom_idx on adminbndy1  (cost=0.00..4.44
rows=1 width=121) (actual time=26.902..27.066 rows=5 loops=1)
   Index Cond: (geom  'SRID=4326;BOX3D(9.4835390946502
47.3936574074074 0,9.5164609053498 47.4063425925926 0)'::geometry)
 Total runtime: 27.265 ms
(3 Zeilen)

So the query planner choses to ignore the index, although it is
appropriate. My first idea was that the statistics, but that turned out
not to be the problem. As the above output shows, the query optimizer
already guesses a rowcount of 1 which is even smaller than the actual
number of 5 fetched rows, so this should really make the query planner
use the index.

Some amount of increasingly random tries later, I did the following:

logigis=# vacuum full freeze analyze verbose adminbndy1;
INFO:  vacuuming public.adminbndy1
INFO:  adminbndy1: found 0 removable, 83 nonremovable row versions in
3 pages
DETAIL:  0 dead row versions cannot be removed yet.
Nonremovable row versions range from 128 to 1968 bytes long.
There were 1 unused item pointers.
Total free space (including removable row versions) is 5024 bytes.
0 pages are or will become empty, including 0 at the end of the table.
3 pages containing 5024 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  index adminbndy1_geom_idx now contains 83 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  adminbndy1: moved 0 row versions, truncated 3 to 3 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  vacuuming pg_toast.pg_toast_19369
INFO:  pg_toast_19369: found 0 removable, 32910 nonremovable row
versions in 8225 pages
DETAIL:  0 dead row versions cannot be removed yet.
Nonremovable row versions range from 37 to 2034 bytes long.
There were 0 unused item pointers.
Total free space (including removable row versions) is 167492 bytes.
0 pages are or will become empty, including 0 at the end of the table.
66 pages containing 67404 free bytes are potential move destinations.
CPU 0.67s/0.04u sec elapsed 2.76 sec.
INFO:  index pg_toast_19369_index now contains 32910 row versions in
127 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.01s/0.00u sec elapsed 0.14 sec.
INFO:  pg_toast_19369: moved 0 row versions, truncated 8225 to 8225 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing public.adminbndy1
INFO:  adminbndy1: 3 pages, 83 rows sampled, 83 estimated total rows
VACUUM
logigis=#

IMHO, this tells the reason. The query planner has a table size of 3
pages, which clearly is a case for a seqscan. But during the seqscan,
the database has to fetch an additional amount of 8225 toast pages and
127 toast index pages, and rebuild the geometries contained therein.

And the total number of 8355 pages = 68MB is a rather huge amount of
data to fetch.

I think this problem bites every user that has rather large columns that
get stored in the TOAST table, when querying on those column.

As a small workaround, I could imagine to add a small additional column
in the table that contains the geometry's bbox, and which I use the 
operator against. This should avoid touching the TOAST for the skipped rows.

But the real fix should be to add the toast pages to the query planners
estimation for the sequential scan. What do you think about it?

Markus

-- 
markus schaber | dipl. informatiker
logi-track ag | rennweg 14-16 | ch 8001 zürich
phone +41-43-888 62 52 | fax +41-43-888 62 53
mailto:[EMAIL PROTECTED] | www.logi-track.com


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-02-02 Thread Bruce Momjian
William Yu wrote:
  Well, that would give you the most benefit, but the memory bandwidth is
  still greater than on a Xeon. There's really no issue with 64 bit if
  you're using open source software; it all compiles for 64 bits and
  you're good to go. http://stats.distributed.net runs on a dual opteron
  box running FreeBSD and I've had no issues.
 
 You can get 64-bit Xeons also but it takes hit in the I/O department due 
 to the lack of a hardware I/O MMU which limits DMA transfers to 
 addresses below 4GB. This has a two-fold impact:
 
 1) transfering data to 4GB require first a transfer to 4GB and then a 
 copy to the final destination.
 
 2) You must allocate real memory 2X the address space of the devices to 
 act as bounce buffers. This is especially problematic for workstations 
 because if you put a 512MB Nvidia card in your computer for graphics 
 work -- you've just lost 1GB of memory. (I dunno how much the typical 
 SCSI/NIC/etc take up.)

I thought Intel was copying AMD's 64-bit API.  Is Intel's
implementation as poor as you description?  Does Intel have any better
64-bit offering other than the Itanium/Itanic?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Bad query optimizer misestimation because of TOAST tables

2005-02-02 Thread Tom Lane
Markus Schaber [EMAIL PROTECTED] writes:
 IMHO, this tells the reason. The query planner has a table size of 3
 pages, which clearly is a case for a seqscan. But during the seqscan,
 the database has to fetch an additional amount of 8225 toast pages and
 127 toast index pages, and rebuild the geometries contained therein.

I don't buy this analysis at all.  The toasted columns are not those in
the index (because we don't support out-of-line-toasted index entries),
so a WHERE clause that only touches indexed columns isn't going to need
to fetch anything from the toast table.  The only stuff it would fetch
is in rows that passed the WHERE and need to be returned to the client
--- and those costs are going to be the same either way.

I'm not entirely sure where the time is going, but I do not think you
have proven your theory about it.  I'd suggest building the backend
with -pg and getting some gprof evidence.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-02-02 Thread William Yu
You can get 64-bit Xeons also but it takes hit in the I/O department due 
to the lack of a hardware I/O MMU which limits DMA transfers to 
addresses below 4GB. This has a two-fold impact:

1) transfering data to 4GB require first a transfer to 4GB and then a 
copy to the final destination.

2) You must allocate real memory 2X the address space of the devices to 
act as bounce buffers. This is especially problematic for workstations 
because if you put a 512MB Nvidia card in your computer for graphics 
work -- you've just lost 1GB of memory. (I dunno how much the typical 
SCSI/NIC/etc take up.)

I thought Intel was copying AMD's 64-bit API.  Is Intel's
implementation as poor as you description?  Does Intel have any better
64-bit offering other than the Itanium/Itanic?
Unfortunately, there's no easy way for Intel to have implemented a 
64-bit IOMMU under their current restrictions. The memory controller 
resides on the chipset and to upgrade the functionality significantly, 
it would probably require changing the bus protocol. It's not that they 
couldn't do it -- it would just require all Intel chipset/MB 
vendors/partners to go through the process of creating  validating 
totally new products. A way lengthier process than just producing 64-bit 
CPUs that drop into current motherboards.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Bad query optimizer misestimation because of TOAST

2005-02-02 Thread Markus Schaber
Hi, Tom,

Tom Lane schrieb:

IMHO, this tells the reason. The query planner has a table size of 3
pages, which clearly is a case for a seqscan. But during the seqscan,
the database has to fetch an additional amount of 8225 toast pages and
127 toast index pages, and rebuild the geometries contained therein.

 I don't buy this analysis at all.  The toasted columns are not those in
 the index (because we don't support out-of-line-toasted index entries),
 so a WHERE clause that only touches indexed columns isn't going to need
 to fetch anything from the toast table.  The only stuff it would fetch
 is in rows that passed the WHERE and need to be returned to the client
 --- and those costs are going to be the same either way.

 I'm not entirely sure where the time is going, but I do not think you
 have proven your theory about it.  I'd suggest building the backend
 with -pg and getting some gprof evidence.

The column is a PostGIS column, and the index was created using GIST.
Those are lossy indices that do not store the whole geometry, but only
the bounding box  corners of the Geometry (2 Points).

Without using the index, the  Operator (which tests for bbox
overlapping) has to load the whole geometry from disk, and extract the
bbox therein (as it cannot make use of partial fetch).

Some little statistics:

logigis=# select max(mem_size(geom)), avg(mem_size(geom))::int,
max(npoints(geom)) from adminbndy1;
   max|   avg   |  max
--+-+
 20998856 | 1384127 | 873657
(1 Zeile)

So the geometries use are about 1.3 MB average size, and have a maximum
size of 20Mb. I'm pretty shure this cannot be stored without TOASTing.

Additionally, my suggested workaround using a separate bbox column
really works:

logigis=# alter table adminbndy1 ADD column bbox geometry;
ALTER TABLE
logigis=# update adminbndy1 set bbox = setsrid(box3d(geom)::geometry, 4326);
UPDATE 83
logigis=# explain analyze SELECT geom FROM adminbndy1 WHERE bbox 
setsrid('BOX3D(9.4835390946502 47.39365740740741,9.5164609053498
47.40634259259259)'::box3d,4326);
QUERY PLAN

---
 Seq Scan on adminbndy1  (cost=1.00..10022.50 rows=1
width=32) (actual time=0.554..0.885 rows=5 loops=1)
   Filter: (bbox  'SRID=4326;BOX3D(9.4835390946502 47.3936574074074
0,9.5164609053498 47.4063425925926 0)'::geometry)
 Total runtime: 0.960 ms
(3 Zeilen)

Here, the seqential scan matching exactly the same 5 rows only needs
about 1/8000th of time, because it does not have to touch the TOAST
pages at all.

logigis=# \o /dev/null
logigis=# \timing
Zeitmessung ist an.
logigis=# SELECT geom FROM adminbndy1 WHERE geom 
setsrid('BOX3D(9.4835390946502 47.39365740740741,9.5164609053498
47.40634259259259)'::box3d,4326);
Zeit: 11224,185 ms
logigis=# SELECT geom FROM adminbndy1 WHERE bbox 
setsrid('BOX3D(9.4835390946502 47.39365740740741,9.5164609053498
47.40634259259259)'::box3d,4326);
Zeit: 7689,720 ms

So you can see that, when actually detoasting the 5 rows and
deserializing the geometries to WKT format (their canonical text
representation), the time relation gets better, but there's still a
noticeable difference.

Markus
--
markus schaber | dipl. informatiker
logi-track ag | rennweg 14-16 | ch 8001 zürich
phone +41-43-888 62 52 | fax +41-43-888 62 53
mailto:[EMAIL PROTECTED] | www.logi-track.com


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-02-02 Thread Bruce Momjian
William Yu wrote:
 You can get 64-bit Xeons also but it takes hit in the I/O department due 
 to the lack of a hardware I/O MMU which limits DMA transfers to 
 addresses below 4GB. This has a two-fold impact:
 
 1) transfering data to 4GB require first a transfer to 4GB and then a 
 copy to the final destination.
 
 2) You must allocate real memory 2X the address space of the devices to 
 act as bounce buffers. This is especially problematic for workstations 
 because if you put a 512MB Nvidia card in your computer for graphics 
 work -- you've just lost 1GB of memory. (I dunno how much the typical 
 SCSI/NIC/etc take up.)

When you say allocate real memory 2X are you saying that if you have
16GB of RAM only 8GB is actually usable and the other 8GB is for
bounce buffers, or is it just address space being used up?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-02-02 Thread William Yu
Bruce Momjian wrote:
William Yu wrote:
You can get 64-bit Xeons also but it takes hit in the I/O department due 
to the lack of a hardware I/O MMU which limits DMA transfers to 
addresses below 4GB. This has a two-fold impact:

1) transfering data to 4GB require first a transfer to 4GB and then a 
copy to the final destination.

2) You must allocate real memory 2X the address space of the devices to 
act as bounce buffers. This is especially problematic for workstations 
because if you put a 512MB Nvidia card in your computer for graphics 
work -- you've just lost 1GB of memory. (I dunno how much the typical 
SCSI/NIC/etc take up.)

When you say allocate real memory 2X are you saying that if you have
16GB of RAM only 8GB is actually usable and the other 8GB is for
bounce buffers, or is it just address space being used up?
It's 2x the memory space of the devices. E.g. a Nvidia Graphics card w/ 
512MB of RAM would require 1GB of memory to act as bounce buffers. And 
it has to be real chunks of memory in 64-bit mode since DMA transfer 
must drop it into real memory in order to then be copied to  4GB.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-02-02 Thread Merlin Moncure
 By now, our system has never used stored procedures approach,
 due to the fact that we're staying on the minimum common SQL features
 that are supported by most db engines.
 I realize though that it would provide an heavy performance boost.

I feel your pain.   Well, sometimes you have to bite the bullet and do a
couple of implementation specific hacks in especially time sensitive
components.

  You also have the parse/bind interface
 
 This is something I have already engineered in our core classes
 (that use DBI + DBD::Pg), so that switching to 8.0 should
 automatically enable the single-prepare, multiple-execute behavior,
 saving a lot of query planner processing, if I understand correctly.

Yes. You save the planning step (which adds up, even for trivial plans).
The 'ExexPrepared' variant of prepared statement execution also provides
substantial savings (on server cpu load and execution time) because the
statement does not have to be parsed. Oh, and network traffic is reduced
correspondingly.  

I know that the perl people were pushing for certain features into the
libpq library (describing prepared statements, IIRC).  I think this
stuff made it into 8.0...have no clue about DBD::pg.

If everything is working the way it's supposed to, 8.0 should be faster
than 7.1 (like, twice faster) for what you are probably trying to do.
If it isn't, something else is wrong and it's very likely a solvable
problem.

In short, in pg 8.0, statement by statement query execution is highly
optimizeable at the driver level, much more so than 7.1.  Law of
Unintended Consequences aside, this will translate into direct benefits
into your app if it uses this application programming model.

Merlin

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] horizontal partition

2005-02-02 Thread Josh Berkus
Gaetano,

 I have a big table with ~ 10 Milion rows, and is a very
 pain administer it, so after years I convinced my self
 to partition it and replace the table usage ( only for reading )
 with a view.

 Now my user_logs table is splitted in 4:

 user_logs
 user_logs_2002
 user_logs_2003
 user_logs_2004

Any reason you didn't use inheritance?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-02-02 Thread Cosimo Streppone
Merlin Moncure wrote:
 [...]
  (...DBI + DBD::Pg), so that switching to 8.0 should
 automatically enable the single-prepare, multiple-execute behavior,
 saving a lot of query planner processing, if I understand correctly.
[...]
I know that the perl people were pushing for certain features into the
libpq library (describing prepared statements, IIRC).  I think this
stuff made it into 8.0...have no clue about DBD::pg.
For the record: yes, DBD::Pg in CVS ( 1.32) has support
for server prepared statements.
If everything is working the way it's supposed to, 8.0 should be faster
than 7.1 (like, twice faster) for what you are probably trying to do.
In the next days I will be testing the entire application with the
same database only changing the backend from 7.1 to 8.0, so this is
a somewhat perfect condition to have a real-world benchmark
of Pg 8.0 vs 7.1.x performances.
--
Cosimo
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Bitmap indexes

2005-02-02 Thread Daniel Ceregatti
PFC wrote:


 contrib/intarray has an index type which could be what you need.


I've used intarray for a site that requires that I match multiple low
cardinality attributes with multiple search criteria. Here's an
(abridged) example:

The table:

\d person_attributes
  Table dm.person_attributes
 Column |   Type   | Modifiers
+--+
 attributes | integer[]| not null
 personid   | integer  | not null
Indexes:
person_attributes_pk PRIMARY KEY, btree (personid)
person_attributes_gist_attributes_index gist (attributes)

This table has about 1.1 million rows.

The index:

create index person_attributes_gist_attributes_index on
person_attributes using gist ((attributes) gist__int_ops);

The query:

select personid
from person_attributes
where attributes @@
'(1|3)(900)(902)(1002)(9002)(11003)(12002|12003)(13003|13004|13005|13006|13007|13008|13009|13010)'::query_int

The explain analyze:

Index Scan using person_attributes_gist_search_index on
person_attributes pa  (cost=0.00..1221.26 rows=602 width=4) (actual
time=0.725..628.994 rows=1659 loops=1)
  Index Cond: (search @@ '( 1 | 3 )  900  902  1002  9002  11003 
( 12002 | 12003 )  ( ( ( ( ( ( ( 13003 | 13004 ) | 13005 ) | 13006 ) |
13007 ) | 13008 ) | 13009 ) | 13010 )'::query_int)
Total runtime: 431.843 ms

The query_int and what each number means:

1|3 means, only gather the people in site id 1 or 3.
900 is an arbitrary flag that means they are searchable.
902 is another arbitrary flag that means they have photos.
1002 is the flag for don't drink.
9002 is the flag for don't smoke.
11003 is the flag for female.
12002|12003 are the flags for straight|bisexual.
13003 through 13010 represent the age range 18 through 25.

In plain English: select all females who are straight or bisexual,
between the ages of 18 and 25 inclusive, that don't drink, that don't
smoke, who are searchable, who have photos, and belong to sites 1 or 3.

As you can see by the explain, this query is relatively fast, given the
number of criteria and data that has to be searched.

This site's predecessor used oracle, and I used bitmap indexes for
performing these searches in oracle. This intarray method is the closest
I've come to being able to reproduce the same functionality at the
required speed in postgres.

The only problems I've run into with this method are: the non-concurrent
nature of gist indexes, which makes doing any sort of bulk DML on them
extremely time consuming (I usually have to drop the index, perform the
bulk DML, then re-create the index), dealing with intarray methods to
select particular attributes so I can then order by them, and dealing
with intarray methods for updating the attributes column. All of these
methods are detailed in the intarray README.

I'm happy with the performance in production so far. I've yet to see any
gist concurrency issues affect performance with normal rates of DML.

Daniel

-- 

Daniel Ceregatti - Programmer
Omnis Network, LLC

Real Programmers don't eat quiche.  They eat Twinkies and Szechwan food.



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Bad query optimizer misestimation because of TOAST

2005-02-02 Thread Tom Lane
Markus Schaber [EMAIL PROTECTED] writes:
 Tom Lane schrieb:
 I don't buy this analysis at all.  The toasted columns are not those in
 the index (because we don't support out-of-line-toasted index entries),
 so a WHERE clause that only touches indexed columns isn't going to need
 to fetch anything from the toast table.

 The column is a PostGIS column, and the index was created using GIST.
 Those are lossy indices that do not store the whole geometry, but only
 the bounding box  corners of the Geometry (2 Points).
 Without using the index, the  Operator (which tests for bbox
 overlapping) has to load the whole geometry from disk, and extract the
 bbox therein (as it cannot make use of partial fetch).

Ah, I see; I forgot to consider the GIST storage option, which allows
the index contents to be something different from the represented column.
Hmm ...

What I would be inclined to do is to extend ANALYZE to make an estimate
of the extent of toasting of every toastable column, and then modify
cost_qual_eval to charge a nonzero cost for evaluation of Vars that are
potentially toasted.

This implies an initdb-forcing change in pg_statistic, which might or
might not be allowed for 8.1 ... we are still a bit up in the air on
what our release policy will be for 8.1.

My first thought about what stat ANALYZE ought to collect is average
number of out-of-line TOAST chunks per value.  Armed with that number
and size information about the TOAST table, it'd be relatively simple
for costsize.c to estimate the average cost of fetching such values.

I'm not sure if it's worth trying to model the cost of decompression of
compressed values.  Surely that's a lot cheaper than fetching
out-of-line values, so maybe we can just ignore it.  If we did want to
model it then we'd also need to make ANALYZE note the fraction of values
that require decompression, and maybe something about their sizes.

This approach would overcharge for operations that are able to work with
partially fetched values, but it's probably not reasonable to expect the
planner to account for that with any accuracy.

Given this we'd have a pretty accurate computation of the true cost of
the seqscan alternative, but what of indexscans?  The current
implementation charges one evaluation of the index qual(s) per
indexscan, which is not really right because actually the index
component is never evaluated at all.  This didn't matter when the index
component was a Var with zero eval cost, but if we're charging some eval
cost it might.  But ... since it's charging only one eval per scan
... the error is probably down in the noise in practice, and it may not
be worth trying to get it exactly right.

A bigger concern is what about lossy indexes?  We currently ignore the
costs of rechecking qual expressions for fetched rows, but this might be
too inaccurate for situations like yours.  I'm hesitant to mess with it
though.  For one thing, to get it right we'd need to understand how many
rows will be returned by the raw index search (which is the number of
times we'd need to recheck).  At the moment the only info we have is the
number that will pass the recheck, which could be a lot less ... and of
course, even that is probably a really crude estimate when we are
dealing with this sort of operator.

Seems like a bit of a can of worms ...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] horizontal partition

2005-02-02 Thread Gaetano Mendola
Josh Berkus wrote:
Gaetano,

I have a big table with ~ 10 Milion rows, and is a very
pain administer it, so after years I convinced my self
to partition it and replace the table usage ( only for reading )
with a view.
Now my user_logs table is splitted in 4:
user_logs
user_logs_2002
user_logs_2003
user_logs_2004

Any reason you didn't use inheritance?
I did in that way just to not use postgresql specific feature.
I can give it a try and I let you know, however the question remain,
why the index usage is lost if used in that way ?
Regards
Gaetano Mendola





---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] horizontal partition

2005-02-02 Thread Klint Gore
On Thu, 03 Feb 2005 02:10:15 +0100, Gaetano Mendola [EMAIL PROTECTED] wrote:
 why the index usage is lost if used in that way ?

This is how I interpret it (if anyone wants to set me straight or
improve on it feel free)

Views are implemented as rules.  

Rules are pretty much just a macro to the query builder.  When it sees
the view, it replaces it with the implementation of the view.

When you join a view to a table, it generates a subselect of the
implementation and joins that to the other table.

So the subselect will generate the entire set of data from the view
before it can use the join to eliminate rows.

I would like a way to make this work better as well.  One of my views is
32 joins of the same table (to get tree like data for reports).

klint.

+---+-+
: Klint Gore: Non rhyming:
: EMail   : [EMAIL PROTECTED]   :  slang - the:
: Snail   : A.B.R.I.:  possibilities  :
: Mail  University of New England   :  are useless   :
:   Armidale NSW 2351 Australia : L.J.J.  :
: Fax : +61 2 6772 5376 : :
+---+-+

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-02-02 Thread Christopher Browne
pgman@candle.pha.pa.us (Bruce Momjian) wrote:
 William Yu wrote:
  Well, that would give you the most benefit, but the memory
  bandwidth is still greater than on a Xeon. There's really no
  issue with 64 bit if you're using open source software; it all
  compiles for 64 bits and you're good to
  go. http://stats.distributed.net runs on a dual opteron box
  running FreeBSD and I've had no issues.
 
 You can get 64-bit Xeons also but it takes hit in the I/O
 department due to the lack of a hardware I/O MMU which limits DMA
 transfers to addresses below 4GB. This has a two-fold impact:
 
 1) transfering data to 4GB require first a transfer to 4GB and
 then a copy to the final destination.
 
 2) You must allocate real memory 2X the address space of the
 devices to act as bounce buffers. This is especially problematic
 for workstations because if you put a 512MB Nvidia card in your
 computer for graphics work -- you've just lost 1GB of memory. (I
 dunno how much the typical SCSI/NIC/etc take up.)

 I thought Intel was copying AMD's 64-bit API.  Is Intel's
 implementation as poor as you description?  Does Intel have any better
 64-bit offering other than the Itanium/Itanic?

From what I can see, the resulting copy of AMD64 amounts to little
more than rushing together a project to glue a bag on the side of a
Xeon chip with some 64 bit parts in it.

I see no reason to expect what is only billed as an extension
technology http://www.eweek.com/article2/0,1759,1545734,00.asp to
alleviate the deeply rooted memory bandwidth problems seen on Xeon.
-- 
let name=cbbrowne and tld=acm.org in name ^ @ ^ tld;;
http://cbbrowne.com/info/advocacy.html
Q: What does the function NULL do?
A: The function NULL tests whether or not its argument is NIL or not.  If
   its argument is NIL the value of NULL is NIL.
-- Ken Tracton, Programmer's Guide to Lisp, page 73.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] GiST indexes and concurrency (tsearch2)

2005-02-02 Thread Marinos J. Yannikos
Hi,
according to 
http://www.postgresql.org/docs/8.0/interactive/limitations.html , 
concurrent access to GiST indexes isn't possible at the moment. I 
haven't read the thesis mentioned there, but I presume that concurrent 
read access is also impossible. Is there any workaround for this, esp. 
if the index is usually only read and not written to?

It seems to be a big problem with tsearch2, when multiple clients are 
hammering the db (we have a quad opteron box here that stays 75% idle 
despite an apachebench with concurrency 10 stressing the php script that 
uses tsearch2, with practically no disk accesses)

Regards,
 Marinos
--
Dipl.-Ing. Marinos Yannikos, CEO
Preisvergleich Internet Services AG
Obere Donaustraße 63/2, A-1020 Wien
Tel./Fax: (+431) 5811609-52/-55
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] GiST indexes and concurrency (tsearch2)

2005-02-02 Thread Tom Lane
Marinos J. Yannikos [EMAIL PROTECTED] writes:
 according to 
 http://www.postgresql.org/docs/8.0/interactive/limitations.html , 
 concurrent access to GiST indexes isn't possible at the moment. I 
 haven't read the thesis mentioned there, but I presume that concurrent 
 read access is also impossible.

You presume wrong ...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] Effect of database encoding on performance

2005-02-02 Thread Igor Postelnik








Whats the effect of different encodings on database
performance? 



Were looking to switch encoding of our database from
SQL_ASCII to UTF-8 to better handle international data. I expect that at least
90% of our data will be in the ASCII range with a few characters that need
double-byte encoding. Has anyone done extensive comparison of the performance of
different encodings?



-Igor












Re: [PERFORM] horizontal partition

2005-02-02 Thread Tom Lane
Josh Berkus josh@agliodbs.com writes:
 The issue here is that the planner is capable of pushing down the WHERE 
 criteria into the first view, but not into the second, nested view, and so 
 postgres materializes the UNIONed data set before perfoming the join.

 Thing is, I seem to recall that this particular issue was something Tom fixed
 a while ago.  Which is why I wanted to know what version Gaetano is using.

It's still true that we can't generate a nestloop-with-inner-indexscan
join plan if the inner side is anything more complex than a single table
scan.  Since that's the only plan that gives you any chance of not
scanning the whole partitioned table, it's rather a hindrance :-(

It might be possible to fix this by treating the nestloop's join
conditions as push down-able criteria, instead of the present rather
ad hoc method for generating nestloop/indexscan plans.  It'd be quite
a deal of work though, and I'm concerned about how slow the planner
might run if we did do it like that.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] [SQL] OFFSET impact on Performance???

2005-02-02 Thread Sean Davis
On Jan 26, 2005, at 5:36 AM, Leeuw van der, Tim wrote:
Hi,
What you could do is create a table containing all the fields from 
your SELECT, plus a per-session unique ID. Then you can store the 
query results in there, and use SELECT with OFFSET / LIMIT on that 
table. The WHERE clause for this temp-results table only needs to 
contain the per-session unique id.

This is what I do, but I use two columns for indexing the original 
query, a user_id (not session-id) and an index to the query_id that 
is unique within user.  This query_id is a foreign key to another 
table that describes the query (often just a name).  I allow the user 
only a fixed number of stored queries and recycle after hitting the 
maximum.  You can timestamp your queries so that when you recycle you 
drop the oldest one first.  If you don't need multiple stored query 
results, then using the user_id is probably adequate (assuming the user 
is not logged on in several locations simultaneously).

This of course gives you a new problem: cleaning stale data out of the 
temp-results table. And another new problem is that users will not see 
new data appear on their screen until somehow the query is re-run (... 
but that might even be desirable, actually, depending on how your 
users do their work and what their work is).

See above.  The query refresh issue remains.
And of course better performance cannot be guaranteed until you try it.
For the standard operating procedure of perform query===view results, 
I have found this to be a nice system.  The user is accustomed to 
queries taking a bit of time to perform, but then wants to be able to 
manipulate and view data rather quickly; this paradigm is pretty well 
served by making a separate table of results, particularly if the 
original query is costly.


Would such a scheme give you any hope of improved performance, or 
would it be too much of a nightmare?

This question still applies
Sean
-Original Message-
From: [EMAIL PROTECTED] on behalf of Andrei 
Bintintan
Sent: Wed 1/26/2005 11:11 AM
To: [EMAIL PROTECTED]; Greg Stark
Cc: Richard Huxton; pgsql-sql@postgresql.org; 
pgsql-performance@postgresql.org
Subject: Re: [PERFORM] [SQL] OFFSET impact on Performance???

The problems still stays open.
The thing is that I have about 20 - 30 clients that are using that SQL 
query
where the offset and limit are involved. So, I cannot create a temp 
table,
because that means that I'll have to make a temp table for each 
session...
which is a very bad ideea. Cursors somehow the same. In my application 
the
Where conditions can be very different for each user(session) apart.

The only solution that I see in the moment is to work at the query, or 
to
write a more complex where function to limit the results output. So no
replace for Offset/Limit.

Best regards,
Andy.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])