Re: [PERFORM] COUNT(*) again (was Re: [HACKERS] Index/Function organized table layout)

2003-10-05 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I think that's not happening, conditionally or otherwise.  The atomicity
 problems alone are sufficient reason why not, even before you look at
 the performance issues.

 What are the atomicity problems of adding a create/expire xid to the
 index tuples?

You can't update a tuple's status in just one place ... you have to
update the copies in the indexes too.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Index/Function organized table layout

2003-10-05 Thread Hannu Krosing
Greg Stark kirjutas P, 05.10.2003 kell 00:17:

 I've never seen anyone use this feature, and I never seriously considered it
 myself. It sort of has the feel of an antiquated feature that traded too much
 flexibility and abstraction for raw performance on very slow disk hardware. 

Read A Conversation with Jim Gray referenced from this slashdot
article: 
http://slashdot.org/article.pl?sid=03/09/17/1246255mode=threadtid=126
for info on how disk drives are slower than ever (relatively), and how
one should treat them as such, especially for large data volumes.

 However I wonder if the nested tables feature doesn't use it under the hood
 though. It seems they would both be useful for the same types of tables.
 
 I'm not sure what this means for Postgres. I'm not sure if Postgres should use
 a different name to avoid confusion and possibly to leave room in the future
 for the possibility of supporting something like this. Or perhaps something
 like this would be useful for Postgres now or in the near future? Or perhaps
 the consensus is as I said, that this is an old idea that no longer gets any
 respect and postgres should just pretend it doesn't exist?

We can't pretend CLUSTER does not exist until we have some better
technology to offer instead.


Hannu




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Index/Function organized table layout

2003-10-05 Thread Mike Mascari
Hannu Krosing wrote:
 Greg Stark kirjutas P, 05.10.2003 kell 00:17:
 
I've never seen anyone use this feature, and I never seriously considered it
myself. It sort of has the feel of an antiquated feature that traded too much
flexibility and abstraction for raw performance on very slow disk hardware. 
 
 
 Read A Conversation with Jim Gray referenced from this slashdot
 article: 
 http://slashdot.org/article.pl?sid=03/09/17/1246255mode=threadtid=126
 for info on how disk drives are slower than ever (relatively), and how
 one should treat them as such, especially for large data volumes.

Too bad PostgreSQL is misspelled (Postgress) and MySQL dominates the
open source discussion. And the MySQL questions are coming from:

David Patterson, who holds the Pardee Chair of Computer Science at
the University of California at Berkeley.

Outrageous! :-)

Mike Mascari
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Index/Function organized table layout (from Re:

2003-10-04 Thread Hannu Krosing
Christopher Browne kirjutas R, 03.10.2003 kell 00:57:
 [EMAIL PROTECTED] (Jean-Luc Lachance) writes:
  That's one of the draw back of MVCC.  
  I once suggested that the transaction number and other house keeping
  info be included in the index, but was told to forget it...
  It would solve once and for all the issue of seq_scan vs index_scan.
  It would simplify the aggregate problem.
 
 It would only simplify _one_ case, namely the case where someone cares
 about the cardinality of a relation, and it would do that at
 _considerable_ cost.
 
 A while back I outlined how this would have to be done, and for it to
 be done efficiently, it would be anything BUT simple.  

Could this be made a TODO item, perhaps with your attack plan. 
Of course as strictly optional feature useful only for special situations
(see below)

I cross-post this to [HACKERS] as it seem relevant to a problem recently
discussed there.

 It would be very hairy to implement it correctly, and all this would
 cover is the single case of SELECT COUNT(*) FROM SOME_TABLE;

Not really. Just yesterday there was a discussion on [HACKERS] about
implementing btree-organized tables, which would be much less needed if
the visibility info were kept in indexes. 

 If you had a single WHERE clause attached, you would have to revert to
 walking through the tuples looking for the ones that are live and
 committed, which is true for any DBMS.

If the WHERE clause could use the same index (or any index with
visibility info) then there would be no need for walking through the
tuples in data relation.

the typical usecase cited on [HACKERS] was time series data, where
inserts are roughly in (timestamp,id)order but queries in (id,timestamp)
order. Now if the index would include all relevant fields
(id,timestamp,data1,data2,...,dataN) then the query could run on index
only touching just a few pages and thus vastly improving performance. I
agree that this is not something everybody needs, but when it is needed
it is needed bad.

 And it still begs the same question, of why the result of this query
 would be particularly meaningful to anyone.  I don't see the
 usefulness; I don't see the value of going to the considerable effort
 of fixing this purported problem.

Being able to do fast count(*) is just a side benefit.


Hannu


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Index/Function organized table layout

2003-10-04 Thread James Rogers
On 10/2/03 11:34 PM, Hannu Krosing [EMAIL PROTECTED] wrote:
 James Rogers kirjutas N, 02.10.2003 kell 23:44:
 Not exactly. What you are describing is more akin to partitioning or
 hash-organized tables i.e. sorting insert/update tuples to various pages
 according to some hash function.
 
 What I actually thought I was describing is how CLUSTER should work in a
 postgres flavour of MVCC storage ;). Not the CLUSTER command, but the
 whole feature.


Yeah, I can see this.  Clustering doesn't really imply ordering, just
tuple proximity in the heap.  Loosely implemented by default (i.e. only
grouping a tuple if it is cheap to do so at insert/update time) on a table's
primary key might give measurable query performance improvement in the
typical case without impacting the average performance of queries that use
other indexes.  

Even if the tuples were not ordered in the heap, tuple proximity in the heap
would solve a significant percentage of the performance issue that
index-organized tables solve i.e. greatly improved cache efficiency.

 
 AFAICS we could resolve this problem (querying indexes only) by keeping
 a copy of visibility info (tmin,tmax,...) in index tuples. This would
 make index updates bigger and thus slower, so this should be optional.


It wouldn't be a feature you would want to use anyway if actually indexing a
table.  But yes, adding the necessary support to make an index page
masquerade as a proper heap page would be most of the effort.  That and
making the SQL execution efficiently make use of the fact that the index and
the table are the same relation with two different interfaces, such that you
don't have to modify or access both when using either one.

 
 If you then put all fields in primary key, then the main table could be
 dropped. If there is no data table then no other indexes would then be
 allowed, or they must be double-indexes referencing the primary key,
 not tuple and thus even bigger ...


Putting an index on an index is pretty horrible.  But quite frankly, if you
actually require a second index on a table, then B-Tree organized tables are
not what you need.

 
 But if we had clustered the table on (id, timestamp), then the data
 would be in right order for queries, if cluster worked well.


Right, and it doesn't even have to be in perfect index order really, as what
I'm really trying to do is significantly decrease the amount of buffers
required for a typical query.  Having to order the tuples later is a small
matter.

 
 Yeah, index-organized tables seems exact fit for your problem, but then
 my abstract idea of what clustering should do is exactly that - keep
 tuples in roughly the same order as an index ;)


It would be a good approximation of what index-organizing aims to
accomplish.

 
 So what really is needed is a smart tuple-placer which can keep tuples
 that are close (as defined by index) together in a small number of
 pages. These pages themselves need not be coninuous, they can be
 sprinkled around in the whole data table, but they need to stay clusters
 of index-close tuples.


More or less, yes.

Cheers,

-James Rogers
 [EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


COUNT(*) again (was Re: [HACKERS] Index/Function organized table layout)

2003-10-04 Thread Tom Lane
Hannu Krosing [EMAIL PROTECTED] writes:
 Christopher Browne kirjutas R, 03.10.2003 kell 00:57:
 A while back I outlined how this would have to be done, and for it to
 be done efficiently, it would be anything BUT simple.  

 Could this be made a TODO item, perhaps with your attack plan. 

If I recall that discussion correctly, no one including Christopher
thought the attack plan was actually reasonable.

What this keeps coming down to is that an optimization that helps only
COUNT(*)-of-one-table-with-no-WHERE-clause would be too expensive in
development and maintenance effort to justify its existence.

At least if you insist on an exact, MVCC-correct answer.  So far as I've
seen, the actual use cases for unqualified COUNT(*) could be handled
equally well by an approximate answer.  What we should be doing rather
than wasting large amounts of time trying to devise exact solutions is
telling people to look at pg_class.reltuples for approximate answers.
We could also be looking at beefing up support for that approach ---
maybe provide some syntactic sugar for the lookup, maybe see if we can
update reltuples in more places than we do now, make sure that the
autovacuum daemon includes keep reltuples accurate as one of its
design goals, etc etc.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Index/Function organized table layout

2003-10-04 Thread Greg Stark

James Rogers [EMAIL PROTECTED] writes:

 On 10/2/03 11:34 PM, Hannu Krosing [EMAIL PROTECTED] wrote:
  
  What I actually thought I was describing is how CLUSTER should work in a
  postgres flavour of MVCC storage ;). Not the CLUSTER command, but the
  whole feature.
 
 
 Yeah, I can see this.  Clustering doesn't really imply ordering, just
 tuple proximity in the heap.  

So I'm a bit confused about the term Clustering. It seems Postgres uses it
to mean simply ordering the tuple storage within an otherwise normal table.

However in other databases it seems to mean something more complex. I've never
used it in Oracle, but from what I read it seems Oracle thinks clustering
means storing the tuples for one table in the heap for *another* table
entirely.

The typical usage scenario envisioned is to store the tuples for child records
near the parent record in a related table. Ie, say you have a users table with
an ACL list, each user has possibly many ACL entries on the ACL list. You
could cluster the ACL entry list onto the users table so that the entries for
a given user would be stored *in the users table heap* near the record for
that user.

I've never seen anyone use this feature, and I never seriously considered it
myself. It sort of has the feel of an antiquated feature that traded too much
flexibility and abstraction for raw performance on very slow disk hardware. 
However I wonder if the nested tables feature doesn't use it under the hood
though. It seems they would both be useful for the same types of tables.

I'm not sure what this means for Postgres. I'm not sure if Postgres should use
a different name to avoid confusion and possibly to leave room in the future
for the possibility of supporting something like this. Or perhaps something
like this would be useful for Postgres now or in the near future? Or perhaps
the consensus is as I said, that this is an old idea that no longer gets any
respect and postgres should just pretend it doesn't exist?

-- 
greg


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Index/Function organized table layout

2003-10-04 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 So I'm a bit confused about the term Clustering. It seems Postgres uses it
 to mean simply ordering the tuple storage within an otherwise normal table.

 However in other databases it seems to mean something more complex.

My take is that clustering means not only placing related tuples near
each other, but taking steps to preserve that organization over time.
PG is really misusing the term because our current CLUSTER command does
the first but not the second.

If tuple insertions were to try to preserve the heap ordering induced
by the latest CLUSTER command, then I think we'd have something closer
to what is usually meant by the term.

 I've never used it in Oracle, but from what I read it seems Oracle
 thinks clustering means storing the tuples for one table in the heap
 for *another* table entirely.

I think that's an implementation detail rather than the fundamental
meaning.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] COUNT(*) again (was Re: [HACKERS] Index/Function organized table layout)

2003-10-04 Thread Tom Lane
Hannu Krosing [EMAIL PROTECTED] writes:
 The point I was trying to make was that faster count(*)'s is just a side
 effect. If we could (conditionally) keep visibility info in indexes,

I think that's not happening, conditionally or otherwise.  The atomicity
problems alone are sufficient reason why not, even before you look at
the performance issues.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Index/Function organized table layout

2003-10-03 Thread Hannu Krosing
James Rogers kirjutas N, 02.10.2003 kell 23:44:
 On Thu, 2003-10-02 at 12:09, Hannu Krosing wrote:
  So what you really need is the CLUSTER command to leave pages half-empty
  and the tuple placement logic on inserts/updates to place new tuples
  near the place where they would be placed by CLUSTER. I.e. the code that
  does actual inserting should be aware of CLUSTERing.
 
 
 Not exactly. What you are describing is more akin to partitioning or
 hash-organized tables i.e. sorting insert/update tuples to various pages
 according to some hash function.

What I actually thought I was describing is how CLUSTER should work in a
postgres flavour of MVCC storage ;). Not the CLUSTER command, but the
whole feature.

 B-Tree organized tables basically make the table and the primary index
 the same thing, and tend to be most useful for tables that use a single
 multi-column primary key index for queries.  This has the effect of
 putting all the tuples for a typical query in the same small number of
 heap pages (and therefore buffers), allowing very efficient access in
 the typical case with the caveat that non-indexed queries will be quite
 slow.

AFAICS we could resolve this problem (querying indexes only) by keeping
a copy of visibility info (tmin,tmax,...) in index tuples. This would
make index updates bigger and thus slower, so this should be optional.

If you then put all fields in primary key, then the main table could be
dropped. If there is no data table then no other indexes would then be
allowed, or they must be double-indexes referencing the primary key,
not tuple and thus even bigger ...

 B-Tree organized tables are particularly useful when the insert order is
 orthogonal to the typical query order.  As I mentioned before, tables
 that store parallel collections of time-series data are classic
 examples.  Often the data is inserted into the pages in order that can
 roughly be described as (timestamp, id), but is queried using (id,
 timestamp) as the index.  If you have enough ids, you end up with the
 pathological case where you typically have one relevant tuple per page
 for a given query.

But if we had clustered the table on (id, timestamp), then the data
would be in right order for queries, if cluster worked well.

 The nuisance would be keeping track of which pages are collecting which
 tuples.  Knowing the CLUSTER index doesn't tell you much about which
 pages would currently be a good place to put a new tuple.  You could
 always markup the index that CLUSTER uses to keep track of good
 candidates (plus some additional structures), but the more I think about
 that, the more it looks like a nasty hack.

Yeah, index-organized tables seems exact fit for your problem, but then
my abstract idea of what clustering should do is exactly that - keep
tuples in roughly the same order as an index ;)

So what really is needed is a smart tuple-placer which can keep tuples
that are close (as defined by index) together in a small number of
pages. These pages themselves need not be coninuous, they can be
sprinkled around in the whole data table, but they need to stay clusters
of index-close tuples.
 

Hannu


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Index/Function organized table layout

2003-10-02 Thread James Rogers
On Wed, 2003-10-01 at 09:29, Alvaro Herrera wrote:
 On Wed, Oct 01, 2003 at 11:37:38AM -0400, Tom Lane wrote:
  Hm, are you sure that smarter buffer management wouldn't serve the
  purpose?
 
 It doesn't help when there a lot of access locality in searching.  In my
 case I want to select some thousands of records that were inserted very
 apart from each other, but are logically very near.  Having this
 pseudoheap that is ordered by definition helps very much with the
 selection; the current heap requires me to bring to buffers lots of
 uninteresting tuples, whichever buffer management algorithm is used,
 because they are in the same page as interesting tuples.


Yes, what Alvaro said.

For very large tables that routinely run modest range queries, it can be
very expensive in terms of cache efficiency if tuples that are closely
grouped and ordered logically are scattered throughout the heap.  The
requirement to buffer a lot of unrelated data for typical case queries
can greatly reduce the cache hit rate if the active portion of the data
is already quite large relative to the physical RAM available.

To give a real world example, a standard query on one of our tables that
has not been CLUSTER-ed recently (i.e. within the last several days)
generates an average of ~2,000 cache misses.  Recently CLUSTER-ed, it
generates ~0 cache misses on average.  Needless to say, one is *much*
faster than the other.  The problem is that the number of buffers
required to satisfy this query with the tuples scattered is enough to
make it swap out the buffers of another competing query on another table
that is also running.  The result is that performance grinds to a halt
as processes are competing with each other and trying to swap out each
others buffers, resulting in a lot less *actual* buffering than should
be occurring given the amount of data actually being queried.

In my case, not only does CLUSTER-ing increase the number of concurrent
queries possible without disk thrashing by an integer factor, but the
number of buffers touched on a query that generates a cache misses is
greatly reduced as well.  The problem is that CLUSTER-ing is costly and
index-organizing some of the tables would reduce the buffer needs, since
the index tuple in these cases are almost as large as the heap tuples
they reference.

The classic scenario for this is when you have a large collection of
time-series data stored in a table, with each series keyed to another
table.  The the typical tuple distribution creates pathological
behaviors when buffer space becomes tight.

Cheers,

-James Rogers
 [EMAIL PROTECTED]





---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Index/Function organized table layout

2003-10-02 Thread Hannu Krosing
James Rogers kirjutas N, 02.10.2003 kell 20:50:

 To give a real world example, a standard query on one of our tables that
 has not been CLUSTER-ed recently (i.e. within the last several days)
 generates an average of ~2,000 cache misses.  Recently CLUSTER-ed, it
 generates ~0 cache misses on average.  Needless to say, one is *much*
 faster than the other. 

So what you really need is the CLUSTER command to leave pages half-empty
and the tuple placement logic on inserts/updates to place new tuples
near the place where they would be placed by CLUSTER. I.e. the code that
does actual inserting should be aware of CLUSTERing.

I guess that similar behaviour (half-empty pages, or even each second
page empty which is better as it creates less dirty buffers) could also
significantly speed up updates on huge number of tuples, as then code
could always select a place near the old one and thus avoid needless
head-movements between reading and writing areas.

 In my case, not only does CLUSTER-ing increase the number of concurrent
 queries possible without disk thrashing by an integer factor, but the
 number of buffers touched on a query that generates a cache misses is
 greatly reduced as well.  The problem is that CLUSTER-ing is costly and
 index-organizing some of the tables would reduce the buffer needs, since
 the index tuple in these cases are almost as large as the heap tuples
 they reference.

True, but my above suggestion would be much easier to implement
near-term. It seems to be a nice incremental improvement just needs 
to touch places:

1. selecting where new tuples go : 

  * updated ones go near old ones if not clustered and near the place
CLUSTER would place them if clustered. 

  * inserted ones go to the less than half-empty pages if not clustered
and near the place CLUSTER would place them if clustered. 

2. making reorganization code (CLUSTER and VACUUM FULL) to leave space 
in pages for clustered updates/inserts.

the half above could of course mean anything from 10% to 95% depending
on access patterns.

-
Hannu


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Index/Function organized table layout

2003-10-02 Thread James Rogers
On Thu, 2003-10-02 at 12:09, Hannu Krosing wrote:
 So what you really need is the CLUSTER command to leave pages half-empty
 and the tuple placement logic on inserts/updates to place new tuples
 near the place where they would be placed by CLUSTER. I.e. the code that
 does actual inserting should be aware of CLUSTERing.


Not exactly. What you are describing is more akin to partitioning or
hash-organized tables i.e. sorting insert/update tuples to various pages
according to some hash function.  This works well if you a table is
composed of many well-defined working sets but the access and query
patterns for a given working set are varied.  A common example of this
is for large accounting tables, which are frequently partitioned by a
function which truncates the record timestamp to the year/month.  Note
that indexes can be partitioned as well, so that as long as you are
doing queries within a given month (using the accounting example), the
query performance is generally what you would expect if the entire table
was the size of one month's data, no matter how big the table actually
gets.

Partitions and hash-organized tables (for RDBMS that support them) are
often handled internally as multiple tables masquerading as one (kind of
like a view), with the hash function determining which table is actually
being accessed.  This allows the possibility of doing things like
locking individual partitions or setting them read-only, and generally
having the option of treating them as individual tables for
administrative purposes e.g. deleting a partition of old unused data
without adversely affecting the active partition in the way it would
if they were truly one table, even if they look like one table from a
SQL-level view.

What I really need in my specific case is B-Tree organized tables,
though hashing/partitioning would help too.  It *is* a similar kind of
problem.

B-Tree organized tables basically make the table and the primary index
the same thing, and tend to be most useful for tables that use a single
multi-column primary key index for queries.  This has the effect of
putting all the tuples for a typical query in the same small number of
heap pages (and therefore buffers), allowing very efficient access in
the typical case with the caveat that non-indexed queries will be quite
slow.

B-Tree organized tables are particularly useful when the insert order is
orthogonal to the typical query order.  As I mentioned before, tables
that store parallel collections of time-series data are classic
examples.  Often the data is inserted into the pages in order that can
roughly be described as (timestamp, id), but is queried using (id,
timestamp) as the index.  If you have enough ids, you end up with the
pathological case where you typically have one relevant tuple per page
for a given query.


 True, but my above suggestion would be much easier to implement
 near-term. It seems to be a nice incremental improvement just needs 
 to touch places:
 
 1. selecting where new tuples go : 
 
   * updated ones go near old ones if not clustered and near the place
 CLUSTER would place them if clustered. 
 
   * inserted ones go to the less than half-empty pages if not clustered
 and near the place CLUSTER would place them if clustered. 
 
 2. making reorganization code (CLUSTER and VACUUM FULL) to leave space 
 in pages for clustered updates/inserts.


The nuisance would be keeping track of which pages are collecting which
tuples.  Knowing the CLUSTER index doesn't tell you much about which
pages would currently be a good place to put a new tuple.  You could
always markup the index that CLUSTER uses to keep track of good
candidates (plus some additional structures), but the more I think about
that, the more it looks like a nasty hack.

Cheers,

-James Rogers
 [EMAIL PROTECTED]





---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Index/Function organized table layout

2003-10-02 Thread Alvaro Herrera Munoz
On Thu, Oct 02, 2003 at 10:09:12PM +0300, Hannu Krosing wrote:

 So what you really need is the CLUSTER command to leave pages half-empty
 and the tuple placement logic on inserts/updates to place new tuples
 near the place where they would be placed by CLUSTER. I.e. the code that
 does actual inserting should be aware of CLUSTERing.

Oh, you mean like what I half-proposed in

http://archives.postgresql.org/pgsql-general/2002-06/msg00968.php

BTW, this is the one message (the one on the CVS log I cite at the end)
that got me into Postgres hacking.  The little involvement I've had has
been great.  Thanks to all PostgreSQL hackers!

-- 
Alvaro Herrera ([EMAIL PROTECTED])
El realista sabe lo que quiere; el idealista quiere lo que sabe (Anonimo)

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Index/Function organized table layout

2003-10-01 Thread Tom Lane
James Rogers [EMAIL PROTECTED] writes:
 Now, I've actually hacked commercial MVCC engines back in the day, and am
 comfortable playing around in database internals.  I have an itch to
 scratch for improving the scalability of Really Large Tables by explicitly
 allowing control of table layouts as an optional property of the tables.  I
 would like some feedback as to whether this is practical given the current
 architecture; it appears that it is, but I'm not intimately familiar with it

I think you'd need to do some basic architectural work first.  Right now
we have a clean API for index access methods, but there is no comparable
abstraction layer for heaps (tables).  It'd probably be necessary to
create such a layer in order to allow different heap organizations to be
supported.  Another point of confusion is that heaps and indexes are
rigidly distinct.  Perhaps some heaps could be considered to be indexes
as well, in order to support your idea of b-tree-organized tables.
Doing that would undoubtedly break a few places though.

 Both of these things really are attempts to address the same basic problem,
 which is optimizing the number of buffers a given query uses by making the
 tables layout reflect typical queries.

Hm, are you sure that smarter buffer management wouldn't serve the
purpose?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Index/Function organized table layout

2003-10-01 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 As for other indexes, I'm not sure why you say this precludes the use of
 other indexes.  The only thing they have to do is keep pointers to index
 elements, instead of heap elements.  Doesn't sound impossible to me.

However, btree feels free to move index entries around while inserting
other entries.  I'm not sure that you could usefully use ctid as an
identifier for tuples in a btree-organized heap.  This will break more
things than just other indexes :-( ... UPDATE chaining for one example.

 Another thing to keep in mind:  LY requires unique keys.  In the
 current btree implementation this is worked around using the
 pointers-to-heap (ctid?) to differentiate items that have the same key.
 If you use a clustered index you won't have this pointer; maybe it will
 be required that this index is marked UNIQUE.

IIRC this assumption is really only used when re-finding the parent
downlink after a page split, and so we were able to get rid of it by
looking for the downlink by child block number, without using the key at
all.  So that part doesn't bother me.  The mutability of index ctids
seems like a much worse problem.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Index/Function organized table layout

2003-10-01 Thread Alvaro Herrera
On Wed, Oct 01, 2003 at 11:37:38AM -0400, Tom Lane wrote:
 James Rogers [EMAIL PROTECTED] writes:

  Both of these things really are attempts to address the same basic problem,
  which is optimizing the number of buffers a given query uses by making the
  tables layout reflect typical queries.
 
 Hm, are you sure that smarter buffer management wouldn't serve the
 purpose?

It doesn't help when there a lot of access locality in searching.  In my
case I want to select some thousands of records that were inserted very
apart from each other, but are logically very near.  Having this
pseudoheap that is ordered by definition helps very much with the
selection; the current heap requires me to bring to buffers lots of
uninteresting tuples, whichever buffer management algorithm is used,
because they are in the same page as interesting tuples.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Linux transformó mi computadora, de una `máquina para hacer cosas',
en un aparato realmente entretenido, sobre el cual cada día aprendo
algo nuevo (Jaime Salinas)

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org