Wogster makes a good point, SYSCS_COMPRESS_TABLE may help get data
in tables to be more contiguous.
The Wogster wrote:
�ystein Gr�vlen wrote:
"TW" == The Wogster <[EMAIL PROTECTED]> writes:
TW> �ystein Gr�vlen wrote:
>> Is this also true for B-tree indexes? I would imagine that if you
>> have a index on a monotocally increasing key (e.g., a
timestamp) and
>> where you regularly delete old records, there may be a lot of
empty
>> B-tree pages that will never be possible to reuse.
>>
TW> What happens in most databases. is that the database has a
fixed page
TW> size, say 8K, when an index page is full, it splits that
page into 2
TW> half pages. When an index page is empty it's dropped from the
index,
TW> and added to the empty page pool. Many will merge
almost empty
TW> neighbouring pages, but that doesn't matter for this discussion.
I know this. The reason I asked was because I have got the impression
that in Derby the only way to drop empty index pages is to do
compression.
Derby should work in a similar way to other databases, the techniques
for developing a database were established years ago. When it comes to
indexes there are 3 technologies:
ISAM: The index is loaded into an array in memory, it adds and deletes
index members through moving pointers around, then dumps the index back
to disk when the database is closed: Dbase worked this way at one time.
B-Tree, traditional B-tree isn't used much anymore, because it's easy to
get an off-balance tree, which ends up very inefficient.
Balanced B-Tree, most databases use this one, logic in the indexing code
is meant to keep shifting the tree around so that it stays in balance.
Requires that the indexes at least be page based, so most page based
databases use this one. Most efficient in reads, can be slower at adds
and deletes because of the shifting around, but for most databases there
are 100 reads for every add or delete.
For dropping index pages, look at the source code and see what it does
on a delete, if it releases the page to the empty page pool (more like a
cache actually), then it doesn't matter whether you compress or not.
One thing you need to remember, to shorten (if possible) or lengthen a
file are expensive operations, this is why most databases add a bunch of
pages to the empty page pool, rather then add a single page at a time.
They also reuse empty pages within the file, because to reuse an
existing page means changing a pointer, adding to or deleting from the
file means kernel operations, which are much slower.
Compression can be good though, as a database ages, an index can have
the first page on page 25, the next on page 234 535 the next on page 43,
and these pages can be all over the disk volume as well, so your moving
the heads all over the place to find where these pages are. Compress a
database, follow this by a disk defrag can make the database faster.
W