Wogster makes a good point, SYSCS_COMPRESS_TABLE may help get data
in tables to be more contiguous.

The Wogster wrote:
�ystein Gr�vlen wrote:

"TW" == The Wogster <[EMAIL PROTECTED]> writes:



    TW> �ystein Gr�vlen wrote:

    >> Is this also true for B-tree indexes?  I would imagine that if you
>> have a index on a monotocally increasing key (e.g., a timestamp) and >> where you regularly delete old records, there may be a lot of empty
    >> B-tree pages that will never be possible to reuse.
    >>

TW> What happens in most databases. is that the database has a fixed page TW> size, say 8K, when an index page is full, it splits that page into 2 TW> half pages. When an index page is empty it's dropped from the index, TW> and added to the empty page pool. Many will merge almost empty
    TW> neighbouring pages, but that doesn't matter for this discussion.

I know this.  The reason I asked was because I have got the impression
that in Derby the only way to drop empty index pages is to do
compression.


Derby should work in a similar way to other databases, the techniques for developing a database were established years ago. When it comes to indexes there are 3 technologies:

ISAM: The index is loaded into an array in memory, it adds and deletes index members through moving pointers around, then dumps the index back to disk when the database is closed: Dbase worked this way at one time.

B-Tree, traditional B-tree isn't used much anymore, because it's easy to get an off-balance tree, which ends up very inefficient.

Balanced B-Tree, most databases use this one, logic in the indexing code is meant to keep shifting the tree around so that it stays in balance. Requires that the indexes at least be page based, so most page based databases use this one. Most efficient in reads, can be slower at adds and deletes because of the shifting around, but for most databases there are 100 reads for every add or delete.

For dropping index pages, look at the source code and see what it does on a delete, if it releases the page to the empty page pool (more like a cache actually), then it doesn't matter whether you compress or not.

One thing you need to remember, to shorten (if possible) or lengthen a file are expensive operations, this is why most databases add a bunch of pages to the empty page pool, rather then add a single page at a time. They also reuse empty pages within the file, because to reuse an existing page means changing a pointer, adding to or deleting from the file means kernel operations, which are much slower.

Compression can be good though, as a database ages, an index can have the first page on page 25, the next on page 234 535 the next on page 43, and these pages can be all over the disk volume as well, so your moving the heads all over the place to find where these pages are. Compress a database, follow this by a disk defrag can make the database faster.

W

























Reply via email to