Re: File extension independent H2 format

Dario Fassi Tue, 21 Dec 2010 09:01:39 -0800

Hi,

El 20/12/10 17:30, Thomas Mueller escribió:
>> would a single storage file slow down H2 engine from looking
>> for and getting or writing data since
> No. The only problem (I know) is what I have already described.


I doubt that this could be real. In a database with many lobs columns and rows 
the size of this single file can easily grow to disadvantageous levels.
A single lob field can have the size of several full tablesor even exceeding 
the size of the rest of the database.

Fragmentation at file system (OS) level will have much more impact on large 
files, caching (at OS level) will be less effective too, read-ahead 
capabilities will be less effective too and finally IO load will 
increaseinevitably.

It is easy to measure the degradation of the performance of a database as the 
data volume is significantly increased. I mean, if a db without lobs have 1 GB 
size and with lobs goes over 10 GB, would be very optimistic to think that the 
overall performance
will not change. Just imagine defragment or compact a file of that size.
In a two files scenario, we would havea main file of 1 GB with almost all data 
+ indexes , and the lobs file of 9 GB with lobs only. ( Not so bad as a file 
per lob and not so big as all in one file).

>> What think you of using only 2 files, one for all normal columns and other 
>> specialized for long data types(LOBS).
> I thought about this, but I would try to avoid it if possible. What
> would be the advantage?

I can think in all stated above and more:

1) Main file will concentrate almost all indexes and data (except lobs) and 
references to lobs files as column values for lobs columns in the main file.

2) Lobs file can have a different fileStore (much more simple and specific) 
organized in variable length extents or pages to take advantage of sequential 
nature of it's contents.

Such a fileStore only need an avail-list and one index of pointers or 
references to be used as column value in the main file ;
like old xBase .DBT files that use a simple and very effective format or .tar 
files format that was designed for sequential access devices (or streaming in 
Java parlance).
So a locator can be implemented easily  (at file level)  as the Lob Reference 
pointer + locator offset.

For extents contents compacting (if needed) can be used a stream oriented 
method like deflate or gzip without harm streaming .
Each extents can have a header with a tag-marker,  length , checksum, etc. ; to 
make broken file recovery easier.

>> that can facilitate locator's implementation too
> How?

Is explained above, but again.

If lob's fileStore is organized as a sequence of variable length extents with 
and index of pointers and available (or deleted) extents ;
a locator can be implemented easily  (at file level)  as the Lob Reference 
pointer + the locator offset.

Streaming access to lob's contents will be simplified and benefited too.

regards,
Dario.

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/h2-database?hl=en.

Re: File extension independent H2 format

Reply via email to