Hi, This is a new thread from "File extension independent H2 format", since discussion has led to other subject:
What think you of using only 2 files, one for all normal columns and other specialized for long data types(LOBS). El 20/12/10 17:30, Thomas Mueller escribió: >> >> would a single storage file slow down H2 engine from looking >> >> for and getting or writing data since > > No. The only problem (I know) is what I have already described. I doubt that this could be real. In a database with many lobs columns and rows the size of this single file can easily grow to disadvantageous levels. A single lob field can have the size of several full tables or even exceeding the size of the rest of the database. Fragmentation at file system (OS) level will have much more impact on large files, caching (at OS level) will be less effective, read-ahead capabilities will be less effective and finally IO load will increase inevitably. It's easy to measure the degradation of the performance of a database as the data volume is significantly increased. I mean, if a db without lobs have 1 GB size and with lobs goes over 10 GB, would be very optimistic to think that the overall performance will not change. As a case imagine defragment or compact a file of that size. In a two files scenario, we would have a main file of 1 GB with almost all data + indexes, and the lobs file of 9 GB with lobs only. ( Not so bad as a file per lob and not so big as all in one file). >> >> What think you of using only 2 files, one for all normal columns and >> >> other specialized for long data types(LOBS). > > I thought about this, but I would try to avoid it if possible. What > > would be the advantage? I can think in all stated above and more: 1) Main file will concentrate almost all indexes and data (except lobs) and references to lobs file as column values for lobs columns in the main file. 2) Lobs file can have a different fileStore (much more simple and specific) organized in variable length extents to take advantage of sequential nature of it's contents. Such a fileStore only need an avail-list and an index of references to pointers; to be used as column value in the main file. Like old xBase .DBT files that use a simple and very effective format or .tar files format that was designed for sequential access devices (or streaming in Java parlance). So a locator can be implemented easily (at file level) as the Lob Reference pointer + locator offset. For extents contents compacting (if needed) can be used a stream oriented method like deflate or gzip without harm streaming . Each extents can have a header with a tag-marker, length , checksum, etc. ; to make broken file recovery easier. >> >> that can facilitate locator's implementation too > > How? Is explained above, but again. If lob's fileStore is organized as a sequence of variable length extents with and index of pointers and available (or deleted) extents ; a locator can be implemented easily at file level as the Lob Reference pointer + the locator offset. Streaming access to lob's contents will be simplified and benefited too. regards, Dario. -- You received this message because you are subscribed to the Google Groups "H2 Database" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/h2-database?hl=en.
