Am Dienstag, den 18.09.2007, 09:23 +0200 schrieb Frank Schönheit - Sun Microsystems Germany: > Hi Marc, > > >> Unfortunately not. This would be the only change which *really* allows > >> to address a number of performance issues with the embedded HSQLDB. > >> Amongst others, closing data views or forms becomes unacceptably slow > >> (IMO) if the .odb exceeds a certain (relatively small) size limit. Also, > >> opening the connection becomes slower as the database and thus the .odb > >> grows. The only change to overcome this would be the single-file > >> backend, but there has been no progress at this. > > > > Will a single file make such a big difference? And why? > > Because with the ZIP file architecture, every commit/write to the ZIP > file (the .odb) requires a complete rewriting of the complete package. > Technically, this is "solved" (not really) by working on a copy of all > the streams in the ZIP package, and only re-packinging them when the > document as a whole is finally saved. > > That is the reason for some oddnesses: For instance, if a form is saved, > then the changes you did to the form are saved to the copy of the form's > stream. Only if you then save the database document, the copy is > re-merged into the .odb file.
Now I understand, and because zip is a stream compression it has to be as it is. > This approach was dictated by the fact that on the medium term, the .odb > format should be standardized at OASIS as well, and this means doing as > the other applications/formats do. And since ODF dictates this approach it is implicitly excluding high performance on bigger data volumes. An idea here could be to re-package the .odb using the database part as a single item with compression -0, but I think zip doesn't support this. But using ODF for the database part is debatable in itself, there is no application that could reuse it currently - and reuse is the goal of ODF. > Also, this would automatically solve the problem of data changes not > surviving a crash: Currently, when you enter data in say the table data > view, this (by the HSQL engine) is immediately (well, with a > configurable delay) written into the underlying files. This is how every > reasonable database engine behaves - it means if you pull the plug just > after changing the data, it will most probably still be there the next > time you look at it (again, not counting for possible write caches of > the operating system). Very valuable feature ... > With a single-file back-end (which, when I say it, always implies a file > with random access to it, other back-end file formats are useless for a > DB engine), this would change, too. > > > I could easily think of owering the workload when serializing the > > database to disc by having some sort of background task preparing the > > physical save by e.g. building up the DOM-model of the data or the like. > > I'd suppose this is overkill, and will get into performance problems a > little bit later, but soon enough. Still, the problem, imposed by the > ZIP format, is that for the change of a single byte in the file (say you > changed a single letter in a table row), the complete file has to be > re-packaged and re-written. This bottle neck can IMO only be removed > with a change of the file format, away from ZIP, forward to a > random-access format. So having a random access file backend would be the way to go. Do you have something in mind (although I think this has to be solved in the HSQL source)? <shouting "Jehova mode> Other databases using binary files having a jdbc driver may fit this requirement, too. Firebird would be a candidate. </shouting "Jehova mode> Marc --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
