I think that was tried when the original change was made and the system did not see the expected IO performance that was expected at log commit time. The problem this change is addresses is that performance of the log sync call at commit time was sometimes 2 to 3 times slower than other databases for single user short transaction benchmarks. The difference depends on OS/JVM/and disk types.

Previous to jdk1.4 the only pure java way to sync data to disk was a call that told the JVM to basically sync the whole file. Our investigation determined that at least on windows each extending write using this mechansism was causing multiple I/O's to the log file - likely one for the data and one for system metadata about the file.

In jdk1.4 a new interface was added which guaranteed sync of writes for each write executed. Unfortunately on some jvm's in jdk1.4.1 it did not
actually do the sync. This bug was fixed in jdk1.4.2 jvms.


So in jdk1.4.2 jvm's the new interface was used. But testing showed the
system still got the multiple I/O performance unless the system wrote to
a preallocated log file rather than an ever growing one. So at the same time the code being discussed here was added to preallocate log files.


I would worry that the suggested trick would actually not allocate all
pages to the file (on all OS/JVM's), and that subsequent sync writes at commit time would again see the multiple I/O peformance that was trying to be avoided.


Making allocation run fast is good, but only if the subsequent syncing writes as part of commits are optimized. Log file allocation can be made asynchronous to client application work, but the syncing peformance directly affects clients performance.

It is unfortunate that the jvm documentation is so bad in this area, as you say the behaviour in this situation is undocumented. From documentation we can't even tell what "metadata" is being discussed when using the "rws" and "rwd" modes.


Jan Hlavat� wrote:
Suresh Thalamati wrote:
| .  Preallocation of  the log file  by doing writes  to a file opened in
| "rws"  mode   will be much  slower than   doing writes to file opened in
| "rw" mode  .

I have a trick for this - instead of writing whole file,
write a single block on the end (last block) - that will create the big file faster, rws or not ;-)
Rest of file gets zeroed out on windows, dunno about other platforms (it's unspecified).


Jan

Reply via email to