Ryan: The goodness of UFS direct I/O is highly application-specific. The benefits arise from two fundamental reasons - one being avoidance of OS page cache scaling limitations, and the other being removal of the POSIX single-writer lock constraint - which allows multiple I/O operations to a file to occur concurently when a write is active. Of these factors, the latter usually has the broadest impact.
For Oracle online redo logs, the concensus is that UFS direct I/O is pretty much always a good thing. By default, Oracle's logging uses asynchronous writes (aio_write()), with data-synchronous completion criteria (because the logs are opened with O_DSYNC). Because these writes are synchronous, filesystem write coalescing at the filesystem level has no opportunity to help. Because the writes are asynchronous, they can benefit from improved thorughput by removal of the single-writer lock. (Yes - it's confusing, 'synchronous' and 'asynchonous' and not plain- English opposites here, but rather different topics altogether. I/O that is not asynchronous is 'blocking' (eg: pwrite()), and I/O that is not synchronous is 'deferred' (ie: only flushed by fsync(), maxcontig fills, or moved along by fsflushd.) The only downside to using UFS direct I/O for online logs comes from the archiver losing the performance advantage of UFS filesystem pre-fetching when reading these files. However, since the archiver uses larger I/O sizes, I'm not aware that this has ever become anyone's constraining bottleneck. Therefore, the improved write throughput to logs with UFS direct I/O is pretty much always a good tradeoff. There are also tradeoffs and limitations associated with the memory management overhead underlying the OS page cache with and without filesystem buffering. At high throughput rates, these factors can absolutely be limiting, but most folks are far more impacted by the single-writer lock than the cost of memory mamagement, so I consider these impacts to be secondary. Note that you would *never* want UFS direct I/O on log achive destinations, since the archiver does *not* use O_DSYNC on its output files, and expects to enjoy the performance benefit of deferred writes! The size threshold at which UFS direct I/O would be beneficial can depend on a great many factors - including UFS tunables; volume management factors; I/O mutlipathing factors; the actual APIs used by the application; whether or not space allocation is occuring; and backend configuration factors. For any given configuration, what's best can be best determined by I/O microbenchmarking techniques. Formulating an appropriate microbenchmark requires an accurate understanding of the actual APIs and tuning factors used by your actual application. For Oracle logging, a correct microbenchmark would use O_DSYNC on open() and aio_write() for writing - and the target files will be pre-allocated so that filesystem logging will not bias the results. Assuming a high transaction rate, Oracle itself will probably coalesce log writes to 8K operations, but for a single-stream workload of iterated single-row INSERT/COMMIT operations, log writes may be quite small. Unfortunately, in the area of I/O microbencharking, errors occur quite frequently due to inappropriate experiment design and incorrect interpretation of results - so be careful! For each application and category of I/O, there are tradeoffs to consider in using UFS direct I/O. As a rule, high-end scaling requires use of some storage option with the essential characteristics of UFS direct I/O - and that would include RAW, QFS direct I/O with Q-writes, VxFS Quick I/O or VxFS ODM. All of these should be expected to perform 'similarly' - but the UFS option is free! When moving to one of these options from 'out-of-the-box' buffered I/O, it is typically necessary to do some Oracle tuning to make use of the system memory that is liberated when filesystem buffering is switched off. It is also typical that the impact of these options on backup and restore operations needs to be properly evaluated. The physics underlying these factors is all well-understood. The problems come in making policy decisions around the tradeoffs associated with these factors. There is a load of mis-information available online. Beware any posting that says "you should always use UFS direct I/O". There is a complex set of tradeoffs here, including operational constraints and logistics of changing from other options. The best guidance I can offer in a small space is to "make well-informed decisions regarding these factors". To promote a better understanding of these factors, I wrote a paper a while ago called "Oracle I/O: Supply and Demand". That paper is due for an upgrade, and I hope to push it out this Fall - with the scope expanded to include RAC/Grid considerations and factors affecting 'direct path' and NOLOGGING write performance. Shucks - this posting is getting way too close to *being* a whitepaper! ;-) Hope this helps, -- Bob Sneed This message posted from opensolaris.org _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org