I am looking at committing this. Knut Anders Hatlen (JIRA) wrote:
[ http://issues.apache.org/jira/browse/DERBY-733?page=all ]Knut Anders Hatlen updated DERBY-733: ------------------------------------- Attachment: DERBY-733.diff Mike, I agree that a pool of open file descriptors is a good idea, but you will run into the same problem with highly unstable response times when the number of threads accessing the same file exceeds the number of file descriptors. I think we should use ReentrantLock for Java 1.5 and higher, since the introduction of this class has allowed the implementers of JVMs to prioritize throughput over fairness in the handling of monitors. We should file a separate enhancement request for the file descriptor pool. I have attached a patch which invokes ReentrantLock.lock() and unlock() when reading in a page from disk. I did not build my own ReentrantLock replacement, as I said I would. Instead, I have used reflection to enable this feature if the JVM supports it. This seemed like an easier approach, and I also discovered that the handling of threads waiting for monitors had changed between 1.4 and 1.5 and that this issue was not so serious on 1.4. The maximum response time was drastically reduced in the disk-bound case. Derbyall ran successfully on both Sun JVM 1.4.2 and 1.5.0. I have also tested the performance, and I could not see any change in throughput or CPU usage. (The performance test was run with a very small page cache and with a database that was many times bigger than the page cache, but smaller than the file system cache. This way, Derby called readPage() very often, but it was CPU-bound since the requested page always was in the file system cache.) Could someone please review this patch? % svn stat M java/engine/org/apache/derby/impl/store/raw/data/RAFContainer.javaStarvation in RAFContainer.readPage() ------------------------------------- Key: DERBY-733 URL: http://issues.apache.org/jira/browse/DERBY-733 Project: Derby Type: Improvement Components: Performance, Store Versions: 10.1.2.1, 10.2.0.0, 10.1.3.0, 10.1.2.2 Environment: Solaris x86 and Linux with Sun JVM 1.5.0. Derby embedded and client/server. Reporter: Knut Anders Hatlen Assignee: Knut Anders Hatlen Attachments: DERBY-733.diff When Derby is completely disk bound, threads might be starved in RAFContainer.readPage(). This is a real problem when multiple clients are repeatedly accessing one or a small number of large tables. In cases like this, I have observed very high maximum response times (several minutes in the worst cases) on simple transactions. The average response time is not affected by this. The starvation is caused by a synchronized block in RAFContainer.readPage(): synchronized (this) { fileData.seek(pageOffset); fileData.readFully(pageData, 0, pageSize); } If many threads want to read pages from the same file, there will be a long queue of threads waiting for this monitor. Since the Java specification does not guarantee that threads waiting for monitors are treated fairly, some threads might have to wait for a long time before they get the monitor. (Usually, a couple of threads get full throughput while the others have to wait.)
