On Fri, Jun 26, 2009 at 09:31, Ulf Zibis <ulf.zi...@gmx.de> wrote: > Martin, thanks for taking the time. > > Am 26.06.2009 15:53, Martin Buchholz schrieb: > > > > On Fri, Jun 26, 2009 at 01:37, Ulf Zibis <ulf.zi...@gmx.de> wrote: > >> 1. Hopefully some volunteer would be found to fix >> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6818737 >> before JDK7 API-freeze. >> Especially, if jar is not compressed, as in case of normal JDK >> installation, reading entries from jar should be much faster through >> java.nio.channels, than via BuffererdInputStream. > > > The way to motivate us around here > is to provide the prototype implementation that > demonstrates the speedup. > > > Sorry, I'm not the specialist on how to provide NIO buffers from native > memory, and first, I will finish my work on charsets. > > Motivation: > Xueming states: > *"dat" based uses less disk space, but it has larger startup time, reading > an additional "big" dat file during class loading/initializing actually > takes much longer time than I expected (I hit the extreme when I worked on > the EUC_TW, which I make the size only 30% of the existing one, but startup > is a disaster regression, ... > * >
I'm surprised. I would expect startup to actually be faster. I assume we're only reading the bytes that are necessary > > If loading x bytes from dat file via getResourceAsStream() takes much > longer time than loading x+30% bytes from class file, processing the UTF-8 > conversion, instantiating and initializing additional Class objects, I > imperatively presume, that there must be a big chance for significantly > improving read speed from uncompressed jar file (here charsets.jar), by > using direct channels or how ever. I presume, enhancing reading from jar > files would be a big fish in performance gain for the whole JDK, as it is > very common task in JVM's daily work. > > > > >> >> >> >>> While benchmarking, I discovered to my horror that the simple >>> >>> jar cf /tmp/t1 ... >>> jar i /tmp/t1 >>> >>> fails, because it tries to create the replacement jar in "." and then >>> rename() it into place. Oops... Better refactor out the code that >>> puts the replacement temp file in the same directory. >>> Better write some tests for that, too. >>> >> >> 2. I don't like to refactor out the code in case of only once used, and >> only to better "comment" what the 2 lines are doing. >> It blows up the code, and following the code demands annoying scrolling. >> Better add additional comment to original code. >> > > The original code created temp files in *two* places, > and did it differently. > > > Oops, at my first search on your code I only found *one* usage of > createTempFileInSameDirectoryAs(). Did you add the 2nd later? > But there is only one usage of directoryOf(). Shouldn't you inline this? > This is modern software engineering. We are all encouraged to write many small methods. > > I think the name > createTempFileInSameDirectoryAs > makes the current code much clearer. > > > Yes, this is pretty clear, but the cost is 19 lines against 2+2 plus > demanding the reader for annoying scrolling. > Thinking about directoryOf() I guess, following this strategy you would > find ten's of locations in Main.java where you could refactor out code into > small well self-explaining methods, but wouldn't this end up in a mess of > unreadable blown-up code? > Find suitable abstractions and refactor them into a separate piece of code. The win is a lot bigger if you make the new abstractions public supported parts of the API, but that is harder with the JDK. > Also, JITs tend to be very good at inlining. > > > (... after some loops), yes, I know > > > >> >> 3. What happens, if original file is exactly named "jartmp" >> I think you would better add ".tmp" at the end of the filename, and remove >> it later. >> Does your new code work with? : >> jar cf /jartmp/t1 ... >> jar i /jartmp/t1 >> > > File.createTempFile doesn't literally create a file named jartmp. > That's only the prefix. And it promises to return > a freshly created empty file. > > > Now I understand deeper. I just wondered why in fact just renaming "tmp" to > "jartmp" would resolve your bug. I didn't recognize the 2nd location, where > wrong "." was used for dir name. > The renaming jar -> jartmp is not significant. Martin > > -Ulf > >