May I add to the "Date thread", that in my own "ancient proprietary" logging system (back in 97, I think), one of the best speed improvements were a small detail that is easy to forget.
Reviewing the log, it is important to know the time rather well, and that the order of the events are correct. But having a accurately formatted date field for each event may not be necessary. I changed my logger so that the Date field was only formatted every 5 second, and had an additional field for millisecond since last format. Major improvement in those days. Niclas On Friday 04 October 2002 02:05, Ceki Gülcü wrote: > Hi Chris, > > Object reuse and optimizing memory usage was one of the themes I was > seriously considering for future log4j releases. > > Your results are somewhat surprising. I knew memory usage could be > improved but hadn't realized the extent... > > More comments below. > > At 10:42 03.10.2002 -0500, [EMAIL PROTECTED] wrote: > >I am currently on a project using log4j for the first time. We're using > >1.1.3. I had many concerns with the amount of transient objects created > >during logging to a DailyRollingFileAppender. I refactored log4j making > >memory usage improvements that I think could be very useful to log4j > > users. > > > >One recent test showed 350MB of transient memory created by log4j to log > > 33 MB of data consisting of 33500 log entries; the number of transient > > objects numbered in the millions and was uncountable by my tools. The > > layout pattern used was: > > Millions? Wow! > > >log4j.appender.R.layout.ConversionPattern=%d{MMM dd yyyy HH:mm:ss,SSS} %t > > % -5p %c - %m%n > > > >Without date formatting the total was ~275 MB. > > What happens with just %d (with nothing within the braces)? With %r? > With no time information? > > >I appreciate the functionality offered by this framework but I find it's > >disregard for memory consumption disturbing. I refactored it and I can > > now run the same test generating only 4 MB of transient data and 105000 > > transient objects for the same 33 MB data set. The remaining 4 MB and > > objects are JDK deficiencies and can't be overcome easily: > >1) The JDK's Thread class stores it's name as a char[]. Every time > >Thread.getName() is called that char[] is converted to a String resulting > >in 2 new objects, the String and it's associated char[]. I am going to > > see if I can get a message through to the Sun folks to look at this. Of > > course, this framework could store the name once in a local variable on > > first lookup, but this assumes the Thread name will never change > > We could circumvent calling Thread.getName() by caching the name of > the thread. In general however, I deem it bad practice to go try to be > smarter than the language. > > >2) The sun.io.CharToByteSingleByte.convert method leaks a byte[1] every > >time invoked. > > You must be kidding. :-) > > >This means for N messages logged, N * 3 objects are created from the JDK > >when outputting Thread name and using a FileAppender. I submit any other > >transient object creation currently created by log4j is unwarranted and a > >potential performance issue. > > Agreed. In log4j defense, I should say that log4j aims to be reliable, > fast and extensible, in that order of priority. It is easier to be > reliable with simpler but albeit less optimized code. > > >3) What if you knew logging a message would cost 1 KB of data for every > >message sent no matter the size? Well if you output the date using > > default log4j functionality it does. java.text.DateFormat.format > > generates 30 transient objects using ~1 KB for every call. I wouldn't > > recommend to use DateFormat to stream dates, it is not built into it's > > functionality to do it efficiently. It is possible to output the same > > information without using any transient objects. From my tests, a large > > % of the objects were created from date formatting. > > Do you mean SimpleDateFormat or just DateFormat. SimpleDateFormat is > very inefficient. However, ISO8601DateFormat, DateTimeDateFormat and > AbsoluteTimeDateFormat that ship with log4j and which all derive from > DateFormat are supposed to be much more efficient. So I reiterate my > question. Do you mean SimpleDateFormat or DateFormat? > > >LOG 4J Architecture Problems WRT Memory: > >The problem with Log4J is that it uses a StringBuffer to format messages. > >This means that the resulting formatted message is a transient object, it > >is ready for GC as soon as the entry is logged. This may highly degrade > >system performance. Not only does the message waste space, it creates > > many objects in the GC graph and makes GC run longer - wasted CPU cycles. > > Sure, although recent JVMs are supposed to be much more efficient with > respect to garbage collection. > > >One advancement I noticed in Log4J 1.2.6 was the use of a BufferedWriter > >around FileWriters in FileAppenders. This step saves the final formatted > >StringBuffer response from being copied into transient memory, a char[] > > the size of the StringBuffer, one last time in FileWriter.write(String). > > Instead it is copied into the BufferedWriter's reusable char[] which will > > call FileWriter.write(char[]). This stops creating the message 3 times > > in memory, it is now only 2. Can it be 1? It is 2 times in memory > > because the current architecture requires the copying into the > > StringBuffer. How about getting rid of the StringBuffer and directly > > appending to the BufferedWriter? Instead of formatting to memory that is > > ready for GC as soon as logging is over, format it in reusable memory in > > the > >BufferedWriter. It is cleaner, more efficient and is a first step towards > >0 transient object generation and 0 excess memory usage during logging. > > It would be a major step towards 0 transient object generation, 0 > excess memory usage during logging and more efficient, that much is > certain. > > IMHO, the current approach is clean and very generic but optimal it is > not. > > >To achieve this goal I added these methods to the following classes: > >public abstract class Layout implements OptionHandler { > > abstract public void stream(LoggingEvent event, Writer out) throws > >IOException; > >} > > > >public abstract class PatternConverter { > > public final void format(Writer out, LoggingEvent e) throws > >IOException { > > convert(e, out); > > } > > > > abstract protected void convert(LoggingEvent event, Writer out) > >throws IOException; > > // Convert and send directly to writer > > > > protected final void formatStringToWriter(String s, Writer out) > >throws IOException {} > > // Since most converters will simply output a String, this is > >added as a utility > > // and will format the String correctly to the writer, > >} > > Did you remove Layout.format(LoggingEvent) returning a String? > > >An appender would call the Layout.stream(LoggingEvent, Writer) method > >instead of format. If an appender was interested in a string of the > >layout, a StringWriter could be passed in. Since the synchronization of > >the logging event is at the AppenderSkeleton.doAppend method and before > > the formatting of the response, writing to the BufferedWriter instead of > > a StringBuffer is thread safe. It is the same effect except the char[] > > in BufferedWriter is reusable and requires one less copy. > > So you are suggesting that Layout.format(LoggingEvent) no longer be > used because it can be replaced by Layout.stream(LoggingEvent, > Writer)? > > Appenders such as SMTPAppender that handle the String returned by > Layout.format(LoggingEvent) can use Layout.stream(LoggingEvent, > Writer) where the writer is a StringWriter. Is that what you saying? > > >Objects that represent XML can be mapped to an ObjectFormatter and be > >streamed to disk instead of first having an extra copy in memory. This > > has been a huge savings on my project. If the XML is 170 KB on disk it > > costs NO transient JVM heap memory to stream the Object as XML to disk; > > there is never a complete in memory representation of the XML as a > > String, this is not possible with the default log4j implementation. Of > > course the XML Binding Framework I use is 0 transient object usage. With > > log4j 1.1.3 it costs at least 2 in memory representations and with 1.2.6 > > at least 1. > > I don't understand. What do you mean by Objects that represent XML? > > >Since formatting dates was a critical concern, I created a > > PatternConverter that creates 1 Date and 1 Calendar. These objects are > > used in conjunction with a set of numbers pre-converted to Strings to > > format the date with less CPU cycles and no transient memory compared to > > the default implementation. Since the set of numbers for date > > formatting(hours, min, secs, ms, date, month) is finite I pre-created a > > set of the numbers. I simply created a static array of these numbers, it > > is a simple lookup: number[10] = "10". > > Good one. You did this to the DateFormatter that ship with log4j? > > >Theoretically, to format the date with almost the same precision and > >flexibility as DateFormatter with no excess memory one could just use > > log4j parameter markers to format the date provided the correct > > Converters are created. I propose the Date object become a member of > > LoggingEvent where it can be initialized once per logging event and date > > parameters pulled from a Converter. > > That is very much thought provoking. I'm not 100% sure the approach is > problem free but it sure is very original. > > >One of the last pieces of memory to clean up was the LoggingEvents > >themselves. There is no reason why these objects can't be pooled. I > > added this functionality to an extended Category and LoggingEvent and > > removed the 1 time use of these objects. To accomplish this it was > > neccesary to make some changes to LoggingEvent to re-initialize itself. > > I view the logging event as a thread resource, but looking up in a pool > > is just as easy. > > Yes, although the gains are not necessary that significant because > there is just one LoggingEvent created for each, well..., logging > event. If as you say 33'000 logs generate millions of objects than > 33'000 less will not make that much of a difference. Would you agree? > > >I view these changes as a positive step to log4j. I don't know how the > >process works of proposing/making changes to the project. If I can be > >helpful in advancing these ideas in any way please let me know. > > A very positive step indeed! You are doing great. Posting on log4j-dev > was the right thing to do. > > >Thanks, > >Chris Nokes > >Perficient, Inc > > No, thank you. -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>