May I add to the "Date thread", that in my own "ancient proprietary" logging 
system (back in 97, I think), one of the best speed improvements were a small 
detail that is easy to forget.

Reviewing the log, it is important to know the time rather well, and that the 
order of the events are correct. But having a accurately formatted date field 
for each event may not be necessary.
I changed my logger so that the Date field was only formatted every 5 second, 
and had an additional field for millisecond since last format.

Major improvement in those days.

Niclas

On Friday 04 October 2002 02:05, Ceki Gülcü wrote:
> Hi Chris,
>
> Object reuse and optimizing memory usage was one of the themes I was
> seriously considering for future log4j releases.
>
> Your results are somewhat surprising. I knew memory usage could be
> improved but hadn't realized the extent...
>
> More comments below.
>
> At 10:42 03.10.2002 -0500, [EMAIL PROTECTED] wrote:
> >I am currently on a project using log4j for the first time.  We're using
> >1.1.3.  I had many concerns with the amount of transient objects created
> >during logging to a DailyRollingFileAppender.  I refactored log4j making
> >memory usage improvements that I think could be very useful to log4j
> > users.
> >
> >One recent test showed 350MB of transient memory created by log4j to log
> > 33 MB of data consisting of 33500 log entries; the number of transient
> > objects numbered in the millions and was uncountable by my tools.  The
> > layout pattern used was:
>
> Millions? Wow!
>
> >log4j.appender.R.layout.ConversionPattern=%d{MMM dd yyyy HH:mm:ss,SSS} %t
> > % -5p %c - %m%n
> >
> >Without date formatting the total was ~275 MB.
>
> What happens with just %d (with nothing within the braces)? With %r?
> With no time information?
>
> >I appreciate the functionality offered by this framework but I find it's
> >disregard for memory consumption disturbing.  I refactored it and I can
> > now run the same test generating only 4 MB of transient data and 105000
> > transient objects for the same 33 MB data set.  The remaining 4 MB and
> > objects are JDK deficiencies and can't be overcome easily:
> >1)  The JDK's Thread class stores it's name as a char[].  Every time
> >Thread.getName() is called that char[] is converted to a String resulting
> >in 2 new objects, the String and it's associated char[].  I am going to
> > see if I can get a message through to the Sun folks to look at this.  Of
> > course, this framework could store the name once in a local variable on
> > first lookup, but this assumes the Thread name will never change
>
> We could circumvent calling Thread.getName() by caching the name of
> the thread. In general however, I deem it bad practice to go try to be
> smarter than the language.
>
> >2)  The sun.io.CharToByteSingleByte.convert method leaks a byte[1] every
> >time invoked.
>
> You must be kidding. :-)
>
> >This means for N messages logged, N * 3 objects are created from the JDK
> >when outputting Thread name and using a FileAppender.  I submit any other
> >transient object creation currently created by log4j is unwarranted and a
> >potential performance issue.
>
> Agreed. In log4j defense, I should say that log4j aims to be reliable,
> fast and extensible, in that order of priority. It is easier to be
> reliable with simpler but albeit less optimized code.
>
> >3) What if you knew logging a message would cost 1 KB of data for every
> >message sent no matter the size?  Well if you output the date using
> > default log4j functionality it does.  java.text.DateFormat.format
> > generates 30 transient objects using ~1 KB for every call.  I wouldn't
> > recommend to use DateFormat to stream dates, it is not built into it's
> > functionality to do it efficiently.  It is possible to output the same
> > information without using any transient objects.  From my tests, a large
> > % of the objects were created from date formatting.
>
> Do you mean SimpleDateFormat or just DateFormat. SimpleDateFormat is
> very inefficient. However, ISO8601DateFormat, DateTimeDateFormat and
> AbsoluteTimeDateFormat that ship with log4j and which all derive from
> DateFormat are supposed to be much more efficient. So I reiterate my
> question. Do you mean SimpleDateFormat or DateFormat?
>
> >LOG 4J Architecture Problems WRT Memory:
> >The problem with Log4J is that it uses a StringBuffer to format messages.
> >This means that the resulting formatted message is a transient object, it
> >is ready for GC as soon as the entry is logged.  This may highly degrade
> >system performance.  Not only does the message waste space, it creates
> > many objects in the GC graph and makes GC run longer - wasted CPU cycles.
>
> Sure, although recent JVMs are supposed to be much more efficient with
> respect to garbage collection.
>
> >One advancement I noticed in Log4J 1.2.6 was the use of a BufferedWriter
> >around FileWriters in FileAppenders.  This step saves the final formatted
> >StringBuffer response from being copied into transient memory, a char[]
> > the size of the StringBuffer, one last time in FileWriter.write(String).
> > Instead it is copied into the BufferedWriter's reusable char[] which will
> > call FileWriter.write(char[]).  This stops creating the message 3 times
> > in memory, it is now only 2.  Can it be 1?  It is 2 times in memory
> > because the current architecture requires the copying into the
> > StringBuffer.  How about getting rid of the StringBuffer and directly
> > appending to the BufferedWriter?  Instead of formatting to memory that is
> > ready for GC as soon as logging is over, format it in reusable memory in
> > the
> >BufferedWriter.  It is cleaner, more efficient and is a first step towards
> >0 transient object generation and 0 excess memory usage during logging.
>
> It would be a major step towards 0 transient object generation, 0
> excess memory usage during logging and more efficient, that much is
> certain.
>
> IMHO, the current approach is clean and very generic but optimal it is
> not.
>
> >To achieve this goal I added these methods to the following classes:
> >public abstract class Layout implements OptionHandler {
> >       abstract public void stream(LoggingEvent event, Writer out) throws
> >IOException;
> >}
> >
> >public abstract class PatternConverter {
> >       public final void format(Writer out, LoggingEvent e) throws
> >IOException {
> >             convert(e, out);
> >       }
> >
> >       abstract protected void convert(LoggingEvent event, Writer out)
> >throws IOException;
> >             // Convert and send directly to writer
> >
> >       protected final void formatStringToWriter(String s, Writer out)
> >throws IOException {}
> >             // Since most converters will simply output a String, this is
> >added as a utility
> >             // and will format the String correctly to the writer,
> >}
>
> Did you remove Layout.format(LoggingEvent) returning a String?
>
> >An appender would call the Layout.stream(LoggingEvent, Writer) method
> >instead of format.  If an appender was interested in a string of the
> >layout, a StringWriter could be passed in.  Since the synchronization of
> >the logging event is at the AppenderSkeleton.doAppend method and before
> > the formatting of the response, writing to the BufferedWriter instead of
> > a StringBuffer is thread safe.  It is the same effect except the char[]
> > in BufferedWriter is reusable and requires one less copy.
>
> So you are suggesting that Layout.format(LoggingEvent) no longer be
> used because it can be replaced by Layout.stream(LoggingEvent,
> Writer)?
>
> Appenders such as SMTPAppender that handle the String returned by
> Layout.format(LoggingEvent) can use Layout.stream(LoggingEvent,
> Writer) where the writer is a StringWriter. Is that what you saying?
>
> >Objects that represent XML can be mapped to an ObjectFormatter and be
> >streamed to disk instead of first having an extra copy in memory.  This
> > has been a huge savings on my project.  If the XML is 170 KB on disk it
> > costs NO transient JVM heap memory to stream the Object as XML to disk;
> > there is never a complete in memory representation of the XML as a
> > String, this is not possible with the default log4j implementation.  Of
> > course the XML Binding Framework I use is 0 transient object usage.  With
> > log4j 1.1.3 it costs at least 2 in memory representations and with 1.2.6
> > at least 1.
>
> I don't understand. What do you mean by Objects that represent XML?
>
> >Since formatting dates was a critical concern, I created a
> > PatternConverter that creates 1 Date and 1 Calendar.  These objects are
> > used in conjunction with a set of numbers pre-converted to Strings to
> > format the date with less CPU cycles and no transient memory compared to
> > the default implementation. Since the set of numbers for date
> > formatting(hours, min, secs, ms, date, month) is finite I pre-created a
> > set of the numbers.  I simply created a static array of these numbers, it
> > is a simple lookup: number[10] = "10".
>
> Good one. You did this to the DateFormatter that ship with log4j?
>
> >Theoretically, to format the date with almost the same precision and
> >flexibility as DateFormatter with no excess memory one could just use
> > log4j parameter markers to format the date provided the correct
> > Converters are created.  I propose the Date object become a member of
> > LoggingEvent where it can be initialized once per logging event and date
> > parameters pulled from a Converter.
>
> That is very much thought provoking. I'm not 100% sure the approach is
> problem free but it sure is very original.
>
> >One of the last pieces of memory to clean up was the LoggingEvents
> >themselves.  There is no reason why these objects can't be pooled.  I
> > added this functionality to an extended Category and LoggingEvent and
> > removed the 1 time use of these objects.  To accomplish this it was
> > neccesary to make some changes to LoggingEvent to re-initialize itself. 
> > I view the logging event as a thread resource, but looking up in a pool
> > is just as easy.
>
> Yes, although the gains are not necessary that significant because
> there is just one LoggingEvent created for each, well..., logging
> event. If as you say 33'000 logs generate millions of objects than
> 33'000 less will not make that much of a difference. Would you agree?
>
> >I view these changes as a positive step to log4j.  I don't know how the
> >process works of proposing/making changes to the project.  If I can be
> >helpful in advancing these ideas in any way please let me know.
>
> A very positive step indeed! You are doing great. Posting on log4j-dev
> was the right thing to do.
>
> >Thanks,
> >Chris Nokes
> >Perficient, Inc
>
> No, thank you.


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to