Hi,
First of all - this was the most thought provoking message I'd seen on a
dev list for a long time, so thank you ;)

>Your results are somewhat surprising. I knew memory usage could be
>improved but hadn't realized the extent...

Same here.  But to preface, I value speed and reliability far far more
than memory usage.  Memory is cheap in the context I work in, so I might
be a bit biased...

>>One recent test showed 350MB of transient memory created by log4j to
log
>33
>>MB of data consisting of 33500 log entries; the number of transient
>objects
>>numbered in the millions and was uncountable by my tools.  The layout

Wow.  Huge numbers.  How did you measure transient memory, given that GC
was running?  What JVM was this, what GC parameters?

>>log4j.appender.R.layout.ConversionPattern=%d{MMM dd yyyy HH:mm:ss,SSS}
%t

Hmm.  I wonder if %d by itself would be better.

>>disregard for memory consumption disturbing.  I refactored it and I
can
>now
>>run the same test generating only 4 MB of transient data and 105000
>>transient objects for the same 33 MB data set.  The remaining 4 MB and

Cool!  And thank you for posting the information.  The question is,
what's sacrificed to gain this memory improvement?  Did you benchmark
speed and/or CPU consumption before and after refactoring?

>>This means for N messages logged, N * 3 objects are created from the
JDK
>>when outputting Thread name and using a FileAppender.  I submit any
other
>>transient object creation currently created by log4j is unwarranted
and a
>>potential performance issue.

Potential performance issue, maybe.  But I'd rather have log4j create a
few (well-bounded amount, not millions) transient objects and work very
fast, relying on the garbage collector to collect them, than have some
tweaked for memory but very hard to understand and/or very complex
design.

>Agreed. In log4j defense, I should say that log4j aims to be reliable,
>fast and extensible, in that order of priority. It is easier to be
>reliable with simpler but albeit less optimized code.

Ditto.

>>3) What if you knew logging a message would cost 1 KB of data for
every
>>message sent no matter the size?  Well if you output the date using

1KB of data for how long?  If it's there for a few milliseconds before
the garbage collector takes away, I don't care at all.

>>is ready for GC as soon as the entry is logged.  This may highly
degrade
>>system performance.  Not only does the message waste space, it creates
>many
>>objects in the GC graph and makes GC run longer - wasted CPU cycles.

This *may* degrade system performance.  It depends on the GC
implementation and tuning.  For example, we have several large JVMs
(each one is long-running, heaps in GB, multiple log files, multiple
appenders of multiple types, tens if not hundreds of thousands of log
messages in the files), and yet we've found log4j to take almost no
appreciable memory compared to our apps.  Certainly not hundreds of MBs.
But we do a lot of tuning of the JVM runtime args for garbage collection
performance.

>>BufferedWriter.  It is cleaner, more efficient and is a first step
towards
>>0 transient object generation and 0 excess memory usage during
logging.

Very good suggestion.

>IMHO, the current approach is clean and very generic but optimal it is
>not.

That depends on the definition of "optimal" ;)  Which is an endless
argument, I know...  

But in my view:
- Disk space is dirt cheap
- Memory is cheap
- CPU cycles aren't very expensive
- A clean, easy to understand, easy to debug, easy to extend, generic
design is PRICELESS. 

I've seen the above over and over again.  We have to be careful to
strike the balance between optimizing too much and too little.

>>Objects that represent XML can be mapped to an ObjectFormatter and be
>>streamed to disk instead of first having an extra copy in memory.
This

Cool idea.  How do you know what objects represent XML?

>>created.  I propose the Date object become a member of LoggingEvent
where
>>it can be initialized once per logging event and date parameters
pulled
>>from a Converter.

What an interesting idea!  I'll have to think about that one.

>>One of the last pieces of memory to clean up was the LoggingEvents
>>themselves.  There is no reason why these objects can't be pooled.  I

Pooled how?  Isn't each logging event conceptually unique?

>I view these changes as a positive step to log4j.  I don't know how the

Possibly a HUGE positive step ;)

Thanks for posting everything!

Yoav Shapira
Millennium ChemInformatics
This e-mail, including any attachments, is a confidential business communication, and 
may contain information that is confidential, proprietary and/or privileged.  This 
e-mail is intended only for the individual(s) to whom it is addressed, and may not be 
saved, copied, printed, disclosed or used by anyone else.  If you are not the(an) 
intended recipient, please immediately delete this e-mail from your computer system 
and notify the sender.  Thank you.

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to