Re: [jibx-users] (Un)MarshallingContext bad performance

Francois Valdy Wed, 29 Aug 2007 06:28:52 -0700

Hi Dennis,

although I may be wrong on the subject (my memory model comes more
from C/C++ understanding than Java knowledge), I thought object
content was garanteed for final or volatile fields without the need of
synchronization.


I think that's more than enough for a Class object, but once again,
feel free to correct me.

On a separate note, here is the list of side-optimizations I also did
(only by overridding JIBX objects, not directly in JIBX code this
time):

 - DualXMLWriter to write a JDOM object and the corresponding String
in one go (not sure how it behaves with namespaces however because of
openNamespaces(...)).

 - JibxJDOMWriter:
  o Skips most of XML character validation (takes a hell of a time,
and not needed for me)
  o getNamespace() should check for empty prefix/uri and return
Namespace.NO_NAMESPACE (jdom version being .... damn slow)

 - JibxStringReader, non synchronized version of StringReader (also
never closed, to allow multiple consequent unmarshall from the same
StringReader, very usefull for custom collection unmarshalling)

 - JibxStringWriter, non sync (StringBuilder instead of StringBuffer),
also never closed

 - JibxUTF8Escaper, escapes \n in attribute values ;-)

Regards,
Francois.

On 8/29/07, Dennis Sosnoski <[EMAIL PROTECTED]> wrote:
> Likewise...
>
>  - Dennis
>
>
> Francois Valdy wrote:
> > Hi Dennis,
> >
> > Same here, feedback inlined ;-)
> > Thanks your your answer.
> >
> > Francois.
> >
> > On 8/29/07, Dennis Sosnoski <[EMAIL PROTECTED]> wrote:
> >
> >> Hi Francois,
> >>
> >> Thanks for these interesting observations! Detailed responses inline.
> >>
> >>  - Dennis
> >>
> >> Dennis M. Sosnoski
> >> SOA and Web Services in Java
> >> Training and Consulting
> >> http://www.sosnoski.com - http://www.sosnoski.co.nz
> >> Seattle, WA +1-425-939-0576 - Wellington, NZ +64-4-298-6117
> >>
> >>
> >>
> >> Francois Valdy wrote:
> >>
> >>> Hi,
> >>>
> >>> Performance of MarshallingContext and its unmarshalling friend are
> >>> really poor compared to the effort done on the rest of JIBX.
> >>> It's not noticeable for large objects, but for small ones, between 50%
> >>> and 75% of the marsh/unmarsh time is taken by those classes.
> >>>
> >>>
> >> I think it's better to optimize for large objects/documents rather than
> >> small ones, but agree that ideally we'll have great performance for both.
> >>
> >>
> >
> > I do want both, I have small (20 chars) and large (1Mo) XML objects in
> > the same application, be sure I checked any tradeoff of my
> > suggestions.
> > (except namespaces, which I do not use)
> >
>
> One of the other things I want to do for 2.0 is to provide optimized
> methods for no-namespace access. It probably won't give a major boost to
> performance, but probably another 5-10%.
>
> >
> >>> Marshalling:
> >>> loadClass result should be cached in an array inside the factory
> >>> (shared cache between MarshallingContext).
> >>>
> >>>
> >> This would involve using synchronization, which can be a real issue in
> >> multithreading systems (especially multiprocessor ones). Still, there'd
> >> also have to be synchronization at some level within the classloader
> >> checking loaded classes. It'd be interesting to check the actual
> >> performance tradeoffs on this.
> >>
> >>
> >
> > This Class object array doesn't need any synchronization, if the Class
> > is on the index use it, if it's not, look it up. 2 threads
> > simultaneously setting the value on this cache is not an issue either.
> > So no added sync here (and you're right saying sync is already at
> > classloader level, actually I'm avoiding the classloader sync with
> > this array cache).
> >
>
> I agree that doing what you're suggesting would be safe under most
> situations, but unfortunately I think it violates the Java memory model
> rules. The problem is that without synchronization there's no guarantee
> that one threads "view" of the class information referenced by the array
> would be accurate, if it had been set by another thread.
>
> > ...
> >> There is a potential way around the exception, I think, which is to use
> >> getResource() first to try to find the class file, and only load the
> >> class when the class file is found. I'm not sure that this would really
> >> provide any benefit, though.
> >>
> >>
> >>
> >>> I've updated the binding generation to add this array of null to the
> >>> factory, passed to the MashallingContext constructor (support null for
> >>> backward compat).
> >>> Class object is cached in factory only if loaded from the factory 
> >>> classloader.
> >>> Result being a 50% performance increase for small objects.
> >>>
> >>>
> >> Are you using synchronization for access? And if so, have you tried it
> >> with multithreading/multiprocessors?
> >>
> >>
> >
> > My application is multithreaded, and I do use it in multi-cpu environments.
> > Like I said above, no sync is required for access on an array. In
> > worst case I'll end up with 2 threads finding null, loading the same
> > class, setting the same index in the array, no big deal no matter
> > which one goes first.
> >
>
> Again, I agree that for reasonable configurations this will work - but
> it does violate the Java memory model, and we couldn't count on it
> working across all JVMs and hardward platforms.
>
> >
> >>> Unmarshalling (improvement from marshalling above applies too):
> >>> for small objects unmarshalled from big factories, the time taken to
> >>> build the cache map is really BIG (and useless).
> >>>
> >>>
> >> If you know the type of object to be unmarshalled you should be able to
> >> avoid this overhead completely by instead using an instance of the
> >> object cast to IUnmarshallable, calling the unmarshal() method. But this
> >> approach is certainly not encouraged by the code samples and such, and
> >> probably wouldn't occur to users.
> >>
> >> There's one obvious optimization that could help with very large
> >> mappings, which is passing the size to the HashMap constructor. I've
> >> made that change in my code, but you may want to try it out to see if it
> >> makes any significant difference for your case.
> >>
> >>
> >
> > In most cases I do not know the type of my objects before hand, but
> > I'll give a go to the "cast-call-unmarshal", I just never thought
> > about it (I'm not sure I'll gain any benefit from it because the map
> > will have to be built for nodes inside the first one, am I wrong ?).
> >
>
> Only if you use untyped object references. If all your object references
> are typed correctly, the mapping information should be built into the
> generated code and the map will never be needed.
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >>  http://get.splunk.com/
> _______________________________________________
> jibx-users mailing list
> jibx-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/jibx-users
>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
jibx-users mailing list
jibx-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jibx-users

Re: [jibx-users] (Un)MarshallingContext bad performance

Reply via email to