Re: [jibx-users] (Un)MarshallingContext bad performance

Dennis Sosnoski Thu, 30 Aug 2007 14:01:46 -0700

Right you are, Francois - I checked with Joe Bowbeer, and at his
suggestion on the JavaMemoryModel list, and everyone agrees that using a
static array of classes will work properly due to the special nature of
the Class class.


I'll add the whitespace escapes into the attribute writing for 1.1.6. I
don't think the JDOM stuff is used much, so I'll pass on those unless
you're inclined to make sure it works properly with namespaces and then
contribute it.

Thanks again for your research into the issues! I don't want major code
changes for 1.1.6, but will try to include some of the improvements into
1.2.

  - Dennis

Dennis M. Sosnoski
SOA and Web Services in Java
Training and Consulting
http://www.sosnoski.com - http://www.sosnoski.co.nz
Seattle, WA +1-425-939-0576 - Wellington, NZ +64-4-298-6117



Francois Valdy wrote:
> Hi Dennis,
>
> although I may be wrong on the subject (my memory model comes more
> from C/C++ understanding than Java knowledge), I thought object
> content was garanteed for final or volatile fields without the need of
> synchronization.
>
> I think that's more than enough for a Class object, but once again,
> feel free to correct me.
>
> On a separate note, here is the list of side-optimizations I also did
> (only by overridding JIBX objects, not directly in JIBX code this
> time):
>
>  - DualXMLWriter to write a JDOM object and the corresponding String
> in one go (not sure how it behaves with namespaces however because of
> openNamespaces(...)).
>
>  - JibxJDOMWriter:
>   o Skips most of XML character validation (takes a hell of a time,
> and not needed for me)
>   o getNamespace() should check for empty prefix/uri and return
> Namespace.NO_NAMESPACE (jdom version being .... damn slow)
>
>  - JibxStringReader, non synchronized version of StringReader (also
> never closed, to allow multiple consequent unmarshall from the same
> StringReader, very usefull for custom collection unmarshalling)
>
>  - JibxStringWriter, non sync (StringBuilder instead of StringBuffer),
> also never closed
>
>  - JibxUTF8Escaper, escapes \n in attribute values ;-)
>
> Regards,
> Francois.
>
> On 8/29/07, Dennis Sosnoski <[EMAIL PROTECTED]> wrote:
>   
>> Likewise...
>>
>>  - Dennis
>>
>>
>> Francois Valdy wrote:
>>     
>>> Hi Dennis,
>>>
>>> Same here, feedback inlined ;-)
>>> Thanks your your answer.
>>>
>>> Francois.
>>>
>>> On 8/29/07, Dennis Sosnoski <[EMAIL PROTECTED]> wrote:
>>>
>>>       
>>>> Hi Francois,
>>>>
>>>> Thanks for these interesting observations! Detailed responses inline.
>>>>
>>>>  - Dennis
>>>>
>>>> Dennis M. Sosnoski
>>>> SOA and Web Services in Java
>>>> Training and Consulting
>>>> http://www.sosnoski.com - http://www.sosnoski.co.nz
>>>> Seattle, WA +1-425-939-0576 - Wellington, NZ +64-4-298-6117
>>>>
>>>>
>>>>
>>>> Francois Valdy wrote:
>>>>
>>>>         
>>>>> Hi,
>>>>>
>>>>> Performance of MarshallingContext and its unmarshalling friend are
>>>>> really poor compared to the effort done on the rest of JIBX.
>>>>> It's not noticeable for large objects, but for small ones, between 50%
>>>>> and 75% of the marsh/unmarsh time is taken by those classes.
>>>>>
>>>>>
>>>>>           
>>>> I think it's better to optimize for large objects/documents rather than
>>>> small ones, but agree that ideally we'll have great performance for both.
>>>>
>>>>
>>>>         
>>> I do want both, I have small (20 chars) and large (1Mo) XML objects in
>>> the same application, be sure I checked any tradeoff of my
>>> suggestions.
>>> (except namespaces, which I do not use)
>>>
>>>       
>> One of the other things I want to do for 2.0 is to provide optimized
>> methods for no-namespace access. It probably won't give a major boost to
>> performance, but probably another 5-10%.
>>
>>     
>>>>> Marshalling:
>>>>> loadClass result should be cached in an array inside the factory
>>>>> (shared cache between MarshallingContext).
>>>>>
>>>>>
>>>>>           
>>>> This would involve using synchronization, which can be a real issue in
>>>> multithreading systems (especially multiprocessor ones). Still, there'd
>>>> also have to be synchronization at some level within the classloader
>>>> checking loaded classes. It'd be interesting to check the actual
>>>> performance tradeoffs on this.
>>>>
>>>>
>>>>         
>>> This Class object array doesn't need any synchronization, if the Class
>>> is on the index use it, if it's not, look it up. 2 threads
>>> simultaneously setting the value on this cache is not an issue either.
>>> So no added sync here (and you're right saying sync is already at
>>> classloader level, actually I'm avoiding the classloader sync with
>>> this array cache).
>>>
>>>       
>> I agree that doing what you're suggesting would be safe under most
>> situations, but unfortunately I think it violates the Java memory model
>> rules. The problem is that without synchronization there's no guarantee
>> that one threads "view" of the class information referenced by the array
>> would be accurate, if it had been set by another thread.
>>
>>     
>>> ...
>>>       
>>>> There is a potential way around the exception, I think, which is to use
>>>> getResource() first to try to find the class file, and only load the
>>>> class when the class file is found. I'm not sure that this would really
>>>> provide any benefit, though.
>>>>
>>>>
>>>>
>>>>         
>>>>> I've updated the binding generation to add this array of null to the
>>>>> factory, passed to the MashallingContext constructor (support null for
>>>>> backward compat).
>>>>> Class object is cached in factory only if loaded from the factory 
>>>>> classloader.
>>>>> Result being a 50% performance increase for small objects.
>>>>>
>>>>>
>>>>>           
>>>> Are you using synchronization for access? And if so, have you tried it
>>>> with multithreading/multiprocessors?
>>>>
>>>>
>>>>         
>>> My application is multithreaded, and I do use it in multi-cpu environments.
>>> Like I said above, no sync is required for access on an array. In
>>> worst case I'll end up with 2 threads finding null, loading the same
>>> class, setting the same index in the array, no big deal no matter
>>> which one goes first.
>>>
>>>       
>> Again, I agree that for reasonable configurations this will work - but
>> it does violate the Java memory model, and we couldn't count on it
>> working across all JVMs and hardward platforms.
>>
>>     
>>>>> Unmarshalling (improvement from marshalling above applies too):
>>>>> for small objects unmarshalled from big factories, the time taken to
>>>>> build the cache map is really BIG (and useless).
>>>>>
>>>>>
>>>>>           
>>>> If you know the type of object to be unmarshalled you should be able to
>>>> avoid this overhead completely by instead using an instance of the
>>>> object cast to IUnmarshallable, calling the unmarshal() method. But this
>>>> approach is certainly not encouraged by the code samples and such, and
>>>> probably wouldn't occur to users.
>>>>
>>>> There's one obvious optimization that could help with very large
>>>> mappings, which is passing the size to the HashMap constructor. I've
>>>> made that change in my code, but you may want to try it out to see if it
>>>> makes any significant difference for your case.
>>>>
>>>>
>>>>         
>>> In most cases I do not know the type of my objects before hand, but
>>> I'll give a go to the "cast-call-unmarshal", I just never thought
>>> about it (I'm not sure I'll gain any benefit from it because the map
>>> will have to be built for nodes inside the first one, am I wrong ?).
>>>
>>>       
>> Only if you use untyped object references. If all your object references
>> are typed correctly, the mapping information should be built into the
>> generated code and the map will never be needed.
>>
>>
>> -------------------------------------------------------------------------
>> This SF.net email is sponsored by: Splunk Inc.
>> Still grepping through log files to find problems?  Stop.
>> Now Search log events and configuration files using AJAX and a browser.
>> Download your FREE copy of Splunk now >>  http://get.splunk.com/
>> _______________________________________________
>> jibx-users mailing list
>> jibx-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/jibx-users
>>
>>     
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >>  http://get.splunk.com/
> _______________________________________________
> jibx-users mailing list
> jibx-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/jibx-users
>
>   

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
jibx-users mailing list
jibx-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jibx-users

Re: [jibx-users] (Un)MarshallingContext bad performance

Reply via email to