Bryce,
> public TypeStoreUser find_element_user ( QName name, int i )
> {
> for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling )
> if (x.isElem() && x._name.equals( name ) && --i < 0)
> return x.getUser();
> return null;
> }
Why that is simply very bad programing. Why are named items in a linked list?
Because it is easy to program for small sets.
I think you have a perfect example of something that ought to be patched in
XMLBeans. There are two paths
(1) Figure out how to patch xmlbeans to use a HashMap for name references yet
retain the linked list for preserving the order. When done contribute it to
Apache xmlbeans.
(2) Ask the xmlbeans user list about this trouble.
Regards,
Dave
On May 7, 2010, at 10:48 AM, <[email protected]>
<[email protected]> wrote:
> Dave and others... ,
>
> Took your suggestion and pulled in XMLBeans 2.4
> Worked but contained the O(n^2) issue.
>
> I have spent some time further investigating this, and it is
> fairly simple to analyse and understand why it is O(n^2).
>
> The real question is what can we do.
> NOTE: My line numbers won't match with SVN because I have modified the code
> to put
> some timers and system outs in place.
>
>
> Just a Quick overview
> it seems like XmlComplexContentImpl.arraySetterHelper iterates over all
> elements in the spread sheet.
> then in side of that iteration, the Xobj.find_element_user(Qname,int) is an
> Order n operation.
>
> Just making Xobj.find_element_user a log(n) would completely solve the
> problem for me.
>
>
>
> ANALYSIS: O(n^2) in XML Beans: (Around Line 1153 or so....)
> -----------------------------------------------------------
> // Order N (this is invoked 1 time for each element in sources (~120,000
> records in my case)
>
>
> for ( i++, j++; i < sources.length; i++, j++)
> {
> long t1 = System.currentTimeMillis();
> XmlCursor c = sources[i].isImmutable() ? null : sources[i].newCursor();
> getCursorTime += (System.currentTimeMillis()-t1);
> long t2=System.currentTimeMillis();
> if (c != null && c.toParent() && c.getObject() == this)
> {
> ifCheckTime += (System.currentTimeMillis()-t2);
> long t3 = System.currentTimeMillis();
> c.dispose();
> long t4 = System.currentTimeMillis();
>
> // THIS CAUSES O(n^2) SEE CODE BELOW, but this on O(n) for each record.
> // THIS IS THE INEFFICIENCY in XMLBeans...
>
> current = store.find_element_user( elemName, j );
>
>
> findUserElement += (System.currentTimeMillis() - t4);
> if (current == sources[i])
> ; // advance
> else
> {
> // Fall back to the general case
> break;
> }
> insideIfTime += (System.currentTimeMillis()-t3);
> }
> else
> {
> ifCheckTime += (System.currentTimeMillis()-t2);
> c.dispose();
> // Insert before the current element
> TypeStoreUser user = store.insert_element_user( elemName, j );
> ((XmlObjectBase) user).set( sources[i] );
> }
> }
>
>
>
>
>
> ---------------- Method from store.find_element_user(QName,int i)---------
>
> // THIS METHOD IS 0(n)
> // but this is called n times by XmlComplexContentImpl.java around line
> 1153....
> public TypeStoreUser find_element_user ( QName name, int i )
> {
> for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling )
> if (x.isElem() && x._name.equals( name ) && --i < 0)
> return x.getUser();
> return null;
> }
>
>
>
> Bryce,
>
> I have a suggestion.
>
> (1) See what happens if you use XMLBeans 2.4.0.
>
> (a) They may have fixed this strange algorithm.
>
> (b) Or, POI may be incompatible with XMLBeans 2.4.0.
>
> If that fails we can discuss the next steps - which would probably to start
> asking questions about this poor algorithm choice in XMLBeans.
>
> If it succeeds then we have a good reason to move the trunk of POI up to
> XMLBeans 2.4 and to produce a new version of OOXML.
>
> Without seeing the whole loop, I can't agree that it is definitely O(N^2),
> but it looks like it. -It looks like they implemented simple logic for small
> test cases without concern for large (or even huge) node lists.
>
> Here is XMLBean's JIRA -
> https://issues.apache.org/jira/secure/BrowseProject.jspa?id=10436
>
> This might be helpful - http://wiki.apache.org/xmlbeans/V2Features
>
> Regards,
> Dave
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]