Bryce, I have a suggestion.
(1) See what happens if you use XMLBeans 2.4.0. (a) They may have fixed this strange algorithm. (b) Or, POI may be incompatible with XMLBeans 2.4.0. If that fails we can discuss the next steps - which would probably to start asking questions about this poor algorithm choice in XMLBeans. If it succeeds then we have a good reason to move the trunk of POI up to XMLBeans 2.4 and to produce a new version of OOXML. Without seeing the whole loop, I can't agree that it is definitely O(N^2), but it looks like it. -It looks like they implemented simple logic for small test cases without concern for large (or even huge) node lists. Here is XMLBean's JIRA - https://issues.apache.org/jira/secure/BrowseProject.jspa?id=10436 This might be helpful - http://wiki.apache.org/xmlbeans/V2Features Regards, Dave On May 6, 2010, at 10:17 AM, <[email protected]> <[email protected]> wrote: > > Just in keeping anyone interested in-tune to what I am finding: > (Slowly plodding forward) > > The Code from xmlbeans (XmlComplexContentImpl.java) > has a tight loop, that scales as O(n) for a single match, > and the way XMLBeans is being used within POI causes a scaling of > O(n^2) because each row has hits this method. > > this is significant. > If This could be re-written to be O(1) or at least O(log n) then > POI would scale linearly (or log linearly) with spreadsheet size. > > We may need to engage the XMLBean folks on this matter. > OR SOMEONE with a really strong algorithms background... > > Please let me know if we can figure out how to proceed with this: > (I WOULD LIKE TO GET THE BEST PEOPLE ENGAGED as I think this could > have a significant impact on POI's usability) > > > CODE: STARTING at Line 1137 of > org.apache.xmlbeans.impl.values.XmlComplexContentImpl.java > ------------------------------------------------------------------------------------------- > // The new object matches what already exists in the array > // Heuristic: we optimize for the case where the new elements > // in the array are the same as the existing elements with > // potentially new elements inserted > > // First insert the new element in the array at position 0 > int j = 0; > for ( j = 0; j < i; j++ ) > { > TypeStoreUser user = store.insert_element_user( elemName, > j ); > ((XmlObjectBase) user).set( sources[j] ); > } > for ( i++, j++; i < sources.length; i++, j++) > { > XmlCursor c = sources[i].isImmutable() ? null : > sources[i].newCursor(); > if (c != null && c.toParent() && c.getObject() == this) > { > c.dispose(); > current = store.find_element_user( elemName, j ); > if (current == sources[i]) > ; // advance > else > { > // Fall back to the general case > break; > } > } > else > { > c.dispose(); > // Insert before the current element > TypeStoreUser user = store.insert_element_user( > elemName, j ); > ((XmlObjectBase) user).set( sources[i] ); > } > } > startDest = j; > startSrc = i; > m = store.count_elements( elemName ); > } > // Fall through > } > else > { > // All of the elements in the existing array are to > // be deleted and replaced with elements from the > // sources array > } > > > > Thanks > Bryce > > --------------------------------------------------------------------------- > Bryce Alcock * Performance Architect * SunGard * > ........................................................................... > http://www.brycealcock.com/ > --------------------------------------------------------------------------- > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
