POI uses XMLBeans, This is a problem to help make POI faster: There has been on going discussion in the dev.at.poi.apache.org list, However I will include back ground here. This issue is significant.
Took your suggestion and pulled XMLBeans 2.4 into POI, it worked but contained the O(n^2) issue. I have spent some time further investigating this, and it is fairly simple to analyse and understand why it is O(n^2). The real question is what can we do. NOTE: My line numbers won't match with SVN because I have modified the code to put some timers and system outs in place. Just a Quick overview it seems like XmlComplexContentImpl.arraySetterHelper iterates over all elements in the spread sheet. then in side of that iteration, the Xobj.find_element_user(Qname,int) is an Order n operation. Just making Xobj.find_element_user a O(log n) would completely solve the problem. ANALYSIS: O(n^2) in XML Beans: (Around Line 1153 or so....) ----------------------------------------------------------- // Order N (this is invoked 1 time for each element in sources (~120,000 records in my case) for ( i++, j++; i < sources.length; i++, j++) { long t1 = System.currentTimeMillis(); XmlCursor c = sources[i].isImmutable() ? null : sources[i].newCursor(); getCursorTime += (System.currentTimeMillis()-t1); long t2=System.currentTimeMillis(); if (c != null && c.toParent() && c.getObject() == this) { ifCheckTime += (System.currentTimeMillis()-t2); long t3 = System.currentTimeMillis(); c.dispose(); long t4 = System.currentTimeMillis(); // THIS CAUSES O(n^2) SEE CODE BELOW, but this on O(n) for each record. // THIS IS THE INEFFICIENCY in XMLBeans... current = store.find_element_user( elemName, j ); findUserElement += (System.currentTimeMillis() - t4); if (current == sources[i]) ; // advance else { // Fall back to the general case break; } insideIfTime += (System.currentTimeMillis()-t3); } else { ifCheckTime += (System.currentTimeMillis()-t2); c.dispose(); // Insert before the current element TypeStoreUser user = store.insert_element_user( elemName, j ); ((XmlObjectBase) user).set( sources[i] ); } } ---------------- Method from store.find_element_user(QName,int i)--------- // THIS METHOD IS 0(n) // but this is called n times by XmlComplexContentImpl.java around line 1153.... public TypeStoreUser find_element_user ( QName name, int i ) { for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling ) if (x.isElem() && x._name.equals( name ) && --i < 0) return x.getUser(); return null; } Best regards Bryce ------------------------------------------------------------------------------------------ Bryce Alcock * Performance Architect * SunGard * Asset Arena * 377 E. Butterfield Road Suite 800, Lombard, IL 60148 * Tel 630-986-3006 * www.sungard.com/assetarena .......................................................................................... http://www.brycealcock.com/ ------------------------------------------------------------------------------------------ --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@xmlbeans.apache.org For additional commands, e-mail: dev-h...@xmlbeans.apache.org