Re: Performance Question with CTSheetDataImpl.java

David Fisher Thu, 06 May 2010 10:47:27 -0700

Bryce,

I have a suggestion.


(1) See what happens if you use XMLBeans 2.4.0.

(a) They may have fixed this strange algorithm.

(b) Or, POI may be incompatible with XMLBeans 2.4.0.

If that fails we can discuss the next steps - which would probably to start 
asking questions about this poor algorithm choice in XMLBeans.

If it succeeds then we have a good reason to move the trunk of POI up to 
XMLBeans 2.4 and to produce a new version of OOXML.

Without seeing the whole loop, I can't agree that it is definitely O(N^2), but 
it looks like it. -It looks like they implemented simple logic for small test 
cases without concern for large (or even huge) node lists.

Here is XMLBean's JIRA - 
https://issues.apache.org/jira/secure/BrowseProject.jspa?id=10436

This might be helpful - http://wiki.apache.org/xmlbeans/V2Features

Regards,
Dave

On May 6, 2010, at 10:17 AM, <[email protected]> 
<[email protected]> wrote:

> 
> Just in keeping anyone interested in-tune to what I am finding:
> (Slowly plodding forward)
> 
> The Code from xmlbeans  (XmlComplexContentImpl.java)
> has a tight loop, that scales as O(n) for a single match, 
> and the way XMLBeans is being used within POI causes a scaling of 
> O(n^2) because each row has hits this method.
> 
> this is significant.
> If This could be re-written to be O(1) or at least O(log n)  then 
> POI would scale linearly (or log linearly) with spreadsheet size.
> 
> We may need to engage the XMLBean folks on this matter.
> OR SOMEONE with a really strong algorithms background...
> 
> Please let me know if we can figure out how to proceed with this:
> (I WOULD LIKE TO GET THE BEST PEOPLE ENGAGED as I think this could 
> have a significant impact on POI's usability)
> 
> 
> CODE:   STARTING at Line 1137 of 
> org.apache.xmlbeans.impl.values.XmlComplexContentImpl.java
> -------------------------------------------------------------------------------------------
>                // The new object matches what already exists in the array
>                // Heuristic: we optimize for the case where the new elements
>                // in the array are the same as the existing elements with
>                // potentially new elements inserted
> 
>                // First insert the new element in the array at position 0
>                int j = 0;
>                for ( j = 0; j < i; j++ )
>                {
>                    TypeStoreUser user = store.insert_element_user( elemName, 
> j );
>                    ((XmlObjectBase) user).set( sources[j] );
>                }
>                for ( i++, j++; i < sources.length; i++, j++)
>                {
>                    XmlCursor c = sources[i].isImmutable() ? null : 
> sources[i].newCursor();
>                    if (c != null && c.toParent() && c.getObject() == this)
>                    {
>                        c.dispose();
>                        current = store.find_element_user( elemName, j );
>                        if (current == sources[i])
>                            ; // advance
>                        else
>                        {
>                            // Fall back to the general case
>                            break;
>                        }
>                    }
>                    else
>                    {
>                        c.dispose();
>                        // Insert before the current element
>                        TypeStoreUser user = store.insert_element_user( 
> elemName, j );
>                        ((XmlObjectBase) user).set( sources[i] );
>                    }
>                }
>                startDest = j;
>                startSrc = i;
>                m = store.count_elements( elemName );
>            }
>            // Fall through
>        }
>        else
>        {
>            // All of the elements in the existing array are to
>            // be deleted and replaced with elements from the
>            // sources array
>        }
> 
> 
> 
> Thanks
> Bryce
> 
> ---------------------------------------------------------------------------
> Bryce Alcock * Performance Architect * SunGard *
> ...........................................................................
> http://www.brycealcock.com/
> ---------------------------------------------------------------------------
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Performance Question with CTSheetDataImpl.java

Reply via email to