On Fri, 5 Nov 2010 21:24:08 -0400, Scott Deerwester
<scott.deerwes...@gmail.com> wrote:
> The following code (in Python):
> 
> for r in range(dataRange.StartRow, dataRange.EndRow):
>     for c in range(dataRange.StartColumn, dataRange.EndColumn):
>         cell = sheet.getCellByPosition(c,r)
> 
> 
> takes nearly two hours to run on a reasonably fast workstation for a
> spreadsheet with 32 columns and ~27,000 rows (about .2 seconds per row).
By
> comparison, opening the file takes about 8 seconds and saving the file
to a
> CSV (which is functionally equivalent to the above) takes a few seconds.
> The
> original file was written in Excel as an XLS (not XLSX). This seems
> impossibly slow. Am I misusing the API somehow? What can I do to speed
it
> up?


First, I usually determine which cells are used. Depending on the
operations, this may be a gross test such as simply finding the largest
used block.

Next, if I must access the data in the cells, I obtain the data by getting
the entire range and then calling getDataArray() (or getData, depending on
my need / usage).

I have a section on timing in AndrewMacor.odt where I search a Calc
document using different methods.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@api.openoffice.org
For additional commands, e-mail: dev-h...@api.openoffice.org

Reply via email to