https://bugs.documentfoundation.org/show_bug.cgi?id=148010

--- Comment #7 from Michael Meeks <[email protected]> ---
I'm afraid the ODS format does make life much more difficult for many
optimizations and for exposing parallelism - since it is a single, huge stream
for a whole workbook - so each sheet cannot be parsed in separate threads (as
with XLSX). It is certainly true ODS uses more verbose tags too -but- then our
parsing (while it is faster then populating our core model) is effectively free
since it happens in another thread so I don't think we need to get too hung up
on long attribute names until we have a profile that says that tokenizing them
is in fact really slow. Even in this case - possibly we could do something
clever to parallelize the XML parsing into chunks if we wrote our own XML
parser (not something I'm hyper-eager to do - though in reality we support a
rather small subset of the XML feature-bloat in ODF files).

Anyhow - there are a very large number of ways to continue to significantly
improve ODS import performance - but all of them are expensive in terms of
developer time; and (as yet) I don't see significant demand for this from
Collabora customers - but no problem with having this ticket open.

As/when someone has credible resources to invest here, and a profile - I'm
really happy to help out with some ideas of how we can try to improve things
substantially.

Beyond that tweaking the ODF format to use eg. 'R1C1' formulae, and to split up
sheets, shared-formulae etc. into more ZIP streams, and (ideally) to save
sheets vertically not horizontally would be great (though a standards process
is usually extremely slow) - and I don't really see us being able to tweak our
core to be able to defer loading of sheets in any sensible horizon: not sure
it's even that useful in the modern world.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to