https://bugs.freedesktop.org/show_bug.cgi?id=62656

--- Comment #14 from [email protected] ---
(I wrote the stuff below before seeing Joel's second comment, that in fact his
version of Office opens the file in 20 seconds.  I note that somebody else,
with a different unspecified version of Office, said it took five minutes. 
Therefore, my suspicion below that adobe devs were generating *uniformly*
crappy docx output was wrong -- instead, it turns out that they were generating
docx output that works fine in every office suite they tested -- consisting
solely and entirely of a recent version of microsoft word.  Sigh.  However,
besides that useful correction, the rest of my commentary stands:  LibreOffice
should open that document in two seconds, not twenty seconds.  We should not
aim to be as good as Microsoft, but dramatically better.  @Alex -- what version
of office were you using, on what windows flavor, that took five minutes?)  

    First of all, I would encourage you not to be satisfied with being almost
as good as Microsoft Office -- that is not how we beat the pants off them, if
you catch my drift.  They have the advantage of getting their binaries
pre-installed (as trialware) on the vast majority of desktops nowadays.  We
need to be better than them, not just equivalent.  

    As for the meat of the question, my position is that the data itselt is not
that complex, inherently,  It is 500 pages.  It has some indentation, some
footnotes, some images.  It takes in the neighborhood of twenty times longer to
load, than a roughly similar 1800-page document, on the same hardware. 
Converting to ODT fileformat cuts that 20x factor down to a 3x.  Therefore, it
makes sense that 

    1.  LibreOffice *can* load similar data much quicker than it currently does 

    Speeding up the load-time of this particular complex document will
undoubtedly also help speed up the load-times of non-pathological large &
complex documents (my alternative sample still takes about 17 seconds to load
-- why not aim for 2 seconds?  while we're on that topic, please load the
DeveloperGuide.odt into your msftOffice, so we can know how many seconds it
takes) 

    There is the question of why this particular datafile is so poorly encoded
into docx form... and the answer is, because Adobe is doing the encoding.  They
don't want you to export from PDF to DOCX, which permits editing with
LibreOffice; they want you to keep needing licenses for Acrobat Pro.  Almost
certainly, LibreOffice can be taught to clean up their pathological DOCX, and
if so, that gives us a competitive advantage over other DOCX suites -- we can
work with the crappy output of Adobe's pdf2docx converter, while the lesser
office suites cannot.  Making LibreOffice capable of re-encoding a crappy DOCX
into a cleaner-and-quicker DOCX is also, again, helpful with other users, not
just this particular pathological docx.  (We should also see if LibreOffice can
handle the original PDF, if it is possible to obtain it -- perhaps the blame is
not adobe's pdf2docx, but rather the state of the original pdf.)  So:  

    2.  We ought to fix our docx2odt conversion process, for instant speed-up 
    3.  We ought to work on a libre pdf2docx conversion process, maybe 
    4.  We ought to investigate docx2docx cleanup, with speed & integrity in
mind

    "unless there is a real reason why you think it should be faster"  

    Umm... because LibreOffice is too damn slow?   :-)   There, that feels
better!  Seriously, though, the real reason that I think it should be faster is
simply first principles.  I have a document.  It is five pages long.  I load it
up.  Sub-second time.  Beautiful.  

    Replace 5 with 1800.  Replace sub-second with 17 seconds.  Replace
beautiful with... pause... drumming fingers... checking gkrellm... pause....
finally!  

    That's not even talking about pathological cases, like the 390 seconds you
have to wait for the 600-page docx this bug-report discusses.  

    What do I actually *see* after loading, whether it is a 5-page document, or
an 1800-page document?  Page 1.  And maybe, page 2.  If I have a *really* big
dose of screen-real-estate, perhaps up to five pages might be displayed.  Even
ten!  But not more than that.  How long does LibreOffice need, to display five
pages of a document?  Sub-second times.  How long should it need, to display
the first few pages of an 1800-page document, and let me get to work while it
loads the rest in the background?  *That* is my point here.  

    5.  LibreOffice ought to load the user-visible pages (1 and 2 by default)
in less than a second, regardless of how large the document happens to be.  

    p.s.  This is valuable when working with large documents on a local drive,
like the 1800-pager mentioned above... but it is also useful when working with
a medium-sized 50-pager that is being downloaded across the network, say.

-- 
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Libreoffice-bugs mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs

Reply via email to