[Libreoffice-bugs] [Bug 151552] PDF import into writer messes up line justification

bugzilla-daemon Sun, 16 Oct 2022 14:50:04 -0700

https://bugs.documentfoundation.org/show_bug.cgi?id=151552


--- Comment #8 from Eyal Rozenberg <eyalr...@gmx.com> ---
(In reply to V Stuart Foote from comment #7)

> (In reply to Eyal Rozenberg from comment #6)
> No, please understand how our poppler based PDF import filtering functions.

I actually assumed everything you wrote in your post. I'm not that dense... :-)

But - it is irrelevant how the current filter works. Or rather, it's relevant
when evaluating whether or not a fix can be based on the current implementation
- it is not relevant for evaluating what the desired behavior is.

> PDF is not an editable format. 

First of all, of course it's an editable format. It's not _convenient_ to edit;
it expresses many things implicitly, sure; and still, it's editable.

... but I won't fall for the moving-of-the-goalposts you seem to be setting up
here. PDFs do not need to be editable to have editors. We've already described
what an editor does - and that does not require directly working on the format
it's an editor for. It is perfectly legitimate for an editor to
import-edit-export. gimp and Photoshop do that for most image formats, because
those are also not editing-friendly.

> We do not Edit PDFs. 

I told you I wouldn't fall for that. You might as well say "We do not edit
OOXMLs"... ok, sure, but LO is still a DOCX editor, and one of the better ones.


> Even for a document being "round-tripped" LibreOffice's import filter(s),
> using external poppler and poppler-utils libraries, extracts the content
> streams from the published presentation, and converts each stream into a
> discreet draw Shape object. 

This, at most, may means that fixing this bug may require a lot of effort due
to the need for an alternative to the use of poppler (although - maybe not; I'm
not familiar with poppler's capabilities). Fine! I do not claim that this this
issue should be the LO project's top priority. 

> PDF Viewers don't need to do more with the content streams--they simply
> parse them and lay them out as described in the postscript pages.

Indeed,  PDF viewers have it easier, and don't have to reach structural
conclusions. A PDF import filter for a textual document editor needs to work
much harder, reconstituting structure, deducing features and styles etc.

I don't expect this to work perfectly for arbitrary PDFs. But I definitely
expect it to work well for the most straightforward of PDFs for us to import:
Paragraphs of text exported from LO Writer.

> Put another way it is not justified to expend dev, QA and design resources
> working on the PDF import filters when we offer exceptional fidelity for PDF
> content using the pdfium based insert filters.

But you know that's not what a PDF import filter is for. The PDF import filter
for Writer is for editing PDFs in Writer, and that's not at all provided by
pdfium. So, the existence of pdfium does not constitute an argument against
investing effort in improving the Writer PDF import filter.

In fact, I must say that you're taking a rather myopic view of the matter.
Think about the promotion of LO as a product! Especially vis-a-vis MS Office.
If you could tell the user "Someone send you a document as a PDF? With LO, you
can edit it! Either make it your own by modifying the text or use Track Changes
to treat it as a draft for discussion." - that very attractive functionality
that Microsoft doesn't offer. 


> And again, LibreOffice is *not* a PDF editor.

I commend your valiant (?) attempt to try to argue this point. Unfortunately,
your argument was based on the false premise that an editor for a file format
must be able to manipulate that format's internal structure directly.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 151552] PDF import into writer messes up line justification

Reply via email to