https://bugs.documentfoundation.org/show_bug.cgi?id=49705

--- Comment #27 from Eyal Rozenberg <[email protected]> ---
(In reply to خالد حسني from comment #25)
> PDF does not have paragraphs or lines let alone justification options. Text
> in PDF is just a stream of absolutely positioned glyphs, it can come in any
> order and formation as long as it gives the desired visual output.

While that is true in one sense, in another sense, it's false: When we look at
a PDF, we see paragraphs and justification. So they are there, they're just not
expressed explicitly.

> I don’t see LibreOffice ever being able to exactly replicate the PDF
> text spacing.

This bug is not about replicating the positioning exactly. It is about deciding
which alignment the text in a line (or a paragraph, if/when we reconstitute
paragraphs) has. We currently just choose "Left". Not "Right", not "Justified",
not "Centered". It is quite doable to make a much better, and usually-correct,
choice. To do so we need to:

1. Guesstimate where the text boundaries are for the page or part-of-the-page.
2. Determine how much extra space the paragraph has to its left and to its
right.
3. Estimate whether the line is a list line on its paragraph
4. Try to decide whether the spacing of the words on the line is the result of
adding extra inter-word space (justification) or not. Also determine the
inter-glyph spacing and perhaps the glyph stretch factor (which can immediately
be set for the text in the paragraph/line, if it's uniform at least.)
5. Try to decide whether the paragraph in its entirety is indented and/or has a
first-line indent
6. Based on our determinations and guesstimates, set the indentation,
alignment, and inter-word spacing for the paragraph.

... and that would typically come after deciding what the paragraph boundaries
are; although some of it may need to be combined (e.g. distinguishing
paragraphs by indents when there is no inter-paragraph spacing.)

And I'll emphasize again that this may not always result in the correct
reconstitution of paragraph alignment - but it will certainly be correct for
typical cases (think: the formal letter you exported as a PDF), and correct for
the large majority of cases.


---------------------------------

> Short of
> using the exact font with the exact glyphs and positioning every glyph
> individually to match the PDF positioning (which essentially means putting
> each glyph in its own text box, which I don’t think people will be thrilled
> about), 

That's outside the scope of this bug. However, that's not what it means. You
can very well set the spacing and stretch factors of individual glyphs within
the same text box or on the same paragraph. And - that should definitely be an
option when editing a PDF: Sometimes, what the user would want is a document
that's easy to read, unladen with a zillion formatting and positioning
specifications; but sometimes, what the user wants is a perfect reproduction of
what the PDF looks like, to the extent LO is capable of doing so. The second
case is for when one wants to make an edit to a PDF in LO (e.g. adding some
text or a signature), and save the result - passing through LO should cause
minimal distortion to the rendering of existing PDF content.

(Also, in the Writer filter, we typically don't want textboxes anyway, and the
text should just go in the body.)

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to