[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2023-08-25 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

--- Comment #25 from Eyal Rozenberg  ---
Can someone summarize the state of this bug at the moment?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2023-01-13 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

Eyal Rozenberg  changed:

   What|Removed |Added

 Blocks||43808, 103378


Referenced Bugs:

https://bugs.documentfoundation.org/show_bug.cgi?id=43808
[Bug 43808] [META] Right-To-Left and Complex Text Layout language issues
(RTL/CTL)
https://bugs.documentfoundation.org/show_bug.cgi?id=103378
[Bug 103378] [META] PDF export bugs and enhancements
-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2022-11-20 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

V Stuart Foote  changed:

   What|Removed |Added

   See Also||https://bugs.documentfounda
   ||tion.org/show_bug.cgi?id=15
   ||2143

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2021-07-21 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

--- Comment #24 from stragu  ---
Created attachment 173742
  --> https://bugs.documentfoundation.org/attachment.cgi?id=173742=edit
PDF as exported by LO 7.3 on Ubuntu 18.04

Also attaching the resulting PDF for completeness' sake.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2021-07-21 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

--- Comment #23 from stragu  ---
Created attachment 173741
  --> https://bugs.documentfoundation.org/attachment.cgi?id=173741=edit
results of testing on Ubuntu 18.04 with LO 7.3 alpha and Evince as PDF viewer

Interesting indeed !

Here are the results of my tests using:
- LO 7.3 alpha0+
- Ubuntu 18.04
- Evince 3.28.4
- gedit 3.28.1

I can't spot any difference with the original text.

This makes me wonder if the issue is specific to Windows, or if Acrobat Reader
is the culprit?

Version: 7.3.0.0.alpha0+ / LibreOffice Community
Build ID: 113d308155e4b6a67a8510098a7db5f4a6632bdc
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
TinderBox: Linux-rpm_deb-x86_64@86-TDF, Branch:master, Time:
2021-07-16_21:27:22
Calc: threaded

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2021-07-20 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

--- Comment #22 from V Stuart Foote  ---
Created attachment 173712
  --> https://bugs.documentfoundation.org/attachment.cgi?id=173712=edit
Result of OP STR as pasted to Notepad++ UTF-8

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2021-07-20 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

--- Comment #21 from V Stuart Foote  ---
(In reply to stragu from comment #20)

> Not sure if something changed in PDF export along the way? Could you please
> test again with a recent version of LO?

Hmm, strange. With STR of OP with Writer 7.3.0alpha export to PDF. Opened in
Acrobat Reader (ver 2021.005.20058) and copy to Notepad++ (bld 7.9.5) in UTF+8
encoding--I get exactly the same misformed Devanagari 

The glyph clusters are not formed correctly, so the words can not be copied out
of the PDF.

The /ActualText structures when present would supplement the incorrect
ToUnicode strings that drop lexical details.  Parsing the actual text runs
would, if done at Unicode word bound iterators, provide better fidelity to
original text when enabled and embedded into the PDF export.

=-testing-=
Version: 7.3.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 213430e0bdac0786b30a76a68b43d35647e93912
CPU threads: 8; OS: Windows 10.0 Build 19043; UI render: Skia/Vulkan; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: threaded

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2021-07-20 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

stragu  changed:

   What|Removed |Added

 CC||stephane.guil...@member.fsf
   ||.org

--- Comment #20 from stragu  ---
I just tested the steps described in the Description, and couldn't reproduce
the same issue:

On Ubuntu 18.04, using LO 7.0.6 and 7.3 alpha0+, I could copy the text, paste
in Write, export to PDF, open in Evince 3.28.4, copy the text and paste it back
in Writer or gedit: the result was  the same as the original text (as far as I
can see).

Not sure if something changed in PDF export along the way? Could you please
test again with a recent version of LO?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2021-07-17 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

V Stuart Foote  changed:

   What|Removed |Added

   See Also||https://bugs.documentfounda
   ||tion.org/show_bug.cgi?id=39
   ||667

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-09-25 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

V Stuart Foote  changed:

   What|Removed |Added

   See Also||https://bugs.documentfounda
   ||tion.org/show_bug.cgi?id=11
   ||8370

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-09-22 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

Khaled Hosny  changed:

   What|Removed |Added

   See Also|https://bugs.documentfounda |
   |tion.org/show_bug.cgi?id=58 |
   |941 |

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-06-19 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

--- Comment #19 from Khaled Hosny  ---
(In reply to flywire0 from comment #18)
> I consider libre word pdf characters displayed missing when text copied is a
> serious bug. In my instance the letter 'i' is displayed in the pdf file but
> often missing when text is copied and pasted to another program. eg computer
> commands are pasted incorrectly.

This should be fixed in big 66597, if you still have an issue with builds
including that fix, please open a new bug. This should be independent of the
issue being discussed here.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-06-19 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

Khaled Hosny  changed:

   What|Removed |Added

 Depends on|117533  |


Referenced Bugs:

https://bugs.documentfoundation.org/show_bug.cgi?id=117533
[Bug 117533] Problems with copying text from generated PDF (for Graphite font)
-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-06-01 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

--- Comment #18 from flywi...@gmail.com ---
I consider libre word pdf characters displayed missing when text copied is a
serious bug. In my instance the letter 'i' is displayed in the pdf file but
often missing when text is copied and pasted to another program. eg computer
commands are pasted incorrectly.

I have also noticed Text To Speech (TTS) does not work with missing characters
in the pdf. Especially when it is a vowel!

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-05-28 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

--- Comment #17 from Heiko Tietze  ---
(In reply to Khaled Hosny from comment #16)
> Nothing is “non-latin”-specific about the proposed option.

How would you call CTL and alike in a way that average users understand this?
IMHO, "Latin" is understood as A..Z maybe including some special characters
like umlauts but definitely not arabic, hebrew, and asian.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-05-27 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

--- Comment #16 from Khaled Hosny  ---
(In reply to Heiko Tietze from comment #15)
> Putting all comments together UX recommends to implement an option for this
> /Actualtext feature. I suggest the caption "Improve non-latin text export"
> (with default off, meaning nothing changes for western users) and explain
> details at the help pages.

Nothing is “non-latin”-specific about the proposed option.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-05-27 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

Heiko Tietze  changed:

   What|Removed |Added

   Keywords|needsUXEval |
 CC|libreoffice-ux-advise@lists |olivier.hallot@documentfoun
   |.freedesktop.org|dation.org,
   ||tietze.he...@gmail.com

--- Comment #15 from Heiko Tietze  ---
Putting all comments together UX recommends to implement an option for this
/Actualtext feature. I suggest the caption "Improve non-latin text export"
(with default off, meaning nothing changes for western users) and explain
details at the help pages.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-05-21 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428
Bug 117428 depends on bug 117533, which changed state.

Bug 117533 Summary: Problems with copying text from generated PDF (for Graphite 
font)
https://bugs.documentfoundation.org/show_bug.cgi?id=117533

   What|Removed |Added

 Status|NEEDINFO|RESOLVED
 Resolution|--- |INVALID

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-05-20 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

--- Comment #14 from Shree Devi Kumar  ---
(In reply to Khaled Hosny from comment #12)
> (In reply to Shree Devi Kumar from comment #10)
> > (In reply to Khaled Hosny from comment #9)
> > > They keyword for the
> > > proposed changes is “per word”, the new option would skip the algorithm 
> > > and
> > > tags the glyphs if each word with it's text, as a complete unit. 
> > 
> > @Khaled Any update on this? Can you create a patch for this option so that
> > it can be tested?
> 
> I don’t currently have time to work on this, unfortunately.

Ok. Thank you for your work on \Actualtext, it is step in the right direction
to getting fully copyable text from pdfs.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-05-20 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

--- Comment #13 from Shree Devi Kumar  ---
(In reply to V Stuart Foote from comment #11)
> I don't believe Khaled has volunteered to tackle the needed refactoring to
> the PDF export filter and GUI.  Check History--clearly not assigned as
> Khaled removed himself, back to NEW

OK. Since he had suggested about opening a new bug for this, I had incorrectly
assumed that he was planning to work on it. 

> 
> Otherwise, is there any objection that implementing an /ActualText flag "per
> word" will mean string selection to copy from PDF will be limited to word
> bounds? Personally I think we need the tagging more than the partial string
> copy. 
> 
> Assuring correct handling combining glyphs and Unicode script--and
> presumably OTF font features when implemented (as for bug 58941)--is the
> desired outcome.
> 
> Justified from a11y perspective, and needed for accuracy supporting CTL
> scripts. 
> 
> Is that the UX consensus?

As a user the ability to copy text from pdf is important. Currently, except for
xelatex, I am not aware of any other method of doing so for Devanagari and
other Indic scripts.

Please see
https://www.wikihow.com/index.php?title=Create-a-Searchable-Hindi-PDF-Using-Lyx-with-Xetex
which is a workaround for users who are not comfortable with XeLatex to create
these searchable/copyable pdfs.

It will be a great benefit to users if this option can be implemented in Libre
Office.

Thank You!

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-05-16 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

--- Comment #12 from Khaled Hosny  ---
(In reply to Shree Devi Kumar from comment #10)
> (In reply to Khaled Hosny from comment #9)
> > They keyword for the
> > proposed changes is “per word”, the new option would skip the algorithm and
> > tags the glyphs if each word with it's text, as a complete unit. 
> 
> @Khaled Any update on this? Can you create a patch for this option so that
> it can be tested?

I don’t currently have time to work on this, unfortunately.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-05-16 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

V Stuart Foote  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   See Also||https://bugs.documentfounda
   ||tion.org/show_bug.cgi?id=58
   ||941

--- Comment #11 from V Stuart Foote  ---
I don't believe Khaled has volunteered to tackle the needed refactoring to the
PDF export filter and GUI.  Check History--clearly not assigned as Khaled
removed himself, back to NEW

Otherwise, is there any objection that implementing an /ActualText flag "per
word" will mean string selection to copy from PDF will be limited to word
bounds? Personally I think we need the tagging more than the partial string
copy. 

Assuring correct handling combining glyphs and Unicode script--and presumably
OTF font features when implemented (as for bug 58941)--is the desired outcome.

Justified from a11y perspective, and needed for accuracy supporting CTL
scripts. 

Is that the UX consensus?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-05-16 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

Shree Devi Kumar  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #10 from Shree Devi Kumar  ---
(In reply to Khaled Hosny from comment #9)
>
> We do export the text already, but using a clever algorithm that minimizes
> file size impact and keeps individual characters selectable (as much as
> possible), but it fails in minor ways with some readers second guessing us
> and inserting random spaces in the middle of the word. 

For Indic languages this was happening in ALL readers that I tested. 

> They keyword for the
> proposed changes is “per word”, the new option would skip the algorithm and
> tags the glyphs if each word with it's text, as a complete unit. 

@Khaled Any update on this? Can you create a patch for this option so that it
can be tested?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-05-09 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

Volga  changed:

   What|Removed |Added

 Depends on||117533


Referenced Bugs:

https://bugs.documentfoundation.org/show_bug.cgi?id=117533
[Bug 117533] Problems with copying text from generated PDF (for Graphite font)
-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-05-09 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

--- Comment #9 from Khaled Hosny  ---
(In reply to Heiko Tietze from comment #8)
>
> (In reply to Khaled Hosny from comment #4)
> > 2) What exact wording to use, /ActualText is a jargon
> "Export raw text", "Export actual text", "Export source"...

We do export the text already, but using a clever algorithm that minimizes file
size impact and keeps individual characters selectable (as much as possible),
but it fails in minor ways with some readers second guessing us and inserting
random spaces in the middle of the word. They keyword for the proposed changes
is “per word”, the new option would skip the algorithm and tags the glyphs if
each word with it's text, as a complete unit. This fixes the issue, but
introduces a new one; you can no longer select parts of the word, it is now a
single unit. The option text needs to relay some of this to the user.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-05-08 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

Khaled Hosny  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|khaledho...@eglug.org   |libreoffice-b...@lists.free
   ||desktop.org

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

2018-05-04 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=117428

Shree Devi Kumar  changed:

   What|Removed |Added

 CC||khaledho...@eglug.org
   Assignee|libreoffice-b...@lists.free |khaledho...@eglug.org
   |desktop.org |

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs