[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2019-03-21 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

V Stuart Foote  changed:

   What|Removed |Added

   See Also||https://bugs.documentfounda
   ||tion.org/show_bug.cgi?id=12
   ||4191

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs

[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2018-01-25 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

Khaled Hosny  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Blocks|66597   |
 Resolution|--- |DUPLICATE

--- Comment #57 from Khaled Hosny  ---
We have one common code path for Graphite and non-Graphite fonts now, so
whatever the fix for bug 66597 it should work here too.

*** This bug has been marked as a duplicate of bug 66597 ***


Referenced Bugs:

https://bugs.documentfoundation.org/show_bug.cgi?id=66597
[Bug 66597] Problems with copying and extracting text from generated PDF
-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2018-01-18 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

--- Comment #56 from martin_hos...@sil.org ---
(In reply to shreeshrii from comment #54)
> The problem of copying text from pdfs created with unicode fonts for complex
> scripts has been solved by Jonathan Kew by use of actualtext in xelatex.
> 
> 
> It uses the new \XeTeXgenerateactualtext feature - please see
> http://tug.org/pipermail/xetex/2016-February/026445.html for the
> announcement.
> 
> Is it possible to use a similar approach for Libre Office?

No. XeTeX is XeTeX and libo, libo. They are completely different animals with
completely different processing engines, pdf output mechanisms. There is no
overlap. All XeTeX is doing is inserting \actualText elements just as I
suggested a while back (see comment #48). This will require some programming
from someone who has the time to do it. Either that or you can pay one of the
consulting companies to do it. Since this is a new feature, no amount of
complaining or trying to say it's a regression on some font or other is going
to fix it.

The only way forward on this bug is for someone to commit code to add the
capability to libo.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2018-01-18 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

--- Comment #55 from shreesh...@gmail.com ---
Please also see https://bugs.documentfoundation.org/show_bug.cgi?id=66597#c20

Comment # 20 on bug 66597 from Khaled Hosny

LibreOfice has limited support for actual text already and I think it shouldn’t
be hard to extend it and make it an option at least. If someone is interested
in giving this a try, check SetActualText() calls in
sw/source/core/text/EnhancedPDFExportHelper.cxx.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2018-01-17 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

--- Comment #54 from shreesh...@gmail.com ---
The problem of copying text from pdfs created with unicode fonts for complex
scripts has been solved by Jonathan Kew by use of actualtext in xelatex.


It uses the new \XeTeXgenerateactualtext feature - please see
http://tug.org/pipermail/xetex/2016-February/026445.html for the announcement.

Is it possible to use a similar approach for Libre Office?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2017-09-06 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

--- Comment #53 from Jonathan  ---
Thanks for the update martin_hos...@sil.org. Personally I concur with the
previous comment in that I don't have a strong preference. Neither space nor
time is a constraint, but having a searchable PDF is essential. Perhaps if it
came to it, getting the PDF right is more important than speed, so I'd go with
the slow and small option.

I might repeat that by manually editing my PDF (I forget how I did it, this was
years ago) I managed to fix the glyph mapping and make it correctly searchable.
I'm not sure what this says about the time/space trade-off you mention, but to
my naive interpretation, it does make the current implementation look more like
a bug than a design flaw.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2017-09-06 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

--- Comment #52 from Volga  ---
(In reply to martin_hosken from comment #51)
> One of the difficulties with attaching text to a PDF text run is that the
> text has to be output before the glyphs that give the presentation. So there
> are a number of tradeoffs we can employ in resolving this. So I'll ask,
> which you prefer:
No, I have no prefer when I report here. I just reproduced by clicking “Expert
to PDF” at toolbar. Sorry.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2017-09-06 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

--- Comment #51 from martin_hos...@sil.org ---
Sorry to be somewhat brutal. But until we get the PDF writer to produce the
necessary PDF to allow for data extraction, using tagged PDF, it doesn't matter
what magic we do with our fonts, it isn't going to work. You can give example
after example, it won't help fix the problem.

One of the difficulties with attaching text to a PDF text run is that the text
has to be output before the glyphs that give the presentation. So there are a
number of tradeoffs we can employ in resolving this. So I'll ask, which you
prefer:

speed vs size? Do you want to make small PDFs that only output unicode strings
for runs that really need them, but take a bit longer to produce (since the
strings have to be analysed to make the decision) or do you OK with having a
complete copy of the text in your pdf?

Do we want to make this an option that says: make me extractable PDF or do we
always want to generate extractable PDF even if the result is bigger or slower
to produce?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2017-09-06 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

--- Comment #50 from Volga  ---
Created attachment 136058
  --> https://bugs.documentfoundation.org/attachment.cgi?id=136058=edit
Problem with Cyrillic

The problem still appearing with Cyrillic. I installed Ponomar Unicode and its
TTF version (Ponomar Unicode TT) on my computer, and I copied a sample text
from http://sci.ponomar.net/fonts.html twice, set to these fonts. After I
expert to PDF, copy the text, I get the following result:

Ponomar Unicode
Хрⷭ ҇ то́ съ воскре́ се и҆ з̾ ме́ ртвыхъ, сме́
ртїю сме́ рть попра́ въ, и҆
сꙋ́
щымъ во гробѣхъ иво́ тъ дарова́ въ.
Ponomar Unicode TT
Хртоосъ воскреосе иизз еортвыхъ, с еортїю с еорть попраовъ, ии
сꙋ
о
щы ъ во гробѣхъ живоотъ дароваовъ.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2017-09-06 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

--- Comment #49 from Volga  ---
Created attachment 136057
  --> https://bugs.documentfoundation.org/attachment.cgi?id=136057=edit
Awami Nastaliq Type Sample generated by LibreOffice 5.4

I have already got the font package from SIL (noted in comment 46), then I
extract the sample ODF, open with LibreOffice 5.4.1, expert as PDF. When I get
the PDF file, I copy the Urdu UDHR again, the character mapping seems better,
but many words are deformed and not correctly handling its direction.

版本:5.4.1.2 (x64)
Build ID:ea7cb86e6eeb2bf3a5af73a8fac570321527
CPU 线程:4; 操作系统:Windows 6.19; UI 渲染:默认; 
区域语言:zh-CN (zh_CN); Calc: group

Here is what I copied from self generated PDF.
Awami Running Text
One paragraph from Urdu UDHR
ے ن
ن
ی ل ب
ڋ
مس
نرل ا
ن
ی جڋ ک حڋک ه ت
ق
م
قِوما
ق
ا ۰۱ ؍دسمڋنر ۸۴۹۱ ؁
با۔
ي
ا اعلان عا مکک ک ے اس ک
ک
ږ ک ک ظوک ر
ن
ن
ن
ور" م
ش
ش
ن
ن
م يم ل ا اع ک
قوک ق
ق
ح
ین
ن
اس
ن
ن
ا " ا ک ء ک
بږ زور الك پ ما ممڋنر مم مت
ق
ے
ي
پ
پ
ن
ے ا ن
ن
ی ل ب
ڋ
مس
عڋ ا ب ے ڋ
ک
ک ے
اب م
ن
را ک
خي ک
ن
ی
ي
بار
ق
ے۔ سا ہ
ن درج
ن
ت
ق
ل م
م ک
م ا ک ور ک ش
ش
ن
م بږ سا
ات پ
ح ف
ن
ص ے
ل گ کا
باں
ي
تما
ن
ے یہ ککہ ا س
ي
ً
بلا
ش
نن۔ م
ي
ہ ل
ّ
ص ح نت
ب م
ق
عا ش
ش
شر و ا
ن
ن
ی ک ین اور اس ک
ي
ږ ک عام ک
ِ
ا اعلان ک ہاں اس ک
ہ
ے
ي
پ
پ
ن
ے ا
ي
پ
پ
ھي ا ب
ڋ
با ککہ هو
ي
د
ی ک ے اور اس ک ن
◌
ا ج
ڋ
اب
ي
اب
ن
س ږ ک ک ھږ
ڋ
ب
ے پ
تن ا س
ي
مي اداروں م ی
ي
ل ع ب
ق
ولوں اور ک
ک
ش بږ ا اص طور پ ج
ن
ے۔ اور ن
◌
ا ج
ڋ
اب
ي
ږا ںکک
ن
ب
ي
بږ وآ قامات پ
ق
م
اب
ق
یہ ڋبږ
ن
باز
ي
ت
ق
م نی ا
◌
و ک ک ے اظ س ح ل ے
ک
ک ب
ق
ثيب
ش
یي
ح
بايس
س ی ک ک ے
ق
ق
با علا
ي
ملك
ي س ک
ک نت
ي
ن م
ن
م ض
ن
نن، اور سا
ي
ن
◌
ا ج
ڋ
ی ک ضج ک
ن
بلات او
ي
ص ف
ن
ت
ق
ے ن
◌
ا ج
ڋ

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2017-06-28 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

--- Comment #48 from martin_hos...@sil.org ---
I lied. It's not producing good text, even if it is somewhat Arabic like. For a
start the text seems to be backwards.

Here's what is going on. Inside the PDF there is a 1:n mapping between glyphs
and characters. That's destined for failure just there because if you break off
your nuqtas, you are in for trouble. So, while libo does the best it can, the
results are going to be really bad regardless.

This has nothing to do with graphite vs harfbuzz, since by the time the pdf
writing is happening, everything has been shaped into the same structures. It's
just the nature of the problem that PDF cannot map n:1 glyphs:chars on output,
especially for the case [xy]:z and x:w. The only way to do this properly is to
output the unicode text along with the glyphed text as part of the PDF page
stream.

One way might be in vcl/source/gdi/pdf_impl.cxx to have another MARK() function
that takes a OUString&, nIndex and nLen and outputs that as the /ActualText as
part of the structure element dictionary in the /Span. This would only get
output if structured marking was turned on. I'm not sure if there would need to
be any other limiting factors like: the text contains CTL codepoints.

Suffice it to say that libo isn't up to handling CTL text for text export from
PDF. But let's not blame libo too much. This is really a bug in PDF since the
PDF specification only allows 1:n glyph:char mapping. All very latin centric ;)

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2017-06-28 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

--- Comment #47 from martin_hos...@sil.org ---
Looks like this is fixed in 5.4. I ran a test and for the 3 fonts:
NotoNastaliqUrdu, Awami Nastaliq and Scheherazade, the PDF copied arabic text
(even with correct characters with nuqtas). Which is all pretty amazing given
the Awami font doesn't have appropriately named glyphs and also decomposes its
nuqtas.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2017-06-24 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

--- Comment #46 from Volga  ---
This bug still affect LO 5.3. SIL Awami Nastaliq website has a font type sample
, this sample produced with LibreOffice 5.3.1.2, when I open the file, copy
Urdu text from page 5, I get the following text:

Awami Running Text
One paragraph from Urdu UDHR
À
Ã
¢
Õ
Œ
œ
ö
–
—
ý 
“
”
¢
‘
ö
Õ
’
÷
 ô
◊
ÿ
∞
Ÿ
/
"
ý
⁄
¤
∞
ý
~
“
”
ö
‹
—
õ

áÇÜ
ï
x›
œ
û
fi
÷
"›
fl
! ‡
·
fl
ý 
›
‚
÷
 òý 
À
„
÷
∆
‰
÷
 ó
Â
Ê
¢
Á
¢
Ÿ
Ë
ó
Â
È
Í
Î
¢
Ï
ù
Ì
Ó
›
fl
›
‚
÷
Â
±
∞
Ô
Õ

¢
›
Ò
Ú
¢
ý 
Ë
›
‚
÷

ó
Û
∆
«
Ù
 ́
™
›
ı
ˆ
“
”
ö
‹
ß
"›
ı
 ̃
∞
À
†
Ù
 ̄
¢
ý 
À
Ã
¢
Õ
Œ
œ
ö
–
—
ý 
ô
 ̆
 ̇
ö
À
„
÷
À
 ̊
›
ú
¢
ó
›
‚
÷
ù
 ̧
¢
 ̋
û
ó
›
ú
∞
 òý 
xÀ
 ̨

 øóõ 
°
¢

∞

®
ı

÷

›
‚
÷
  ó
Â
È
Í
Î
¢
Ï
 òý 
∆
«
Ù
 ú
›
◊

¢

À

÷
ý
p›
ú
û
›
ı
 ̃
¢
À
ý 
÷
û
4
‡
·
œ
Í

x°
£
û

.


°

û

ƒ
∞

›

Í
ý 

“

Í
Ú
¢
Õ
’
÷
 òý ó

ý 
°

û
∆
‰
÷
"›
fl
/
! ‡
·
fl
ý 
›
‚
÷
 òý 
p›


À
†
Ù
 ̄
¢
ý 
À
†
Ù
 ̄
¢
ý 
ù


ö
 
÷
›
ú
û
õ
Õ
’
÷
 òý ó

ý 
À
Ã
+
›

ö
›
ú
û
›
œ
¢

∆
‰
÷
m∆
õ
«
Ù
À
ý 
°

û

p
óýõý 
ù
Ì

û

 ̆
 ̇
∞
 ó

ý 
p⁄

⁄

÷

ý 
∆
«
Ù
  ó
Â

›

¢
 ó

ý 
xÀ
Ã
+
›

ö
›
œ
û
fi
÷
p
ý
∆
¢
«
û

∆
«
Ù
 ú
›

›
∞
!
À
Ã
+
›

ö
›
ú
∞
∆
«
ö
¢
Û›
œ
û
"
∞
Ï
ý 
Õ

+
⁄
#
÷
À
›
◊
$
À
„
÷
ƒ
∞
≈
û
%
Í
&
û
'
ù
(
›
œ
û

Õ
’
÷
À
)
∞
‡
·
fl
›
ú
û
 ́
©

ù
*
+
÷
°

û

°
¢
,
-
¢
 òý ó

ý 
°
£
û
§
+
›

ö
Õ
’
÷
¡
.
¢
ý

 ú
‡
·
œ
û
/
0
¢
1
∞

This document is directly available in http://software.sil.org/awami/design/ ,
also available in their download page http://software.sil.org/awami/download/

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2016-10-17 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

--- Comment #45 from Vera  ---
I can confirm that the bug is present in LibreOffice 5.2.2.2 in Ubuntu 16.04.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2016-09-20 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

--- Comment #44 from Jonathan  ---
I can confirm that the bug is present and the behaviour unchanged in version
5.2.0.4 (Debian build ID 1:5.2.0-2) which is the version installed on my work
notebook. I am away from my development machine and unable to test a more
recent or upstream version for another 2 weeks.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2016-09-20 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

--- Comment #43 from QA Administrators  ---
** Please read this message in its entirety before responding **

To make sure we're focusing on the bugs that affect our users today,
LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed
bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this
bug report. During that time, it's possible that the bug has been fixed, or the
details of the problem have changed. We'd really appreciate your help in
getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present on a currently supported version of
LibreOffice 
(5.1.5 or 5.2.1  https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the version of
LibreOffice and 
your operating system, and any changes you see in the bug behavior

If the bug is NOT present, please set the bug's Status field to
RESOLVED-WORKSFORME and leave 
a short comment that includes your version of LibreOffice and Operating System

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular
meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a
REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your
bug pertains to a feature added after 3.3)

http://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to "inherited from OOo";
4b. If the bug was not present in 3.3 - add "regression" to keyword


Feel free to come ask questions or to say hello in our QA chat:
http://webchat.freenode.net/?channels=libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug-20160920

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs


[Libreoffice-bugs] [Bug 62846] Incorrect glyph to Unicode mappings in PDFs (Graphite)

2015-08-10 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=62846

Gerry gerry.trep...@googlemail.com changed:

   What|Removed |Added

Summary|Incorrect glyph to Unicode  |Incorrect glyph to Unicode
   |mappings in PDFs|mappings in PDFs (Graphite)

--- Comment #42 from Gerry gerry.trep...@googlemail.com ---
(In reply to László Németh from comment #41)
 @Martin, @Gerry: Many thanks for the tests. I think, it's possible to close
 this issue now, thanks to Martin's LibreOffice fix, and I will fix the
 Graphite font problem with numbers in the next Linux Libertine/Biolinum G
 release in the near future.

@László: I just wanted to ask you when you plan to update the Linux
Libertine/Biolinum G fonts to fix the wrong glyph mapping in the PDF output.

Shall the bug be closed already now or after the new font versions are out?

Thanks!

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs