[Bug 1772439] Re: Arabic text gets deformed when creating a PDF in LibreOffice Writer

2018-11-07 Thread Bug Watch Updater
** Changed in: df-libreoffice
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1772439

Title:
  Arabic text gets deformed when creating a PDF in LibreOffice Writer

To manage notifications about this bug go to:
https://bugs.launchpad.net/df-libreoffice/+bug/1772439/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1772439] Re: Arabic text gets deformed when creating a PDF in LibreOffice Writer

2018-08-31 Thread Bug Watch Updater
Launchpad has imported 2 comments from the remote bug at
https://bugs.documentfoundation.org/show_bug.cgi?id=119606.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.


On 2018-08-30T12:25:49+00:00 Miikka-Markus Alhonen wrote:

Description:
Creating a PDF from a document written in the Arabic script deforms the textual 
content of the document, although it looks fine on the screen.

For example, see the attached PDF created with Writer 5.4.6.2 on Ubuntu
17.10, where the example sentence "اشترى بلال خمسة آلاف كتاب وَأَنَا
اشْتَرَيْتُهَا مِنْهُ" looks as it should, but when you view it with any
PDF reader, such as evince, copying the text deforms most of the words.
Some characters are clearly visible but cannot be selected or searched
(such as ى at the end of the first word اشترى). If I search for the
second word بلال, evince tells me there are no matches in the document.
The same happens when converting the file with pdftotext, which produces
the following output:

‫اشتر‬

‫للا مسة لفا كتاب وَأَنَا ْ‬
‫ه‬
‫اشت َ َريْتُهَا ِ‬
‫من ْ ُ‬

Here only two of the eight words are intact, the rest are garbled in one
way or another. If the text is in Latin script, both evince and
pdftotext behave as expected, meaning that the textual content is
transferred correctly from Writer to the PDF.

On LO 6.0.3.2 on Ubuntu 18.04, the textual content is preserved a little
better but it is still quite garbled. This is the output from pdftotext:

ه‬
اشترى للا خمسة آفا كتاب وَأنَا اشْ ت َ َريْتُهَا ِ‬
من ْ ُ‬

Here four out of the eight words are intact, and for example the last
word of the sentence is divided so that the last full character is found
on the first line and the rest on the third line. Some diacritics are
found where they are supposed to be, some others not.

MS Word 2007 handles this case better, although it's not perfect either.
This is the output from pdftotext:

اشترى بالل خمسة آالف كتاب وأنا اشتريتها منه‬

Here all diacritics are dropped and all sequences of ل (U+0644) + ا
(U+0627) are reversed turning لا into ال. Otherwise the sentence is
intact.

This bug was first reported on Launchpad for LO 5.4.6.2 on Ubuntu 17.10
at: https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/1772439 .
After my initial report, I have upgraded to LO 6.0.3.2 where the problem
persists, although the actual output is different. Another user on
Launchpad confirmed the bug on LO 6.0.3.2, as well.

Steps to Reproduce:
1. In a new Writer document, type some text in Arabic. My example sentence was: 
اشترى بلال خمسة آلاف كتاب وَأَنَا اشْتَرَيْتُهَا مِنْهُ
2. Create a PDF.
3. Open the created PDF with a PDF reader (such as evince) and type one of the 
words in the Search dialog, e.g. بلال. Alternatively select the word in the PDF 
reader and copy-paste it somewhere else. You can also convert the PDF to text 
using a utility like pdftotext.

Actual Results:
The PDF reader reports there are no matches for some of the words in the 
document, although they are all clearly visible. Selecting and copy-pasting the 
word garbles it. Pdftotext's output is garbled.

Expected Results:
All the words that are visible should also be searchable in a PDF reader, 
copy-pasting should preserve the text, and the output of pdftotext should match 
the original document.


Reproducible: Always


User Profile Reset: No


Additional Info:

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/1772439/comments/6


On 2018-08-30T12:26:47+00:00 Miikka-Markus Alhonen wrote:

Created attachment 144554
PDF created with LO 5.4.6.2 where textual content is garbled

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/1772439/comments/7


** Changed in: df-libreoffice
   Status: Unknown => New

** Changed in: df-libreoffice
   Importance: Unknown => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1772439

Title:
  Arabic text gets deformed when creating a PDF in LibreOffice Writer

To manage notifications about this bug go to:
https://bugs.launchpad.net/df-libreoffice/+bug/1772439/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1772439] Re: Arabic text gets deformed when creating a PDF in LibreOffice Writer

2018-08-30 Thread Olivier Tilloy
** Also affects: df-libreoffice via
   https://bugs.documentfoundation.org/show_bug.cgi?id=119606
   Importance: Unknown
   Status: Unknown

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1772439

Title:
  Arabic text gets deformed when creating a PDF in LibreOffice Writer

To manage notifications about this bug go to:
https://bugs.launchpad.net/df-libreoffice/+bug/1772439/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1772439] Re: Arabic text gets deformed when creating a PDF in LibreOffice Writer

2018-08-30 Thread Miikka-Markus Alhonen
I filed an upstream report at
https://bugs.documentfoundation.org/show_bug.cgi?id=119606

** Bug watch added: Document Foundation Bugzilla #119606
   https://bugs.documentfoundation.org/show_bug.cgi?id=119606

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1772439

Title:
  Arabic text gets deformed when creating a PDF in LibreOffice Writer

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/1772439/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1772439] Re: Arabic text gets deformed when creating a PDF in LibreOffice Writer

2018-08-30 Thread Olivier Tilloy
Thanks for the confirmation Miikka. Confirming the bug, based on what I
observe and your feedback.

This is most likely an upstream issue. Would you mind filing a bug
report at
https://bugs.documentfoundation.org/enter_bug.cgi?product=LibreOffice=guided
and linking to it here so we can track its resolution? Thanks!

** Changed in: libreoffice (Ubuntu)
   Status: New => Confirmed

** Changed in: libreoffice (Ubuntu)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1772439

Title:
  Arabic text gets deformed when creating a PDF in LibreOffice Writer

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/1772439/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1772439] Re: Arabic text gets deformed when creating a PDF in LibreOffice Writer

2018-08-30 Thread Miikka-Markus Alhonen
I tested the same example sentence with Ubuntu 18.04 and LibreOffice
6.0.3.2. Here’s the output from pdftotext:

ه‬
اشترى للا خمسة آفا كتاب وَأنَا اشْ ت َ َريْتُهَا ِ‬
من ْ ُ‬

Here four out of the eight words are intact, so it’s an improvement to
5.4.6 but still leaves a lot to hope for. The last word of the sentence
(مِنْهُ) is broken into pieces so that the last full character ه is
found on the first line and the two others on the last. Diacritical
marks are sometimes placed where they are supposed to (such as the first
and the three last diacritics in the word اشْتَرَيْتُهَا) but sometimes
not (the middle of the same word and the last word of the sentence
مِنْهُ). This time ى is visible but the first letter of the following
word ب is not.

Here’s what MS Word 2007 (12.0.6787.5000, SP3 MSO 12.0.6785.5000) on
Windows 8.1 produces when processed by pdftotext:

اشترى بالل خمسة آالف كتاب وأنا اشتريتها منه‬

So Word 2007 drops all the diacritics, and mixes up the order of the
letters in the combination ل (U+0644) + ا (U+0627) producing ال instead
of لا. Otherwise the output is intact and definitely much better than
LO. I don't have any newer versions of MS Word at my disposal, so I
can't test it further.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1772439

Title:
  Arabic text gets deformed when creating a PDF in LibreOffice Writer

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/1772439/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1772439] Re: Arabic text gets deformed when creating a PDF in LibreOffice Writer

2018-08-29 Thread Olivier Tilloy
I'm seeing different results than what you describe when testing on
Ubuntu 18.04 with libreoffice 6.0.3, although it doesn't seem to behave
exactly as you would expect, either. Can you test on 18.04 and share
here whether the situation is any better?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1772439

Title:
  Arabic text gets deformed when creating a PDF in LibreOffice Writer

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/1772439/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1772439] Re: Arabic text gets deformed when creating a PDF in LibreOffice Writer

2018-08-29 Thread Olivier Tilloy
Sorry for the lack of feedback until now, Miikka.

Are you able to test whether the same thing works correctly in other
word processing software, such as abiword or MS Word ?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1772439

Title:
  Arabic text gets deformed when creating a PDF in LibreOffice Writer

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/1772439/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs