I tested the same example sentence with Ubuntu 18.04 and LibreOffice
6.0.3.2. Here’s the output from pdftotext:

ه‬
اشترى للا خمسة آفا كتاب وَأنَا اشْ ت َ َريْتُهَا ِ‬
من ْ ُ‬

Here four out of the eight words are intact, so it’s an improvement to
5.4.6 but still leaves a lot to hope for. The last word of the sentence
(مِنْهُ) is broken into pieces so that the last full character ه is
found on the first line and the two others on the last. Diacritical
marks are sometimes placed where they are supposed to (such as the first
and the three last diacritics in the word اشْتَرَيْتُهَا) but sometimes
not (the middle of the same word and the last word of the sentence
مِنْهُ). This time ى is visible but the first letter of the following
word ب is not.

Here’s what MS Word 2007 (12.0.6787.5000, SP3 MSO 12.0.6785.5000) on
Windows 8.1 produces when processed by pdftotext:

اشترى بالل خمسة آالف كتاب وأنا اشتريتها منه‬

So Word 2007 drops all the diacritics, and mixes up the order of the
letters in the combination ل (U+0644) + ا (U+0627) producing ال instead
of لا. Otherwise the output is intact and definitely much better than
LO. I don't have any newer versions of MS Word at my disposal, so I
can't test it further.

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to libreoffice in Ubuntu.
https://bugs.launchpad.net/bugs/1772439

Title:
  Arabic text gets deformed when creating a PDF in LibreOffice Writer

Status in libreoffice package in Ubuntu:
  New

Bug description:
  Creating a PDF from a document written in the Arabic script deforms
  the textual content of the document, although it looks fine on the
  screen.

  For example, see the attached PDF created with Writer, where the
  example sentence "اشترى بلال خمسة آلاف كتاب وَأَنَا اشْتَرَيْتُهَا
  مِنْهُ" looks as it should, but when you view it with any PDF reader,
  such as evince, copying the text deforms most of the words. Some
  characters are clearly visible but cannot be selected or searched
  (such as ى at the end of the first word اشترى). If I search for the
  second word بلال, evince tells me there are no matches in the
  document. The same happens when converting the file with pdftotext,
  which produces the following output:

  ‫اشتر‬

  ‫للا مسة لفا كتاب وَأَنَا ْ‬
  ‫ه‬
  ‫اشت َ َريْتُهَا ِ‬
  ‫من ْ ُ‬

  Here only two of the seven words are intact, the rest are garbled in
  one way or another. If the text is in Latin script, both evince and
  pdftotext behave as expected, meaning that the textual content is
  transferred correctly from Writer to the PDF.

  Description:  Ubuntu 17.10
  Release:      17.10

  libreoffice-writer:
    Installed: 1:5.4.6-0ubuntu0.17.10.1
    Candidate: 1:5.4.6-0ubuntu0.17.10.1
    Version table:
   *** 1:5.4.6-0ubuntu0.17.10.1 500
          500 http://mr.archive.ubuntu.com/ubuntu artful-updates/main amd64 
Packages
          100 /var/lib/dpkg/status
       1:5.4.5-0ubuntu0.17.10.5 500
          500 http://security.ubuntu.com/ubuntu artful-security/main amd64 
Packages
       1:5.4.1-0ubuntu1 500
          500 http://mr.archive.ubuntu.com/ubuntu artful/main amd64 Packages

  ProblemType: Bug
  DistroRelease: Ubuntu 17.10
  Package: libreoffice-writer 1:5.4.6-0ubuntu0.17.10.1
  ProcVersionSignature: Ubuntu 4.13.0-41.46-generic 4.13.16
  Uname: Linux 4.13.0-41-generic x86_64
  ApportVersion: 2.20.7-0ubuntu3.8
  Architecture: amd64
  CurrentDesktop: ubuntu:GNOME
  Date: Mon May 21 15:18:41 2018
  InstallationDate: Installed on 2017-02-13 (462 days ago)
  InstallationMedia: Ubuntu 16.10 "Yakkety Yak" - Release amd64 (20161012.2)
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=fi_FI.UTF-8
   SHELL=/bin/bash
  SourcePackage: libreoffice
  UpgradeStatus: Upgraded to artful on 2017-11-05 (196 days ago)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/1772439/+subscriptions

-- 
Mailing list: https://launchpad.net/~desktop-packages
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~desktop-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to