Public bug reported:

➜ example git:(master) ✗ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.2 LTS
Release: 16.04
Codename: xenial
➜ pdf2xml-viewer git:(master) ✗ pdftohtml
pdftohtml version 0.41.0
Copyright 2005-2016 The Poppler Developers - http://poppler.freedesktop.org
Copyright 1999-2003 Gueorgui Ovtcharov and Rainer Dorsch
Copyright 1996-2011 Glyph & Cog, LLC

poppler-data is already the newest version (0.4.7-7).

➜ example git:(master) ✗ pdffonts test.pdf
name type encoding emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- 
--- ---------
OCVNVZ+KaiTi_GB2312 TrueType WinAnsi yes yes yes 19 0
JSRZNG+SimSun TrueType WinAnsi yes yes yes 8 0
➜

➜ example git:(master) ✗ pdftohtml -c -hidden -enc UTF-8 -xml test.pdf 
test-utf8.xml
Page-1

i could not get correct Chinese characters

test file is here
link: https://pan.baidu.com/s/1dFiSrDn
password: ai5u

** Affects: poppler (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to poppler in Ubuntu.
https://bugs.launchpad.net/bugs/1678470

Title:
  pdftohtml could not get correct Chinese Characters

Status in poppler package in Ubuntu:
  New

Bug description:
  ➜ example git:(master) ✗ lsb_release -a
  No LSB modules are available.
  Distributor ID: Ubuntu
  Description: Ubuntu 16.04.2 LTS
  Release: 16.04
  Codename: xenial
  ➜ pdf2xml-viewer git:(master) ✗ pdftohtml
  pdftohtml version 0.41.0
  Copyright 2005-2016 The Poppler Developers - http://poppler.freedesktop.org
  Copyright 1999-2003 Gueorgui Ovtcharov and Rainer Dorsch
  Copyright 1996-2011 Glyph & Cog, LLC

  poppler-data is already the newest version (0.4.7-7).

  ➜ example git:(master) ✗ pdffonts test.pdf
  name type encoding emb sub uni object ID
  ------------------------------------ ----------------- ---------------- --- 
--- --- ---------
  OCVNVZ+KaiTi_GB2312 TrueType WinAnsi yes yes yes 19 0
  JSRZNG+SimSun TrueType WinAnsi yes yes yes 8 0
  ➜

  ➜ example git:(master) ✗ pdftohtml -c -hidden -enc UTF-8 -xml test.pdf 
test-utf8.xml
  Page-1

  i could not get correct Chinese characters

  test file is here
  link: https://pan.baidu.com/s/1dFiSrDn
  password: ai5u

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/poppler/+bug/1678470/+subscriptions

-- 
Mailing list: https://launchpad.net/~desktop-packages
Post to     : desktop-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~desktop-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to