Re: [NTG-context] searchable PDF with MinionPro under mkiv
How can I generate a searchable PDF with mkiv, using a non standard font like MinionPro? \definefontfeature [default] [default] [mode=node,script=latn,onum=yes] \usemodule[simplefonts] \setmainfont[minionpro] \starttext fi ff ffi ffl 1234567890 \stoptext Using pdftotext, I get this: fi ff ffi ffl Hi Oliver, it works for me with the beta 2011.01.12 and 2011.01.14 and poppler-0.14.5/ poppler-0.16.0. However, it turns out that pdftotext converts to fi ff ffi ffl 1234567890, splitting fi ligature while leaving ff, ffi and ffl intact, which is strange. I did not try with Adobe Reader but the pdf is searchable with Apple Preview and the pasted copy is still intact: fi ff ffi ffl 1234567890 Florian ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] searchable PDF with MinionPro under mkiv
Hi Florian, Florian Wobbe florian.wo...@awi.de writes: it works for me with the beta 2011.01.12 and 2011.01.14 and poppler-0.14.5/ poppler-0.16.0. However, it turns out that pdftotext converts to fi ff ffi ffl 1234567890, splitting fi ligature while leaving ff, ffi and ffl intact, which is strange. I did not try with Adobe Reader but the pdf is searchable with Apple Preview and the pasted copy is still intact: fi ff ffi ffl 1234567890 For me, it still doesn't work. I get oldstyle numbers in the text, and neither in Adobe Reader nor in okular, evince or xpdf the numbers are searchable. However, I figured out that it is my version of the font causing the wrong result. $ otfinfo -i /usr/local/share/fonts/MinionPro_Regular.otf Family: Minion Pro Subfamily: Regular Full name: Minion Pro PostScript name: MinionPro-Regular Version: OTF 1.011;PS 001.000;Core 1.0.27;makeotf.lib1.3.1 Unique ID: 1.011;ADBE;MinionPro-Regular Designer:Robert Slimbach Vendor URL: http://www.adobe.com/type/ Trademark: Minion is either a registered trademark or a trademark of Adobe Systems Incorporated in the United States and/or other countries. Copyright: © 2000 Adobe Systems Incorporated. All Rights Reserved. U.S. Patent Des. 337,604. Other patents pending. License URL: http://www.adobe.com/type/legal.html When using the MinionPro fonts shipped with Adobe reader, I get the same results as you: $ otfinfo -i /usr/local/share/fonts/MinionPro-Regular.otf Family: Minion Pro Subfamily: Regular Full name: Minion Pro PostScript name: MinionPro-Regular Version: Version 2.068;PS 2.000;hotconv 1.0.57; makeotf.lib2.0.21895 Unique ID: 2.068;ADBE;MinionPro-Regular Designer:Robert Slimbach Manufacturer:Adobe Systems Incorporated Vendor URL: http://www.adobe.com/type/ Trademark: Minion is either a registered trademark or a trademark of Adobe Systems Incorporated in the United States and/or other countries. Copyright: © 1990, 1991, 1992, 1994, 1997, 1998, 2000, 2002, 2004 Adobe Systems Incorporated. All rights reserved. License URL: http://www.adobe.com/type/legal.html Has this to be consired a bug in the font? Best regards, olli -- Oliver Heins he...@sopos.org http://www.sopos.org/olli GPG: F27A BA8C 1CFB B905 65A8 2544 0F07 B675 9A00 D827 1024D/9A00D827 2004-09-24 -- gpg --recv-keys 0x9A00D827 Please avoid sending me Word or PowerPoint attachments: http://www.gnu.org/philosophy/no-word-attachments.html ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] searchable PDF with MinionPro under mkiv
However, it turns out that pdftotext converts to fi ff ffi ffl 1234567890, splitting fi ligature while leaving ff, ffi and ffl intact, which is strange. I did not try with Adobe Reader but the pdf is searchable with Apple Preview and the pasted copy is still intact: fi ff ffi ffl 1234567890 For me, it still doesn't work. I get oldstyle numbers in the text, and neither in Adobe Reader nor in okular, evince or xpdf the numbers are searchable. However, I figured out that it is my version of the font causing the wrong result. You are right! I have not considered that. Depending on the used font, pdftotext expands (some) the ligatures or not. With TeXGyre Pagella for instance there is no ligature expansion at all: fi ff ffi ffl 1234567890 and with Cambria I get a pdf which is not searchable with Preview: ũi ff fũi fũl 1234567890 Florian ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
[NTG-context] searchable PDF with MinionPro under mkiv
How can I generate a searchable PDF with mkiv, using a non standard font like MinionPro? \definefontfeature [default] [default] [mode=node,script=latn,onum=yes] \usemodule[simplefonts] \setmainfont[minionpro] \starttext fi ff ffi ffl 1234567890 \stoptext Using pdftotext, I get this: fi ff ffi ffl However, using Adobe Reader this things won't be found. It should read: fi ff ffi ffl 1234567890 Using latex, one would use \input glyphtounicode.tex \pdfgentounicode=1, but this doesn't seem to work with context. Context used pdfr-def, but this seems to be mkii-only. TIA, olli -- Oliver Heins he...@sopos.org http://www.sopos.org/olli GPG: F27A BA8C 1CFB B905 65A8 2544 0F07 B675 9A00 D827 1024D/9A00D827 2004-09-24 -- gpg --recv-keys 0x9A00D827 Please avoid sending me Word or PowerPoint attachments: http://www.gnu.org/philosophy/no-word-attachments.html ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] searchable PDF with MinionPro under mkiv
2011/1/17 Oliver Heins o...@sopos.org: How can I generate a searchable PDF with mkiv, using a non standard font like MinionPro? \definefontfeature [default] [default] [mode=node,script=latn,onum=yes] \usemodule[simplefonts] \setmainfont[minionpro] \starttext fi ff ffi ffl 1234567890 \stoptext Using pdftotext, I get this: fi ff ffi ffl However, using Adobe Reader this things won't be found. It should read: fi ff ffi ffl 1234567890 Using latex, one would use \input glyphtounicode.tex \pdfgentounicode=1, but this doesn't seem to work with context. Context used pdfr-def, but this seems to be mkii-only. Hi Oliver, Your example works for me with the beta 2011.01.14 and pdftotext-0.16.0. Which version is your ConTeXt MkIV? Your problem looks like http://www.ntg.nl/pipermail/ntg-context/2010/052259.html; but that one has been solved by Taco. -- Best regards, Li Yanrui (李延瑞) ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] searchable PDF with MinionPro under mkiv
Li Yanrui (李延瑞) liyanrui...@gmail.com writes: Your example works for me with the beta 2011.01.14 and pdftotext-0.16.0. Which version is your ConTeXt MkIV? Your problem looks like http://www.ntg.nl/pipermail/ntg-context/2010/052259.html; but that one has been solved by Taco. Hi Li, ConTeXt ver: 2011.01.12 10:20 MKIV fmt: 2011.1.12 This is a quite recent version, however I updated my minimals, so now I have: ConTeXt ver: 2011.01.14 14:44 MKIV fmt: 2011.1.17 The result stays the same. My pdftotext is an older version (0.12.4), but that shouldn't be a problem. Adobe reader, evince and xpdf are able to find the ligatures, but not the numbers. okular even fails to find the ligatures, but I would consider this a bug in okular. Best regards, olli -- Oliver Heins he...@sopos.org http://www.sopos.org/olli GPG: F27A BA8C 1CFB B905 65A8 2544 0F07 B675 9A00 D827 1024D/9A00D827 2004-09-24 -- gpg --recv-keys 0x9A00D827 Please avoid sending me Word or PowerPoint attachments: http://www.gnu.org/philosophy/no-word-attachments.html ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___