Hi

Yes, you can set up your own font and TeX installations are a good source of Type 1 fonts. Here is an example (paths obviously specific to my [Ubuntu 20.04] OS and TeX installation) ...


cmlgc <- Type1Font("cmlgc",

rep("/usr/share/texlive/texmf-dist/fonts/afm/public/cm-lgc/fcmr6z.afm", 4),
                   encoding="Cyrillic")
pdfFonts(cmlgc=cmlgc)

x <- '\u410\u411\u412'
pdf("cmlgc.pdf", family="cmlgc", encoding="Cyrillic")
plot(1:10, main = x)
dev.off()

embedFonts("cmlgc.pdf", out="cmlgc-embed.pdf",

fontpaths="/usr/share/texlive/texmf-dist/fonts/type1/public/cm-lgc/")


Final result attached.

Thanks for the patch for the unrelated memory problem; I will take a look at that.

Paul

On 24/09/23 09:43, Ivan Krylov wrote:
On Wed, 20 Sep 2023 12:39:50 +0200
Martin Maechler <maech...@stat.math.ethz.ch> wrote:

 > The problem is that some pdf *viewers*,
 > notably `evince` on Fedora Linux, for several years now,
 > do *not* show *some* of the UTF-8 glyphs because they do not use
 > the correct fonts

One more problem that makes it nontrivial to use Unicode with pdf() is
the graphics device not knowing some of the font metrics:

x <- '\u410\u411\u412'
pdf()
plot(1:10, main = x)
# Warning messages:
# 1: In title(...) : font width unknown for character 0xb0
# 2: In title(...) : font width unknown for character 0xe4
# 3: In title(...) : font width unknown for character 0xfc
# 4: In title(...) : font width unknown for character 0x7f
dev.off()

In the resulting PDF file, the three letters are visible, at least in
Evince 3.38.2, but they are all positioned in the same space.

I understand that this is strictly speaking not pdf()'s fault
(grDevices contains the font metrics for all standard Adobe fonts and a
few more), but I'm not sure what to do as a user. Should I call
pdfFonts(...), declaring a font with all symbols I need? Where does one
even get Type-1 Cyrillic Helvetica (or any other font) with separate
font metrics files for use with pdf()?

Actually, the wrong number of sometimes random character codes reminds
me of stack garbage. In src/library/grDevices/src/devPS.c, function
static double PostScriptStringWidth, there's this bit of code:

if(!strIsASCII((char *) str) &&
/*
* Every fifth font is a symbol font:
* see postscriptFonts()
*/
(face % 5) != 0) {
R_CheckStack2(strlen((char *)str)+1);
char buff[strlen((char *)str)+1];
/* Output string cannot be longer */
mbcsToSbcs((char *)str, buff, encoding, enc);
str1 = (unsigned char *)buff;
}

Later the characters in str1 are iterated over in order to calculate
the total width of the string. I didn't notice this myself until I saw
in the debugger that after a few iterations of the loop, the contents
of str1 are completely different from the result of mbcsToSbcs((char
*)str, buff, encoding, enc), and went to investigate. Only after the
debugger told me that there's no variable called "buff" I realised that
the VLA pointed to by str1 no longer exists.

--- src/library/grDevices/src/devPS.c (revision 85214)
+++ src/library/grDevices/src/devPS.c (working copy)
@@ -721,6 +721,8 @@
unsigned char p1, p2;

int status;
+ /* May be about to allocate */
+ void *alloc = vmaxget();
if(!metrics && (face % 5) != 0) {
/* This is the CID font case, and should only happen for
non-symbol fonts. So we assume monospaced with multipliers.
@@ -755,9 +757,8 @@
* Every fifth font is a symbol font:
* see postscriptFonts()
*/
- (face % 5) != 0) {
- R_CheckStack2(strlen((char *)str)+1);
- char buff[strlen((char *)str)+1];
+ (face % 5) != 0 && metrics) {
+ char *buff = R_alloc(strlen((char *)str)+1, 1);
/* Output string cannot be longer */
mbcsToSbcs((char *)str, buff, encoding, enc);
str1 = (unsigned char *)buff;
@@ -792,6 +793,7 @@
}
}
}
+ vmaxset(alloc);
return 0.001 * sum;
}



After this patch, I'm consistently getting the right character codes in
the warnings, but I still don't know how to set up the font metrics.

--
Best regards,
Ivan

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel <https://stat.ethz.ch/mailman/listinfo/r-devel>

--
Dr Paul Murrell
Te Kura Tatauranga | Department of Statistics
Waipapa Taumata Rau | The University of Auckland
Private Bag 92019, Auckland 1142, New Zealand
64 9 3737599 x85392
p...@stat.auckland.ac.nz
www.stat.auckland.ac.nz/~paul/

Attachment: cmlgc-embed.pdf
Description: Adobe PDF document

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to