Exactly what Adam said!

To add to that, there's no single font (or even font family) that has glyphs 
for every single Unicode character.  The Noto font family has aims to do that 
"in the future," however, and it already includes a lot of fonts as part of its 
family (including Noto Sans Thai and Noto Serif Thai) that one would have to 
install.  See https://www.google.com/get/noto/


In any event, ASpace should certainly be updated so that the staff-side PDFs 
have more coverage by default (but I also think there needs to be a decision 
about whether the platform supports both EAD to PDF transformations as well as 
HTML/CSS to PDF transformations), but the out-of-the-box approach is never 
going to cover everything.  Perhaps a good next step would be to update Apache 
FOP (since the version used by ASpace is pretty out of date right now), package 
ASpace with a few of the Noto fonts so that those could be used in place of the 
base-14 fonts (e.g. Times is used by FOP for its "any" font), and update the 
transformation process.  Even then, though, I believe that you would actually 
need to embed the fonts into the PDF file, since if you don't, there's no 
guarantee that whomever opens the PDF file has that font on their computer, so 
you might still wind up with character replacements.  But the PDF standard 
allows you to do just that.


Last, EAD3 added language and script data attributes for precisely this sort of 
reason (e.g. if you have one paragraph in English, and another in Arabic, you'd 
need some reliable method to determine when to switch fonts and the direction 
of the text).  ASpace doesn't have that ability yet (although I'm pretty sure 
that AtoM does), but it would be a great addition (as well as a necessary one, 
for this sort of reason) addition.  Here's a note from EAD3s tag library:


"Support for multilingual description was addressed by adding @lang and @script 
attributes to all non-empty elements in EAD3, making it possible to explicitly 
state what language or script is used therein. Additionally, some elements were 
modified to allow them to repeat where previously they did not, thus enabling 
the inclusion of the same data in multiple languages."


So, lots to do, but all worth doing.





________________________________
From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
<archivesspace_users_group-boun...@lyralists.lyrasis.org> on behalf of Adam 
Jazairi <jaza...@bc.edu>
Sent: Thursday, January 24, 2019 1:45:38 PM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] Thai names in Finding Aid PDF

Hi Ed,

This is a problem with the fonts included in the version of Apache FOP that 
ASpace uses. There's an open ticket here: 
https://archivesspace.atlassian.net/browse/ANW-473<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Farchivesspace.atlassian.net%2Fbrowse%2FANW-473&data=02%7C01%7Cmark.custer%40yale.edu%7C48edfc3ae3324bae138d08d6822c2cb0%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C0%7C636839523610525508&sdata=z%2FQrtRc3rB1CBIR3Djte09O2zYiXvUpJAJP%2FA7HftOw%3D&reserved=0>

We've encountered the same issue when we attempt to generate a PDF finding aid 
containing Irish or Japanese diacritics. An interim solution we've been using 
is to export the EAD, then run 
Saxon<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fsaxon.sourceforge.net%2F%23F9.9HE&data=02%7C01%7Cmark.custer%40yale.edu%7C48edfc3ae3324bae138d08d6822c2cb0%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C0%7C636839523610535518&sdata=Fdt%2FN5Ei5qarYYT5qVq%2FhgbMgp5TaFJzpOs4MPlixVE%3D&reserved=0>
 on it to generate an FO file, then run FOP 
1.0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fxmlgraphics.apache.org%2Ffop%2F1.0%2F&data=02%7C01%7Cmark.custer%40yale.edu%7C48edfc3ae3324bae138d08d6822c2cb0%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C0%7C636839523610535518&sdata=zPfmkMTcJBxUjq9IP93jQy6buUoT9i2M0RFPbx0G5PQ%3D&reserved=0>
 with the appropriate font on the FO file to generate the PDF. It's a bit 
cumbersome, but it's worked for us so far.

Here's the FOP conf file that we use: 
https://github.com/BCDigLib/bc-aspace/blob/master/fop/fop.xconf<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FBCDigLib%2Fbc-aspace%2Fblob%2Fmaster%2Ffop%2Ffop.xconf&data=02%7C01%7Cmark.custer%40yale.edu%7C48edfc3ae3324bae138d08d6822c2cb0%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C0%7C636839523610545518&sdata=GgWWuL3vTkfjsJpZIypVKl2VgIIXF5qgieIez36Ks9U%3D&reserved=0>

The only catch is that you'll need a font that supports the unicode characters 
you need. In your case, it looks like Arial v2.95 would work: 
https://en.wikipedia.org/wiki/Arial#TrueType/OpenType_version_history<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FArial%23TrueType%2FOpenType_version_history&data=02%7C01%7Cmark.custer%40yale.edu%7C48edfc3ae3324bae138d08d6822c2cb0%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C0%7C636839523610545518&sdata=9MgY%2BExVKK4yhLPXxdlHsAdwJCIz5jDIBUtmE4HlTHk%3D&reserved=0>

Hope this helps.

Adam

On Thu, Jan 24, 2019 at 1:14 PM Tang, Lydia 
<lta...@lib.msu.edu<mailto:lta...@lib.msu.edu>> wrote:
Hi Ed,
The related ticket that I see is here: 
https://archivesspace.atlassian.net/browse/ANW-294?jql=text%20~%20%22pdf%20diacritics%22<https://na01.safelinks.protection.outlook.com/?url=https:%2F%2Farchivesspace.atlassian.net%2Fbrowse%2FANW-294%3Fjql%3Dtext%2520~%2520%2522pdf%2520diacritics%2522&data=02%7C01%7Cmark.custer%40yale.edu%7C48edfc3ae3324bae138d08d6822c2cb0%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C0%7C636839523610555528&sdata=rSJ6Fc0g2VvG3tlzd8gxx4uBzXtrebM4aouNPRfR5kc%3D&reserved=0>
  It is “closed – completed”  It doesn’t look like Marcella’s ticket was ever 
created.  Ed, please go ahead and create a new ticket!  Thanks for pointing 
this out!
Lydia – on behalf of Dev. Pri.

From: 
<archivesspace_users_group-boun...@lyralists.lyrasis.org<mailto:archivesspace_users_group-boun...@lyralists.lyrasis.org>>
 on behalf of "Busch, Edward" <busch...@msu.edu<mailto:busch...@msu.edu>>
Reply-To: Archivesspace Users Group 
<archivesspace_users_group@lyralists.lyrasis.org<mailto:archivesspace_users_group@lyralists.lyrasis.org>>
Date: Thursday, January 24, 2019 at 12:53 PM
To: 
"'archivesspace_users_group@lyralists.lyrasis.org<mailto:archivesspace_users_group@lyralists.lyrasis.org>'"
 
<archivesspace_users_group@lyralists.lyrasis.org<mailto:archivesspace_users_group@lyralists.lyrasis.org>>
Subject: [Archivesspace_Users_Group] Thai names in Finding Aid PDF

I’m not sure if there is an open ticket on this or not; a quick search didn’t 
reveal anything directly.

Agents with Thai names and diacritics look correct in ASpace but when generated 
into a PDF finding aid, do not. They end up like:
Saph# K#ns#ks# h#ng Ch#t

I can create a ticket if needed.

Ed Busch, MLIS
Electronic Records Archivist
Michigan State University Archives
Conrad Hall
943 Conrad Road, Room 101
East Lansing, MI 48824
517-884-6438
busch...@msu.edu<mailto:busch...@msu.edu><mailto:busch...@msu.edu<mailto:busch...@msu.edu>>

_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org<mailto:Archivesspace_Users_Group@lyralists.lyrasis.org>
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flyralists.lyrasis.org%2Fmailman%2Flistinfo%2Farchivesspace_users_group&data=02%7C01%7Cmark.custer%40yale.edu%7C48edfc3ae3324bae138d08d6822c2cb0%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C0%7C636839523610555528&sdata=awkkD4GZQV8YUIO9gjuQ7c6%2FJYGcgmhderTdusF28Ho%3D&reserved=0>


--
Adam Jazairi
Digital Repository Services
Boston College Libraries
(617) 552-1404
adam.jaza...@bc.edu<mailto:adam.jaza...@bc.edu>
_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group

Reply via email to