As a follow-up, we’ve had the same issue with Chinese language characters and, 
the stop-gap workaround we’re using at the moment until we can implement a 
better solution, is to use the the New School Bulk Updater plugin to export a 
spreadsheet for the finding aid which then at least serves as an 
easy-to-generate searchable file outside of AS, even though it’s not the same 
experience as a PDF.

Good luck and we’ll continue to tune into these language-based questions 
related to AS!

-Bailey

Bailey Hoffner, MLIS
Pronouns: she/her or they/them
Metadata and Collections Management Archivist
University of Oklahoma Libraries
bail...@ou.edu

From: <archivesspace_users_group-boun...@lyralists.lyrasis.org> on behalf of 
"Mayo, Dave" <dave_m...@harvard.edu>
Reply-To: Archivesspace Users Group 
<archivesspace_users_group@lyralists.lyrasis.org>
Date: Wednesday, September 14, 2022 at 10:01 PM
To: Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] Missing Japanese charactires in a PUI 
generated PDF

Hi!

This is something we’ve recently had to deal with – I’m not 100% sure from what 
you’ve posted that it’s the same issue we had, but there are a few issues with 
the PUI’s current PDF generation support that make font handling challenging.

So, first of all – if you’re setting up a fallback hierarchy and the font with 
Japanese characters isn’t in the first position, the PDF generation library 
isn’t seeing it at all.  The flying saucer ipdf library doesn’t support font 
fallback, which is a real problem if you need to support multiple languages.

So, first thing I’d try is making sure that the text in question is _solely_ 
the font supporting Japanese.  If the Japanese characters render, that’ll at 
least verify that that’s the reason.

Our solution, which I’m hoping to work up and submit as a pull request, was to 
replace the existing library with 
https://github.com/danfickle/openhtmltopdf<https://urldefense.com/v3/__https:/github.com/danfickle/openhtmltopdf__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0qjtV6Dr$>
 - a project based on flying saucer but with several enhancements.  
Implementing it is somewhat complex:

1. Openhtmltopdf and _all dependencies thereof_ need to be provided by putting 
them in the archivesspace/lib directory (the directory the MySQL connector goes 
in during install)

Currently we’re doing this in our dockerfile via:
wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/com/google/zxing/core/3.5.0/core-3.5.0.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/com/google/zxing/core/3.5.0/core-3.5.0.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0nOZdvzn$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/junit/junit/4.13.2/junit-4.13.2.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/junit/junit/4.13.2/junit-4.13.2.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0jrg5hUt$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/com/openhtmltopdf/openhtmltopdf-core/1.0.10/openhtmltopdf-core-1.0.10.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/com/openhtmltopdf/openhtmltopdf-core/1.0.10/openhtmltopdf-core-1.0.10.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0oFmU2Mu$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/com/openhtmltopdf/openhtmltopdf-pdfbox/1.0.10/openhtmltopdf-pdfbox-1.0.10.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/com/openhtmltopdf/openhtmltopdf-pdfbox/1.0.10/openhtmltopdf-pdfbox-1.0.10.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0uXsF_L0$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/de/rototor/pdfbox/graphics2d/0.34/graphics2d-0.34.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/de/rototor/pdfbox/graphics2d/0.34/graphics2d-0.34.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0nqRclZR$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/apache/pdfbox/pdfbox/2.0.26/pdfbox-2.0.26.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/apache/pdfbox/pdfbox/2.0.26/pdfbox-2.0.26.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0uvMI0N8$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/apache/pdfbox/xmpbox/2.0.26/xmpbox-2.0.26.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/apache/pdfbox/xmpbox/2.0.26/xmpbox-2.0.26.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0hdwsONi$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/apache/pdfbox/fontbox/2.0.26/fontbox-2.0.26.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/apache/pdfbox/fontbox/2.0.26/fontbox-2.0.26.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0ihyfag7$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/jfree/jfreechart/1.5.3/jfreechart-1.5.3.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/jfree/jfreechart/1.5.3/jfreechart-1.5.3.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0kI8UZOD$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/freemarker/freemarker/2.3.27-incubating/freemarker-2.3.27-incubating.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/freemarker/freemarker/2.3.27-incubating/freemarker-2.3.27-incubating.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0jqQKbKW$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/apache/servicemix/bundles/org.apache.servicemix.bundles.rhino/1.7.10_1/org.apache.servicemix.bundles.rhino-1.7.10_1-sources.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/apache/servicemix/bundles/org.apache.servicemix.bundles.rhino/1.7.10_1/org.apache.servicemix.bundles.rhino-1.7.10_1-sources.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0sCNSTPG$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/openjdk/jmh/jmh-core/1.29/jmh-core-1.29.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/openjdk/jmh/jmh-core/1.29/jmh-core-1.29.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0g656vM5$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/codelibs/jhighlight/1.1.0/jhighlight-1.1.0.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/codelibs/jhighlight/1.1.0/jhighlight-1.1.0.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0m3zjIdI$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/thymeleaf/extras/thymeleaf-extras-java8time/3.0.4.RELEASE/thymeleaf-extras-java8time-3.0.4.RELEASE.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/thymeleaf/extras/thymeleaf-extras-java8time/3.0.4.RELEASE/thymeleaf-extras-java8time-3.0.4.RELEASE.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0ulOPZVQ$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/thymeleaf/thymeleaf/3.1.0.M2/thymeleaf-3.1.0.M2.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/thymeleaf/thymeleaf/3.1.0.M2/thymeleaf-3.1.0.M2.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0uXoyyHz$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/yaml/snakeyaml/1.26/snakeyaml-1.26.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/yaml/snakeyaml/1.26/snakeyaml-1.26.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0kIerBO8$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/com/ibm/icu/icu4j/59.1/icu4j-59.1.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/com/ibm/icu/icu4j/59.1/icu4j-59.1.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0n8_Yzke$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/apache/xmlgraphics/batik-codec/1.14/batik-codec-1.14.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/apache/xmlgraphics/batik-codec/1.14/batik-codec-1.14.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0uCF8D-3$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/apache/xmlgraphics/batik-ext/1.14/batik-ext-1.14.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/apache/xmlgraphics/batik-ext/1.14/batik-ext-1.14.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0pEs44Q1$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/apache/xmlgraphics/batik-transcoder/1.14/batik-transcoder-1.14.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/apache/xmlgraphics/batik-transcoder/1.14/batik-transcoder-1.14.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0p0AtRw-$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/apache/xmlgraphics/xmlgraphics-commons/2.7/xmlgraphics-commons-2.7.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/apache/xmlgraphics/xmlgraphics-commons/2.7/xmlgraphics-commons-2.7.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0pJ8Ey28$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/org/verapdf/validation-model/1.18.8/validation-model-1.18.8.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/verapdf/validation-model/1.18.8/validation-model-1.18.8.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0nJm_3hW$>
 && \

wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/de/rototor/snuggletex/snuggletex-core/1.3.0/snuggletex-core-1.3.0.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/de/rototor/snuggletex/snuggletex-core/1.3.0/snuggletex-core-1.3.0.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0u86e-RR$>
 && \
wget -P /archivesspace/lib 
https://repo1.maven.org/maven2/net/sourceforge/jeuclid/jeuclid-core/3.1.9/jeuclid-core-3.1.9.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/net/sourceforge/jeuclid/jeuclid-core/3.1.9/jeuclid-core-3.1.9.jar__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0riWdnKi$>
 && \

2. Then, the code that generates the PDFs needs to be overridden with code 
based on the new library. We do this in our PUI customization plugin here:

https://github.com/harvard-library/aspace-hvd-pui/blob/bd4b1c3cf728674cc3445dee39a16282848c2cca/public/models/hvd_pdf.rb#L152<https://urldefense.com/v3/__https:/github.com/harvard-library/aspace-hvd-pui/blob/bd4b1c3cf728674cc3445dee39a16282848c2cca/public/models/hvd_pdf.rb*L152__;Iw!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0uw6PQs6$>

We were already overriding PDF generation, the model in core ArchivesSpace is 
located here:

https://github.com/archivesspace/archivesspace/blob/ceeb72d1796a8b67104814065ffea23215403f78/public/app/models/finding_aid_pdf.rb#L94<https://urldefense.com/v3/__https:/github.com/archivesspace/archivesspace/blob/ceeb72d1796a8b67104814065ffea23215403f78/public/app/models/finding_aid_pdf.rb*L94__;Iw!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0jA2MdmW$>

I believe my co-worker Doug still couldn’t get a web font to work ever really – 
we ended up using the Kurinto fonts (and some others) provided with 
archivesspace and used by the XSLT PDF processing in the backend.  
https://github.com/harvard-library/aspace-hvd-pui/blob/bd4b1c3cf728674cc3445dee39a16282848c2cca/public/models/hvd_pdf.rb#L165<https://urldefense.com/v3/__https:/github.com/harvard-library/aspace-hvd-pui/blob/bd4b1c3cf728674cc3445dee39a16282848c2cca/public/models/hvd_pdf.rb*L165__;Iw!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0k8ORiay$>

I hope this is somewhat helpful! I very much want to try and package this up in 
a less terrible way, either by getting this incorporated into core or through 
creating a plugin – a plugin would need to either copy the libraries into the 
right place on install or have a manual step of downloading and installing the 
libraries, so it’d be a bit inelegant.
If you have any questions, I’d be happy to try and answer them!

--
Dave Mayo (he/him)
Senior Digital Library Software Engineer
Harvard University > HUIT > LTS

From: <archivesspace_users_group-boun...@lyralists.lyrasis.org> on behalf of 松山 
ひとみ <matsuyam...@nakka-art.jp>
Reply-To: Archivesspace Users Group 
<archivesspace_users_group@lyralists.lyrasis.org>
Date: Tuesday, September 13, 2022 at 9:41 PM
To: "'archivesspace_users_group@lyralists.lyrasis.org'" 
<archivesspace_users_group@lyralists.lyrasis.org>
Subject: [Archivesspace_Users_Group] Missing Japanese charactires in a PUI 
generated PDF

Hi all.

We’ve been struggling with an issue of a PUI generated PDF, in which no 
Japanese characters are displayed.
Could anyone tell us what we should try next, or anything wrong in our 
procedure?

We’d tried as follows;

1. Created "./plugins/local/public/views/pdf/_header.html.erb", and edited.
We confirmed that the CSS was applied.

2. In the style of 1., we specified these 3 fonts, "serif", "sans-serif", and 
the font used in converting itext into Japanese;

body {
  font-family: KozMinPro-Regular;
}

3. In addition, we loaded Google Fonts and executed. It didn’t work.

@import 
url('https://fonts.googleapis.com/css2?family=Sawarabi+Gothic&display=swap<https://urldefense.com/v3/__https:/fonts.googleapis.com/css2?family=Sawarabi*Gothic&display=swap__;Kw!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0jvXx4op$>');

body {
font-family: 'Sawarabi Gothic', sans-serif;
}

We’ve looked through the previous Q&As;
http://lyralists.lyrasis.org/mailman/htdig/archivesspace_users_group/2017-August/005046.html<https://urldefense.com/v3/__http:/lyralists.lyrasis.org/mailman/htdig/archivesspace_users_group/2017-August/005046.html__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0tuZjIjJ$>
http://lyralists.lyrasis.org/mailman/htdig/archivesspace_users_group/2017-August/005047.html<https://urldefense.com/v3/__http:/lyralists.lyrasis.org/mailman/htdig/archivesspace_users_group/2017-August/005047.html__;!!GNU8KkXDZlD12Q!75t4x4H-_8m7gtcXJ2E_a5IMUNmOdwJkNxXyvA26rAt8iJP45GbR88nKKSzb0Auc2Mz3V3oK2DRc0j2L-tFU$>

We would appreciate a lot your generous assistance!

Hitomi Matsuyama, Audiovisual Archivist

Nakanoshima Museum of Art, Osaka
4-3-1 Nakanoshima, Kita-ku
Osaka 530-0005 JAPAN
tel. +81 (0)6 64 79 05 58
email. matsuyam...@nakka-art.jp<mailto:matsuyam...@nakka-art.jp>

_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group

Reply via email to