Hi,

Our DSpace 4.2's Discovery search results displays snippets from the item's
full-text PDF extract, but we get mojibake (strange characters) in the
summaries (see attached photo).  Browsing to the item's PDF-extracted text
bitstream indeed shows the strange characters, and Firefox's developer
tools show the encoding is ISO-8859-1.  What's strange is, if I download
the file the resulting encoding is UTF-8, and these characters display
properly.

I have tried the following:
- Confirmed our Tomcat connectors are using URIEncoding="UTF-8"
- Forced "-Dfile.encoding=UTF-8" in JAVA_OPTS and manually re-run
`filter-media' as well as `index-discovery -b'

What could I be missing?

Thanks!

-- 
Alan Orth
alan.o...@gmail.com
https://alaninkenya.org
https://mjanja.ch
"In heaven all the interesting people are missing." -Friedrich Nietzsche
GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to