Hi,
Just to let you know - our programmer has eventually come up with a
workaround which we think may be useful for others and so we are sharing
our solution:
All characters in added content with ASCII code higher than 127 have
been converted to HTML entities :-).
Linda
On 7.9.2016 08:37, Linda Jansova wrote:
Hi,
We are currently developing a module for pulling added content (book
jackets, tables of contents etc.) from our local provider obalkyknih.cz.
In our programming endeavor we have come across an encoding issue
which we have not been able to resolve so far. The thing is all
textual added content gets messed up when shown on a record webpage.
Our programmer has made a number of tests to make sure that the
encoding does not go wrong in the module we are developing (a list of
tests performed is available at
https://bugs.launchpad.net/evergreen/+bug/1610678). We have also
tested Open Library – when it comes to tables of contents with
diacritics, it is also messed up.
To investigate the issue further, could anyone provide us with any
hints as to:
*
how (where) the record webpage actually pulls together?
*
how AddedContent.pm methods (which provide the added content) are
called?
One more question – is there any developer documentation which would
describe Evergreen architecture in more detail available?
BTW could our developer's hint that „interestingly on URL in form
http://evergreen-server/opac/extras/ac/toc/html/r/23225 I could see
toc in correct encoding“ be of any use? It seems to me that if we
figured out what the differences between how data are processed for a
record webpage and for the sample URL above, we could actually hit the
nail on the head...
Thank you in advance for sharing any clues!
Linda