Hi Tom, The AS API is UTF-8 by default and AS tries to make sure your database is set up correctly, too, by checking the database/table encodings. As a data point, with dozens of migrations making millions of calls to the AS API and sending data in both directions I've yet to come across a single instance of AS inserting spurious characters into API responses, but I've had plenty of encoding issues in the same migrations on the data/database level. I'm fairly confident you'll find the source of those characters if you look at the raw data.
p ________________________________ From: archivesspace_users_group-boun...@lyralists.lyrasis.org <archivesspace_users_group-boun...@lyralists.lyrasis.org> on behalf of Tom Hanstra <hans...@nd.edu> Sent: 03 September 2021 18:09 To: Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org> Subject: Re: [Archivesspace_Users_Group] API output - extra unicode Brian (and others), The data in the database should be UTF-8 as far as I can tell. So, I think this has to be happening at the API export level. Is there anything specific that needs to be done to have the API know that this is UTF-8 data? Tom On Fri, Sep 3, 2021 at 11:42 AM Brian Harrington <brian.harring...@lyrasis.org<mailto:brian.harring...@lyrasis.org>> wrote: Hi Tom, In my experience \u00c3 appearing in anything is almost always a sign of encoding issues. I would make sure that everything is UTF-8 all the way through. Brian From: <archivesspace_users_group-boun...@lyralists.lyrasis.org<mailto:archivesspace_users_group-boun...@lyralists.lyrasis.org>> on behalf of Tom Hanstra <hans...@nd.edu<mailto:hans...@nd.edu>> Reply-To: Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org<mailto:archivesspace_users_group@lyralists.lyrasis.org>> Date: Friday, September 3, 2021 at 11:06 AM To: Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org<mailto:archivesspace_users_group@lyralists.lyrasis.org>> Subject: [Archivesspace_Users_Group] API output - extra unicode On our local version of ArchivesSpace, we are testing API output and are finding that we are getting extra Unicode characters on export. It looks like the data is right in the database, but doesn't quite come out right from the API extract. It looks like there is an extra unicode character added (in some of the code we reviewed, this was either \u00c3 or \u00a2). Where might we have something set incorrectly? Where might the extra data be coming from or have been introduced along the way? Thanks, Tom -- Tom Hanstra Sr. Systems Administrator hans...@nd.edu<mailto:hans...@nd.edu> [Image removed by sender.] _______________________________________________ Archivesspace_Users_Group mailing list Archivesspace_Users_Group@lyralists.lyrasis.org<mailto:Archivesspace_Users_Group@lyralists.lyrasis.org> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group -- Tom Hanstra Sr. Systems Administrator hans...@nd.edu<mailto:hans...@nd.edu> [https://docs.google.com/uc?export=download&id=1GFX1KaaMTtQ2Kg2u8bMXt1YwBp96bvf0&revid=0B7APN9POn6xAQ244WWFYMFU3aVJwZ0lxbmVHK3FxNXlCd0RRPQ]
_______________________________________________ Archivesspace_Users_Group mailing list Archivesspace_Users_Group@lyralists.lyrasis.org http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group