https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=34549
--- Comment #16 from David Cook <[email protected]> --- (In reply to Martin Renvoize from comment #15) > Hmm, whilst this certainly resolves the core issue.. I'd have loved to have > seen some form of warning to the end user that their input data has been > manipulated. I agree. I'm not sure the best way to do that, but I was a bit surprised my patches got pushed without it 😅. > I'm not close enough to the differences between MARC-8 and UTF-8 encodings > to know exactly what we're losing during the save.. the test case > highlighted here is simple.. just dropping a hidden character.. no harm > done.. however, might there be cases where the mis-encoded string getting > stripped would result in worse data from the human perspective? It would be > good to somehow catch these sorts of misconfigurations and try to encourage > end users to fix them. Firstly, absolutely. Secondly, you can find code tables for MARC-8 at https://www.loc.gov/marc/specifications/specchartables.html Here's a fun case I encountered the other day: ö UTF-8: C3B6 Latin-1: F6 I was accidentally outputting Latin-1 as I'd forgot to tell Perl to print UTF-8. It was then interpreting ö as an underscore (ie "_") because F6 in MARC-8 is an underscore. And F6 isn't a valid UTF-8 byte in any case. So it would disappear thanks to this change. -- You are receiving this mail because: You are watching all bug changes. _______________________________________________ Koha-bugs mailing list [email protected] https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
