[Koha-bugs] [Bug 34549] The cataloguing editor allows you to input invalid data

bugzilla-daemon Sun, 22 Oct 2023 16:09:26 -0700

https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=34549


--- Comment #16 from David Cook <[email protected]> ---
(In reply to Martin Renvoize from comment #15)
> Hmm,  whilst this certainly resolves the core issue.. I'd have loved to have
> seen some form of warning to the end user that their input data has been
> manipulated.

I agree. I'm not sure the best way to do that, but I was a bit surprised my
patches got pushed without it 😅.

> I'm not close enough to the differences between MARC-8 and UTF-8 encodings
> to know exactly what we're losing during the save.. the test case
> highlighted here is simple.. just dropping a hidden character.. no harm
> done.. however, might there be cases where the mis-encoded string getting
> stripped would result in worse data from the human perspective?  It would be
> good to somehow catch these sorts of misconfigurations and try to encourage
> end users to fix them.

Firstly, absolutely.

Secondly, you can find code tables for MARC-8 at
https://www.loc.gov/marc/specifications/specchartables.html 

Here's a fun case I encountered the other day:

ö
UTF-8: C3B6
Latin-1: F6

I was accidentally outputting Latin-1 as I'd forgot to tell Perl to print
UTF-8. It was then interpreting ö as an underscore (ie "_") because F6 in
MARC-8 is an underscore. 

And F6 isn't a valid UTF-8 byte in any case. So it would disappear thanks to
this change.

-- 
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[email protected]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

[Koha-bugs] [Bug 34549] The cataloguing editor allows you to input invalid data

Reply via email to