RE: UNICODE character identification

2015-02-10 Thread Kool,Wouter
...@gmail.com] Sent: dinsdag 10 februari 2015 13:27 To: perl4lib@perl.org Subject: UNICODE character identification Hello friendly folks, follows what i am trying to do, and i am looking for your help in order to find the most clever way to achieve this: We have records, that include typos like this: we

Re: UNICODE character identification

2015-02-10 Thread George Milten
15:56 *To:* Kool,Wouter *Cc:* perl4lib@perl.org *Subject:* Re: UNICODE character identification utf-8, thank you 2015-02-10 16:54 GMT+02:00 Kool,Wouter wouter.k...@oclc.org: What encoding is your data in? utf8? Single-byte encoding? Marc8? That information matters a lot

Re: UNICODE character identification

2015-02-10 Thread George Milten
:* Re: UNICODE character identification yes probably this is where i was also heading, but thought there was a more clever way. Also, is there a good perl normaliser? I have not had any experience with: http://search.cpan.org/~sadahiro/Unicode-Normalize-1.18/Normalize.pm For starters

Re: UNICODE character identification

2015-02-10 Thread George Milten
=00D8000ZRv8lastMod=140984368]* http://www.oclc.org/ *From:* George Milten [mailto:george.mil...@gmail.com] *Sent:* dinsdag 10 februari 2015 13:27 *To:* perl4lib@perl.org *Subject:* UNICODE character identification Hello friendly folks, follows what i am trying to do, and i am

RE: UNICODE character identification

2015-02-10 Thread Kool,Wouter
that most characters match and you look for the exceptions. Would that help? From: George Milten [mailto:george.mil...@gmail.com] Sent: dinsdag 10 februari 2015 15:56 To: Kool,Wouter Cc: perl4lib@perl.org Subject: Re: UNICODE character identification utf-8, thank you 2015-02-10 16:54 GMT+02:00

UNICODE character identification

2015-02-10 Thread George Milten
Hello friendly folks, follows what i am trying to do, and i am looking for your help in order to find the most clever way to achieve this: We have records, that include typos like this: we have a word say Plato, where the last o is inputted with the keyboard set to Greek language, so we need