Re: UNICODE character identification

2015-02-10 Thread George Milten
. I have not tested it, but I suppose /[^\p{Latin}]/ would match any non-latin characters. So you find the character class that most characters match and you look for the exceptions. Would that help? *From:* George Milten [mailto:george.mil...@gmail.com] *Sent:* dinsdag 10 februari 2015

Re: UNICODE character identification

2015-02-10 Thread George Milten
scripts and use the classifier to find hybrid cases. I have quite satisfactory results with this approach in a slightly different use case. *From:* George Milten [mailto:george.mil...@gmail.com] *Sent:* dinsdag 10 februari 2015 16:09 *To:* Kool,Wouter *Cc:* perl4lib@perl.org *Subject

Re: UNICODE character identification

2015-02-10 Thread George Milten
=00D8000ZRv8lastMod=140984368]* http://www.oclc.org/ *From:* George Milten [mailto:george.mil...@gmail.com] *Sent:* dinsdag 10 februari 2015 13:27 *To:* perl4lib@perl.org *Subject:* UNICODE character identification Hello friendly folks, follows what i am trying to do, and i am

UNICODE character identification

2015-02-10 Thread George Milten
Hello friendly folks, follows what i am trying to do, and i am looking for your help in order to find the most clever way to achieve this: We have records, that include typos like this: we have a word say Plato, where the last o is inputted with the keyboard set to Greek language, so we need

script help list all files in folders and subfolders

2015-03-31 Thread George Milten
Hello friendly folks, i would appreciate any help on the following: say we have a folder with thousands of html files. Since the file browser crashes, i am looking at making a script that would do the following: Distribute all html files in folders, say 001, 002, 003, etc, sorted by the html