Hi all, is it possible to smart process utf-8 encoded text data? I need to do somenthing like: - split text to words - remove illegal characters for specified language - remove control characters - ...
Which module I need to use? There is a lot of modules for charset conversion. I found Unicode::String to be usefull, but from latin* encodings support only latin1. How I can prevent false matching using regular expressions if working with multibyte characters? -- best regards Ing. Roman Vasicek software developer +----------------------------------------------------------------------------+ PetaMem s.r.o., Drahobejlova 27/1019, 190 00 Praha 9 - Liben, Czech republic http://www.petamem.com/