Hi Chandramohan! I'm CCing this list on this private reply. Next time please hit "reply all" or make it the default using the Google Labs interface.
On Thursday 26 Nov 2009 16:29:36 Chandramohan Neelakantan wrote: > Hi Schlomi, > It's "Shlomi" (English spelling) - not "Schlomi" (German Spelling). Many people make this mistake. > Many thanks for your reply. You're welcome. > > > 1. Linux Kernel 2.4.27 is incredibly old: > > http://en.wikipedia.org/wiki/Linux_kernel#Timeline > > There's already kernel 2.4.37 and kernel 2.6.31.6. Please upgrade, for > > your own good. 2. perl-5.8.4 is very old as well. There's already 5.8.9 > > and 5.10.1. ----------- > > I do not have an option here to change/upgrade the OS. > I see. We may not be able to help you with problems you encounter. > > Which system is this? Is it RHEL / CentOS ? > > I think both of these should not pose a problem in what I'm saying. > > Debian. > Really? What was the last Debian that shipped with these versions of Perl and the Linux kernel? What does "cat /etc/debian_version" say? This seems likely that it would be an old and un-maintained version. See: http://community.livejournal.com/shlomif_tech/36125.html > > See: > > * http://perldoc.perl.org/perlunitut.html > > I have been going through the documentation for the last few days now. > Unfortunately I am in a unique position. > The text for me comes various sources : which means that > standardization of everything is also not an option as there are many > many parties here. > (For example, I would have a non-Danish speaker to write Danish text > and send it to me). > > I hit on an idea that each text file coming from different sources will > have the Unicode UTF8 hex string of all the special characters. > See here. http://lwp.interglacial.com/appf_01.htm > > For example the Danish A-ring ( A with a ring on top ) in the text file > will be written as : 0xc30x85. This was found to be acceptable by all > parties. > > Now the question is, I will have a text file with lots of English > langauge alphabets along with the special characters written as above. > Using File i/o, I will have to dynamically identify the special > characters and print the equiavalent characters in PDF,HTML files. > > Any ideas? > You can use regexes to match specific characters. But generally your converter (e.g: of a http://en.wikipedia.org/wiki/Lightweight_markup_language ) wil ldo that for you. Do you need to guess the language of the document? I still don't understand exactly what you want to do. Regards, Shlomi Fish > Your help much appreciated. > > Thanks & regards > CM > -- ----------------------------------------------------------------- Shlomi Fish http://www.shlomifish.org/ "The Human Hacking Field Guide" - http://shlom.in/hhfg Chuck Norris read the entire English Wikipedia in 24 hours. Twice. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/