2009/2/21 sanalkumar mr <[email protected]>: > Dear all.....Plz help me to solve this problem. > $ pdftotext -f 3 commonds.pdf rr.txt > When I executed the above command the file was converted to text file.But > when I tried it with a Malayalam pdf file it didn't give me the output. > $ pdftotext -f 1 -layout -enc utf8 madhyam_first.pdf ff.txt > The command executed is given above....and the error shown is given below > Error: Couldn't find unicodeMap file for the 'utf8' encoding > Error: Couldn't get text encoding > I visited man pages of pdftotext and some sites to learn more about that > http://www.cyberciti.biz/faq/converter-pdf-files-to-text-format-command/ > ,but didn't get help for the conversion of a Malayalam(utf8) pdf file. > and then I tried this > $ pdftotext -f 1 -layout -enc UTF-8 madhyam_first.pdf ff.txt > and I got a file named ff.txt , but that is not Malayalam > Plz help me to solve this issue
I think the pdf text is not encoded in UTF-8. It can be in any ascii font. To check which font the file is using, use the properties in the filemenu of the pdf viewer application. If it is not a unicode font, you have to use a converter like payyans or http://uni.medhas.org ~vimal > -- > > -- > പറയാതെ പരിഭവം നീയൊന്നു മുടി കോതി- > ക്കഴിയുമ്പൊഴായുസ്സു തീരും > പറയാതെ പരിഭവം ഞാനൊന്നു നെടുവീര്പ്പിട്ടു- > കഴിയുമ്പോള് ലോകവും തീരും.......... > > > > -- Free Software, Free Society സ്വതന്ത്ര സോഫ്റ്റ്വെയര്, സ്വതന്ത്ര സമൂഹം <http://fsfs.in> --~--~---------~--~----~------------~-------~--~----~ "Freedom is the only law". "Freedom Unplugged" http://www.ilug-tvm.org You received this message because you are subscribed to the Google Groups "ilug-tvm" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For details visit the website: www.ilug-tvm.org or the google group page: http://groups.google.com/group/ilug-tvm?hl=en -~----------~----~----~----~------~----~------~--~---
