maybe you can try alternative solution ;-) [1]. It was created by google(I think ;-) ) and there is visible contributor e-mail if it does not work :-)
https://code.google.com/p/hocr-tools/source/browse/hocr-pdf Zdenko On Sun, Jan 26, 2014 at 10:19 PM, universal reseller <[email protected]>wrote: > yes! > but thay dont answered me after 1 week so i asked here!! > > any one else have problem with utf-8!? > > > On Mon, Jan 27, 2014 at 12:47 AM, zdenko podobny <[email protected]> wrote: > >> Did you asked this question author of hocr2pdf ? >> >> Zdenko >> >> >> On Sun, Jan 26, 2014 at 9:25 PM, peiman F. <[email protected]> wrote: >> >>> Hi >>> i have a pdf file and i have to make it searchable >>> the pdf is in arabic language >>> >>> i can ocr its a single page with tesseract without any problem >>> but when i use hocr2pdf the out put pdf file have some mistakes!! >>> >>> the 1.pdf is my source file >>> and the 123.pdf is the 4th page that searchabled by hocr2pdf >>> >>> i used commands like this: >>> >>> 1) >>> tesseract 1/out04.jpg 552233 -l ara hocr >>> >>> 2) >>> hocr2pdf -i 1/out04.jpg -s -o 123.pdf < 552233.html >>> >>> am i wrong or hocr2pdf dont support utf-8 !? >>> >>> -- >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To post to this group, send email to [email protected] >>> To unsubscribe from this group, send email to >>> [email protected] >>> For more options, visit this group at >>> http://groups.google.com/group/tesseract-ocr?hl=en >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/groups/opt_out. >>> >> >> -- >> -- >> You received this message because you are subscribed to the Google >> Groups "tesseract-ocr" group. >> To post to this group, send email to [email protected] >> To unsubscribe from this group, send email to >> [email protected] >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en >> >> --- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/groups/opt_out. >> > > -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

