Hi, this should work (in terms of How to use configs in pytesseract):
import os import pytesseract from PIL import Image # configuration pytesseract.pytesseract.tesseract_cmd = r"f:\win64_llvm\bin\tesseract.exe" os.environ["TESSDATA_PREFIX"] = r"f:\Project-Personal\tessdata_best\tessdata" custom_configs = r'-c hocr_font_type=1' # OCR img = Image.open(r"image.png") hocr = pytesseract.image_to_pdf_or_hocr(img, extension='hocr', config=custom_configs) But AFAIR tesseract 4.x does not provide info about font type. Zdenko st 18. 9. 2019 o 6:37 'Raphael Alabi' via tesseract-ocr < tesseract-ocr@googlegroups.com> napĂsal(a): > How does one pass in the hocr_font_type 1 parameter to config to be able > to get font type information through OCR? > I am a bit lost as to how this is done............. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/8e47af0f-a1b9-48aa-b4b6-352f11a8945e%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/8e47af0f-a1b9-48aa-b4b6-352f11a8945e%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8yr%3D%2BxvmUNk5TW3HK5c1zTm_yVMLhxHdWsLrkw%3DwK6%2Bug%40mail.gmail.com.