[tesseract-ocr] Any way to get bbox information only with tesseract or some other tool?

Finjon Kiang Sat, 03 May 2014 10:44:13 -0700

We are trying to deal with lots of Traditional Chinese documents ( scanned 
as images ). The result from tesseract is not good in default configs ( We 
are not familiar with tesseract training yet. ). So we tried to make a 
crowdsourcing website and invite people to provide identified text. The 
problem is that we need the bbox information extracted from tesseract using 
hocr config. But we don't need the OCR result. As the OCR process in 
Chinese is very slow. Is there anyway to get the bbox information directly 
without OCR process? ( either with tesseract or other tools ).


--- environments @ Ubuntu 13.10---
tesseract --version
tesseract 3.02.01
 leptonica-1.69
  libgif 4.1.6 : libjpeg 8d : libpng 1.2.49 : libtiff 4.0.2 : zlib 1.2.8

---
kiang

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/2f7abb1d-7f7b-4915-8256-af30c092fc39%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Any way to get bbox information only with tesseract or some other tool?

Reply via email to