Re: [tesseract-ocr] Are character bboxes trustworthy?

2020-07-25 Thread Zdenko Podobny
As I mentioned, if you need good bounding boxes you have to use a legacy engine. There are several issues & comments why it is problem to get accurate bounding boxes e.g. https://github.com/tesseract-ocr/tesseract/issues/2825#issuecomment-579220987 Zdenko so 25. 7. 2020 o 0:44

Re: [tesseract-ocr] Are character bboxes trustworthy?

2020-07-24 Thread 'robinw...@googlemail.com' via tesseract-ocr
> Do you use lstm or legacy engine? lstm. I can find a couple of Noah Metzger patches: https://github.com/tesseract-ocr/tesseract/commit/c350077b96077fa50fefe97fbaed04014407f0f1 and https://github.com/tesseract-ocr/tesseract/pull/2576 etc, but they've all been merged into master. As far

Re: [tesseract-ocr] Are character bboxes trustworthy?

2020-07-24 Thread Zdenko Podobny
Do you use lstm or legacy engine? If lstm: search issue tracker/PR/(forum?) for bounding box problem (and Noah Metzger patches) There are rumours that if you need really good bounding boxes you have to use the latest 3.5 version because changes in the 4.x version (and later) also affected

[tesseract-ocr] Are character bboxes trustworthy?

2020-07-24 Thread 'Robin Watts' via tesseract-ocr
Hi all, I'm using tesseract as a library, and broadly it seems to be working well. I am having some very strange problems with the character boxes I get back from the iterator though. The attached image is a png made from the 8bpp greyscale image that I feed it, overlaid with boxes to show