Probably a per-file/per-request thing

Peter Kronenberg  |  Senior AI Analytic ENGINEER
C: 703.887.5623
[Torch AI]<http://www.torch.ai/>
4303 W. 119th St., Leawood, KS 66209
WWW.TORCH.AI<http://www.torch.ai/>


From: Tim Allison <[email protected]>
Sent: Friday, October 29, 2021 1:58 PM
To: [email protected]
Subject: Re: OCR with bounding boxes


That's not currently supported, and in fact, I don't think we even support 
running OCR on specific pages within PDFs (and I do remember we've had that 
request occasionally).  Would this be a per-file configuration or would you 
want to specify something for all files?

On Fri, Oct 29, 2021 at 12:55 PM Peter Kronenberg 
<[email protected]<mailto:[email protected]>> wrote:
I’m pretty sure this is a capability of Tesseract, but does Tika have the 
ability to specify a bounding box when OCR’ing a page?  So if we want to give 
it the coordinates of a single paragraph or section of a document?


Thanks
Peter

Peter Kronenberg  |  Senior AI Analytic ENGINEER
C: 703.887.5623
[Torch 
AI]<https://us-east-2.protection.sophos.com/?d=torch.ai&u=aHR0cDovL3d3dy50b3JjaC5haS8=&i=NjAwMDY2MjNjNzQ1NDY0ODkyYTNmNTg3&t=dHRDUUJralFuRnRCU2tvcmRLNUUycFdBV2RmazdTZU0zZUZVM21GSXhobz0=&h=0fdc369ea7154ad8955d0a086d4f1f78>
4303 W. 119th St., Leawood, KS 66209
WWW.TORCH.AI<https://us-east-2.protection.sophos.com?d=torch.ai&u=aHR0cDovL3d3dy50b3JjaC5haS8=&i=NjAwMDY2MjNjNzQ1NDY0ODkyYTNmNTg3&t=dHRDUUJralFuRnRCU2tvcmRLNUUycFdBV2RmazdTZU0zZUZVM21GSXhobz0=&h=0fdc369ea7154ad8955d0a086d4f1f78>


Reply via email to