Probably a per-file/per-request thing Peter Kronenberg | Senior AI Analytic ENGINEER C: 703.887.5623 [Torch AI]<http://www.torch.ai/> 4303 W. 119th St., Leawood, KS 66209 WWW.TORCH.AI<http://www.torch.ai/>
From: Tim Allison <[email protected]> Sent: Friday, October 29, 2021 1:58 PM To: [email protected] Subject: Re: OCR with bounding boxes That's not currently supported, and in fact, I don't think we even support running OCR on specific pages within PDFs (and I do remember we've had that request occasionally). Would this be a per-file configuration or would you want to specify something for all files? On Fri, Oct 29, 2021 at 12:55 PM Peter Kronenberg <[email protected]<mailto:[email protected]>> wrote: I’m pretty sure this is a capability of Tesseract, but does Tika have the ability to specify a bounding box when OCR’ing a page? So if we want to give it the coordinates of a single paragraph or section of a document? Thanks Peter Peter Kronenberg | Senior AI Analytic ENGINEER C: 703.887.5623 [Torch AI]<https://us-east-2.protection.sophos.com/?d=torch.ai&u=aHR0cDovL3d3dy50b3JjaC5haS8=&i=NjAwMDY2MjNjNzQ1NDY0ODkyYTNmNTg3&t=dHRDUUJralFuRnRCU2tvcmRLNUUycFdBV2RmazdTZU0zZUZVM21GSXhobz0=&h=0fdc369ea7154ad8955d0a086d4f1f78> 4303 W. 119th St., Leawood, KS 66209 WWW.TORCH.AI<https://us-east-2.protection.sophos.com?d=torch.ai&u=aHR0cDovL3d3dy50b3JjaC5haS8=&i=NjAwMDY2MjNjNzQ1NDY0ODkyYTNmNTg3&t=dHRDUUJralFuRnRCU2tvcmRLNUUycFdBV2RmazdTZU0zZUZVM21GSXhobz0=&h=0fdc369ea7154ad8955d0a086d4f1f78>
