Re: Image to text conversion
Sekhar, There are a few open Jira's: I think it would be a great contribution if you get this to work: - CTAKES-189 https://issues.apache.org/jira/browse/CTAKES-189 GSoC: Implement OCR/Tika to standardize text input for cTAKES - - CTAKES-105 https://issues.apache.org/jira/browse/CTAKES-105 Add Apache Tika integration On Thu, Apr 30, 2015 at 1:21 AM, Hari, Sekhar sekhar.h...@cgi.com wrote: Thanks. Let me try this, and will let you know for any help if required. Cheers, Sekhar H. -Original Message- From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Thursday, April 30, 2015 10:44 AM To: dev@ctakes.apache.org; u...@ctakes.apache.org Subject: Re: Image to text conversion What about using Apache Tika within cTAKES for this? Tika supports OCR through Tesseract: http://wiki.apache.org/tika/TikaOCR Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Hari, Sekhar sekhar.h...@cgi.com Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org Date: Wednesday, April 29, 2015 at 10:11 PM To: dev@ctakes.apache.org dev@ctakes.apache.org, u...@ctakes.apache.org u...@ctakes.apache.org Subject: Image to text conversion Hello All - I am looking for an OCR ability in cTAKES. The requirement is to convert scanned image documents (ex: scanned hand written prescriptions) into a text format. Then apply the usual NLP pipeline to convert the unstructured text to a structured data. Can cTAKES convert scanned image documents into a text? If so, please help me to understand this by sharing any documents or video. Many thanks, Sekhar H.
Re: Image to text conversion
What about using Apache Tika within cTAKES for this? Tika supports OCR through Tesseract: http://wiki.apache.org/tika/TikaOCR Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Hari, Sekhar sekhar.h...@cgi.com Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org Date: Wednesday, April 29, 2015 at 10:11 PM To: dev@ctakes.apache.org dev@ctakes.apache.org, u...@ctakes.apache.org u...@ctakes.apache.org Subject: Image to text conversion Hello All - I am looking for an OCR ability in cTAKES. The requirement is to convert scanned image documents (ex: scanned hand written prescriptions) into a text format. Then apply the usual NLP pipeline to convert the unstructured text to a structured data. Can cTAKES convert scanned image documents into a text? If so, please help me to understand this by sharing any documents or video. Many thanks, Sekhar H.
Image to text conversion
Hello All - I am looking for an OCR ability in cTAKES. The requirement is to convert scanned image documents (ex: scanned hand written prescriptions) into a text format. Then apply the usual NLP pipeline to convert the unstructured text to a structured data. Can cTAKES convert scanned image documents into a text? If so, please help me to understand this by sharing any documents or video. Many thanks, Sekhar H.
RE: Image to text conversion
Thanks. Let me try this, and will let you know for any help if required. Cheers, Sekhar H. -Original Message- From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Thursday, April 30, 2015 10:44 AM To: dev@ctakes.apache.org; u...@ctakes.apache.org Subject: Re: Image to text conversion What about using Apache Tika within cTAKES for this? Tika supports OCR through Tesseract: http://wiki.apache.org/tika/TikaOCR Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Hari, Sekhar sekhar.h...@cgi.com Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org Date: Wednesday, April 29, 2015 at 10:11 PM To: dev@ctakes.apache.org dev@ctakes.apache.org, u...@ctakes.apache.org u...@ctakes.apache.org Subject: Image to text conversion Hello All - I am looking for an OCR ability in cTAKES. The requirement is to convert scanned image documents (ex: scanned hand written prescriptions) into a text format. Then apply the usual NLP pipeline to convert the unstructured text to a structured data. Can cTAKES convert scanned image documents into a text? If so, please help me to understand this by sharing any documents or video. Many thanks, Sekhar H.