Craig Pfeifer created TIKA-2092:
-----------------------------------
Summary: Integrate Math equation image extraction
Key: TIKA-2092
URL: https://issues.apache.org/jira/browse/TIKA-2092
Project: Tika
Issue Type: Improvement
Components: detector, ocr
Affects Versions: 2.0
Reporter: Craig Pfeifer
"A general-purpose, deep learning-based system to decompile an image into
presentational markup."
As described in :
What You Get Is What You See: A Visual Markup Decompiler
Yuntian Deng, Anssi Kanervisto, and Alexander M. Rush
http://arxiv.org/pdf/1609.04938v1.pdf
code here:
https://github.com/harvardnlp/im2markup
Would be useful as part of the OCR efforts in tika.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)