[ 
https://issues.apache.org/jira/browse/TIKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Pfeifer updated TIKA-2092:
--------------------------------
    Description: 
"A general-purpose, deep learning-based system to decompile an image into 
presentational markup."

As described in :

What You Get Is What You See: A Visual Markup Decompiler  
Yuntian Deng, Anssi Kanervisto, and Alexander M. Rush
http://arxiv.org/pdf/1609.04938v1.pdf

code here:
https://github.com/harvardnlp/im2markup

demo here:
http://lstm.seas.harvard.edu/latex/

Would be useful as part of the OCR efforts in tika.

  was:
"A general-purpose, deep learning-based system to decompile an image into 
presentational markup."

As described in :

What You Get Is What You See: A Visual Markup Decompiler  
Yuntian Deng, Anssi Kanervisto, and Alexander M. Rush
http://arxiv.org/pdf/1609.04938v1.pdf

code here:
https://github.com/harvardnlp/im2markup

Would be useful as part of the OCR efforts in tika.


> Integrate Math equation image extraction
> ----------------------------------------
>
>                 Key: TIKA-2092
>                 URL: https://issues.apache.org/jira/browse/TIKA-2092
>             Project: Tika
>          Issue Type: Improvement
>          Components: detector, ocr
>    Affects Versions: 2.0
>            Reporter: Craig Pfeifer
>              Labels: deeplearning, image, parse
>
> "A general-purpose, deep learning-based system to decompile an image into 
> presentational markup."
> As described in :
> What You Get Is What You See: A Visual Markup Decompiler  
> Yuntian Deng, Anssi Kanervisto, and Alexander M. Rush
> http://arxiv.org/pdf/1609.04938v1.pdf
> code here:
> https://github.com/harvardnlp/im2markup
> demo here:
> http://lstm.seas.harvard.edu/latex/
> Would be useful as part of the OCR efforts in tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to