Dear all,
I am developing a tool to extract a table from an image. It is a big
undertaking but I hope to release a beta version soon.
The input to the tool is a PNG/JPG/PDF image and output is a CSV/ODT/XLS
table.
I have some simple tables extracted from PDF. If there are formats which
govt uses often and people often need/want to digitize them, I'd like to
have some samples. I am thinking of census data, GIS data etc..
There is no plan to support multi-page tables. I can use some advice on the
OCR backend (I am using pytesseract from google for now).
best,
Dilawar
--
Dilawar Singh, Ph.D.
LinkedIn <https://www.linkedin.com/in/dilawar-singh-ph-d-44b81b194/> ORCID
<https://orcid.org/0000-0002-4645-3211> Github <https://github.com/dilawar>
--
Datameet is a community of Data Science enthusiasts in India. Know more about
us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/datameet/CAM72-Zs9PT7CNZONjCUWM3%3D%3DiNDyfhVPg7Yhko1ALJ_Cmp25%2Bw%40mail.gmail.com.