Re: [PSF-Community] Python library to extract data tables from PDF files

Vinayak Mehta Fri, 28 Sep 2018 11:32:03 -0700

I've created a Jupyter notebook which shows an example of how Camelot makes
it easy to extract tables out of PDFs.



In the example, I scrape a PDF from an Indian disease outbreaks data
source[1] using requests, extract tables from
each page of the PDF using Camelot and then concat those tables.
Here's the 
gist!https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873
:)

[1] http://idsp.nic.in/index4.php?lang=1&level=0&linkid=406&lid=3689


On Fri, Sep 28, 2018 at 12:01 PM Vinayak Mehta <[email protected]> wrote:

> Hello everyone!
>
> I recently released a Python library which lets users extract data tables
> out of PDF files, my first open source library! Here's the link:
> https://github.com/socialcopsdev/camelot
>
> I've created a wiki page
> <https://github.com/socialcopsdev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools>
> comparing it to other open source PDF table extraction tools. I'm currently
> working on porting it to Python3!
>
> I would be really grateful if you could check it out and see if its useful
> to you and give me any feedback that may help me improve it, by replying
> here, opening an issue or a pull request!
>
> Looking forward to hearing from you all!
>
> Thanks for your time!
>
> Vinayak
>

_______________________________________________
PSF-Community mailing list
[email protected]
https://mail.python.org/mailman/listinfo/psf-community

Re: [PSF-Community] Python library to extract data tables from PDF files

Reply via email to