I've created a Jupyter notebook which shows an example of how Camelot makes it easy to extract tables out of PDFs.
In the example, I scrape a PDF from an Indian disease outbreaks data source[1] using requests, extract tables from each page of the PDF using Camelot and then concat those tables. Here's the gist!https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873 :) [1] http://idsp.nic.in/index4.php?lang=1&level=0&linkid=406&lid=3689 On Fri, Sep 28, 2018 at 12:01 PM Vinayak Mehta <vmeht...@gmail.com> wrote: > Hello everyone! > > I recently released a Python library which lets users extract data tables > out of PDF files, my first open source library! Here's the link: > https://github.com/socialcopsdev/camelot > > I've created a wiki page > <https://github.com/socialcopsdev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools> > comparing it to other open source PDF table extraction tools. I'm currently > working on porting it to Python3! > > I would be really grateful if you could check it out and see if its useful > to you and give me any feedback that may help me improve it, by replying > here, opening an issue or a pull request! > > Looking forward to hearing from you all! > > Thanks for your time! > > Vinayak >
_______________________________________________ PSF-Community mailing list PSF-Community@python.org https://mail.python.org/mailman/listinfo/psf-community