Re: Read the table data from PDF files in Python

2019-04-24 Thread Mark Kettner
I've heard about camelot a while ago:

https://camelot-py.readthedocs.io/

but I never really used it and cannot provide any support or comparison
to other data-extraction tools or the like.

--
Mit freundlichen Gruessen / Best Regards

Mark Kettner
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Read the table data from PDF files in Python

2019-04-24 Thread Peter Pearson
On Wed, 24 Apr 2019 02:36:27 -0700 (PDT), mrawat...@gmail.com wrote:
> Hello,
> Anyone knows how to fetch the data from PDF file having tables with
> other text in Python. Need to fetch some cell values based on
> condition from that table.

You might find pdftotext useful.

The command . . .

  pdftotext -layout somefile.pdf

produces a file named somefile.txt.

This will be completely useless if the original PDF is just
a PDF wrapper around an image.  That's what document scanners
tend to produce.

-- 
To email me, substitute nowhere->runbox, invalid->com.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Read the table data from PDF files in Python

2019-04-24 Thread Rhodri James

On 24/04/2019 10:36, mrawat...@gmail.com wrote:

Anyone knows how to fetch the data from PDF file having tables with other text 
in Python. Need to fetch some cell values based on condition from that table.


Hi there!

If you have any alternatives to doing this, use them.  Extracting data 
from PDFs like this is hugely unreliable because the order in which page 
elements show up in a PDF varies enormously.  What works for one PDF may 
give you complete nonsense for the next.


If you must do it this way, there are modules called PyPDF and PyPDF2 in 
PyPI which will allow you to extract the text from the PDF.  You are on 
your own for working out how to parse the tables out of that text, 
though; the structures in the data you are hoping for simply don't exist.


--
Rhodri James *-* Kynesim Ltd
--
https://mail.python.org/mailman/listinfo/python-list


Read the table data from PDF files in Python

2019-04-24 Thread mrawat213
Hello,

Anyone knows how to fetch the data from PDF file having tables with other text 
in Python. Need to fetch some cell values based on condition from that table.



Thanks,
Mukesh
-- 
https://mail.python.org/mailman/listinfo/python-list