[Israel.pm] PDF handling

Yossi Itzkovich Wed, 24 Dec 2008 07:28:42 -0800

Hi,
I was asked by a colleague here the following question:

I have a PDF document that contains tables.
Is there a Perl module that provides an API for reading a PDF doc,
Identifying the tables in the doc and reading each table-cell separately, even 
if the text in the cell
Is "broken" to several rows (wrapping)?


I tried to convert the PDF to text and work on the text, but the converted text 
doesn't always
Behave as expected, e.g. a cell with a "broken" line is converted to several 
text lines that
Are NOT NECESSARILY ADJACENT - there might be an blank row between the parts, 
which makes
The analysis more difficult.


Can someone help in this issue ?

Yossi
_______________________________________________
Perl mailing list
[email protected]
http://perl.org.il/mailman/listinfo/perl

[Israel.pm] PDF handling

Reply via email to