Hello,

filippo di natale:
> I need to parse "csv" or "fixed length" like documents that are
> unfortunately in pdf format, if anyone has any suggestion on how to parse
> them without translating them to text...

The library that okular uses is Poppler - http://poppler.freedesktop.org

For "fixed length" like documents in pdf format, the recently-implemented 
"Table Selection Tool" might be useful - see very recent git master and/or 
bugs 279859 and 283440. That will let you select the "fixed length" part of 
the pdf document, divide it up into rows and columns, then paste into a 
spreadsheet or other tabular document.

If you need automated processing, there are things like TableSeer floating 
around, but be prepared for fairly moderate performance only - sometimes it 
finds and extracts the tables, sometimes it doesn't or only partially. It 
would probably depend on your documents. http://tableseer.sf.net


Jiri
-- 
Jiri Baum <j...@baum.com.au>                   http://www.baum.com.au/sabik
_______________________________________________
Okular-devel mailing list
Okular-devel@kde.org
https://mail.kde.org/mailman/listinfo/okular-devel

Reply via email to