Hello, filippo di natale: > I need to parse "csv" or "fixed length" like documents that are > unfortunately in pdf format, if anyone has any suggestion on how to parse > them without translating them to text...
The library that okular uses is Poppler - http://poppler.freedesktop.org For "fixed length" like documents in pdf format, the recently-implemented "Table Selection Tool" might be useful - see very recent git master and/or bugs 279859 and 283440. That will let you select the "fixed length" part of the pdf document, divide it up into rows and columns, then paste into a spreadsheet or other tabular document. If you need automated processing, there are things like TableSeer floating around, but be prepared for fairly moderate performance only - sometimes it finds and extracts the tables, sometimes it doesn't or only partially. It would probably depend on your documents. http://tableseer.sf.net Jiri -- Jiri Baum <j...@baum.com.au> http://www.baum.com.au/sabik _______________________________________________ Okular-devel mailing list Okular-devel@kde.org https://mail.kde.org/mailman/listinfo/okular-devel