Re: Table Extraction

2020-10-13 Thread Peter Murray-Rust
A word of warning - Extracting tables generally is very hard. I spent last year developing code based on PDFBox to extract data *automatically* from a very limited subset of tables. It may be easier if you can manually interact with each table but that takes time. (Also see Tabula which has pioneer

Table Extraction

2020-10-13 Thread Kaushlendra Singh
Hi, I need to extract meaningful text from tables present in a PDF document. PDFBox doesn't support any such API directly but while searching through I got https://gist.github.com/beldaz/8ed6e7473bd228fcee8d4a3e4525be11 which helped me getting meaningful text which internally involves creating the