2016-07-11 15:34 GMT+05:30 Sankar P :
>> I don't know if it does what you want, but have you looked at
>> https://godoc.org/rsc.io/pdf ?
>
> It seems to be unmaintained. I tried loading a complex PDF with plenty
> of tables and it hung infinitely on Content() call in the first page.
> I lost intere
Using pdftohtml and then using regexes or parser on top, seem to be
the easiest solution as of now. I came across tabula-java which also
seems interesting. Thank you everyone for the recommendations. I've
still not got multiple tables in a single page or tables over-flowing
across pages working cor
> I don't know if it does what you want, but have you looked at
> https://godoc.org/rsc.io/pdf ?
It seems to be unmaintained. I tried loading a complex PDF with plenty
of tables and it hung infinitely on Content() call in the first page.
I lost interest after that. Thanks.
--
Sankar P
http://ps
On Thu, 30 Jun 2016 11:22:00 -0400
Shawn Milochik wrote:
> I don't know of a Go solution, but if you are on Linux you could try
> pdftotext and parse the text. With the obvious caveat of "it depends
> on how the PDF was encoded."
I'm using this approach in one of my applications.
The only probl
On Thu, Jun 30, 2016 at 1:35 AM, Sankar wrote:
>
> Are there any stable/production-quality golang libraries that people are
> aware of which could read and extract tabular data out of PDF documents ?
I don't know if it does what you want, but have you looked at
https://godoc.org/rsc.io/pdf ?
Ian
I don't know of a Go solution, but if you are on Linux you could try
pdftotext and parse the text. With the obvious caveat of "it depends on how
the PDF was encoded." Worst-case you may be able to use tesseract OCR to
generate text and then do the same thing.
https://packages.debian.org/sid/poppl
Yes, I did come across your service when I was searching. I have some
PII information and so did not try the service. Having an on-premise
solution is encouraging. I will play with it. Thanks.
2016-06-30 14:35 GMT+05:30 Peter Waller :
> Hi Sankar,
>
> It may not be exactly what you're looking for
Hi Sankar,
It may not be exactly what you're looking for but I can't resist the
opportunity to plug our product! PDFTables.com has a remote API, you can
see an example of how to use it here:
https://github.com/pdftables/api/blob/master/go/cmd/pdftables-api/main.go
You can get an API key and find
Hi
Are there any stable/production-quality golang libraries that people are
aware of which could read and extract tabular data out of PDF documents ?
Thanks
Sankar
--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To unsubscribe from this group