Re: [909linux] parsing PDF with Perl

Joel Brauer Wed, 11 Oct 2006 15:54:30 -0700 (PDT)

I would start with pdftotext  and then parse from there...

pdftotext is part of the poppler-utils package on my system(Ubuntu)


-joel

On Wed, 2006-10-11 at 15:13 -0700, Roger E. Rustad, Jr. wrote:
> I need to parse this PDF into a delimited text format
> 
> http://www.riversideca.gov/finance/pdf/Business_List.pdf
> 
> (Ideally, I'd like to do it in Perl b/c I hear Perl has some great
> scraping/parsing features that would benefit me later on when I need
> to do this kind of thing again.) 
> 
> Any suggestions?
> 
> (I can copy the text into a txt file first, in case that makes the
> scraping/parsing easier)
> _______________________________________________
> 909linux mailing list
> [email protected]
> http://909linux.org/cgi-bin/mailman/listinfo/909linux
-- 
Joel Brauer
Manager IS
Communications and Web Technologies
[email protected]
pager: [email protected]
office: 909-558-7713
cell: 909-534-1934

Only you can decide to be happy! The rest of life is working out the
details...

Re: [909linux] parsing PDF with Perl

Reply via email to