On Wed, 28 Jan 2009, Alex Rufon wrote:
> PDF format, they have to go through these steps:
> 
> 1.       Export the PDF file into HTML
> 
> 2.       Parse the HTML file 
> 
> 3.       Insert/Update the databases

Theoretically you can parse pdf file yourself because it is a plain
test format with possible embedded graphic or compressed text in zlib.
However I think that you can just export pdf into txt directly using
some utilities (please google yourself).  Openoffice can be used as a
command line converter between various formats including pdf and txt
iirc.

-- 
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
唐詩017 王維  西施詠
    艷色天下重  西施寧久微  朝為越溪女  暮作吳宮妃  賤日豈殊眾  貴來方悟稀
    邀人傅脂粉  不自著羅衣  君寵益嬌態  君憐無是非  當時浣紗伴  莫得同車歸
    持謝鄰家子  效顰安可希
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to