On Wed, 28 Jan 2009, Alex Rufon wrote:
> PDF format, they have to go through these steps:
>
> 1. Export the PDF file into HTML
>
> 2. Parse the HTML file
>
> 3. Insert/Update the databases
Theoretically you can parse pdf file yourself because it is a plain
test format with possible embedded graphic or compressed text in zlib.
However I think that you can just export pdf into txt directly using
some utilities (please google yourself). Openoffice can be used as a
command line converter between various formats including pdf and txt
iirc.
--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
唐詩017 王維 西施詠
艷色天下重 西施寧久微 朝為越溪女 暮作吳宮妃 賤日豈殊眾 貴來方悟稀
邀人傅脂粉 不自著羅衣 君寵益嬌態 君憐無是非 當時浣紗伴 莫得同車歸
持謝鄰家子 效顰安可希
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm