On Wed, 20 Mar 2002 22:00:27 +0800, allan wrote: >this might sound naive but does someone know if it's at all >possible to actually read a microsoft word file and convert >it to some reasonable text format?
>i did a little search at cpan but didn't really find anything. I don't think anybody has actually finished that job (or at least, made it available), but IMO your best starting point would be OLE::Storage or OLE::Storage::Lite. See the modules SpreadSheet::WriteExcel and SpreadSheet::ParseExcel, as examples for what could be done with those. Plus, at least some minimal sample code is provided that extracts the structure from a Word file. Urm... Yeah: "lhalw" looks like it's a basic "get text from Word" script; it comes with OLE::Storage. BTW a bit of hunting around on the web brought me to this page: <http://www.wvware.com/>. I doubt if it'd be easy to make the lot work on a Mac. OSX could well be an easier target. -- Bart.