On Thu, Jul 24, 2014 at 03:11:32PM -0600, Gillian Densmore wrote: > I'm not a google fanboy by any strech. However of the tools I was > familliar with googls docs was the only one able to extract raw text from a > corrupted DOC formated file.
Back in the day of Word 6, the unix tool "strings" extracted the raw text from a Word file. I recovered many a corrupted .doc file that way. I even recovered Word files from corrupted floppy disks the same way for my wife (dd if=/dev/fd0|strings). Not so sure if it works from Word 2003 onwards though, but then corrupted .doc files became rarer. The other interesting thing to note is that you usually also recovered all the revisions of the Word document - which I believe was the technique that embarrassed the British Government during the ivasion of Iraq. And .docx files? They are actually just .zip files in disguise, so you can try to unzip them. Then what you have should be a directory of plain text XML files (OOXML anyone?) that you can load into any text editor, or process with standard XML processing tools. If the zip file is corrupted though, then you are probably SOL. -- ---------------------------------------------------------------------------- Prof Russell Standish Phone 0425 253119 (mobile) Principal, High Performance Coders Visiting Professor of Mathematics [email protected] University of New South Wales http://www.hpcoders.com.au Latest project: The Amoeba's Secret (http://www.hpcoders.com.au/AmoebasSecret.html) ---------------------------------------------------------------------------- ============================================================ FRIAM Applied Complexity Group listserv Meets Fridays 9a-11:30 at cafe at St. John's College to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
