On Thu, Jul 24, 2014 at 03:11:32PM -0600, Gillian Densmore wrote:
> I'm not a google fanboy by any strech. However of the tools I  was
> familliar with googls docs was the only one able to extract raw text from a
> corrupted DOC formated file.


Back in the day of Word 6, the unix tool "strings" extracted the raw
text from a Word file. I recovered many a corrupted .doc file that
way. I even recovered Word files from corrupted floppy disks the same
way for my wife (dd if=/dev/fd0|strings). Not so sure if it works from
Word 2003 onwards though, but then corrupted .doc files became rarer.

The other interesting thing to note is that you usually also recovered
all the revisions of the Word document - which I believe was the
technique that embarrassed the British Government during the ivasion
of Iraq.

And .docx files? They are actually just .zip files in disguise, so you
can try to unzip them. Then what you have should be a directory of
plain text XML files (OOXML anyone?) that you can load into any text
editor, or process with standard XML processing tools. If the zip file
is corrupted though, then you are probably SOL.


-- 

----------------------------------------------------------------------------
Prof Russell Standish                  Phone 0425 253119 (mobile)
Principal, High Performance Coders
Visiting Professor of Mathematics      [email protected]
University of New South Wales          http://www.hpcoders.com.au

 Latest project: The Amoeba's Secret 
         (http://www.hpcoders.com.au/AmoebasSecret.html)
----------------------------------------------------------------------------

============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com

Reply via email to