Hi Ian,

I couldn't see a difference in the file format for protected or non-protected 
documents. I got "Microsoft Word 97-2003 Document" for `.doc` and "Microsoft 
Word Document" for `.docx` though. Is what you're seeing based on the file 
extension or definitely on the protection status?

Assuming that you can't tell without opening the files, here's what I'd do.

Using a machine with Word 2007 or 2010 on it, I would use VSTO to run through 
each of the 50,000+ documents and convert all `.doc` format files to `.docx` 
(in a temporary folder, of course) and then use `System.IO.Packaging` to open 
each file and look at the ` ~\word\settings.xml` stream within the file and see 
if it contains a `<w:documentProtection />` node (or similar).

Would that work for you?

Cheers.

James.

From: [email protected] [mailto:[email protected]] On 
Behalf Of Ian Thomas
Sent: Monday, 30 May 2011 22:13
To: [email protected]; 'ozDotNet'
Subject: Word VSTO question


I have 50000+ short Word documents, a proportion of which have a small 
protected section. As a first pass, I need to identify which of the files have 
a protected section. Can anyone help me with how to do that?
On the basis of a sample of one of each, the Word file format is "Microsoft 
Word 97-2003 Document" for the files without a protected section, and 
"Microsoft Word Document" for those that do have a protected section. (the 
machine I inspected these with has only Office 2003 installed).
________________________________

Ian Thomas
Victoria Park, Western Australia

Reply via email to