Hi guys,
I'm sorry to bother you with this, but you seem to be the only people reachable 
outside Microsoft that know about the PPT file format. I've dealt myself a lot 
with the XML versions and also did some minor enhancements to the OO XSLTs that 
convert WordML to OO, but now I need some help with the binary version.
I'm trying to write a comparison function that compares two versions of a 
document with each other and should return true if the documents have the same 
content and false otherwise. I'm using an MD5 hash to do this. 
The reason is, that I want to eliminate versions of documents in Sharepoint 
where only metadata has changed. Unfortunately, Sharepoint is so clever that it 
writes Metadata not only into its own database, but also inside the document 
itself, if it is an office document type. 
Therefore I want to strip off the header (and trailer) that contains metadata. 
For doc files this is quite easy. I just had to remove (or overwrite with 
zeros) the first 2554 and the last 1520 bytes and compare the files afterwards. 
Unfortunately this strategy does not work with PPT files. It seems that every 
sheet inside the file has it's own copy of metadata.
Can you give me any advice, how to get rid of the metadata (just for the 
comparison). Is there any byte sequence I can search for and then overwrite the 
next x byte with zeros?
I would be really thankful for any help.
Thanks a lot and best regards
René
P.S: Please accept my apologies for those of you also on the [EMAIL PROTECTED] 
mailing list. I've posted a similar question regarding XLS there.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to