Hi Andreas,
I appreciate the offer. After some more digging I have found that the
assumption made by this code snippet (from
Ole10Native.createFromEmbeddedOleObject) is not 100% reliable:
try {
directory.getEntry("\u0001Ole10ItemName");
plain = true;
} catch (FileNotFoundException ex) {
plain = false;
}
What I have found is that with some documents that do not contain this entry
(i.e. plain=false) are extractable if you set plain=true.
So I have made the following (very similar) method to replace the call:
private Ole10Native resilientCreateFromEmbeddedOleObject(DirectoryNode
directory) throws IOException, Ole10NativeException {
final String OLE10_NATIVE = "\u0001Ole10Native";
Ole10Native ole10 = null;
boolean plain = false;
boolean retry = false;
try {
directory.getEntry("\u0001Ole10ItemName");
plain = true;
} catch (FileNotFoundException ex) {
plain = false;
}
DocumentEntry nativeEntry =
(DocumentEntry)directory.getEntry(OLE10_NATIVE);
byte[] data = new byte[nativeEntry.getSize()];
directory.createDocumentInputStream(nativeEntry).read(data);
// Have 2 goes at this - 'plain' can lie!
try {
ole10 = new Ole10Native(data, 0, plain);
} catch (Ole10NativeException e) {
retry = true;
}
if (retry) {
ole10 = new Ole10Native(data, 0, !plain);
}
return ole10;
}
This gives a higher success rate. I will let you know what else I find :-)
Kind regards,
- Chris
On 16 Jul 2014, at 23:00, Andreas Beeker
<[email protected]<mailto:[email protected]>> wrote:
Hi Chris,
> On 16.07.2014 15:24, Chris Bamford wrote:
> Looking in the source of Ole10Native at the offending line I see:
> if (totalSize < ofs) {
> throw new Ole10NativeException("Invalid Ole10Native");
> }
>
>Can anyone shed any light on what this means and why it happens?
The MS docs [1] are quite limited on that stream, so the code is just plain
guessing :|
There are Ole10Native streams without an actually data part - i.e. (some)
equation editor objects come without the data part, but encode somehow their
data within the filename.
But the Ole objects I looked at up so far, were common in having a label, a
filename and a command or at least 3 length-prefixed byte-arrays.
So this line checks if there was a error with the length-prefixes.
If you can share your file, please open a bug entry or alternatively send it to
my private email.
I would then try to figure out, how the bin object could be handled.
Currently I don't have much time and my priority is to finish that xml
signature stuff, so that may take some time ... sorry
Andi.
[1]
http://msdn.microsoft.com/en-us/library/dd942447.aspx<http://msdn.microsoft.com/en-us/library/dd942447.aspx>
---------------------------------------------------------------------
To unsubscribe, e-mail:
[email protected]<mailto:[email protected]>
For additional commands, e-mail:
[email protected]<mailto:[email protected]>
Chris Bamford
Senior Developer
m: +44 7860 405292
p: +44 207 847 8700
w: www.mimecast.com
Address click here: www.mimecast.com/About-us/Contact-us/