Hi Nick,

> That suggests that it's stored in a different bit of the file (a different
> stream) to the ones we're expecting to find it in. The file format is
> documented, so you can look up what each different bit means, but there are
> a lot of duplicate fields for historical reasons. What we lack is a guide
> saying "outlook 200x stores the sent date as MAPI_???_DATE, while 200y uses
> OUTLOOK_DATE_MAPI_???_V3"

I see.  This makes sense.

> What'd be great is if you could use org.apache.poi.hsmf.dev.HSMFDump
> (contained within the poi-scratchpad jar, dependency on the main poi jar but
> I don't think anything else) to try to track down which chunk contains the
> date. You might need to combine that with a little bit of hacking of your
> ruby script, to have it print some debug logging of what fields it's
> printing from
>
> Once we know the field, we can look up the details on how it's stored, then
> add a fallback check of that field/chunk too

Sounds promising.  Is there a command line way to run the HSMFDump
component of the poi-scratchpad.jar on my .msg file?  If so, could you
give me a pointer?  If not, I fear this may be getting beyond my
abilities, as I'm not a java programmer.  (I'm generally
unix-literate, work with Mac OS X; in terms of programming, I use
XQuery with eXist-db, which has incorporates Tika as an extension
module.)  I'm happy to provide the .msg file in question off list, if
it would help, but I'd understand if you aren't able to help to that
extent.

Thanks,
Joe

Reply via email to