Hi again Nick, This problem appears to be Mac-specific, I have had more luck with a .doc file created natively in Windows :-) Now POIFSLister shows the ObjectPool and the item in it:
Root Entry - SummaryInformation <(0x05)SummaryInformation> [412 / 0x19c] DocumentSummaryInformation <(0x05)DocumentSummaryInformation> [280 / 0x118] WordDocument [4142 / 0x102e] 1Table [2087 / 0x827] ObjectPool - _1432368106 - CompObj <(0x01)CompObj> [76 / 0x4c] ObjInfo <(0x03)ObjInfo> [6 / 0x6] Ole10Native <(0x01)Ole10Native> [568849 / 0x8ae11] EPRINT <(0x03)EPRINT> [5000 / 0x1388] CompObj <(0x01)CompObj> [113 / 0x71] Data [4096 / 0x1000] Please can you point me to any resources which could help me to save the embedded file to another file (i.e. read all the bytes and save them somewhere)? Thanks, - Chris On 10 Jun 2013, at 09:33, Chris Bamford wrote: > Hi Nick, > > I created a .doc file with an embedded MP3 (that is, I dragged an MP3 file > from Finder and dropped it into the document whereupon Word displayed a small > image of a loudspeaker - I took this as a positive sign!). > I then added some text for good measure and saved it, taking care to save it > as "Word 97 - 2004". > Then I ran POIFSLister -sizes on it and got: > > Root Entry - > SummaryInformation <(0x05)SummaryInformation> [4096 / 0x1000] > DocumentSummaryInformation <(0x05)DocumentSummaryInformation> [4096 / 0x1000] > WordDocument [9152 / 0x23c0] > 1Table [7280 / 0x1c70] > CompObj <(0x01)CompObj> [96 / 0x60] > > Looking closer in the debugger, I discovered that none of the entries shown > are of type DirectoryNode, so I cannot even start the process of finding / > extracting the MP3. > Any ideas what I might be doing wrong? > Thanks, > > - Chris > > > Thanks Nick, must have missed that. Will check it out. > Chris > On 7 Jun 2013, at 14:12, Nick Burch wrote: >> On Fri, 7 Jun 2013, Chris Bamford wrote: >>> Is there a way to extract files embedded into Word docs (.doc, not .docx), >>> using the HWPF package? >> >> Does the information on http://poi.apache.org/poifs/embeded.html not cover >> what you need? >> >> Nick > > > > > On 7 Jun 2013, at 14:26, Chris Bamford wrote: > > Thanks Nick, must have missed that. Will check it out. > > Chris > > On 7 Jun 2013, at 14:12, Nick Burch wrote: > >> On Fri, 7 Jun 2013, Chris Bamford wrote: >>> Is there a way to extract files embedded into Word docs (.doc, not .docx), >>> using the HWPF package? >> >> Does the information on http://poi.apache.org/poifs/embeded.html not cover >> what you need? >> >> Nick > > > Chris Bamford > Senior Developer > > CityPoint, > One Ropemaker Street, > London, > EC2Y 9AW. > > mobile +44 7860 405292 > tel: +44 (0) 207 847 8700 > web www.mimecast.com > > > The information contained in this communication from cbamf...@mimecast.com is > confidential and may be legally privileged. It is intended solely for use by > user@poi.apache.org and others authorized to receive it. If you are not > user@poi.apache.org you are hereby notified that any disclosure, copying, > distribution or taking action in reliance of the contents of this information > is strictly prohibited and may be unlawful. > > > Mimecast Ltd. is a company registered in England and Wales with the company > number 4698693 VAT No. GB 123 4197 34 > Registered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y > 9AW Email Address: i...@mimecast.com > > This email message has been scanned for viruses by Mimecast. > Mimecast delivers a complete managed email solution from a single web based > platform. > For more information please visit http://www.mimecast.com