----- Original Message ----- From: "Nick Burch" <[email protected]> To: "POI Users List" <[email protected]>; "Zachary Mitchell" <[email protected]>
Sent: Friday, September 24, 2010 8:42 PM
Subject: Re: Find start and finish point in HWPFDocument bytes.


On Fri, 24 Sep 2010, Zachary Mitchell wrote:
I wish to create the document, and
based on a Picture file, as an array of type
primitive byte [],
insert these bytes, in the write way,
into the document byte [] bytes

That won't work - Word doesn't store the raw picture data at the offset. Instead, at that offset you'll find a series of header, and if you're lucky, the picture data somewhere after that...

See my earlier reply for more information on what you'd need to do if you wanted to add pictures

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


I'm still confused. With my "array search" algorithm, I try to

search my word document byte [] for occurences of my picture byte [].

I am also trying to reverse engineer from Picture, HWPFDocument....

I suspect that I am confusing myself.

Should I compare byte [] from
DataInputStream of picture file, word file or
byte [] from hwpfdocument, Picture?

Any help at all (have had a look at poi source code).

?

//------------------------------------------- import java.io.*;
import java.nio.*;
import java.util.*;
import java.util.concurrent.*;
import org.apache.poi.poifs.filesystem.*;
import org.apache.poi.poifs.storage.*;
//-------------------------------------------
import org.apache.poi.hwpf.*;
//import org.apache.poi.hwpf.model.*;
//import org.apache.poi.hwpf.model.io.*;
import org.apache.poi.hwpf.usermodel.*;
//------------------------------------------- //import org.apache.poi.hssf.usermodel .*; //------------------------------------------- import java.lang.reflect.*; //------------------------------------------- //import javax.management.openmbean.*; //------------------------------------------- import javax.imageio.stream.*; //------------------------------------------- public class MSImageEmbedAttempt {


public static void main (String [] args)

{
try{
////////////////////////////////////////////////////////////////////////////
FileInputStream input = new FileInputStream(new File("demo.doc"));
POIFSFileSystem fileSystem = HWPFDocument.verifyAndBuildPOIFS(input);
HWPFDocument document = new HWPFDocument(fileSystem);
input.close();
Field dataStream = document.getClass().getDeclaredField("_mainStream");
dataStream.setAccessible(true);
byte [] fileArray = (byte [])dataStream.get(document);
////////////////////////////////////////////////////////////////////////////

//7828 bytes read. how big does File API say it is?
File flanders = new File("flanders.gif");
System.out.println("flanders.gif: " + flanders.length());
System.out.println("---------------------------------------------------------------------");
//Indeed, works for demonstration single file.
//How to write image file bytes out to file?

DataInputStream inputTwo = new DataInputStream(new FileInputStream("flanders.gif"));
ConcurrentLinkedQueue<Byte> queue = new ConcurrentLinkedQueue<Byte>();
Byte datum = null;
while(inputTwo.available() > 0 )
{
datum = new Byte(inputTwo.readByte());
if(datum instanceof Byte)
{ queue.add(datum);}
}
inputTwo.close();


byte [] pictureArray = new byte[queue.size()];

for(int i=0;i<pictureArray.length;i++)
{
pictureArray[i] = queue.poll().byteValue();

}

Picture picture = new Picture(pictureArray);
pictureArray = picture.getContent();


//??????????????????????????????????????????????????????????????????????????????????????????
//picture << file => one is an aggregate of the other.
//THIS SECTION NEEDS DEBUGGING AND FURTHER WORK for multiple images in word file.

byte [] resultArray = new byte[pictureArray.length];
boolean first = false;
boolean last = false;
int a = 0;
int b = 0;
int k = 0;

for (int i=0; i<fileArray.length; i++)
{
for (int j=0; j<pictureArray.length; j++)
{

if (fileArray[i] == pictureArray[j])
{
first = true;

resultArray[k] = fileArray[i];
k++;
a = i;
}
else
{
if(first == true)
{
last = true;
first = false;
b = i;

if(k != pictureArray.length)
{
Arrays.fill(resultArray,(new Integer(0)).byteValue());
}
break;
}
}
}

if (last == true)
{
last = false;
break;

}

}
//??????????????????????????????????????????????????????????????????????????????????????????
//What about when the picture ends, with more file?



System.out.println("Search completed.");
System.out.println("a: " + a);
System.out.println("b: " + b);
System.out.println("Picture array, read by binary from GIF file:");
System.out.println(Arrays.toString(pictureArray)); //A
System.out.println("Word File array, read from HWPFDocument Word document file.");
System.out.println(Arrays.toString(fileArray)); //B
System.out.println("Image result array, extracted by binary from Word document:");
System.out.println(Arrays.toString(resultArray)); //C
System.out.println("---------------------------------------------------------------------");
System.out.println("Number of bytes: " + pictureArray.length);





//because of this, one knows that file data is being reinterpreted.
for (int i=0;i<fileArray.length;i++)
{
if(fileArray[i] == -119)
{
System.out.println("Found start.");
//if((i<fileArray.length) && (fileArray[i+1] == 80))
//{System.out.println("Found start.");}
}


}


FileImageOutputStream output = new FileImageOutputStream(new File ("destination.gif"));
output.write(pictureArray,0,pictureArray.length);
output.close();


}

catch (Exception e)
{e.printStackTrace();}
}
}





---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to