Re: [iText-questions] How can compare the content of two revision

OscarP Tue, 26 May 2009 08:51:41 -0700

Hi,

Ok Michael, i was able to get the PDF contents, but my method doesn't work
for all PDF files. For instance, I can't get it to work with PDF files
generated with OpenOffice.


import java.io.*;
import java.util.*;

import com.lowagie.text.*;
import com.lowagie.text.pdf.*;
 
public class Example {
        public static void main(String[] args) {
                comprobar("d:\\pruebas\\PDFs\\textoF2I.pdf");
        }

        public static void comprobar(String fichero) {
                System.out.println("/////////////////////////////////////");
                System.out.println(fichero);
                System.out.println("/////////////////////////////////////");
                try {
                        PdfReader reader1 = new PdfReader(fichero);
                        System.out.println(obtenerPaginaPDF(reader1,1));
                }catch(Exception e){
                        e.printStackTrace();
                }
        }
        
        public static String obtenerPaginaPDF (PdfReader reader,int i){
                try{
                        PdfDictionary page = reader.getPageN(i);
                        byte[] streamBytes = getStreamBytes(page);              
        
                        PRTokeniser tokenizer = new PRTokeniser(streamBytes);
                        StringBuffer sb = new StringBuffer();
                        boolean arrayAbierto = false;
                        while (tokenizer.nextToken()) {
                                if (tokenizer.getTokenType() == 
PRTokeniser.TK_STRING) {
                                        if (tokenizer.getStringValue().equals(" 
") && !arrayAbierto)
                                                sb.append("\n");
                                        else 
                                                
sb.append(tokenizer.getStringValue());
                                }
                                else if (tokenizer.getTokenType() == 
PRTokeniser.TK_START_ARRAY) {
                                        arrayAbierto=true;
                                }
                                else if (tokenizer.getTokenType() == 
PRTokeniser.TK_END_ARRAY) {
                                        arrayAbierto=false;
                                        sb.append("\n");
                                }
                        }
                        return sb.toString();
                } catch (IOException e) {
                        // TODO Bloque catch generado automáticamente
                        e.printStackTrace();
                }
                return null;
                
        }                       

        private static byte[] getStreamBytes(PdfDictionary page) throws
IOException{
                PdfObject resources = page.get(PdfName.RESOURCES);

                byte[] streamBytes=null;                        
                if (resources instanceof PdfDictionary){
                        try{
                                PdfDictionary object = (PdfDictionary)
((PdfDictionary)resources).get(PdfName.XOBJECT);
                                if (object!=null){
                                        Set set = object.getKeys();
                                        Iterator it = set.iterator();
                                        while (it.hasNext()){
                                                PdfName s = (PdfName) it.next();
                                                if (object.get(s) instanceof 
PRIndirectReference){
                                                        PRIndirectReference 
objectReference = (PRIndirectReference)
object.get(s);
                                                        PRStream stream = 
(PRStream) PdfReader
                                                                        
.getPdfObject(objectReference);
                                                        streamBytes = 
PdfReader.getStreamBytes(stream);
                                                }
                                        }
                                }
                        }catch(Exception e){
                                e.printStackTrace();
                        }
                }
                else if (resources instanceof PRIndirectReference){
                        try{
                                PdfDictionary object = 
(PdfDictionary)PdfReader.getPdfObject(resources);
                                if (object!=null){
                                        Set set = object.getKeys();
                                        Iterator it = set.iterator();
                                        while (it.hasNext()){
                                                PdfName s = (PdfName) it.next();
                                                if (object.get(s) instanceof 
PRIndirectReference){
                                                        PRIndirectReference 
objectReference = (PRIndirectReference)
object.get(s);
                                                        PRStream stream = 
(PRStream) PdfReader
                                                                        
.getPdfObject(objectReference);
                                                        streamBytes = 
PdfReader.getStreamBytes(stream);
                                                }
                                        }
                                }
                        }catch(Exception e){
                        }
                }
                if (streamBytes==null){
                        PdfObject ob = page.get(PdfName.CONTENTS);
                        if (ob instanceof PRIndirectReference){
                                PRIndirectReference contents = 
(PRIndirectReference)
page.get(PdfName.CONTENTS);
                                PRStream streamContents = (PRStream) 
PdfReader.getPdfObject(contents);
                                streamBytes = 
PdfReader.getStreamBytes(streamContents);
                        }
                        else if (ob instanceof PdfArray){
                                for (int j=0;j<((PdfArray)ob).size();j++){
                                        PRIndirectReference ir =
(PRIndirectReference)((PdfArray)ob).getPdfObject(j);
                                        PRStream streamContents = (PRStream) 
PdfReader.getPdfObject(ir);
                                        streamBytes = 
PdfReader.getStreamBytes(streamContents);
                                }
                        }
                }
                return streamBytes;
        }
}

Probably this is not the right way to get the PDF contents, but I see no
other way to do it, and I don't know what else I can try.

I had execute this code with this files:
 - Generate with Acrobat Profesional 
http://www.nabble.com/file/p23723941/firmado2vecesOk.pdf firmado2vecesOk.pdf
.
 - Generate with GosthScript 
http://www.nabble.com/file/p23723941/2274_2007_H_PROVISIONAL.pdf
2274_2007_H_PROVISIONAL.pdf .
 - Generate with MSWord 
http://www.nabble.com/file/p23723941/Security%2BArchitecture.pdf
Security+Architecture.pdf 
 - Generate with OpenOffice 
http://www.nabble.com/file/p23723941/Prueba-para-Oscar.pdf
Prueba-para-Oscar.pdf 

All the examples work "fine", i haven't tested them with embedded images,
except the OpenOffice one.

Could you please show me an example on how to do this? Could you at least
tell me what is going wrong?


Thank you very much in advance.



mkl wrote:
> 
> Oscar,
> 
> 
> OscarP wrote:
>> 
>> OK,
>> took several days working on this, but I can not find out anything, how
>> can I get those differences? I've analysed the binary of this document 
>> http://www.nabble.com/file/p23704652/textoF2IMod.pdf textoF2IMod.pdf ,
>> but the object with the difference (70 0) returns null with the itext
>> (reader.refObj[70]).
>> 
> 
> 70 0 contains a cross-reference stream. iText hides away cross-reference
> streams it comes along when collecting cross-reference information by
> explicitely marking the matching entry in memory as a freed object. ( "if
> (thisStream < xref.length) xref[thisStream] = -1;" in
> PdfReader.readXRefStream)
> 
> (Actually 70 0 is the cross reference stream holding only the information
> about object 70 0...)
> 
> The rationale for this might be some self protection; usually you never
> tamper with any former cross-reference tables or streams. When trying to
> inspect a PDF in detail this is a bit uncomfortable, though.
> 
> 
> OscarP wrote:
>> 
>> To sum it all up, I need to know whether there are differences between
>> one signature and the other. I'd be very grateful if you could tell me
>> the way to get that result with iText.
>> 
> 
> Whether there are differences between the signatures? You refer to the
> signature containers or the whole signature dictionaries? Either way, they
> are directly available from the AcroFields, aren't they?
> 
> Regards,   Michael.
> 

-- 
View this message in context: 
http://www.nabble.com/How-can-compare-the-content-of-two-revision-tp23649348p23723941.html
Sent from the iText - General mailing list archive at Nabble.com.


------------------------------------------------------------------------------
Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT
is a gathering of tech-side developers & brand creativity professionals. Meet
the minds behind Google Creative Lab, Visual Complexity, Processing, & 
iPhoneDevCamp asthey present alongside digital heavyweights like Barbarian
Group, R/GA, & Big Spaceship. http://www.creativitycat.com 
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

Re: [iText-questions] How can compare the content of two revision

Reply via email to