better understanding of what is what in pdfbox

José Rodolfo Carrijo de Freitas Fri, 22 Oct 2010 06:16:03 -0700

Im trying to collect some new guides  Im learning while working with
pdfbox


Im putting them here so you can correct me if I am wrong, and it might help
someone else.

 

1)      The pdStream got with page.getContents() have all data of a page.

2)      A token in a list of tokens collected with the PDFStreamParser
represents a data on a stream

3)      Removing a token from the list collected above and writing the
others tokens of the list in a pdStream can corrupt the stream

4)      A pdf object is formed by more than one token. (probably thats why
we can afirme the number 3 line)

5)      A stream got with getUnfilteredStream of an object represents the
raw data of an object, it does not have pdf information like coordinates and
scale.

6)      A stream got with getFilteredStream of an object represents the raw
data of an object, it has pdf information like coordinates and scale.

better understanding of what is what in pdfbox

Reply via email to