Hello,

I've been using PDFBox for quite some time now. I am very happy with the
flexibility and functionality it gave me to process pdf documents.

Recently I decided to give back to the community, in the process I am
trying to reverse engineer the library in order to understand how the flow
goes about. One thing I am stuck at is how or when are TextPosition's
in  charactersByArticle
array populated and appended to the array. I see its being simply checked
if its has some content and being iterated over in writePage()
<https://github.com/apache/pdfbox/blob/trunk/pdfbox/src/main/java/org/apache/pdfbox/text/PDFTextStripper.java#L475>
function in PDFTextStripper class. But I was unable to figure out how and
when is this array being populated with character values.

If some can brief me about the flow,how this is done it would be very
helpful.

Reply via email to