Hello, I have had experience developing PDF tools. I am the original writer and maintainer of PDFresurrect (which needs a serious rewrite), and I am working on a new tool. Granted, all of my PDF parsing routines were homegrown, therefore producing no dependencies in my previous project. I think it is way awesome that GNU PDF aims to be that "go-to" library for dealing with PDF creation and parsing!
Anyways, I have a question that has really been troubling me. I am decoding a PS stream (and maybe this question is more appropriate on a PS related list). Anyways, there are no explicit space characters in the PS document; instead all of the spacing is done by moving the text cursor. The amount the position moves can vary greatly, and I am having a hard time deducing spaces from these values. Any tips would be helpful. Section 9.4 (Text Objects) in the PDF3000 spec, which has some spacing help, does help some, but I feel like my code is just guessing when a space is. I have a hard coded value that looks at last versus current character position and if that distance is large enough, I assume it is a space. I don't like that solution, since I do not know how large to make the 'hard coded threshold'. Any tips/help would be appreciated. Thanks! -Matt