Re: [pdf-devel] Re: Modifications on pdf_token_read to get token boundaries

David Vazquez Mon, 25 May 2009 18:18:16 -0700

Michael Gold <[email protected]> writes:

> On Tue, May 26, 2009 at 00:53:46 +0200, [email protected] wrote:
> ...
>> We need to be able to determine the boundaries of a token that has
>> been read, for error reporting. We cannot rely on the stm used by the
>> token reader to determine the beginning position of a read token,
>> since it is skipping white characters.
>
> This behaviour could be changed by
>  - adding a flag that causes token_read to return whitespace as a token;
>    or,
>  - adding a function/flag to advance to the beginning of the next token
>


It is preferable it is not work of the tokeniser module, I think.

Adding a new function for this seems nice for me. Although we could
simply the API if `pdf_token_reader_new' function consumes characters
until the first token, and `pdf_token_read' does same thing after of
read each token, then we could assume the token always is at the
current position of the stream, therefore we could use the
`pdf_stm_tell' function to get the beginning of each token.

Finally, I think we will not need the ending offset of a token.

Re: [pdf-devel] Re: Modifications on pdf_token_read to get token boundaries

Reply via email to