Re: [Scheme-reports] Is bytevector a token?

Takashi Kato Tue, 29 Oct 2013 01:54:25 -0700

I think it's impossible to put it into <token> as long as <bytevector> contains 
')'. (again correct me if I'm wrong.)



> On the other hand the rule <bytevector> is placed under section 7.1.1
> (Lexical structure), not under 7.1.2 (External representations). I
> think this is not related to the complexity of the lexer implementation.
Yes, you're right.



On Tuesday, 29 October 2013, 9:23, Yuichi Nishiwaki 
<[email protected]> wrote:
Hi,

> As my understanding (correct me if I'm wrong), <token> meant be the smallest 
> unit that a lexer reads and <bytevector> is a compound datum that a parser 
> needs to construct. Suppose your reader gets #u8(1 2 3) as its input, then 
> first your lexer needs to return a token to let your parser know what type of 
> datum needs to be constructed.

Even in that case, the lexer still is capable of scanning bytevectors,
though the implementation will be much more complex than otherwise.

> Now, first the lexer returns #u8( then the parser understands this is a 
> bytevector. However if <bytevector> is in <token> then your lexer needs to 
> read the input as a bytevector. Then inside of the bytevector it has a 
> delimiter and token so lexer needs to read your input recursively. I'm not 
> good with theory but seems something is wrong.

I'm really wondering which of '#u8(' and <bytevector> is a token. As
seeing the definition of <token> rule it reads '#u8(' is a token. On
the other hand the rule <bytevector> is placed under section 7.1.1
(Lexical structure), not under 7.1.2 (External representations). I
think this is not related to the complexity of the lexer
implementation.

-- Yuichi Nishiwaki



2013/10/29 Takashi Kato <[email protected]>:
> I think it's on purpose.
>
> As my understanding (correct me if I'm wrong), <token> meant be the smallest 
> unit that a lexer reads and <bytevector> is a compound datum that a parser 
> needs to construct. Suppose your reader gets #u8(1 2 3) as its input, then 
> first your lexer needs to return a token to let your parser know what type of 
> datum needs to be constructed. Now, first the lexer returns #u8( then the 
> parser understands this is a bytevector. However if <bytevector> is in 
> <token> then your lexer needs to read the input as a bytevector. Then inside 
> of the bytevector it has a delimiter and token so lexer needs to read your 
> input recursively. I'm not good with theory but seems something is wrong.
>
>
> Hope it can help you.
>
> _/_/
> Takashi Kato
> E-mail: [email protected]
>
>
>
>
> On Tuesday, 29 October 2013, 5:48, Yuichi Nishiwaki 
> <[email protected]> wrote:
> Hi, all. I'm very excited to see the final R7RS draft published. Thank
> you all for the great work.
> Reading the final draft, I have one question about the formal syntax
> definition (7.1.1). <bytevector> is not listed in <token> line. Is it
> by purpose? Or just a missing?
>
> -- Yuichi Nishiwaki
>
> _______________________________________________
> Scheme-reports mailing list
> [email protected]
> http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports

_______________________________________________
Scheme-reports mailing list
[email protected]
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports

Re: [Scheme-reports] Is bytevector a token?

Reply via email to