It's great that Michael has detected the problem and even solved it. We've
been also testing some simple Chinese templates and didn't notice it.

Just for motivation Geir, there's not any other even near as good tool for
multilingual web application development as Velocity after the problems are
solved. So it would be cool to have this one already in Velocity 1.1/Turbine
2.1. I don't mean to put any pressure, reliability comes always first.

But it would be cool :-)

-- Ilkka


----- Original Message -----
From: Geir Magnusson Jr. <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, May 23, 2001 05:51
Subject: Re: Velocity v1.1-rc1 released


> If you read a message or two ahead, yes, I see why that doesn't work.
>
> I will stare at this a bit more so I can feel confident I understand it
> end to end, but yes, there is a big problem here because of the byte
> masking, and your work on the solution really helps.  I understand why
> your example inputs have the problem, and further understand that we
> didn't see this sooner because of sheer chance, although I want to look
> a little deeper.
>
> In the end, I think we will just make, as you suggest, a
> VelocityCharStream to keep the confusion to a bare minimum.  I don't
> like the idea of moving and renaming - that will just be too confusing
> down the road.
>
> I suspect this will be in 1.2 rather than 1.1, unless things turn out to
> be lucid and clean - I think we want to beat up any major changes before
> declaring production ready.
>
> I'll make a huge test template to make sure nothing slips by.
>
> geir
>
>
> Michael Zhou wrote:
> >
> > > Geir Magnusson Jr. wrote:
> > >
> > >
> > >Re the question about why to hack, I think I see why.
> > >
> > Yes, I have to hack the ASCII_CharStream instead of the generated
UCode_CharStream.  Because UCode_CharStream combines every 2 characters into
1 characters(see UCode_CharStream.ReadChar()), while ASCII_CharStream masks
higher byte of every character(see ASCII_CharStream.readChar()).  So there's
no existing class can do the work.  A much graceful solution is to set the
option USER_CHAR_STREAM=true in Parser.jjt file, and write a
VelocityCharStream.java to extends the generated CharStream.java interface.
This needs to modify the constructor of Parser so that it can instantiate
the user defined CharStream.
> >
> > > Heh.  I wasn't arguing that there wasn't a problem - just inquiring
> > > about what was going on.
> > >
> > > I would have guessed that the higher byte came into play with the
> > > testcase too.
> > No, there IS a problem.  Velocity(JavaCC) has ignored any higher byte of
UNICODE, so it will consider (U+4e0d) will match "\n", (U+4e2d) will match
"-", etc.  When the token manager returns the token, it get the characters
directly from buffer(in which higher byte of character hasn't been masked),
so in most cases, there seems no problem.
> >
> > Did you tested the encodingtest.vm I attached in last mail?  It may
cause the parsing error just because it masks the higher byte!
> >
> > Looking forward to the future velocity will correct this.
> >
> > Best regards,
> > Michael Zhou
>
> --
> Geir Magnusson Jr.                           [EMAIL PROTECTED]
> System and Software Consulting
> Developing for the web?  See http://jakarta.apache.org/velocity/
> "still climbing up to the shoulders..."


Reply via email to