B2CConverter.java

Filip Hanik - Dev Lists Sun, 26 Aug 2007 08:13:46 -0700

Bill Barker wrote:

"Filip Hanik - Dev Lists" <[EMAIL PROTECTED]> wrote in messagenews:[EMAIL PROTECTED]
Bill Barker wrote:
"Filip Hanik - Dev Lists" <[EMAIL PROTECTED]> wrote in messagenews:[EMAIL PROTECTED]
Bill Barker wrote:
"Filip Hanik - Dev Lists" <[EMAIL PROTECTED]> wrote in messagenews:[EMAIL PROTECTED]
Bill Barker wrote:
"Remy Maucherat" <[EMAIL PROTECTED]> wrote in messagenews:[EMAIL PROTECTED]
Filip Hanik - Dev Lists wrote:
Test Case and 5.5.x patch can be found here.
http://people.apache.org/~fhanik/tomcat/b2c/

This is what is happening

int cnt=conv.read( result, 0, BUFFER_SIZE );
is called with a "while (true)" statement,
When the IntermediateInputStream.read returns -1, the abovestatement returns cnt==1.So to avoid calling conv.read, we must check to see if we have morebytes to read by implementing the available() method, to avoid theinputstream ever returning -1.
It's possible, but I have a hard time understanding the issue.
The issue is that InputStreamReader reads 8192 bytes fromIntermediateInputStream on the first go. It then translates theminto 2734 chars, but thinks that the last few bytes represent anincomplete char, so holds onto them. On the next call,IntermediateInputStream returns -1, so InputStreamReader outputs thelast char as best it can (resulting in returning 1). Then theIntermediateInputStream buffer is reset, and it can continue onreading (but from the wrong position, resulting in corruption).
Filip's patch is inelegant (better would be to use the ByteChunksink), but other than my looking for a better way to do it, I can'tcome up with the required technical reason to porting the base of itto 5.5 (of course, I could care less what he does in his sandbox :).
I've committed the fix to 5.5, if you find a more elegant way ofsolving the actual problem, feel free to revert it and commit anotherfix. I don't care about the how, as long as there is a fix that willbe included in the tag 5.5.25 on Friday
No problem. I can see how to do this better, but I'll wait until theweekend to commit (since it's not totally trivial, I don't want aone-day window for regression testing :). That way 5.5.25 can go outwith your patch. It doesn't include the NIO dependancy (which was myonly concern), so it works well enough for me for now.
according to the KISS principle, your fix would have to be less than 4lines changed to be "more elegant" :)
Yes, it is more than 4 lines, but most of them are deletes :). I've doneit already on my local machine here, in case anybody wants RTC on the5.5.x branch (and Filip's test case passes with flying colors :). I'mpretty much sure that there are no regressions for 5.5.x+, but I stillneed to look at 3.3.x, and 4.1.x.
If anyone is interested, I can post the patch files. Otherwise, I'llassume that CTR is still in place, and you can veto it when I commit overthe w/e ;). Of course, if this message was meant as a pre-emptive veto,then I won't bother.
it's your choice if you want to commit it before or after the tag today.
If you wanna commit it before, then we are counting on your vote :)
I've noticed a problem with using Reader.mark with multi-byte charsets (wehave a hack in place that works for single-byte charsets). I could justcommit what I've got here (which should be no worse than before :), but I'dlike to solve this once and for all first.
Using Filip's example servlet, if you modify it to do:
            reader = request.getReader();
+            reader.mark(5);   // content length + terminator
            while (true) {
                int c = reader.read();
                if (c == -1 || c == '/')
                    break;
                buf.append((char)c);
            }
+           reader.reset(); // throws IOException here
With the current code (and what I have), the first call to reader.readrequests 8192 chars, and produces 2734 chars. The current code then resultsin throwing away the last 2729 chars and abandoning the mark. The best I'vegot until now preserves the 2729 chars, but still throws away the mark, andhence still throws an IOE when reset is called.
Long story short, I'm not now sure that I can promise to commit a fix thisweekend :(.

no worries, since UTF-8 can be anywhere between 1 and 6 bytes, wouldn'tit just be easier to do

boolean markSupported() { return false; }
and not worry about parsing the bytes correctly during a mark?
the problem you are describing goes beyond that though, so take your time.

Filip

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Bug in B2C converter WAS: svn commit: r568307 - /tomcat/trunk/java/org/apache/tomcat/util/buf/B2CConverter.java

Reply via email to