Hi Eric, I simply removed AFTER_READ_STATUS_CODE state and it seems to work fine. :) Please let me know if my fix works.
Thanks! Trustin On Jan 5, 2008 5:29 AM, Eric Gaumer <[EMAIL PROTECTED]> wrote: > Hello All, > > I ran across another interesting problem today in handling status codes > inside of: > > src/main/java/org/apache/mina/filter/codec/http/HttpResponseLineDecodingStat > e.java > > This is mina-2.0 snapshot. > > I'm building an RSS client that needs to poll thousands of RSS channels. I'm > using conditional gets as not to waste bandwidth. > > I noticed a particular site was raising a "Bad Status Code" exception > whenever the server responded with a 304 (Not Modified). > > It seems that the server (Apache version ??) was sending back a status code > without the Reason Phrase. > > I found the BNF notation that describes the Reason Phrase in RFC2616: > > http://www.w3.org/Protocols/HTTP/1.1/rfc2616bis/issues/#i94 > > Which states: > > TEXT = <any OCTET except CTLs, but including LWS> > LWS = [CRLF] 1*( SP | HT ) > CRLF = CR LF > Reason-Phrase = *<TEXT, excluding CR, LF> > > This means that a Reason Phrase could be empty and still be considered valid > (e.g., 0 or more Octets). > > The state machine expects to see a Reason Phrase and when it doesn't, it > consumes part of the next header (Date) and then throws an exception trying > to convert this value to an Integer. > > What I did was override the isTerminator() method for the > ConsumeToLinearWhitespaceDecodingState adding a check for a CR. > > This stops the scanner from pulling in excess bytes. I wasn't sure how the > remaining states would handle this but AFTER_READ_STATUS_CODE returns > immediately as does READ_REASON_PHRASE (since we left a remaining LF byte on > the input buffer) and we cleanly move to a final acceptance state. > > I've run about 500 feeds through this and nothing seems to have broke. > > Here's a patch: > > --- HttpResponseLineDecodingState.orig.java 2008-01-04 > 14:29:25.000000000 -0500 > +++ HttpResponseLineDecodingState.java 2008-01-04 14:28:40.000000000 -0500 > @@ -80,6 +80,10 @@ > } > return AFTER_READ_STATUS_CODE; > } > + @Override > + protected boolean isTerminator(byte b) { > + return b == 32 || b == 9 || b == 13; > + } > }; > > private final DecodingState AFTER_READ_STATUS_CODE = new > LinearWhitespaceSkippingState() { > > > This _should_ be safe since the response line has to be terminated with a > CR/LF pair. As long as we leave the LF byte, it's enough to satisfy the > state requirements for the trailing states. > > You can use this site for testing. If you set the eTag you should get a 304 > with no Reason Phrase. > > http://www.mattweber.org/feed/ > > Thanks, > -Eric > > P.S. Did I mention how much fun I've been having with mina? Love it! > > > -- what we call human nature is actually human habit -- http://gleamynode.net/ -- PGP Key ID: 0x0255ECA6
