Hi Eric,

I simply removed AFTER_READ_STATUS_CODE state and it seems to work fine. :)
Please let me know if my fix works.

Thanks!
Trustin

On Jan 5, 2008 5:29 AM, Eric Gaumer <[EMAIL PROTECTED]> wrote:
> Hello All,
>
> I ran across another interesting problem today in handling status codes
> inside of:
>
> src/main/java/org/apache/mina/filter/codec/http/HttpResponseLineDecodingStat
> e.java
>
> This is mina-2.0 snapshot.
>
> I'm building an RSS client that needs to poll thousands of RSS channels. I'm
> using conditional gets as not to waste bandwidth.
>
> I noticed a particular site was raising a "Bad Status Code" exception
> whenever the server responded with a 304 (Not Modified).
>
> It seems that the server (Apache version ??) was sending back a status code
> without the Reason Phrase.
>
> I found the BNF notation that describes the Reason Phrase in RFC2616:
>
> http://www.w3.org/Protocols/HTTP/1.1/rfc2616bis/issues/#i94
>
> Which states:
>
> TEXT           = <any OCTET except CTLs, but including LWS>
> LWS            = [CRLF] 1*( SP | HT )
> CRLF           = CR LF
> Reason-Phrase  = *<TEXT, excluding CR, LF>
>
> This means that a Reason Phrase could be empty and still be considered valid
> (e.g., 0 or more Octets).
>
> The state machine expects to see a Reason Phrase and when it doesn't, it
> consumes part of the next header (Date) and then throws an exception trying
> to convert this value to an Integer.
>
> What I did was override the isTerminator() method for the
> ConsumeToLinearWhitespaceDecodingState adding a check for a CR.
>
> This stops the scanner from pulling in excess bytes. I wasn't sure how the
> remaining states would handle this but AFTER_READ_STATUS_CODE returns
> immediately as does READ_REASON_PHRASE (since we left a remaining LF byte on
> the input buffer) and we cleanly move to a final acceptance state.
>
> I've run about 500 feeds through this and nothing seems to have broke.
>
> Here's a patch:
>
> --- HttpResponseLineDecodingState.orig.java     2008-01-04
> 14:29:25.000000000 -0500
> +++ HttpResponseLineDecodingState.java  2008-01-04 14:28:40.000000000 -0500
> @@ -80,6 +80,10 @@
>              }
>              return AFTER_READ_STATUS_CODE;
>          }
> +        @Override
> +        protected boolean isTerminator(byte b) {
> +            return b == 32 || b == 9 || b == 13;
> +        }
>      };
>
>      private final DecodingState AFTER_READ_STATUS_CODE = new
> LinearWhitespaceSkippingState() {
>
>
> This _should_ be safe since the response line has to be terminated with a
> CR/LF pair. As long as we leave the LF byte, it's enough to satisfy the
> state requirements for the trailing states.
>
> You can use this site for testing. If you set the eTag you should get a 304
> with no Reason Phrase.
>
>     http://www.mattweber.org/feed/
>
> Thanks,
> -Eric
>
> P.S. Did I mention how much fun I've been having with mina? Love it!
>
>
>



-- 
what we call human nature is actually human habit
--
http://gleamynode.net/
--
PGP Key ID: 0x0255ECA6

Reply via email to