-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

André,

On 1/21/2010 9:21 AM, André Warnier wrote:
> But then, such header field values MUST be encoded according to the
> rules of RFC 2047.

Unfortunately, Tomcat does not follow RFC2047, at least not according to
http://stackoverflow.com/questions/324470/http-headers-encoding-decoding-in-java
and not according to my simple test:

$ wget -O - --header "Test-Value:
=?iso-8859-1?q?this=20is=20some=20text?="
http://myhost/SessionSnooper.jsp | grep -C 1 "some=20text"

   <td>

        =?iso-8859-1?q?this=20is=20some=20text?=<br />

    </td>

The value is preserved as-is. (The SessionSnooper.jsp file referenced
above can be found here: http://www.christopherschultz.net/projects/java/).

Fortunately, the value /is/ passed-through without modification. That
means that we can read it ourselves!

Let's figure out how to decode the string
"=?iso-8859-1?q?this=20is=20some=20text?=":

1. Check the the string matches the pattern "=\?[^?]*\?(B|Q)\?[^?]*\?=".
2. Extract the charset and encoding
3. If encoding is 'Q', convert value characters to bytes:
      "=HL" -> 0xHL
      others direct
4. If encoding is 'B', base64 decode value into bytes
5. Convert bytes to characters using charset:
     new String(bytes, charset)

As I started to write code to do this, it occurred to me that it must
already exist. Googling for "java rfc2047 decode" shows that the
javax.mail.internet.MimeUtility class (packaged with the JavaMail API)
already has a method called "decodeText" that will do this for us.

I wrote a simple wrapper around that method, and you can see that it works:

$ java -classpath javamail-1.4.2.jar:. RFC2047Codec
'=?iso-8859-1?q?this=20is=20some=20text?='
this is some text
$ java -classpath javamail-1.4.2.jar:. RFC2047Codec
'=?UTF-8?q?this=20is=20some=20text?='
this is some text
$ java -classpath javamail-1.4.2.jar:. RFC2047Codec
'=?utf-8?q?this=20is=20some=20text?='
this is some text
$ java -classpath javamail-1.4.2.jar:. RFC2047Codec
'=?utf-8?q?this=20is=20a=20pi:=20=cf=80?='
this is a pi: #

Er.... the pi wouldn't copy correctly from my terminal, but I assure you
that the pi character was dumped to my terminal.

So, if you have to decode RFC2047-compliant values, MimeUtility can help
you do that. It can also help you encode them, too.

It sounds like you have everything you need at this point, as long as
AAI recognizes RFC2047-formatted HTTP header values.

Good luck,
- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktYq7AACgkQ9CaO5/Lv0PAW5wCbBZM3AKhY23dp4OqYm927gM40
Ty0AoJOwpJlLZ/f3IiCNfzSaimyMnRHB
=Vf7P
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to