Re: Encoding pb? -- Can't parse http response from babelfish.altavista.com

Roland Weber Fri, 27 Oct 2006 10:36:00 -0700

Hi Franck,

> But when i try to parse babelfish response to my request for a
> translation to russian or greek, i get little squares instead of
> russian or greek characters.


HttpClient never gives you little squares. It gives you bytes or
characters, where the characters might be correct or not. If you
*print* the characters to some stream, then they get *rendered*
by some font. Little squares usually indicate one of those:
1. The character is a little square.
2. The character is correct, but not supported by the font.
   Like a japanese character printed by an ISO-Latin font.
3. The character is wrong.

> So i tried to use the getResponseBody()
> on the post method to get an array of bytes so i could convert it
> using UTF-8 or ISO-8859-1 or UTF-16. No matter what character encoding
> i use i get those annoying little squares.

Try printing the code points:
System.out.println("character: " + ((int)c));
Then you know which value the character has in memory, and can
verify whether the problem is in character decoding or in the
range of characters supported by the font.
You can also try to write the characters into a binary file as
UTF-16, then open that file with a text editor that supports
UTF, such as WordPad (if you use Windows).

>      String encoding = post.getResponseCharSet ();
> 
>      String altavistaResponse /*= new String(post.getResponseBody(),
> "ISO-8859-1")*/;
>      //altavistaResponse = new String(post.getResponseBody(),
> "ISO-8859-7");
>      altavistaResponse = new String( post.getResponseBody(), "UTF-8");
>      //altavistaResponse = new String(post.getResponseBody(), "UTF-16");
>      //altavistaResponse = post.getResponseBodyAsString();

Well, what *is* the value of "encoding"? What does a browser use
for displaying then page when you visit it directly?

hope that helps,
  Roland


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Encoding pb? -- Can't parse http response from babelfish.altavista.com

Reply via email to