Hello,
I use the following code to find charset of a page,but it does not worked for
page
"http://www.annahar.com/content.php?priority=1&table=main&type=main&day=Mon"
Code :
[code]
try {
HttpClient httpclient = new DefaultHttpClient();
String
url="http://www.annahar.com/content.php?priority=1&table=main&type=main&day=Mon";
HttpGet httpget = new HttpGet(url);
HttpResponse response;
response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
if (entity != null) {
Header[] allHeaders = response.getHeaders("Content-Type");
System.out.println(allHeaders[0].getValue());
}
} catch (ClientProtocolException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
[/code]
And the output of above code is : text/html.
But i think the output must be "text/html; charset=windows-1256" .Am i right?
But when i use
"http://bigbrowser.blog.lemonde.fr/2011/08/03/iran-le-mossad-derriere-le-meurtre-dun-scientifique-spiegel"
as a url in code,it returns "text/html; charset=UTF-8" ,that i think ,it is OK.
It seems ,it works for some pages not all of them.Why this happens?
Khosro.