Polish Web sites use Cp1250 (windows-1250) or iso8859-2 (or UTF-8 of
course). Check if diacritics like these:

ęółąśćżń

look all right in the above encodings and use appropriately.

Dawid

On Wed, Sep 16, 2009 at 4:47 PM, MilleBii <mille...@gmail.com> wrote:
> same thing when there is
> charset=ISO-8859-2
>
> 2009/9/16 MilleBii <mille...@gmail.com>
>
>> Not sure where to look for explanations:
>>
>> I have a problem with some Polish pages which I can not index properly on
>> the specific polish characters such as :
>> &#321;
>>
>> They are havin the following  charset=windows-1252
>>
>> Does the HTML parser convert them into their Unicode equivalent ....
>>
>> --
>> -MilleBii-
>>
>
>
>
> --
> -MilleBii-
>

Reply via email to