Re: invalid utf-8 sequence

Jarrett Billingsley Tue, 06 Jan 2009 19:23:57 -0800

On Tue, Jan 6, 2009 at 9:20 PM, james <[email protected]> wrote:
> Jarrett Billingsley Wrote:
>
>> On Tue, Jan 6, 2009 at 8:04 PM, james <[email protected]> wrote:
>> > im writing an indexer, but im having a problem because on some file, when 
>> > i read gives this error
>> >
>> > Error 4: invalid UTF-8 sequence
>> >
>> > is there a way to fix it.
>> >
>>
>> You're probably reading a file that's encoded in some non-Unicode
>> encoding, like Latin-1.  You could read in the file data as byte[]
>> instead of as char[], but that still doesn't deal with the problem
>> that you have characters in your file that are outside the ASCII
>> range.  If you know what encoding your file uses, you could do some
>> transformations on it to turn it into valid Unicode, or you could just
>> ignore characters outside the ASCII range :P
>
> is there any library or function that can automatically convert these unknown 
> html charset into UTF-8


Not that I know of, for D anyway.

Re: invalid utf-8 sequence

Reply via email to