Thanks for your quick reply!
Looks like your theory about the input data being "in ascii (with
entity
references...)" is contradicted by the evidence.
Indeed.
So now you need to determine what character encoding is being used for
the non-ascii codes, which are obviously present in the data. When
At 1:12 am +0200 26/10/03, Marco Baroni wrote:
I am new to (explicit) unicode handling, and right now I am facing
this problem.
I have some data (lots of data) that in theory should be in ascii
(with entity references in place of non-ascii characters). I have
no easy way to get to know exact
> Date: Sun, 26 Oct 2003 18:02:36 +0200
> From: Jarkko Hietaniemi <[EMAIL PROTECTED]>
Beesley:
> > It's curious that the Arabic Presentation Forms got
> > into Unicode at all, and a number of people still think
> > it was a mistake, a sell-out. One of the Fathers of Unicode
> > told me they were
Chris,
I think what you've done is very interesting, and
useful, so what I have to say below is not intended
as criticism of your work in any way.
It's curious that the Arabic Presentation Forms got
into Unicode at all, and a number of people still think
it was a mistake, a sell-out. One of the
[EMAIL PROTECTED] said:
> I see a rhombus with a question mark inside (which is the way my
> shell displays non-ASCII characters). I guess it is a c with cedilla
> from the context.
> So, I would like to ask you or anybody else: is there some kind of
> tool (e.g., a text editor) that I could u
> It's curious that the Arabic Presentation Forms got
> into Unicode at all, and a number of people still think
> it was a mistake, a sell-out. One of the Fathers of Unicode
> told me they were deprecated. Even the Unicode specification
> explains their presence rather apologetically.
Well, one
On Sunday 26 October 2003 01:27 am, Marco Baroni wrote:
> Thanks for your quick reply!
>
> > When you look at the file and you see
> > a c with cedilla, can you tell whether is this actually the
> > appropriate character, based on its context? Is this true
> > of all such characters?
>
> I do not