Character coding

Andrew Brown Wed, 10 Dec 2008 05:49:52 -0800

I am trying to sort out some files that contain a number of entities,
a large number of correctly-coded accented characters (é for e acute)
and a good many incorrectly-coded accented characters (È for e acute).


To identify the problematic cases I deleted from the files with grep
[a-z] and [A-Z] but this also deleted the È and other problems.

But those same characters could be found (before [A-Z] deleted them)
by searching for [À-Ù]...

My conclusion is that [A-Z] finds more than A through Z, but can this
be so ?

I fear that this may have something to do with the wretched unicode...

AB


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "BBEdit Talk" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/bbedit?hl=en
If you have a specific feature request or would like to report a suspected (or 
confirmed) problem with the software, please email to "[EMAIL PROTECTED]" 
rather than posting to the group.
-~----------~----~----~----~------~----~------~--~---

Character coding

Reply via email to