On 11 Dec 2008, at 02:44, Peter N Lewis wrote: >> I am trying to sort out some files that contain a number of entities, >> a large number of correctly-coded accented characters (é for e >> acute) >> and a good many incorrectly-coded accented characters (È for e >> acute). >> >> To identify the problematic cases I deleted from the files with grep >> [a-z] and [A-Z] but this also deleted the È and other problems. >> >> But those same characters could be found (before [A-Z] deleted them) >> by searching for [À-Ù]... >> >> My conclusion is that [A-Z] finds more than A through Z, but can this >> be so ? > > I tried to duplicate this and could not. I took > your above paragraphs and pasted into a new text > file in BBEdit, and replaces [A-Z] with nothing > (case insensitive, grep) and it left the five > accented characters as expected. > > Are you using BBEdit 9?
Yes, I'm on 9.0.2. It's a mystery to me, because those accented characters definitely went west and I was left only with this sort of stuff ¸ ü done œ œ done … É done ' ’ done ‚ â done ‡ à done ∫ º done ‰ ä done fl ß done ∆ Æ done ∞ º done ™ ª done ≈ Å done The files are in FileMaker Pro now, where the odd behaviour continues. I had 34 occurrences of a symbol that should have been ó (o acute), and corrected them all. FMP now finds only 8 o acutes, but of course that could just be FMP playing up. I'll report back if further tests produce useful results. AB --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "BBEdit Talk" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/bbedit?hl=en If you have a specific feature request or would like to report a suspected (or confirmed) problem with the software, please email to "[EMAIL PROTECTED]" rather than posting to the group. -~----------~----~----~----~------~----~------~--~---
