Hi Sven, Is it possible that you did a case insensitive search (the "Case sensitive" check box was unchecked in the Find window)? In this case it is not a bug but simply Unicode case conversion, your regex finds the "lowercase" version of this two Unicode character:
Unicode Character “K” (U+212A) is based on k https://www.compart.com/en/unicode/U+212a Unicode Character “ſ” (U+017F) is based on s https://www.compart.com/en/unicode/U+017f If you try you regex with "Case sensitive" checked, 's' and 'k' are not found because there is no case conversion. Case conversion + Unicode + Locales can be tricky. Regards, Jean Jourdain On Friday, April 30, 2021 at 2:16:20 PM UTC+2 [email protected] wrote: > Hi! > > I was going to run a regular expression on a large document. > What I wanted to extract was lines matching [\x{007f}-\x{ffff}], also > known as high or extended ASCII. > > When I search for that pattern in the document, however, it also oddly > matches the characters "s" and "k", which according to the Character > inspector have Unicode 0073 and 006B respectively. > > Am I doing something wrong here? It seems to me this could be a bug. > > I'm at BBEdit 13.5.6. > > Best regards, > Sven > -- This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "[email protected]" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit> --- You received this message because you are subscribed to the Google Groups "BBEdit Talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/6e5c63ac-0cc2-47b5-945a-ea93a63c33fdn%40googlegroups.com.
