Hi Jim, As per the PCRE2 documentation, you could use *\h* instead of *\s* :
https://www.pcre.org/current/doc/html/pcre2syntax.html#SEC4 CHARACTER TYPES . any character except newline; in dotall mode, any character whatsoever \C one code unit, even in UTF mode (best avoided) \d a decimal digit \D a character that is not a decimal digit \h *a horizontal white space character * \H a character that is not a horizontal white space character \N a character that is not a newline \p{xx} a character with the xx property \P{xx} a character without the xx property \R a newline sequence \s a white space character \S a character that is not a white space character \v *a vertical white space character * \V a character that is not a vertical white space character \w a "word" character \W a "non-word" character \X a Unicode extended grapheme cluster HTH Jean Jourdain On Sunday, September 24, 2023 at 8:56:21 PM UTC+2 Patrick Woolsey wrote: > Since per the discussion of character classes in Chapter 8, the special > class \s intrinsically includes linefeeds: > > ==== > > * Other Special Character Classes * > > BBEdit uses several other sequences for matching different types or > categories of characters. > > Special Character Matches > > \s any whitespace character (space, tab, carriage return, line feed, form > feed) > > ==== > > I suggest you instead define a character class which contains only the > whitespace characters that you explicitly wish to exclude, e.g. [^\t ] > since you needn't worry about carriage returns and I expect you aren't > likely to encounter form feeds. :-) > > Regards, > > Patrick Woolsey > == > Bare Bones Software, Inc. <https://www.barebones.com/> > > > > > On Sep 24, 2023, at 12:48, Jim Witte <[email protected]> wrote: > > > > I'm trying to create a pattern that will find two Chinese characters > separated by 1 or more spaces and covert it to a single ideographic space > (\x{3000}), using the following pattern: > > > > Find: ([\x{2f00}-\x{ffff}]){1}[\s^$]+([\x{2f00}-\x{ffff}]) > > Replace: \1\x{3000}\2 > > > > But this also recognizes newlines as spaces. I figure out how to do it > using [[:blank:]] with > > > > Find: ([\x{2f00}-\x{ffff}]){1}[[:blank:]]+([\x{2f00}-\x{ffff}]) > > > > But is there another way? Something like [\s^$] ? "[\s^\n]" doesn't work. > > > > -- This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "[email protected]" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit> --- You received this message because you are subscribed to the Google Groups "BBEdit Talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/a23d224b-11f1-40b5-a490-5a177007da91n%40googlegroups.com.
