Re: Preventing a newline from being recognized by [\s]+ without using [[:blank:]]

jj Mon, 25 Sep 2023 00:19:20 -0700

Hi Jim,

As per the PCRE2 documentation, you could use *\h* instead of *\s* :


https://www.pcre.org/current/doc/html/pcre2syntax.html#SEC4

CHARACTER TYPES


. any character except newline; in dotall mode, any character whatsoever 

\C one code unit, even in UTF mode (best avoided) 

\d a decimal digit 

\D a character that is not a decimal digit 

\h *a horizontal white space character *

\H a character that is not a horizontal white space character 

\N a character that is not a newline 

\p{xx} a character with the xx property 

\P{xx} a character without the xx property 

\R a newline sequence 

\s a white space character 

\S a character that is not a white space character 

\v *a vertical white space character *

\V a character that is not a vertical white space character 

\w a "word" character 

\W a "non-word" character 

\X a Unicode extended grapheme cluster

HTH

Jean Jourdain

On Sunday, September 24, 2023 at 8:56:21 PM UTC+2 Patrick Woolsey wrote:

> Since per the discussion of character classes in Chapter 8, the special 
> class \s intrinsically includes linefeeds:
>
> ====
>
> * Other Special Character Classes *
>
> BBEdit uses several other sequences for matching different types or 
> categories of characters.
>
> Special Character Matches
>
> \s any whitespace character (space, tab, carriage return, line feed, form 
> feed)
>
> ====
>
> I suggest you instead define a character class which contains only the 
> whitespace characters that you explicitly wish to exclude, e.g. [^\t ] 
> since you needn't worry about carriage returns and I expect you aren't 
> likely to encounter form feeds. :-)
>
> Regards,
>
> Patrick Woolsey
> ==
> Bare Bones Software, Inc. <https://www.barebones.com/>
>
>
>
> > On Sep 24, 2023, at 12:48, Jim Witte <[email protected]> wrote:
> > 
> > I'm trying to create a pattern that will find two Chinese characters 
> separated by 1 or more spaces and covert it to a single ideographic space 
> (\x{3000}), using the following pattern:
> > 
> > Find: ([\x{2f00}-\x{ffff}]){1}[\s^$]+([\x{2f00}-\x{ffff}])
> > Replace: \1\x{3000}\2
> > 
> > But this also recognizes newlines as spaces. I figure out how to do it 
> using [[:blank:]] with
> > 
> > Find: ([\x{2f00}-\x{ffff}]){1}[[:blank:]]+([\x{2f00}-\x{ffff}])
> > 
> > But is there another way? Something like [\s^$] ? "[\s^\n]" doesn't work.
> > 
>
>

-- 
This is the BBEdit Talk public discussion group. If you have a feature request 
or need technical support, please email "[email protected]" rather than 
posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
--- 
You received this message because you are subscribed to the Google Groups 
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/bbedit/a23d224b-11f1-40b5-a490-5a177007da91n%40googlegroups.com.

Re: Preventing a newline from being recognized by [\s]+ *without* using [[:blank:]]

Reply via email to

Re: Preventing a newline from being recognized by [\s]+ without using [[:blank:]]