regular_expressions.xml

fschumacher Sun, 22 Mar 2015 11:37:24 -0700

Author: fschumacher
Date: Sun Mar 22 18:37:12 2015
New Revision: 1668433

URL: http://svn.apache.org/r1668433
Log:
Markup code fragments


Modified:
    jmeter/trunk/xdocs/usermanual/regular_expressions.xml

Modified: jmeter/trunk/xdocs/usermanual/regular_expressions.xml
URL: 
http://svn.apache.org/viewvc/jmeter/trunk/xdocs/usermanual/regular_expressions.xml?rev=1668433&r1=1668432&r2=1668433&view=diff
==============================================================================
--- jmeter/trunk/xdocs/usermanual/regular_expressions.xml (original)
+++ jmeter/trunk/xdocs/usermanual/regular_expressions.xml Sun Mar 22 18:37:12 
2015
@@ -42,34 +42,36 @@ There is also documentation on an older
 </p>
 <p>
 The pattern matching is very similar to the pattern matching in Perl. 
-A full installation of Perl will include plenty of documentation on regular 
expressions - look for perlrequick, perlretut, perlre, perlreref.
+A full installation of Perl will include plenty of documentation on regular 
expressions - look for <code>perlrequick</code>,
+<code>perlretut</code>, <code>perlre</code> and <code>perlreref</code>.
 </p>
 <p>
-It is worth stressing the difference between "contains" and "matches", as used 
on the Response Assertion test element:
+It is worth stressing the difference between "<em>contains</em>" and 
"<em>matches</em>", as used on the Response Assertion test element:
 </p>
-<ul>
-<li>
-"contains" means that the regular expression matched at least some part of the 
target, 
-so 'alphabet' "contains" 'ph.b.' because the regular expression matches the 
substring 'phabe'.
-</li>
-<li>
-"matches" means that the regular expression matched the whole target. 
-So 'alphabet' is "matched" by 'al.*t'. 
-</li>
-</ul>
-<p>In this case, it is equivalent to wrapping the regular expression in ^ and 
$, viz '^al.*t$'. 
+<dl>
+<dt>"<em>contains</em>"</dt><dd> means that the regular expression matched at 
least some part of the target, 
+so '<code>alphabet</code>' "<em>contains</em>" '<code>ph.b.</code>' because 
the regular expression matches the substring '<code>phabe</code>'.
+</dd>
+<dt>
+"<em>matches</em>"</dt><dd> means that the regular expression matched the 
whole target. 
+So '<code>alphabet</code>' is "<em>matched</em>" by '<code>al.*t</code>'. 
+</dd>
+</dl>
+<p>In this case, it is equivalent to wrapping the regular expression in 
<code>^</code> and <code>$</code>, viz '<code>^al.*t$</code>'. 
 </p>
 <p>However, this is not always the case. 
-For example, the regular expression 'alp|.lp.*' is "contained" in 'alphabet', 
but does not match 'alphabet'.
-</p>
-<p>Why? Because when the pattern matcher finds the sequence 'alp' in 
'alphabet', it stops trying any other combinations - and 'alp' is not the same 
as 'alphabet', as it does not include 'habet'.
+For example, the regular expression '<code>alp|.lp.*</code>' is 
"<em>contained</em>" in '<code>alphabet</code>',
+but does not "<em>match</em>" '<code>alphabet</code>'.
 </p>
-<p>
-Note: unlike Perl, there is no need to (i.e. do not) enclose the regular 
expression in //. 
+<p>Why? Because when the pattern matcher finds the sequence '<code>alp</code>' 
in '<code>alphabet</code>', it stops trying any other
+combinations - and '<code>alp</code>' is not the same as 
'<code>alphabet</code>', as it does not include '<code>habet</code>'.
 </p>
+<note>
+Unlike Perl, there is no need to (i.e. do not) enclose the regular expression 
in <code>//</code>.
+</note>
 <p>
-So how does one use the modifiers ismx etc if there is no trailing /? 
-The solution is to use <i>extended regular expressions</i>, i.e. /abc/i 
becomes (?i)abc.
+So how does one use the modifiers <code>ismx</code> etc if there is no 
trailing <code>/</code>? 
+The solution is to use <i>extended regular expressions</i>, i.e. 
<code>/abc/i</code> becomes <code>(?i)abc</code>.
 See also <a href="#placement">Placement of modifiers</a> below.
 </p>
 </subsection>
@@ -88,14 +90,14 @@ A suitable regular expression would be:
 <p>
 The special characters above are:
 </p>
-<ul>
-<li>( and ) - these enclose the portion of the match string to be returned</li>
-<li>. - match any character</li>
-<li>+ - one or more times</li> 
-<li>? - don't be greedy, i.e. stop when first match succeeds</li>
-</ul>
+<dl>
+<dt><code>(</code> and <code>)</code></dt><dd>these enclose the portion of the 
match string to be returned</dd>
+<dt><code>.</code></dt><dd>match any character</dd>
+<dt><code>+</code></dt><dd>one or more times</dd> 
+<dt><code>?</code></dt><dd>don't be greedy, i.e. stop when first match 
succeeds</dd>
+</dl>
 <p>
-Note: without the ?, the .+ would continue past the first <code>"></code>
+Note: without the <code>?</code>, the <code>.+</code> would continue past the 
first <code>"></code>
 until it found the last possible <code>"></code> - which is probably not what 
was intended.
 </p>
 <p>
@@ -103,7 +105,7 @@ Note: although the above expression work
 <br/>
 <code>name="file" value="([^"]+)"></code>
 where<br></br>
-[^"] - means match anything except "<br></br>
+<code>[^"]</code> - means match anything except <code>"</code><br></br>
 In this case, the matching engine can stop looking as soon as it sees the 
first <code>"</code>, 
 whereas in the previous case the engine has to check that it has found 
<code>"></code> rather than say <code>" ></code>.
 </p>
@@ -117,7 +119,7 @@ A suitable reqular expression would be:
 <br/>
 <code>name="([^"]+)" value="([^"]+)"</code>
 <br/>
-This would create 2 groups, which could be used in the JMeter Regular 
Expression Extractor template as $1$ and $2$.
+This would create 2 groups, which could be used in the JMeter Regular 
Expression Extractor template as <code>$1$</code> and <code>$2$</code>.
 </p>
 <p>
 The JMeter Regex Extractor saves the values of the groups in additional 
variables.
@@ -126,21 +128,21 @@ The JMeter Regex Extractor saves the val
 For example, assume:
 </p>
 <ul>
-<li>Reference Name: MYREF</li>
-<li>Regex: name="(.+?)" value="(.+?)"</li>
-<li>Template: $1$$2$</li>
+<li>Reference Name: <code>MYREF</code></li>
+<li>Regex: <code>name="(.+?)" value="(.+?)"</code></li>
+<li>Template: <code>$1$$2$</code></li>
 </ul>
-<note>Do not enclose the regular expression in / /</note>
+<note>Do not enclose the regular expression in <code>/ /</code></note>
 <p>
 The following variables would be set:
 </p>
-<ul>
-<li>MYREF: file.namereadme.txt</li>
-<li>MYREF_g0: name="file.name" value="readme.txt"</li>
-<li>MYREF_g1: file.name</li>
-<li>MYREF_g2: readme.txt</li>
-</ul>
-These variables can be referred to later on in the JMeter test plan, as 
${MYREF}, ${MYREF_g1} etc 
+<dl>
+<dt><code>MYREF</code></dt><dd><code>file.namereadme.txt</code></dd>
+<dt><code>MYREF_g0</code></dt><dd><code>name="file.name" 
value="readme.txt"</code></dd>
+<dt><code>MYREF_g1</code></dt><dd><code>file.name</code></dd>
+<dt><code>MYREF_g2</code></dt><dd><code>readme.txt</code></dd>
+</dl>
+These variables can be referred to later on in the JMeter test plan, as 
<code>${MYREF}</code>, <code>${MYREF_g1}</code> etc. 
 </p>
 </subsection>
 <subsection name="&sect-num;.3 Line mode" anchor="line_mode">
@@ -151,60 +153,63 @@ they can be specified independently.
 </p>
 <h3>Single-line mode</h3>
 <p>
-Single-line mode only affects how the '.' meta-character is interpreted.
+Single-line mode only affects how the '<code>.</code>' meta-character is 
interpreted.
 </p>
 <p>
-Default behaviour is that '.' matches any character except newline. 
-In single-line mode, '.' also matches newline.
+Default behaviour is that '<code>.</code>' matches any character except 
newline. 
+In single-line mode, '<code>.</code>' also matches newline.
 </p>
 
 <h3>Multi-line mode</h3>
 <p>
-Multi-line mode only affects how the meta-characters '^' and '$' are 
interpreted.
+Multi-line mode only affects how the meta-characters '<code>^</code>' and 
'<code>$</code>' are interpreted.
 </p>
 <p>
-Default behaviour is that '^' and '$' only match at the very beginning and end 
of the string. 
-When Multi-line mode is used, the '^' metacharacter matches at the beginning 
of every line,
-and the '$' metacharacter matches at the end of every line.</p>
+Default behaviour is that '<code>^</code>' and '<code>$</code>' only match at 
the very beginning and end of the string. 
+When Multi-line mode is used, the '<code>^</code>' metacharacter matches at 
the beginning of every line,
+and the '<code>$</code>' metacharacter matches at the end of every line.</p>
 
 </subsection>
 
 <subsection name="&sect-num;.4 Meta characters" anchor="meta_chars">
 <p>
 Regular expressions use certain characters as meta characters - these 
characters have a special meaning to the RE engine.
-Such characters must be escaped by preceeding them with \ (backslash) in order 
to treat them as ordinary characters.
+Such characters must be escaped by preceeding them with <code>\</code> 
(backslash) in order to treat them as ordinary characters.
 Here is a list of the meta characters and their meaning (please check the ORO 
documentation if in doubt).
 </p>
-<ul>
-<li>( ) - grouping</li>
-<li>[ ] - character classes</li>
-<li>{ } - repetition</li>
-<li>* + ? - repetition</li>
-<li>. - wild-card character</li>
-<li>\ - escape character</li>
-<li>| - alternatives</li>
-<li>^ $ - start and end of string or line</li>
-</ul>
+<dl>
+<dt><code>(</code> and <code>)</code></dt><dd>grouping</dd>
+<dt><code>[</code> and <code>]</code></dt><dd>character classes</dd>
+<dt><code>{</code> and <code>}</code></dt><dd>repetition</dd>
+<dt><code>*</code>, <code>+</code> and <code>?</code></dt><dd>repetition</dd>
+<dt><code>.</code></dt><dd>wild-card character</dd>
+<dt><code>\</code></dt><dd>escape character</dd>
+<dt><code>|</code></dt><dd>alternatives</dd>
+<dt><code>^</code> and <code>$</code></dt><dd>start and end of string or 
line</dd>
+</dl>
 <note>
-<p>Please note that ORO does not support the \Q and \E meta-characters.
+Please note that ORO does not support the <code>\Q</code> and <code>\E</code> 
meta-characters.
 [In other RE engines, these can be used to quote a portion of an RE so that 
the meta-characters stand for themselves.]
 You can use function  to do the equivalent, see <a 
href="functions.html#__escapeOroRegexpChars">${__escapeOroRegexpChars(valueToEscape)}</a>.
-</p>
 </note>
 <p>
 The following Perl5 extended regular expressions are supported by ORO.
 
 <dl>
-<dt>(?#text)</dt>
+<dt><code>(?#text)</code></dt>
 <dd>An embedded comment causing text to be ignored.</dd>
-<dt>(?:regexp)</dt>
-<dd>Groups things like "()" but doesn't cause the group match to be saved.</dd>
-<dt>(?=regexp)</dt>
-<dd>A zero-width positive lookahead assertion. For example, \w+(?=\s) matches 
a word followed by whitespace, without including whitespace in the 
MatchResult.</dd>
-<dt>(?!regexp)</dt>
-<dd>A zero-width negative lookahead assertion. For example foo(?!bar) matches 
any occurrence of "foo" that isn't followed by "bar". Remember that this is a 
zero-width assertion, which means that a(?!b)d will match ad because a is 
followed by a character that is not b (the d) and a d follows the zero-width 
assertion.</dd>
-<dt>(?imsx)</dt>
-<dd>One or more embedded pattern-match modifiers. i enables case 
insensitivity, m enables multiline treatment of the input, s enables single 
line treatment of the input, and x enables extended whitespace comments.</dd>
+<dt><code>(?:regexp)</code></dt>
+<dd>Groups things like "<code>()</code>" but doesn't cause the group match to 
be saved.</dd>
+<dt><code>(?=regexp)</code></dt>
+<dd>A zero-width positive lookahead assertion. For example, 
<code>\w+(?=\s)</code> matches a word followed by whitespace, without including 
whitespace in the MatchResult.</dd>
+<dt><code>(?!regexp)</code></dt>
+<dd>A zero-width negative lookahead assertion. For example 
<code>foo(?!bar)</code> matches any occurrence of "<code>foo</code>" that
+isn't followed by "<code>bar</code>". Remember that this is a zero-width 
assertion, which means that <code>a(?!b)d</code> will
+match <code>ad</code> because <code>a</code> is followed by a character that 
is not <code>b</code> (the <code>d</code>) and a <code>d</code>
+follows the zero-width assertion.</dd>
+<dt><code>(?imsx)</code></dt>
+<dd>One or more embedded pattern-match modifiers. <code>i</code> enables case 
insensitivity, <code>m</code> enables multiline treatment
+of the input, <code>s</code> enables single line treatment of the input, and 
<code>x</code> enables extended whitespace comments.</dd>
 </dl>
 <b>Note that <code>(?&lt;=regexp)</code> - lookbehind - is not supported.</b>
 </p>
@@ -218,15 +223,15 @@ Modifiers can be placed anywhere in the
 However they would have no effect there anyway.]
 </p>
 <p>
-The single-line (?s) and multi-line (?m) modifiers are normally placed at the 
start of the regex.
+The single-line <code>(?s)</code> and multi-line <code>(?m)</code> modifiers 
are normally placed at the start of the regex.
 </p>
 <p>
-The ignore-case modifier (?i) may be usefully applied to just part of a regex,
-for example:
-<pre>
+The ignore-case modifier <code>(?i)</code> may be usefully applied to just 
part of a regex,
+for example:</p>
+<source>
 Match ExAct case or (?i)ArBiTrARY(?-i) case
-</pre>
-</p>
+</source>
+would match <code>Match ExAct case or arbitrary case</code> as well as 
<code>Match ExAct case or ARBitrary case</code>, but not <code>Match exact case 
or ArBiTrARY case</code>.
 </subsection>
 </section>

svn commit: r1668433 - /jmeter/trunk/xdocs/usermanual/regular_expressions.xml

Reply via email to