Author: fschumacher
Date: Sun Mar 22 18:37:12 2015
New Revision: 1668433
URL: http://svn.apache.org/r1668433
Log:
Markup code fragments
Modified:
jmeter/trunk/xdocs/usermanual/regular_expressions.xml
Modified: jmeter/trunk/xdocs/usermanual/regular_expressions.xml
URL:
http://svn.apache.org/viewvc/jmeter/trunk/xdocs/usermanual/regular_expressions.xml?rev=1668433&r1=1668432&r2=1668433&view=diff
==============================================================================
--- jmeter/trunk/xdocs/usermanual/regular_expressions.xml (original)
+++ jmeter/trunk/xdocs/usermanual/regular_expressions.xml Sun Mar 22 18:37:12
2015
@@ -42,34 +42,36 @@ There is also documentation on an older
</p>
<p>
The pattern matching is very similar to the pattern matching in Perl.
-A full installation of Perl will include plenty of documentation on regular
expressions - look for perlrequick, perlretut, perlre, perlreref.
+A full installation of Perl will include plenty of documentation on regular
expressions - look for <code>perlrequick</code>,
+<code>perlretut</code>, <code>perlre</code> and <code>perlreref</code>.
</p>
<p>
-It is worth stressing the difference between "contains" and "matches", as used
on the Response Assertion test element:
+It is worth stressing the difference between "<em>contains</em>" and
"<em>matches</em>", as used on the Response Assertion test element:
</p>
-<ul>
-<li>
-"contains" means that the regular expression matched at least some part of the
target,
-so 'alphabet' "contains" 'ph.b.' because the regular expression matches the
substring 'phabe'.
-</li>
-<li>
-"matches" means that the regular expression matched the whole target.
-So 'alphabet' is "matched" by 'al.*t'.
-</li>
-</ul>
-<p>In this case, it is equivalent to wrapping the regular expression in ^ and
$, viz '^al.*t$'.
+<dl>
+<dt>"<em>contains</em>"</dt><dd> means that the regular expression matched at
least some part of the target,
+so '<code>alphabet</code>' "<em>contains</em>" '<code>ph.b.</code>' because
the regular expression matches the substring '<code>phabe</code>'.
+</dd>
+<dt>
+"<em>matches</em>"</dt><dd> means that the regular expression matched the
whole target.
+So '<code>alphabet</code>' is "<em>matched</em>" by '<code>al.*t</code>'.
+</dd>
+</dl>
+<p>In this case, it is equivalent to wrapping the regular expression in
<code>^</code> and <code>$</code>, viz '<code>^al.*t$</code>'.
</p>
<p>However, this is not always the case.
-For example, the regular expression 'alp|.lp.*' is "contained" in 'alphabet',
but does not match 'alphabet'.
-</p>
-<p>Why? Because when the pattern matcher finds the sequence 'alp' in
'alphabet', it stops trying any other combinations - and 'alp' is not the same
as 'alphabet', as it does not include 'habet'.
+For example, the regular expression '<code>alp|.lp.*</code>' is
"<em>contained</em>" in '<code>alphabet</code>',
+but does not "<em>match</em>" '<code>alphabet</code>'.
</p>
-<p>
-Note: unlike Perl, there is no need to (i.e. do not) enclose the regular
expression in //.
+<p>Why? Because when the pattern matcher finds the sequence '<code>alp</code>'
in '<code>alphabet</code>', it stops trying any other
+combinations - and '<code>alp</code>' is not the same as
'<code>alphabet</code>', as it does not include '<code>habet</code>'.
</p>
+<note>
+Unlike Perl, there is no need to (i.e. do not) enclose the regular expression
in <code>//</code>.
+</note>
<p>
-So how does one use the modifiers ismx etc if there is no trailing /?
-The solution is to use <i>extended regular expressions</i>, i.e. /abc/i
becomes (?i)abc.
+So how does one use the modifiers <code>ismx</code> etc if there is no
trailing <code>/</code>?
+The solution is to use <i>extended regular expressions</i>, i.e.
<code>/abc/i</code> becomes <code>(?i)abc</code>.
See also <a href="#placement">Placement of modifiers</a> below.
</p>
</subsection>
@@ -88,14 +90,14 @@ A suitable regular expression would be:
<p>
The special characters above are:
</p>
-<ul>
-<li>( and ) - these enclose the portion of the match string to be returned</li>
-<li>. - match any character</li>
-<li>+ - one or more times</li>
-<li>? - don't be greedy, i.e. stop when first match succeeds</li>
-</ul>
+<dl>
+<dt><code>(</code> and <code>)</code></dt><dd>these enclose the portion of the
match string to be returned</dd>
+<dt><code>.</code></dt><dd>match any character</dd>
+<dt><code>+</code></dt><dd>one or more times</dd>
+<dt><code>?</code></dt><dd>don't be greedy, i.e. stop when first match
succeeds</dd>
+</dl>
<p>
-Note: without the ?, the .+ would continue past the first <code>"></code>
+Note: without the <code>?</code>, the <code>.+</code> would continue past the
first <code>"></code>
until it found the last possible <code>"></code> - which is probably not what
was intended.
</p>
<p>
@@ -103,7 +105,7 @@ Note: although the above expression work
<br/>
<code>name="file" value="([^"]+)"></code>
where<br></br>
-[^"] - means match anything except "<br></br>
+<code>[^"]</code> - means match anything except <code>"</code><br></br>
In this case, the matching engine can stop looking as soon as it sees the
first <code>"</code>,
whereas in the previous case the engine has to check that it has found
<code>"></code> rather than say <code>" ></code>.
</p>
@@ -117,7 +119,7 @@ A suitable reqular expression would be:
<br/>
<code>name="([^"]+)" value="([^"]+)"</code>
<br/>
-This would create 2 groups, which could be used in the JMeter Regular
Expression Extractor template as $1$ and $2$.
+This would create 2 groups, which could be used in the JMeter Regular
Expression Extractor template as <code>$1$</code> and <code>$2$</code>.
</p>
<p>
The JMeter Regex Extractor saves the values of the groups in additional
variables.
@@ -126,21 +128,21 @@ The JMeter Regex Extractor saves the val
For example, assume:
</p>
<ul>
-<li>Reference Name: MYREF</li>
-<li>Regex: name="(.+?)" value="(.+?)"</li>
-<li>Template: $1$$2$</li>
+<li>Reference Name: <code>MYREF</code></li>
+<li>Regex: <code>name="(.+?)" value="(.+?)"</code></li>
+<li>Template: <code>$1$$2$</code></li>
</ul>
-<note>Do not enclose the regular expression in / /</note>
+<note>Do not enclose the regular expression in <code>/ /</code></note>
<p>
The following variables would be set:
</p>
-<ul>
-<li>MYREF: file.namereadme.txt</li>
-<li>MYREF_g0: name="file.name" value="readme.txt"</li>
-<li>MYREF_g1: file.name</li>
-<li>MYREF_g2: readme.txt</li>
-</ul>
-These variables can be referred to later on in the JMeter test plan, as
${MYREF}, ${MYREF_g1} etc
+<dl>
+<dt><code>MYREF</code></dt><dd><code>file.namereadme.txt</code></dd>
+<dt><code>MYREF_g0</code></dt><dd><code>name="file.name"
value="readme.txt"</code></dd>
+<dt><code>MYREF_g1</code></dt><dd><code>file.name</code></dd>
+<dt><code>MYREF_g2</code></dt><dd><code>readme.txt</code></dd>
+</dl>
+These variables can be referred to later on in the JMeter test plan, as
<code>${MYREF}</code>, <code>${MYREF_g1}</code> etc.
</p>
</subsection>
<subsection name="§-num;.3 Line mode" anchor="line_mode">
@@ -151,60 +153,63 @@ they can be specified independently.
</p>
<h3>Single-line mode</h3>
<p>
-Single-line mode only affects how the '.' meta-character is interpreted.
+Single-line mode only affects how the '<code>.</code>' meta-character is
interpreted.
</p>
<p>
-Default behaviour is that '.' matches any character except newline.
-In single-line mode, '.' also matches newline.
+Default behaviour is that '<code>.</code>' matches any character except
newline.
+In single-line mode, '<code>.</code>' also matches newline.
</p>
<h3>Multi-line mode</h3>
<p>
-Multi-line mode only affects how the meta-characters '^' and '$' are
interpreted.
+Multi-line mode only affects how the meta-characters '<code>^</code>' and
'<code>$</code>' are interpreted.
</p>
<p>
-Default behaviour is that '^' and '$' only match at the very beginning and end
of the string.
-When Multi-line mode is used, the '^' metacharacter matches at the beginning
of every line,
-and the '$' metacharacter matches at the end of every line.</p>
+Default behaviour is that '<code>^</code>' and '<code>$</code>' only match at
the very beginning and end of the string.
+When Multi-line mode is used, the '<code>^</code>' metacharacter matches at
the beginning of every line,
+and the '<code>$</code>' metacharacter matches at the end of every line.</p>
</subsection>
<subsection name="§-num;.4 Meta characters" anchor="meta_chars">
<p>
Regular expressions use certain characters as meta characters - these
characters have a special meaning to the RE engine.
-Such characters must be escaped by preceeding them with \ (backslash) in order
to treat them as ordinary characters.
+Such characters must be escaped by preceeding them with <code>\</code>
(backslash) in order to treat them as ordinary characters.
Here is a list of the meta characters and their meaning (please check the ORO
documentation if in doubt).
</p>
-<ul>
-<li>( ) - grouping</li>
-<li>[ ] - character classes</li>
-<li>{ } - repetition</li>
-<li>* + ? - repetition</li>
-<li>. - wild-card character</li>
-<li>\ - escape character</li>
-<li>| - alternatives</li>
-<li>^ $ - start and end of string or line</li>
-</ul>
+<dl>
+<dt><code>(</code> and <code>)</code></dt><dd>grouping</dd>
+<dt><code>[</code> and <code>]</code></dt><dd>character classes</dd>
+<dt><code>{</code> and <code>}</code></dt><dd>repetition</dd>
+<dt><code>*</code>, <code>+</code> and <code>?</code></dt><dd>repetition</dd>
+<dt><code>.</code></dt><dd>wild-card character</dd>
+<dt><code>\</code></dt><dd>escape character</dd>
+<dt><code>|</code></dt><dd>alternatives</dd>
+<dt><code>^</code> and <code>$</code></dt><dd>start and end of string or
line</dd>
+</dl>
<note>
-<p>Please note that ORO does not support the \Q and \E meta-characters.
+Please note that ORO does not support the <code>\Q</code> and <code>\E</code>
meta-characters.
[In other RE engines, these can be used to quote a portion of an RE so that
the meta-characters stand for themselves.]
You can use function to do the equivalent, see <a
href="functions.html#__escapeOroRegexpChars">${__escapeOroRegexpChars(valueToEscape)}</a>.
-</p>
</note>
<p>
The following Perl5 extended regular expressions are supported by ORO.
<dl>
-<dt>(?#text)</dt>
+<dt><code>(?#text)</code></dt>
<dd>An embedded comment causing text to be ignored.</dd>
-<dt>(?:regexp)</dt>
-<dd>Groups things like "()" but doesn't cause the group match to be saved.</dd>
-<dt>(?=regexp)</dt>
-<dd>A zero-width positive lookahead assertion. For example, \w+(?=\s) matches
a word followed by whitespace, without including whitespace in the
MatchResult.</dd>
-<dt>(?!regexp)</dt>
-<dd>A zero-width negative lookahead assertion. For example foo(?!bar) matches
any occurrence of "foo" that isn't followed by "bar". Remember that this is a
zero-width assertion, which means that a(?!b)d will match ad because a is
followed by a character that is not b (the d) and a d follows the zero-width
assertion.</dd>
-<dt>(?imsx)</dt>
-<dd>One or more embedded pattern-match modifiers. i enables case
insensitivity, m enables multiline treatment of the input, s enables single
line treatment of the input, and x enables extended whitespace comments.</dd>
+<dt><code>(?:regexp)</code></dt>
+<dd>Groups things like "<code>()</code>" but doesn't cause the group match to
be saved.</dd>
+<dt><code>(?=regexp)</code></dt>
+<dd>A zero-width positive lookahead assertion. For example,
<code>\w+(?=\s)</code> matches a word followed by whitespace, without including
whitespace in the MatchResult.</dd>
+<dt><code>(?!regexp)</code></dt>
+<dd>A zero-width negative lookahead assertion. For example
<code>foo(?!bar)</code> matches any occurrence of "<code>foo</code>" that
+isn't followed by "<code>bar</code>". Remember that this is a zero-width
assertion, which means that <code>a(?!b)d</code> will
+match <code>ad</code> because <code>a</code> is followed by a character that
is not <code>b</code> (the <code>d</code>) and a <code>d</code>
+follows the zero-width assertion.</dd>
+<dt><code>(?imsx)</code></dt>
+<dd>One or more embedded pattern-match modifiers. <code>i</code> enables case
insensitivity, <code>m</code> enables multiline treatment
+of the input, <code>s</code> enables single line treatment of the input, and
<code>x</code> enables extended whitespace comments.</dd>
</dl>
<b>Note that <code>(?<=regexp)</code> - lookbehind - is not supported.</b>
</p>
@@ -218,15 +223,15 @@ Modifiers can be placed anywhere in the
However they would have no effect there anyway.]
</p>
<p>
-The single-line (?s) and multi-line (?m) modifiers are normally placed at the
start of the regex.
+The single-line <code>(?s)</code> and multi-line <code>(?m)</code> modifiers
are normally placed at the start of the regex.
</p>
<p>
-The ignore-case modifier (?i) may be usefully applied to just part of a regex,
-for example:
-<pre>
+The ignore-case modifier <code>(?i)</code> may be usefully applied to just
part of a regex,
+for example:</p>
+<source>
Match ExAct case or (?i)ArBiTrARY(?-i) case
-</pre>
-</p>
+</source>
+would match <code>Match ExAct case or arbitrary case</code> as well as
<code>Match ExAct case or ARBitrary case</code>, but not <code>Match exact case
or ArBiTrARY case</code>.
</subsection>
</section>