This is an automated email from the ASF dual-hosted git repository.
hansva pushed a commit to branch release/2.11.0
in repository https://gitbox.apache.org/repos/asf/hop.git
The following commit(s) were added to refs/heads/release/2.11.0 by this push:
new ade3b5e38c [RELEASE] doc version and regexeval fix
ade3b5e38c is described below
commit ade3b5e38cb3c0aee00373a926b161d1ab98b2d9
Author: Hans Van Akelyen <[email protected]>
AuthorDate: Thu Dec 5 09:11:22 2024 +0100
[RELEASE] doc version and regexeval fix
---
docs/hop-dev-manual/antora.yml | 1 -
docs/hop-tech-manual/antora.yml | 1 -
docs/hop-user-manual/antora.yml | 2 --
.../ROOT/pages/pipeline/transforms/regexeval.adoc | 37 ++++++++++++++--------
4 files changed, 24 insertions(+), 17 deletions(-)
diff --git a/docs/hop-dev-manual/antora.yml b/docs/hop-dev-manual/antora.yml
index 4fd540ab67..3bad0debf3 100644
--- a/docs/hop-dev-manual/antora.yml
+++ b/docs/hop-dev-manual/antora.yml
@@ -18,6 +18,5 @@
name: dev-manual
title: Development Documentation
version: 2.11.0
-prerelease: true
nav:
- modules/ROOT/nav.adoc
diff --git a/docs/hop-tech-manual/antora.yml b/docs/hop-tech-manual/antora.yml
index 9c5c971b43..0aa7fd40cd 100644
--- a/docs/hop-tech-manual/antora.yml
+++ b/docs/hop-tech-manual/antora.yml
@@ -18,6 +18,5 @@
name: tech-manual
title: Technical Documentation
version: 2.11.0
-prerelease: true
nav:
- modules/ROOT/nav.adoc
diff --git a/docs/hop-user-manual/antora.yml b/docs/hop-user-manual/antora.yml
index e36f6878ad..7abfab6ff2 100644
--- a/docs/hop-user-manual/antora.yml
+++ b/docs/hop-user-manual/antora.yml
@@ -18,7 +18,5 @@
name: manual
title: User manual
version: 2.11.0
-prerelease: true
-display_version: 2.11.0 (pre-release)
nav:
- modules/ROOT/nav.adoc
diff --git
a/docs/hop-user-manual/modules/ROOT/pages/pipeline/transforms/regexeval.adoc
b/docs/hop-user-manual/modules/ROOT/pages/pipeline/transforms/regexeval.adoc
index aa87cac211..3090d50ae2 100644
--- a/docs/hop-user-manual/modules/ROOT/pages/pipeline/transforms/regexeval.adoc
+++ b/docs/hop-user-manual/modules/ROOT/pages/pipeline/transforms/regexeval.adoc
@@ -52,11 +52,16 @@ The primary usage for this transform is to check if an
input field matches the g
The pattern is intended to match the entire input field, not just a part of
it. For example, given the input:
-+++<pre>"Author, Ann" - 53 posts</pre>+++
+----
+"Author, Ann" - 53 posts
+----
a regular expression like `\d* posts` would give no match, even if a part of
the input (`53 posts`) indeed matches with the pattern. To get an actual match,
you need to add `.*` in the pattern:
-+++<pre>.*\d* posts</pre>+++
+[source,regexp]
+----
+.*\d* posts
+----
=== Capturing text
@@ -64,7 +69,10 @@ This transform can also capture parts of the input and store
them in new fields
With the same input text as above, create a regular expression with two
capture groups:
-+++<pre>^"([^"]*)" - (\d*) posts$</pre>+++
+[source,regexp]
+----
+^"([^"]*)" - (\d*) posts$
+----
The transform will capture the values `Author, Ann` and `53`, so you can
create two new fields in your pipeline (i.e. one for the name, and one for the
number of posts).
@@ -173,15 +181,15 @@ By default, these expressions only match at the beginning
and the end of the ent
=== Sub-text matching
-As mentioned earlier, the pattern is intended to match the entire input field,
i.e. when the supplied input _is_ the pattern.
+As mentioned earlier, the pattern is intended to match the entire input field,
i.e. when the supplied input _is_ the pattern.
If you just need to test if your input _contains_ the pattern, you need to
tweak your regular expression so that it matches the entire input field. You
should also include the grouping operators (parentheses) to get the sub-text
you intended to match, for example:
* Input data: `THIS IS A TITLE <PROCESSING_TAG>`
-* RegEx 1: `+++<.*>+++` -> returns no match, because the pattern doesn't match
the entire input
-* RegEx 2: `+++.*(<.*>)+++` -> returns a match and you can capture the value
`<PROCESSING_TAG>` with the grouping operators
+* RegEx 1: `<.*>` -> returns no match, because the pattern doesn't match the
entire input
+* RegEx 2: `.*(<.*>)` -> returns a match and you can capture the value
`<PROCESSING_TAG>` with the grouping operators
-As a consequence, you can consider the line delimiting operators `^` and `$`
as implied in your regular expression: the examples above are equivalent to
`+++^<.*>$+++` and `+++^.*(<.*>)$+++` respectively.
+As a consequence, you can consider the line delimiting operators `^` and `$`
as implied in your regular expression: the examples above are equivalent to
`^<.*>$` and `^.*(<.*>)$` respectively.
=== Nested capture groups
@@ -189,19 +197,22 @@ Suppose your input field contains a text value like
`"Author, Ann" - 53 posts.`
The following regular expression creates four capturing groups and can be used
to parse out the different parts:
-+++<pre>^"(([^"]+), ([^"])+)" - (\d+) posts\.$</pre>+++
+[source,regexp]
+----
+^"(([^"]+), ([^"])+)" - (\d+) posts\.$
+----
This expression creates the following four capturing groups, which become
output fields:
[options="header"]
|===
|Field name|RegEx segment|Value
-|Fullname|`+++(([^"]+), ([^"]+))+++`|`Author, Ann`
-|Lastname|`+++([^"]+)+++` (first occurrence)|`Author`
-|Firstname|`+++([^"]+)+++` (second occurrence)|`Ann`
-|Number of posts|`+++(\d+)+++`|`53`
+|Fullname|`(([^"]+), ([^"]+))`|`Author, Ann`
+|Lastname|`([^"]+)` (first occurrence)|`Author`
+|Firstname|`([^"]+)` (second occurrence)|`Ann`
+|Number of posts|`(\d+)`|`53`
|===
In this example, a field definition must be present for each of these
capturing groups.
-If the number of capture groups in the regular expression does not match the
number of fields specified, the transform will fail and an error is written to
the log.
+If the number of capture groups in the regular expression does not match the
number of fields specified, the transform will fail and an error is written to
the log.
\ No newline at end of file