[
https://issues.apache.org/jira/browse/TIKA-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17765829#comment-17765829
]
Hudson commented on TIKA-4133:
------------------------------
SUCCESS: Integrated in Jenkins build Tika ยป tika-main-jdk11 #1249 (See
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk11/1249/])
TIKA-4133 -- add a capture group metadatafilter (#1346) (github:
[https://github.com/apache/tika/commit/aeb637b5761f65514bc3c0ede3f1f893ba7f14ff])
* (add)
tika-core/src/test/resources/org/apache/tika/config/TIKA-4133-capture-group-overwrite.xml
* (edit)
tika-core/src/test/java/org/apache/tika/metadata/filter/TestMetadataFilter.java
* (add)
tika-core/src/test/resources/org/apache/tika/config/TIKA-4133-capture-group.xml
* (add)
tika-core/src/main/java/org/apache/tika/metadata/filter/CaptureGroupMetadataFilter.java
> Add capture group metadataFilter
> --------------------------------
>
> Key: TIKA-4133
> URL: https://issues.apache.org/jira/browse/TIKA-4133
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Trivial
>
> There are some cases where it would be useful to run a regex to capture
> specific values in a metadata object.
> For example, some users might not want the mime attributes (e.g. charset) as
> in "text/html; charset=UTF-8".
> Let's start with a simple regex capture group filter. If we need to capture
> multiple matches etc, we can add that on a later ticket.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)