[ 
https://issues.apache.org/jira/browse/TIKA-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17765829#comment-17765829
 ] 

Hudson commented on TIKA-4133:
------------------------------

SUCCESS: Integrated in Jenkins build Tika ยป tika-main-jdk11 #1249 (See 
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk11/1249/])
TIKA-4133 -- add a capture group metadatafilter (#1346) (github: 
[https://github.com/apache/tika/commit/aeb637b5761f65514bc3c0ede3f1f893ba7f14ff])
* (add) 
tika-core/src/test/resources/org/apache/tika/config/TIKA-4133-capture-group-overwrite.xml
* (edit) 
tika-core/src/test/java/org/apache/tika/metadata/filter/TestMetadataFilter.java
* (add) 
tika-core/src/test/resources/org/apache/tika/config/TIKA-4133-capture-group.xml
* (add) 
tika-core/src/main/java/org/apache/tika/metadata/filter/CaptureGroupMetadataFilter.java


> Add capture group metadataFilter
> --------------------------------
>
>                 Key: TIKA-4133
>                 URL: https://issues.apache.org/jira/browse/TIKA-4133
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Trivial
>
> There are some cases where it would be useful to run a regex to capture 
> specific values in a metadata object.
> For example, some users might not want the mime attributes (e.g. charset) as 
> in "text/html; charset=UTF-8".
> Let's start with a simple regex capture group filter.  If we need to capture 
> multiple matches etc, we can add that on a later ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to