[
https://issues.apache.org/jira/browse/NUTCH-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17466691#comment-17466691
]
ASF GitHub Bot commented on NUTCH-2919:
---------------------------------------
lewismc commented on pull request #717:
URL: https://github.com/apache/nutch/pull/717#issuecomment-1002883095
I was getting a local failure on [parse-tika's
TestRTFParser](https://github.com/apache/nutch/blob/master/src/plugin/parse-tika/src/test/org/apache/nutch/parse/tika/TestRTFParser.java#L63)
where it was attempting to use the deprecated
[org.apache.tika.metadata.OfficeOpenXMLCore#SUBJECT](https://github.com/apache/tika/blob/main/tika-core/src/main/java/org/apache/tika/metadata/OfficeOpenXMLCore.java#L71-L77).
This has been replaced by
[DublinCore#SUBJECT](https://github.com/apache/tika/blob/main/tika-core/src/main/java/org/apache/tika/metadata/DublinCore.java#L162-L170)
Tests passed locally. Let's see how CI does.
Additionally, when inspecting the extracted metadata from the sample `.rtf`
file, I see the following... should we augment the unit test to assert the
results?
```
Content-Length: 2235
Content-Type: application/rtf
X-TIKA:Parsed-By: org.apache.tika.parser.DefaultParser
X-TIKA:Parsed-By: org.apache.tika.parser.microsoft.rtf.RTFParser
X-TIKA:digest:MD5: 61d9f6cd7ebacf61737936f9341c2289
X-TIKA:digest:SHA256:
1aae10f9ae8fdfdfddae338dec7f4a40cf9fc7d0c254af32e742dbd227f9399b
dc:subject: tests
dc:title: test rft document
dcterms:created: 2004-09-21T02:36:00Z
resourceName: test.rtf
w:Comments: StarWriter
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> Upgrade to Tika 2.2.0
> ---------------------
>
> Key: NUTCH-2919
> URL: https://issues.apache.org/jira/browse/NUTCH-2919
> Project: Nutch
> Issue Type: Improvement
> Components: build
> Reporter: Lewis John McGibbney
> Assignee: Lewis John McGibbney
> Priority: Major
> Fix For: 1.19
>
>
> Tika 2.2.0 just released
> https://lists.apache.org/thread/rbnn1m02o38jkyfh14vjtslh11km26bb
--
This message was sent by Atlassian Jira
(v8.20.1#820001)