[
https://issues.apache.org/jira/browse/TIKA-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830079#comment-17830079
]
Tilman Hausherr commented on TIKA-4218:
---------------------------------------
The word "party" appears 36 times in the json file, 18 times in my text
extraction, but 62 times in the csv file in the TOP_N_TOKENS_A row. The double
in the json file is because of "xfa_content", but the "62" I don't understand.
Thanks for mentioning the new list (I probably missed it), I'll adjust my
scripts and use them the next time.
> Run regression tests to support 2.9.2 release
> ---------------------------------------------
>
> Key: TIKA-4218
> URL: https://issues.apache.org/jira/browse/TIKA-4218
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Major
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)