Dear Wiki user, You have subscribed to a wiki page or wiki category on "Tika Wiki" for change notification.
The "TikaEval" page has been changed by TimothyAllison: https://wiki.apache.org/tika/TikaEval?action=diff&rev1=14&rev2=15 2.#2 If the other tool does not extract embedded content, you'd only want to look at the first metadata object (representing the container file) in the .json file: `java -jar tika-eval.X.Y.jar Compare -extractsA tika_1_14 -extractsB tika_1_15 -db comparisondb -alterExtract first_only` + === Min/Max Extract Size === + You may find that some extracts are too big to fit in memory, in which case use `-maxExtractSize <maxBytes>`, or you may want to focus only on extracts that are greater than a minimum length: `-minExtractSize <minBytes>`. + + == Reports == - The module tika-eval comes with a list of reports. However, you might want to generate your own. Each report is specified by sql and a few other configurations in an xml file. See `comparison-reports.xml` and `profile-reports.xml` to get a sense of the syntax. + The module tika-eval comes with a list of reports. However, you might want to generate your own. Each report is specified by SQL and a few other configurations in an xml file. See `comparison-reports.xml` and `profile-reports.xml` to get a sense of the syntax. To specify your own reports on the commandline, use `-rf` (report file): `java -jar tika-eval.X.Y.jar Report -db comparisondb -rf myreports.xml`
