Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tika Wiki" for change 
notification.

The "TikaEval" page has been changed by TimothyAllison:
https://wiki.apache.org/tika/TikaEval?action=diff&rev1=14&rev2=15

   2.#2 If the other tool does not extract embedded content, you'd only want to 
look at the first metadata object (representing the container file) in the 
.json file:
      `java -jar tika-eval.X.Y.jar Compare -extractsA tika_1_14 -extractsB 
tika_1_15 -db comparisondb -alterExtract first_only`
  
+ === Min/Max Extract Size ===
+ You may find that some extracts are too big to fit in memory, in which case 
use `-maxExtractSize <maxBytes>`, or you may want to focus only on extracts 
that are greater than a minimum length: `-minExtractSize <minBytes>`.
+ 
+ 
  == Reports ==
- The module tika-eval comes with a list of reports.  However, you might want 
to generate your own.  Each report is specified by sql and a few other 
configurations in an xml file.  See `comparison-reports.xml` and 
`profile-reports.xml` to get a sense of the syntax.
+ The module tika-eval comes with a list of reports.  However, you might want 
to generate your own.  Each report is specified by SQL and a few other 
configurations in an xml file.  See `comparison-reports.xml` and 
`profile-reports.xml` to get a sense of the syntax.
  
  To specify your own reports on the commandline, use `-rf` (report file):
      `java -jar tika-eval.X.Y.jar Report -db comparisondb -rf myreports.xml`

Reply via email to