[
https://issues.apache.org/jira/browse/TIKA-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14312155#comment-14312155
]
Tim Allison commented on TIKA-1334:
-----------------------------------
No, not pretty. This is a first step. The code for tika-eval is still very
rough around the edges, but it is on my github site under branch TIKA-1302.
In my current design, tika-eval will have three primary chunks of code:
* A profiler which will run through a directory of output and populate a
database and generate reports similar to the attached but for a single run.
* A directory comparison tool that will run through a pair of directories, and
run comparisons on a file pair-wise level. This will generate static reports
similar to the attached, but this will also populate a database that we can use
in an interactive ui.
* Some kind of interactive ui that will allow users to drill down and view
reports, summary statistics, output diffs and source files.
I just transitioned to h2 for the db, and I was quite impressed with the fairly
flat memory consumption even at 1M files.
> Add presentation layer for results of each run
> ----------------------------------------------
>
> Key: TIKA-1334
> URL: https://issues.apache.org/jira/browse/TIKA-1334
> Project: Tika
> Issue Type: Sub-task
> Components: cli, general, server
> Reporter: Tim Allison
> Attachments: static_stats.zip
>
>
> If I'm doing this, it'll probably be vintage mid-90s html. If someone with
> some .js kung-fu wants to take this, please do.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)