Tim Allison created TIKA-2906:
---------------------------------

             Summary: Modularize tika-eval's language stats from the application
                 Key: TIKA-2906
                 URL: https://issues.apache.org/jira/browse/TIKA-2906
             Project: Tika
          Issue Type: Task
            Reporter: Tim Allison
            Assignee: Tim Allison


Tika-eval's language stats are tightly coupled to the application and the 
initial workflow of running against a directory of extracts and reporting info 
to an H2 db.

It would be helpful for large-scale data processing pipelines to modularize 
some of tika-eval's stats so that they can be applied to, e.g. a full Solr/ES 
cluster.  We won't build the actual connectors to Solr/ES/other on this ticket, 
but we will make it easier for integrators to build their own.

This is slated for 1.23/2.0...not 1.22.




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to