Kevin Risden created SOLR-16010:
-----------------------------------

             Summary: langid should include all required Tika dependencies
                 Key: SOLR-16010
                 URL: https://issues.apache.org/jira/browse/SOLR-16010
             Project: Solr
          Issue Type: Task
      Security Level: Public (Default Security Level. Issues are Public)
          Components: contrib - LangId
            Reporter: Kevin Risden


Currently, the langid module requires that extraction module to be loaded for 
langid to work. It isn't clear if what is included in the extraction module 
will even meet the langid needs (ie: tika-langdetect isn't included in 
extraction module)

{code:java}
➜  solr git:(SOLR-15989) find solr/packaging/build/solr-10.0.0-SNAPSHOT/ -name 
'*tika*.jar'
solr/packaging/build/solr-10.0.0-SNAPSHOT/modules/langid/lib/tika-core-1.27.jar
solr/packaging/build/solr-10.0.0-SNAPSHOT/modules/extraction/lib/tika-parsers-1.27.jar
solr/packaging/build/solr-10.0.0-SNAPSHOT/modules/extraction/lib/tika-java7-1.27.jar
solr/packaging/build/solr-10.0.0-SNAPSHOT/modules/extraction/lib/tika-xmp-1.27.jar
solr/packaging/build/solr-10.0.0-SNAPSHOT/modules/extraction/lib/vorbis-java-tika-0.8.jar
solr/packaging/build/solr-10.0.0-SNAPSHOT/modules/extraction/lib/tika-core-1.27.jar
{code}

This came out of a discussion in SOLR-15989 - 
https://github.com/apache/solr/pull/621#discussion_r806083202



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to