Hi Chip,
Your analysis is correct - you're experiencing these dependency issues because Solr 10 has made significant changes to the Tika integration architecture.In Solr 9.10, the LocalTikaExtractionBackend was deprecated, and in Solr 10.0.0 it has been completely removed (SOLR-17961). This is indeed what's causing your ClassNotFoundException and the need to manually add all those dependencies. Key changes in Solr 10: • LocalTikaExtractionBackend (in-process Tika) is removed • Only the external TikaServer backend is now supported • The ExtractingRequestHandler now requires configuring an external Tika Server URL via the 'tikaserver.url' parameter • Parse-context-based configuration is no longer supported
(...)
2. **Alternative extraction pipeline**: Extract content in your application before sending to Solr, which is often more robust for production anyway. The dependency explosion you're seeing is because the local Tika backend used to handle these dependencies transparently, but now they're exposed due to the architectural changes.
Yes I know. When I saw that there's a new release I first read the changelog. But IMHO this isn't the problem I'm seeing:
Before I upload documents to Solr, I use a local Tika parser to extract content (I don't use Solr for this), I'm sending preprocessed SolrDocuments containing all the information.
OTOH when you compare the pom.xml files for solr-core for versions 9.10.1 and 10.0.0 they don't look much different. As far as I have seen they both contain more or less the same dependencies with the same scope, some of them have newer versions. What I don't understand is that all those dependencies aren't available in the dependency tree in my test.
With solr-core-9.10.1 "mvn dependency:tree" gives me (...) [INFO] +- org.apache.solr:solr-core:jar:9.10.1:compile [INFO] | +- org.apache.lucene:lucene-core:jar:9.12.2:compile [INFO] | +- org.apache.lucene:lucene-analysis-common:jar:9.12.2:compile [INFO] | +- org.apache.lucene:lucene-queries:jar:9.12.3:compile (...) [INFO] | \- org.xerial.snappy:snappy-java:jar:1.1.10.8:runtime [INFO] +- org.eclipse.jdt:org.eclipse.jdt.annotation:jar:2.4.100:provided [INFO] +- org.junit.jupiter:junit-jupiter-engine:jar:6.0.3:test [INFO] | +- org.junit.platform:junit-platform-engine:jar:6.0.3:test [INFO] | | +- org.opentest4j:opentest4j:jar:1.3.0:test [INFO] | | \- org.junit.platform:junit-platform-commons:jar:6.0.3:test [INFO] | +- org.junit.jupiter:junit-jupiter-api:jar:6.0.3:test [INFO] | +- org.apiguardian:apiguardian-api:jar:1.1.2:test [INFO] | \- org.jspecify:jspecify:jar:1.0.0:runtime [INFO] \- org.junit.jupiter:junit-jupiter-params:jar:6.0.3:test (...) whereas the same command with solr-core-10.0.0 only shows (...) [INFO] +- org.apache.solr:solr-core:jar:10.0.0:compile [INFO] +- org.eclipse.jdt:org.eclipse.jdt.annotation:jar:2.4.100:provided [INFO] +- org.junit.jupiter:junit-jupiter-engine:jar:6.0.3:test [INFO] | +- org.junit.platform:junit-platform-engine:jar:6.0.3:test [INFO] | | +- org.opentest4j:opentest4j:jar:1.3.0:test [INFO] | | \- org.junit.platform:junit-platform-commons:jar:6.0.3:test [INFO] | +- org.junit.jupiter:junit-jupiter-api:jar:6.0.3:test [INFO] | +- org.apiguardian:apiguardian-api:jar:1.1.2:test [INFO] | \- org.jspecify:jspecify:jar:1.0.0:runtime [INFO] \- org.junit.jupiter:junit-jupiter-params:jar:6.0.3:test (...)All dependencies from solr-core-10.0.0 are missing although they are still there in the pom.xml...
Do you have any idea why? Regards Thorsten
OpenPGP_0x5A54BBB878225E08.asc
Description: OpenPGP public key
OpenPGP_signature.asc
Description: OpenPGP digital signature
