epugh commented on code in PR #3784:
URL: https://github.com/apache/solr/pull/3784#discussion_r2439313692


##########
solr/solr-ref-guide/modules/indexing-guide/pages/indexing-with-tika.adoc:
##########
@@ -20,7 +20,7 @@ If the documents you need to index are in a binary format, 
such as Word, Excel,
 
 Apache Tika incorporates many different file-format parsers such as 
http://pdfbox.apache.org/[Apache PDFBox] and 
http://poi.apache.org/index.html[Apache POI] to extract the text content and 
metadata from files.
 
-Solr's `ExtractingRequestHandler` uses Tika, either in-process or a remote 
Tika server, to support extracting text and metadata from binary files.
+Solr's `ExtractingRequestHandler` uses Apache Tika via an external Tika Server 
to extract text and metadata from binary files.

Review Comment:
   Worth mentioning it's pluggability?  Though everything in Solr is...



##########
solr/modules/extraction/src/java/org/apache/solr/handler/extraction/ExtractingParams.java:
##########
@@ -137,7 +137,7 @@ public interface ExtractingParams {
    */
   public static final String PASSWORD_MAP_FILE = "passwordsFile";
 
-  /** Backend selection, either `local` or `tikaserver`. */
+  /** Backend selection */

Review Comment:
   Worth saying pluggable backend selection?



##########
gradle/libs.versions.toml:
##########
@@ -39,7 +39,6 @@ apache-kafka = "3.9.1"
 apache-log4j = "2.21.0"
 apache-lucene = "10.3.1"
 apache-opennlp = "2.5.6"
-apache-poi = "5.2.2"
 apache-rat = "0.15"
 apache-tika = "1.28.5"

Review Comment:
   I guess this is here because of tika lang id?   I was expecting tika 3...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to