gerlowskija commented on a change in pull request #1332: SOLR-14254: Docs for 
text tagger: FST50 trade-off
URL: https://github.com/apache/lucene-solr/pull/1332#discussion_r390556194
 
 

 ##########
 File path: solr/solr-ref-guide/src/the-tagger-handler.adoc
 ##########
 @@ -271,11 +271,12 @@ The response should be this (the QTime may vary):
   }}
 ----
 
-== Tagger Tips
+== Tagger Performance Tips
 
-Performance Tips:
-
-* Follow the recommended configuration field settings, especially 
`postingsFormat=FST50`.
+* Follow the recommended configuration field settings above.
+Additionally, for the best tagger performance, set `postingsFormat=FST50`.
+However, non-default postings formats have no backwards-compatibility 
guarantees, and so if you upgrade Solr then you may find a nasty exception on 
startup as it fails to read the older index.
+If the input text to be tagged is small (e.g. you are tagging queries or 
tweets) then the postings format choice isn't as important.
 
 Review comment:
   [Q] Interesting.  I didn't realize that the FST50 vs default performance 
decreased the smaller the individual document size was.  Did you do a 
particular performance test to bear this out, or are you just intuiting that 
behavior from knowing how postingsFormats work?
   
   Is the performance comparable even if numTweets or whatever gets large and 
the posting-lists grow due to the sheer number of tiny docs?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to