I wanted to share that I have been polishing a plugin that lets you control the stats of your Solr instance. Primarily used in relevance evaluation to mock production doc frequency, etc, this plugin lets you control the global stats side of BM25 scoring, when working with smaller test samples of documents.
However, there could be other use cases where you want to manually control the "natural" document frequency in prod to match the true specificity of a term. If "book" only comes up once in your titles for your book search index, arguably, you might want that to actually be treated a lot more common to match the user's true sense of that term's specificity, and make book matches less important than, say, a match for a truly specific term like "woodworking". https://github.com/softwaredoug/managed-stats Love to get feedback, PRs always welcome Thanks -Doug