I've received a request from our business area to take a look at emphasising ~0 phrase matches over ~1 (and greater) more that they are already. I can't see any doco on the subject, and I'd like to ask if anyone else has played in this area? Or at least is willing to sanity check my reasoning before I rush in and code a solution, when I may be reinventing the wheel?
Looking through the codebase, I can only find hardcoded weightings in a couple of places, using the formula: "return 1.0f / (distance + 1);" which results in ~0 getting a weight of 1, and ~1 getting a weight of 0.5. There are a number of ways I've already considered, but the most flexible seems to be to expose those two numbers via configuration. We are considering adjusting them in sync with each other (using 1/3 instead of 1 in both places), which has the impact of altering the overall distribution of the weightings graph, but retaining the scale between 1 and 0. Additionally, we are considering increasing the numerator to increase the upper scale above 1. Not sure if this is dumb idea though. Our hope was to use something like "return 2.0f / (distance + 0.33f);" to give ~0 matches a real (^2) boost in comparison to other weighting factors, and retain the ~1 (and greater) matches at around their current weight. This remains a completely untested theory though, since I may be misunderstanding how the output gets combined outside this method. The real technical change though would be to simply get those two numbers from config. Any advice or suggestions about other ideas we haven't even considered? The larger picture here is that we are using edismax and the pf fields are all covered by ps=5. Ta, Greg