We have a number of queries that produce good results based on the textual data, but are contextually wrong (for example, an "SSD hard drive" search matches the music album "SSD hip hop drives us crazy".
Textually a fair match, but SSD is a term that strongly relates to technical documents. We'd like to be able to direct this query more strictly in the direction of the technical documents based on the term "SSD". I am considering whether it would be worth trying to cluster all documents, thus tending to group the music with the music and tech items with the tech items. Then pulling out the term vectors that define each group; do a human review of that data; and plug it back into the documents of each cluster as a separate search field that gets boosted. In my head it seems like a plausible way to weigh terms like SSD to the cluster of items that it most closely associates. Should I spend the effort to find out? Yeh or neh?