> OK, can we do it this way then?: > > - how can I boost (say) one data supplier over the others (based on a > properties switch, or database value)?
In Sesat-Kernel one single search request relates to a "mode". One mode delegates out to an unlimited number of "search-commands". Each search command can be manipulated in various ways. After they have all been run the mode can be manipulated in various ways, for example federating the search command results together, removing duplicates, etc. The search commands are of different types: fast4, fast5, yahoo, solr, etc etc; each with various specific attributes. There is also general configuration for all command types on how to translate the query to the index (query-builder), how to further customise the query (query-transformers), and post processing of results (result-handlers). This is all defined declaratively in the modes.xml file. As mentioned there is also post processing on a mode. This is how the search federation occurs, that is results from various commands can be blended together into one pseudo-command making it looked like all results actually came from one index (FederatorRunHandler.java) To answer your question, boosting one supplier over another is the algorithm you use in the blending of results in the federation. Basic attributes for basic blending can be found in FederatorRunHandlerConfig.java. For now there are also three basic blending approaches, random, round robin, and sequel. It would be easy to add a blend algorithm that blended the results together based on a "score" field from the index or database. Boosting one index's score over another would be to use the NumberOperationHandler (a result handler) on each result coming our of the index to manipulate the value in the score field. For example multiple the score field by some number. But any mathematical equation supported by the JEP library can be used here so this can become increasingly complex as your need requires. This presumes that the different supplier results are in different indexes. If everything is in the one index then you could solve this by using the index's rank profile (fast) or DispatchRequest's query fields (solr). For Solr you can read up on http://wiki.apache.org/solr/DisMaxRequestHandler which also has a boost query and boost functions. > - how could I do absolute boosting always displaying a certain supplier, > also soft-switchable? Using the NumberOperationHandler you would multiply the value in the score field from results from that certain supplier against an extra very high value to ensure they came first in the blending process. > - what is the relevancy algorithm and can I create and apply my own > relevancy algorithms without changing the base code? I might need a > different one for games suppliers than I need for music suppliers for > example. The relevancy algorithm is usually done within the index. For example in Solr you can use different DisMaxRequestHandlers that will give you different ranking algorithms, or you can use the sort parameter to ask for a particular type of sorting (eg by date, by size, etc). Sorting works when one field in the index can be sorted on, ranking algorithms are useful when the relevance is related to various fields in each result. Sesat-Kernel provides a nice navigation model around this to make it easy to design the navigators on the webpage. > - Do you support the Yahoo-style concepts, aka query direction, aka > query steering, which identifies a "zone" of suppliers (named group) > based on a query term? If so how? Not that i know of. Is this part of Yahoo's Index Data Protocol (IDP)? > - My suppliers will return their own interpretation of relevancy scores. > How can I rebase them so I can compare one supplier's relevancy with > another and prevent suppliers boosting their own stuff unfairly. In this situation I would put each supplier into it's own index, and use the the result handling and then federation approach as described in the first question-answer paragraph above. This would also make sense since each index can be refeed and optimised to the supplier's needs. There is also isolation for security reasons. > - Corporate animals like me don't have the time, inclination or > equipment to build their own version. It looks like I would need to > install Maven, JDK etc. Could you not offer prospective users a > pre-built version with release notes, one for LINUX one for Windows? I > think this would increase takeup, particularly in larger companies. FAST > do this with Unity. Sesat is not a commercial product. Neither does it have any companies offering support for it, although you could enquire with T-Rank AS. It is purely a code framework and suite of libraries. Even the declarative side of affairs means building with a JDK and Maven a sesat skin and deploying it along side the sesat-kernel in a java web container like tomcat or jboss. So at some point you need a developer to evaluate Sesat as a solution to your needs. FAST Unity is not so different, less features overall and less flexibility in the long run but a greater "wow" factor off the starting block. Here you'll be paying instead to have the fast consultants do the equivalent development behind the scenes. This generally leads to a greater and greater need on the consultants and ends up being a far more expensive route than just having your own developers in charge. ~mck -- "When prosperity comes, do not use all of it." Confucius | semb.wever.org | sesat.no | sesam.no |
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Kernel-development mailing list [email protected] http://sesat.no/mailman/listinfo/kernel-development
