Hello, I'm starting to work on federated search algorithms for my PhD study. I'll use Solr to implement them (Since I have two years experience with Solr at my work).
I thought that at least part of my work could be useful for Solr Project and I could contribute some code. I mean specifically the components/modifications to add federated search support to Solr. By "Federated Search" I mean searching across heterogeneous data sources (something different than existing Distributed Search implemented in Solr) - to allow Solr to merge results not only from SolrServer instances, but also to include results from external sources (eg. search engines using different API). The use case would look like this: - user sends the request to Solr (eg. SearchRequest) - Solr handles the request internally and/or sends it to other Solr instances (current Distributed Search) AND sends it to specified external data sources using dedicated adapters. - Solr merges the results from Solr instances with results from external collections and returns the combined results to user. To perform this scenario the four common federated search parts should be supported: - collection representation (external collections probably won't provide the same informations as Solr, like tf-idf) - collection selection (predict which collections may return relevant results and transfer the search request only to them) - result merging (merge results based on more limited informations than Solr provides) - external sources connection (common API to write custom collections adapters) I thought I would write some federated search components - schema to allow developers to implement custom algorithms/plugins for each part of federated search scenario. What do You think about that? Sorry for my English :) Jacek Plebanek Interdisciplinary Centre for Mathematical and Computational Modelling University of Warsaw, Poland --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
