: le list is not ordered (I do not know the details of the search : angine, I only have its result for a query) : : then I have this list of documents, which represents a subset of the corpus : : I have to rank the documents of the list, using your scoring algorithm
In other words, out of a large copus C, this webservice hase told you that the documents comprising subset S is the top N matching documents for your query Q (where N << sizeof(C)) your goal is to sort S as best as possible. You could try indexing all the docs in S in a Lucene RAMDirectory and then search on them, but my orriginal point about the score being fairly meaningless in an index of only 1 document still applies somewhat ... if all of the documents you get back allready have a lot in common 9they must have something in common or hte webservice wouldn't havre returned them in response to your query) it may be hard to get a meaningful document frequency of the words in your query. you also may run into confusion about what exactly your query "is" and wether or not your interpretation matches that of the webservice ... at a very simplisitc level, if the query is "Java Lucene" and your webservice interprets that as an "OR" query and you interpret that as an "AND" query, you might find that the scores you compute for all the docs are 0. If i were in your shoes, i'd try to work with whoever runs this webservice to make it return more usefull informatoin -- at the very least to return results in sorted order. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]