Hi, here's another question involving MultiTermQuerys. My aim is to get a frequency count for a MultiTermQuery while I don't need to execute the query. The naive approach would be to create the Query, extract the terms, and get each term's frequency, approximately as follows:
IndexSearcher searcher = ...; PrefixQuery query = new PrefixQuery(new Term("field", "abc")); Query rewritten = searcher.rewrite(query); Set<Term> terms = rewritten.extractTerms(); ... And eventually read the term frequencies for each term. However, this seems rather costly for a large number of terms and I am actually interested in the total frequencies, so there would be no need for a term-by-term analysis. My use case is that I have an index containing part-of-speech tags in the form <tag>:<token> and I may be searching for <tag> frequencies. My alternative solution would be to create a dedicated index in which the original tokens are completely replaced by the tags, so that I had documents in the form "DET NN ..." and corresponding tokens. Would you rather recommend this? Thanks, Carsten -- Institut für Deutsche Sprache | http://www.ids-mannheim.de Projekt KorAP | http://korap.ids-mannheim.de Tel. +49-(0)621-43740789 | schno...@ids-mannheim.de Korpusanalyseplattform der nächsten Generation Next Generation Corpus Analysis Platform --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org