But somehow from "michael" you'll generate the N other terms to search for? And then it seems like you could just make a new query with those expanded terms?
If you need to know all terms in the index, you can use a TermsEnum to iterate through them. I'm only pushing this because doing this outside of Lucene is going to be far, far easier than modifying Lucene's sources to do what seems to be essentially query expansion. Each query has it's own Weight impl, and those all implement Weight.scorer. E.g. see TermWeight.scorer(), which returns TermScorer. Mike McCandless http://blog.mikemccandless.com On Wed, Jul 24, 2013 at 9:12 AM, Abhishek Gupta <[email protected]> wrote: > Michael thanks for replying so fast. > > No there is no mapping. What I have is some code based on LCS(Longest Common > Subsequence), Levenshtein Distance, Suffix Tree and a probabilistic error > model which maps original word(for eg. michael) to the erroneous > word(mihel). So I think I have to change the way how matching done. > > I am not sure whether it is clear to you or not the matching I am talking > about. So I am explaining it a little bit. I am taking the case of Vector > Space Model and for indexing I am taking the case of inverted list (I am > actually not sure what lucene uses). I am talking about matching of each > query term with the labels of inverted index. > > Also just because of curiosity, the problem I mentioned in the SO question > is that I am not finding the definition of the weight .scorer(). Can you > help me with how things are working there. And which model by default Lucene > selects. > > Cheers, > Abhishek Gupta > > On Wed, Jul 24, 2013 at 5:33 PM, Michael McCandless > <[email protected]> wrote: >> >> Is there some mapping from clean term X to dirty indexed terms A, B, >> C? If so, can't you just take a TermQuery(X) and replace with >> BooleanQuery SHOULD A, B, C? >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> >> On Wed, Jul 24, 2013 at 7:41 AM, Abhishek Gupta <[email protected]> >> wrote: >> > As I have some discussion on IRC with . They want to know my objective >> > in >> > doing so. I have already post the objective there and posting here >> > again: >> > Sry, for late response. I am making a search system in which the data I >> > have >> > indexed is erroneous. So I have made some schemes to match a query >> > term(which is error free) to the indexed term(which might be erroneous). >> > So >> > I have to change the Lucene code where it matches a query term to the >> > indexed data, so that I can code my matching schemes there. >> > >> > >> > On Tue, Jul 23, 2013 at 11:21 PM, Abhishek Gupta >> > <[email protected]> >> > wrote: >> >> >> >> Hi, >> >> I have a problem which is explained completely here. Please help!! or >> >> just >> >> give me some suggestion about from where to get help. >> >> >> >> -- >> >> Abhishek Gupta, >> >> 897876422, 9416106204, 9624799165 >> > >> > >> > >> > >> > -- >> > Abhishek Gupta, >> > 897876422, 9416106204, 9624799165 >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > > > > -- > Abhishek Gupta, > 897876422, 9416106204, 9624799165 --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
