Dear Parnab,
Thanks a lot for your guidance. I prefer to follow the second method, as I
have already indexed the bigrams using ShingleFilterWrapper. But, I have no
any idea about how to use NGramTokenizer here. So, could you please write
one or two lines of the code which shows how to use NGramTok
TF is straight forward, you can simply count the no of occurrences in the
doc by simple string matching. For IDF you need to know total no of docs in
the collection and the no. of docs having the bigram. reader.maxDoc() will
give you the total no of docs in the collection. To calculate the number o