Hi, In my programme, I can index and search a document based on unigrams. I modified the code as follows to obtain the results based on bigrams. However, it did not give me the desired output.
***************** *public* *static* *void* createIndex() *throws* CorruptIndexException, LockObtainFailedException, IOException { *final* String[] NEW_STOP_WORDS = {"a", "able", "about", "actually", "after", "allow", "almost", "already", "also", "although", "always", "am", "an", "and", "any", "anybody"}; //only a portion SnowballAnalyzer analyzer = *new* SnowballAnalyzer("English", NEW_STOP_WORDS ); Directory directory = FSDirectory.getDirectory(*INDEX_DIRECTORY* ); ShingleAnalyzerWrapper sw=*new* ShingleAnalyzerWrapper(analyzer,2); sw.setOutputUnigrams(*false*); IndexWriter w= *new* IndexWriter(*INDEX_DIRECTORY*, analyzer, *true*,IndexWriter.MaxFieldLength.*UNLIMITED*); File dir = *new* File(*FILES_TO_INDEX_DIRECTORY*); File[] files = dir.listFiles(); *for* (File file : files) { Document doc = *new* Document(); String text=""; doc.add(*new* Field("contents",text,Field.Store.*YES*, Field.Index.UN_TOKENIZED,Field.TermVector.*YES*)); Reader reader = *new* FileReader(file); doc.add(*new* Field(*FIELD_CONTENTS*, reader)); w.addDocument(doc); } w.optimize(); w.close(); } **************** Still the output is; {contents: /1, assist/1, fine/1, librari/1, librarian/1, main/1, manjula/3, name/1, sabaragamuwa/1, univers/1} ******************* If anybody can, please help me to obtain the correct output. Thanks, Manjula.