On 11/29/06, Java Programmer <[EMAIL PROTECTED]> wrote:
Hello, I have trouble with writing and searching on lucene index same time, all I did so far is making a class which has 2 methods: private String indexLocation;public void addDocument(int id,String title, String body) throws IOException{ IndexWriter indexWriter = new IndexWriter(indexLocation, new SimpleAnalyzer(), false); Document doc = new Document(); doc.add(new Field("id",Integer.toString(id),Store.YES,Index.NO)); doc.add(new Field("title",title,Store.NO,Index.TOKENIZED)); doc.add(new Field("body",body,Store.NO,Index.TOKENIZED)); indexWriter.addDocument(doc); indexWriter.close(); } public List<Integer> search(String query) throws IOException, ParseException{ IndexSearcher indexSearcher = new IndexSearcher(indexLocation); MultiFieldQueryParser queryparser = new MultiFieldQueryParser(new String[]{"title","body"}, new SimpleAnalyzer()); Query q = queryparser.parse(query); Hits hits = indexSearcher.search(q); Iterator it = hits.iterator(); List<Integer> output = new ArrayList<Integer>(); while(it.hasNext()){ output.add(Integer.parseInt(((Hit)it.next()).getDocument().get("id"))); } indexSearcher.close(); return output; } What I don't like is that I have in each method opening IndexWriter and IndexSearcher, I try to open them once and keep opened throught whole lifecycle of application (which would be very long cause it would be search for news working as webservice), but when I wasn't close IndexWriter then IndexSearcher wasn't seen any new documents in index. Next step was keeping IndexWriter open and reopen only IndexSearcher but in this case also IndexSearcher was seen old index without new documents. So my final version is this above, but could it be better, without closing IndexWriter after each addition, and opening IndexSearcher before each search query? What is the best pattern of doing such systems?
Hi there, it's quiet a good Idea to keep IndexWriter/Reader(Searcher) open as long as possible. A quiet nice patter is used by solr and gdata-server. The indexwriter will be closed after a certain amount of document have been added to the index or if an timeout exceeded without any updates inserts. As soon as the writers close method is called a new index searcher will be opened and released to the application. If you work in a webapp e.g. in a multithreaded env. you shout track the references of you searcher to close it if nobody uses it anymore. Keeping Searchers and Writers open will result in a much better performance if it is considerable to have updates and insert invisible to the searcher for a certain amount of time. Have a look at this: http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/gdata-server/src/java/org/apache/lucene/gdata/search/index/GDataIndexer.java http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/gdata-server/src/java/org/apache/lucene/gdata/utils/ReferenceCounter.java http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/gdata-server/src/java/org/apache/lucene/gdata/search/index/IndexController.java --> private ReferenceCounter<IndexSearcher> getNewServiceSearcher(final Directory dir)
Another question: do I need provide any synchronization on indexWriter.addDocument(doc) method? I see that it isn't synchronized, so maybe programmer need to do it himself? Best regards, Adr
You could queue the document to add to the index to keep your indexwriter busy. Might be a good idea anyway. best regards simon
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
