dumping lucene index to text file

2005-08-21 Thread Michael Ji
provide? thanks, Michael Ji __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For

Re: newbie questions

2005-08-25 Thread Michael Ji
I guess, you need to delete a doc from lucene search engine to avoid to be hit it; you can use LuakeAll (an admin tool for lucene indexing) to manupilate the indexed content; Michael Ji, --- haipeng du <[EMAIL PROTECTED]> wrote: > could lucene have a way to delete a document fro

Re: newbie questions

2005-08-25 Thread Michael Ji
I guess so, but I didn't try that before, Michael Ji, --- haipeng du <[EMAIL PROTECTED]> wrote: > yes, that is right. Could I do that from Lucene API? > Thanks a lot. > > On 8/25/05, Michael Ji <[EMAIL PROTECTED]> wrote: > > I guess, you need to delete a doc

next score usage

2005-10-14 Thread Michael Ji
Lucene indexing structure. thanks, Michael Ji, __ Start your day with Yahoo! - Make it your home page! http://www.yahoo.com/r/hs - To unsubscribe, e-mail: [EMAIL PROTECT

Document Duplication for Multiple Segment Merge

2005-10-14 Thread Michael Ji
hi, When Nutch's IndexMerger.java is called, the indexes from multiple segment directories are merged to one target directory. I wonder how lucene deals with the case when identical documents existing in two segments. Is the older document ( lower time stamp ) deleted? thanks, Micha

Re: Document Duplication for Multiple Segment Merge

2005-10-14 Thread Michael Ji
should be discarded totally. And, a strategy must be made in such a way that each segment should relate to a fetchlist with same interval time. Is it the way Nutch handling refetching case? Michael Ji, --- Yonik Seeley <[EMAIL PROTECTED]> wrote: > There is no concept in Lucene of document

Re: Document Duplication for Multiple Segment Merge

2005-10-14 Thread Michael Ji
m yet. thanks, Micheal Ji, --- Yonik Seeley <[EMAIL PROTECTED]> wrote: > Sorry, I've only briefly looked at Nutch, so you > should ask on that mailing > list. > Lucene doesn't do deduping. > > > -Yonik > Now hiring -- http://tinyurl.com/7m67g > >

compile search.jsp

2006-03-04 Thread Michael Ji
Hi, I made change in search.jsp under /nutch/src/web/jsp and hope the change could reflect to the skin of nutch search page. I tried to run "ant war" and replace ROOT.war in tomcat/webapp also I tried to shutdown and restart tomcat; But seems the nutch search page keeps the same, also the bean.

refetching interval

2006-04-21 Thread Michael Ji
Hi, I am using nutch 07 and found the following code in FetchListTool.java private static final long FETCH_GENERATION_DELAY_MS = 7 * 24 * 60 * 60 * 1000; that means next refetching time is always 7 days later, no matter what fetch interval setting in nutch-site.xml, I feel puzzled. Could any on