How to add database to an existing nutch index?

2006-07-27 Thread Patrick Kratzenstein
Hi there, I've read much about adding a database easily to a lucene index. Well, it works. My goal is to have an engine that crawls some pages, but also several databases. So, at first my programm crawls all the pages and finally it goes through each database and all available datasets. Each

Total time of a search

2006-07-27 Thread Lourival Júnior
Hi, Somebody knows how to calculate the total time of a search? Actually a use this, but I'm not sure about it: Date d = new Date(); int iniTime = (int) d.getTime();//pega o tempo de inicio da execução da busca nos índices //Aqui é executada a busca nos índices. try{ hits =

RE: Total time of a search

2006-07-27 Thread NG-Marketing, M.Schneider
Hy, I use the following code to measure it in millisecs: // Stoppuhr class Stoppuhr { long millis; void starte() { millis = System.currentTimeMillis(); } void stoppe() { millis = System.currentTimeMillis() - millis; }

nutch analyze

2006-07-27 Thread NG-Marketing, M.Schneider
Hello Friends, after using nutch analyze i got the following message: ... Pages consumed: 26785000 (at index 26785000). Links fetched: 35411420. Finished at Thu Jul 27 14:21:42 CEST 2006 Exception in thread main java.lang.OutOfMemoryError Now, the job finished, but I got

mergesegs tool hangs up

2006-07-27 Thread Dima Mazmanov
Hi, ALL! I have 15 segments with 18 urls When I'm trying to execute mergesegs tool the process hangs up on Processing 8 pages and then nothing... HEAP_SIZE=512 Mb Please help! -- Regards, Dima mailto:[EMAIL PROTECTED]

Embedded Docs

2006-07-27 Thread Oleg Galkin
It seems there's a bug in Nutch fetcher. It doesn't recognize docs embedded into a page via object or embed tags. I've tested it with an embedded flash file. The fetcher ignores it but, on the other hand, parses the same file linked directly. Oleg

Re: stemming

2006-07-27 Thread Matthew Holt
In my nutch-site.xml I overrode the plugin.includes property as below: property nameplugin.includes/name valueprotocol-httpclient|urlfilter-regex|parse-(text|html|js|oo|pdf|msword|mspowerpoint|rtf|zip)|index-(basic|more)|query-(more|site|stemmer|url)|summary-basic|scoring-opic/value

Re: stemming

2006-07-27 Thread Howie Wang
Hi, The settings look reasonable. But for testing purposes, I would get rid of the other query filters and put in some print statements in the query-stemmer to see what's happening. Howie In my nutch-site.xml I overrode the plugin.includes property as below: property

Re[2]: stemming

2006-07-27 Thread bb300
Hi, I think we should wait when Eugen can share his code. In his version of stemming everything works. Also the pagination is realized too. The best way is to develop Eugen's code - this is my opinion. I think that Jerome Charron also interested in that code - because of highlighting of results.

Re: stemming

2006-07-27 Thread Matthew Holt
Actually, ignore my earlier posts. Thanks for your help Howie, I found a dumb mistake on my end. I had the parse-stemmer plugin activated in my local directory but not in my servlet directory.. Thanks!! Matt [EMAIL PROTECTED] wrote: Hi, I think we should wait when Eugen can share his code. In

Plugin Documentation

2006-07-27 Thread Matthew Holt
Hey All, I was looking through the wiki plugin page and noticed that a number of the plugins didn't have much documentation. I was trying to find help on how to query using the query-basic plugin. If anyone can reply with the list of queries that this plugin supports, I'll update the wiki.