Re: Full fledged Lucene Query Syntax support in Nutch
Performance might be a reason, but only the queries that include wildcards or fuzzy characters would be slowed down but not all the queries right? The regular plain text searches performance shouldn't be affected. Any thoughts? Thanks, Ravi Chintakunta On 5/3/06, Ravish Bhagdev [EMAIL PROTECTED] wrote: reason is performance. Allowing above means more complex query which causes more dealy in getting results. If you need these features, you know how to get them, but its tradeoff with performance. May be not if number of pages are less, it will on large scale. -- Ravish. On 5/2/06, Ravi Chintakunta [EMAIL PROTECTED] wrote: Lucene supports fuzzy, wildcard, range, proximity searches as listed here: http://lucene.apache.org/java/docs/queryparsersyntax.html But Nutch does not use all these capabilities. It is limited by query parsing in org.apache.nutch.analysis.NutchAnalysis and the query filters hosted in plugins. We have to modify the analyzer and add more plugins to Nutch to use the Lucene's query syntax. Or we have to directly use Lucene's Query Parser. I tried the second approach by modifying org.apache.nutch.searcher.IndexSearcher and that seems to work. Is there a reason that Nutch does not support the entire Lucene query syntax by default? Thanks in advance, Ravi Chintakunta
Nutch ADMIN -GUI Mirror
I have hosted the bundle at the following URL. http://68.178.249.66/nutch-admin/nutch-0.8-dev_guiBundle_05_02_06.tar.gz I hope it helps. Thanks Sudhi Sudhi Seshachala http://sudhilogs.blogspot.com/ - Love cheap thrills? Enjoy PC-to-Phone calls to 30+ countries for just 2¢/min with Yahoo! Messenger with Voice.
Re: GUI
It just got completed few days back. You could beta test it downloading from http://68.178.249.66/nutch-admin/nutch-0.8-dev_guiBundle_05_02_06.tar.gz It is still in early stages... So I would not rank it as stable.. Thanks Markus Franz [EMAIL PROTECTED] wrote: Hello! Are there any powerful and stable (or almost stable) administration GUIs for Nutch? Did you test them? Regards, Markus -- Danziger Weg 2 97350 Mainbernheim Germany -- +491626077635 [EMAIL PROTECTED] -- Sudhi Seshachala http://sudhilogs.blogspot.com/ - Yahoo! Messenger with Voice. PC-to-Phone calls for ridiculously low rates.
Re: Nutch ADMIN -GUI Mirror
I have hosted the bundle at the following URL. http://68.178.249.66/nutch-admin/nutch-0.8-dev_guiBundle_05_02_06.tar.gz Added to the Wiki : http://wiki.apache.org/nutch/NutchAdministrationUserInterface Thanks Jérôme -- http://motrech.free.fr/ http://www.frutch.org/
Re: GUI
Hi, is there any url to see the gui without installing the Bundle? Matthias
Re: GUI
is there any url to see the gui without installing the Bundle? This static prototype gives a good overview of what is in the bundle: http://www.media-style.com/gfx/nutchadmin/index.html Jérôme -- http://motrech.free.fr/ http://www.frutch.org/
Re: GUI
I tried to upload some screenshooots to the jira but wasn't able to do so. :( But installing it, mean downloading it, decompress and start bin/ nutch gui /aFolder.. Stefan Am 04.05.2006 um 10:07 schrieb Jérôme Charron: is there any url to see the gui without installing the Bundle? This static prototype gives a good overview of what is in the bundle: http://www.media-style.com/gfx/nutchadmin/index.html Jérôme -- http://motrech.free.fr/ http://www.frutch.org/
Nutch as a large scale RSS aggregator?
Title: Nutch as a large scale RSS aggregator? Email full of questions: Do some of you use the Nutch distribution capabilities in order to aggregate periodically a very large number of RSS feeds ? I am basically still wondering if I should to the build from scratch aggregator or use Nutch crawler/extracter on multiple nodes/machines. If I go the Nutch route, I would prefer to do much of the calls to Nutch from the Java API than the command line, in order to schedule/control better the jobs. (some feeds have to be updated every 5 minutes, others every hour or once a day, and the RSS protocol also gives a date/time that should be respected) What do you guys think? Is nutch the right tool for this, as I think it could? (I haven't found any open source already done large scale RSS aggregators.) Jeremy. smime.p7s Description: S/MIME cryptographic signature