Re: Full fledged Lucene Query Syntax support in Nutch

2006-05-04 Thread Ravi Chintakunta

Performance might be a reason, but only the queries that include
wildcards or fuzzy characters would be slowed down but not all the
queries right? The regular plain text searches performance shouldn't
be affected.

Any thoughts?

Thanks,
Ravi Chintakunta

On 5/3/06, Ravish Bhagdev [EMAIL PROTECTED] wrote:

reason is performance.  Allowing above means more complex query which causes
more dealy in getting results.  If you need these features, you know how to
get them, but its tradeoff with performance.  May be not if number of pages
are less, it will on large scale.

-- Ravish.


On 5/2/06, Ravi Chintakunta [EMAIL PROTECTED] wrote:

 Lucene supports fuzzy, wildcard, range, proximity searches as listed
 here: http://lucene.apache.org/java/docs/queryparsersyntax.html

 But Nutch does not use all these capabilities. It is limited by query
 parsing in org.apache.nutch.analysis.NutchAnalysis and the query
 filters hosted in plugins.

 We have to modify the analyzer and add more plugins to Nutch to use
 the Lucene's query syntax. Or we have to directly use Lucene's Query
 Parser. I tried the second approach by modifying
 org.apache.nutch.searcher.IndexSearcher and that seems to work.

 Is there a reason that Nutch does not support the entire Lucene query
 syntax by default?

 Thanks in advance,
 Ravi Chintakunta





Nutch ADMIN -GUI Mirror

2006-05-04 Thread sudhendra seshachala
I have hosted the bundle at the following URL.
   
  http://68.178.249.66/nutch-admin/nutch-0.8-dev_guiBundle_05_02_06.tar.gz
   
  I hope it helps.
   
  Thanks
  Sudhi

   


  Sudhi Seshachala
  http://sudhilogs.blogspot.com/
   



-
Love cheap thrills? Enjoy PC-to-Phone  calls to 30+ countries for just 2¢/min 
with Yahoo! Messenger with Voice.

Re: GUI

2006-05-04 Thread sudhendra seshachala
It just got completed few days back.
  You could beta test it downloading from 
  http://68.178.249.66/nutch-admin/nutch-0.8-dev_guiBundle_05_02_06.tar.gz
   
  It is still in early stages... So I would not rank it as stable..
   
  Thanks
   
   
  Markus Franz [EMAIL PROTECTED] wrote:
  Hello!

Are there any powerful and stable (or almost stable) administration GUIs
for Nutch? Did you test them?

Regards,
Markus

-- 
Danziger Weg 2
97350 Mainbernheim
Germany
--
+491626077635
[EMAIL PROTECTED]
--




  Sudhi Seshachala
  http://sudhilogs.blogspot.com/
   



-
Yahoo! Messenger with Voice. PC-to-Phone calls for ridiculously low rates.

Re: Nutch ADMIN -GUI Mirror

2006-05-04 Thread Jérôme Charron

I have hosted the bundle at the following URL.
  http://68.178.249.66/nutch-admin/nutch-0.8-dev_guiBundle_05_02_06.tar.gz


Added to the Wiki :
http://wiki.apache.org/nutch/NutchAdministrationUserInterface
Thanks

Jérôme

--
http://motrech.free.fr/
http://www.frutch.org/


Re: GUI

2006-05-04 Thread Matthias Jaekle

Hi,

is there any url to see the gui without installing the Bundle?

Matthias


Re: GUI

2006-05-04 Thread Jérôme Charron

is there any url to see the gui without installing the Bundle?


This static prototype gives a good overview of what is in the bundle:
http://www.media-style.com/gfx/nutchadmin/index.html

Jérôme

--
http://motrech.free.fr/
http://www.frutch.org/


Re: GUI

2006-05-04 Thread Stefan Groschupf
I tried to upload some screenshooots to the jira but wasn't able to  
do so. :(
But installing it, mean downloading it, decompress and start bin/ 
nutch gui /aFolder..


Stefan

Am 04.05.2006 um 10:07 schrieb Jérôme Charron:


is there any url to see the gui without installing the Bundle?


This static prototype gives a good overview of what is in the bundle:
http://www.media-style.com/gfx/nutchadmin/index.html

Jérôme

--
http://motrech.free.fr/
http://www.frutch.org/




Nutch as a large scale RSS aggregator?

2006-05-04 Thread HUYLEBROECK Jeremy RD-ILAB-SSF
Title: Nutch as a large scale RSS aggregator?







Email full of questions:


Do some of you use the Nutch distribution capabilities in order to aggregate periodically a very large number of RSS feeds ?

I am basically still wondering if I should to the build from scratch aggregator or use Nutch crawler/extracter on multiple nodes/machines.

If I go the Nutch route, I would prefer to do much of the calls to Nutch from the Java API than the command line, in order to schedule/control better the jobs. (some feeds have to be updated every 5 minutes, others every hour or once a day, and the RSS protocol also gives a date/time that should be respected)

What do you guys think? Is nutch the right tool for this, as I think it could?

(I haven't found any open source already done large scale RSS aggregators.)



Jeremy.





smime.p7s
Description: S/MIME cryptographic signature