I'm trying to get nutch working for my large web site, but I can't find
answers to basic questions after looking all over the nutch site and
searching google.

1) Why doesn't 0.7.2 allow me to search by "title:", I have 15 different
fields showing in Luke but I can only search two of them? Url: and site:, is
that it?

2) How would I add an additional field like "author:" that can be searched
by.

3) Is there an search in "anchor:" ability?

4) Can't you do wildcard searches? Like "d?g" or "t*est" etc.

5) Why does it seem that nutch doesn't support Lucene's full feature set of
query types etc.?

6) I'm using this mostly for site search, I have access to the database,
would it just be better 
to use Lucene and index my database instead of using nutch? Is there
application that's better
suited for indexing a database that uses Lucene, and preferably outputs
opensearch XML?


Also, what is your guys IRC channel you're using?




-----Original Message-----
From: Abdelhakim Diab [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 27, 2006 2:53 AM
To: [email protected]; Dima Mazmanov
Subject: Re: urls list crawling

thanks for your replay .
I solved the problem .
I am useing nutch 7.0.2
the problem was in the filter.
thanks very much.

----- Original Message -----
From: "Dima Mazmanov" <[EMAIL PROTECTED]>
To: "Abdelhakim Diab" <[email protected]>
Sent: Monday, June 26, 2006 4:24 PM
Subject: Re: urls list crawling


Hi,Abdelhakim.

What is the version of nutch you are using?
You wrote 26 июня 2006 г., 17:04:12:

> I want to crawl a list of sites , but when I put the urls in the urls.txt
> file the crawler fetches the first url just.
> and no fetching for the other urls
> how can I solve this problem .
> the urls :
> http://lucene.apache.org/nutch/
> http://www.spacetoon.com




-- 
Regards,
 Dima                          mailto:[EMAIL PROTECTED]





______________________________________
Tonal web design and hosting
http://tonalweb.com
eCommerce development & marketing





Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to