Bogdan,

In my case, I am editing /opt/tomcat/webapps/ROOT/WEB-INF/classes/nutch-
site.xml

url: works fine, as before.  Strangely, site: does not work.  I only have
one site that I'm indexing, and it's an ip address (for now, testing), so
maybe that's why?

Still can't get title: to work, or anything else...

Is there some other file I have to edit?

How can I verify what plugins are loaded?

Ben

On 6/14/06, Bogdan Kecman <[EMAIL PROTECTED]> wrote:

query-site plugin will allow search for "site" field
query-url plugin will allow search for "url" field
...

As for the search, pay attention that site field is
NOT tokenized, meaning, if www.aaa.com is the value
Of the field you can search only site:"www.aaa.com"
And not site:"aaa" as whole value is one token.

Url field is tokenized, so you can search for part of
the url field, and as for the title, it is also tokenized
so the search should work without problems.

As for the xml you added, make sure that you changed the right
file (for e.g.
/usr/local/apache-tomcat-5.5.12/webapps/ROOT/WEB-INF/classes/nutch-
site.xml
)

bogdan

> -----Original Message-----
> From: Benjamin Higgins [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, June 14, 2006 3:40 AM
> To: [email protected]
> Subject: Re: Confused about searchable fields
>
> I added this:
>
> <property>
>   <name>plugin.includes</name>
>
> <value>nutch-extensionpoints|protocol-http|urlfilter-regex|par
se-(text|html)|index-(basic|more)|query-(basic|more|site|url)</v> alue>
> </property>
>
> But still it seems that I can't get other fields to work!
> What the heck?
>
> Ben
>
> On 6/7/06, Bogdan Kecman <[EMAIL PROTECTED]> wrote:
> >
> > Have you turned on query-* plugins in container (tomcat or whatever
> > you
> > use)
> > I usualy unpack .war file and install files directly into
> container.
> > Then edit ..../whereyou
> > instaledthemincontainer/WEB-INF/classes/nutch-site.xml
> > by default (at least in mine version of nutch.war) the
> query-* was not
> > included in the list of plugins and this solved the problem
> >
> > Bogdan
> >
> > Ps. Note that LUKE will search trough ALL fields in the Lucene DB.
> > Nutch will search only those fields that have appropriate
> plugin (in
> > every plugin.xml you will find what fields plugin will
> search trough)
> >
> > > -----Original Message-----
> > > From: Benjamin Higgins [mailto:[EMAIL PROTECTED]
> > > Sent: Tuesday, June 06, 2006 8:49 PM
> > > To: [email protected]
> > > Subject: Confused about searchable fields
> > >
> > > Hi all,
> > >
> > > I bet this is easy to answer, but I can't seem to figure it out.
> > >
> > > Going over the source, it looks like there are several
> fields that
> > > are available in the index by default, like "site",
> "title", "url".
> > >
> > > I see that index-basic and index-more plugins are used to
> populate
> > > these fields.  I see that query-basic, query-more,
> query-site, and
> > > query-url are used to query these fields.
> > >
> > > But when I do a query (with search.jsp), the only field
> that appears
> > > to work is "url".
> > >
> > > For example, searching for "url:xyz" successfully returns
> documents
> > > that have "xyz" in the url.
> > >
> > > But searching for "title:xyz" returns nothing, even if I
> have plenty
> > > of documents with "xyz" in the title.
> > >
> > > If I run this query with Luke on a segment of my index, it runs
> > > fine!  So apparently I'm not understanding how to wire up what is
> > > already indexed and searchable.
> > >
> > > Any tips appreciated.
> > >
> >
> >
>


_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to