Okie so the query will be something like

type:movie excorcist

Default NUTCH search is AND

so it will see those pages and retrieve excorcist

Will try it out .


Rgds.
Prabhu


On 1/9/06, Saravanaraj Duraisamy <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> It is like you are adding another field on which the index can be queried
> for.
> As you said, you may have two different index folders, one for 'movie' and
> another for 'songs'.
>
> I think, there is no need to have two different indexes. Just index those
> two sites in a single index. But add another field to the index which
> differentaiates the data from the 2 sites even though they are in the same
> index.
>
> You can do this, by adding certain lines in the plugin you are going to
> write or modify. Check for the url and accordingly add field value (the
> field name can 'siteType') as 'songs' or 'music'.
>
> I hope Mr.Wang was mentioning that.
>
> Rgds,
> D Saravanaraj
>
> On 1/9/06, Raghavendra Prabhu <[EMAIL PROTECTED]> wrote:
> >
> > Hi Wang
> >
> > But i thought when you include a query-plugin and you have a field
> called
> >
> > type:
> >
> > It will search content only in that filed
> >
> > So You are asking me to make all the content a subset of this one .Is it
> ?
> >
> > For example -query-url will basically search in url field in the
> documents
> >
> > So how can this be a solution.
> >
> >
> >
> > Rgds
> > Prabhu
> >
> >
> > On 1/9/06, Howie Wang <[EMAIL PROTECTED]> wrote:
> > >
> > > To do what I mentioned, you basically have to write two plugins,
> > > an IndexFilter plugin and a QueryFilter plugin. I think this page has
> > > some info on writing plugins:
> > >
> > > http://wiki.apache.org/nutch/WritingPlugins
> > >
> > > It will probably be easiest if you copy the src/plugins/index-basic
> > > directory, and just change all the build files and filenames as
> needed.
> > If
> > > you
> > > look at BasicIndexingFilter.java file, you'll see that the
> modifications
> > > needed
> > > aren't bad at all. There are a whole bunch of lines that do something
> > > like:
> > >
> > >    doc.add(Field.Text("myfield"), "somevalue");
> > >
> > > You should figure out if the url is from a movie page and then
> > > add your field:
> > >
> > >    if (isFromMovieSite(url)) {
> > >        doc.add(Field.Text("type"), "movies");
> > >    } else if (isFromMusicSite(url)) {
> > >        doc.add(Field.Text("type"), "music");
> > >    }  else {
> > >        // Need to make sure all docs have the field,
> > >        // Otherwise it will crash when you search
> > >        doc.add(Field.Text("type"), "miscellaneous");
> > >    }
> > >
> > > Doing the query filter is even easier, just copy the
> > > src/plugins/query-site
> > > directory, change filenames and build files as needed. And change the
> > > line that says:
> > >
> > >    super("site");
> > >
> > > to:
> > >
> > >    super("type");
> > >
> > > That's pretty much it. You'll have to edit your conf/nutch-*.xml files
> > to
> > > include your new plugins.
> > >
> > >
> > > >Can you explain what exactly you have in mind
> > > >
> > > >Say that i have fetched sites under movie category (a list of
> websites
> > > >which
> > > >i have ),how do i add
> > > >a field to it  and have fetched sites for songs.
> > > >How do i specifically add a field to first set of pages (ie that of
> > > movies)
> > > >and a separate field to the second (ie that of songs)
> > > >
> > > >And field search ,How can i search by this field
> > > >
> > > >How will nutch understand this query
> > > >newfield:uniquename
> > > >
> > > >I thought you needed to create a query-plugin for each field u create
> .
> > > >(like query-url)
> > > >
> > > >I still did not get what u meant .If you can clearly mention ,it will
> > be
> > > >helpful
> > > >
> > > >Thanks .
> > > >Raghavendra Prabhu R
> > >
> > >
> > >
> >
> >
>
>

Reply via email to