Okie so the query will be something like
type:movie excorcist Default NUTCH search is AND so it will see those pages and retrieve excorcist Will try it out . Rgds. Prabhu On 1/9/06, Saravanaraj Duraisamy <[EMAIL PROTECTED]> wrote: > > Hi, > > It is like you are adding another field on which the index can be queried > for. > As you said, you may have two different index folders, one for 'movie' and > another for 'songs'. > > I think, there is no need to have two different indexes. Just index those > two sites in a single index. But add another field to the index which > differentaiates the data from the 2 sites even though they are in the same > index. > > You can do this, by adding certain lines in the plugin you are going to > write or modify. Check for the url and accordingly add field value (the > field name can 'siteType') as 'songs' or 'music'. > > I hope Mr.Wang was mentioning that. > > Rgds, > D Saravanaraj > > On 1/9/06, Raghavendra Prabhu <[EMAIL PROTECTED]> wrote: > > > > Hi Wang > > > > But i thought when you include a query-plugin and you have a field > called > > > > type: > > > > It will search content only in that filed > > > > So You are asking me to make all the content a subset of this one .Is it > ? > > > > For example -query-url will basically search in url field in the > documents > > > > So how can this be a solution. > > > > > > > > Rgds > > Prabhu > > > > > > On 1/9/06, Howie Wang <[EMAIL PROTECTED]> wrote: > > > > > > To do what I mentioned, you basically have to write two plugins, > > > an IndexFilter plugin and a QueryFilter plugin. I think this page has > > > some info on writing plugins: > > > > > > http://wiki.apache.org/nutch/WritingPlugins > > > > > > It will probably be easiest if you copy the src/plugins/index-basic > > > directory, and just change all the build files and filenames as > needed. > > If > > > you > > > look at BasicIndexingFilter.java file, you'll see that the > modifications > > > needed > > > aren't bad at all. There are a whole bunch of lines that do something > > > like: > > > > > > doc.add(Field.Text("myfield"), "somevalue"); > > > > > > You should figure out if the url is from a movie page and then > > > add your field: > > > > > > if (isFromMovieSite(url)) { > > > doc.add(Field.Text("type"), "movies"); > > > } else if (isFromMusicSite(url)) { > > > doc.add(Field.Text("type"), "music"); > > > } else { > > > // Need to make sure all docs have the field, > > > // Otherwise it will crash when you search > > > doc.add(Field.Text("type"), "miscellaneous"); > > > } > > > > > > Doing the query filter is even easier, just copy the > > > src/plugins/query-site > > > directory, change filenames and build files as needed. And change the > > > line that says: > > > > > > super("site"); > > > > > > to: > > > > > > super("type"); > > > > > > That's pretty much it. You'll have to edit your conf/nutch-*.xml files > > to > > > include your new plugins. > > > > > > > > > >Can you explain what exactly you have in mind > > > > > > > >Say that i have fetched sites under movie category (a list of > websites > > > >which > > > >i have ),how do i add > > > >a field to it and have fetched sites for songs. > > > >How do i specifically add a field to first set of pages (ie that of > > > movies) > > > >and a separate field to the second (ie that of songs) > > > > > > > >And field search ,How can i search by this field > > > > > > > >How will nutch understand this query > > > >newfield:uniquename > > > > > > > >I thought you needed to create a query-plugin for each field u create > . > > > >(like query-url) > > > > > > > >I still did not get what u meant .If you can clearly mention ,it will > > be > > > >helpful > > > > > > > >Thanks . > > > >Raghavendra Prabhu R > > > > > > > > > > > > > > >
