Hi, It is like you are adding another field on which the index can be queried for. As you said, you may have two different index folders, one for 'movie' and another for 'songs'.
I think, there is no need to have two different indexes. Just index those two sites in a single index. But add another field to the index which differentaiates the data from the 2 sites even though they are in the same index. You can do this, by adding certain lines in the plugin you are going to write or modify. Check for the url and accordingly add field value (the field name can 'siteType') as 'songs' or 'music'. I hope Mr.Wang was mentioning that. Rgds, D Saravanaraj On 1/9/06, Raghavendra Prabhu <[EMAIL PROTECTED]> wrote: > > Hi Wang > > But i thought when you include a query-plugin and you have a field called > > type: > > It will search content only in that filed > > So You are asking me to make all the content a subset of this one .Is it ? > > For example -query-url will basically search in url field in the documents > > So how can this be a solution. > > > > Rgds > Prabhu > > > On 1/9/06, Howie Wang <[EMAIL PROTECTED]> wrote: > > > > To do what I mentioned, you basically have to write two plugins, > > an IndexFilter plugin and a QueryFilter plugin. I think this page has > > some info on writing plugins: > > > > http://wiki.apache.org/nutch/WritingPlugins > > > > It will probably be easiest if you copy the src/plugins/index-basic > > directory, and just change all the build files and filenames as needed. > If > > you > > look at BasicIndexingFilter.java file, you'll see that the modifications > > needed > > aren't bad at all. There are a whole bunch of lines that do something > > like: > > > > doc.add(Field.Text("myfield"), "somevalue"); > > > > You should figure out if the url is from a movie page and then > > add your field: > > > > if (isFromMovieSite(url)) { > > doc.add(Field.Text("type"), "movies"); > > } else if (isFromMusicSite(url)) { > > doc.add(Field.Text("type"), "music"); > > } else { > > // Need to make sure all docs have the field, > > // Otherwise it will crash when you search > > doc.add(Field.Text("type"), "miscellaneous"); > > } > > > > Doing the query filter is even easier, just copy the > > src/plugins/query-site > > directory, change filenames and build files as needed. And change the > > line that says: > > > > super("site"); > > > > to: > > > > super("type"); > > > > That's pretty much it. You'll have to edit your conf/nutch-*.xml files > to > > include your new plugins. > > > > > > >Can you explain what exactly you have in mind > > > > > >Say that i have fetched sites under movie category (a list of websites > > >which > > >i have ),how do i add > > >a field to it and have fetched sites for songs. > > >How do i specifically add a field to first set of pages (ie that of > > movies) > > >and a separate field to the second (ie that of songs) > > > > > >And field search ,How can i search by this field > > > > > >How will nutch understand this query > > >newfield:uniquename > > > > > >I thought you needed to create a query-plugin for each field u create . > > >(like query-url) > > > > > >I still did not get what u meant .If you can clearly mention ,it will > be > > >helpful > > > > > >Thanks . > > >Raghavendra Prabhu R > > > > > > > >
