Even i am facing the same problems... I dont know how to eliminate or delete the particular index of an url which is crawled. i need to eliminate the porn url's from my search engine...
i m having the crawled data after crawling with me and now i need to find,the indexes of the porn urls.. please help me in doing this... With Thanks, Franklin.S Ratnesh,V2Solutions India wrote: > > no, > i don't think that we hav to deal somthing we that, because if i remove > then I wont b able to index my own file for which I am crawling to. > > But I will surely check, as at this moment I am not very sure?? > Can you tell me abour ur whereabots?? > > Thnks > Ratnesh V2Soltuons, India > > Siddharth Jonathan wrote: >> >> Hmmm...I haven't had to do this, but my guess would be to remove the >> corresponding >> plugin entries from the nutch-default.xml file. >> There is a plugin include property in that file which includes the >> default >> indexing filters (index-basic,index-more etc) >> and the query filter plugins(query-basic,query-more etc). Try removing >> those. That might keep them from getting used. >> >> Jonathan >> >> >> On 4/2/07, Ratnesh,V2Solutions India >> wrote: >>> >>> >>> exactly offcourse , >>> >>> I want this only, Do you have any solution for this?? >>> >>> looking forwards for your reply >>> >>> Thnx >>> >>> >>> Siddharth Jonathan wrote: >>> > >>> > Do you mean how do you get rid of some of the fields that are indexed >>> by >>> > default? eg. content, anchor text etc. >>> > >>> > Jonathan >>> > On 4/2/07, Ratnesh,V2Solutions India >>> > >>> > wrote: >>> >> >>> >> >>> >> Hi, >>> >> I have written a plugin , which finds no. of Object tags in a html >>> and >>> >> corresponding urls. >>> >> I am storing "objects" as fields and page url as values. >>> >> >>> >> And finally interested in seeing the search realted with "objects" >>> >> indexed >>> >> fields not those which is already stored as indexed fields. >>> >> >>> >> So how shall I delete those index fields which is already stored???? >>> >> >>> >> Looking forward towards your reply(Valuable >>> >> inputs)......................... >>> >> >>> >> Thnx to Nutch Community >>> >> -- >>> >> View this message in context: >>> >> >>> http://www.nabble.com/How-to-delete-already-stored-indexed-fields----tf3504164.html#a9786377 >>> >> Sent from the Nutch - User mailing list archive at Nabble.com. >>> >> >>> >> >>> > >>> > >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/How-to-delete-already-stored-indexed-fields----tf3504164.html#a9803792 >>> Sent from the Nutch - User mailing list archive at Nabble.com. >>> >>> >> >> > > -- View this message in context: http://www.nabble.com/How-to-delete-already-stored-indexed-fields----tf3504164.html#a10099074 Sent from the Nutch - User mailing list archive at Nabble.com.
