Hi Alex, Yes I have read that one. It led me to return zero content for the page (so that URL seems to be empty HTML page this way), but I couldn't make that URL as "never-downloaded".
Dinçer 2011/9/5 alex <[email protected]> > On 09/04/2011 10:22 AM, Dinçer Kavraal wrote: > >> Hi, >> >> Is it possible to reject a page to be indexed in parse operation? I even >> don't want it to be indexed as a no-content page without any text >> information inside. >> >> There is a situation that I cannot understand whether I should inject or >> not >> from the URL itself. I need check the content. When I match a, say, >> keyword >> in the page, I want to avoid the page in render phase. >> >> Do you have any ideas? >> >> Thanks >> Dincer >> >> >> > have you read this: > http://wiki.apache.org/nutch/**WritingPluginExample-0.9<http://wiki.apache.org/nutch/WritingPluginExample-0.9>? > > I guess, you need parsefilter and/or indexingfilter... > >

