Hi Alessio, You need to determine in which field the unwanted content exists. Once you've done this you could write an indexing filter to remove this from your document prior to indexing.
Lewis On Thu, Apr 5, 2012 at 9:41 PM, alessio crisantemi < [email protected]> wrote: > > > ---------- Messaggio inoltrato ---------- > Da: alessio crisantemi <[email protected]> > Date: 05 aprile 2012 22:32 > Oggetto: request about snippets > A: [email protected] > > > Dear all, > I configured my Nutch (1.4) for works with Solr (1.4.1) and I crawl and > index with success my website. > > I have only a problem with the results of my researches. > Into all results, the snippets have a raw with a string where I can read > all the categories of my website. I attached a screen shot for explain: > here, the no good raw is "Mercoledì Apr 04 parent"> Home NEWSLOT/VLT > SCOMMESSE ONLINE LOTTERIE Politica Video Live Score ") > > This is a problem, because if solr read for any page the same raw, when my > query is the same word of this raw (eg: 'ONLINe') I have all my solr index > like a result. > > When I can jump this raw during my crawling? Is possible exclude this raw? > thank you in adavande > alessio > > -- *Lewis*

