I can't see any of your attachments as they're not permitted on list.

Can you provide an URL?

On Thu, Apr 5, 2012 at 9:56 PM, alessio crisantemi <
[email protected]> wrote:

> Dear Lewis, thank you for your fast reply.
> But just thiat's my problem! I don't compred wich is the field that crates
> this raw.
>
> But I see a date (eg: "Mercoledì Apr 04") followed by the word "parent"
> anche after ">" and the the ame of categories (Home NEWSLOT/VLT SCOMMESSE
> ONLINE LOTTERIE Politica Video Live Score").
>
> Do you know wich field of default nutch configuration generate the 'parent'
> raw.
>
> as you can see in the attachement, this raw is into the content field,
> between 'str' tags.
> ..
> suggestions?
> tx
> a.
>
> Il giorno 05 aprile 2012 22:45, Lewis John Mcgibbney <
> [email protected]> ha scritto:
>
> > Hi Alessio,
> >
> > You need to determine in which field the unwanted content exists. Once
> > you've done this you could write an indexing filter to remove this from
> > your document prior to indexing.
> >
> > Lewis
> >
> > On Thu, Apr 5, 2012 at 9:41 PM, alessio crisantemi <
> > [email protected]> wrote:
> >
> > >
> > >
> > > ---------- Messaggio inoltrato ----------
> > > Da: alessio crisantemi <[email protected]>
> > > Date: 05 aprile 2012 22:32
> > > Oggetto: request about snippets
> > > A: [email protected]
> > >
> > >
> > > Dear all,
> > > I configured my Nutch (1.4) for works with Solr (1.4.1) and I crawl and
> > > index with success my website.
> > >
> > > I have only a problem with the results of my researches.
> > > Into all results, the snippets have a raw with a string where I can
> read
> > > all the categories of my website. I attached a screen shot for explain:
> > > here, the no good raw is "Mercoledì Apr 04 parent"> Home NEWSLOT/VLT
> > > SCOMMESSE ONLINE LOTTERIE Politica Video Live Score ")
> > >
> > > This is a problem, because if solr read for any page the same raw, when
> > my
> > > query is the same word of this raw (eg: 'ONLINe') I have all my solr
> > index
> > > like a result.
> > >
> > > When I can jump this raw during my crawling? Is possible exclude this
> > raw?
> > > thank you in adavande
> > > alessio
> > >
> > >
> >
> >
> > --
> > *Lewis*
> >
>



-- 
*Lewis*

Reply via email to