Hi Sebastian,



*>(d) anchor is filled by index-anchor. Is the index-anchor plugin active? *
Could you please tell me how I can check if index-anchor plugin is active.
I do see that listed under plugin.includes as index - basic|anchor|more. Is
there any other way to check if it is active?

>



*There will be also no anchors, if there are no inlinks at all, or no
inlinks from different hosts if the property "db.ignore.internal.links" is
true (default).*
I have *"db.ignore.internal.links" *as false

*.*
Thanks for your help!



On Mon, Apr 21, 2014 at 4:09 PM, Sebastian Nagel <[email protected]
> wrote:

> > From what you mentioned, if I understand it correctly - I am going to
> take
> > "anchor" as an example as I don't see that being captured in Solr index
> > data -
> >
> > (a) "anchor" *is *defined under Solr-  schema.xml
> > (b) "anchor *is NOT *defined in gora-hbase-mapping.xml
> > (c) "anchor"* is NOT* defined in solrindex-mapping.xml
> >
> > So, my question would be - why is "anchor" not see in Solr index data
> > ...
>
> (d) anchor is filled by index-anchor.
>
> Is the index-anchor plugin active?
> There will be also no anchors, if there are no inlinks at all, or
> no inlinks from different hosts if the property "db.ignore.internal.links"
> is true (default).
>
> Sebastian
>
>
>
>
> On 04/21/2014 04:38 PM, A Laxmi wrote:
> > Hi Sebastian,
> >
> > 2. From gora-hbase-mapping.xml - with ref to Link[0] above, '*url*' will
> be
> >> '*baseUrl*' in hbase (data Nutch stores). What will be '*site*'?
> >
> >
> >
> >
> > * Which fields are indexed does not depend on the storage mapping. Fields
> > are filled by indexing filters, some "basic" fields also by the indexer
> > itself. Which fields are sent to the indexing back-ends is defined in
> > schema and mappings of the back-end.*
> >
> > From what you mentioned, if I understand it correctly - I am going to
> take
> > "anchor" as an example as I don't see that being captured in Solr index
> > data -
> >
> > (a) "anchor" *is *defined under Solr-  schema.xml
> > (b) "anchor *is NOT *defined in gora-hbase-mapping.xml
> > (c) "anchor"* is NOT* defined in solrindex-mapping.xml
> >
> > So, my question would be - why is "anchor" not see in Solr index data
> > though it is defined in Solr - schema.xml [(a) from above]?? When you
> said
> > mappings of the back-end - are you referring to gora-hbase-mapping.xml
> [(b)
> > from above]??
> >
> > Thanks for your help..
> >
> >
> >
> > On Sat, Apr 19, 2014 at 4:28 AM, Sebastian Nagel <
> [email protected]
> >> wrote:
> >
> >> Hi,
> >>
> >>> 1. With ref to link below, what is the difference between '*site*' and
> '
> >>> *url*'?
> >>> Link [0]: http://wiki.apache.org/nutch/IndexStructure
> >>
> >> the field "site" (from indexing filter plugin index-basic)
> >> has been removed some time ago (since Nutch 1.5?) because
> >> it's an alias for "host". We need to update the mentioned
> >> wiki page, accordingly (done right now). Thanks for the hint!
> >>
> >>> 2. From gora-hbase-mapping.xml - with ref to Link[0] above, '*url*'
> will
> >> be
> >>> '*baseUrl*' in hbase (data Nutch stores). What will be '*site*'?
> >>
> >> Which fields are indexed does not depend on the storage mapping.
> >> Fields are filled by indexing filters, some "basic" fields also
> >> by the indexer itself. Which fields are sent to the indexing
> >> back-ends is defined in schema and mappings of the back-end.
> >>
> >> Sebastian
> >>
> >>
> >> On 04/19/2014 03:18 AM, A Laxmi wrote:
> >>> Hi,
> >>>
> >>> I am using Nutch 2.2.1 with HBase. I have couple of questions about the
> >>> index fields in Nutch:
> >>>
> >>> 1. With ref to link below, what is the difference between '*site*' and
> '
> >>> *url*'?
> >>> Link [0]: http://wiki.apache.org/nutch/IndexStructure
> >>>
> >>> 2. From gora-hbase-mapping.xml - with ref to Link[0] above, '*url*'
> will
> >> be
> >>> '*baseUrl*' in hbase (data Nutch stores). What will be '*site*'?
> >>>
> >>>
> >>> Thanks for any help..
> >>>
> >>
> >>
> >
>
>

Reply via email to