On 11.07.2011 16:15, Markus Jelsma wrote:


On Monday 11 July 2011 16:11:47 Marek Bachmann wrote:
Thank you very much

On 11.07.2011 15:48, Markus Jelsma wrote:
Hi,

Using the brand-new IndexingFiltersChecker in 1.4-dev you can see exactly
what Nutch is going to send. It comes down to the plugins you have
defined. See the schema config for a list of fields per plug-in:

http://svn.apache.org/viewvc/nutch/branches/branch-1.4/conf/schema.xml?vi
ew=markup

Cheers

So, as there is no "score" field in the schema.xml I guess the score for
a webpage in the crawl db has no effect in solr by default, am I right? :)

There is no score field indeed but there is a boost field. This contains the
score. Nutch will also set the Lucene document boost and field boost weights
with this value.


Ahh! This is really an important information for me! :-) Thanks!


On Monday 11 July 2011 15:46:37 Marek Bachmann wrote:
Hello there,

where can I find informations about the solr document structure which
the solrindex command sends to solr for indexing?

As far as I know, you add data to the solr index by sending a document
with specific fields to the engine.

I would like to know how nutch creates these documents and which fields
these documents contain.

In other words, what kind of information about a website is transferred
to solr?

Thank you very much.


Reply via email to