On Monday 11 July 2011 16:11:47 Marek Bachmann wrote: > Thank you very much > > On 11.07.2011 15:48, Markus Jelsma wrote: > > Hi, > > > > Using the brand-new IndexingFiltersChecker in 1.4-dev you can see exactly > > what Nutch is going to send. It comes down to the plugins you have > > defined. See the schema config for a list of fields per plug-in: > > > > http://svn.apache.org/viewvc/nutch/branches/branch-1.4/conf/schema.xml?vi > > ew=markup > > > > Cheers > > So, as there is no "score" field in the schema.xml I guess the score for > a webpage in the crawl db has no effect in solr by default, am I right? :)
There is no score field indeed but there is a boost field. This contains the score. Nutch will also set the Lucene document boost and field boost weights with this value. > > > On Monday 11 July 2011 15:46:37 Marek Bachmann wrote: > >> Hello there, > >> > >> where can I find informations about the solr document structure which > >> the solrindex command sends to solr for indexing? > >> > >> As far as I know, you add data to the solr index by sending a document > >> with specific fields to the engine. > >> > >> I would like to know how nutch creates these documents and which fields > >> these documents contain. > >> > >> In other words, what kind of information about a website is transferred > >> to solr? > >> > >> Thank you very much. -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

