1) For Lucene's scoring, there's this:

  http://lucene.apache.org/java/2_4_0/scoring.html#Scoring

And Lucene in action also describes the scoring formula.

2) It's up to you to build a Lucene document from your content, so you decide which parts of your content (body, link, meta) become which fields in Lucene. At that point Lucene's scoring formula kicks in.

Mike

Marco Palumbo - In4Tech wrote:

Good morning,

some days ago I sent the following e-mail, but I had no feed-back on it. Could you please tell us if there is someone able to cooperate with us on this project?

Thank you in advance,

Marco Palumbo

          dott. Marco Palumbo
          Chief Financial Officer
          In4Tech s.r.l.
          c.so Canalgrande, n. 88
          41100 Modena - Italy
          tel.: 0039 059 230651
          fax : 0039 059 244672
          www.in4tech.net

From: Marco Palumbo - In4Tech
Sent: giovedì 13 novembre 2008 16.03
To: java-user@lucene.apache.org
Subject: Lucene

Good morning,

our company works in the field of industrial biotechnologies. We were interested in having a software capable to classify web-sites (and so organizations) working in our field. So, one of our IT consultants organized a system based on Heritrix (http://crawler.archive.org/ ) and Lucene.

As you know, Lucene calculates some scores of frequency. We would like to know/obtain:
1) the formula used by Lucene to calculate the scores;
2) for each page, the basic information used by Lucene to calculate the scores (atomic data: term's frequency in meta, link, body; dimension of the page; ...).

How can you help us to have this kind of information?

Thanks.

Marco Palumbo


          dott. Marco Palumbo
          Chief Financial Officer
          In4Tech s.r.l.
          c.so Canalgrande, n. 88
          41100 Modena - Italy
          tel.: 0039 059 230651
          fax : 0039 059 244672
          www.in4tech.net



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to