Re: synonym payload boosting

Grant Ingersoll Mon, 09 Nov 2009 04:10:40 -0800


On Nov 9, 2009, at 4:41 AM, David Ginzburg wrote:

I have found this
https://issues.apache.org/jira/browse/SOLR-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
patch
But i don't want to use any function, just the normal scoring and the
similarity class  I have written.
Can you point me to  modifications I need (if any) ?

Amhet's point is that you need some query that will actually invokethe payload in scoring. PayloadTermQuery and PayloadNearQuery are thetwo that do this in Lucene. You can certainly write your own, as well.


-Grant


On Sun, Nov 8, 2009 at 16:33, AHMET ARSLAN <iori...@yahoo.com> wrote:

Additionaly you need to modify your queryparser to return
BoostingTermQuery, PayloadTermQuery, PayloadNearQuery etc.

With these types of Queries scorePayload method invoked.

Hope this helps.

--- On Sun, 11/8/09, David Ginzburg <da...@digitaltrowel.com> wrote:

From: David Ginzburg <da...@digitaltrowel.com>
Subject: synonym payload boosting
To: solr-user@lucene.apache.org
Date: Sunday, November 8, 2009, 4:06 PM
Hi,
I have a field and a wighted synonym map.
I have indexed the synonyms with the weight as payload.
my code snippet from my filter

*public Token next(final Token reusableToken) throws
IOException *
*        . *
*        . *
*        .*
      * Payload boostPayload;*
*
*
*        for (Synonym synonym : syns)
{*
*            *
*            Token newTok =
new Token(nToken.startOffset(),
nToken.endOffset(), "SYNONYM");*
*
newTok.setTermBuffer(synonym.getToken().toCharArray(), 0,
synonym.getToken().length());*
*            // set the
position increment to zero*
*            // this tells
lucene the synonym is*
*            // in the exact
same location as the originating word*
*
newTok.setPositionIncrement(0);*
*            boostPayload =
new
Payload(PayloadHelper.encodeFloat(synonym.getWieght()));*
*
newTok.setPayload(boostPayload);*
*
*
I have put it in the index time analyzer : this is my field
definition:

*
<fieldType name="PersonName" class="solr.TextField"
positionIncrementGap="100" >
     <analyzer type="index">
       <tokenizer
class="solr.WhitespaceTokenizerFactory"/>
       <filter
class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
       <filter
class="solr.LowerCaseFilterFactory"/>
       <filter
class="com.digitaltrowel.solr.DTSynonymFactory"
FreskoFunction="names_with_scoresPipe23Columns.txt"
ignoreCase="true"
expand="false"/>

       <!--<filter
class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>-->
       <!--<filter
class="solr.RemoveDuplicatesTokenFilterFactory"/>-->
     </analyzer>
     <analyzer type="query">
       <tokenizer
class="solr.WhitespaceTokenizerFactory"/>
       <filter
class="solr.LowerCaseFilterFactory"/>
       <!--<filter
class="com.digitaltrowel.solr.DTSynonymFactory"
synonyms="synonyms.txt" ignoreCase="true"
expand="false"/>-->
       <filter
class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
       <!--<filter
class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>-->
       <!--<filter
class="solr.RemoveDuplicatesTokenFilterFactory"/

-->

     </analyzer>
   </fieldType>


my similarity class is
public class BoostingSymilarity extends DefaultSimilarity
{


   public BoostingSymilarity(){
       super();

 }
   @Override
   public  float scorePayload(String field,
byte [] payload, int offset,
int length)
{
double weight = PayloadHelper.decodeFloat(payload, 0);
return (float)weight;
}

@Override public float coord(int overlap, int maxoverlap)
{
return 1.0f;
}

@Override public float idf(int docFreq, int numDocs)
{
return 1.0f;
}

@Override public float lengthNorm(String fieldName, int
numTerms)
{
return 1.0f;
}

@Override public float tf(float freq)
{
return 1.0f;
}
}

My problem is that scorePayload method does not get called
at search time
like the other methods in  my similarity class.
I tested and verified it with break points.
What am I doing wrong?
I used solr 1.3 and thinking of the payload boos support in
solr 1.4.


*


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com




--
Regards

_____________________
David Ginzburg
Developer, Digital Trowel
1 Hayarden St., Airport City
[POB 169, NATBAG]
Lod, 70151, Israel
http://www.digitaltrowel.com/
Office: +972 73 240 522
Mobile: +972 50 496 0595

CHECK OUT OUR NEW TEXT MINING BLOG:
http://mineyourbusiness.wordpress.com/


--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)using Solr/Lucene:

http://www.lucidimagination.com/search

Re: synonym payload boosting

Reply via email to