Hi List
Apologies for such a long message. I have tried to include everything, that
you might need to know to answer my question.
I am having difficulties understanding how or what AveragePayloadFunction is
doing. Here is my example
Title:Human|9 pineal|5 luteinizing hormone receptors.
Text:The presence of luteinizing hormone receptors in human|9 pineal|5
glands from five females and three males, ranging in age from 61-89 yr, was
examined by in situ hybridization and immunocytochemistry. The results
demonstrated the presence of these receptors at the mRNA|7 and protein
levels in all the pineal|5 glands examined. Pineal|5 gland luteinizing
hormone receptors could potentially be involved in the regulation of
melatonin|7 synthesis.
3 is for class A
5 is for class B
7 is for class C
9 is for class D
These are the payloads stored in the index. But when I search, I use these
values for encoding term class, and then return 3 for selected class.
I am using WhiteSpaceTokenizer and LowerCaseFilter. In my PayloadSimilarity
class, I manipulate payload in a way so that, if I am interested in class A,
it will return payload value "x=3" only for terms in class A, I decide term
class by checking its payload value.
Now, I query for "luteinizing hormone" using PayloadNearQuery with slop of
5. First I try with interest in class B and next with interest in class A.
*Result of Class A interest:*
Explain: 10.97332 = (MATCH) sum of:
2.5589073 = (MATCH) weight(payloadNear([AbstractText:luteinizing,
AbstractText:hormone], 5, true) in 5362133), product of:
0.68000716 = queryWeight(payloadNear([AbstractText:luteinizing,
AbstractText:hormone], 5, true)), product of:
14.045828 = idf(AbstractText: luteinizing=15481 hormone=164637)
0.048413463 = queryNorm
3.7630591 = (MATCH) fieldWeight(AbstractText:payloadNear([luteinizing,
hormone], 5, true) in 5362133), product of:
2.4494898 = PayloadNearQuery, product of:
0.8164966 = tf(phraseFreq=0.6666667)
*3.0 = AveragePayloadFunction(...)*
14.045828 = idf(AbstractText: luteinizing=15481 hormone=164637)
0.109375 = fieldNorm(field=AbstractText, doc=5362133)
8.4144125 = (MATCH) weight(payloadNear([ArticleTitle:luteinizing,
ArticleTitle:hormone], 5, true) in 5362133), product of:
0.7332054 = queryWeight(payloadNear([ArticleTitle:luteinizing,
ArticleTitle:hormone], 5, true)), product of:
15.144659 = idf(ArticleTitle: hormone=86980 luteinizing=9765)
0.048413463 = queryNorm
11.476201 = (MATCH) fieldWeight(ArticleTitle:payloadNear([luteinizing,
hormone], 5, true) in 5362133), product of:
1.7320508 = PayloadNearQuery, product of:
0.57735026 = tf(phraseFreq=0.33333334)
* 3.0 = AveragePayloadFunction(...)*
15.144659 = idf(ArticleTitle: hormone=86980 luteinizing=9765)
0.4375 = fieldNorm(field=ArticleTitle, doc=5362133)
---------------------------------------------------------------------
*Result of Class B Interest:*
Explain: 3.657773 = (MATCH) sum of:
0.85296905 = (MATCH) weight(payloadNear([AbstractText:luteinizing,
AbstractText:hormone], 5, true) in 5362133), product of:
0.68000716 = queryWeight(payloadNear([AbstractText:luteinizing,
AbstractText:hormone], 5, true)), product of:
14.045828 = idf(AbstractText: luteinizing=15481 hormone=164637)
0.048413463 = queryNorm
1.254353 = (MATCH) fieldWeight(AbstractText:payloadNear([luteinizing,
hormone], 5, true) in 5362133), product of:
0.8164966 = PayloadNearQuery, product of:
0.8164966 = tf(phraseFreq=0.6666667)
*1.0 = AveragePayloadFunction(...)*
14.045828 = idf(AbstractText: luteinizing=15481 hormone=164637)
0.109375 = fieldNorm(field=AbstractText, doc=5362133)
2.804804 = (MATCH) weight(payloadNear([ArticleTitle:luteinizing,
ArticleTitle:hormone], 5, true) in 5362133), product of:
0.7332054 = queryWeight(payloadNear([ArticleTitle:luteinizing,
ArticleTitle:hormone], 5, true)), product of:
15.144659 = idf(ArticleTitle: hormone=86980 luteinizing=9765)
0.048413463 = queryNorm
3.8254004 = (MATCH) fieldWeight(ArticleTitle:payloadNear([luteinizing,
hormone], 5, true) in 5362133), product of:
0.57735026 = PayloadNearQuery, product of:
0.57735026 = tf(phraseFreq=0.33333334)
* 1.0 = AveragePayloadFunction(...)*
15.144659 = idf(ArticleTitle: hormone=86980 luteinizing=9765)
0.4375 = fieldNorm(field=ArticleTitle, doc=5362133)
As I understand, when I am interested in class B, I should get 3 from
AveragePayloadFunction, where as I should get 1 for class A, as there is no
class A term in the text, hence everything will have payload 1. Whereas, if
I am interested in Class B, there is one term in "Title" field, hence
AveragePayloadFunction returned value will be 3.
I do not understand what is going on. May be I am not getting what
AveragePayloadFunction is doing exactly.
My similarity class is as follows:
public class PayloadSearchSimilarity extends DefaultSimilarity {
private static final long serialVersionUID = 1L;
public static String semantic;
@Override
public float scorePayload(int docId,String fieldName, int start, int
end, byte[] bytes, int offset, int length) {
//System.out.println("this is gett");
if(bytes!=null)
{
float payload=PayloadHelper.decodeFloat(bytes, offset);
//System.out.println("this is getting called, load:"+payload);
//i am now returning same payload for all semantic type
so that we can
compare the score. it was changed after we showed it to Dietrich.
if(semantic.equals("A") && (payload==3))
{
//System.out.println("Doc id:"+docId+"field
:"+fieldName+" Semantic:"+
semantic+" Payload:"+payload);
return 3;
}
else
{
if(semantic.equals("B") && (payload==5))
{
//System.out.println("Doc
id:"+docId+"field :"+fieldName+" Semantic:"+
semantic+" Payload:"+payload);
return 3;
}
else
{
if(semantic.equals("C") && (payload==7))
{
System.out.println("Semantic:"+
semantic);
return 3;
}
else
{
if(semantic.equals("D") &&
(payload==9))
{
System.out.println("Semantic:"+ semantic);
return 3;
}
else
{
//System.out.println("happens when term class does not match with
semantic, Semantic:"+ semantic);
return 1;
}
}
}
}
}//payload|bytes not null end
else
{
//System.out.println("payload null");
return 1;
}
}
}
I am really puzzled. It will be really helpful, if someone can help.
Look forward to hear from you.
Many Thanks
Shyama
--
View this message in context:
http://lucene.472066.n3.nabble.com/PayloadNearQuery-and-AveragePayloadFunction-tp3710454p3710454.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]