These statistics are use for determining document relevance or score for the
query itself. As such, they are one of two things: 1) (per field) per
document, or for the universe of documents in the collection. That's it, one
of the two.
You keep referring to "ResultSet", but there is no such concept in relevancy
or scoring, at least in the Lucene model for relevancy and scoring.
If you might more details on Lucene/Solr scoring, see:
http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html
Feel free to propose an alternative model to relevancy and scoring, but
don't expect an implementation of such a model in the near-term.
You might also be able to implement your alternative model for relevance and
scoring using a custom "Similarity" (scoring) plug-in, coupled with custom
"Value Sources" to expose whatever alternative metrics you wish.
But, before you embark on such a venture, be aware that the performance of
such an alternative relevance model might not be as appealing as you might
want. You'll have to do a proof of concept to see how well things actually
work out.
-- Jack Krupansky
-----Original Message-----
From: Tony Mullins
Sent: Thursday, July 04, 2013 12:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Total Term Frequency per ResultSet in Solr 4.3 ?
So what is the workaround for this problem ?
Can it be done without changing any source code ?
Thanks,
Tony
On Thu, Jul 4, 2013 at 8:01 PM, Yonik Seeley <yo...@lucidworks.com> wrote:
Ah, sorry - I thought you were after docfreq, not termfreq.
-Yonik
http://lucidworks.com
On Thu, Jul 4, 2013 at 10:57 AM, Tony Mullins <tonymullins...@gmail.com>
wrote:
> Hi Yonik,
>
> With facet it didn't work.
>
> Please see the result set doc below
>
>
http://localhost:8080/solr/collection2/select?fl=*,amazing_freq:termfreq%28product,%27amazing%27%29,spider_freq:termfreq%28product,%27spider%27%29&fq=id%3A27&q=spider&fl=*&df=product&wt=xml&indent=true&facet=true&facet.query=product:spider&facet.query=product:amazing&rows=20
>
> <doc>
> <str name="id">27</str>
> <str name="type">Movies</str>
> <str name="format">dvd</str>
> <str name="product">The amazing spider man is amazing spider the
> spider</str>
> <int name="popularity">1</int>
> <long name="_version_">1439641369145507840</long>
>
> <int name="amazing_freq">2</int>
> <int name="spider_freq">3</int>
> </doc>
> </result><lst name="facet_counts"><lst name="facet_queries">
> <int name="product:spider">1</int>
> <int name="product:amazing">1</int>
> </lst>
>
> As you can see facet is actually just returning the no. of docs found
> against those keywrods not the actual frequency.
> Actual frequency is returned by the field 'amazing_freq' & 'spider_freq'
!
>
> So is there any workaround for this to get the total of term-frequency
> in
> resultset without any modification to Solr source code ?
>
>
> Thanks,
> Tony
>
>
> On Thu, Jul 4, 2013 at 7:05 PM, Yonik Seeley <yo...@lucidworks.com>
wrote:
>
>> If you just want to retrieve those counts, this seems like simple
faceting.
>>
>> q=something
>> facet=true
>> facet.query=product:hunger
>> facet.query=product:games
>>
>> -Yonik
>> http://lucidworks.com
>>
>> On Thu, Jul 4, 2013 at 9:45 AM, Tony Mullins <tonymullins...@gmail.com>
>> wrote:
>> > Hi ,
>> >
>> > I have lots of crawled data, indexed in my Solr (4.3.0) and lets say
user
>> > creates a search criteria 'X1' and he/she wants to know the
>> > occurrence
>> of a
>> > specific term in the result set of that 'X1' search criteria.
>> > And then again he/she creates another search criteria 'X2' and he/she
>> wants
>> > to know the occurrence of that same term in the result set of that
'X2'
>> > search criteria.
>> >
>> > At the moment if I give termfreq(field,term) then it gives me the
>> > term
>> > frequency per document and if I use totaltermfreq(field,term), it
gives
>> me
>> > the total term frequency in entire index not in the result set of my
>> search
>> > criteria.
>> >
>> > So what I need is your help to find how to how to get total
>> > occurrence
>> of a
>> > term in query's result set.
>> >
>> > If this is my result set
>> >
>> > <doc>
>> > <str name="type">Movies</str>
>> > <str name="format">dvd</str>
>> > <str name="product">The Hunger Games</str></doc>
>> >
>> > <doc>
>> > <str name="type">Books</str>
>> > <str name="format">paperback</str>
>> > <str name="product">The Hunger Book</str></doc>
>> >
>> > And I am looking for term 'hunger' in product field then I want to
>> > get
>> > value = '2' , and if I am searching for term 'games' in product field
I
>> > want to get value = '1' .
>> >
>> > Thanks,
>> > Tony
>>