subject:"Boosting Documents"

Re: Boosting Documents using the field Value

2017-06-27 Thread govind nitk

Hi Erick,

Finally Made it work.

bf=if(exists(query($qqone)),one_score,0)=one_query:\"google cloud\"

Thanks a lot for guiding, also reminding its not url escape.

No analyzers used.


Regards,
Govind



On Tue, Jun 27, 2017 at 11:01 AM, govind nitk  wrote:

> Hi Erick,
> I accept, I should have mentioned the what I was doing first.
>
> field types:
> one_query is "string",
> one_score is float.
>
> So No explicit analyzers.
>
> mentioned sow=false. and escaping as you mentioned. But still the error
> persist. - undefined field "cloud"
>
> Will get back.
>
> Regards,
> Givind
>
> On Tue, Jun 27, 2017 at 8:44 AM, Erick Erickson 
> wrote:
>
>> bq: So, ultimate goal is when the exact query matches in field
>> one_query, apply boost of one_score
>>
>> It would have been helpful to have made that statement in the first
>> place, would have saved some false paths.
>>
>> What is your analysis chain here? If it's anything like "text_general"
>> or the like then you're going to have some trouble. I'd think about an
>> analysis chain like KeywordTokenizerFactory and
>> LowercaseFilterFactory. That'll index the entire field as a single
>> token. The admin/analysis page is your friend.
>>
>> To search against it, you need to _escape_ the space (not "url
>> escape"). As in google\ cloud so that makes it through the query
>> parser as a single token.
>>
>> As of Solr 6.5 you can also specify sow=false (Split On Whitespace),
>> which may be a better option, see:
>> https://issues.apache.org/jira/browse/SOLR-9185
>>
>> Best,
>> Erick
>>
>> On Mon, Jun 26, 2017 at 7:32 PM, govind nitk 
>> wrote:
>> > Hi Developers, Erick
>> >
>> > I am able to add boost through function as below:
>> > bf=if(termfreq(one_query,"google"),one_score,0)
>> >
>> > Problem is when I say "google cloud" as query, it gives error:
>> > undefined field: \"cloud\""
>> >
>> > I tried encoding the query(%20, + for space), but not able to get it
>> > working.
>> >
>> > So, ultimate goal is when the exact query matches in field one_query,
>> apply
>> > boost of one_score.
>> >
>> > Is there any way to do this? Or a PR is needed.
>> >
>> >
>> > Regards,
>> > Govind
>> >
>> >
>> > On Mon, Jun 26, 2017 at 11:14 AM, govind nitk 
>> wrote:
>> >
>> >>
>> >> Hi Erick,
>> >>
>> >> Exactly this is what I was looking for.
>> >> Thanks a lot.
>> >>
>> >>
>> >> Regards,
>> >> Govind
>> >>
>> >> On Mon, Jun 26, 2017 at 12:03 AM, Erick Erickson <
>> erickerick...@gmail.com>
>> >> wrote:
>> >>
>> >>> Take a look at function queries. You're probably looking for "field",
>> >>> "termfreq" and "if" functions or some other combination like that.
>> >>>
>> >>> On Sun, Jun 25, 2017 at 9:01 AM, govind nitk 
>> >>> wrote:
>> >>> > Hi Erik, Thanks for the reply.
>> >>> >
>> >>> > My intention of using the domain_ct in the qf was, giving the weight
>> >>> > present in the that document.
>> >>> >
>> >>> > e.g
>> >>> > qf=category^domain_ct
>> >>> >
>> >>> > if the current query matched in the category, the boost given will
>> be
>> >>> > domain_ct, which is present in the current matched document.
>> >>> >
>> >>> >
>> >>> > So if I have category_1ct, category_2ct, category_3ct, category_4ct
>> as 4
>> >>> > indexed categories(text_general fields) and the same document has
>> >>> > domain_1ct, domain_2ct, domain_3ct, domain_4ct as 4 different count
>> >>> > fields(int), is there any way to achieve:
>> >>> >
>> >>> > qf=category_1ct^domain_1ct=category_2ct^domain_2ct=cat
>> >>> egory_3ct^domain_3ct=category_4ct^domain_4ct
>> >>> >   ?
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> > Regards
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> > On Sat, Jun 24, 2017 at 3:42 PM, Erik Hatcher <
>> erik.hatc...@gmail.com>
>> >>> > wrote:
>> >>> >
>> >>> >> With dismax use bf=domain_ct. you can also use boost=domain_ct with
>> >>> >> edismax.
>> >>> >>
>> >>> >> > On Jun 23, 2017, at 23:01, govind nitk 
>> >>> wrote:
>> >>> >> >
>> >>> >> > Hi Solr,
>> >>> >> >
>> >>> >> > My Index Data:
>> >>> >> >
>> >>> >> > id name category domain domain_ct
>> >>> >> > 1 Banana Fruits Home > Fruits > Banana 2
>> >>> >> > 2 Orange Fruits Home > Fruits > Orange 4
>> >>> >> > 3 Samsung Mobile Electronics > Mobile > Samsung 3
>> >>> >> >
>> >>> >> >
>> >>> >> > I am able to retrieve the documents with dismax parser with the
>> >>> weights
>> >>> >> > mentioned as below.
>> >>> >> >
>> >>> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
>> >>> >> indent=on=fruits=category
>> >>> >> > ^0.9=name^0.7=json
>> >>> >> >
>> >>> >> >
>> >>> >> > Is it possible to retrieve the documents with weight taken from
>> the
>> >>> >> indexed
>> >>> >> > field like:
>> >>> >> >
>> >>> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
>> >>> >> indent=on=fruits=category
>> >>> >> > ^domain_ct=name^domain_ct=json
>> >>> >> >
>> >>> >> > Is this possible to give weight

Re: Boosting Documents using the field Value

2017-06-26 Thread govind nitk

Hi Erick,
I accept, I should have mentioned the what I was doing first.

field types:
one_query is "string",
one_score is float.

So No explicit analyzers.

mentioned sow=false. and escaping as you mentioned. But still the error
persist. - undefined field "cloud"

Will get back.

Regards,
Givind

On Tue, Jun 27, 2017 at 8:44 AM, Erick Erickson 
wrote:

> bq: So, ultimate goal is when the exact query matches in field
> one_query, apply boost of one_score
>
> It would have been helpful to have made that statement in the first
> place, would have saved some false paths.
>
> What is your analysis chain here? If it's anything like "text_general"
> or the like then you're going to have some trouble. I'd think about an
> analysis chain like KeywordTokenizerFactory and
> LowercaseFilterFactory. That'll index the entire field as a single
> token. The admin/analysis page is your friend.
>
> To search against it, you need to _escape_ the space (not "url
> escape"). As in google\ cloud so that makes it through the query
> parser as a single token.
>
> As of Solr 6.5 you can also specify sow=false (Split On Whitespace),
> which may be a better option, see:
> https://issues.apache.org/jira/browse/SOLR-9185
>
> Best,
> Erick
>
> On Mon, Jun 26, 2017 at 7:32 PM, govind nitk 
> wrote:
> > Hi Developers, Erick
> >
> > I am able to add boost through function as below:
> > bf=if(termfreq(one_query,"google"),one_score,0)
> >
> > Problem is when I say "google cloud" as query, it gives error:
> > undefined field: \"cloud\""
> >
> > I tried encoding the query(%20, + for space), but not able to get it
> > working.
> >
> > So, ultimate goal is when the exact query matches in field one_query,
> apply
> > boost of one_score.
> >
> > Is there any way to do this? Or a PR is needed.
> >
> >
> > Regards,
> > Govind
> >
> >
> > On Mon, Jun 26, 2017 at 11:14 AM, govind nitk 
> wrote:
> >
> >>
> >> Hi Erick,
> >>
> >> Exactly this is what I was looking for.
> >> Thanks a lot.
> >>
> >>
> >> Regards,
> >> Govind
> >>
> >> On Mon, Jun 26, 2017 at 12:03 AM, Erick Erickson <
> erickerick...@gmail.com>
> >> wrote:
> >>
> >>> Take a look at function queries. You're probably looking for "field",
> >>> "termfreq" and "if" functions or some other combination like that.
> >>>
> >>> On Sun, Jun 25, 2017 at 9:01 AM, govind nitk 
> >>> wrote:
> >>> > Hi Erik, Thanks for the reply.
> >>> >
> >>> > My intention of using the domain_ct in the qf was, giving the weight
> >>> > present in the that document.
> >>> >
> >>> > e.g
> >>> > qf=category^domain_ct
> >>> >
> >>> > if the current query matched in the category, the boost given will be
> >>> > domain_ct, which is present in the current matched document.
> >>> >
> >>> >
> >>> > So if I have category_1ct, category_2ct, category_3ct, category_4ct
> as 4
> >>> > indexed categories(text_general fields) and the same document has
> >>> > domain_1ct, domain_2ct, domain_3ct, domain_4ct as 4 different count
> >>> > fields(int), is there any way to achieve:
> >>> >
> >>> > qf=category_1ct^domain_1ct=category_2ct^domain_2ct=cat
> >>> egory_3ct^domain_3ct=category_4ct^domain_4ct
> >>> >   ?
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > Regards
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > On Sat, Jun 24, 2017 at 3:42 PM, Erik Hatcher <
> erik.hatc...@gmail.com>
> >>> > wrote:
> >>> >
> >>> >> With dismax use bf=domain_ct. you can also use boost=domain_ct with
> >>> >> edismax.
> >>> >>
> >>> >> > On Jun 23, 2017, at 23:01, govind nitk 
> >>> wrote:
> >>> >> >
> >>> >> > Hi Solr,
> >>> >> >
> >>> >> > My Index Data:
> >>> >> >
> >>> >> > id name category domain domain_ct
> >>> >> > 1 Banana Fruits Home > Fruits > Banana 2
> >>> >> > 2 Orange Fruits Home > Fruits > Orange 4
> >>> >> > 3 Samsung Mobile Electronics > Mobile > Samsung 3
> >>> >> >
> >>> >> >
> >>> >> > I am able to retrieve the documents with dismax parser with the
> >>> weights
> >>> >> > mentioned as below.
> >>> >> >
> >>> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
> >>> >> indent=on=fruits=category
> >>> >> > ^0.9=name^0.7=json
> >>> >> >
> >>> >> >
> >>> >> > Is it possible to retrieve the documents with weight taken from
> the
> >>> >> indexed
> >>> >> > field like:
> >>> >> >
> >>> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
> >>> >> indent=on=fruits=category
> >>> >> > ^domain_ct=name^domain_ct=json
> >>> >> >
> >>> >> > Is this possible to give weight from an indexed field ? Am I doing
> >>> >> > something wrong?
> >>> >> > Is there any other way of doing this?
> >>> >> >
> >>> >> >
> >>> >> > Regards
> >>> >>
> >>>
> >>
> >>
>

Re: Boosting Documents using the field Value

2017-06-26 Thread Erick Erickson

bq: So, ultimate goal is when the exact query matches in field
one_query, apply boost of one_score

It would have been helpful to have made that statement in the first
place, would have saved some false paths.

What is your analysis chain here? If it's anything like "text_general"
or the like then you're going to have some trouble. I'd think about an
analysis chain like KeywordTokenizerFactory and
LowercaseFilterFactory. That'll index the entire field as a single
token. The admin/analysis page is your friend.

To search against it, you need to _escape_ the space (not "url
escape"). As in google\ cloud so that makes it through the query
parser as a single token.

As of Solr 6.5 you can also specify sow=false (Split On Whitespace),
which may be a better option, see:
https://issues.apache.org/jira/browse/SOLR-9185

Best,
Erick

On Mon, Jun 26, 2017 at 7:32 PM, govind nitk  wrote:
> Hi Developers, Erick
>
> I am able to add boost through function as below:
> bf=if(termfreq(one_query,"google"),one_score,0)
>
> Problem is when I say "google cloud" as query, it gives error:
> undefined field: \"cloud\""
>
> I tried encoding the query(%20, + for space), but not able to get it
> working.
>
> So, ultimate goal is when the exact query matches in field one_query, apply
> boost of one_score.
>
> Is there any way to do this? Or a PR is needed.
>
>
> Regards,
> Govind
>
>
> On Mon, Jun 26, 2017 at 11:14 AM, govind nitk  wrote:
>
>>
>> Hi Erick,
>>
>> Exactly this is what I was looking for.
>> Thanks a lot.
>>
>>
>> Regards,
>> Govind
>>
>> On Mon, Jun 26, 2017 at 12:03 AM, Erick Erickson 
>> wrote:
>>
>>> Take a look at function queries. You're probably looking for "field",
>>> "termfreq" and "if" functions or some other combination like that.
>>>
>>> On Sun, Jun 25, 2017 at 9:01 AM, govind nitk 
>>> wrote:
>>> > Hi Erik, Thanks for the reply.
>>> >
>>> > My intention of using the domain_ct in the qf was, giving the weight
>>> > present in the that document.
>>> >
>>> > e.g
>>> > qf=category^domain_ct
>>> >
>>> > if the current query matched in the category, the boost given will be
>>> > domain_ct, which is present in the current matched document.
>>> >
>>> >
>>> > So if I have category_1ct, category_2ct, category_3ct, category_4ct as 4
>>> > indexed categories(text_general fields) and the same document has
>>> > domain_1ct, domain_2ct, domain_3ct, domain_4ct as 4 different count
>>> > fields(int), is there any way to achieve:
>>> >
>>> > qf=category_1ct^domain_1ct=category_2ct^domain_2ct=cat
>>> egory_3ct^domain_3ct=category_4ct^domain_4ct
>>> >   ?
>>> >
>>> >
>>> >
>>> >
>>> > Regards
>>> >
>>> >
>>> >
>>> >
>>> > On Sat, Jun 24, 2017 at 3:42 PM, Erik Hatcher 
>>> > wrote:
>>> >
>>> >> With dismax use bf=domain_ct. you can also use boost=domain_ct with
>>> >> edismax.
>>> >>
>>> >> > On Jun 23, 2017, at 23:01, govind nitk 
>>> wrote:
>>> >> >
>>> >> > Hi Solr,
>>> >> >
>>> >> > My Index Data:
>>> >> >
>>> >> > id name category domain domain_ct
>>> >> > 1 Banana Fruits Home > Fruits > Banana 2
>>> >> > 2 Orange Fruits Home > Fruits > Orange 4
>>> >> > 3 Samsung Mobile Electronics > Mobile > Samsung 3
>>> >> >
>>> >> >
>>> >> > I am able to retrieve the documents with dismax parser with the
>>> weights
>>> >> > mentioned as below.
>>> >> >
>>> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
>>> >> indent=on=fruits=category
>>> >> > ^0.9=name^0.7=json
>>> >> >
>>> >> >
>>> >> > Is it possible to retrieve the documents with weight taken from the
>>> >> indexed
>>> >> > field like:
>>> >> >
>>> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
>>> >> indent=on=fruits=category
>>> >> > ^domain_ct=name^domain_ct=json
>>> >> >
>>> >> > Is this possible to give weight from an indexed field ? Am I doing
>>> >> > something wrong?
>>> >> > Is there any other way of doing this?
>>> >> >
>>> >> >
>>> >> > Regards
>>> >>
>>>
>>
>>

Re: Boosting Documents using the field Value

2017-06-26 Thread govind nitk

Hi Developers, Erick

I am able to add boost through function as below:
bf=if(termfreq(one_query,"google"),one_score,0)

Problem is when I say "google cloud" as query, it gives error:
undefined field: \"cloud\""

I tried encoding the query(%20, + for space), but not able to get it
working.

So, ultimate goal is when the exact query matches in field one_query, apply
boost of one_score.

Is there any way to do this? Or a PR is needed.


Regards,
Govind


On Mon, Jun 26, 2017 at 11:14 AM, govind nitk  wrote:

>
> Hi Erick,
>
> Exactly this is what I was looking for.
> Thanks a lot.
>
>
> Regards,
> Govind
>
> On Mon, Jun 26, 2017 at 12:03 AM, Erick Erickson 
> wrote:
>
>> Take a look at function queries. You're probably looking for "field",
>> "termfreq" and "if" functions or some other combination like that.
>>
>> On Sun, Jun 25, 2017 at 9:01 AM, govind nitk 
>> wrote:
>> > Hi Erik, Thanks for the reply.
>> >
>> > My intention of using the domain_ct in the qf was, giving the weight
>> > present in the that document.
>> >
>> > e.g
>> > qf=category^domain_ct
>> >
>> > if the current query matched in the category, the boost given will be
>> > domain_ct, which is present in the current matched document.
>> >
>> >
>> > So if I have category_1ct, category_2ct, category_3ct, category_4ct as 4
>> > indexed categories(text_general fields) and the same document has
>> > domain_1ct, domain_2ct, domain_3ct, domain_4ct as 4 different count
>> > fields(int), is there any way to achieve:
>> >
>> > qf=category_1ct^domain_1ct=category_2ct^domain_2ct=cat
>> egory_3ct^domain_3ct=category_4ct^domain_4ct
>> >   ?
>> >
>> >
>> >
>> >
>> > Regards
>> >
>> >
>> >
>> >
>> > On Sat, Jun 24, 2017 at 3:42 PM, Erik Hatcher 
>> > wrote:
>> >
>> >> With dismax use bf=domain_ct. you can also use boost=domain_ct with
>> >> edismax.
>> >>
>> >> > On Jun 23, 2017, at 23:01, govind nitk 
>> wrote:
>> >> >
>> >> > Hi Solr,
>> >> >
>> >> > My Index Data:
>> >> >
>> >> > id name category domain domain_ct
>> >> > 1 Banana Fruits Home > Fruits > Banana 2
>> >> > 2 Orange Fruits Home > Fruits > Orange 4
>> >> > 3 Samsung Mobile Electronics > Mobile > Samsung 3
>> >> >
>> >> >
>> >> > I am able to retrieve the documents with dismax parser with the
>> weights
>> >> > mentioned as below.
>> >> >
>> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
>> >> indent=on=fruits=category
>> >> > ^0.9=name^0.7=json
>> >> >
>> >> >
>> >> > Is it possible to retrieve the documents with weight taken from the
>> >> indexed
>> >> > field like:
>> >> >
>> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
>> >> indent=on=fruits=category
>> >> > ^domain_ct=name^domain_ct=json
>> >> >
>> >> > Is this possible to give weight from an indexed field ? Am I doing
>> >> > something wrong?
>> >> > Is there any other way of doing this?
>> >> >
>> >> >
>> >> > Regards
>> >>
>>
>
>

Re: Boosting Documents using the field Value

2017-06-25 Thread govind nitk

Hi Erick,

Exactly this is what I was looking for.
Thanks a lot.


Regards,
Govind

On Mon, Jun 26, 2017 at 12:03 AM, Erick Erickson 
wrote:

> Take a look at function queries. You're probably looking for "field",
> "termfreq" and "if" functions or some other combination like that.
>
> On Sun, Jun 25, 2017 at 9:01 AM, govind nitk 
> wrote:
> > Hi Erik, Thanks for the reply.
> >
> > My intention of using the domain_ct in the qf was, giving the weight
> > present in the that document.
> >
> > e.g
> > qf=category^domain_ct
> >
> > if the current query matched in the category, the boost given will be
> > domain_ct, which is present in the current matched document.
> >
> >
> > So if I have category_1ct, category_2ct, category_3ct, category_4ct as 4
> > indexed categories(text_general fields) and the same document has
> > domain_1ct, domain_2ct, domain_3ct, domain_4ct as 4 different count
> > fields(int), is there any way to achieve:
> >
> > qf=category_1ct^domain_1ct=category_2ct^domain_2ct=
> category_3ct^domain_3ct=category_4ct^domain_4ct
> >   ?
> >
> >
> >
> >
> > Regards
> >
> >
> >
> >
> > On Sat, Jun 24, 2017 at 3:42 PM, Erik Hatcher 
> > wrote:
> >
> >> With dismax use bf=domain_ct. you can also use boost=domain_ct with
> >> edismax.
> >>
> >> > On Jun 23, 2017, at 23:01, govind nitk  wrote:
> >> >
> >> > Hi Solr,
> >> >
> >> > My Index Data:
> >> >
> >> > id name category domain domain_ct
> >> > 1 Banana Fruits Home > Fruits > Banana 2
> >> > 2 Orange Fruits Home > Fruits > Orange 4
> >> > 3 Samsung Mobile Electronics > Mobile > Samsung 3
> >> >
> >> >
> >> > I am able to retrieve the documents with dismax parser with the
> weights
> >> > mentioned as below.
> >> >
> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
> >> indent=on=fruits=category
> >> > ^0.9=name^0.7=json
> >> >
> >> >
> >> > Is it possible to retrieve the documents with weight taken from the
> >> indexed
> >> > field like:
> >> >
> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
> >> indent=on=fruits=category
> >> > ^domain_ct=name^domain_ct=json
> >> >
> >> > Is this possible to give weight from an indexed field ? Am I doing
> >> > something wrong?
> >> > Is there any other way of doing this?
> >> >
> >> >
> >> > Regards
> >>
>

Re: Boosting Documents using the field Value

2017-06-25 Thread Erick Erickson

Take a look at function queries. You're probably looking for "field",
"termfreq" and "if" functions or some other combination like that.

On Sun, Jun 25, 2017 at 9:01 AM, govind nitk  wrote:
> Hi Erik, Thanks for the reply.
>
> My intention of using the domain_ct in the qf was, giving the weight
> present in the that document.
>
> e.g
> qf=category^domain_ct
>
> if the current query matched in the category, the boost given will be
> domain_ct, which is present in the current matched document.
>
>
> So if I have category_1ct, category_2ct, category_3ct, category_4ct as 4
> indexed categories(text_general fields) and the same document has
> domain_1ct, domain_2ct, domain_3ct, domain_4ct as 4 different count
> fields(int), is there any way to achieve:
>
> qf=category_1ct^domain_1ct=category_2ct^domain_2ct=category_3ct^domain_3ct=category_4ct^domain_4ct
>   ?
>
>
>
>
> Regards
>
>
>
>
> On Sat, Jun 24, 2017 at 3:42 PM, Erik Hatcher 
> wrote:
>
>> With dismax use bf=domain_ct. you can also use boost=domain_ct with
>> edismax.
>>
>> > On Jun 23, 2017, at 23:01, govind nitk  wrote:
>> >
>> > Hi Solr,
>> >
>> > My Index Data:
>> >
>> > id name category domain domain_ct
>> > 1 Banana Fruits Home > Fruits > Banana 2
>> > 2 Orange Fruits Home > Fruits > Orange 4
>> > 3 Samsung Mobile Electronics > Mobile > Samsung 3
>> >
>> >
>> > I am able to retrieve the documents with dismax parser with the weights
>> > mentioned as below.
>> >
>> > http://localhost:8983/solr/my_index/select?defType=dismax;
>> indent=on=fruits=category
>> > ^0.9=name^0.7=json
>> >
>> >
>> > Is it possible to retrieve the documents with weight taken from the
>> indexed
>> > field like:
>> >
>> > http://localhost:8983/solr/my_index/select?defType=dismax;
>> indent=on=fruits=category
>> > ^domain_ct=name^domain_ct=json
>> >
>> > Is this possible to give weight from an indexed field ? Am I doing
>> > something wrong?
>> > Is there any other way of doing this?
>> >
>> >
>> > Regards
>>

Re: Boosting Documents using the field Value

2017-06-25 Thread govind nitk

Hi Erik, Thanks for the reply.

My intention of using the domain_ct in the qf was, giving the weight
present in the that document.

e.g
qf=category^domain_ct

if the current query matched in the category, the boost given will be
domain_ct, which is present in the current matched document.


So if I have category_1ct, category_2ct, category_3ct, category_4ct as 4
indexed categories(text_general fields) and the same document has
domain_1ct, domain_2ct, domain_3ct, domain_4ct as 4 different count
fields(int), is there any way to achieve:

qf=category_1ct^domain_1ct=category_2ct^domain_2ct=category_3ct^domain_3ct=category_4ct^domain_4ct
  ?




Regards




On Sat, Jun 24, 2017 at 3:42 PM, Erik Hatcher 
wrote:

> With dismax use bf=domain_ct. you can also use boost=domain_ct with
> edismax.
>
> > On Jun 23, 2017, at 23:01, govind nitk  wrote:
> >
> > Hi Solr,
> >
> > My Index Data:
> >
> > id name category domain domain_ct
> > 1 Banana Fruits Home > Fruits > Banana 2
> > 2 Orange Fruits Home > Fruits > Orange 4
> > 3 Samsung Mobile Electronics > Mobile > Samsung 3
> >
> >
> > I am able to retrieve the documents with dismax parser with the weights
> > mentioned as below.
> >
> > http://localhost:8983/solr/my_index/select?defType=dismax;
> indent=on=fruits=category
> > ^0.9=name^0.7=json
> >
> >
> > Is it possible to retrieve the documents with weight taken from the
> indexed
> > field like:
> >
> > http://localhost:8983/solr/my_index/select?defType=dismax;
> indent=on=fruits=category
> > ^domain_ct=name^domain_ct=json
> >
> > Is this possible to give weight from an indexed field ? Am I doing
> > something wrong?
> > Is there any other way of doing this?
> >
> >
> > Regards
>

Re: Boosting Documents using the field Value

2017-06-24 Thread Erik Hatcher

With dismax use bf=domain_ct. you can also use boost=domain_ct with edismax. 

> On Jun 23, 2017, at 23:01, govind nitk  wrote:
> 
> Hi Solr,
> 
> My Index Data:
> 
> id name category domain domain_ct
> 1 Banana Fruits Home > Fruits > Banana 2
> 2 Orange Fruits Home > Fruits > Orange 4
> 3 Samsung Mobile Electronics > Mobile > Samsung 3
> 
> 
> I am able to retrieve the documents with dismax parser with the weights
> mentioned as below.
> 
> http://localhost:8983/solr/my_index/select?defType=dismax=on=fruits=category
> ^0.9=name^0.7=json
> 
> 
> Is it possible to retrieve the documents with weight taken from the indexed
> field like:
> 
> http://localhost:8983/solr/my_index/select?defType=dismax=on=fruits=category
> ^domain_ct=name^domain_ct=json
> 
> Is this possible to give weight from an indexed field ? Am I doing
> something wrong?
> Is there any other way of doing this?
> 
> 
> Regards

Boosting Documents using the field Value

2017-06-23 Thread govind nitk

Hi Solr,

My Index Data:

id name category domain domain_ct
1 Banana Fruits Home > Fruits > Banana 2
2 Orange Fruits Home > Fruits > Orange 4
3 Samsung Mobile Electronics > Mobile > Samsung 3


I am able to retrieve the documents with dismax parser with the weights
mentioned as below.

http://localhost:8983/solr/my_index/select?defType=dismax=on=fruits=category
^0.9=name^0.7=json


Is it possible to retrieve the documents with weight taken from the indexed
field like:

http://localhost:8983/solr/my_index/select?defType=dismax=on=fruits=category
^domain_ct=name^domain_ct=json

Is this possible to give weight from an indexed field ? Am I doing
something wrong?
Is there any other way of doing this?


Regards

Re: Negative Boosting documents with a certain word

2015-05-07 Thread Chris Hostetter


: Right now, I specify the boost for my request handler as:
: requestHandler name=/select class=solr.SearchHandler
:   .
:   str name=boostln(qty)/str
:   
:  /requestHandler
: 
: Is there a way to specify this boost in the Solrconfig.xml?
: 
: I tried: str name=boost(*:* -Refurbished)^10/str   and I get the
: following exception: 
: 
: ERROR - 2015-05-01 15:13:41.609; org.apache.solr.common.SolrException;
: org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError:
: Expected identifier at pos 0 str='(*:* -Refurbished)^10'


thta's because the boost option on the edismax parser expects a 
function, not a query...

https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser

try adding a bq param...

  str name=bq(*:* -Refurbished -foo -bar -baz)^10/str



-Hoss
http://www.lucidworks.com/

Re: Negative Boosting documents with a certain word

2015-05-02 Thread O. Olson

.29_boost_to_documents_that_match_a_query.3F
 
 The general principle you need to follow is to boost documents that do 
 *not* match your keyword...
 
   (*:* -Refurbished)^10
 
 -Hoss
 http://www.lucidworks.com/





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Negative-Boosting-documents-with-a-certain-word-tp4203224p4203488.html
Sent from the Solr - User mailing list archive at Nabble.com.

Negative Boosting documents with a certain word

2015-04-30 Thread O. Olson

Hi,

My Solr documents contain descriptions of products, similar to a 
BestBuy or
a NewEgg catalog. I'm wondering if it were possible to push a product down
the ranking if it contains a certain word. By this I mean it would still
appear in the search results. However, instead of appearing near the top of
the results, it would appear further towards the bottom. (I'm assuming this
is a called a negative boost.)

For example, consider the word:  'Refurbished' or the word: 'Case'

If the product description contains the word 'Refurbished' (or the word
'Case') I would like to reduce the ranking of these products. My business
logic is that I would rather sell a new Laptop vs a refurbished laptop, or I
would rather sell a laptop vs selling a laptop case. So, I would like to see
if I can assign products a negative boost if they contain certain words in
their description.

Thank you in advance for all your help,
O. O.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Negative-Boosting-documents-with-a-certain-word-tp4203224.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Negative Boosting documents with a certain word

2015-04-30 Thread Chris Hostetter


:   My Solr documents contain descriptions of products, similar to a 
BestBuy or
: a NewEgg catalog. I'm wondering if it were possible to push a product down
: the ranking if it contains a certain word. By this I mean it would still

https://wiki.apache.org/solr/SolrRelevancyFAQ#How_do_I_give_a_negative_.28or_very_low.29_boost_to_documents_that_match_a_query.3F

The general principle you need to follow is to boost documents that do 
*not* match your keyword...

(*:* -Refurbished)^10


: appear in the search results. However, instead of appearing near the top of
: the results, it would appear further towards the bottom. (I'm assuming this
: is a called a negative boost.)
: 
:   For example, consider the word:  'Refurbished' or the word: 'Case'
: 
:   If the product description contains the word 'Refurbished' (or the word
: 'Case') I would like to reduce the ranking of these products. My business
: logic is that I would rather sell a new Laptop vs a refurbished laptop, or I
: would rather sell a laptop vs selling a laptop case. So, I would like to see
: if I can assign products a negative boost if they contain certain words in
: their description.
: 
: Thank you in advance for all your help,
: O. O.
: 
: 
: 
: --
: View this message in context: 
http://lucene.472066.n3.nabble.com/Negative-Boosting-documents-with-a-certain-word-tp4203224.html
: Sent from the Solr - User mailing list archive at Nabble.com.
: 

-Hoss
http://www.lucidworks.com/

Re: Boosting documents by categorical preferences

2014-01-30 Thread Amit Nithian

Chris,

Sounds good! Thanks for the tips.. I'll be glad to submit my talk to this
as I have a writeup pretty much ready to go.

Cheers
Amit


On Tue, Jan 28, 2014 at 11:24 AM, Chris Hostetter
hossman_luc...@fucit.orgwrote:


 : The initial results seem to be kinda promising... of course there are
 many
 : more optimizations I could do like decay user ratings over time to
 indicate
 : that preferences decay over time so a 5 rating a year ago doesn't count
 as
 : much as a 5 rating today.
 :
 : Hope this helps others. I'll open source what I have soon and post back.
 If
 : there is feedback or other thoughts let me know!

 Hey Amit,

 Glad to hear your user based boosting experiments are paying off.  I would
 definitely love to see a more detailed writeup down the road showing off
 how it affects your final user metrics -- or perhaps even give a session
 on your technique at ApacheCon?


 http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp


 -Hoss
 http://www.lucidworks.com/

Re: Boosting documents by categorical preferences

2014-01-28 Thread Chris Hostetter


: The initial results seem to be kinda promising... of course there are many
: more optimizations I could do like decay user ratings over time to indicate
: that preferences decay over time so a 5 rating a year ago doesn't count as
: much as a 5 rating today.
: 
: Hope this helps others. I'll open source what I have soon and post back. If
: there is feedback or other thoughts let me know!

Hey Amit,

Glad to hear your user based boosting experiments are paying off.  I would 
definitely love to see a more detailed writeup down the road showing off 
how it affects your final user metrics -- or perhaps even give a session 
on your technique at ApacheCon?

http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp


-Hoss
http://www.lucidworks.com/

Re: Boosting documents by categorical preferences

2014-01-27 Thread Amit Nithian

Hi Chris (and others interested in this),

Sorry for dropping off.. I got sidetracked with other work and came back to
this and finally got a V1 of this implemented.

The final process is as follows:
1) Pre-compute the global categorical num_ratings/average/std-dev (so for
Action the average rating may be 3.49 with stdDev of .99)
2) For a given user, retrieve the last X (X for me is 10) ratings and
compute the user's categorical affinities by taking the average rating for
all movies in that particular category (Action) subtract the global cat
average and divide by cat std_dev. Furthermore, multiply this by the
fraction of total user ratings in that category.
   - For example, if a user's last 10 ratings consisted of 9/10 Drama and
1/10 Thriller, the z-score of the Thriller should be discounted relative to
that of the Drama so that it's more prominent the user's preference (either
positive or negative) to Drama.
3) Sort by the absolute value of the z-score (Thanks Hossman.. great
thought).
4) Return the top 3 (arbitrary number)
5) Modify the query to look like the following:

qq=tom hanksq={!boost b=$b defType=edismax
v=$qq}cat1=category:Childrencat2=category:Fantasycat3=category:Animationb=sum(1,sum(product(query($cat1),0.22267872),product(query($cat2),0.21630952),product(query($cat3),0.21120241)))

basically b = 1+(pref1*query(category:something1) +
pref2*query(category:something2) + pref3*query(category:something3))

The initial results seem to be kinda promising... of course there are many
more optimizations I could do like decay user ratings over time to indicate
that preferences decay over time so a 5 rating a year ago doesn't count as
much as a 5 rating today.

Hope this helps others. I'll open source what I have soon and post back. If
there is feedback or other thoughts let me know!

Cheers
Amit


On Fri, Nov 22, 2013 at 11:38 AM, Chris Hostetter
hossman_luc...@fucit.orgwrote:


 : I thought about that but my concern/question was how. If I used the pow
 : function then I'm still boosting the bad categories by a small
 : amount..alternatively I could multiply by a negative number but does that
 : work as expected?

 I'm not sure i understand your concern: negative powers would give you
 values less then 1, positive powers would give you values greater then 1,
 and then you'd use those values as multiplicitive boosts -- so the values
 less then 1 would penalize the scores of existing matching docs in the
 categories the user dislikes.

 Oh wait ... i see, in your original email (and in my subsequent suggested
 tweak to use pow()) you were talking about sum()ing up these 3 category
 boosts (and i cut/pasted sum() in my example as well) ... yeah,
 using multiplcation there would make more sense if you wanted to do the
 negative prefrences as well, because then then score of any matching doc
 will be reduced if it matches on an undesired category -- and the
 amount it will be reduced will be determined by how strongly it
 matches on that category (ie: the base score returned by the nested
 query() func) and how negative the undesired prefrence value (ie:
 the pow() exponent) is


 qq=...
 q={!boost b=$b v=$qq}

 b=prod(pow(query($cat1,cat1z)),pow(query($cat2,cat2z)),pow(query($cat3,cat3z))
 cat1=...action...
 cat1z=1.48
 cat2=...comedy...
 cat2z=1.33
 cat3=...kids...
 cat3z=-1.7


 -Hoss

Boosting documents at index time, based on payloads

2014-01-10 Thread michael.boom

Hi,

I'm not really sure how/if payloads work (I tried out Rafal Kuc's payload
example in Apache Solr 4 Cookbook and did not do what i was expecting - see
below what i was expecting to do and please correct me if i was looking for
the the wrong droid)

What I am trying to achieve is similar to the payload principle, give
certain term a boosting value at index time.
At query time , if searched by that term, that boost value should influence
the scoring, docs with bigger boost values being preferred to the ones with
smaller boost values.

Can this be achieved using payloads? I expect so, but then how should this
behaviour be implemented - the basic recipe failed to work, so I'm a little
confused.

Thanks!



-
Thanks,
Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-documents-at-index-time-based-on-payloads-tp4110661.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting documents by categorical preferences

2013-11-22 Thread Chris Hostetter


: I thought about that but my concern/question was how. If I used the pow
: function then I'm still boosting the bad categories by a small
: amount..alternatively I could multiply by a negative number but does that
: work as expected?

I'm not sure i understand your concern: negative powers would give you 
values less then 1, positive powers would give you values greater then 1, 
and then you'd use those values as multiplicitive boosts -- so the values 
less then 1 would penalize the scores of existing matching docs in the 
categories the user dislikes.

Oh wait ... i see, in your original email (and in my subsequent suggested 
tweak to use pow()) you were talking about sum()ing up these 3 category 
boosts (and i cut/pasted sum() in my example as well) ... yeah, 
using multiplcation there would make more sense if you wanted to do the 
negative prefrences as well, because then then score of any matching doc 
will be reduced if it matches on an undesired category -- and the 
amount it will be reduced will be determined by how strongly it 
matches on that category (ie: the base score returned by the nested 
query() func) and how negative the undesired prefrence value (ie: 
the pow() exponent) is


qq=...
q={!boost b=$b v=$qq}
b=prod(pow(query($cat1,cat1z)),pow(query($cat2,cat2z)),pow(query($cat3,cat3z))
cat1=...action...
cat1z=1.48
cat2=...comedy...
cat2z=1.33
cat3=...kids...
cat3z=-1.7


-Hoss

Re: Boosting documents by categorical preferences

2013-11-20 Thread Amit Nithian

I thought about that but my concern/question was how. If I used the pow
function then I'm still boosting the bad categories by a small
amount..alternatively I could multiply by a negative number but does that
work as expected?

I haven't done much with negative boosting except for the sledgehammer
approach of category exclusion through filters.

Thanks
Amit
On Nov 19, 2013 8:51 AM, Chris Hostetter hossman_luc...@fucit.org wrote:

 : My approach was something like:
 : 1) Look at the categories that the user has preferred and compute the
 : z-score
 : 2) Pick the top 3 among those
 : 3) Use those to boost search results.

 I think that totaly makes sense ... the additional bit i was suggesting
 that you consider is that instead of picking the highest 3 z-scores,
 pick the z-scores with the greatest absolute value ... that way if someone
 is a very booring person and their positive interests are all basically
 exactly the same as the mean for everyone else, but they have some very
 strong dis-interests you don't bother boosting on those miniscule
 interests and instead you negatively boost on the things they are
 antogonistic against.


 -Hoss

Re: Boosting documents by categorical preferences

2013-11-19 Thread Chris Hostetter

: My approach was something like:
: 1) Look at the categories that the user has preferred and compute the
: z-score
: 2) Pick the top 3 among those
: 3) Use those to boost search results.

I think that totaly makes sense ... the additional bit i was suggesting 
that you consider is that instead of picking the highest 3 z-scores, 
pick the z-scores with the greatest absolute value ... that way if someone 
is a very booring person and their positive interests are all basically 
exactly the same as the mean for everyone else, but they have some very 
strong dis-interests you don't bother boosting on those miniscule 
interests and instead you negatively boost on the things they are 
antogonistic against.


-Hoss

Re: Boosting documents by categorical preferences

2013-11-18 Thread Amit Nithian

Hey Chris,

Sorry for the delay and thanks for your response. This was inspired by your
talk on boosting and biasing that you presented way back when at a meetup.
I'm glad that my general approach seems to make sense.

My approach was something like:
1) Look at the categories that the user has preferred and compute the
z-score
2) Pick the top 3 among those
3) Use those to boost search results.

I'll look at using the boosts as an exponent instead of a multiplier as I
think that would make sense.. also as it handles the 0 case.

This is for a prototype I am doing but I'll share the results one day in a
meetup as I think it'll be kinda interesting.

Thanks again
Amit


On Thu, Nov 14, 2013 at 11:11 AM, Chris Hostetter
hossman_luc...@fucit.orgwrote:


 : I have a question around boosting. I wanted to use the boost= to write a
 : nested query that will boost a document based on categorical preferences.

 You have no idea how stoked I am to see you working on this in a real
 world application.

 : Currently I have the weights set to the z-score equivalent of a user's
 : preference for that category which is simply how many standard deviations
 : above the global average is this user's preference for that movie
 category.
 :
 : My question though is basically whether or not semantically the equation
 : query(category:Drama)*some weight + query(category:Comedy)*some
 weight
 : + query(category:Action)*some weight makes sense?

 My gut says that your apprach makes sense -- but if i'm
 understadning you correclty, i think that you need to add 1 to
 all your weights: the boost is a multiplier, so if someone's rating for
 every category is is 0 std devs above the average rating (ie: the most
 average person imaginable), you don't wnat to give every moving in every
 category a score of 0.

 Are you picking the top 3 categories the user prefers as a cut off, or
 are you arbitrarily using N category boosts for however many N categories
 the user is above the global average in their pref for that category?

 Are your prefrences coming from explicit user feedback on the categories
 (ie: rate how much you like comedies on a scale of 1-5) or are you
 infering it from user ratings of the movies themselves? (ie: rate this
 movie, which happens to be an scifi,action,comedy, on a scale of 1-5) ...
 because if it's hte later you probably want to be careful to also
 normalize based on how many categories the movie is in.

 the other thing to consider is wether you want to include negative
 prefrences (ie: weights less then 1) based on how many std dev the user's
 average is *below* the global average for a category .. in this case i
 *think* you'd want to divide the raw value from -1 to get a useful
 multiplier.

 Alternatively: you oculd experiment with using the weights as exponents
 instead of multipliers...


 b=sum(pow(query($cat1),1.482),pow(query($cat2),0.1199),pow(query($cat3),1.448))

 ...that would simplify the math you'd have to worry about both for the
 totally boring average user (x**0 = 1) and for the categories users hate
 (x**-5 = some positive fraction that will act as a penalty) ... but you'd
 definitley need to run some tests to see if it over boosts as the std
 dev variations get really high (might want to take a root first before
 using them as the exponent)



 -Hoss

Re: Boosting documents by categorical preferences

2013-11-14 Thread Chris Hostetter


: I have a question around boosting. I wanted to use the boost= to write a
: nested query that will boost a document based on categorical preferences.

You have no idea how stoked I am to see you working on this in a real 
world application.

: Currently I have the weights set to the z-score equivalent of a user's
: preference for that category which is simply how many standard deviations
: above the global average is this user's preference for that movie category.
: 
: My question though is basically whether or not semantically the equation
: query(category:Drama)*some weight + query(category:Comedy)*some weight
: + query(category:Action)*some weight makes sense?

My gut says that your apprach makes sense -- but if i'm 
understadning you correclty, i think that you need to add 1 to 
all your weights: the boost is a multiplier, so if someone's rating for 
every category is is 0 std devs above the average rating (ie: the most 
average person imaginable), you don't wnat to give every moving in every 
category a score of 0.

Are you picking the top 3 categories the user prefers as a cut off, or 
are you arbitrarily using N category boosts for however many N categories 
the user is above the global average in their pref for that category?

Are your prefrences coming from explicit user feedback on the categories 
(ie: rate how much you like comedies on a scale of 1-5) or are you 
infering it from user ratings of the movies themselves? (ie: rate this 
movie, which happens to be an scifi,action,comedy, on a scale of 1-5) ... 
because if it's hte later you probably want to be careful to also 
normalize based on how many categories the movie is in.

the other thing to consider is wether you want to include negative 
prefrences (ie: weights less then 1) based on how many std dev the user's 
average is *below* the global average for a category .. in this case i 
*think* you'd want to divide the raw value from -1 to get a useful 
multiplier.

Alternatively: you oculd experiment with using the weights as exponents 
instead of multipliers...

b=sum(pow(query($cat1),1.482),pow(query($cat2),0.1199),pow(query($cat3),1.448))

...that would simplify the math you'd have to worry about both for the 
totally boring average user (x**0 = 1) and for the categories users hate 
(x**-5 = some positive fraction that will act as a penalty) ... but you'd 
definitley need to run some tests to see if it over boosts as the std 
dev variations get really high (might want to take a root first before 
using them as the exponent)



-Hoss

Boosting documents by categorical preferences

2013-11-12 Thread Amit Nithian

Hi all,

I have a question around boosting. I wanted to use the boost= to write a
nested query that will boost a document based on categorical preferences.

For a movie search for example, say that a user likes drama, comedy, and
action. I could use things like

qq=q={!boost%20b=$b%20defType=edismax%20v=$qq}b=sum(product(query($cat1),1.482),product(query($cat2),0.1199),product(query($cat3),1.448))cat1=category:Dramacat2=category:Comedycat3=category:Action

where cat1=Drama cat2=Comedy cat3=Action

Currently I have the weights set to the z-score equivalent of a user's
preference for that category which is simply how many standard deviations
above the global average is this user's preference for that movie category.

My question though is basically whether or not semantically the equation
query(category:Drama)*some weight + query(category:Comedy)*some weight
+ query(category:Action)*some weight makes sense?

What are some techniques people use to boost documents based on discrete
things like category, manufacturer, genre etc?

Thanks!
Amit

Re: Boosting Documents

2013-05-23 Thread Oussama Jilal

Oh thank you Chris, this is much clearer, and thank you for updating the 
Wiki too.


On 05/22/2013 08:29 PM, Chris Hostetter wrote:

: NOTE: make sure norms are enabled (omitNorms=false in the schema.xml) for
: any fields where the index-time boost should be stored.
:
: In my case where I only need to boost the whole document (not a specific
: field), do I have to activate the  omitNorms=false  for all the fields
: in the schema ?

docBoost is really just syntactic sugar for a field boost on each field i
the document -- it's factored into the norm value for each field in the
document.  (I'll update the wiki to make this more clear)

If you do a query that doesn't utilize any field which has norms, then the
docBoost you specified when indexing the document never comes into play.


In general, doc boosts and field boosts, and the way they come into play
as part of the field norm is fairly inflexible, and (in my opinion)
antiquated.  A much better way of dealing with this type of problem is
also discussed in the section of the wiki you linked to.  Imeediately
below...

http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts

...you'll find...

http://wiki.apache.org/solr/SolrRelevancyFAQ#Field_Based_Boosting


-Hoss

Re: Boosting Documents

2013-05-22 Thread Oussama Jilal


Thank you for your reply bbarani,

I can't do that because I want to boost some documents over others, 
independing of the query.


On 05/21/2013 05:41 PM, bbarani wrote:

Why don't you boost during query time?

Something like q=supermanqf=title^2 subject

You can refer: http://wiki.apache.org/solr/SolrRelevancyFAQ



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-Documents-tp4064955p4064966.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting Documents

2013-05-22 Thread Sandeep Mestry

Hi Oussama,

This is explained very nicely on Solr Wiki..
http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts
http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22

All you need to do is something similar to below..

   -

   add  doc boost=2.5field name=employeeId05991/field
  field name=office boost=2.0Bridgewater/field  /doc/add


What is not clear from your message is whether you need better scoring or
better sorting. so, additionally, you can consider adding a secondary sort
parameter for the docs having the same score.
http://wiki.apache.org/solr/CommonQueryParameters#sort


HTH,
Sandeep


On 22 May 2013 09:21, Oussama Jilal jilal.ouss...@gmail.com wrote:

 Thank you for your reply bbarani,

 I can't do that because I want to boost some documents over others,
 independing of the query.


 On 05/21/2013 05:41 PM, bbarani wrote:

  Why don't you boost during query time?

 Something like q=supermanqf=title^2 subject

 You can refer: 
 http://wiki.apache.org/solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ



 --
 View this message in context: http://lucene.472066.n3.**
 nabble.com/Boosting-Documents-**tp4064955p4064966.htmlhttp://lucene.472066.n3.nabble.com/Boosting-Documents-tp4064955p4064966.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting Documents

2013-05-22 Thread Oussama Jilal


Thank you Sandeep,

I did post the document like that (a minor difference is that I did not 
add the boost to the field since I don't want to boost on specific 
field, I boosted the whole document 'doc boost=2.0  /doc'), 
but the issue is that everything in the queries results has the same 
score even if they had been indexed with different boosts, and I can't 
sort on another field since this is independent from any field value.


Any ideas ?

On 05/22/2013 10:30 AM, Sandeep Mestry wrote:

Hi Oussama,

This is explained very nicely on Solr Wiki..
http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts
http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22

All you need to do is something similar to below..

-

add  doc boost=2.5field name=employeeId05991/field
   field name=office boost=2.0Bridgewater/field  /doc/add


What is not clear from your message is whether you need better scoring or
better sorting. so, additionally, you can consider adding a secondary sort
parameter for the docs having the same score.
http://wiki.apache.org/solr/CommonQueryParameters#sort


HTH,
Sandeep


On 22 May 2013 09:21, Oussama Jilal jilal.ouss...@gmail.com wrote:


Thank you for your reply bbarani,

I can't do that because I want to boost some documents over others,
independing of the query.


On 05/21/2013 05:41 PM, bbarani wrote:


  Why don't you boost during query time?

Something like q=supermanqf=title^2 subject

You can refer: 
http://wiki.apache.org/solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ



--
View this message in context: http://lucene.472066.n3.**
nabble.com/Boosting-Documents-**tp4064955p4064966.htmlhttp://lucene.472066.n3.nabble.com/Boosting-Documents-tp4064955p4064966.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting Documents

2013-05-22 Thread Oussama Jilal

I don't know if this is the issue or not but, concidering this note from 
the wiki :


NOTE: make sure norms are enabled (omitNorms=false in the schema.xml) 
for any fields where the index-time boost should be stored.


In my case where I only need to boost the whole document (not a specific 
field), do I have to activate the  omitNorms=false  for all the 
fields in the schema ?




On 05/22/2013 10:41 AM, Oussama Jilal wrote:

Thank you Sandeep,

I did post the document like that (a minor difference is that I did 
not add the boost to the field since I don't want to boost on specific 
field, I boosted the whole document 'doc boost=2.0  /doc'), 
but the issue is that everything in the queries results has the same 
score even if they had been indexed with different boosts, and I can't 
sort on another field since this is independent from any field value.


Any ideas ?

On 05/22/2013 10:30 AM, Sandeep Mestry wrote:

Hi Oussama,

This is explained very nicely on Solr Wiki..
http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts
http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22 



All you need to do is something similar to below..

-

add  doc boost=2.5field name=employeeId05991/field
   field name=office boost=2.0Bridgewater/field /doc/add


What is not clear from your message is whether you need better 
scoring or
better sorting. so, additionally, you can consider adding a secondary 
sort

parameter for the docs having the same score.
http://wiki.apache.org/solr/CommonQueryParameters#sort


HTH,
Sandeep


On 22 May 2013 09:21, Oussama Jilal jilal.ouss...@gmail.com wrote:


Thank you for your reply bbarani,

I can't do that because I want to boost some documents over others,
independing of the query.


On 05/21/2013 05:41 PM, bbarani wrote:


  Why don't you boost during query time?

Something like q=supermanqf=title^2 subject

You can refer: 
http://wiki.apache.org/solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ




--
View this message in context: http://lucene.472066.n3.**
nabble.com/Boosting-Documents-**tp4064955p4064966.htmlhttp://lucene.472066.n3.nabble.com/Boosting-Documents-tp4064955p4064966.html 


Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting Documents

2013-05-22 Thread Sandeep Mestry

I think that is applicable only for the field level boosting and not at
document level boosting.

Can you post your query, field definition and results you're expecting.

I am using index and query time boosting without any issues so far. also
which version of Solr you're using?


On 22 May 2013 10:44, Oussama Jilal jilal.ouss...@gmail.com wrote:

 I don't know if this is the issue or not but, concidering this note from
 the wiki :

 NOTE: make sure norms are enabled (omitNorms=false in the schema.xml)
 for any fields where the index-time boost should be stored.

 In my case where I only need to boost the whole document (not a specific
 field), do I have to activate the  omitNorms=false  for all the
 fields in the schema ?




 On 05/22/2013 10:41 AM, Oussama Jilal wrote:

 Thank you Sandeep,

 I did post the document like that (a minor difference is that I did not
 add the boost to the field since I don't want to boost on specific field, I
 boosted the whole document 'doc boost=2.0  /doc'), but the issue
 is that everything in the queries results has the same score even if they
 had been indexed with different boosts, and I can't sort on another field
 since this is independent from any field value.

 Any ideas ?

 On 05/22/2013 10:30 AM, Sandeep Mestry wrote:

 Hi Oussama,

 This is explained very nicely on Solr Wiki..
 http://wiki.apache.org/solr/**SolrRelevancyFAQ#index-time_**boostshttp://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts
 http://wiki.apache.org/solr/**UpdateXmlMessages#Optional_**
 attributes_for_.22add.22http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22

 All you need to do is something similar to below..

 -

 add  doc boost=2.5field name=employeeId05991/**field
field name=office boost=2.0Bridgewater/**field /doc/add


 What is not clear from your message is whether you need better scoring or
 better sorting. so, additionally, you can consider adding a secondary
 sort
 parameter for the docs having the same score.
 http://wiki.apache.org/solr/**CommonQueryParameters#sorthttp://wiki.apache.org/solr/CommonQueryParameters#sort


 HTH,
 Sandeep


 On 22 May 2013 09:21, Oussama Jilal jilal.ouss...@gmail.com wrote:

  Thank you for your reply bbarani,

 I can't do that because I want to boost some documents over others,
 independing of the query.


 On 05/21/2013 05:41 PM, bbarani wrote:

Why don't you boost during query time?

 Something like q=supermanqf=title^2 subject

 You can refer: 
 http://wiki.apache.org/solr/SolrRelevancyFAQhttp://wiki.apache.org/solr/**SolrRelevancyFAQ
 http://wiki.**apache.org/solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ
 



 --
 View this message in context: http://lucene.472066.n3.**
 nabble.com/Boosting-Documents-tp4064955p4064966.htmlhttp://nabble.com/Boosting-Documents-**tp4064955p4064966.html
 http:**//lucene.472066.n3.nabble.com/**Boosting-Documents-**
 tp4064955p4064966.htmlhttp://lucene.472066.n3.nabble.com/Boosting-Documents-tp4064955p4064966.html

 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting Documents

2013-05-22 Thread Oussama Jilal

I don't know if this can help (since the document boost should be 
independent of any schema) but here is my schema :


   |?xml version=1.0 encoding=UTF-8?
   schema  name=  version=1.5
types
fieldType  name=string  class=solr.StrField  
sortMissingLast=true  /
fieldType  name=long  class=solr.TrieLongField  sortMissingLast=true  
precisionStep=0  positionIncrementGap=0  /
fieldType  name=text  class=solr.TextField  sortMissingLast=true  
omitNorms=true
analyzer  type=index
tokenizer  class=solr.KeywordTokenizerFactory  /
filter  class=solr.LowerCaseFilterFactory  /
filter  class=solr.EdgeNGramFilterFactory  
maxGramSize=255  /
/analyzer
analyzer  type=query
tokenizer  class=solr.KeywordTokenizerFactory  /
filter  class=solr.LowerCaseFilterFactory  /
/analyzer
/fieldType
/types
fields
field  name=Id  type=string  indexed=true  stored=true  
multiValued=false  required=true  /
field  name=Suggestion  type=text  indexed=true  stored=true  
multiValued=false  required=false  /
field  name=Type  type=string  indexed=true  stored=true  
multiValued=false  required=true  /
field  name=Sections  type=string  indexed=true  stored=true  
multiValued=true  required=false  /
field  name=_version_  type=long  indexed=true  
stored=true/
/fields
copyField  source=Id  dest=Suggestion  /
uniqueKeyId/uniqueKey
defaultSearchFieldSuggestion/defaultSearchField
   /schema|

My query is somthing like : Suggestion:Olive Oil.

The result is 9 documents, wich all has the same score 11.287682, even 
if they had been indexed with different boosts (I am sure of this).




On 05/22/2013 10:54 AM, Sandeep Mestry wrote:

I think that is applicable only for the field level boosting and not at
document level boosting.

Can you post your query, field definition and results you're expecting.

I am using index and query time boosting without any issues so far. also
which version of Solr you're using?


On 22 May 2013 10:44, Oussama Jilal jilal.ouss...@gmail.com wrote:


I don't know if this is the issue or not but, concidering this note from
the wiki :

NOTE: make sure norms are enabled (omitNorms=false in the schema.xml)
for any fields where the index-time boost should be stored.

In my case where I only need to boost the whole document (not a specific
field), do I have to activate the  omitNorms=false  for all the
fields in the schema ?




On 05/22/2013 10:41 AM, Oussama Jilal wrote:


Thank you Sandeep,

I did post the document like that (a minor difference is that I did not
add the boost to the field since I don't want to boost on specific field, I
boosted the whole document 'doc boost=2.0  /doc'), but the issue
is that everything in the queries results has the same score even if they
had been indexed with different boosts, and I can't sort on another field
since this is independent from any field value.

Any ideas ?

On 05/22/2013 10:30 AM, Sandeep Mestry wrote:


Hi Oussama,

This is explained very nicely on Solr Wiki..
http://wiki.apache.org/solr/**SolrRelevancyFAQ#index-time_**boostshttp://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts
http://wiki.apache.org/solr/**UpdateXmlMessages#Optional_**
attributes_for_.22add.22http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22

All you need to do is something similar to below..

 -

 add  doc boost=2.5field name=employeeId05991/**field
field name=office boost=2.0Bridgewater/**field /doc/add


What is not clear from your message is whether you need better scoring or
better sorting. so, additionally, you can consider adding a secondary
sort
parameter for the docs having the same score.
http://wiki.apache.org/solr/**CommonQueryParameters#sorthttp://wiki.apache.org/solr/CommonQueryParameters#sort


HTH,
Sandeep


On 22 May 2013 09:21, Oussama Jilal jilal.ouss...@gmail.com wrote:

  Thank you for your reply bbarani,

I can't do that because I want to boost some documents over others,
independing of the query.


On 05/21/2013 05:41 PM, bbarani wrote:

Why don't you boost during query time?

Something like q=supermanqf=title^2 subject

You can refer: 
http://wiki.apache.org/solr/SolrRelevancyFAQhttp://wiki.apache.org/solr/**SolrRelevancyFAQ
http://wiki.**apache.org/solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ


--
View this message in context: http://lucene.472066.n3.**
nabble.com/Boosting-Documents-tp4064955p4064966.htmlhttp://nabble.com/Boosting-Documents-**tp4064955p4064966.html
http:**//lucene.472066.n3.nabble.com/**Boosting-Documents-**
tp4064955p4064966.htmlhttp

Re: Boosting Documents

2013-05-22 Thread Sandeep Mestry

://wiki.apache.org/solr/CommonQueryParameters#sorthttp://wiki.apache.org/solr/**CommonQueryParameters#sort
 htt**p://wiki.apache.org/solr/**CommonQueryParameters#sorthttp://wiki.apache.org/solr/CommonQueryParameters#sort
 



 HTH,
 Sandeep


 On 22 May 2013 09:21, Oussama Jilal jilal.ouss...@gmail.com wrote:

   Thank you for your reply bbarani,

 I can't do that because I want to boost some documents over others,
 independing of the query.


 On 05/21/2013 05:41 PM, bbarani wrote:

 Why don't you boost during query time?

 Something like q=supermanqf=title^2 subject

 You can refer: 
 http://wiki.apache.org/solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ
 http://**wiki.apache.org/solr/SolrRelevancyFAQhttp://wiki.apache.org/solr/**SolrRelevancyFAQ
 
 http://wiki.**apache.org/**solr/**SolrRelevancyFAQhttp://apache.org/solr/**SolrRelevancyFAQ
 http:/**/wiki.apache.org/solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ
 



 --
 View this message in context: http://lucene.472066.n3.**
 nabble.com/Boosting-Documents-**tp4064955p4064966.htmlhttp://nabble.com/Boosting-Documents-tp4064955p4064966.html
 htt**p://nabble.com/Boosting-**Documents-**tp4064955p4064966.**htmlhttp://nabble.com/Boosting-Documents-**tp4064955p4064966.html
 
 http:**//lucene.472066.n3.**nabble.com/**Boosting-**Documents-**http://lucene.472066.n3.nabble.com/**Boosting-Documents-**

 tp4064955p4064966.htmlhttp://**lucene.472066.n3.nabble.com/**
 Boosting-Documents-**tp4064955p4064966.htmlhttp://lucene.472066.n3.nabble.com/Boosting-Documents-tp4064955p4064966.html
 

 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting Documents

2013-05-22 Thread Oussama Jilal

/
field
 field name=office boost=2.0Bridgewater/field
/doc/add



What is not clear from your message is whether you need better scoring
or
better sorting. so, additionally, you can consider adding a secondary
sort
parameter for the docs having the same score.
http://wiki.apache.org/solr/CommonQueryParameters#sorthttp://wiki.apache.org/solr/**CommonQueryParameters#sort
htt**p://wiki.apache.org/solr/**CommonQueryParameters#sorthttp://wiki.apache.org/solr/CommonQueryParameters#sort


HTH,
Sandeep


On 22 May 2013 09:21, Oussama Jilal jilal.ouss...@gmail.com wrote:

   Thank you for your reply bbarani,


I can't do that because I want to boost some documents over others,
independing of the query.


On 05/21/2013 05:41 PM, bbarani wrote:

 Why don't you boost during query time?


Something like q=supermanqf=title^2 subject

You can refer: 
http://wiki.apache.org/solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ
http://**wiki.apache.org/solr/SolrRelevancyFAQhttp://wiki.apache.org/solr/**SolrRelevancyFAQ
http://wiki.**apache.org/**solr/**SolrRelevancyFAQhttp://apache.org/solr/**SolrRelevancyFAQ
http:/**/wiki.apache.org/solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ


--
View this message in context: http://lucene.472066.n3.**
nabble.com/Boosting-Documents-**tp4064955p4064966.htmlhttp://nabble.com/Boosting-Documents-tp4064955p4064966.html
htt**p://nabble.com/Boosting-**Documents-**tp4064955p4064966.**htmlhttp://nabble.com/Boosting-Documents-**tp4064955p4064966.html
http:**//lucene.472066.n3.**nabble.com/**Boosting-**Documents-**http://lucene.472066.n3.nabble.com/**Boosting-Documents-**

tp4064955p4064966.htmlhttp://**lucene.472066.n3.nabble.com/**
Boosting-Documents-**tp4064955p4064966.htmlhttp://lucene.472066.n3.nabble.com/Boosting-Documents-tp4064955p4064966.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting Documents

2013-05-22 Thread Sandeep Mestry

/**SolrRelevancyFAQ#**
 index-time_**boostshttp://wiki.apache.org/solr/**SolrRelevancyFAQ#index-time_**boosts
 
 http://wiki.apache.org/solr/SolrRelevancyFAQ#index-
 time_boostshttp://wiki.apache.org/**solr/SolrRelevancyFAQ#index-**time_boosts
 http://wiki.**apache.org/solr/**SolrRelevancyFAQ#index-time_**
 boostshttp://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts
 
 http://wiki.apache.org/solr/**UpdateXmlMessages#Optional_http://wiki.apache.org/solr/UpdateXmlMessages#Optional_**
 http://wiki.apache.org/solr/UpdateXmlMessages#Optional_http://wiki.apache.org/solr/**UpdateXmlMessages#Optional_**
 
 attributes_for_.22add.22http://wiki.apache.org/solr/**
 UpdateXmlMessages#Optional_attributes_for_.22add.22http:**
 //wiki.apache.org/solr/**UpdateXmlMessages#Optional_**
 attributes_for_.22add.22http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22
 


 All you need to do is something similar to below..

   -

   add  doc boost=2.5field
 name=employeeId05991/
 field
  field name=office boost=2.0Bridgewater/**field

 /doc/add



 What is not clear from your message is whether you need better
 scoring
 or
 better sorting. so, additionally, you can consider adding a secondary
 sort
 parameter for the docs having the same score.
 http://wiki.apache.org/solr/**CommonQueryParameters#sorthttp://wiki.apache.org/solr/CommonQueryParameters#sort
 h**ttp://wiki.apache.org/solr/CommonQueryParameters#sorthttp://wiki.apache.org/solr/**CommonQueryParameters#sort
 
 htt**p://wiki.apache.org/**solr/**CommonQueryParameters#**sorthttp://wiki.apache.org/solr/**CommonQueryParameters#sort
 http://wiki.apache.org/**solr/CommonQueryParameters#**sorthttp://wiki.apache.org/solr/CommonQueryParameters#sort
 



 HTH,
 Sandeep


 On 22 May 2013 09:21, Oussama Jilal jilal.ouss...@gmail.com wrote:

Thank you for your reply bbarani,

  I can't do that because I want to boost some documents over others,
 independing of the query.


 On 05/21/2013 05:41 PM, bbarani wrote:

  Why don't you boost during query time?

  Something like q=supermanqf=title^2 subject

 You can refer: http://wiki.apache.org/solr/
 SolrRelevancyFAQhttp://wiki.apache.org/solr/**SolrRelevancyFAQ
 http://**wiki.apache.org/solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ
 
 http://**wiki.apache.org/**solr/SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ
 http**://wiki.apache.org/solr/SolrRelevancyFAQhttp://wiki.apache.org/solr/**SolrRelevancyFAQ
 
 http://wiki.**apache.org/solr/**SolrRelevancyFAQhttp://apache.org/**solr/**SolrRelevancyFAQ
 http:/**/apache.org/solr/SolrRelevancyFAQhttp://apache.org/solr/**SolrRelevancyFAQ
 
 http:/**/wiki.apache.org/**solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/**SolrRelevancyFAQ
 http:/**/wiki.apache.org/solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ
 



 --
 View this message in context: http://lucene.472066.n3.**
 nabble.com/Boosting-Documents-tp4064955p4064966.htmlhttp://nabble.com/Boosting-Documents-**tp4064955p4064966.html
 h**ttp://nabble.com/Boosting-**Documents-**
 tp4064955p4064966.htmlhttp://nabble.com/Boosting-Documents-tp4064955p4064966.html
 
 htt**p://nabble.com/Boosting-Documents-
 tp4064955p4064966.**htmlhttp://nabble.com/Boosting-**Documents-**tp4064955p4064966.**html
 http:**//nabble.com/Boosting-**Documents-**tp4064955p4064966.**
 htmlhttp://nabble.com/Boosting-Documents-**tp4064955p4064966.html
 
 http:**//lucene.472066.n3.**n**abble.com/**Boosting-
 Documents-** http://nabble.com/**Boosting-**Documents-**
 http://lucene.**472066.n3.nabble.com/Boosting-Documents-**http://lucene.472066.n3.nabble.com/**Boosting-Documents-**
 

 tp4064955p4064966.htmlhttp://lucene.472066.n3.nabble.com/http://lucene.472066.n3.nabble.com/**
 Boosting-Documents-tp4064955p4064966.htmlhttp://**
 lucene.472066.n3.nabble.com/**Boosting-Documents-**
 tp4064955p4064966.htmlhttp://lucene.472066.n3.nabble.com/Boosting-Documents-tp4064955p4064966.html
 

 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting Documents

2013-05-22 Thread Oussama Jilal

/**SolrRelevancyFAQ#index-time_**
boostshttp://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts
http://wiki.apache.**org/solr/**SolrRelevancyFAQ#**
index-time_**boostshttp://wiki.apache.org/solr/**SolrRelevancyFAQ#index-time_**boosts
http://wiki.apache.org/solr/SolrRelevancyFAQ#index-
time_boostshttp://wiki.apache.org/**solr/SolrRelevancyFAQ#index-**time_boosts
http://wiki.**apache.org/solr/**SolrRelevancyFAQ#index-time_**
boostshttp://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts
http://wiki.apache.org/solr/**UpdateXmlMessages#Optional_http://wiki.apache.org/solr/UpdateXmlMessages#Optional_**
http://wiki.apache.org/solr/UpdateXmlMessages#Optional_http://wiki.apache.org/solr/**UpdateXmlMessages#Optional_**
attributes_for_.22add.22http://wiki.apache.org/solr/**
UpdateXmlMessages#Optional_attributes_for_.22add.22http:**
//wiki.apache.org/solr/**UpdateXmlMessages#Optional_**
attributes_for_.22add.22http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22

All you need to do is something similar to below..

   -

   add  doc boost=2.5field
name=employeeId05991/
field
  field name=office boost=2.0Bridgewater/**field

/doc/add



What is not clear from your message is whether you need better
scoring
or
better sorting. so, additionally, you can consider adding a secondary
sort
parameter for the docs having the same score.
http://wiki.apache.org/solr/**CommonQueryParameters#sorthttp://wiki.apache.org/solr/CommonQueryParameters#sort
h**ttp://wiki.apache.org/solr/CommonQueryParameters#sorthttp://wiki.apache.org/solr/**CommonQueryParameters#sort
htt**p://wiki.apache.org/**solr/**CommonQueryParameters#**sorthttp://wiki.apache.org/solr/**CommonQueryParameters#sort
http://wiki.apache.org/**solr/CommonQueryParameters#**sorthttp://wiki.apache.org/solr/CommonQueryParameters#sort


HTH,
Sandeep


On 22 May 2013 09:21, Oussama Jilal jilal.ouss...@gmail.com wrote:

Thank you for your reply bbarani,

  I can't do that because I want to boost some documents over others,

independing of the query.


On 05/21/2013 05:41 PM, bbarani wrote:

  Why don't you boost during query time?

  Something like q=supermanqf=title^2 subject

You can refer: http://wiki.apache.org/solr/
SolrRelevancyFAQhttp://wiki.apache.org/solr/**SolrRelevancyFAQ
http://**wiki.apache.org/solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ
http://**wiki.apache.org/**solr/SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ
http**://wiki.apache.org/solr/SolrRelevancyFAQhttp://wiki.apache.org/solr/**SolrRelevancyFAQ
http://wiki.**apache.org/solr/**SolrRelevancyFAQhttp://apache.org/**solr/**SolrRelevancyFAQ
http:/**/apache.org/solr/SolrRelevancyFAQhttp://apache.org/solr/**SolrRelevancyFAQ
http:/**/wiki.apache.org/**solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/**SolrRelevancyFAQ
http:/**/wiki.apache.org/solr/**SolrRelevancyFAQhttp://wiki.apache.org/solr/SolrRelevancyFAQ


--
View this message in context: http://lucene.472066.n3.**
nabble.com/Boosting-Documents-tp4064955p4064966.htmlhttp://nabble.com/Boosting-Documents-**tp4064955p4064966.html
h**ttp://nabble.com/Boosting-**Documents-**
tp4064955p4064966.htmlhttp://nabble.com/Boosting-Documents-tp4064955p4064966.html
htt**p://nabble.com/Boosting-Documents-
tp4064955p4064966.**htmlhttp://nabble.com/Boosting-**Documents-**tp4064955p4064966.**html
http:**//nabble.com/Boosting-**Documents-**tp4064955p4064966.**
htmlhttp://nabble.com/Boosting-Documents-**tp4064955p4064966.html
http:**//lucene.472066.n3.**n**abble.com/**Boosting-
Documents-** http://nabble.com/**Boosting-**Documents-**
http://lucene.**472066.n3.nabble.com/Boosting-Documents-**http://lucene.472066.n3.nabble.com/**Boosting-Documents-**
tp4064955p4064966.htmlhttp://lucene.472066.n3.nabble.com/http://lucene.472066.n3.nabble.com/**
Boosting-Documents-tp4064955p4064966.htmlhttp://**
lucene.472066.n3.nabble.com/**Boosting-Documents-**
tp4064955p4064966.htmlhttp://lucene.472066.n3.nabble.com/Boosting-Documents-tp4064955p4064966.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting Documents

2013-05-22 Thread Chris Hostetter


: NOTE: make sure norms are enabled (omitNorms=false in the schema.xml) for
: any fields where the index-time boost should be stored.
: 
: In my case where I only need to boost the whole document (not a specific
: field), do I have to activate the  omitNorms=false  for all the fields
: in the schema ?

docBoost is really just syntactic sugar for a field boost on each field i 
the document -- it's factored into the norm value for each field in the 
document.  (I'll update the wiki to make this more clear)

If you do a query that doesn't utilize any field which has norms, then the 
docBoost you specified when indexing the document never comes into play.


In general, doc boosts and field boosts, and the way they come into play 
as part of the field norm is fairly inflexible, and (in my opinion) 
antiquated.  A much better way of dealing with this type of problem is 
also discussed in the section of the wiki you linked to.  Imeediately 
below...

http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts

...you'll find...

http://wiki.apache.org/solr/SolrRelevancyFAQ#Field_Based_Boosting


-Hoss

Boosting Documents

2013-05-21 Thread Oussama Jilal


Hi everyone,

I have a small (I hope) issue, and I wish someone could point me to the 
right direction.


I have been indexing some documents using Solr 4.1 and specifying 
different boosts for different types of documents (boost for the whole 
document). But when searching, I noticed that the scores are the same 
for all of them and that affected the order (not what I wanted).


Does anyone, know if I have to configure something else or what ? I have 
been using Solr for quite some time (more than a year) but I never used 
the boosting feature.


Thanks.

Re: Boosting Documents

2013-05-21 Thread bbarani

Why don't you boost during query time?

Something like q=supermanqf=title^2 subject

You can refer: http://wiki.apache.org/solr/SolrRelevancyFAQ



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-Documents-tp4064955p4064966.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting documents with terms derived from clustering - good idea?

2013-05-19 Thread Otis Gospodnetic

Hi,

I would take a different approach.  Track users' queries and their
clicks.  Aggregate queries and start thinking of them as tags/labels.
Aggregate them and use top N to tag your docs.
Alternatively/additionally, extract significant terms and phrases from
clicked-to docs and use that to tag your docs.

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html




On Tue, May 14, 2013 at 7:04 AM, David Parks davidpark...@yahoo.com wrote:
 We have a number of queries that produce good results based on the textual
 data, but are contextually wrong (for example, an SSD hard drive search
 matches the music album SSD hip hop drives us crazy.



 Textually a fair match, but SSD is a term that strongly relates to technical
 documents.



 We'd like to be able to direct this query more strictly in the direction of
 the technical documents based on the term SSD.  I am considering whether
 it would be worth trying to cluster all documents, thus tending to group the
 music with the music and tech items with the tech items. Then pulling out
 the term vectors that define each group; do a human review of that data; and
 plug it back into the documents of each cluster as a separate search field
 that gets boosted.



 In my head it seems like a plausible way to weigh terms like SSD to the
 cluster of items that it most closely associates.



 Should I spend the effort to find out?

 Yeh or neh?

Boosting documents with terms derived from clustering - good idea?

2013-05-14 Thread David Parks

We have a number of queries that produce good results based on the textual
data, but are contextually wrong (for example, an SSD hard drive search
matches the music album SSD hip hop drives us crazy.

 

Textually a fair match, but SSD is a term that strongly relates to technical
documents.

 

We'd like to be able to direct this query more strictly in the direction of
the technical documents based on the term SSD.  I am considering whether
it would be worth trying to cluster all documents, thus tending to group the
music with the music and tech items with the tech items. Then pulling out
the term vectors that define each group; do a human review of that data; and
plug it back into the documents of each cluster as a separate search field
that gets boosted.

 

In my head it seems like a plausible way to weigh terms like SSD to the
cluster of items that it most closely associates.

 

Should I spend the effort to find out?

Yeh or neh?

Re: Boosting documents matching in a specific shard

2012-08-24 Thread Erick Erickson

Well, the simplest would be to include the shard ID in the document
when you index it, then just boost on that field...

Best
Erick

On Thu, Aug 23, 2012 at 8:33 AM, Husain, Yavar yhus...@firstam.com wrote:
 I am aware that IDF is not distributed. Suppose I have to boost or give 
 higher rank to documents which are matching in a specific/particular shard, 
 how can I accomplish that?
 **
 This message may contain confidential or proprietary information intended 
 only for the use of the
 addressee(s) named above or may contain information that is legally 
 privileged. If you are
 not the intended addressee, or the person responsible for delivering it to 
 the intended addressee,
 you are hereby notified that reading, disseminating, distributing or copying 
 this message is strictly
 prohibited. If you have received this message by mistake, please immediately 
 notify us by
 replying to the message and delete the original message and any copies 
 immediately thereafter.

 Thank you.-
 **
 FAFLD

Boosting documents matching in a specific shard

2012-08-23 Thread Husain, Yavar

I am aware that IDF is not distributed. Suppose I have to boost or give higher 
rank to documents which are matching in a specific/particular shard, how can I 
accomplish that?
**
 
This message may contain confidential or proprietary information intended only 
for the use of the 
addressee(s) named above or may contain information that is legally privileged. 
If you are 
not the intended addressee, or the person responsible for delivering it to the 
intended addressee, 
you are hereby notified that reading, disseminating, distributing or copying 
this message is strictly 
prohibited. If you have received this message by mistake, please immediately 
notify us by 
replying to the message and delete the original message and any copies 
immediately thereafter. 

Thank you.- 
**
FAFLD

Boosting documents based on search term/phrase

2012-05-01 Thread Donald Organ

Is there a way to boost documents based on the search term/phrase?

Re: Boosting documents based on search term/phrase

2012-05-01 Thread Jack Krupansky


Do you mean besides query elevation?

http://wiki.apache.org/solr/QueryElevationComponent

And besides explicit boosting by the user (the ^ suffix operator after a 
term/phrase)?


-- Jack Krupansky

-Original Message- 
From: Donald Organ

Sent: Tuesday, May 01, 2012 3:59 PM
To: solr-user
Subject: Boosting documents based on search term/phrase

Is there a way to boost documents based on the search term/phrase?

Re: Boosting documents based on search term/phrase

2012-05-01 Thread Donald Organ

query elevation was exactly what I was talking about.

Now is there a way to add this to the default query handler?

On Tue, May 1, 2012 at 4:26 PM, Jack Krupansky j...@basetechnology.comwrote:

 Do you mean besides query elevation?

 http://wiki.apache.org/solr/**QueryElevationComponenthttp://wiki.apache.org/solr/QueryElevationComponent

 And besides explicit boosting by the user (the ^ suffix operator after a
 term/phrase)?

 -- Jack Krupansky

 -Original Message- From: Donald Organ
 Sent: Tuesday, May 01, 2012 3:59 PM
 To: solr-user
 Subject: Boosting documents based on search term/phrase

 Is there a way to boost documents based on the search term/phrase?

Re: Boosting documents based on search term/phrase

2012-05-01 Thread Jeevanandam


Yes, you can add in last-components section on default query handler.

arr name=last-components
 strelevator/str
/arr

- Jeevanandam


On 02-05-2012 3:53 am, Donald Organ wrote:

query elevation was exactly what I was talking about.

Now is there a way to add this to the default query handler?

On Tue, May 1, 2012 at 4:26 PM, Jack Krupansky
j...@basetechnology.comwrote:


Do you mean besides query elevation?


http://wiki.apache.org/solr/**QueryElevationComponenthttp://wiki.apache.org/solr/QueryElevationComponent

And besides explicit boosting by the user (the ^ suffix operator 
after a

term/phrase)?

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Tuesday, May 01, 2012 3:59 PM
To: solr-user
Subject: Boosting documents based on search term/phrase

Is there a way to boost documents based on the search term/phrase?

Re: Boosting documents based on search term/phrase

2012-05-01 Thread Jack Krupansky


Here's some doc from Lucid:
http://lucidworks.lucidimagination.com/display/solr/The+Query+Elevation+Component

-- Jack Krupansky

-Original Message- 
From: Donald Organ

Sent: Tuesday, May 01, 2012 5:23 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting documents based on search term/phrase

query elevation was exactly what I was talking about.

Now is there a way to add this to the default query handler?

On Tue, May 1, 2012 at 4:26 PM, Jack Krupansky 
j...@basetechnology.comwrote:



Do you mean besides query elevation?

http://wiki.apache.org/solr/**QueryElevationComponenthttp://wiki.apache.org/solr/QueryElevationComponent

And besides explicit boosting by the user (the ^ suffix operator after a
term/phrase)?

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Tuesday, May 01, 2012 3:59 PM
To: solr-user
Subject: Boosting documents based on search term/phrase

Is there a way to boost documents based on the search term/phrase?

Re: Boosting documents based on search term/phrase

2012-05-01 Thread Otis Gospodnetic

Hi,

Can you please give an example of what you mean?

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 




 From: Donald Organ dor...@donaldorgan.com
To: solr-user solr-user@lucene.apache.org 
Sent: Tuesday, May 1, 2012 3:59 PM
Subject: Boosting documents based on search term/phrase
 
Is there a way to boost documents based on the search term/phrase?

Re: Boosting documents based on search term/phrase

2012-05-01 Thread Donald Organ

Perfect, this is working well.

On Tue, May 1, 2012 at 5:33 PM, Jeevanandam je...@myjeeva.com wrote:

 Yes, you can add in last-components section on default query handler.

 arr name=last-components
 strelevator/str
 /arr

 - Jeevanandam


 On 02-05-2012 3:53 am, Donald Organ wrote:

 query elevation was exactly what I was talking about.

 Now is there a way to add this to the default query handler?

 On Tue, May 1, 2012 at 4:26 PM, Jack Krupansky
 j...@basetechnology.com**wrote:

  Do you mean besides query elevation?


 http://wiki.apache.org/solr/QueryElevationComponenthttp://wiki.apache.org/solr/**QueryElevationComponent
 http:/**/wiki.apache.org/solr/**QueryElevationComponenthttp://wiki.apache.org/solr/QueryElevationComponent
 

 And besides explicit boosting by the user (the ^ suffix operator after
 a
 term/phrase)?

 -- Jack Krupansky

 -Original Message- From: Donald Organ
 Sent: Tuesday, May 01, 2012 3:59 PM
 To: solr-user
 Subject: Boosting documents based on search term/phrase

 Is there a way to boost documents based on the search term/phrase?

Re: Boosting documents based on the vote count

2010-10-20 Thread Alexandru Badiu

Thanks, will look into those.

Andu

On Mon, Oct 18, 2010 at 4:14 PM, Ahmet Arslan iori...@yahoo.com wrote:
 I know but I can't figure out what
 functions to use. :)

 Oh, I see. Why not just use {!boost b=log(vote)}?

 May be scale(vote,0.5,10)?

Boosting documents based on the vote count

2010-10-18 Thread Alexandru Badiu

Hello all,

I have a field in my schema which holds the number of votes a document
has. How can I boost documents based on that number?

Something like the one which has the maximum number has a boost of 10,
the one with the smallest number has 0.5 and in between the values get
calculated automatically.

Thanks,
Alexandru Badiu

Re: Boosting documents based on the vote count

2010-10-18 Thread Ahmet Arslan

 I have a field in my schema which holds the number of votes
 a document
 has. How can I boost documents based on that number?

you can do it with http://wiki.apache.org/solr/FunctionQuery

Re: Boosting documents based on the vote count

2010-10-18 Thread Alexandru Badiu

I know but I can't figure out what functions to use. :)

On Mon, Oct 18, 2010 at 1:38 PM, Ahmet Arslan iori...@yahoo.com wrote:
 I have a field in my schema which holds the number of votes
 a document
 has. How can I boost documents based on that number?

 you can do it with http://wiki.apache.org/solr/FunctionQuery

Re: Boosting documents based on the vote count

2010-10-18 Thread Ahmet Arslan

 I know but I can't figure out what
 functions to use. :)

Oh, I see. Why not just use {!boost b=log(vote)}?

May be scale(vote,0.5,10)?

Re: FunctionQuery and boosting documents using date arithmetic

2007-08-13 Thread Yonik Seeley

On 8/12/07, Chris Hostetter [EMAIL PROTECTED] wrote:
 : I'm having the date boosting function as well. I'm using this function:
 : F = recip(rord(creationDate),1,1000,1000)^10. However, since I have around
 : 10,000 of documents added in one day, rord(createDate) returns very
 : different values for the same createDate. For example, the last document

 you may want to consider rounding dates down to the nearest day when
 indexing, that way everything published on the same day would have the
 same value and thus the same ordinal value.

Yeah, and that will save index space and a lot of memory (smaller
FieldCache entry) too.

-Yonik

Re: FunctionQuery and boosting documents using date arithmetic

2007-08-12 Thread Pieter Berkel

On 11/08/07, Chris Hostetter [EMAIL PROTECTED] wrote:

 i would agree with you there, this is where a more robust (ie:
 less efficient) DateField-ish class that supports configuration options
 to specify:
   1) the output format
   2) the input format(s)
   3) the indexed format
 ...as SimpleDateFormatter pattern strings would be handy.  The
 ValueSource it uses could return seconds (or some other unit based on
 another config option) since epoch as the intValue.


That definitely sounds like a sensible and flexible approach, I'll have to
take a closer look at the ValueSource and FunctionQuery classes and see what
I can come up with.

it's been discussed before, but there are a lot of tricky issues involved
 which is probably why no one has really tackled it.


It does seem somehow related to the issue of making the value of NOW
constant during the entire execution of a query, hopefully not in the
to-hard basket.

be careful what you wish for.  you are 100% correct that functions using
 hte (r)ord value of a DateField aren't a function of true age, but
 dependong on how you look at it that may be better then using the real age
 (i think so anyway).


I understand the problems you describe with using true age values, although
I wonder how much recip() (or perhaps some other logarithmic function) would
be able to dampen any unpleasant side-effects created by unusual publishing
patterns, not publishing on weekends, etc.  Using min age sounds like a
much better idea than using NOW to avoid any of the described weirdness too,
but that might increase the complexity of the function.

I'm still keen to get something working, at least to compare the results it
generates with the current ordinal method.

Piete

Re: FunctionQuery and boosting documents using date arithmetic

2007-08-12 Thread Pieter Berkel

Do you consistently add 10,000 documents to your index every day or does the
number of new documents added per day vary?


On 11/08/07, climbingrose [EMAIL PROTECTED] wrote:

 I'm having the date boosting function as well. I'm using this function:
 F = recip(rord(creationDate),1,1000,1000)^10. However, since I have around
 10,000 of documents added in one day, rord(createDate) returns very
 different values for the same createDate. For example, the last document
 added with have rord(createdDate) =1 while the last document added will
 have
 rord(createdDate) = 10,000. When createDate  10,000, value of F is
 approaching 0. Therefore, the boost query doesn't make any difference
 between the the last document added today and the document added 10 days
 ago. Now if I replace 1000 in F with a large number, say 10,  the
 boost
 function  suddenly gives the last few documents enormous boost and make
 the
 other query scores irrelevant.

 So in my case (and many others' I believe), the true date value would be
 more appropriate. I'm thinking along the same line of adding timestamp. It
 wouldn't add much overhead this way, would it?

Re: FunctionQuery and boosting documents using date arithmetic

2007-08-11 Thread climbingrose

I'm having the date boosting function as well. I'm using this function:
F = recip(rord(creationDate),1,1000,1000)^10. However, since I have around
10,000 of documents added in one day, rord(createDate) returns very
different values for the same createDate. For example, the last document
added with have rord(createdDate) =1 while the last document added will have
rord(createdDate) = 10,000. When createDate  10,000, value of F is
approaching 0. Therefore, the boost query doesn't make any difference
between the the last document added today and the document added 10 days
ago. Now if I replace 1000 in F with a large number, say 10,  the boost
function  suddenly gives the last few documents enormous boost and make the
other query scores irrelevant.

So in my case (and many others' I believe), the true date value would be
more appropriate. I'm thinking along the same line of adding timestamp. It
wouldn't add much overhead this way, would it?

Regards,



On 8/11/07, Chris Hostetter [EMAIL PROTECTED] wrote:


 : Actually, just thinking about this a bit more, perhaps adding a function
 : call such as parseDate() might add too much overhead to the actual
 query,
 : perhaps it would be better to first convert the date to a timestamp at
 index
 : time and store it in a field type slong?  This might be more efficient
 but

 i would agree with you there, this is where a more robust (ie:
 less efficient) DateField-ish class that supports configuration options
 to specify:
   1) the output format
   2) the input format(s)
   3) the indexed format
 ...as SimpleDateFormatter pattern strings would be handy.  The
 ValueSource it uses could return seconds (or some other unit based on
 another config option) since epoch as the intValue.

 it's been discussed before, but there are a lot of tricky issues involved
 which is probably why no one has really tackled it.

 : that still leaves the problem of obtaining the current timestamp to use
 in
 : the boost function.

 it would be pretty easy to write a ValueSource that just knew about now
 as seconds since epoch.

 :  While it seems to work pretty well, I've realised that this may not be
 :  quite as effective as i had hoped given that the calculation is based
 on the
 :  ordinal of the field value rather than the value of the field
 itself.  In
 :  cases where the field type is 'date' and the actual field values are
 not
 :  distributed evenly across all documents in the index, the value
 returned by
 :  rord() is not going to give a true reflection of document age.  For
 example,

 be careful what you wish for.  you are 100% correct that functions using
 hte (r)ord value of a DateField aren't a function of true age, but
 dependong on how you look at it that may be better then using the real age
 (i think so anyway).  Why it sounds appealing to say that docA should
 score half as high as docB if it is twice as old, that typically isn't all
 that important when dealing with recent dates; and when dealing with older
 dates the ordinal value tends to approximate it decently well ... where a
 true measure of age might screw you up is when you have situations where
 few/no new articles get published on weekends (or late at night).  it's
 also very confusing to people when the ordering of documents changes even
 though no new documents have been published -- that can easily happen if
 you are heavily boosting on a true age calculation but will never happen
 when dealing with an ordinal ranking of documents by age.

 (allthough, this could be compensated by doing all of your true age
 calculations relative the min age of all articles in your index -- but
 you would still get really weird 'big' shifts in scores as soon as that
 first article gets published on monday morning.


 -Hoss




-- 
Regards,

Cuong Hoang

Re: FunctionQuery and boosting documents using date arithmetic

2007-08-10 Thread Chris Hostetter


: Actually, just thinking about this a bit more, perhaps adding a function
: call such as parseDate() might add too much overhead to the actual query,
: perhaps it would be better to first convert the date to a timestamp at index
: time and store it in a field type slong?  This might be more efficient but

i would agree with you there, this is where a more robust (ie:
less efficient) DateField-ish class that supports configuration options
to specify:
  1) the output format
  2) the input format(s)
  3) the indexed format
...as SimpleDateFormatter pattern strings would be handy.  The
ValueSource it uses could return seconds (or some other unit based on
another config option) since epoch as the intValue.

it's been discussed before, but there are a lot of tricky issues involved
which is probably why no one has really tackled it.

: that still leaves the problem of obtaining the current timestamp to use in
: the boost function.

it would be pretty easy to write a ValueSource that just knew about now
as seconds since epoch.

:  While it seems to work pretty well, I've realised that this may not be
:  quite as effective as i had hoped given that the calculation is based on the
:  ordinal of the field value rather than the value of the field itself.  In
:  cases where the field type is 'date' and the actual field values are not
:  distributed evenly across all documents in the index, the value returned by
:  rord() is not going to give a true reflection of document age.  For example,

be careful what you wish for.  you are 100% correct that functions using
hte (r)ord value of a DateField aren't a function of true age, but
dependong on how you look at it that may be better then using the real age
(i think so anyway).  Why it sounds appealing to say that docA should
score half as high as docB if it is twice as old, that typically isn't all
that important when dealing with recent dates; and when dealing with older
dates the ordinal value tends to approximate it decently well ... where a
true measure of age might screw you up is when you have situations where
few/no new articles get published on weekends (or late at night).  it's
also very confusing to people when the ordering of documents changes even
though no new documents have been published -- that can easily happen if
you are heavily boosting on a true age calculation but will never happen
when dealing with an ordinal ranking of documents by age.

(allthough, this could be compensated by doing all of your true age
calculations relative the min age of all articles in your index -- but
you would still get really weird 'big' shifts in scores as soon as that
first article gets published on monday morning.


-Hoss

FunctionQuery and boosting documents using date arithmetic

2007-08-06 Thread Pieter Berkel

I've been using a simple variation of the boost function given in the
examples used to boost more recent documents:

recip(rord(creationDate),1,1000,1000)^1.3

While it seems to work pretty well, I've realised that this may not be quite
as effective as i had hoped given that the calculation is based on the
ordinal of the field value rather than the value of the field itself.  In
cases where the field type is 'date' and the actual field values are not
distributed evenly across all documents in the index, the value returned by
rord() is not going to give a true reflection of document age.  For example,
using Hoss' new date faceting feature, I can see that the rate at which
documents have been added to the index I'm maintaining has been slowly but
steadily increasing over the past few months, and I fear this fact will skew
the boost value calculated by the function listed above.

There doesn't seem to be currently any way of performing date arithmetic or
convert a date field into an integer (seconds since epoch?), ideally I'd
like to be able to do something like:

recip(intval(parseDate('NOW')-parseDate(creationDate)),1,1000,1000)^1.3

so that the function calculates the boost based on the actual document age,
rather than the relative age.  Does anybody have any thoughts or comments on
this approach?

cheers,
Piete

Re: FunctionQuery and boosting documents using date arithmetic

2007-08-06 Thread Pieter Berkel

Actually, just thinking about this a bit more, perhaps adding a function
call such as parseDate() might add too much overhead to the actual query,
perhaps it would be better to first convert the date to a timestamp at index
time and store it in a field type slong?  This might be more efficient but
that still leaves the problem of obtaining the current timestamp to use in
the boost function.



On 06/08/07, Pieter Berkel [EMAIL PROTECTED] wrote:

 I've been using a simple variation of the boost function given in the
 examples used to boost more recent documents:

 recip(rord(creationDate),1,1000,1000)^1.3

 While it seems to work pretty well, I've realised that this may not be
 quite as effective as i had hoped given that the calculation is based on the
 ordinal of the field value rather than the value of the field itself.  In
 cases where the field type is 'date' and the actual field values are not
 distributed evenly across all documents in the index, the value returned by
 rord() is not going to give a true reflection of document age.  For example,
 using Hoss' new date faceting feature, I can see that the rate at which
 documents have been added to the index I'm maintaining has been slowly but
 steadily increasing over the past few months, and I fear this fact will skew
 the boost value calculated by the function listed above.

 There doesn't seem to be currently any way of performing date arithmetic
 or convert a date field into an integer (seconds since epoch?), ideally I'd
 like to be able to do something like:

 recip(intval(parseDate('NOW')-parseDate(creationDate)),1,1000,1000)^ 1.3

 so that the function calculates the boost based on the actual document
 age, rather than the relative age.  Does anybody have any thoughts or
 comments on this approach?

 cheers,
 Piete

60 matches

Mail list logo