Re: SOLR ranking

2016-02-19 Thread Alessandro Benedetti
;>
>>>>>> changed configuration below:
>>>>>>
>>>>>> Hi Emir,
>>>>>>
>>>>>> I have changed the cofiguration as per your suggestion, added pf2 /
>>>>>>
>>>>> pf3.
>>>>
>>>>> Yes, i saw the difference but still the ranking is not getting
>>>>>>
>>>>> followed
>>>
>>>> correctly in case of phrases.
>>>>>>
>>>>>> Changed configuration;
>>>>>>
>>>>>> >>>>>
>>>>> stored="true"
>>>>>
>>>>>> />
>>>>>> >>>>>
>>>>> stored="false"
>>>
>>>> />
>>>>>
>>>>>>
>>>>>> >>>>> stored="true"/>
>>>>>> >>>>>
>>>>> stored="false"/>
>>>>>
>>>>>>
>>>>>> >>>>> multiValued="true"/>
>>>>>> >>>>>
>>>>> stored="false"
>>>
>>>> multiValued="true"/>
>>>>>>
>>>>>> >>>>> multiValued="true"/>
>>>>>> >>>>>
>>>>> stored="false"
>>>>
>>>>> multiValued="true"/>
>>>>>>
>>>>>> >>>>>
>>>>> stored="false"/>
>>>>
>>>>>
>>>>>> Copy fields again for the reference :
>>>>>>
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>
>>>>>> Added following field type:
>>>>>>
>>>>>> >>>>> positionIncrementGap="100" omitNorms="true">
>>>>>>
>>>>>>
>>>>>>>>>>>
>>>>> ignoreCase="true"
>>>
>>>> words="stopwords.txt" />
>>>>>>
>>>>>>
>>>>>> 
>>>>>>
>>>>>> Removed the string type from the copy fields.
>>>>>>
>>>>>> Changed Query :
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>> http://localhost:8983/solr/tgl/select?q=rheumatoid%20arthritis=xml=1.0=200=AND=true=edismax=true=true=true;
>>>
>>>> pf=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
>>>>>> pf2=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
>>>>>> pf3=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
>>>>>> qf=topic_title^100 subtopic_title^40 index_term^20 drug^15 content^3
>>>>>>
>>>>>> After making these changes, I am able to get my search results
>>>>>>
>>>>> correctly
>>>>
>>>>> for
>>>>>
>>>>>> a single term but in case of phrase search, i am still not able to
>>>>>>
>>>>> get
>>>
>>>> the
>>>>>
>>>>>> results in the correct order.
>>>>>>
>>>>>> Hi Modassar,
>>>>>>
>>>>>> I tried using mm=100, but the order is still the same.
>>>>>>
>>>>>> Hi Alessandro,
>>>>>>
>>>>>> I have not yet tried the slope parameter. By default it is taking it
>>>>>>
>>>>> as
>>>
>>>> 1.0
>>>>>
>>>>>> when i looked it in debug mode. Will revert you definitely. So, let
>>>>>>
>>>>> me
>>>
>>>> try
>>>>>
>>>>>> this option too.
>>>>>>
>>>>>> All,
>>>>>>
>>>>>> Please suggest if anyone is having any other suggestion on this. I
>>>>>>
>>>>> have
>>>
>>>> to
>>>>>
>>>>>> implement it on urgent basis and i think i am very close to it.
>>>>>>
>>>>> Thanks
>>>
>>>> all
>>>>>
>>>>>> of you. I have reached to this level just because of you guys.
>>>>>>
>>>>>> Thanks and Regards,
>>>>>> Nitin
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>>
>>>>> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257782.html
>>>>>
>>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>>>
>>>>>
>>>>> --
>>>>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>>>>> Solr & Elasticsearch Support * http://sematext.com/
>>>>>
>>>>> --
>>>>>
>>>> Regards,
>>>> Binoy Dalal
>>>>
>>>>
>>>
>>>
>>> --
>>> --
>>>
>>> Benedetti Alessandro
>>> Visiting card : http://about.me/alessandro_benedetti
>>>
>>> "Tyger, tyger burning bright
>>> In the forests of the night,
>>> What immortal hand or eye
>>> Could frame thy fearful symmetry?"
>>>
>>> William Blake - Songs of Experience -1794 England
>>>
>>>
> --
> Ere Maijala
> Kansalliskirjasto / The National Library of Finland
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: SOLR ranking

2016-02-19 Thread Ere Maijala
If he needs faceting or something (I didn't see that specified), doing 
two queries won't do, of course..


--Ere

19.2.2016, 2.22, Binoy Dalal kirjoitti:

Hi Alessandro,
Don't get me wrong. Using mm, ps and pf can and absolutely will solve his
problem.

Like I said above, my solution is meant to be a quick and dirty fix. It's
really not that complex and shouldn't take more than an hour to setup at
the app level. Moreover I suggested it because he said it was urgent for
him and setting up a proper config with mm, pf and ps might take him much
longer.

Hope this clears things up :)

On Fri, 19 Feb 2016, 05:31 Alessandro Benedetti <abenede...@apache.org>
wrote:


Hey Binoi ,
can't understand why such complexity to be honest :/
Can you explain me why playing with :

edismax
mm ( percentage of query terms you want to be in the results)
pf ( the fields you want to be boosted if phrase matches )
ps ( slop to allow)

Should not solve the problem instead of the 2 phases query ?

Cheers

On 18 February 2016 at 18:09, Binoy Dalal <binoydala...@gmail.com> wrote:


Here's an alternative solution that may be of some help.
Here I'm assuming that you are not directly outputting the search results
to the user and have some sort of layer between the results from solr and
presentation to the user where some additional processing can be

performed.


1) You already know that you want phrase matches to show up higher than
single matches. In this case, why not do an explicit phrase match first,
with some slop or as is based on how close you want the phrase terms be

to

each other.
2) Once you have the results from the first query, fire an OR query with
your terms and get those results.
3) Put results from (2) after (1) and present to the user. This happens

in

the app layer.

This is essentially the same as running a query as such: "Rheumatoid
Arthritis"~slop OR (Rhuematoid AND Arthritis) but you don't need to worry
about the ordering because you're sorting your results.

Now, this will obviously take more time since you're querying twice and
then doing the addtional processing in the app layer, but provided your
architecture is balanced enough and can cope with a little extra load, I

do

not think that your performance will take that bad a hit. Moreover since
you're in a hurry, you could implement this as a quick and dirty solution
to meet the project goals, provided it fits the acceptance parameters and
then later play around with the scoring/sorting and figure out the best
possible setup to suit your needs.

On Thu, Feb 18, 2016 at 4:22 PM Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:


Hi Nitin,
Can you send us how your parsed query looks like (from debug output).

Thanks,
Emir

On 17.02.2016 08:38, Nitin.K wrote:

Hi Binoy,

We are searching for both phrases and individual words
but we want that only those documents which are having phrases will

come

first in the order and then the individual app.

termPositions = true is also not working in my case.

I have also removed the string type from copy fields. kindly look

into

the

changed configuration below:

Hi Emir,

I have changed the cofiguration as per your suggestion, added pf2 /

pf3.

Yes, i saw the difference but still the ranking is not getting

followed

correctly in case of phrases.

Changed configuration;


stored="true"

/>

stored="false"

/>




stored="false"/>




stored="false"

multiValued="true"/>



stored="false"

multiValued="true"/>


stored="false"/>


Copy fields again for the reference :







Added following field type:


   
   
   
ignoreCase="true"

words="stopwords.txt" />
   
   


Removed the string type from the copy fields.

Changed Query :







http://localhost:8983/solr/tgl/select?q=rheumatoid%20arthritis=xml=1.0=200=AND=true=edismax=true=true=true;

pf=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
pf2=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
pf3=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
qf=topic_title^100 subtopic_title^40 index_term^20 drug^15 content^3

After making these changes, I am able to get my search results

correctly

for

a single term but in case of phrase search, i am still not able to

get

the

results in the correct order.

Hi Modassar,

I tried using mm=100, but the order is still the same.

Hi Alessandro,

I have not yet tried the slope parameter. By default it is taking it

as

1.0

when i looked it in debug mode. Will revert you definitely. So, let

me

try

this option too.

All,

Please suggest if anyone is having any other suggestion on this. I

have

to

implement it on urgent basis and i think i am very close to it.

Thanks

all

of you. I have reached to this level just because of you guys.

Thanks and Regards,
Nitin



--
Vi

Re: SOLR ranking

2016-02-18 Thread Binoy Dalal
> 
> > > > 
> > > > 
> > > >
> > > > Added following field type:
> > > >
> > > >  > > > positionIncrementGap="100" omitNorms="true">
> > > >   
> > > >   
> > > >ignoreCase="true"
> > > > words="stopwords.txt" />
> > > >   
> > > >   
> > > > 
> > > >
> > > > Removed the string type from the copy fields.
> > > >
> > > > Changed Query :
> > > >
> > > >
> > >
> >
> http://localhost:8983/solr/tgl/select?q=rheumatoid%20arthritis=xml=1.0=200=AND=true=edismax=true=true=true;
> > > > pf=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > > > pf2=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > > > pf3=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > > > qf=topic_title^100 subtopic_title^40 index_term^20 drug^15 content^3
> > > >
> > > > After making these changes, I am able to get my search results
> > correctly
> > > for
> > > > a single term but in case of phrase search, i am still not able to
> get
> > > the
> > > > results in the correct order.
> > > >
> > > > Hi Modassar,
> > > >
> > > > I tried using mm=100, but the order is still the same.
> > > >
> > > > Hi Alessandro,
> > > >
> > > > I have not yet tried the slope parameter. By default it is taking it
> as
> > > 1.0
> > > > when i looked it in debug mode. Will revert you definitely. So, let
> me
> > > try
> > > > this option too.
> > > >
> > > > All,
> > > >
> > > > Please suggest if anyone is having any other suggestion on this. I
> have
> > > to
> > > > implement it on urgent basis and i think i am very close to it.
> Thanks
> > > all
> > > > of you. I have reached to this level just because of you guys.
> > > >
> > > > Thanks and Regards,
> > > > Nitin
> > > >
> > > >
> > > >
> > > > --
> > > > View this message in context:
> > > http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257782.html
> > > > Sent from the Solr - User mailing list archive at Nabble.com.
> > >
> > > --
> > > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > > Solr & Elasticsearch Support * http://sematext.com/
> > >
> > > --
> > Regards,
> > Binoy Dalal
> >
>
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>
-- 
Regards,
Binoy Dalal


Re: SOLR ranking

2016-02-18 Thread Alessandro Benedetti
Hey Binoi ,
can't understand why such complexity to be honest :/
Can you explain me why playing with :

edismax
mm ( percentage of query terms you want to be in the results)
pf ( the fields you want to be boosted if phrase matches )
ps ( slop to allow)

Should not solve the problem instead of the 2 phases query ?

Cheers

On 18 February 2016 at 18:09, Binoy Dalal <binoydala...@gmail.com> wrote:

> Here's an alternative solution that may be of some help.
> Here I'm assuming that you are not directly outputting the search results
> to the user and have some sort of layer between the results from solr and
> presentation to the user where some additional processing can be performed.
>
> 1) You already know that you want phrase matches to show up higher than
> single matches. In this case, why not do an explicit phrase match first,
> with some slop or as is based on how close you want the phrase terms be to
> each other.
> 2) Once you have the results from the first query, fire an OR query with
> your terms and get those results.
> 3) Put results from (2) after (1) and present to the user. This happens in
> the app layer.
>
> This is essentially the same as running a query as such: "Rheumatoid
> Arthritis"~slop OR (Rhuematoid AND Arthritis) but you don't need to worry
> about the ordering because you're sorting your results.
>
> Now, this will obviously take more time since you're querying twice and
> then doing the addtional processing in the app layer, but provided your
> architecture is balanced enough and can cope with a little extra load, I do
> not think that your performance will take that bad a hit. Moreover since
> you're in a hurry, you could implement this as a quick and dirty solution
> to meet the project goals, provided it fits the acceptance parameters and
> then later play around with the scoring/sorting and figure out the best
> possible setup to suit your needs.
>
> On Thu, Feb 18, 2016 at 4:22 PM Emir Arnautovic <
> emir.arnauto...@sematext.com> wrote:
>
> > Hi Nitin,
> > Can you send us how your parsed query looks like (from debug output).
> >
> > Thanks,
> > Emir
> >
> > On 17.02.2016 08:38, Nitin.K wrote:
> > > Hi Binoy,
> > >
> > > We are searching for both phrases and individual words
> > > but we want that only those documents which are having phrases will
> come
> > > first in the order and then the individual app.
> > >
> > > termPositions = true is also not working in my case.
> > >
> > > I have also removed the string type from copy fields. kindly look into
> > the
> > > changed configuration below:
> > >
> > > Hi Emir,
> > >
> > > I have changed the cofiguration as per your suggestion, added pf2 /
> pf3.
> > > Yes, i saw the difference but still the ranking is not getting followed
> > > correctly in case of phrases.
> > >
> > > Changed configuration;
> > >
> > >  > stored="true"
> > > />
> > >  > />
> > >
> > >  > > stored="true"/>
> > >  > stored="false"/>
> > >
> > >  > > multiValued="true"/>
> > >  > > multiValued="true"/>
> > >
> > >  > > multiValued="true"/>
> > >  stored="false"
> > > multiValued="true"/>
> > >
> > >  stored="false"/>
> > >
> > > Copy fields again for the reference :
> > >
> > > 
> > > 
> > > 
> > > 
> > > 
> > >
> > > Added following field type:
> > >
> > >  > > positionIncrementGap="100" omitNorms="true">
> > >   
> > >   
> > >> > words="stopwords.txt" />
> > >   
> > >   
> > > 
> > >
> > > Removed the string type from the copy fields.
> > >
> > > Changed Query :
> > >
> > >
> >
> http://localhost:8983/solr/tgl/select?q=rheumatoid%20arthritis=xml=1.0=200=AND=true=edismax=true=true=true;
> > > pf=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > > pf2=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > > pf3=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > > qf=topic_title^100 subtopic_title^40 index_term^20 drug^15 content^3
> > >
> > > After making these changes, I am able to get my search results
> corr

Re: SOLR ranking

2016-02-18 Thread Binoy Dalal
Here's an alternative solution that may be of some help.
Here I'm assuming that you are not directly outputting the search results
to the user and have some sort of layer between the results from solr and
presentation to the user where some additional processing can be performed.

1) You already know that you want phrase matches to show up higher than
single matches. In this case, why not do an explicit phrase match first,
with some slop or as is based on how close you want the phrase terms be to
each other.
2) Once you have the results from the first query, fire an OR query with
your terms and get those results.
3) Put results from (2) after (1) and present to the user. This happens in
the app layer.

This is essentially the same as running a query as such: "Rheumatoid
Arthritis"~slop OR (Rhuematoid AND Arthritis) but you don't need to worry
about the ordering because you're sorting your results.

Now, this will obviously take more time since you're querying twice and
then doing the addtional processing in the app layer, but provided your
architecture is balanced enough and can cope with a little extra load, I do
not think that your performance will take that bad a hit. Moreover since
you're in a hurry, you could implement this as a quick and dirty solution
to meet the project goals, provided it fits the acceptance parameters and
then later play around with the scoring/sorting and figure out the best
possible setup to suit your needs.

On Thu, Feb 18, 2016 at 4:22 PM Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:

> Hi Nitin,
> Can you send us how your parsed query looks like (from debug output).
>
> Thanks,
> Emir
>
> On 17.02.2016 08:38, Nitin.K wrote:
> > Hi Binoy,
> >
> > We are searching for both phrases and individual words
> > but we want that only those documents which are having phrases will come
> > first in the order and then the individual app.
> >
> > termPositions = true is also not working in my case.
> >
> > I have also removed the string type from copy fields. kindly look into
> the
> > changed configuration below:
> >
> > Hi Emir,
> >
> > I have changed the cofiguration as per your suggestion, added pf2 / pf3.
> > Yes, i saw the difference but still the ranking is not getting followed
> > correctly in case of phrases.
> >
> > Changed configuration;
> >
> >  stored="true"
> > />
> >  />
> >
> >  > stored="true"/>
> >  stored="false"/>
> >
> >  > multiValued="true"/>
> >  > multiValued="true"/>
> >
> >  > multiValued="true"/>
> >  > multiValued="true"/>
> >
> > 
> >
> > Copy fields again for the reference :
> >
> > 
> > 
> > 
> > 
> > 
> >
> > Added following field type:
> >
> >  > positionIncrementGap="100" omitNorms="true">
> >   
> >   
> >> words="stopwords.txt" />
> >   
> >   
> > 
> >
> > Removed the string type from the copy fields.
> >
> > Changed Query :
> >
> >
> http://localhost:8983/solr/tgl/select?q=rheumatoid%20arthritis=xml=1.0=200=AND=true=edismax=true=true=true;
> > pf=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > pf2=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > pf3=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > qf=topic_title^100 subtopic_title^40 index_term^20 drug^15 content^3
> >
> > After making these changes, I am able to get my search results correctly
> for
> > a single term but in case of phrase search, i am still not able to get
> the
> > results in the correct order.
> >
> > Hi Modassar,
> >
> > I tried using mm=100, but the order is still the same.
> >
> > Hi Alessandro,
> >
> > I have not yet tried the slope parameter. By default it is taking it as
> 1.0
> > when i looked it in debug mode. Will revert you definitely. So, let me
> try
> > this option too.
> >
> > All,
> >
> > Please suggest if anyone is having any other suggestion on this. I have
> to
> > implement it on urgent basis and i think i am very close to it. Thanks
> all
> > of you. I have reached to this level just because of you guys.
> >
> > Thanks and Regards,
> > Nitin
> >
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257782.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
> --
Regards,
Binoy Dalal


Re: SOLR ranking

2016-02-18 Thread Emir Arnautovic

Hi Nitin,
Can you send us how your parsed query looks like (from debug output).

Thanks,
Emir

On 17.02.2016 08:38, Nitin.K wrote:

Hi Binoy,

We are searching for both phrases and individual words
but we want that only those documents which are having phrases will come
first in the order and then the individual app.

termPositions = true is also not working in my case.

I have also removed the string type from copy fields. kindly look into the
changed configuration below:

Hi Emir,

I have changed the cofiguration as per your suggestion, added pf2 / pf3.
Yes, i saw the difference but still the ranking is not getting followed
correctly in case of phrases.

Changed configuration;















Copy fields again for the reference :







Added following field type:









Removed the string type from the copy fields.

Changed Query :

http://localhost:8983/solr/tgl/select?q=rheumatoid%20arthritis=xml=1.0=200=AND=true=edismax=true=true=true;
pf=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
pf2=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
pf3=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
qf=topic_title^100 subtopic_title^40 index_term^20 drug^15 content^3

After making these changes, I am able to get my search results correctly for
a single term but in case of phrase search, i am still not able to get the
results in the correct order.

Hi Modassar,

I tried using mm=100, but the order is still the same.

Hi Alessandro,

I have not yet tried the slope parameter. By default it is taking it as 1.0
when i looked it in debug mode. Will revert you definitely. So, let me try
this option too.

All,

Please suggest if anyone is having any other suggestion on this. I have to
implement it on urgent basis and i think i am very close to it. Thanks all
of you. I have reached to this level just because of you guys.

Thanks and Regards,
Nitin



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257782.html
Sent from the Solr - User mailing list archive at Nabble.com.


--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: SOLR ranking

2016-02-17 Thread Nitin.K
Hi Binoy,

We are searching for both phrases and individual words 
but we want that only those documents which are having phrases will come
first in the order and then the individual app.

termPositions = true is also not working in my case.

I have also removed the string type from copy fields. kindly look into the
changed configuration below:

Hi Emir,

I have changed the cofiguration as per your suggestion, added pf2 / pf3.
Yes, i saw the difference but still the ranking is not getting followed
correctly in case of phrases.

Changed configuration;















Copy fields again for the reference :







Added following field type:









Removed the string type from the copy fields.

Changed Query :

http://localhost:8983/solr/tgl/select?q=rheumatoid%20arthritis=xml=1.0=200=AND=true=edismax=true=true=true;
pf=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
pf2=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
pf3=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
qf=topic_title^100 subtopic_title^40 index_term^20 drug^15 content^3

After making these changes, I am able to get my search results correctly for
a single term but in case of phrase search, i am still not able to get the
results in the correct order.

Hi Modassar,

I tried using mm=100, but the order is still the same.

Hi Alessandro,

I have not yet tried the slope parameter. By default it is taking it as 1.0
when i looked it in debug mode. Will revert you definitely. So, let me try
this option too.

All,

Please suggest if anyone is having any other suggestion on this. I have to
implement it on urgent basis and i think i am very close to it. Thanks all
of you. I have reached to this level just because of you guys.

Thanks and Regards,
Nitin



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257782.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR ranking

2016-02-16 Thread david.w.smi...@gmail.com
I just want to interject to say one thing:
You *can* sort on multi-valued fields as-of recent Solr 5 releases.  it's
done using the "field" function query with either a "min" or "max" 2nd
argument:
https://cwiki.apache.org/confluence/display/solr/Function+Queries
Of course it'd be nicer to simply sort asc/desc on the field like normally
and not use this special syntax but AFAIK that convenience hasn't been
added yet.

~ David

On Mon, Feb 15, 2016 at 10:26 AM Binoy Dalal <binoydala...@gmail.com> wrote:

> I'm sorry, missed that part. It's true, you cannot sort on multivalued
> fields. The workaround will be pretty complex; you'll either have to find
> the max or min value of the fields at index time and store those in
> separate fields and use those to sort, or somehow come up with some
> function that can convert the values from your multivalued field into a
> single value (something like sum(field)) but it surely won't be trivial.
>
> Instead you should do what Emir's saying.
> Boost your fields at index or query time based on how you want to sort your
> documents.
> So in your case, give the highest boost to topic_title then a little lower
> to subtopic_title and so on. This should return your documents in the
> correct order.
> You will have to play around with the boost values a little to get them
> right, though.
>
> Alternatively, you could boost on the multivalued fields and then sort
> based on your single valued fields.
>
> Either ways, you'll have to experiment and see what works best for you.
>
> On Mon, Feb 15, 2016 at 8:21 PM Nitin.K <nitin.kanu...@adi-mps.com> wrote:
>
> > Thanks Binoy..
> >
> > Actually it is throwing following error:
> >
> > can not sort on multivalued field: index_term
> >
> >
> >
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257378.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
> --
> Regards,
> Binoy Dalal
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: SOLR ranking

2016-02-16 Thread Alessandro Benedetti
 Actually you can get it with the edismax.
>> > > > > Just set mm to 100% and then configure a pf field ( or more) .
>> > > > > You are going to search all the search terms mandatory and boost
>> > > phrases
>> > > > > match .
>> > > > >
>> > > > > Cheers
>> > > > >
>> > > > > On 16 February 2016 at 07:57, Emir Arnautovic <
>> > > > > emir.arnauto...@sematext.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Hi Nitin,
>> > > > > > You can use pf parameter to boost results with exact phrase. You
>> > can
>> > > > also
>> > > > > > use pf2 and pf3 to boost results with bigrams (phrase matches
>> with
>> > 2
>> > > > or 3
>> > > > > > words in case input is with more than 3 words)
>> > > > > >
>> > > > > > Regards,
>> > > > > > Emir
>> > > > > >
>> > > > > >
>> > > > > > On 16.02.2016 06:18, Nitin.K wrote:
>> > > > > >
>> > > > > >> I am using edismax parser with the following query:
>> > > > > >>
>> > > > > >>
>> > > > > >>
>> > > > >
>> > > >
>> > >
>> >
>> localhost:8983/solr/tgl/select?q=eating%20disorders=xml=1.0=200=AND=true=edismax=true=true=true=topic_title%5E100+subtopic_title%5E40+index_term%5E20+drug%5E15+content%5E3=topTitle%5E200+subTopTitle%5E80+indTerm%5E40+drugString%5E30+content%5E6
>> > > > > >>
>> > > > > >> Configuration of schema.xml
>> > > > > >>
>> > > > > >> > > > > > stored="true"
>> > > > > >> />
>> > > > > >> > > stored="false"/>
>> > > > > >>
>> > > > > >> > > > > > >> stored="true"/>
>> > > > > >> > > > > stored="false"/>
>> > > > > >>
>> > > > > >> > > stored="true"
>> > > > > >> multiValued="true"/>
>> > > > > >> > stored="false"
>> > > > > >> multiValued="true"/>
>> > > > > >>
>> > > > > >> > > > > > >> multiValued="true"/>
>> > > > > >> > > stored="false"
>> > > > > >> multiValued="true"/>
>> > > > > >>
>> > > > > >> > > > > stored="true"/>
>> > > > > >>
>> > > > > >> 
>> > > > > >> 
>> > > > > >> 
>> > > > > >> 
>> > > > > >>
>> > > > > >> > > > > > >> positionIncrementGap="100" omitNorms="true">
>> > > > > >> 
>> > > > > >> > > > > > class="solr.StandardTokenizerFactory"/>
>> > > > > >> > > > > > >> ignoreCase="true"
>> > > > > >> words="stopwords.txt" />
>> > > > > >> > > > class="solr.LowerCaseFilterFactory"/>
>> > > > > >> 
>> > > > > >> 
>> > > > > >> > > > > > class="solr.StandardTokenizerFactory"/>
>> > > > > >> > > > > > >> ignoreCase="true"
>> > > > > >> words="stopwords.txt" />
>> > > > > >> > class="solr.SynonymFilterFactory"
>> > > > > >> synonyms="synonyms.txt"
>> > > > > >> ignoreCase="true" expand="true"/>
>> > > > > >> > > > class="solr.LowerCaseFilterFactory"/>
>> > > > > >> 
>> > > > > >> 
>> > > > > >> > > > > > >> positionIncrementGap="100"
>> > > > > >> omitTermFreqAndPositions="true" omitNorms="true">
>> > > > > >> 
>> > > > > >> > > > > > >> class="solr.WhitespaceTokenizerFactory"/>
>> > > > > >> > > > > > >> ignoreCase="true"
>> > > > > >> words="stopwords.txt" />
>> > > > > >> > > > class="solr.LowerCaseFilterFactory"/>
>> > > > > >> 
>> > > > > >> 
>> > > > > >>
>> > > > > >>
>> > > > > >> I want , if user will search for a phrase then that pharse
>> should
>> > > > always
>> > > > > >> takes the priority in comaprison to the individual words;
>> > > > > >>
>> > > > > >> Example: "Eating Disorders"
>> > > > > >>
>> > > > > >> First it will search for "Eating Disorders" together and then
>> the
>> > > > > >> individual
>> > > > > >> words "Eating" and "Disorders"
>> > > > > >> but while searching for individual words, it will always return
>> > > those
>> > > > > >> documents where both the words should exist for which i am
>> already
>> > > > using
>> > > > > >> q.op="AND" in my query.
>> > > > > >>
>> > > > > >> Thanks,
>> > > > > >> Nitin
>> > > > > >>
>> > > > > >>
>> > > > > >>
>> > > > > >>
>> > > > > >> --
>> > > > > >> View this message in context:
>> > > > > >>
>> > > >
>> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257510.html
>> > > > > >> Sent from the Solr - User mailing list archive at Nabble.com.
>> > > > > >>
>> > > > > >
>> > > > > > --
>> > > > > > Monitoring * Alerting * Anomaly Detection * Centralized Log
>> > > Management
>> > > > > > Solr & Elasticsearch Support * http://sematext.com/
>> > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > --
>> > > > >
>> > > > > Benedetti Alessandro
>> > > > > Visiting card : http://about.me/alessandro_benedetti
>> > > > >
>> > > > > "Tyger, tyger burning bright
>> > > > > In the forests of the night,
>> > > > > What immortal hand or eye
>> > > > > Could frame thy fearful symmetry?"
>> > > > >
>> > > > > William Blake - Songs of Experience -1794 England
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > --
>> > >
>> > > Benedetti Alessandro
>> > > Visiting card : http://about.me/alessandro_benedetti
>> > >
>> > > "Tyger, tyger burning bright
>> > > In the forests of the night,
>> > > What immortal hand or eye
>> > > Could frame thy fearful symmetry?"
>> > >
>> > > William Blake - Songs of Experience -1794 England
>> > >
>> >
>> --
>> Regards,
>> Binoy Dalal
>>
>
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: SOLR ranking

2016-02-16 Thread Alessandro Benedetti
han 3 words)
> > > > > >
> > > > > > Regards,
> > > > > > Emir
> > > > > >
> > > > > >
> > > > > > On 16.02.2016 06:18, Nitin.K wrote:
> > > > > >
> > > > > >> I am using edismax parser with the following query:
> > > > > >>
> > > > > >>
> > > > > >>
> > > > >
> > > >
> > >
> >
> localhost:8983/solr/tgl/select?q=eating%20disorders=xml=1.0=200=AND=true=edismax=true=true=true=topic_title%5E100+subtopic_title%5E40+index_term%5E20+drug%5E15+content%5E3=topTitle%5E200+subTopTitle%5E80+indTerm%5E40+drugString%5E30+content%5E6
> > > > > >>
> > > > > >> Configuration of schema.xml
> > > > > >>
> > > > > >>  > > > > stored="true"
> > > > > >> />
> > > > > >>  > stored="false"/>
> > > > > >>
> > > > > >>  > > > > >> stored="true"/>
> > > > > >>  > > > stored="false"/>
> > > > > >>
> > > > > >>  > stored="true"
> > > > > >> multiValued="true"/>
> > > > > >>  stored="false"
> > > > > >> multiValued="true"/>
> > > > > >>
> > > > > >>  > > > > >> multiValued="true"/>
> > > > > >>  > stored="false"
> > > > > >> multiValued="true"/>
> > > > > >>
> > > > > >>  > > > stored="true"/>
> > > > > >>
> > > > > >> 
> > > > > >> 
> > > > > >> 
> > > > > >> 
> > > > > >>
> > > > > >>  > > > > >> positionIncrementGap="100" omitNorms="true">
> > > > > >> 
> > > > > >>  > > > > class="solr.StandardTokenizerFactory"/>
> > > > > >>  > > > > >> ignoreCase="true"
> > > > > >> words="stopwords.txt" />
> > > > > >>  > > class="solr.LowerCaseFilterFactory"/>
> > > > > >> 
> > > > > >> 
> > > > > >>  > > > > class="solr.StandardTokenizerFactory"/>
> > > > > >>  > > > > >> ignoreCase="true"
> > > > > >> words="stopwords.txt" />
> > > > > >>  class="solr.SynonymFilterFactory"
> > > > > >> synonyms="synonyms.txt"
> > > > > >> ignoreCase="true" expand="true"/>
> > > > > >>  > > class="solr.LowerCaseFilterFactory"/>
> > > > > >> 
> > > > > >> 
> > > > > >>  > > > > >> positionIncrementGap="100"
> > > > > >> omitTermFreqAndPositions="true" omitNorms="true">
> > > > > >> 
> > > > > >>  > > > > >> class="solr.WhitespaceTokenizerFactory"/>
> > > > > >>  > > > > >> ignoreCase="true"
> > > > > >> words="stopwords.txt" />
> > > > > >>  > > class="solr.LowerCaseFilterFactory"/>
> > > > > >> 
> > > > > >> 
> > > > > >>
> > > > > >>
> > > > > >> I want , if user will search for a phrase then that pharse
> should
> > > > always
> > > > > >> takes the priority in comaprison to the individual words;
> > > > > >>
> > > > > >> Example: "Eating Disorders"
> > > > > >>
> > > > > >> First it will search for "Eating Disorders" together and then
> the
> > > > > >> individual
> > > > > >> words "Eating" and "Disorders"
> > > > > >> but while searching for individual words, it will always return
> > > those
> > > > > >> documents where both the words should exist for which i am
> already
> > > > using
> > > > > >> q.op="AND" in my query.
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Nitin
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >> View this message in context:
> > > > > >>
> > > >
> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257510.html
> > > > > >> Sent from the Solr - User mailing list archive at Nabble.com.
> > > > > >>
> > > > > >
> > > > > > --
> > > > > > Monitoring * Alerting * Anomaly Detection * Centralized Log
> > > Management
> > > > > > Solr & Elasticsearch Support * http://sematext.com/
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > --
> > > > >
> > > > > Benedetti Alessandro
> > > > > Visiting card : http://about.me/alessandro_benedetti
> > > > >
> > > > > "Tyger, tyger burning bright
> > > > > In the forests of the night,
> > > > > What immortal hand or eye
> > > > > Could frame thy fearful symmetry?"
> > > > >
> > > > > William Blake - Songs of Experience -1794 England
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > --
> > >
> > > Benedetti Alessandro
> > > Visiting card : http://about.me/alessandro_benedetti
> > >
> > > "Tyger, tyger burning bright
> > > In the forests of the night,
> > > What immortal hand or eye
> > > Could frame thy fearful symmetry?"
> > >
> > > William Blake - Songs of Experience -1794 England
> > >
> >
> --
> Regards,
> Binoy Dalal
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: SOLR ranking

2016-02-16 Thread Binoy Dalal
 > > >>
> > > > >>  stored="true"
> > > > >> multiValued="true"/>
> > > > >>  > > > >> multiValued="true"/>
> > > > >>
> > > > >>  > > > >> multiValued="true"/>
> > > > >>  stored="false"
> > > > >> multiValued="true"/>
> > > > >>
> > > > >>  > > stored="true"/>
> > > > >>
> > > > >> 
> > > > >> 
> > > > >> 
> > > > >> 
> > > > >>
> > > > >>  > > > >> positionIncrementGap="100" omitNorms="true">
> > > > >> 
> > > > >>  > > > class="solr.StandardTokenizerFactory"/>
> > > > >>  > > > >> ignoreCase="true"
> > > > >> words="stopwords.txt" />
> > > > >>  > class="solr.LowerCaseFilterFactory"/>
> > > > >> 
> > > > >> 
> > > > >>  > > > class="solr.StandardTokenizerFactory"/>
> > > > >>  > > > >> ignoreCase="true"
> > > > >> words="stopwords.txt" />
> > > > >>  > > > >> synonyms="synonyms.txt"
> > > > >> ignoreCase="true" expand="true"/>
> > > > >>  > class="solr.LowerCaseFilterFactory"/>
> > > > >> 
> > > > >> 
> > > > >>  > > > >> positionIncrementGap="100"
> > > > >> omitTermFreqAndPositions="true" omitNorms="true">
> > > > >> 
> > > > >>  > > > >> class="solr.WhitespaceTokenizerFactory"/>
> > > > >>  > > > >> ignoreCase="true"
> > > > >> words="stopwords.txt" />
> > > > >>  > class="solr.LowerCaseFilterFactory"/>
> > > > >> 
> > > > >> 
> > > > >>
> > > > >>
> > > > >> I want , if user will search for a phrase then that pharse should
> > > always
> > > > >> takes the priority in comaprison to the individual words;
> > > > >>
> > > > >> Example: "Eating Disorders"
> > > > >>
> > > > >> First it will search for "Eating Disorders" together and then the
> > > > >> individual
> > > > >> words "Eating" and "Disorders"
> > > > >> but while searching for individual words, it will always return
> > those
> > > > >> documents where both the words should exist for which i am already
> > > using
> > > > >> q.op="AND" in my query.
> > > > >>
> > > > >> Thanks,
> > > > >> Nitin
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> View this message in context:
> > > > >>
> > > http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257510.html
> > > > >> Sent from the Solr - User mailing list archive at Nabble.com.
> > > > >>
> > > > >
> > > > > --
> > > > > Monitoring * Alerting * Anomaly Detection * Centralized Log
> > Management
> > > > > Solr & Elasticsearch Support * http://sematext.com/
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > --
> > > >
> > > > Benedetti Alessandro
> > > > Visiting card : http://about.me/alessandro_benedetti
> > > >
> > > > "Tyger, tyger burning bright
> > > > In the forests of the night,
> > > > What immortal hand or eye
> > > > Could frame thy fearful symmetry?"
> > > >
> > > > William Blake - Songs of Experience -1794 England
> > > >
> > >
> >
> >
> >
> > --
> > --
> >
> > Benedetti Alessandro
> > Visiting card : http://about.me/alessandro_benedetti
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
> >
>
-- 
Regards,
Binoy Dalal


Re: SOLR ranking

2016-02-16 Thread Modassar Ather
 />
> > > >>  class="solr.LowerCaseFilterFactory"/>
> > > >> 
> > > >> 
> > > >>  > > class="solr.StandardTokenizerFactory"/>
> > > >>  > > >> ignoreCase="true"
> > > >> words="stopwords.txt" />
> > > >>  > > >> synonyms="synonyms.txt"
> > > >> ignoreCase="true" expand="true"/>
> > > >>  class="solr.LowerCaseFilterFactory"/>
> > > >> 
> > > >> 
> > > >>  > > >> positionIncrementGap="100"
> > > >> omitTermFreqAndPositions="true" omitNorms="true">
> > > >> 
> > > >>  > > >> class="solr.WhitespaceTokenizerFactory"/>
> > > >>  > > >> ignoreCase="true"
> > > >> words="stopwords.txt" />
> > > >>  class="solr.LowerCaseFilterFactory"/>
> > > >> 
> > > >> 
> > > >>
> > > >>
> > > >> I want , if user will search for a phrase then that pharse should
> > always
> > > >> takes the priority in comaprison to the individual words;
> > > >>
> > > >> Example: "Eating Disorders"
> > > >>
> > > >> First it will search for "Eating Disorders" together and then the
> > > >> individual
> > > >> words "Eating" and "Disorders"
> > > >> but while searching for individual words, it will always return
> those
> > > >> documents where both the words should exist for which i am already
> > using
> > > >> q.op="AND" in my query.
> > > >>
> > > >> Thanks,
> > > >> Nitin
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> View this message in context:
> > > >>
> > http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257510.html
> > > >> Sent from the Solr - User mailing list archive at Nabble.com.
> > > >>
> > > >
> > > > --
> > > > Monitoring * Alerting * Anomaly Detection * Centralized Log
> Management
> > > > Solr & Elasticsearch Support * http://sematext.com/
> > > >
> > > >
> > >
> > >
> > > --
> > > --
> > >
> > > Benedetti Alessandro
> > > Visiting card : http://about.me/alessandro_benedetti
> > >
> > > "Tyger, tyger burning bright
> > > In the forests of the night,
> > > What immortal hand or eye
> > > Could frame thy fearful symmetry?"
> > >
> > > William Blake - Songs of Experience -1794 England
> > >
> >
>
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>


Re: SOLR ranking

2016-02-16 Thread Alessandro Benedetti
If I remember well , it is going to be as a phrase query ( when you use the
"quotes") .
So the close proximity means a match of the phrase with 0 tolerance ( so
the terms must respect the position distance in the query).
If I remember well I debugged that recently.

Cheers

On 16 February 2016 at 11:42, Modassar Ather <modather1...@gmail.com> wrote:

> Actually you can get it with the edismax.
> Just set mm to 100% and then configure a pf field ( or more) .
> You are going to search all the search terms mandatory and boost phrases
> match .
>
> @Alessandro Thanks for your insight.
> I thought that the document will be boosted if all of the terms appear in
> close proximity by setting pf. Not sure how much is meant by the close
> proximity. Checked it on dismax query parser wiki too.
>
> Best,
> Modassar
>
> On Tue, Feb 16, 2016 at 3:36 PM, Alessandro Benedetti <
> abenede...@apache.org
> > wrote:
>
> > Binoy, the omitTermFreqAndPositions is set only for text_ws which is used
> > only on the "indexed_terms" field.
> > The text_general fields seem fine to me.
> >
> > Are you omitting norms on purpose ? To be fair it could be relevant in
> > title or short topic searches to boost up short field values, containing
> a
> > lot of terms from the searched query.
> >
> > To respond Modassar :
> >
> > I don't think the phrase will be searched as individual ANDed terms until
> > > the query has it like below.
> > > "Eating Disorders" OR (Eating AND Disorders).
> > >
> >
> > Actually you can get it with the edismax.
> > Just set mm to 100% and then configure a pf field ( or more) .
> > You are going to search all the search terms mandatory and boost phrases
> > match .
> >
> > Cheers
> >
> > On 16 February 2016 at 07:57, Emir Arnautovic <
> > emir.arnauto...@sematext.com>
> > wrote:
> >
> > > Hi Nitin,
> > > You can use pf parameter to boost results with exact phrase. You can
> also
> > > use pf2 and pf3 to boost results with bigrams (phrase matches with 2
> or 3
> > > words in case input is with more than 3 words)
> > >
> > > Regards,
> > > Emir
> > >
> > >
> > > On 16.02.2016 06:18, Nitin.K wrote:
> > >
> > >> I am using edismax parser with the following query:
> > >>
> > >>
> > >>
> >
> localhost:8983/solr/tgl/select?q=eating%20disorders=xml=1.0=200=AND=true=edismax=true=true=true=topic_title%5E100+subtopic_title%5E40+index_term%5E20+drug%5E15+content%5E3=topTitle%5E200+subTopTitle%5E80+indTerm%5E40+drugString%5E30+content%5E6
> > >>
> > >> Configuration of schema.xml
> > >>
> > >>  > stored="true"
> > >> />
> > >> 
> > >>
> > >>  > >> stored="true"/>
> > >>  stored="false"/>
> > >>
> > >>  > >> multiValued="true"/>
> > >>  > >> multiValued="true"/>
> > >>
> > >>  > >> multiValued="true"/>
> > >>  > >> multiValued="true"/>
> > >>
> > >>  stored="true"/>
> > >>
> > >> 
> > >> 
> > >> 
> > >> 
> > >>
> > >>  > >> positionIncrementGap="100" omitNorms="true">
> > >> 
> > >>  > class="solr.StandardTokenizerFactory"/>
> > >>  > >> ignoreCase="true"
> > >> words="stopwords.txt" />
> > >> 
> > >> 
> > >> 
> > >>  > class="solr.StandardTokenizerFactory"/>
> > >>  > >> ignoreCase="true"
> > >> words="stopwords.txt" />
> > >>  > >> synonyms="synonyms.txt"
> > >> ignoreCase="true" expand="true"/>
> > >> 
> > >> 
> > >> 
> > >>  > >> positionIncrementGap="100"
> > >> omitTermFreqAndPositions="true" omitNorms="true">
> > >> 
> > >>  > >> class="solr.WhitespaceTokenizerFactory"/>
> > 

Re: SOLR ranking

2016-02-16 Thread Modassar Ather
Actually you can get it with the edismax.
Just set mm to 100% and then configure a pf field ( or more) .
You are going to search all the search terms mandatory and boost phrases
match .

@Alessandro Thanks for your insight.
I thought that the document will be boosted if all of the terms appear in
close proximity by setting pf. Not sure how much is meant by the close
proximity. Checked it on dismax query parser wiki too.

Best,
Modassar

On Tue, Feb 16, 2016 at 3:36 PM, Alessandro Benedetti <abenede...@apache.org
> wrote:

> Binoy, the omitTermFreqAndPositions is set only for text_ws which is used
> only on the "indexed_terms" field.
> The text_general fields seem fine to me.
>
> Are you omitting norms on purpose ? To be fair it could be relevant in
> title or short topic searches to boost up short field values, containing a
> lot of terms from the searched query.
>
> To respond Modassar :
>
> I don't think the phrase will be searched as individual ANDed terms until
> > the query has it like below.
> > "Eating Disorders" OR (Eating AND Disorders).
> >
>
> Actually you can get it with the edismax.
> Just set mm to 100% and then configure a pf field ( or more) .
> You are going to search all the search terms mandatory and boost phrases
> match .
>
> Cheers
>
> On 16 February 2016 at 07:57, Emir Arnautovic <
> emir.arnauto...@sematext.com>
> wrote:
>
> > Hi Nitin,
> > You can use pf parameter to boost results with exact phrase. You can also
> > use pf2 and pf3 to boost results with bigrams (phrase matches with 2 or 3
> > words in case input is with more than 3 words)
> >
> > Regards,
> > Emir
> >
> >
> > On 16.02.2016 06:18, Nitin.K wrote:
> >
> >> I am using edismax parser with the following query:
> >>
> >>
> >>
> localhost:8983/solr/tgl/select?q=eating%20disorders=xml=1.0=200=AND=true=edismax=true=true=true=topic_title%5E100+subtopic_title%5E40+index_term%5E20+drug%5E15+content%5E3=topTitle%5E200+subTopTitle%5E80+indTerm%5E40+drugString%5E30+content%5E6
> >>
> >> Configuration of schema.xml
> >>
> >>  stored="true"
> >> />
> >> 
> >>
> >>  >> stored="true"/>
> >> 
> >>
> >>  >> multiValued="true"/>
> >>  >> multiValued="true"/>
> >>
> >>  >> multiValued="true"/>
> >>  >> multiValued="true"/>
> >>
> >> 
> >>
> >> 
> >> 
> >> 
> >> 
> >>
> >>  >> positionIncrementGap="100" omitNorms="true">
> >> 
> >>  class="solr.StandardTokenizerFactory"/>
> >>  >> ignoreCase="true"
> >> words="stopwords.txt" />
> >> 
> >> 
> >> 
> >>  class="solr.StandardTokenizerFactory"/>
> >>  >> ignoreCase="true"
> >> words="stopwords.txt" />
> >>  >> synonyms="synonyms.txt"
> >> ignoreCase="true" expand="true"/>
> >> 
> >> 
> >> 
> >>  >> positionIncrementGap="100"
> >> omitTermFreqAndPositions="true" omitNorms="true">
> >> 
> >>  >> class="solr.WhitespaceTokenizerFactory"/>
> >>  >> ignoreCase="true"
> >> words="stopwords.txt" />
> >> 
> >> 
> >> 
> >>
> >>
> >> I want , if user will search for a phrase then that pharse should always
> >> takes the priority in comaprison to the individual words;
> >>
> >> Example: "Eating Disorders"
> >>
> >> First it will search for "Eating Disorders" together and then the
> >> individual
> >> words "Eating" and "Disorders"
> >> but while searching for individual words, it will always return those
> >> documents where both the words should exist for which i am already using
> >> q.op="AND" in my query.
> >>
> >> Thanks,
> >> Nitin
> >>
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257510.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >
> > --
> > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> >
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>


Re: SOLR ranking

2016-02-16 Thread Emir Arnautovic

Hi Nitin,
Not sure if you changed what fields you use for phrase boost, but in 
example you sent, all fields except content are "string" fields and 
content is boosted with 6 while topic_title in qf is boosted with 100. 
Try setting same field you use in qf in pf2 and you should see the 
difference. After that you can play with field analysis and which field 
to use just for boosting.


Regards,
Emir

On 16.02.2016 11:30, Nitin.K wrote:

Hi Emir,

I tried using the boost parameters for phrase search by removing the
omitTermFreqAndPositions from the multivalued field type but somehow while
searching phrases; the documents that have exact match are not coming up in
the order. Instead; in the content field, it is considering the mutual count
of both the terms and based on that, its deciding the order.

kindly let me know, how can i first search the phrase and then go to the
individual words (i.e word-1 AND word-2)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257556.html
Sent from the Solr - User mailing list archive at Nabble.com.


--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: SOLR ranking

2016-02-16 Thread Alessandro Benedetti
Nithin, have you read my reply ?

kindly let me know, how can i first search the phrase and then go to the
> individual words (i.e word-1 AND word-2)
>

On 16 February 2016 at 10:45, Binoy Dalal <binoydala...@gmail.com> wrote:

> Based on a quick look at the documentation, I think that you should use
> termPositions=true to achieve what you want.
>
> On Tue, 16 Feb 2016, 16:08 Nitin.K <nitin.kanu...@adi-mps.com> wrote:
>
> > Hi Emir,
> >
> > I tried using the boost parameters for phrase search by removing the
> > omitTermFreqAndPositions from the multivalued field type but somehow
> while
> > searching phrases; the documents that have exact match are not coming up
> in
> > the order. Instead; in the content field, it is considering the mutual
> > count
> > of both the terms and based on that, its deciding the order.
> >
> > kindly let me know, how can i first search the phrase and then go to the
> > individual words (i.e word-1 AND word-2)
> >
> >
> >
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257556.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
> --
> Regards,
> Binoy Dalal
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: SOLR ranking

2016-02-16 Thread Binoy Dalal
Based on a quick look at the documentation, I think that you should use
termPositions=true to achieve what you want.

On Tue, 16 Feb 2016, 16:08 Nitin.K <nitin.kanu...@adi-mps.com> wrote:

> Hi Emir,
>
> I tried using the boost parameters for phrase search by removing the
> omitTermFreqAndPositions from the multivalued field type but somehow while
> searching phrases; the documents that have exact match are not coming up in
> the order. Instead; in the content field, it is considering the mutual
> count
> of both the terms and based on that, its deciding the order.
>
> kindly let me know, how can i first search the phrase and then go to the
> individual words (i.e word-1 AND word-2)
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257556.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
-- 
Regards,
Binoy Dalal


Re: SOLR ranking

2016-02-16 Thread Nitin.K
Hi Emir,

I tried using the boost parameters for phrase search by removing the
omitTermFreqAndPositions from the multivalued field type but somehow while
searching phrases; the documents that have exact match are not coming up in
the order. Instead; in the content field, it is considering the mutual count
of both the terms and based on that, its deciding the order.

kindly let me know, how can i first search the phrase and then go to the
individual words (i.e word-1 AND word-2)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257556.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR ranking

2016-02-16 Thread Nitin.K
You are absolutely right Binoy..!!

But my problem is; We don't want the term frequency to take into account for
index term as well as drug. (i.e. Don't want to consider the no. of
occurrences of search term for both of these fields.)
Is it possible that i can omit the term frequency for these two fields and
also indexed them with term positions for phrase search ??

I tried using omitTermFreqAndPositions="true" and omitPositions="false" but
thats not working for me.

Thanks,
Nitin




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257551.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR ranking

2016-02-16 Thread Binoy Dalal
@Nitin
Why are you phrase boosting on string fields?
More often than not, it won't do anything because the phrases simply won't
match the entire string.

On Tue, 16 Feb 2016, 15:36 Alessandro Benedetti <abenede...@apache.org>
wrote:

> Binoy, the omitTermFreqAndPositions is set only for text_ws which is used
> only on the "indexed_terms" field.
> The text_general fields seem fine to me.
>
> Are you omitting norms on purpose ? To be fair it could be relevant in
> title or short topic searches to boost up short field values, containing a
> lot of terms from the searched query.
>
> To respond Modassar :
>
> I don't think the phrase will be searched as individual ANDed terms until
> > the query has it like below.
> > "Eating Disorders" OR (Eating AND Disorders).
> >
>
> Actually you can get it with the edismax.
> Just set mm to 100% and then configure a pf field ( or more) .
> You are going to search all the search terms mandatory and boost phrases
> match .
>
> Cheers
>
> On 16 February 2016 at 07:57, Emir Arnautovic <
> emir.arnauto...@sematext.com>
> wrote:
>
> > Hi Nitin,
> > You can use pf parameter to boost results with exact phrase. You can also
> > use pf2 and pf3 to boost results with bigrams (phrase matches with 2 or 3
> > words in case input is with more than 3 words)
> >
> > Regards,
> > Emir
> >
> >
> > On 16.02.2016 06:18, Nitin.K wrote:
> >
> >> I am using edismax parser with the following query:
> >>
> >>
> >>
> localhost:8983/solr/tgl/select?q=eating%20disorders=xml=1.0=200=AND=true=edismax=true=true=true=topic_title%5E100+subtopic_title%5E40+index_term%5E20+drug%5E15+content%5E3=topTitle%5E200+subTopTitle%5E80+indTerm%5E40+drugString%5E30+content%5E6
> >>
> >> Configuration of schema.xml
> >>
> >>  stored="true"
> >> />
> >> 
> >>
> >>  >> stored="true"/>
> >> 
> >>
> >>  >> multiValued="true"/>
> >>  >> multiValued="true"/>
> >>
> >>  >> multiValued="true"/>
> >>  >> multiValued="true"/>
> >>
> >> 
> >>
> >> 
> >> 
> >> 
> >> 
> >>
> >>  >> positionIncrementGap="100" omitNorms="true">
> >> 
> >>  class="solr.StandardTokenizerFactory"/>
> >>  >> ignoreCase="true"
> >> words="stopwords.txt" />
> >> 
> >> 
> >> 
> >>  class="solr.StandardTokenizerFactory"/>
> >>  >> ignoreCase="true"
> >> words="stopwords.txt" />
> >>  >> synonyms="synonyms.txt"
> >> ignoreCase="true" expand="true"/>
> >> 
> >> 
> >> 
> >>  >> positionIncrementGap="100"
> >> omitTermFreqAndPositions="true" omitNorms="true">
> >> 
> >>  >> class="solr.WhitespaceTokenizerFactory"/>
> >>  >> ignoreCase="true"
> >> words="stopwords.txt" />
> >> 
> >> 
> >> 
> >>
> >>
> >> I want , if user will search for a phrase then that pharse should always
> >> takes the priority in comaprison to the individual words;
> >>
> >> Example: "Eating Disorders"
> >>
> >> First it will search for "Eating Disorders" together and then the
> >> individual
> >> words "Eating" and "Disorders"
> >> but while searching for individual words, it will always return those
> >> documents where both the words should exist for which i am already using
> >> q.op="AND" in my query.
> >>
> >> Thanks,
> >> Nitin
> >>
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257510.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >
> > --
> > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> >
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>
-- 
Regards,
Binoy Dalal


Re: SOLR ranking

2016-02-16 Thread Alessandro Benedetti
Binoy, the omitTermFreqAndPositions is set only for text_ws which is used
only on the "indexed_terms" field.
The text_general fields seem fine to me.

Are you omitting norms on purpose ? To be fair it could be relevant in
title or short topic searches to boost up short field values, containing a
lot of terms from the searched query.

To respond Modassar :

I don't think the phrase will be searched as individual ANDed terms until
> the query has it like below.
> "Eating Disorders" OR (Eating AND Disorders).
>

Actually you can get it with the edismax.
Just set mm to 100% and then configure a pf field ( or more) .
You are going to search all the search terms mandatory and boost phrases
match .

Cheers

On 16 February 2016 at 07:57, Emir Arnautovic <emir.arnauto...@sematext.com>
wrote:

> Hi Nitin,
> You can use pf parameter to boost results with exact phrase. You can also
> use pf2 and pf3 to boost results with bigrams (phrase matches with 2 or 3
> words in case input is with more than 3 words)
>
> Regards,
> Emir
>
>
> On 16.02.2016 06:18, Nitin.K wrote:
>
>> I am using edismax parser with the following query:
>>
>>
>> localhost:8983/solr/tgl/select?q=eating%20disorders=xml=1.0=200=AND=true=edismax=true=true=true=topic_title%5E100+subtopic_title%5E40+index_term%5E20+drug%5E15+content%5E3=topTitle%5E200+subTopTitle%5E80+indTerm%5E40+drugString%5E30+content%5E6
>>
>> Configuration of schema.xml
>>
>> > />
>> 
>>
>> > stored="true"/>
>> 
>>
>> > multiValued="true"/>
>> > multiValued="true"/>
>>
>> > multiValued="true"/>
>> > multiValued="true"/>
>>
>> 
>>
>> 
>> 
>> 
>> 
>>
>> > positionIncrementGap="100" omitNorms="true">
>> 
>> 
>> > ignoreCase="true"
>> words="stopwords.txt" />
>> 
>> 
>> 
>> 
>> > ignoreCase="true"
>> words="stopwords.txt" />
>> > synonyms="synonyms.txt"
>> ignoreCase="true" expand="true"/>
>> 
>> 
>> 
>> > positionIncrementGap="100"
>> omitTermFreqAndPositions="true" omitNorms="true">
>> 
>> > class="solr.WhitespaceTokenizerFactory"/>
>> > ignoreCase="true"
>> words="stopwords.txt" />
>> 
>> 
>> 
>>
>>
>> I want , if user will search for a phrase then that pharse should always
>> takes the priority in comaprison to the individual words;
>>
>> Example: "Eating Disorders"
>>
>> First it will search for "Eating Disorders" together and then the
>> individual
>> words "Eating" and "Disorders"
>> but while searching for individual words, it will always return those
>> documents where both the words should exist for which i am already using
>> q.op="AND" in my query.
>>
>> Thanks,
>> Nitin
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257510.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>


-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: SOLR ranking

2016-02-15 Thread Emir Arnautovic

Hi Nitin,
You can use pf parameter to boost results with exact phrase. You can 
also use pf2 and pf3 to boost results with bigrams (phrase matches with 
2 or 3 words in case input is with more than 3 words)


Regards,
Emir

On 16.02.2016 06:18, Nitin.K wrote:

I am using edismax parser with the following query:

localhost:8983/solr/tgl/select?q=eating%20disorders=xml=1.0=200=AND=true=edismax=true=true=true=topic_title%5E100+subtopic_title%5E40+index_term%5E20+drug%5E15+content%5E3=topTitle%5E200+subTopTitle%5E80+indTerm%5E40+drugString%5E30+content%5E6

Configuration of schema.xml










































I want , if user will search for a phrase then that pharse should always
takes the priority in comaprison to the individual words;

Example: "Eating Disorders"

First it will search for "Eating Disorders" together and then the individual
words "Eating" and "Disorders"
but while searching for individual words, it will always return those
documents where both the words should exist for which i am already using
q.op="AND" in my query.

Thanks,
Nitin




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257510.html
Sent from the Solr - User mailing list archive at Nabble.com.


--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: SOLR ranking

2016-02-15 Thread Binoy Dalal
Firstly to do phrase searching, you need to set
omitTermFreqAndPositions=false.
You've set this to true.
This will require a reindex.
Secondly it will be helpful to check the debug Query output and see how the
query is parsed and searched.

On Tue, 16 Feb 2016, 12:28 Modassar Ather <modather1...@gmail.com> wrote:

> First it will search for "Eating Disorders" together and then the
> individual
> words "Eating" and "Disorders"
>
> I don't think the phrase will be searched as individual ANDed terms until
> the query has it like below.
> "Eating Disorders" OR (Eating AND Disorders).
>
> Best,
> Modassar
>
> On Tue, Feb 16, 2016 at 10:48 AM, Nitin.K <nitin.kanu...@adi-mps.com>
> wrote:
>
> > I am using edismax parser with the following query:
> >
> >
> >
> localhost:8983/solr/tgl/select?q=eating%20disorders=xml=1.0=200=AND=true=edismax=true=true=true=topic_title%5E100+subtopic_title%5E40+index_term%5E20+drug%5E15+content%5E3=topTitle%5E200+subTopTitle%5E80+indTerm%5E40+drugString%5E30+content%5E6
> >
> > Configuration of schema.xml
> >
> >  stored="true"
> > />
> > 
> >
> >  > stored="true"/>
> > 
> >
> >  > multiValued="true"/>
> >  > multiValued="true"/>
> >
> >  > multiValued="true"/>
> >  > multiValued="true"/>
> >
> > 
> >
> > 
> > 
> > 
> > 
> >
> >  > positionIncrementGap="100" omitNorms="true">
> > 
> >  class="solr.StandardTokenizerFactory"/>
> >  > ignoreCase="true"
> > words="stopwords.txt" />
> > 
> > 
> > 
> >  class="solr.StandardTokenizerFactory"/>
> >  > ignoreCase="true"
> > words="stopwords.txt" />
> >  > synonyms="synonyms.txt"
> > ignoreCase="true" expand="true"/>
> > 
> > 
> > 
> >  positionIncrementGap="100"
> > omitTermFreqAndPositions="true" omitNorms="true">
> > 
> >  > class="solr.WhitespaceTokenizerFactory"/>
> >  > ignoreCase="true"
> > words="stopwords.txt" />
> > 
> > 
> > 
> >
> >
> > I want , if user will search for a phrase then that pharse should always
> > takes the priority in comaprison to the individual words;
> >
> > Example: "Eating Disorders"
> >
> > First it will search for "Eating Disorders" together and then the
> > individual
> > words "Eating" and "Disorders"
> > but while searching for individual words, it will always return those
> > documents where both the words should exist for which i am already using
> > q.op="AND" in my query.
> >
> > Thanks,
> > Nitin
> >
> >
> >
> >
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257510.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>
-- 
Regards,
Binoy Dalal


Re: SOLR ranking

2016-02-15 Thread Modassar Ather
First it will search for "Eating Disorders" together and then the individual
words "Eating" and "Disorders"

I don't think the phrase will be searched as individual ANDed terms until
the query has it like below.
"Eating Disorders" OR (Eating AND Disorders).

Best,
Modassar

On Tue, Feb 16, 2016 at 10:48 AM, Nitin.K <nitin.kanu...@adi-mps.com> wrote:

> I am using edismax parser with the following query:
>
>
> localhost:8983/solr/tgl/select?q=eating%20disorders=xml=1.0=200=AND=true=edismax=true=true=true=topic_title%5E100+subtopic_title%5E40+index_term%5E20+drug%5E15+content%5E3=topTitle%5E200+subTopTitle%5E80+indTerm%5E40+drugString%5E30+content%5E6
>
> Configuration of schema.xml
>
>  />
> 
>
>  stored="true"/>
> 
>
>  multiValued="true"/>
>  multiValued="true"/>
>
>  multiValued="true"/>
>  multiValued="true"/>
>
> 
>
> 
> 
> 
> 
>
>  positionIncrementGap="100" omitNorms="true">
> 
> 
>  ignoreCase="true"
> words="stopwords.txt" />
> 
> 
> 
> 
>  ignoreCase="true"
> words="stopwords.txt" />
>  synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
> 
> 
> 
>  omitTermFreqAndPositions="true" omitNorms="true">
> 
>  class="solr.WhitespaceTokenizerFactory"/>
>  ignoreCase="true"
> words="stopwords.txt" />
> 
> 
> 
>
>
> I want , if user will search for a phrase then that pharse should always
> takes the priority in comaprison to the individual words;
>
> Example: "Eating Disorders"
>
> First it will search for "Eating Disorders" together and then the
> individual
> words "Eating" and "Disorders"
> but while searching for individual words, it will always return those
> documents where both the words should exist for which i am already using
> q.op="AND" in my query.
>
> Thanks,
> Nitin
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257510.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: SOLR ranking

2016-02-15 Thread Nitin.K
I am using edismax parser with the following query:

localhost:8983/solr/tgl/select?q=eating%20disorders=xml=1.0=200=AND=true=edismax=true=true=true=topic_title%5E100+subtopic_title%5E40+index_term%5E20+drug%5E15+content%5E3=topTitle%5E200+subTopTitle%5E80+indTerm%5E40+drugString%5E30+content%5E6

Configuration of schema.xml










































I want , if user will search for a phrase then that pharse should always
takes the priority in comaprison to the individual words;

Example: "Eating Disorders"

First it will search for "Eating Disorders" together and then the individual
words "Eating" and "Disorders"
but while searching for individual words, it will always return those
documents where both the words should exist for which i am already using
q.op="AND" in my query.

Thanks,
Nitin 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257510.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR ranking

2016-02-15 Thread Binoy Dalal
You'll have to provide more information.
How exactly do you want phrase search to work and how is it not working
properly?

On Tue, 16 Feb 2016, 00:08 Nitin.K <nitin.kanu...@adi-mps.com> wrote:

> Thanks Binoy..
>
> I have used the boost parameters and its working as expected.
> I also need to give the priority to the phrase search. Kindly suggest on
> this.
> I am using edismax parser right now.
> Using pf, pf2 and pf3 parameters but that too are not working properly.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257420.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
-- 
Regards,
Binoy Dalal


Re: SOLR ranking

2016-02-15 Thread Nitin.K
Thanks Binoy..

I have used the boost parameters and its working as expected.
I also need to give the priority to the phrase search. Kindly suggest on
this.
I am using edismax parser right now. 
Using pf, pf2 and pf3 parameters but that too are not working properly. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257420.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR ranking

2016-02-15 Thread Binoy Dalal
I'm sorry, missed that part. It's true, you cannot sort on multivalued
fields. The workaround will be pretty complex; you'll either have to find
the max or min value of the fields at index time and store those in
separate fields and use those to sort, or somehow come up with some
function that can convert the values from your multivalued field into a
single value (something like sum(field)) but it surely won't be trivial.

Instead you should do what Emir's saying.
Boost your fields at index or query time based on how you want to sort your
documents.
So in your case, give the highest boost to topic_title then a little lower
to subtopic_title and so on. This should return your documents in the
correct order.
You will have to play around with the boost values a little to get them
right, though.

Alternatively, you could boost on the multivalued fields and then sort
based on your single valued fields.

Either ways, you'll have to experiment and see what works best for you.

On Mon, Feb 15, 2016 at 8:21 PM Nitin.K <nitin.kanu...@adi-mps.com> wrote:

> Thanks Binoy..
>
> Actually it is throwing following error:
>
> can not sort on multivalued field: index_term
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257378.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
-- 
Regards,
Binoy Dalal


Re: SOLR ranking

2016-02-15 Thread Nitin.K
Thanks Binoy..

Actually it is throwing following error:

can not sort on multivalued field: index_term



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257378.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR ranking

2016-02-15 Thread Emir Arnautovic

Hi,
Not  sure how ordering will help (maybe missing question) but what seems 
to me that would help your case is simple boosting. See 
https://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_make_.22superman.22_in_the_title_field_score_higher_than_in_the_subject_field 



Regards,
Emir

On 15.02.2016 14:14, Binoy Dalal wrote:

Use the sort parameter with your query and pass the fields in the order in
which you want to sort them.
So if you want topic > subtopic > index > drug > content all ascending,
your sort parameter will look like
=topic asc,subtopic asc,index asc,drug asc,content asc

On Mon, 15 Feb 2016, 18:17 Nitin.K <nitin.kanu...@adi-mps.com> wrote:


I have five fields in SOLR
topic_title
subtopic_title
index_terms - Multivalued
drug - Multivalued
content

- Now, I want to rank the documents with all these fields; I want all those
documents that are haivng the search term in topic_title will come first in
the order
then documents having search term in subtopic title and then so on.

Example : If two documents are having search term in topic_title then the
solr should look for subtopic_ title similarly
if the search term is present in both topic_title and subtopic_title fields
then it should look for index term and so on; to decide the ranking order

- I dont want to consider the no. of occurrences in multivalued fields but
if the two documents are having search term in topic_title, subtopic_title,
index_term and drug then the documents
should be ranked in the order of no. of occurrences inside the content
field.


Kindly help in this. I will be really thankful



--
View this message in context:
http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367.html
Sent from the Solr - User mailing list archive at Nabble.com.



--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: SOLR ranking

2016-02-15 Thread Binoy Dalal
Use the sort parameter with your query and pass the fields in the order in
which you want to sort them.
So if you want topic > subtopic > index > drug > content all ascending,
your sort parameter will look like
=topic asc,subtopic asc,index asc,drug asc,content asc

On Mon, 15 Feb 2016, 18:17 Nitin.K <nitin.kanu...@adi-mps.com> wrote:

> I have five fields in SOLR
> topic_title
> subtopic_title
> index_terms - Multivalued
> drug - Multivalued
> content
>
> - Now, I want to rank the documents with all these fields; I want all those
> documents that are haivng the search term in topic_title will come first in
> the order
> then documents having search term in subtopic title and then so on.
>
> Example : If two documents are having search term in topic_title then the
> solr should look for subtopic_ title similarly
> if the search term is present in both topic_title and subtopic_title fields
> then it should look for index term and so on; to decide the ranking order
>
> - I dont want to consider the no. of occurrences in multivalued fields but
> if the two documents are having search term in topic_title, subtopic_title,
> index_term and drug then the documents
> should be ranked in the order of no. of occurrences inside the content
> field.
>
>
> Kindly help in this. I will be really thankful
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
-- 
Regards,
Binoy Dalal


SOLR ranking

2016-02-15 Thread Nitin.K
I have five fields in SOLR
topic_title
subtopic_title
index_terms - Multivalued
drug - Multivalued
content

- Now, I want to rank the documents with all these fields; I want all those
documents that are haivng the search term in topic_title will come first in
the order
then documents having search term in subtopic title and then so on. 

Example : If two documents are having search term in topic_title then the
solr should look for subtopic_ title similarly
if the search term is present in both topic_title and subtopic_title fields
then it should look for index term and so on; to decide the ranking order 

- I dont want to consider the no. of occurrences in multivalued fields but
if the two documents are having search term in topic_title, subtopic_title,
index_term and drug then the documents
should be ranked in the order of no. of occurrences inside the content
field.


Kindly help in this. I will be really thankful



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr ranking query..

2014-02-04 Thread Varun Thacker
Hi Chris,

I think what you are looking for could be solved using the eDismax query
parser.
https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser

1. Your Query Fields ( qf ) would be -  urlKeywords^60 title^40 fulltxt^1
2. To check on adultFlag:N you could use  fq=adultFlag:N
3. For Lowest Domain Rank within the same group to rank higher you could
use the boost parameter and use a recip (
http://wiki.apache.org/solr/FunctionQuery#recip ) function query to achieve
this.

Hope this works for you


On Tue, Feb 4, 2014 at 12:19 PM, Chris christu...@gmail.com wrote:

 Hi,

 I have a document structure that looks like the below. I would like to
 implement something like -

 (urlKeywords:+keyword+ AND domainRank:[3 TO 1] AND adultFlag:N)^60 
 +
  OR (title:+keyword+ AND domainRank:[3 TO 1] AND adultFlag:N)^20  +
   OR (title:+keyword+ AND domainRank:[10001 TO *] AND adultFlag:N)^2  +
   OR (fulltxt:+keyword+) );


 In case we have multiple words in keywords - A B C D then for the
 documents that have all the words should rank highest (Group1), then 3
 words(Group2), then 2 words(Group 3) etc
 AND - Within each group (Group1, 2, 3) I would want the ones with the
 lowest domain rank value to rank higher (but within the group)

 How can i do this in a single query? and please advice on the fastest way
 possible,
 (open to implementing fq  other techniques to speed it up)

 Please advice.


 Document Structure in XML -

  doc
 str name=subDomainwww/str
 str name=domainncoah.com/str
 str name=path/links.html/str
 str name=urlFullhttp://www.ncoah.com/links.html/str
 str name=titleNorth Carolina Office of Administrative Hearings
 - Links/str
 arr name=text
   strNorth Carolina Office of Administrative Hearings - Links/str
 /arr
 str name=relatedLinks - a
 href=http://www.ncoah.com/links.html;  title=HearingsHearings/a
 - a href=http://www.ncoah.com/links.html;  title=RulesRules/a -
 a href=http://www.ncoah.com/links.html;  title=Civil RightsCivil
 Rights/a - a href=http://www.ncoah.com/links.html;
 title=WelcomeWelcome/a - a
 href=http://www.ncoah.com/links.html;  title=General
 InformationGeneral Information/a - a
 href=http://www.ncoah.com/links.html;  title=Directions to
 OAHDirections to OAH/a - a href=http://www.ncoah.com/links.html;
  title=Establishment of OAHEstablishment of OAH/a - a
 href=http://www.ncoah.com/links.html;  title=G.S. 150BG.S.
 150B/a - a href=http://www.ncoah.com/links.html;
 title=FormsForms/a - a href=http://www.ncoah.com/links.html;
 title=LinksLinks/a - a href=http://www.nc.gov/;  title=Visit
 the North Carolina State web portalVisit the North Carolina State
 web portal/a - a
 href=http://ncinfo.iog.unc.edu/library/counties.html;  title=North
 Carolina CountiesNorth Carolina Counties/a - a
 href=http://ncinfo.iog.unc.edu/library/cities.html;  title=North
 Carolina Cities  TownsNorth Carolina Cities  Towns/a - a
 href=http://www.nccourts.org/;  title=Administrative Office of the
 CourtsAdministrative Office of the Courts/a - a
 href=http://www.ncleg.net/;  title=North Carolina General
 AssemblyNorth Carolina General Assembly/a - a
 href=http://www.doa.state.nc.us/;  title=Department of
 AdministrationDepartment of Administration/a - a
 href=http://www.ncagr.com/;  title=Department of
 AgricultureDepartment of Agriculture/a - a
 href=http://www.nccommerce.com;  title=Department of
 CommerceDepartment of Commerce/a - a
 href=http://www.doc.state.nc.us/;  title=Department of
 CorrectionDepartment of Correction/a - a
 href=http://www.nccrimecontrol.org/;  title=Department of Crime
 Control  Public SafetyDepartment of Crime Control  Public
 Safety/a - a href=http://www.ncdcr.gov/;  title=Department of
 Cultural ResourcesDepartment of Cultural Resources/a - a
 href=http://www.ncdenr.gov/;  title=Department of Environment and
 Natural ResourcesDepartment of Environment and Natural Resources/a
 - a href=http://www.dhhs.state.nc.us;  title=Department of Health
 and Human ServicesDepartment of Health and Human Services/a - a
 href=http://www.ncdoi.com/;  title=Department of
 InsuranceDepartment of Insurance/a - a
 href=http://www.ncdoj.com/;  title=Department of JusticeDepartment
 of Justice/a - a href=http://www.juvjus.state.nc.us/;
 title=Department of Juvenile Justice and Delinquency
 PreventionDepartment of Juvenile Justice and Delinquency
 Prevention/a - a href=http://www.nclabor.com/;  title=Department
 of LaborDepartment of Labor/a - a
 href=http://www.dpi.state.nc.us/;  title=Department of Public
 InstructionDepartment of Public Instruction/a - a
 href=http://www.dor.state.nc.us/;  title=Department of
 RevenueDepartment of Revenue/a - a
 href=http://www.treasurer.state.nc.us/;  title=Department of State
 TreasurerDepartment of State Treasurer/a - a
 href=http://www.ncdot.org/;  title=Department of
 TransportationDepartment of Transportation/a - a
 href=http://www.secstate.state.nc.us/;  title=Department of the
 

Re: Solr ranking query..

2014-02-04 Thread Chris
Dear Varun,

Thank you for your replies, I managed to get point 1  2 done, but for the
boost query, I am unable to figure it out. Could you be kind enough to
point me to an example or maybe advice a bit more on that one?

Thanks for your help,
Chris


On Tue, Feb 4, 2014 at 3:14 PM, Varun Thacker varunthacker1...@gmail.comwrote:

 Hi Chris,

 I think what you are looking for could be solved using the eDismax query
 parser.

 https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser

 1. Your Query Fields ( qf ) would be -  urlKeywords^60 title^40 fulltxt^1
 2. To check on adultFlag:N you could use  fq=adultFlag:N
 3. For Lowest Domain Rank within the same group to rank higher you could
 use the boost parameter and use a recip (
 http://wiki.apache.org/solr/FunctionQuery#recip ) function query to
 achieve
 this.

 Hope this works for you


 On Tue, Feb 4, 2014 at 12:19 PM, Chris christu...@gmail.com wrote:

  Hi,
 
  I have a document structure that looks like the below. I would like to
  implement something like -
 
  (urlKeywords:+keyword+ AND domainRank:[3 TO 1] AND adultFlag:N)^60
 
  +
   OR (title:+keyword+ AND domainRank:[3 TO 1] AND adultFlag:N)^20
  +
OR (title:+keyword+ AND domainRank:[10001 TO *] AND adultFlag:N)^2
  +
OR (fulltxt:+keyword+) );
 
 
  In case we have multiple words in keywords - A B C D then for the
  documents that have all the words should rank highest (Group1), then 3
  words(Group2), then 2 words(Group 3) etc
  AND - Within each group (Group1, 2, 3) I would want the ones with the
  lowest domain rank value to rank higher (but within the group)
 
  How can i do this in a single query? and please advice on the fastest way
  possible,
  (open to implementing fq  other techniques to speed it up)
 
  Please advice.
 
 
  Document Structure in XML -
 
   doc
  str name=subDomainwww/str
  str name=domainncoah.com/str
  str name=path/links.html/str
  str name=urlFullhttp://www.ncoah.com/links.html/str
  str name=titleNorth Carolina Office of Administrative Hearings
  - Links/str
  arr name=text
strNorth Carolina Office of Administrative Hearings - Links/str
  /arr
  str name=relatedLinks - a
  href=http://www.ncoah.com/links.html;  title=HearingsHearings/a
  - a href=http://www.ncoah.com/links.html;  title=RulesRules/a -
  a href=http://www.ncoah.com/links.html;  title=Civil RightsCivil
  Rights/a - a href=http://www.ncoah.com/links.html;
  title=WelcomeWelcome/a - a
  href=http://www.ncoah.com/links.html;  title=General
  InformationGeneral Information/a - a
  href=http://www.ncoah.com/links.html;  title=Directions to
  OAHDirections to OAH/a - a href=http://www.ncoah.com/links.html;
   title=Establishment of OAHEstablishment of OAH/a - a
  href=http://www.ncoah.com/links.html;  title=G.S. 150BG.S.
  150B/a - a href=http://www.ncoah.com/links.html;
  title=FormsForms/a - a href=http://www.ncoah.com/links.html;
  title=LinksLinks/a - a href=http://www.nc.gov/;  title=Visit
  the North Carolina State web portalVisit the North Carolina State
  web portal/a - a
  href=http://ncinfo.iog.unc.edu/library/counties.html;  title=North
  Carolina CountiesNorth Carolina Counties/a - a
  href=http://ncinfo.iog.unc.edu/library/cities.html;  title=North
  Carolina Cities  TownsNorth Carolina Cities  Towns/a - a
  href=http://www.nccourts.org/;  title=Administrative Office of the
  CourtsAdministrative Office of the Courts/a - a
  href=http://www.ncleg.net/;  title=North Carolina General
  AssemblyNorth Carolina General Assembly/a - a
  href=http://www.doa.state.nc.us/;  title=Department of
  AdministrationDepartment of Administration/a - a
  href=http://www.ncagr.com/;  title=Department of
  AgricultureDepartment of Agriculture/a - a
  href=http://www.nccommerce.com;  title=Department of
  CommerceDepartment of Commerce/a - a
  href=http://www.doc.state.nc.us/;  title=Department of
  CorrectionDepartment of Correction/a - a
  href=http://www.nccrimecontrol.org/;  title=Department of Crime
  Control  Public SafetyDepartment of Crime Control  Public
  Safety/a - a href=http://www.ncdcr.gov/;  title=Department of
  Cultural ResourcesDepartment of Cultural Resources/a - a
  href=http://www.ncdenr.gov/;  title=Department of Environment and
  Natural ResourcesDepartment of Environment and Natural Resources/a
  - a href=http://www.dhhs.state.nc.us;  title=Department of Health
  and Human ServicesDepartment of Health and Human Services/a - a
  href=http://www.ncdoi.com/;  title=Department of
  InsuranceDepartment of Insurance/a - a
  href=http://www.ncdoj.com/;  title=Department of JusticeDepartment
  of Justice/a - a href=http://www.juvjus.state.nc.us/;
  title=Department of Juvenile Justice and Delinquency
  PreventionDepartment of Juvenile Justice and Delinquency
  Prevention/a - a href=http://www.nclabor.com/;  title=Department
  of LaborDepartment of Labor/a - a
  href=http://www.dpi.state.nc.us/;  title=Department of 

Re: Solr ranking query..

2014-02-04 Thread Varun Thacker
Hi Chris,

An example for point 3 could be -
boost=recip(field(domainRank),0.1,1,1)

http://wiki.apache.org/solr/FunctionQuery#recip
recip(x,m,a,b) implementing a/(m*x+b). m,a,b are constants, x is any
numeric field or arbitrarily complex function.

So with these values when domainRank is 1 it will multiply by10, when
domain rank is 10 it will multiplied by 1 and so on.

You could choose better values of a,b and m to suit your data


On Tue, Feb 4, 2014 at 9:04 PM, Chris christu...@gmail.com wrote:

 Dear Varun,

 Thank you for your replies, I managed to get point 1  2 done, but for the
 boost query, I am unable to figure it out. Could you be kind enough to
 point me to an example or maybe advice a bit more on that one?

 Thanks for your help,
 Chris


 On Tue, Feb 4, 2014 at 3:14 PM, Varun Thacker varunthacker1...@gmail.com
 wrote:

  Hi Chris,
 
  I think what you are looking for could be solved using the eDismax query
  parser.
 
 
 https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser
 
  1. Your Query Fields ( qf ) would be -  urlKeywords^60 title^40
 fulltxt^1
  2. To check on adultFlag:N you could use  fq=adultFlag:N
  3. For Lowest Domain Rank within the same group to rank higher you could
  use the boost parameter and use a recip (
  http://wiki.apache.org/solr/FunctionQuery#recip ) function query to
  achieve
  this.
 
  Hope this works for you
 
 
  On Tue, Feb 4, 2014 at 12:19 PM, Chris christu...@gmail.com wrote:
 
   Hi,
  
   I have a document structure that looks like the below. I would like to
   implement something like -
  
   (urlKeywords:+keyword+ AND domainRank:[3 TO 1] AND
 adultFlag:N)^60
  
   +
OR (title:+keyword+ AND domainRank:[3 TO 1] AND adultFlag:N)^20
   +
 OR (title:+keyword+ AND domainRank:[10001 TO *] AND adultFlag:N)^2
   +
 OR (fulltxt:+keyword+) );
  
  
   In case we have multiple words in keywords - A B C D then for the
   documents that have all the words should rank highest (Group1), then 3
   words(Group2), then 2 words(Group 3) etc
   AND - Within each group (Group1, 2, 3) I would want the ones with the
   lowest domain rank value to rank higher (but within the group)
  
   How can i do this in a single query? and please advice on the fastest
 way
   possible,
   (open to implementing fq  other techniques to speed it up)
  
   Please advice.
  
  
   Document Structure in XML -
  
doc
   str name=subDomainwww/str
   str name=domainncoah.com/str
   str name=path/links.html/str
   str name=urlFullhttp://www.ncoah.com/links.html/str
   str name=titleNorth Carolina Office of Administrative Hearings
   - Links/str
   arr name=text
 strNorth Carolina Office of Administrative Hearings -
 Links/str
   /arr
   str name=relatedLinks - a
   href=http://www.ncoah.com/links.html;  title=HearingsHearings/a
   - a href=http://www.ncoah.com/links.html;  title=RulesRules/a -
   a href=http://www.ncoah.com/links.html;  title=Civil RightsCivil
   Rights/a - a href=http://www.ncoah.com/links.html;
   title=WelcomeWelcome/a - a
   href=http://www.ncoah.com/links.html;  title=General
   InformationGeneral Information/a - a
   href=http://www.ncoah.com/links.html;  title=Directions to
   OAHDirections to OAH/a - a href=http://www.ncoah.com/links.html;
title=Establishment of OAHEstablishment of OAH/a - a
   href=http://www.ncoah.com/links.html;  title=G.S. 150BG.S.
   150B/a - a href=http://www.ncoah.com/links.html;
   title=FormsForms/a - a href=http://www.ncoah.com/links.html;
   title=LinksLinks/a - a href=http://www.nc.gov/;  title=Visit
   the North Carolina State web portalVisit the North Carolina State
   web portal/a - a
   href=http://ncinfo.iog.unc.edu/library/counties.html;  title=North
   Carolina CountiesNorth Carolina Counties/a - a
   href=http://ncinfo.iog.unc.edu/library/cities.html;  title=North
   Carolina Cities  TownsNorth Carolina Cities  Towns/a - a
   href=http://www.nccourts.org/;  title=Administrative Office of the
   CourtsAdministrative Office of the Courts/a - a
   href=http://www.ncleg.net/;  title=North Carolina General
   AssemblyNorth Carolina General Assembly/a - a
   href=http://www.doa.state.nc.us/;  title=Department of
   AdministrationDepartment of Administration/a - a
   href=http://www.ncagr.com/;  title=Department of
   AgricultureDepartment of Agriculture/a - a
   href=http://www.nccommerce.com;  title=Department of
   CommerceDepartment of Commerce/a - a
   href=http://www.doc.state.nc.us/;  title=Department of
   CorrectionDepartment of Correction/a - a
   href=http://www.nccrimecontrol.org/;  title=Department of Crime
   Control  Public SafetyDepartment of Crime Control  Public
   Safety/a - a href=http://www.ncdcr.gov/;  title=Department of
   Cultural ResourcesDepartment of Cultural Resources/a - a
   href=http://www.ncdenr.gov/;  title=Department of Environment and
   Natural ResourcesDepartment of Environment and Natural Resources/a
   - a 

Solr ranking query..

2014-02-03 Thread Chris
Hi,

I have a document structure that looks like the below. I would like to
implement something like -

(urlKeywords:+keyword+ AND domainRank:[3 TO 1] AND adultFlag:N)^60  +
 OR (title:+keyword+ AND domainRank:[3 TO 1] AND adultFlag:N)^20  +
  OR (title:+keyword+ AND domainRank:[10001 TO *] AND adultFlag:N)^2  +
  OR (fulltxt:+keyword+) );


In case we have multiple words in keywords - A B C D then for the
documents that have all the words should rank highest (Group1), then 3
words(Group2), then 2 words(Group 3) etc
AND - Within each group (Group1, 2, 3) I would want the ones with the
lowest domain rank value to rank higher (but within the group)

How can i do this in a single query? and please advice on the fastest way
possible,
(open to implementing fq  other techniques to speed it up)

Please advice.


Document Structure in XML -

 doc
str name=subDomainwww/str
str name=domainncoah.com/str
str name=path/links.html/str
str name=urlFullhttp://www.ncoah.com/links.html/str
str name=titleNorth Carolina Office of Administrative Hearings
- Links/str
arr name=text
  strNorth Carolina Office of Administrative Hearings - Links/str
/arr
str name=relatedLinks - a
href=http://www.ncoah.com/links.html;  title=HearingsHearings/a
- a href=http://www.ncoah.com/links.html;  title=RulesRules/a -
a href=http://www.ncoah.com/links.html;  title=Civil RightsCivil
Rights/a - a href=http://www.ncoah.com/links.html;
title=WelcomeWelcome/a - a
href=http://www.ncoah.com/links.html;  title=General
InformationGeneral Information/a - a
href=http://www.ncoah.com/links.html;  title=Directions to
OAHDirections to OAH/a - a href=http://www.ncoah.com/links.html;
 title=Establishment of OAHEstablishment of OAH/a - a
href=http://www.ncoah.com/links.html;  title=G.S. 150BG.S.
150B/a - a href=http://www.ncoah.com/links.html;
title=FormsForms/a - a href=http://www.ncoah.com/links.html;
title=LinksLinks/a - a href=http://www.nc.gov/;  title=Visit
the North Carolina State web portalVisit the North Carolina State
web portal/a - a
href=http://ncinfo.iog.unc.edu/library/counties.html;  title=North
Carolina CountiesNorth Carolina Counties/a - a
href=http://ncinfo.iog.unc.edu/library/cities.html;  title=North
Carolina Cities  TownsNorth Carolina Cities  Towns/a - a
href=http://www.nccourts.org/;  title=Administrative Office of the
CourtsAdministrative Office of the Courts/a - a
href=http://www.ncleg.net/;  title=North Carolina General
AssemblyNorth Carolina General Assembly/a - a
href=http://www.doa.state.nc.us/;  title=Department of
AdministrationDepartment of Administration/a - a
href=http://www.ncagr.com/;  title=Department of
AgricultureDepartment of Agriculture/a - a
href=http://www.nccommerce.com;  title=Department of
CommerceDepartment of Commerce/a - a
href=http://www.doc.state.nc.us/;  title=Department of
CorrectionDepartment of Correction/a - a
href=http://www.nccrimecontrol.org/;  title=Department of Crime
Control  Public SafetyDepartment of Crime Control  Public
Safety/a - a href=http://www.ncdcr.gov/;  title=Department of
Cultural ResourcesDepartment of Cultural Resources/a - a
href=http://www.ncdenr.gov/;  title=Department of Environment and
Natural ResourcesDepartment of Environment and Natural Resources/a
- a href=http://www.dhhs.state.nc.us;  title=Department of Health
and Human ServicesDepartment of Health and Human Services/a - a
href=http://www.ncdoi.com/;  title=Department of
InsuranceDepartment of Insurance/a - a
href=http://www.ncdoj.com/;  title=Department of JusticeDepartment
of Justice/a - a href=http://www.juvjus.state.nc.us/;
title=Department of Juvenile Justice and Delinquency
PreventionDepartment of Juvenile Justice and Delinquency
Prevention/a - a href=http://www.nclabor.com/;  title=Department
of LaborDepartment of Labor/a - a
href=http://www.dpi.state.nc.us/;  title=Department of Public
InstructionDepartment of Public Instruction/a - a
href=http://www.dor.state.nc.us/;  title=Department of
RevenueDepartment of Revenue/a - a
href=http://www.treasurer.state.nc.us/;  title=Department of State
TreasurerDepartment of State Treasurer/a - a
href=http://www.ncdot.org/;  title=Department of
TransportationDepartment of Transportation/a - a
href=http://www.secstate.state.nc.us/;  title=Department of the
Secretary of StateDepartment of the Secretary of State/a - a
href=http://www.osp.state.nc.us/;  title=Office of State
PersonnelOffice of State Personnel/a - a
href=http://www.governor.state.nc.us/;  title=Office of the
GovernorOffice of the Governor/a - a
href=http://www.ltgov.state.nc.us/;  title=Office of the Lt.
GovernorOffice of the Lt. Governor/a - a
href=http://www.ncauditor.net/;  title=Office of the State
AuditorOffice of the State Auditor/a - a
href=http://www.osc.nc.gov/;  title=Office of the State
ControllerOffice of the State Controller/a - a
href=http://www.ncbar.org/;  title=North Carolina Bar
AssociationNorth Carolina Bar Association/a - a

Re: Confused by Solr Ranking

2010-03-09 Thread Avi Rosenschein


  I kind of suspected stemming to be the reason behind this.
  But I consider stemming to be a good feature.

 This is the side effect of stemming. Stemming increases recall while
 harming precision.


This is a side effect of stemming, the way it is currently implemented in
Lucene. Stemming could theoretically increase recall without hurting
precision or relevancy. One way to do this would be to always store the
original token, along with the stemmed token. Then, at scoring time, give a
boost to matches which are closer to the original form.

-- Avi


Re: Confused by Solr Ranking

2010-03-09 Thread Michael Lackhoff
On 09.03.2010 16:01 Ahmet Arslan wrote:

 
 I kind of suspected stemming to be the reason behind this.
 But I consider stemming to be a good feature.
 
 This is the side effect of stemming. Stemming increases recall while harming 
 precision.

But most people want the best possible combination of both, something like:
(raw_field:word OR stemmed_field:word^0.5)
and it is nice that Solr allows such arrangements but it would be even
nicer to have some sort of automatic take this field, transform the
contents in a couple of ways and do some boosting in the order given.
At least this would be my wish for the recent question about the one
feature I would like to see.
Or even better, allow not only a hierarchy of transformations but also a
hierarchy of fields (like in dismax, but with the full power of the
standard request handler)

-Michael



Re: Confused by Solr Ranking

2010-03-09 Thread Erick Erickson
Well, that's a matter of opinion, isn't it? If *your* application
requires this, you could always copy the field to a non-stemmed
field and apply boosts...

Erick

On Tue, Mar 9, 2010 at 9:21 AM, abhishes abhis...@gmail.com wrote:


 I kind of suspected stemming to be the reason behind this. But I consider
 stemming to be a good feature.

 The point is that if an exact match exists, then solr should report that
 first and then stemmed results should be reported.

 disabling stemming altogether would be a step in the wrong direction.



 Shalin Shekhar Mangar wrote:
 
  On Tue, Mar 9, 2010 at 4:38 PM, abhishes abhis...@gmail.com wrote:
 
 
  I am indexing a column in a database. I have chosen field type of text
  for
  this column (this type was defined in the sample schema file which comes
  in
  the Solr Example).
 
  When I search for the word impress and top 3 results. I get these 3
  documents
 
  str name=TEXTbare desire pronounce villainy draught beasts blockish
  impression acquit/str
  str name=TEXTbare impression villainy pronounce beasts desire
  blockish
  draught acquit/str
  str name=TEXTbeasts desire villainy pronounce bare acquit impression
  draught blockish/str
 
  But here the TEXT doesn't really contain the word impress it contains
  the
  word impression
 
  Now the database does contain a few rows where the word impress is
  there,
  but those rows do not come in top 3 results.
 
  So my question is that why did the rows containing the word impression
  got
  ranked higher than the rows containing the word impress when I
 searched
  for impress?
 
 
  The text type is configured to do stemming on the input. So I'm
 guessing
  that impression and impress both stem to the same form. You can
 remove
  the EnglishPorterFilterFactory from the text type if you don't need
  stemming.
 
  --
  Regards,
  Shalin Shekhar Mangar.
 
 

 --
 View this message in context:
 http://old.nabble.com/Confused-by-Solr-Ranking-tp27834227p27836299.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: Confused by Solr Ranking

2010-03-09 Thread abhishes

I kind of suspected stemming to be the reason behind this. But I consider
stemming to be a good feature.

The point is that if an exact match exists, then solr should report that
first and then stemmed results should be reported.

disabling stemming altogether would be a step in the wrong direction.



Shalin Shekhar Mangar wrote:
 
 On Tue, Mar 9, 2010 at 4:38 PM, abhishes abhis...@gmail.com wrote:
 

 I am indexing a column in a database. I have chosen field type of text
 for
 this column (this type was defined in the sample schema file which comes
 in
 the Solr Example).

 When I search for the word impress and top 3 results. I get these 3
 documents

 str name=TEXTbare desire pronounce villainy draught beasts blockish
 impression acquit/str
 str name=TEXTbare impression villainy pronounce beasts desire
 blockish
 draught acquit/str
 str name=TEXTbeasts desire villainy pronounce bare acquit impression
 draught blockish/str

 But here the TEXT doesn't really contain the word impress it contains
 the
 word impression

 Now the database does contain a few rows where the word impress is
 there,
 but those rows do not come in top 3 results.

 So my question is that why did the rows containing the word impression
 got
 ranked higher than the rows containing the word impress when I searched
 for impress?


 The text type is configured to do stemming on the input. So I'm guessing
 that impression and impress both stem to the same form. You can remove
 the EnglishPorterFilterFactory from the text type if you don't need
 stemming.
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://old.nabble.com/Confused-by-Solr-Ranking-tp27834227p27836299.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Confused by Solr Ranking

2010-03-09 Thread Ahmet Arslan


 I kind of suspected stemming to be the reason behind this.
 But I consider stemming to be a good feature.

This is the side effect of stemming. Stemming increases recall while harming 
precision.


  


Re: Confused by Solr Ranking

2010-03-09 Thread Shalin Shekhar Mangar
On Tue, Mar 9, 2010 at 4:38 PM, abhishes abhis...@gmail.com wrote:


 I am indexing a column in a database. I have chosen field type of text for
 this column (this type was defined in the sample schema file which comes in
 the Solr Example).

 When I search for the word impress and top 3 results. I get these 3
 documents

 str name=TEXTbare desire pronounce villainy draught beasts blockish
 impression acquit/str
 str name=TEXTbare impression villainy pronounce beasts desire blockish
 draught acquit/str
 str name=TEXTbeasts desire villainy pronounce bare acquit impression
 draught blockish/str

 But here the TEXT doesn't really contain the word impress it contains the
 word impression

 Now the database does contain a few rows where the word impress is there,
 but those rows do not come in top 3 results.

 So my question is that why did the rows containing the word impression
 got
 ranked higher than the rows containing the word impress when I searched
 for impress?


The text type is configured to do stemming on the input. So I'm guessing
that impression and impress both stem to the same form. You can remove
the EnglishPorterFilterFactory from the text type if you don't need
stemming.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Question regarding Solr ranking

2008-03-02 Thread Chris Hostetter
: I am not really clear to what the analysis mode is supposed to give me. It
: requires me to specify a field when I specify a query. What does that do?
: Also, I don't see anything in the analyzer to explain the weighting of a
: particular document.

i think what Otis ment is that the analysis tool would help you verify 
that your Analyzers are doing what you expect them to be doing.

If you try that with your locRvwText and the text you are asking about you 
would see that RemoveDuplicatesTokenFilterFactory does not make it the 
same as a single instance of Pizza ... per the docs...

Filters out any tokens which are at the same logical position 
in the tokenstream as a previous token with the same text. ...

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-b05ef0377d71df53b47b9dd9cc28c26d95097a0b

so it isn't removing any tokens in your situation because they do not 
existing in the same logical position.



-Hoss



Re: Question regarding Solr ranking

2008-02-29 Thread oleg_gnatovskiy


Otis Gospodnetic wrote:
 
 It's a little hard to read that message, but if I were you I'd go to the
 Solr admin page, analysis section, enter your query, and see what index
 and query time analyzers spit out.  I think that should at least give you
 some hints.
 
 Otis 
 
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
I am not really clear to what the analysis mode is supposed to give me. It
requires me to specify a field when I specify a query. What does that do?
Also, I don't see anything in the analyzer to explain the weighting of a
particular document.

Regardless, what I have it narrowed down to is that my locRvwText (defined
as multiple value text field) and it has a field that looks like this:
Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
  Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
  Pizza... Pizza... Pizza... Pizza... Pizza... . Solr is counting this as
 20 hits, but I was under the impression that the
 RemoveDuplicatesTokenFilterFactory should filter this result to have it
 count as just 1 hit. Am I understanding was
 RemoveDuplicatesTokenFilterFactory does incorrectly?
-- 
View this message in context: 
http://www.nabble.com/Question-regarding-Solr-ranking-tp15719752p15768743.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Question regarding Solr ranking

2008-02-28 Thread Otis Gospodnetic
It's a little hard to read that message, but if I were you I'd go to the Solr 
admin page, analysis section, enter your query, and see what index and query 
time analyzers spit out.  I think that should at least give you some hints.

Otis 

--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

- Original Message 
 From: oleg_gnatovskiy [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Wednesday, February 27, 2008 1:33:44 PM
 Subject: Re: Question regarding Solr ranking
 
 
 Sorry about the previous message, I had some formatting issues. Below is the
 actual message!
 
 oleg_gnatovskiy wrote:
  
  Hello everyone.
  
  I've run into a weird problem with Solr's ranking engine. In a nutshell,
  the problem involves certain results getting EXTREMELY high rank scores.
  Here is an example:
  
  locRvwText:Pizza Pizza^10 OR locName:Pizza Pizza^30
  
  The way I understand it is that the locName part of the query should be
  boosted 3x more then the locRvwText.
  However, when running this query the first result is:
  
  10.8226
  Johnnie's New York Pizzeria
  
  
  Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
  Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
  Pizza... Pizza... Pizza... Pizza... Pizza...
  
  
  
  
  
  
  10.8226 = (MATCH) product of:
21.6452 = (MATCH) sum of:
  21.6452 = weight(locRvwText:pizza pizza^10.0 in 3792465), product
  of:
0.3354544 = queryWeight(locRvwText:pizza pizza^10.0), product of:
  10.0 = boost
  14.428232 = idf(locRvwText: pizza=8156 pizza=8156)
  0.0023249863 = queryNorm
64.52502 = fieldWeight(locRvwText:pizza pizza in 3792465), product
  of:
  4.472136 = tf(phraseFreq=20.0)
  14.428232 = idf(locRvwText: pizza=8156 pizza=8156)
  1.0 = fieldNorm(field=locRvwText, doc=3792465)
0.5 = coord(1/2)
  
  
  
  
  How come the phrase frequency for rvwText comes back as 20? The field
  rvwText is defined in the following way:
  
  
  required=false multiValued=true  omitNorms=true/
  
  And my text fields are defined in the following way:
  
  

  
  
  
  ignoreCase=true expand=true/
  
  words=stopwords.txt/
  
  generateWordParts=1 generateNumberParts=1 catenateWords=1
  catenateNumbers=1 catenateAll=0/
  
  
  protected=protwords.txt/
  


  
  
  ignoreCase=true expand=true/
  
  words=stopwords.txt/
  
  generateWordParts=1 generateNumberParts=1 catenateWords=0
  catenateNumbers=0 catenateAll=0/
  
  
  protected=protwords.txt/
  

  
  
  Forgive me if I am wrong, but shouldn't the
  RemoveDuplicatesTokenFilterFactory have the string Pizza... Pizza...
  Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
  Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
  Pizza... Pizza... Pizza... Count as simplu one Pizza?

  I'd appreciate any help I can get! 
  
  Thanks!
  
 
 -- 
 View this message in context: 
 http://www.nabble.com/Question-regarding-Solr-ranking-tp15719752p15719834.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 




Question regarding Solr ranking

2008-02-27 Thread oleg_gnatovskiy

Hello everyone.


I've run into a weird problem with Solr's ranking engine. In a nutshell, the
problem involves certain results getting EXTREMELY high rank scores. Here is
an example:


locRvwText:Pizza Pizza^10 OR locName:Pizza Pizza^30


The way I understand it is that the locName part of the query should be
boosted 3x more then the locRvwText.

However, when running this query the first result is:



10.8226
Johnnie's New York Pizzeria


Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
Pizza... Pizza... Pizza... Pizza... Pizza...



−


10.8226 = (MATCH) product of:
  21.6452 = (MATCH) sum of:
21.6452 = weight(locRvwText:pizza pizza^10.0 in 3792465), product of:
  0.3354544 = queryWeight(locRvwText:pizza pizza^10.0), product of:
10.0 = boost
14.428232 = idf(locRvwText: pizza=8156 pizza=8156)
0.0023249863 = queryNorm
  64.52502 = fieldWeight(locRvwText:pizza pizza in 3792465), product
of:
4.472136 = tf(phraseFreq=20.0)
14.428232 = idf(locRvwText: pizza=8156 pizza=8156)
1.0 = fieldNorm(field=locRvwText, doc=3792465)
  0.5 = coord(1/2)




How come the phrase frequency for rvwText comes back as 20? The field
rvwText is defined in the following way:




And my text fields are defined in the following way:




  








  
  







  



Forgive me if I am wrong, but shouldn't the
RemoveDuplicatesTokenFilterFactory have the string Pizza... Pizza...
Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
Pizza... Pizza... Pizza... Count as simplu one Pizza?

I'd appreciate any help I can get! 

Thanks!






-- 
View this message in context: 
http://www.nabble.com/Question-regarding-Solr-ranking-tp15719752p15719752.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question regarding Solr ranking

2008-02-27 Thread oleg_gnatovskiy

Sorry about the previous message, I had some formatting issues. Below is the
actual message!

oleg_gnatovskiy wrote:
 
 Hello everyone.
 
 I've run into a weird problem with Solr's ranking engine. In a nutshell,
 the problem involves certain results getting EXTREMELY high rank scores.
 Here is an example:
 
 locRvwText:Pizza Pizza^10 OR locName:Pizza Pizza^30
 
 The way I understand it is that the locName part of the query should be
 boosted 3x more then the locRvwText.
 However, when running this query the first result is:
 
 float name=score10.8226/float
 str name=locNameJohnnie's New York Pizzeria/str
 arr name=locRvwText
 str
 Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
 Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
 Pizza... Pizza... Pizza... Pizza... Pizza...
 /str
 /arr
 lst name=explain
 
   str name=id=157789,internal_docid=3792465
 
 10.8226 = (MATCH) product of:
   21.6452 = (MATCH) sum of:
 21.6452 = weight(locRvwText:pizza pizza^10.0 in 3792465), product
 of:
   0.3354544 = queryWeight(locRvwText:pizza pizza^10.0), product of:
 10.0 = boost
 14.428232 = idf(locRvwText: pizza=8156 pizza=8156)
 0.0023249863 = queryNorm
   64.52502 = fieldWeight(locRvwText:pizza pizza in 3792465), product
 of:
 4.472136 = tf(phraseFreq=20.0)
 14.428232 = idf(locRvwText: pizza=8156 pizza=8156)
 1.0 = fieldNorm(field=locRvwText, doc=3792465)
   0.5 = coord(1/2)
 /str
 /lst
 
 
 How come the phrase frequency for rvwText comes back as 20? The field
 rvwText is defined in the following way:
 
 field name=locRvwText type=text index=false stored=true
 required=false multiValued=true  omitNorms=true/
 
 And my text fields are defined in the following way:
 
 fieldtype name=text class=solr.TextField positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
   !-- in this example, we will only use synonyms at query time --
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt/
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EnglishPorterFilterFactory
 protected=protwords.txt/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt/
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EnglishPorterFilterFactory
 protected=protwords.txt/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
 /fieldtype
 
 Forgive me if I am wrong, but shouldn't the
 RemoveDuplicatesTokenFilterFactory have the string Pizza... Pizza...
 Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
 Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
 Pizza... Pizza... Pizza... Count as simplu one Pizza?br
 I'd appreciate any help I can get! 
 
 Thanks!
 

-- 
View this message in context: 
http://www.nabble.com/Question-regarding-Solr-ranking-tp15719752p15719834.html
Sent from the Solr - User mailing list archive at Nabble.com.