Re: Function Query Optimization

2020-12-14 Thread Jae Joo
Should SubQuery be faster than FunctionQuery?

On Sat, Dec 12, 2020 at 10:24 AM Vincenzo D'Amore 
wrote:

> Hi, looking at this sample it seems you have just one document for '12345',
> one for '23456' and so on so forth. If this is true, why don't just try
> with a subquery
>
> https://lucene.apache.org/solr/guide/6_6/transforming-result-documents.html#TransformingResultDocuments-_subquery_
>
> On Fri, Dec 11, 2020 at 3:31 PM Jae Joo  wrote:
>
> > I have the requirement to create field  - xyz to be returned based on the
> > matched result.
> > Here Is the code .
> >
> > XYZ:concat(
> >
> > if(exists(query({!v='field1:12345'})), '12345', ''),
> >
> > if(exists(query({!v='field1:23456'})), '23456', ''),
> >
> > if(exists(query({!v='field1:34567'})), '34567', ''),
> >
> > if(exists(query({!v='field:45678'})), '45678','')
> > ),
> >
> > I am feeling this is very complex, so I am looking for some smart and
> > faster ideas.
> >
> > Thanks,
> >
> > Jae
> >
>
>
> --
> Vincenzo D'Amore
>


Re: Function Query Optimization

2020-12-12 Thread Vincenzo D'Amore
Hi, looking at this sample it seems you have just one document for '12345',
one for '23456' and so on so forth. If this is true, why don't just try
with a subquery
https://lucene.apache.org/solr/guide/6_6/transforming-result-documents.html#TransformingResultDocuments-_subquery_

On Fri, Dec 11, 2020 at 3:31 PM Jae Joo  wrote:

> I have the requirement to create field  - xyz to be returned based on the
> matched result.
> Here Is the code .
>
> XYZ:concat(
>
> if(exists(query({!v='field1:12345'})), '12345', ''),
>
> if(exists(query({!v='field1:23456'})), '23456', ''),
>
> if(exists(query({!v='field1:34567'})), '34567', ''),
>
> if(exists(query({!v='field:45678'})), '45678','')
> ),
>
> I am feeling this is very complex, so I am looking for some smart and
> faster ideas.
>
> Thanks,
>
> Jae
>


-- 
Vincenzo D'Amore


Function Query Optimization

2020-12-11 Thread Jae Joo
I have the requirement to create field  - xyz to be returned based on the
matched result.
Here Is the code .

XYZ:concat(

if(exists(query({!v='field1:12345'})), '12345', ''),

if(exists(query({!v='field1:23456'})), '23456', ''),

if(exists(query({!v='field1:34567'})), '34567', ''),

if(exists(query({!v='field:45678'})), '45678','')
),

I am feeling this is very complex, so I am looking for some smart and
faster ideas.

Thanks,

Jae


Re: query optimization

2019-07-06 Thread Mikhail Khludnev
https://lucene.apache.org/solr/guide/6_6/common-query-parameters.html#CommonQueryParameters-ThedebugParameter


On Wed, Jul 3, 2019 at 10:10 AM Midas A  wrote:

> Please suggest here
>
> On Wed, Jul 3, 2019 at 10:23 AM Midas A  wrote:
>
> > Hi,
> >
> > How can i optimize following query it is taking time
> >
> >  webapp=/solr path=/search params={
> >
> df=ttl=0=true=1=true=true=0=0=contents^0.05+currdesig^1.5+predesig^1.5+lng^2+ttl+kw_skl+kw_it=false=ttl,kw_skl,kw_it,contents==1=ttl^0.1+currdesig^0.1+predesig^0.1=0=/resumesearch="mbbss"+OR+"medicine"=2=true=mbbs,+"medical+officer",+doctor,+physician+("medical+officer")+"medical+officer"+"physician""+""general+physician""+""physicians""+""consultant+physician""+""house+physician"+"physician"+"doctor"+"mbbs"+"general+physician"+"physicians"+"consultant+physician"+"house+physician"=(293)=false==none=id,upt=1=OR=NOT+contents:("liaise+with+medical+officer"+"worked+with+medical+officer"+"working+with+medical+officer"+"reported+to+medical+officer"+"references+are+medical+officer"+"coordinated+with+medical+officer"+"closely+with+medical+officer"+"signature+of+medical+officer"+"seal+of++medical+officer"+"liaise+with+physician"+"worked+with+physician"+"working+with+physician"+"reported+to+physician"+"references+are+physician"+"coordinated+with+physician"+"closely+with+physician"+"signature+of+physician"+"seal+of++physician"+"liaise+with+doctor"+"worked+with+doctor"+"working+with+doctor"+"reported+to+doctor"+"references+are+doctor"+"coordinated+with+doctor"+"closely+with+doctor"+"signature+of+doctor"+"seal+of++doctor")=NOT+hemp:("xmwxagency"+"xmwxlimited"+"xmwxplacement"+"xmwxplus"+"xmwxprivate"+"xmwxsecurity"+"xmwxz2"+"xmwxand"+"xswxz2+plus+placement+and+security+agency+private+limited"+"xswxz2+plus+placement+and+security+agency+private"+"xswxz2+plus+placement+and+security+agency"+"xswxz2+plus+placement+and+security"+"xswxz2+plus+placement+and"+"xswxz2+plus+placement"+"xswxz2+plus"+"xswxz2")=ctc:[100.0+TO+107.2]+OR+ctc:[-1.0+TO+-1.0]=(dlh:(22))=ind:(24++42++24++8)=(rol:(292+293+294+322))=(cat:(9))=cat:(1000+OR+907+OR+1+OR+2+OR+3+OR+786+OR+4+OR+5+OR+6+OR+7+OR+8+OR+9+OR+10+OR+11+OR+12+OR+13+OR+14+OR+785+OR+15+OR+16+OR+17+OR+18+OR+908+OR+19+OR+20+OR+21+OR+23+OR+24)=NOT+is_udis:2=is_resume:0^-1000=upt_date:[*+TO+NOW/DAY-36MONTHS]^2=upt_date:[NOW/DAY-36MONTHS+TO+NOW/DAY-24MONTHS]^3=upt_date:[NOW/DAY-24MONTHS+TO+NOW/DAY-12MONTHS]^4=upt_date:[NOW/DAY-12MONTHS+TO+NOW/DAY-9MONTHS]^5=upt_date:[NOW/DAY-9MONTHS+TO+NOW/DAY-6MONTHS]^10=upt_date:[NOW/DAY-6MONTHS+TO+NOW/DAY-3MONTHS]^15=upt_date:[NOW/DAY-3MONTHS+TO+*]^20=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"=dlh:(22)^8={!boost+b%3D4}+_query_:{!edismax+qf%3D"currdesig^8+predesig^6+ttl^3+kw_skl^2+contents"+v%3D"\"doctor\"+\"medical+officer\"+\"physician\""+q.op%3DAND+bq%3D}=_query_:{!edismax+qf%3D"currdesig+predesig+ttl+kw_skl+contents^0.01"+v%3D"\"doctor\"+\"medical+officer\"+\"physician\""+q.op%3DOR+bq%3D}=NOT+country:isoin^-10=exp:[+10+TO+11+]=exp:[+11+TO+13+]=exp:[+13+TO+15+]=exp:[+15+TO+17+]=exp:[+17+TO+20+]=exp:[+20+TO+25+]=exp:[+25+TO+109+]=ctc:[+100+TO+101+]=ctc:[+101+TO+101.5+]=ctc:[+101.5+TO+102+]=ctc:[+102+TO+103+]=ctc:[+103+TO+104+]=ctc:[+104+TO+105+]=ctc:[+105+TO+107.5+]=ctc:[+107.5+TO+110+]=ctc:[+110+TO+115+]=ctc:[+115+TO+10100+]=1=(22)=javabin=(293)=(294)=(322)=ind=cat=rol=cl=pref=false=1=0=40=((mbbs+OR+_query_:"{!edismax+qf%3Ddlh+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany3+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+((("medical+officer")+OR+"medical+officer"~0)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany0+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+(("doctor"+OR+doctor)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany2+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+(("physician"+OR+"physicians"+OR+"general+physician"+OR+"house+physician"+OR+"consultant+physician"+OR+physician)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany1+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+_query_:"{!edismax+qf%3D\$semanticfieldskl+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D\$semantictermsskl+q.op%3DOR+bq%3D\$bq1+bf%3D}"+OR+_query_:"{!edismax+qf%3D\$semanticfieldttl+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D\$semantictermsttl+q.op%3DAND+bq%3D\$bq1+bf%3D}")=10=id=kw_skl^0.05+kw_it^0.05+ttl^0.05+currdesig^0.05+predesig^0.05=1=id=id=true}
> > hits=20268 status=0 QTime=10659
> >
>


-- 
Sincerely yours
Mikhail Khludnev


Re: query optimization

2019-07-03 Thread Midas A
Please suggest here

On Wed, Jul 3, 2019 at 10:23 AM Midas A  wrote:

> Hi,
>
> How can i optimize following query it is taking time
>
>  webapp=/solr path=/search params={
> df=ttl=0=true=1=true=true=0=0=contents^0.05+currdesig^1.5+predesig^1.5+lng^2+ttl+kw_skl+kw_it=false=ttl,kw_skl,kw_it,contents==1=ttl^0.1+currdesig^0.1+predesig^0.1=0=/resumesearch="mbbss"+OR+"medicine"=2=true=mbbs,+"medical+officer",+doctor,+physician+("medical+officer")+"medical+officer"+"physician""+""general+physician""+""physicians""+""consultant+physician""+""house+physician"+"physician"+"doctor"+"mbbs"+"general+physician"+"physicians"+"consultant+physician"+"house+physician"=(293)=false==none=id,upt=1=OR=NOT+contents:("liaise+with+medical+officer"+"worked+with+medical+officer"+"working+with+medical+officer"+"reported+to+medical+officer"+"references+are+medical+officer"+"coordinated+with+medical+officer"+"closely+with+medical+officer"+"signature+of+medical+officer"+"seal+of++medical+officer"+"liaise+with+physician"+"worked+with+physician"+"working+with+physician"+"reported+to+physician"+"references+are+physician"+"coordinated+with+physician"+"closely+with+physician"+"signature+of+physician"+"seal+of++physician"+"liaise+with+doctor"+"worked+with+doctor"+"working+with+doctor"+"reported+to+doctor"+"references+are+doctor"+"coordinated+with+doctor"+"closely+with+doctor"+"signature+of+doctor"+"seal+of++doctor")=NOT+hemp:("xmwxagency"+"xmwxlimited"+"xmwxplacement"+"xmwxplus"+"xmwxprivate"+"xmwxsecurity"+"xmwxz2"+"xmwxand"+"xswxz2+plus+placement+and+security+agency+private+limited"+"xswxz2+plus+placement+and+security+agency+private"+"xswxz2+plus+placement+and+security+agency"+"xswxz2+plus+placement+and+security"+"xswxz2+plus+placement+and"+"xswxz2+plus+placement"+"xswxz2+plus"+"xswxz2")=ctc:[100.0+TO+107.2]+OR+ctc:[-1.0+TO+-1.0]=(dlh:(22))=ind:(24++42++24++8)=(rol:(292+293+294+322))=(cat:(9))=cat:(1000+OR+907+OR+1+OR+2+OR+3+OR+786+OR+4+OR+5+OR+6+OR+7+OR+8+OR+9+OR+10+OR+11+OR+12+OR+13+OR+14+OR+785+OR+15+OR+16+OR+17+OR+18+OR+908+OR+19+OR+20+OR+21+OR+23+OR+24)=NOT+is_udis:2=is_resume:0^-1000=upt_date:[*+TO+NOW/DAY-36MONTHS]^2=upt_date:[NOW/DAY-36MONTHS+TO+NOW/DAY-24MONTHS]^3=upt_date:[NOW/DAY-24MONTHS+TO+NOW/DAY-12MONTHS]^4=upt_date:[NOW/DAY-12MONTHS+TO+NOW/DAY-9MONTHS]^5=upt_date:[NOW/DAY-9MONTHS+TO+NOW/DAY-6MONTHS]^10=upt_date:[NOW/DAY-6MONTHS+TO+NOW/DAY-3MONTHS]^15=upt_date:[NOW/DAY-3MONTHS+TO+*]^20=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"=dlh:(22)^8={!boost+b%3D4}+_query_:{!edismax+qf%3D"currdesig^8+predesig^6+ttl^3+kw_skl^2+contents"+v%3D"\"doctor\"+\"medical+officer\"+\"physician\""+q.op%3DAND+bq%3D}=_query_:{!edismax+qf%3D"currdesig+predesig+ttl+kw_skl+contents^0.01"+v%3D"\"doctor\"+\"medical+officer\"+\"physician\""+q.op%3DOR+bq%3D}=NOT+country:isoin^-10=exp:[+10+TO+11+]=exp:[+11+TO+13+]=exp:[+13+TO+15+]=exp:[+15+TO+17+]=exp:[+17+TO+20+]=exp:[+20+TO+25+]=exp:[+25+TO+109+]=ctc:[+100+TO+101+]=ctc:[+101+TO+101.5+]=ctc:[+101.5+TO+102+]=ctc:[+102+TO+103+]=ctc:[+103+TO+104+]=ctc:[+104+TO+105+]=ctc:[+105+TO+107.5+]=ctc:[+107.5+TO+110+]=ctc:[+110+TO+115+]=ctc:[+115+TO+10100+]=1=(22)=javabin=(293)=(294)=(322)=ind=cat=rol=cl=pref=false=1=0=40=((mbbs+OR+_query_:"{!edismax+qf%3Ddlh+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany3+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+((("medical+officer")+OR+"medical+officer"~0)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany0+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+(("doctor"+OR+doctor)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany2+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+(("physician"+OR+"physicians"+OR+"general+physician"+OR+"house+physician"+OR+"consultant+physician"+OR+physician)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany1+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+_query_:"{!edismax+qf%3D\$semanticfieldskl+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D\$semantictermsskl+q.op%3DOR+bq%3D\$bq1+bf%3D}"+OR+_query_:"{!edismax+qf%3D\$semanticfieldttl+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D\$semantictermsttl+q.op%3DAND+bq%3D\$bq1+bf%3D}")=10=id=kw_skl^0.05+kw_it^0.05+ttl^0.05+currdesig^0.05+predesig^0.05=1=id=id=true}
> hits=20268 status=0 QTime=10659
>


query optimization

2019-07-02 Thread Midas A
Hi,

How can i optimize following query it is taking time

 webapp=/solr path=/search params={
df=ttl=0=true=1=true=true=0=0=contents^0.05+currdesig^1.5+predesig^1.5+lng^2+ttl+kw_skl+kw_it=false=ttl,kw_skl,kw_it,contents==1=ttl^0.1+currdesig^0.1+predesig^0.1=0=/resumesearch="mbbss"+OR+"medicine"=2=true=mbbs,+"medical+officer",+doctor,+physician+("medical+officer")+"medical+officer"+"physician""+""general+physician""+""physicians""+""consultant+physician""+""house+physician"+"physician"+"doctor"+"mbbs"+"general+physician"+"physicians"+"consultant+physician"+"house+physician"=(293)=false==none=id,upt=1=OR=NOT+contents:("liaise+with+medical+officer"+"worked+with+medical+officer"+"working+with+medical+officer"+"reported+to+medical+officer"+"references+are+medical+officer"+"coordinated+with+medical+officer"+"closely+with+medical+officer"+"signature+of+medical+officer"+"seal+of++medical+officer"+"liaise+with+physician"+"worked+with+physician"+"working+with+physician"+"reported+to+physician"+"references+are+physician"+"coordinated+with+physician"+"closely+with+physician"+"signature+of+physician"+"seal+of++physician"+"liaise+with+doctor"+"worked+with+doctor"+"working+with+doctor"+"reported+to+doctor"+"references+are+doctor"+"coordinated+with+doctor"+"closely+with+doctor"+"signature+of+doctor"+"seal+of++doctor")=NOT+hemp:("xmwxagency"+"xmwxlimited"+"xmwxplacement"+"xmwxplus"+"xmwxprivate"+"xmwxsecurity"+"xmwxz2"+"xmwxand"+"xswxz2+plus+placement+and+security+agency+private+limited"+"xswxz2+plus+placement+and+security+agency+private"+"xswxz2+plus+placement+and+security+agency"+"xswxz2+plus+placement+and+security"+"xswxz2+plus+placement+and"+"xswxz2+plus+placement"+"xswxz2+plus"+"xswxz2")=ctc:[100.0+TO+107.2]+OR+ctc:[-1.0+TO+-1.0]=(dlh:(22))=ind:(24++42++24++8)=(rol:(292+293+294+322))=(cat:(9))=cat:(1000+OR+907+OR+1+OR+2+OR+3+OR+786+OR+4+OR+5+OR+6+OR+7+OR+8+OR+9+OR+10+OR+11+OR+12+OR+13+OR+14+OR+785+OR+15+OR+16+OR+17+OR+18+OR+908+OR+19+OR+20+OR+21+OR+23+OR+24)=NOT+is_udis:2=is_resume:0^-1000=upt_date:[*+TO+NOW/DAY-36MONTHS]^2=upt_date:[NOW/DAY-36MONTHS+TO+NOW/DAY-24MONTHS]^3=upt_date:[NOW/DAY-24MONTHS+TO+NOW/DAY-12MONTHS]^4=upt_date:[NOW/DAY-12MONTHS+TO+NOW/DAY-9MONTHS]^5=upt_date:[NOW/DAY-9MONTHS+TO+NOW/DAY-6MONTHS]^10=upt_date:[NOW/DAY-6MONTHS+TO+NOW/DAY-3MONTHS]^15=upt_date:[NOW/DAY-3MONTHS+TO+*]^20=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"=dlh:(22)^8={!boost+b%3D4}+_query_:{!edismax+qf%3D"currdesig^8+predesig^6+ttl^3+kw_skl^2+contents"+v%3D"\"doctor\"+\"medical+officer\"+\"physician\""+q.op%3DAND+bq%3D}=_query_:{!edismax+qf%3D"currdesig+predesig+ttl+kw_skl+contents^0.01"+v%3D"\"doctor\"+\"medical+officer\"+\"physician\""+q.op%3DOR+bq%3D}=NOT+country:isoin^-10=exp:[+10+TO+11+]=exp:[+11+TO+13+]=exp:[+13+TO+15+]=exp:[+15+TO+17+]=exp:[+17+TO+20+]=exp:[+20+TO+25+]=exp:[+25+TO+109+]=ctc:[+100+TO+101+]=ctc:[+101+TO+101.5+]=ctc:[+101.5+TO+102+]=ctc:[+102+TO+103+]=ctc:[+103+TO+104+]=ctc:[+104+TO+105+]=ctc:[+105+TO+107.5+]=ctc:[+107.5+TO+110+]=ctc:[+110+TO+115+]=ctc:[+115+TO+10100+]=1=(22)=javabin=(293)=(294)=(322)=ind=cat=rol=cl=pref=false=1=0=40=((mbbs+OR+_query_:"{!edismax+qf%3Ddlh+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany3+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+((("medical+officer")+OR+"medical+officer"~0)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany0+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+(("doctor"+OR+doctor)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany2+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+(("physician"+OR+"physicians"+OR+"general+physician"+OR+"house+physician"+OR+"consultant+physician"+OR+physician)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany1+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+_query_:"{!edismax+qf%3D\$semanticfieldskl+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D\$semantictermsskl+q.op%3DOR+bq%3D\$bq1+bf%3D}"+OR+_query_:"{!edismax+qf%3D\$semanticfieldttl+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D\$semantictermsttl+q.op%3DAND+bq%3D\$bq1+bf%3D}")=10=id=kw_skl^0.05+kw_it^0.05+ttl^0.05+currdesig^0.05+predesig^0.05=1=id=id=true}
hits=20268 status=0 QTime=10659


Re: Query optimization

2016-07-29 Thread Ahmet Arslan
Ups I forgot the link:
http://yonik.com/solr/paging-and-deep-paging/




On Friday, July 29, 2016 9:51 AM, Ahmet Arslan  wrote:
Hi Midas,

Please search 'deep paging' over the documentation, mailing list, etc.
Solr Deep Paging and Sorting


Ahmet

On Friday, July 29, 2016 9:21 AM, Midas A  wrote:



please reply .


On Fri, Jul 29, 2016 at 10:26 AM, Midas A  wrote:

> a) my index size is 10 gb   for higher start is query response got slow .
> what should i do to optimize this query for higher start value in query
>


Re: Query optimization

2016-07-29 Thread Ahmet Arslan
Hi Midas,

Please search 'deep paging' over the documentation, mailing list, etc.
Solr Deep Paging and Sorting


Ahmet
On Friday, July 29, 2016 9:21 AM, Midas A  wrote:



please reply .


On Fri, Jul 29, 2016 at 10:26 AM, Midas A  wrote:

> a) my index size is 10 gb   for higher start is query response got slow .
> what should i do to optimize this query for higher start value in query
>


Re: Query optimization

2016-07-29 Thread Midas A
please reply .

On Fri, Jul 29, 2016 at 10:26 AM, Midas A  wrote:

> a) my index size is 10 gb   for higher start is query response got slow .
> what should i do to optimize this query for higher start value in query
>


Query optimization

2016-07-28 Thread Midas A
a) my index size is 10 gb   for higher start is query response got slow .
what should i do to optimize this query for higher start value in query


Re: Query optimization

2016-07-13 Thread Midas A
Hi ,

One more thing i would like to add here is  we build facet queries over
dynamic fields so my question is
a) Is there any harm of using docValues true with dynamic fields.
b) Other suggestion that we can implement to optimize this query my index
size is 8GB  and query is taking more tha 3 seconds.

Regards,
Abhishek Tiwari

On Thu, Jul 14, 2016 at 6:42 AM, Erick Erickson 
wrote:

> DocValues are now the preferred mechanism
> whenever you need to sort, facet or group. It'll
> make your on-disk index bigger, but the on-disk
> structure would have been built in Java's memory
> if you didn't use DocValues whereas if you do
> it's MMap'd.
>
> So overall, use DocValues by preference.
>
> Best,
> Erick
>
> On Wed, Jul 13, 2016 at 5:36 AM, sara hajili 
> wrote:
> > as i know when you use docValue=true
> > solr when indexing doc,
> > solr although store doc and docValue=true field in memory.to use that in
> > facet query and sort query result.
> > so maybe use a lot docvalue=true use a lot  memory of ur system.
> > but use it in logical way.can make better query response time
> >
> > On Wed, Jul 13, 2016 at 5:11 AM, Midas A  wrote:
> >
> >> Is there any draw back of using docValues=true ?
> >>
> >> On Wed, Jul 13, 2016 at 2:28 PM, sara hajili 
> >> wrote:
> >>
> >> > Hi.
> >> > Facet query take a long time.you vcan use group query.
> >> > Or in fileds in schema that you run facet query on that filed.
> >> > Set doc value=true.
> >> > To get better answer.in quick time.
> >> > On Jul 13, 2016 11:54 AM, "Midas A"  wrote:
> >> >
> >> > > http://
> >> > >
> >> > >
> >> >
> >>
> #:8983/solr/prod/select?q=id_path_ids:166=sort_price:[0%20TO%20*]=status:A=company_status:A=true=1=show_meta_id=show_brand=product_amount_available=by_processor=by_system_memory=by_screen_size=by_operating_system=by_laptop_type=by_processor_brand=by_hard_drive_capacity=by_touchscreen=by_warranty=by_graphic_memory=is_trm=show_merchant=is_cod=show_market={!ex=p_r%20key=product_rating:[4-5]}product_rating:[4%20TO%205]={!ex=p_r%20key=product_rating:[3-5]}product_rating:[3%20TO%205]={!ex=p_r%20key=product_rating:[2-5]}product_rating:[2%20TO%205]={!ex=p_r%20key=product_rating:[1-5]}product_rating:[1%20TO%205]={!ex=m_r%20key=merchant_rating:[4-5]}merchant_rating:[4%20TO%205]={!ex=m_r%20key=merchant_rating:[3-5]}merchant_rating:[3%20TO%205]={!ex=m_r%20key=merchant_rating:[2-5]}merchant_rating:[2%20TO%205]={!ex=m_r%20key=merchant_rating:[1-5]}merchant_rating:[1%20TO%205]=500=true=sort_price=0=10=product_amount_available%20desc,boost_index%20asc,popularity%20desc,is_cod%20desc
> >> > >
> >> > >
> >> > > What kind of optimization we can do in above query . it is taking
> 2400
> >> > ms .
> >> > >
> >> >
> >>
>


Re: Query optimization

2016-07-13 Thread Erick Erickson
DocValues are now the preferred mechanism
whenever you need to sort, facet or group. It'll
make your on-disk index bigger, but the on-disk
structure would have been built in Java's memory
if you didn't use DocValues whereas if you do
it's MMap'd.

So overall, use DocValues by preference.

Best,
Erick

On Wed, Jul 13, 2016 at 5:36 AM, sara hajili  wrote:
> as i know when you use docValue=true
> solr when indexing doc,
> solr although store doc and docValue=true field in memory.to use that in
> facet query and sort query result.
> so maybe use a lot docvalue=true use a lot  memory of ur system.
> but use it in logical way.can make better query response time
>
> On Wed, Jul 13, 2016 at 5:11 AM, Midas A  wrote:
>
>> Is there any draw back of using docValues=true ?
>>
>> On Wed, Jul 13, 2016 at 2:28 PM, sara hajili 
>> wrote:
>>
>> > Hi.
>> > Facet query take a long time.you vcan use group query.
>> > Or in fileds in schema that you run facet query on that filed.
>> > Set doc value=true.
>> > To get better answer.in quick time.
>> > On Jul 13, 2016 11:54 AM, "Midas A"  wrote:
>> >
>> > > http://
>> > >
>> > >
>> >
>> #:8983/solr/prod/select?q=id_path_ids:166=sort_price:[0%20TO%20*]=status:A=company_status:A=true=1=show_meta_id=show_brand=product_amount_available=by_processor=by_system_memory=by_screen_size=by_operating_system=by_laptop_type=by_processor_brand=by_hard_drive_capacity=by_touchscreen=by_warranty=by_graphic_memory=is_trm=show_merchant=is_cod=show_market={!ex=p_r%20key=product_rating:[4-5]}product_rating:[4%20TO%205]={!ex=p_r%20key=product_rating:[3-5]}product_rating:[3%20TO%205]={!ex=p_r%20key=product_rating:[2-5]}product_rating:[2%20TO%205]={!ex=p_r%20key=product_rating:[1-5]}product_rating:[1%20TO%205]={!ex=m_r%20key=merchant_rating:[4-5]}merchant_rating:[4%20TO%205]={!ex=m_r%20key=merchant_rating:[3-5]}merchant_rating:[3%20TO%205]={!ex=m_r%20key=merchant_rating:[2-5]}merchant_rating:[2%20TO%205]={!ex=m_r%20key=merchant_rating:[1-5]}merchant_rating:[1%20TO%205]=500=true=sort_price=0=10=product_amount_available%20desc,boost_index%20asc,popularity%20desc,is_cod%20desc
>> > >
>> > >
>> > > What kind of optimization we can do in above query . it is taking 2400
>> > ms .
>> > >
>> >
>>


Re: Query optimization

2016-07-13 Thread sara hajili
as i know when you use docValue=true
solr when indexing doc,
solr although store doc and docValue=true field in memory.to use that in
facet query and sort query result.
so maybe use a lot docvalue=true use a lot  memory of ur system.
but use it in logical way.can make better query response time

On Wed, Jul 13, 2016 at 5:11 AM, Midas A  wrote:

> Is there any draw back of using docValues=true ?
>
> On Wed, Jul 13, 2016 at 2:28 PM, sara hajili 
> wrote:
>
> > Hi.
> > Facet query take a long time.you vcan use group query.
> > Or in fileds in schema that you run facet query on that filed.
> > Set doc value=true.
> > To get better answer.in quick time.
> > On Jul 13, 2016 11:54 AM, "Midas A"  wrote:
> >
> > > http://
> > >
> > >
> >
> #:8983/solr/prod/select?q=id_path_ids:166=sort_price:[0%20TO%20*]=status:A=company_status:A=true=1=show_meta_id=show_brand=product_amount_available=by_processor=by_system_memory=by_screen_size=by_operating_system=by_laptop_type=by_processor_brand=by_hard_drive_capacity=by_touchscreen=by_warranty=by_graphic_memory=is_trm=show_merchant=is_cod=show_market={!ex=p_r%20key=product_rating:[4-5]}product_rating:[4%20TO%205]={!ex=p_r%20key=product_rating:[3-5]}product_rating:[3%20TO%205]={!ex=p_r%20key=product_rating:[2-5]}product_rating:[2%20TO%205]={!ex=p_r%20key=product_rating:[1-5]}product_rating:[1%20TO%205]={!ex=m_r%20key=merchant_rating:[4-5]}merchant_rating:[4%20TO%205]={!ex=m_r%20key=merchant_rating:[3-5]}merchant_rating:[3%20TO%205]={!ex=m_r%20key=merchant_rating:[2-5]}merchant_rating:[2%20TO%205]={!ex=m_r%20key=merchant_rating:[1-5]}merchant_rating:[1%20TO%205]=500=true=sort_price=0=10=product_amount_available%20desc,boost_index%20asc,popularity%20desc,is_cod%20desc
> > >
> > >
> > > What kind of optimization we can do in above query . it is taking 2400
> > ms .
> > >
> >
>


Re: Query optimization

2016-07-13 Thread Midas A
Is there any draw back of using docValues=true ?

On Wed, Jul 13, 2016 at 2:28 PM, sara hajili  wrote:

> Hi.
> Facet query take a long time.you vcan use group query.
> Or in fileds in schema that you run facet query on that filed.
> Set doc value=true.
> To get better answer.in quick time.
> On Jul 13, 2016 11:54 AM, "Midas A"  wrote:
>
> > http://
> >
> >
> #:8983/solr/prod/select?q=id_path_ids:166=sort_price:[0%20TO%20*]=status:A=company_status:A=true=1=show_meta_id=show_brand=product_amount_available=by_processor=by_system_memory=by_screen_size=by_operating_system=by_laptop_type=by_processor_brand=by_hard_drive_capacity=by_touchscreen=by_warranty=by_graphic_memory=is_trm=show_merchant=is_cod=show_market={!ex=p_r%20key=product_rating:[4-5]}product_rating:[4%20TO%205]={!ex=p_r%20key=product_rating:[3-5]}product_rating:[3%20TO%205]={!ex=p_r%20key=product_rating:[2-5]}product_rating:[2%20TO%205]={!ex=p_r%20key=product_rating:[1-5]}product_rating:[1%20TO%205]={!ex=m_r%20key=merchant_rating:[4-5]}merchant_rating:[4%20TO%205]={!ex=m_r%20key=merchant_rating:[3-5]}merchant_rating:[3%20TO%205]={!ex=m_r%20key=merchant_rating:[2-5]}merchant_rating:[2%20TO%205]={!ex=m_r%20key=merchant_rating:[1-5]}merchant_rating:[1%20TO%205]=500=true=sort_price=0=10=product_amount_available%20desc,boost_index%20asc,popularity%20desc,is_cod%20desc
> >
> >
> > What kind of optimization we can do in above query . it is taking 2400
> ms .
> >
>


Re: Query optimization

2016-07-13 Thread sara hajili
Hi.
Facet query take a long time.you vcan use group query.
Or in fileds in schema that you run facet query on that filed.
Set doc value=true.
To get better answer.in quick time.
On Jul 13, 2016 11:54 AM, "Midas A"  wrote:

> http://
>
> #:8983/solr/prod/select?q=id_path_ids:166=sort_price:[0%20TO%20*]=status:A=company_status:A=true=1=show_meta_id=show_brand=product_amount_available=by_processor=by_system_memory=by_screen_size=by_operating_system=by_laptop_type=by_processor_brand=by_hard_drive_capacity=by_touchscreen=by_warranty=by_graphic_memory=is_trm=show_merchant=is_cod=show_market={!ex=p_r%20key=product_rating:[4-5]}product_rating:[4%20TO%205]={!ex=p_r%20key=product_rating:[3-5]}product_rating:[3%20TO%205]={!ex=p_r%20key=product_rating:[2-5]}product_rating:[2%20TO%205]={!ex=p_r%20key=product_rating:[1-5]}product_rating:[1%20TO%205]={!ex=m_r%20key=merchant_rating:[4-5]}merchant_rating:[4%20TO%205]={!ex=m_r%20key=merchant_rating:[3-5]}merchant_rating:[3%20TO%205]={!ex=m_r%20key=merchant_rating:[2-5]}merchant_rating:[2%20TO%205]={!ex=m_r%20key=merchant_rating:[1-5]}merchant_rating:[1%20TO%205]=500=true=sort_price=0=10=product_amount_available%20desc,boost_index%20asc,popularity%20desc,is_cod%20desc
>
>
> What kind of optimization we can do in above query . it is taking 2400 ms .
>


Query optimization

2016-07-13 Thread Midas A
http://
#:8983/solr/prod/select?q=id_path_ids:166=sort_price:[0%20TO%20*]=status:A=company_status:A=true=1=show_meta_id=show_brand=product_amount_available=by_processor=by_system_memory=by_screen_size=by_operating_system=by_laptop_type=by_processor_brand=by_hard_drive_capacity=by_touchscreen=by_warranty=by_graphic_memory=is_trm=show_merchant=is_cod=show_market={!ex=p_r%20key=product_rating:[4-5]}product_rating:[4%20TO%205]={!ex=p_r%20key=product_rating:[3-5]}product_rating:[3%20TO%205]={!ex=p_r%20key=product_rating:[2-5]}product_rating:[2%20TO%205]={!ex=p_r%20key=product_rating:[1-5]}product_rating:[1%20TO%205]={!ex=m_r%20key=merchant_rating:[4-5]}merchant_rating:[4%20TO%205]={!ex=m_r%20key=merchant_rating:[3-5]}merchant_rating:[3%20TO%205]={!ex=m_r%20key=merchant_rating:[2-5]}merchant_rating:[2%20TO%205]={!ex=m_r%20key=merchant_rating:[1-5]}merchant_rating:[1%20TO%205]=500=true=sort_price=0=10=product_amount_available%20desc,boost_index%20asc,popularity%20desc,is_cod%20desc


What kind of optimization we can do in above query . it is taking 2400 ms .


Filter query optimization

2009-10-19 Thread Jason Rutherglen
If a filter query matches nothing, then no additional query should be
performed and no results returned?  I don't think we have this today?


Re: Filter query optimization

2009-10-19 Thread Jason Rutherglen
Yonik,

 this is a fast operation anyway

Can you elaborate on why this is a fast operation?

Basically there's a distributed query with a filter, where on a
number of the servers, the filter query isn't matching anything,
however I'm seeing load on those servers (where nothing
matches), so I'm assuming the filter is generated (and cached)
which is fine, then the user query is being performed on a
filter where no documents match. I could misinterpreting the
data, however, I want to find out about this use case regardless
as it likely will crop up again for us.

-J

On Mon, Oct 19, 2009 at 12:07 PM, Yonik Seeley
yo...@lucidimagination.com wrote:
 On Mon, Oct 19, 2009 at 2:55 PM, Jason Rutherglen
 jason.rutherg...@gmail.com wrote:
 If a filter query matches nothing, then no additional query should be
 performed and no results returned?  I don't think we have this today?

 No, but this is a fast operation anyway (In Solr 1.4 at least).

 Another thing to watch out for is to not try this with filters that
 you don't know the size of (or else you may force a popcount on a
 BitDocSet that would not otherwise have been needed).

 It could also potentially complicate warming queries - need to be
 careful that the combination of filters you are warming with matches
 something, or it would cause the fieldCache entries to not be
 populated.

 -Yonik
 http://www.lucidimagination.com



Re: Filter query optimization

2009-10-19 Thread Yonik Seeley
On Mon, Oct 19, 2009 at 4:45 PM, Jason Rutherglen
jason.rutherg...@gmail.com wrote:
 Yonik,

 this is a fast operation anyway

 Can you elaborate on why this is a fast operation?

The scorers will never really be used.
The query will be weighted and scorers will be created, but the filter
will be checked first and return NO_MORE_DOCS.

-Yonik
http://www.lucidimagination.com

 Basically there's a distributed query with a filter, where on a
 number of the servers, the filter query isn't matching anything,
 however I'm seeing load on those servers (where nothing
 matches), so I'm assuming the filter is generated (and cached)
 which is fine, then the user query is being performed on a
 filter where no documents match. I could misinterpreting the
 data, however, I want to find out about this use case regardless
 as it likely will crop up again for us.

 -J

 On Mon, Oct 19, 2009 at 12:07 PM, Yonik Seeley
 yo...@lucidimagination.com wrote:
 On Mon, Oct 19, 2009 at 2:55 PM, Jason Rutherglen
 jason.rutherg...@gmail.com wrote:
 If a filter query matches nothing, then no additional query should be
 performed and no results returned?  I don't think we have this today?

 No, but this is a fast operation anyway (In Solr 1.4 at least).

 Another thing to watch out for is to not try this with filters that
 you don't know the size of (or else you may force a popcount on a
 BitDocSet that would not otherwise have been needed).

 It could also potentially complicate warming queries - need to be
 careful that the combination of filters you are warming with matches
 something, or it would cause the fieldCache entries to not be
 populated.

 -Yonik
 http://www.lucidimagination.com




Re: Filter query optimization

2009-10-19 Thread Jason Rutherglen
Ok, thanks, new Lucene 2.9 features.

On Mon, Oct 19, 2009 at 2:33 PM, Yonik Seeley
yo...@lucidimagination.com wrote:
 On Mon, Oct 19, 2009 at 4:45 PM, Jason Rutherglen
 jason.rutherg...@gmail.com wrote:
 Yonik,

 this is a fast operation anyway

 Can you elaborate on why this is a fast operation?

 The scorers will never really be used.
 The query will be weighted and scorers will be created, but the filter
 will be checked first and return NO_MORE_DOCS.

 -Yonik
 http://www.lucidimagination.com

 Basically there's a distributed query with a filter, where on a
 number of the servers, the filter query isn't matching anything,
 however I'm seeing load on those servers (where nothing
 matches), so I'm assuming the filter is generated (and cached)
 which is fine, then the user query is being performed on a
 filter where no documents match. I could misinterpreting the
 data, however, I want to find out about this use case regardless
 as it likely will crop up again for us.

 -J

 On Mon, Oct 19, 2009 at 12:07 PM, Yonik Seeley
 yo...@lucidimagination.com wrote:
 On Mon, Oct 19, 2009 at 2:55 PM, Jason Rutherglen
 jason.rutherg...@gmail.com wrote:
 If a filter query matches nothing, then no additional query should be
 performed and no results returned?  I don't think we have this today?

 No, but this is a fast operation anyway (In Solr 1.4 at least).

 Another thing to watch out for is to not try this with filters that
 you don't know the size of (or else you may force a popcount on a
 BitDocSet that would not otherwise have been needed).

 It could also potentially complicate warming queries - need to be
 careful that the combination of filters you are warming with matches
 something, or it would cause the fieldCache entries to not be
 populated.

 -Yonik
 http://www.lucidimagination.com





Re: Search query optimization

2008-06-30 Thread wojtekpia

If I know that condition C will eliminate more results than either A or B,
does specifying the query as: C AND A AND B make it any faster (than the
original A AND B AND C)?
-- 
View this message in context: 
http://www.nabble.com/Search-query-optimization-tp17544667p18205504.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Search query optimization

2008-06-30 Thread Chris Hostetter

: If I know that condition C will eliminate more results than either A or B,
: does specifying the query as: C AND A AND B make it any faster (than the
: original A AND B AND C)?

Nope.  Lucene takes care of that for you.



-Hoss



RE: Search query optimization

2008-06-17 Thread Yongjun Rong
Hi,
  Thanks for your reply. I did some test on my test machine. 
http://stage.boomi.com:8080/solr/select/?q=account:1rows=1000. It will
return resultset 384 in 3ms. If I add a new AND condition as below:
http://stage.boomi.com:8080/solr/select/?q=account:1+AND+recordeddate_dt
:[NOW/DAYS-7DAYS+TO+NOW]rows=1000. It will take 18236 to return 21
resultset. If I only use the recordedate_dt condition like
http://stage.boomi.com:8080/solr/select/?q=recordeddate_dt:[NOW/DAYS-7DA
YS+TO+NOW]rows=1000. It takes 20271 ms to get 412800 results. All the
above URL are live, you test it.

Can anyone give me some explaination why this happens if we have the
query optimization? Thank you very much.
Yongjun Rong
 

-Original Message-
From: Walter Underwood [mailto:[EMAIL PROTECTED] 
Sent: Thursday, May 29, 2008 4:57 PM
To: solr-user@lucene.apache.org
Subject: Re: Search query optimization

The people working on Lucene are pretty smart, and this sort of query
optimization is a well-known trick, so I would not worry about it.

A dozen years ago at Infoseek, we checked the count of matches for each
term in an AND, and evaluated the smallest one first.
If any of them had zero matches, we didn't evaluate any of them.

I expect that Doug Cutting and the other Lucene folk know those same
tricks.

wunder

On 5/29/08 1:50 PM, Yongjun Rong [EMAIL PROTECTED] wrote:

 Hi Yonik,
   Thanks for your quick reply. I'm very new to the lucene source code.
 Can you give me a little more detail explaination about this.
 Do you think it will save some memory if docnum = find_match(A)  
 docnum = find_match(B) and put B in the front of the AND query like 
 B AND A AND C? How about sorting (sort=A,B,Cq=A AND B AND C)? Do 
 you think the order of conditions (A,B,C) in a query will affect the 
 performance of the query?
   Thank you very much.
   Yongjun
 
 
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik 
 Seeley
 Sent: Thursday, May 29, 2008 4:12 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Search query optimization
 
 On Thu, May 29, 2008 at 4:05 PM, Yongjun Rong [EMAIL PROTECTED]
 wrote:
  I have a question about how the lucene query parser. For example, I 
 have query A AND B AND C. Will lucene extract all documents satisfy

 condition A in memory and then filter it with condition B and C?
 
 No, Lucene will try and optimize this the best it can.
 
 It roughly goes like this..
 docnum = find_match(A)
 docnum = find_first_match_after(docnum, B) docnum =
 find_first_match_after(docnum,C)
 etc...
 until the same docnum is returned for A,B, and C.
 
 See ConjunctionScorer for the gritty details.
 
 -Yonik
 
 
 
 or only
 the documents satisfying A AND B AND C will be put into memory? Is 
 there any articles discuss about how to build a optimization query to

 save memory and improve performance?
  Thank you very much.
  Yongjun Rong
 



RE: Search query optimization

2008-06-17 Thread Yongjun Rong
Thanks for reply. Here is the debugQuery output:
lst name=debug
−
str name=rawquerystring
account:1 AND recordeddate_dt:[NOW/DAYS-1DAYS TO NOW]
/str
−
str name=querystring
account:1 AND recordeddate_dt:[NOW/DAYS-1DAYS TO NOW]
/str
−
str name=parsedquery
+account:1 +recordeddate_dt:[2008-06-16T00:00:00.000Z TO 
2008-06-17T17:07:57.420Z]
/str
−
str name=parsedquery_toString
+account:1 +recordeddate_dt:[2008-06-16T00:00:00.000 TO 2008-06-17T17:07:57.420]
/str
−
lst name=explain
−
str 
name=id=e03dbd92-3d41-4693-8b69-ac9a0d332446-atom-d52484f5-7aa8-40b3-ad6f-ba3a9071999e,internal_docid=6515410

10.88071 = (MATCH) sum of:
  10.788804 = (MATCH) weight(account:1 in 6515410), product of:
0.9957678 = queryWeight(account:1), product of:
  10.834659 = idf(docFreq=348, numDocs=6515640)
  0.09190578 = queryNorm
10.834659 = (MATCH) fieldWeight(account:1 in 6515410), product of:
  1.0 = tf(termFreq(account:1)=1)
  10.834659 = idf(docFreq=348, numDocs=6515640)
  1.0 = fieldNorm(field=account, doc=6515410)
  0.09190578 = (MATCH) 
ConstantScoreQuery(recordeddate_dt:[2008-06-16T00:00:00.000-2008-06-17T17:07:57.420]),
 product of:
1.0 = boost
0.09190578 = queryNorm
/str
/lst
/lst 

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 17, 2008 12:43 PM
To: solr-user@lucene.apache.org
Subject: Re: Search query optimization

Hi,

Probably because the [NOW/DAYS-7DAYS+TO+NOW] part gets rewritten as lots of OR 
clauses.  I think that you'll see that if you add debugQuery=true to the URL.  
Make sure your recorded_date_dt is not too granular (e.g. if you don't need 
minutes, round the values to hours. If you don't need hours, round the values 
to days).


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
 From: Yongjun Rong [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Tuesday, June 17, 2008 11:56:06 AM
 Subject: RE: Search query optimization
 
 Hi,
   Thanks for your reply. I did some test on my test machine. 
 http://stage.boomi.com:8080/solr/select/?q=account:1rows=1000. It 
 will return resultset 384 in 3ms. If I add a new AND condition as below:
 http://stage.boomi.com:8080/solr/select/?q=account:1+AND+recordeddate_
 dt :[NOW/DAYS-7DAYS+TO+NOW]rows=1000. It will take 18236 to return 21 
 resultset. If I only use the recordedate_dt condition like 
 http://stage.boomi.com:8080/solr/select/?q=recordeddate_dt:[NOW/DAYS-7
 DA
 YS+TO+NOW]rows=1000. It takes 20271 ms to get 412800 results. All the
 above URL are live, you test it.
 
 Can anyone give me some explaination why this happens if we have the 
 query optimization? Thank you very much.
 Yongjun Rong
 
 
 -Original Message-
 From: Walter Underwood [mailto:[EMAIL PROTECTED]
 Sent: Thursday, May 29, 2008 4:57 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Search query optimization
 
 The people working on Lucene are pretty smart, and this sort of query 
 optimization is a well-known trick, so I would not worry about it.
 
 A dozen years ago at Infoseek, we checked the count of matches for 
 each term in an AND, and evaluated the smallest one first.
 If any of them had zero matches, we didn't evaluate any of them.
 
 I expect that Doug Cutting and the other Lucene folk know those same 
 tricks.
 
 wunder
 
 On 5/29/08 1:50 PM, Yongjun Rong wrote:
 
  Hi Yonik,
Thanks for your quick reply. I'm very new to the lucene source code.
  Can you give me a little more detail explaination about this.
  Do you think it will save some memory if docnum = find_match(A)  
  docnum = find_match(B) and put B in the front of the AND query 
  like B AND A AND C? How about sorting (sort=A,B,Cq=A AND B AND 
  C)? Do you think the order of conditions (A,B,C) in a query will 
  affect the performance of the query?
Thank you very much.
Yongjun
 
  
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of 
  Yonik Seeley
  Sent: Thursday, May 29, 2008 4:12 PM
  To: solr-user@lucene.apache.org
  Subject: Re: Search query optimization
  
  On Thu, May 29, 2008 at 4:05 PM, Yongjun Rong
  wrote:
   I have a question about how the lucene query parser. For example, 
  I have query A AND B AND C. Will lucene extract all documents 
  satisfy
 
  condition A in memory and then filter it with condition B and C?
  
  No, Lucene will try and optimize this the best it can.
  
  It roughly goes like this..
  docnum = find_match(A)
  docnum = find_first_match_after(docnum, B) docnum =
  find_first_match_after(docnum,C)
  etc...
  until the same docnum is returned for A,B, and C.
  
  See ConjunctionScorer for the gritty details.
  
  -Yonik
  
  
  
  or only
  the documents satisfying A AND B AND C will be put into memory? 
  Is there any articles discuss about how to build a optimization 
  query to
 
  save memory and improve performance?
   Thank you very much

Re: Search query optimization

2008-06-17 Thread Otis Gospodnetic
Hi,

This is what I was talking about:

recordeddate_dt:[2008-06-16T00:00:00.000Z TO 2008-06-17T17:07:57.420Z]

Note that the granularity of this date field is down to milliseconds.  You 
should change that to be more coarse if you don't need such precision (e.g. no 
milliseconds, no seconds, no minutes, no hours...)


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
 From: Yongjun Rong [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Tuesday, June 17, 2008 1:09:19 PM
 Subject: RE: Search query optimization
 
 Thanks for reply. Here is the debugQuery output:
 
 −
 
 account:1 AND recordeddate_dt:[NOW/DAYS-1DAYS TO NOW]
 
 −
 
 account:1 AND recordeddate_dt:[NOW/DAYS-1DAYS TO NOW]
 
 −
 
 +account:1 +recordeddate_dt:[2008-06-16T00:00:00.000Z TO 
 2008-06-17T17:07:57.420Z]
 
 −
 
 +account:1 +recordeddate_dt:[2008-06-16T00:00:00.000 TO 
 2008-06-17T17:07:57.420]
 
 −
 
 −
 
 name=id=e03dbd92-3d41-4693-8b69-ac9a0d332446-atom-d52484f5-7aa8-40b3-ad6f-ba3a9071999e,internal_docid=6515410
 
 10.88071 = (MATCH) sum of:
   10.788804 = (MATCH) weight(account:1 in 6515410), product of:
 0.9957678 = queryWeight(account:1), product of:
   10.834659 = idf(docFreq=348, numDocs=6515640)
   0.09190578 = queryNorm
 10.834659 = (MATCH) fieldWeight(account:1 in 6515410), product of:
   1.0 = tf(termFreq(account:1)=1)
   10.834659 = idf(docFreq=348, numDocs=6515640)
   1.0 = fieldNorm(field=account, doc=6515410)
   0.09190578 = (MATCH) 
 ConstantScoreQuery(recordeddate_dt:[2008-06-16T00:00:00.000-2008-06-17T17:07:57.420]),
  
 product of:
 1.0 = boost
 0.09190578 = queryNorm
 
 
  
 
 -Original Message-
 From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, June 17, 2008 12:43 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Search query optimization
 
 Hi,
 
 Probably because the [NOW/DAYS-7DAYS+TO+NOW] part gets rewritten as lots of 
 OR 
 clauses.  I think that you'll see that if you add debugQuery=true to the 
 URL.  
 Make sure your recorded_date_dt is not too granular (e.g. if you don't need 
 minutes, round the values to hours. If you don't need hours, round the values 
 to 
 days).
 
 
 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 - Original Message 
  From: Yongjun Rong 
  To: solr-user@lucene.apache.org
  Sent: Tuesday, June 17, 2008 11:56:06 AM
  Subject: RE: Search query optimization
  
  Hi,
Thanks for your reply. I did some test on my test machine. 
  http://stage.boomi.com:8080/solr/select/?q=account:1rows=1000. It 
  will return resultset 384 in 3ms. If I add a new AND condition as below:
  http://stage.boomi.com:8080/solr/select/?q=account:1+AND+recordeddate_ 
  dt :[NOW/DAYS-7DAYS+TO+NOW]rows=1000. It will take 18236 to return 21 
  resultset. If I only use the recordedate_dt condition like 
  http://stage.boomi.com:8080/solr/select/?q=recordeddate_dt:[NOW/DAYS-7
  DA
  YS+TO+NOW]rows=1000. It takes 20271 ms to get 412800 results. All the
  above URL are live, you test it.
  
  Can anyone give me some explaination why this happens if we have the 
  query optimization? Thank you very much.
  Yongjun Rong
  
  
  -Original Message-
  From: Walter Underwood [mailto:[EMAIL PROTECTED]
  Sent: Thursday, May 29, 2008 4:57 PM
  To: solr-user@lucene.apache.org
  Subject: Re: Search query optimization
  
  The people working on Lucene are pretty smart, and this sort of query 
  optimization is a well-known trick, so I would not worry about it.
  
  A dozen years ago at Infoseek, we checked the count of matches for 
  each term in an AND, and evaluated the smallest one first.
  If any of them had zero matches, we didn't evaluate any of them.
  
  I expect that Doug Cutting and the other Lucene folk know those same 
  tricks.
  
  wunder
  
  On 5/29/08 1:50 PM, Yongjun Rong wrote:
  
   Hi Yonik,
 Thanks for your quick reply. I'm very new to the lucene source code.
   Can you give me a little more detail explaination about this.
   Do you think it will save some memory if docnum = find_match(A)  
   docnum = find_match(B) and put B in the front of the AND query 
   like B AND A AND C? How about sorting (sort=A,B,Cq=A AND B AND 
   C)? Do you think the order of conditions (A,B,C) in a query will 
   affect the performance of the query?
 Thank you very much.
 Yongjun
  
   
   -Original Message-
   From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of 
   Yonik Seeley
   Sent: Thursday, May 29, 2008 4:12 PM
   To: solr-user@lucene.apache.org
   Subject: Re: Search query optimization
   
   On Thu, May 29, 2008 at 4:05 PM, Yongjun Rong
   wrote:
I have a question about how the lucene query parser. For example, 
   I have query A AND B AND C. Will lucene extract all documents 
   satisfy
  
   condition A in memory and then filter it with condition B and C?
   
   No, Lucene will try and optimize

Re: Search query optimization

2008-06-17 Thread Chris Hostetter

: Probably because the [NOW/DAYS-7DAYS+TO+NOW] part gets rewritten as lots 
: of OR clauses.  I think that you'll see that if you add debugQuery=true 
: to the URL.  Make sure your recorded_date_dt is not too granular (e.g. 
: if you don't need minutes, round the values to hours. If you don't need 
: hours, round the values to days).

for the record: it doesn't get rewritten to a lot of OR clauses, it's 
using ConstantScoreRangeQuery.

granularity is definitely important however, bth when indexing and when 
querying.  

NOW is milliseconds, so every time you execute that query it's different 
and there is almost no caching possible.

if you use [NOW/DAY-7DAYS TO NOW/DAY] or even 
[NOW/DAY-7DAYS TO NOW/HOUR] you'll get a lot better caching behavior.  it 
looks like you are trying to find anything in the past week, so you may 
want [NOW/DAY-7DAYS TO NOW/DAY+1DAY] (to go to the end of the current day)

once you have a less granular date restriction, it can frequently make 
sense to put this in a seperate fq clause, so it will get cached 
independently of your main query. 

But Otis's point about reducing granularity can also help when indexing 
... the fewer unique dates that apepar in your index, the faster range 
queries will be ... if you've got 1000 documents that all of a 
recordeddate of June 11 2008, but at different times, and you're never 
going to care aboutthe times (just the date) then strip those times off 
when indexing so they all have the same fieled value of 
2008-06-11T00:00:00Z

BTW: the solr port you sent out a URL to ... all of it's caching is 
turned off (the filterCache and queryResultCache configs are commented out 
of your solrconfig.xml) ... you're going to wnat to turn on some caching 
or you'll never see really *great* request times.


-Hoss



RE: Search query optimization

2008-06-17 Thread Yongjun Rong
Hi Otis,
  Thanks for your advice. Do you mean when we add the date data we need 
carefully select the granularity of the date field to make sure it is more 
coarse? How can we do this? We just access solr via http URL not API. If you 
talk about the query syntax, we do have NOW/DAY as round to DAY.
  Yongjun Rong
   

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 17, 2008 1:32 PM
To: solr-user@lucene.apache.org
Subject: Re: Search query optimization

Hi,

This is what I was talking about:

recordeddate_dt:[2008-06-16T00:00:00.000Z TO 2008-06-17T17:07:57.420Z]

Note that the granularity of this date field is down to milliseconds.  You 
should change that to be more coarse if you don't need such precision (e.g. no 
milliseconds, no seconds, no minutes, no hours...)


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
 From: Yongjun Rong [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Tuesday, June 17, 2008 1:09:19 PM
 Subject: RE: Search query optimization
 
 Thanks for reply. Here is the debugQuery output:
 
 −
 
 account:1 AND recordeddate_dt:[NOW/DAYS-1DAYS TO NOW]
 
 −
 
 account:1 AND recordeddate_dt:[NOW/DAYS-1DAYS TO NOW]
 
 −
 
 +account:1 +recordeddate_dt:[2008-06-16T00:00:00.000Z TO
 2008-06-17T17:07:57.420Z]
 
 −
 
 +account:1 +recordeddate_dt:[2008-06-16T00:00:00.000 TO 
 +2008-06-17T17:07:57.420]
 
 −
 
 −
 
 name=id=e03dbd92-3d41-4693-8b69-ac9a0d332446-atom-d52484f5-7aa8-40b3-
 ad6f-ba3a9071999e,internal_docid=6515410
 
 10.88071 = (MATCH) sum of:
   10.788804 = (MATCH) weight(account:1 in 6515410), product of:
 0.9957678 = queryWeight(account:1), product of:
   10.834659 = idf(docFreq=348, numDocs=6515640)
   0.09190578 = queryNorm
 10.834659 = (MATCH) fieldWeight(account:1 in 6515410), product of:
   1.0 = tf(termFreq(account:1)=1)
   10.834659 = idf(docFreq=348, numDocs=6515640)
   1.0 = fieldNorm(field=account, doc=6515410)
   0.09190578 = (MATCH)
 ConstantScoreQuery(recordeddate_dt:[2008-06-16T00:00:00.000-2008-06-17
 T17:07:57.420]),
 product of:
 1.0 = boost
 0.09190578 = queryNorm
 
 
  
 
 -Original Message-
 From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, June 17, 2008 12:43 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Search query optimization
 
 Hi,
 
 Probably because the [NOW/DAYS-7DAYS+TO+NOW] part gets rewritten as 
 lots of OR clauses.  I think that you'll see that if you add debugQuery=true 
 to the URL.
 Make sure your recorded_date_dt is not too granular (e.g. if you don't 
 need minutes, round the values to hours. If you don't need hours, 
 round the values to days).
 
 
 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 - Original Message 
  From: Yongjun Rong
  To: solr-user@lucene.apache.org
  Sent: Tuesday, June 17, 2008 11:56:06 AM
  Subject: RE: Search query optimization
  
  Hi,
Thanks for your reply. I did some test on my test machine. 
  http://stage.boomi.com:8080/solr/select/?q=account:1rows=1000. It 
  will return resultset 384 in 3ms. If I add a new AND condition as below:
  http://stage.boomi.com:8080/solr/select/?q=account:1+AND+recordeddat
  e_ dt :[NOW/DAYS-7DAYS+TO+NOW]rows=1000. It will take 18236 to 
  return 21 resultset. If I only use the recordedate_dt condition like
  http://stage.boomi.com:8080/solr/select/?q=recordeddate_dt:[NOW/DAYS
  -7
  DA
  YS+TO+NOW]rows=1000. It takes 20271 ms to get 412800 results. All 
  YS+TO+the
  above URL are live, you test it.
  
  Can anyone give me some explaination why this happens if we have the 
  query optimization? Thank you very much.
  Yongjun Rong
  
  
  -Original Message-
  From: Walter Underwood [mailto:[EMAIL PROTECTED]
  Sent: Thursday, May 29, 2008 4:57 PM
  To: solr-user@lucene.apache.org
  Subject: Re: Search query optimization
  
  The people working on Lucene are pretty smart, and this sort of 
  query optimization is a well-known trick, so I would not worry about it.
  
  A dozen years ago at Infoseek, we checked the count of matches for 
  each term in an AND, and evaluated the smallest one first.
  If any of them had zero matches, we didn't evaluate any of them.
  
  I expect that Doug Cutting and the other Lucene folk know those same 
  tricks.
  
  wunder
  
  On 5/29/08 1:50 PM, Yongjun Rong wrote:
  
   Hi Yonik,
 Thanks for your quick reply. I'm very new to the lucene source code.
   Can you give me a little more detail explaination about this.
   Do you think it will save some memory if docnum = find_match(A) 
docnum = find_match(B) and put B in the front of the AND query 
   like B AND A AND C? How about sorting (sort=A,B,Cq=A AND B AND 
   C)? Do you think the order of conditions (A,B,C) in a query will 
   affect the performance of the query?
 Thank you very much.
 Yongjun
  
   
   -Original Message-
   From: [EMAIL

RE: Search query optimization

2008-06-17 Thread Yongjun Rong
Hi Chris,
   Thanks for your suggestions. I did try the [NOW/DAY-7DAYS TO
NOW/DAY], but it is not better. And I tried [NOW/DAY-7DAYS TO
NOW/DAY+1DAY], I got some exception as below:
org.apache.solr.core.SolrException: Query parsing error: Cannot parse
'account:1 AND recordeddate_dt:[NOW/DAYS-7DAYS TO NOW/DAY 1DAY]':
Encountered 1DAY at line 1, column 57.
Was expecting:
] ...

at
org.apache.solr.search.QueryParsing.parseQuery(QueryParsing.java:104)
at
org.apache.solr.request.StandardRequestHandler.handleRequestBody(Standar
dRequestHandler.java:109)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerB
ase.java:77)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:658)
at
org.apache.solr.servlet.SolrServlet.doGet(SolrServlet.java:66)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:487)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHan
dler.java:1093)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
ava:185)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHan
dler.java:1084)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:360)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:2
16)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:726)
at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandler
Collection.java:206)
at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.jav
a:114)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:324)
at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:505)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConne
ction.java:828)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:514)
at
org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:211)
at
org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:380)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:
395)
at
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.ja
va:450)
Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse
'account:1 AND recordeddate_dt:[NOW/DAYS-7DAYS TO NOW/DAY 1DAY]':
Encountered 1DAY at line 1, column 57.
Was expecting:
] ...

at
org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:152)
at
org.apache.solr.search.QueryParsing.parseQuery(QueryParsing.java:94)
... 26 more

And I will try to open the cache and see if I can get better query time.
I will let you know.
Thank you very much.
Yongjun Rong

-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 17, 2008 1:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Search query optimization


: Probably because the [NOW/DAYS-7DAYS+TO+NOW] part gets rewritten as
lots
: of OR clauses.  I think that you'll see that if you add
debugQuery=true
: to the URL.  Make sure your recorded_date_dt is not too granular (e.g.

: if you don't need minutes, round the values to hours. If you don't
need
: hours, round the values to days).

for the record: it doesn't get rewritten to a lot of OR clauses, it's
using ConstantScoreRangeQuery.

granularity is definitely important however, bth when indexing and when
querying.  

NOW is milliseconds, so every time you execute that query it's
different and there is almost no caching possible.

if you use [NOW/DAY-7DAYS TO NOW/DAY] or even [NOW/DAY-7DAYS TO
NOW/HOUR] you'll get a lot better caching behavior.  it looks like you
are trying to find anything in the past week, so you may want
[NOW/DAY-7DAYS TO NOW/DAY+1DAY] (to go to the end of the current day)

once you have a less granular date restriction, it can frequently make
sense to put this in a seperate fq clause, so it will get cached
independently of your main query. 

But Otis's point about reducing granularity can also help when indexing
... the fewer unique dates that apepar in your index, the faster range
queries will be ... if you've got 1000 documents that all of a
recordeddate of June 11 2008, but at different times, and you're never
going to care aboutthe times (just the date) then strip those times off
when indexing so they all have the same fieled value of
2008-06-11T00:00:00Z

BTW: the solr port you sent out a URL to ... all of it's caching is
turned off (the filterCache

RE: Search query optimization

2008-06-17 Thread Chris Hostetter
:Thanks for your suggestions. I did try the [NOW/DAY-7DAYS TO
: NOW/DAY], but it is not better. And I tried [NOW/DAY-7DAYS TO
: NOW/DAY+1DAY], I got some exception as below:
: org.apache.solr.core.SolrException: Query parsing error: Cannot parse
: 'account:1 AND recordeddate_dt:[NOW/DAYS-7DAYS TO NOW/DAY 1DAY]':
: Encountered 1DAY at line 1, column 57.

you need to propertly URL escape the + character as %2B in your URLs.

: And I will try to open the cache and see if I can get better query time.

the first request won't be any faster.  but the second request will be.  
and if filtering by week is something you expect peopel to do a lot of, 
you can put it in a newSearcher so it's always warmed up and fast 
for everyone.


-Hoss



RE: Search query optimization

2008-06-17 Thread Yongjun Rong
Hi Chris,
  Thank you very much for the detail suggestions. I just did the cache
test. If most of requests return the same set of data, cache will
improve the query performance. But in our usage, almost all requests
have different data set to return. The cache hit ratio is very low.
That's the reason we close the cache for memory saving.  Another
question is: 
q=account:1+AND+recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY] will combine
the resultset of account:1 and
recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]. How lucene handle it? From
my previous test examples, it seems lucene will not check the size of
the subconditions (like account:1 or
recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]). Q=account:1 will return a
small set of data. But q=recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY] will
return a large set of data. If we combine them with AND like:
q=account+AND+recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]. It should
return the small set of data and then apply the subcondition
recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]. But from the response
time, it seems not the case.
Can anyone give me some detail explaination about this?
Thank you very much.
Yongjun Rong

-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 17, 2008 2:32 PM
To: solr-user@lucene.apache.org
Subject: RE: Search query optimization

:Thanks for your suggestions. I did try the [NOW/DAY-7DAYS TO
: NOW/DAY], but it is not better. And I tried [NOW/DAY-7DAYS TO
: NOW/DAY+1DAY], I got some exception as below:
: org.apache.solr.core.SolrException: Query parsing error: Cannot parse
: 'account:1 AND recordeddate_dt:[NOW/DAYS-7DAYS TO NOW/DAY 1DAY]':
: Encountered 1DAY at line 1, column 57.

you need to propertly URL escape the + character as %2B in your URLs.

: And I will try to open the cache and see if I can get better query
time.

the first request won't be any faster.  but the second request will be.

and if filtering by week is something you expect peopel to do a lot of,
you can put it in a newSearcher so it's always warmed up and fast for
everyone.


-Hoss



RE: Search query optimization

2008-06-17 Thread Chris Hostetter
: test. If most of requests return the same set of data, cache will
: improve the query performance. But in our usage, almost all requests
: have different data set to return. The cache hit ratio is very low.

that's hwy i suggested moving clauses that are likely to be common (ie: 
your within the last week clause into a seperate fq param where it can 
be cached independently from the main query.  if you do that *and* you 
have the filterCache turned on then after this query...
  q=account:1fq=recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]
...these other queries will all be fairly fast becauseo f hte cache hit...
  q=account:fq=recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]
  q=account:fq=recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]
  q=anything+you+wantfq=recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]

: my previous test examples, it seems lucene will not check the size of
: the subconditions (like account:1 or
: recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]). Q=account:1 will return a
: small set of data. But q=recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY] will
: return a large set of data. If we combine them with AND like:
: q=account+AND+recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]. It should
: return the small set of data and then apply the subcondition
: recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]. But from the response

the ConjunctionScorer will do that (as mentioned earlier in this thread) 
but even if the account:1 clause indicates that it can skip ahead to 
*document* #1234567, the ConstantScoreRangeQuery still 
needs iterate over all of the *terms* in the specified range before it 
knows which the lowest matching doc id is above #1234567.

that's why putting range queries into seperate fq params can be a lot 
better ... that term iteration only needs to be done once and can then be 
cached and reused.



-Hoss



Search query optimization

2008-05-29 Thread Yongjun Rong
Hi,
  I have a question about how the lucene query parser. For example, I
have query A AND B AND C. Will lucene extract all documents satisfy
condition A in memory and then filter it with condition B and C? or only
the documents satisfying A AND B AND C will be put into memory? Is
there any articles discuss about how to build a optimization query to
save memory and improve performance?
  Thank you very much.
  Yongjun Rong


Re: Search query optimization

2008-05-29 Thread Yonik Seeley
On Thu, May 29, 2008 at 4:05 PM, Yongjun Rong [EMAIL PROTECTED] wrote:
  I have a question about how the lucene query parser. For example, I
 have query A AND B AND C. Will lucene extract all documents satisfy
 condition A in memory and then filter it with condition B and C?

No, Lucene will try and optimize this the best it can.

It roughly goes like this..
docnum = find_match(A)
docnum = find_first_match_after(docnum, B)
docnum = find_first_match_after(docnum,C)
etc...
until the same docnum is returned for A,B, and C.

See ConjunctionScorer for the gritty details.

-Yonik



 or only
 the documents satisfying A AND B AND C will be put into memory? Is
 there any articles discuss about how to build a optimization query to
 save memory and improve performance?
  Thank you very much.
  Yongjun Rong



RE: Search query optimization

2008-05-29 Thread Yongjun Rong
Hi Yonik,
  Thanks for your quick reply. I'm very new to the lucene source code.
Can you give me a little more detail explaination about this.
Do you think it will save some memory if docnum = find_match(A) 
docnum = find_match(B) and put B in the front of the AND query like B
AND A AND C? How about sorting (sort=A,B,Cq=A AND B AND C)? Do you
think the order of conditions (A,B,C) in a query will affect the
performance of the query?
  Thank you very much.
  Yongjun


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Thursday, May 29, 2008 4:12 PM
To: solr-user@lucene.apache.org
Subject: Re: Search query optimization

On Thu, May 29, 2008 at 4:05 PM, Yongjun Rong [EMAIL PROTECTED]
wrote:
  I have a question about how the lucene query parser. For example, I 
 have query A AND B AND C. Will lucene extract all documents satisfy 
 condition A in memory and then filter it with condition B and C?

No, Lucene will try and optimize this the best it can.

It roughly goes like this..
docnum = find_match(A)
docnum = find_first_match_after(docnum, B) docnum =
find_first_match_after(docnum,C)
etc...
until the same docnum is returned for A,B, and C.

See ConjunctionScorer for the gritty details.

-Yonik



 or only
 the documents satisfying A AND B AND C will be put into memory? Is 
 there any articles discuss about how to build a optimization query to 
 save memory and improve performance?
  Thank you very much.
  Yongjun Rong



Re: Search query optimization

2008-05-29 Thread Walter Underwood
The people working on Lucene are pretty smart, and this sort of
query optimization is a well-known trick, so I would not worry
about it.

A dozen years ago at Infoseek, we checked the count of matches
for each term in an AND, and evaluated the smallest one first.
If any of them had zero matches, we didn't evaluate any of them.

I expect that Doug Cutting and the other Lucene folk know those
same tricks.

wunder

On 5/29/08 1:50 PM, Yongjun Rong [EMAIL PROTECTED] wrote:

 Hi Yonik,
   Thanks for your quick reply. I'm very new to the lucene source code.
 Can you give me a little more detail explaination about this.
 Do you think it will save some memory if docnum = find_match(A) 
 docnum = find_match(B) and put B in the front of the AND query like B
 AND A AND C? How about sorting (sort=A,B,Cq=A AND B AND C)? Do you
 think the order of conditions (A,B,C) in a query will affect the
 performance of the query?
   Thank you very much.
   Yongjun
 
 
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
 Seeley
 Sent: Thursday, May 29, 2008 4:12 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Search query optimization
 
 On Thu, May 29, 2008 at 4:05 PM, Yongjun Rong [EMAIL PROTECTED]
 wrote:
  I have a question about how the lucene query parser. For example, I
 have query A AND B AND C. Will lucene extract all documents satisfy
 condition A in memory and then filter it with condition B and C?
 
 No, Lucene will try and optimize this the best it can.
 
 It roughly goes like this..
 docnum = find_match(A)
 docnum = find_first_match_after(docnum, B) docnum =
 find_first_match_after(docnum,C)
 etc...
 until the same docnum is returned for A,B, and C.
 
 See ConjunctionScorer for the gritty details.
 
 -Yonik
 
 
 
 or only
 the documents satisfying A AND B AND C will be put into memory? Is
 there any articles discuss about how to build a optimization query to
 save memory and improve performance?
  Thank you very much.
  Yongjun Rong