Is there any limit how many documents can be indexed by apache solr
Dear All I am using Apache solr 3.6.2 with Drupal 7. Users keeps adding their profiles (resumes) and with cron task from Drupal, documents get indexed. Recently I observed, after indexing around 11,000 documents, further documents are not getting indexed. Is there any configuration for max documents those can be indexed. Kindly help. Thanks kamal
Re: Is there any limit how many documents can be indexed by apache solr
Thanks Alejandro and Luis. If I need to see logs, how can I see it. Is it stored in any default log files. I am using below command to start apache solr. java -Xms64m -Xmx6g -jar start.jar I am using it along with Drupal 7.1.5 , I am trying to find out if it is a Drupal issue or Apache solr issue. Not getting clue where to start In terminal where I have started apache solr, I get below logs when I attempt to index for remaining documents. [root@example]# [root@example]# [root@example]# Nov 26, 2013 5:28:56 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/admin/ping params={} hits=0 status=0 QTime=1 Nov 26, 2013 5:28:56 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/admin/ping params={} status=0 QTime=1 Nov 26, 2013 5:28:56 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/admin/ping params={} hits=0 status=0 QTime=1 Nov 26, 2013 5:28:56 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/admin/ping params={} status=0 QTime=1 [root@example]# In addition to this, is there any other log folder... Kindly bear with me,as I am very novice in apache solr. Regards kamal On Tue, Nov 26, 2013 at 5:19 PM, Luis Cappa Banda luisca...@gmail.comwrote: Hello! Checkout also your application server logs. Maybe you're trying to index Documents with any syntax error and they are skipped. Regards, - Luis Cappa 2013/11/26 Alejandro Marqués Rodríguez amarq...@paradigmatecnologico.com Hi, In lucene you are supossed to be able to index up to 274 billion documents ( http://lucene.apache.org/core/3_0_3/fileformats.html#Limitations ), so in Solr should be something like that. Anyway the maximum number is quite bigger than those 11.000 ;) Could it be that you are reusing IDs so the new documents overwrite the old ones? 2013/11/26 Kamal Palei palei.ka...@gmail.com Dear All I am using Apache solr 3.6.2 with Drupal 7. Users keeps adding their profiles (resumes) and with cron task from Drupal, documents get indexed. Recently I observed, after indexing around 11,000 documents, further documents are not getting indexed. Is there any configuration for max documents those can be indexed. Kindly help. Thanks kamal -- Alejandro Marqués Rodríguez Paradigma Tecnológico http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 -- - Luis Cappa
Re: Is there any limit how many documents can be indexed by apache solr
=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} Nov 26, 2013 5:46:53 AM org.apache.solr.core.QuerySenderListener newSearcher INFO: QuerySenderListener sending requests to Searcher@1872c950 main Nov 26, 2013 5:46:53 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=null path=null params={start=0event=newSearcherq=solrrows=10} hits=11 status=0 QTime=25 Nov 26, 2013 5:46:53 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=null path=null params={start=0event=newSearcherq=rocksrows=10} hits=38 status=0 QTime=15 Thanks kamal On Tue, Nov 26, 2013 at 6:19 PM, Kamal Palei palei.ka...@gmail.com wrote: Thanks Alejandro and Luis. If I need to see logs, how can I see it. Is it stored in any default log files. I am using below command to start apache solr. java -Xms64m -Xmx6g -jar start.jar I am using it along with Drupal 7.1.5 , I am trying to find out if it is a Drupal issue or Apache solr issue. Not getting clue where to start In terminal where I have started apache solr, I get below logs when I attempt to index for remaining documents. [root@example]# [root@example]# [root@example]# Nov 26, 2013 5:28:56 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/admin/ping params={} hits=0 status=0 QTime=1 Nov 26, 2013 5:28:56 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/admin/ping params={} status=0 QTime=1 Nov 26, 2013 5:28:56 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/admin/ping params={} hits=0 status=0 QTime=1 Nov 26, 2013 5:28:56 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/admin/ping params={} status=0 QTime=1 [root@example]# In addition to this, is there any other log folder... Kindly bear with me,as I am very novice in apache solr. Regards kamal On Tue, Nov 26, 2013 at 5:19 PM, Luis Cappa Banda luisca...@gmail.comwrote: Hello! Checkout also your application server logs. Maybe you're trying to index Documents with any syntax error and they are skipped. Regards, - Luis Cappa 2013/11/26 Alejandro Marqués Rodríguez amarq...@paradigmatecnologico.com Hi, In lucene you are supossed to be able to index up to 274 billion documents ( http://lucene.apache.org/core/3_0_3/fileformats.html#Limitations ), so in Solr should be something like that. Anyway the maximum number is quite bigger than those 11.000 ;) Could it be that you are reusing IDs so the new documents overwrite the old ones? 2013/11/26 Kamal Palei palei.ka...@gmail.com Dear All I am using Apache solr 3.6.2 with Drupal 7. Users keeps adding their profiles (resumes) and with cron task from Drupal, documents get indexed. Recently I observed, after indexing around 11,000 documents, further documents are not getting indexed. Is there any configuration for max documents those can be indexed. Kindly help. Thanks kamal -- Alejandro Marqués Rodríguez Paradigma Tecnológico http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 -- - Luis Cappa
Re: LIMIT on number of OR in fq
Thanks Alok One need to add headerBufferSize to connector section in jetty.xml file as shown below * !-- This connector is currently being used for Solr because it showed better performance than nio.SelectChannelConnector for typical Solr requests. -- Call name=addConnector Arg New class=org.mortbay.jetty.bio.SocketConnector Set name=hostSystemProperty name=jetty.host //Set Set name=portSystemProperty name=jetty.port default=8983//Set Set name=maxIdleTime5/Set Set name=lowResourceMaxIdleTime1500/Set Set name=statsOnfalse/Set Set name=headerBufferSize65536/Set /New /Arg /Call * Regards Kamal On Mon, Jun 10, 2013 at 12:20 PM, Aloke Ghoshal alghos...@gmail.com wrote: True, the container's request header size limit must be the reason then. Try: http://serverfault.com/questions/136249/how-do-we-increase-the-maximum-allowed-http-get-query-length-in-jetty On Sun, Jun 9, 2013 at 11:04 PM, Jack Krupansky j...@basetechnology.com wrote: Maybe it is hitting some kind of container limit on URL length, like more than 2048? Add debugQuery=true to your query and see what query is both received and parsed and generated. Also, if the default query operator is set to or, fq={! q.op=OR}..., then you can drop the OR operators for a shorter query string. That said, as with most features of Lucene and Solr, the #1 rule is: Use them in moderation. A few dozen IDs are fine. A hundred immediately raising suspicion - what are you really trying to do? 200?! 250??!! Over 300?!! 1,000?!?! 5,000?!?! I mean, do you really need to do all of this on a single query? If you find yourself saying Yes, go back to the drawing board and think a lot more carefully what your data model is. I mean, the application data model is supposed to simplify queries. Your case does not seem simple at all. Tell us what you are really trying to do with this extreme filter query. The fact that you stumbled into an apparent problem should just be a wakeup call that you need to reconsider your basic design assumptions. -- Jack Krupansky -Original Message- From: Kamal Palei Sent: Sunday, June 09, 2013 9:07 AM To: solr-user@lucene.apache.org Subject: LIMIT on number of OR in fq Dear All I am using below syntax to check for a particular field. fq=locations:(5000 OR 1 OR 15000 OR 2 OR 75100) With this I get the expected result properly. In a particular situations the number of ORs are more (looks around 280) something as below. fq=pref_work_locations:(5000 OR 1 OR 15000 OR 2 OR 75100 OR 125300 OR 25300 OR 141100 OR 100700 OR 50300 OR 132100 OR 25000 OR 25100 OR 25200 OR 25400 OR 25500 OR 25600 OR 25700 OR 25800 OR 25900 OR 26000 OR 26100 OR 26200 OR 26300 OR 26400 OR 26500 OR 3 OR 30100 OR 35000 OR 35100 OR 35200 OR 35300 OR 35400 OR 35500 OR 35600 OR 35700 OR 35800 OR 4 OR 45000 OR 45100 OR 45200 OR 45300 OR 45400 OR 45500 OR 5 OR 50100 OR 50200 OR 55000 OR 55100 OR 55200 OR 55300 OR 55400 OR 55500 OR 55600 OR 55700 OR 6 OR 60100 OR 60200 OR 60300 OR 60400 OR 60500 OR 65000 OR 65100 OR 65200 OR 7 OR 70100 OR 70200 OR 70300 OR 70400 OR 75000 OR 75200 OR 75300 OR 75400 OR 75500 OR 75600 OR 75700 OR 75800 OR 75900 OR 76000 OR 76100 OR 76200 OR 76300 OR 76400 OR 8 OR 80100 OR 80200 OR 80300 OR 80400 OR 80500 OR 85000 OR 85100 OR 85200 OR 85300 OR 85400 OR 85500 OR 85600 OR 85700 OR 85800 OR 85900 OR 86000 OR 86100 OR 86200 OR 9 OR 90100 OR 90200 OR 90300 OR 90400 OR 90500 OR 90600 OR 90700 OR 90800 OR 90900 OR 91000 OR 91100 OR 91200 OR 91300 OR 91400 OR 91500 OR 91600 OR 91700 OR 91800 OR 91900 OR 92000 OR 92100 OR 92200 OR 92300 OR 92400 OR 92500 OR 92600 OR 92700 OR 92800 OR 92900 OR 95000 OR 95100 OR 10 OR 100100 OR 105000 OR 105100 OR 105200 OR 105300 OR 105400 OR 105500 OR 105600 OR 105700 OR 105800 OR 105900 OR 106000 OR 106100 OR 106200 OR 11 OR 110100 OR 115000 OR 115100 OR 115200 OR 115300 OR 115400 OR 115500 OR 12 OR 120100 OR 120200 OR 120300 OR 120400 OR 120500 OR 120600 OR 120700 OR 120800 OR 120900 OR 121000 OR 121100 OR 125000 OR 125100 OR 125200 OR 125400 OR 125500 OR 125600 OR 125700 OR 125800 OR 125900 OR 126000 OR 126100 OR 13 OR 130100 OR 130200 OR 130300 OR 130400 OR 130500 OR 130600 OR 130700 OR 130800 OR 130900 OR 131000 OR 131100 OR 131200 OR 131300 OR 131400 OR 131500 OR 131600 OR 131700 OR 131800 OR 131900 OR 132000 OR 132200 OR 132300 OR 132400 OR 132500 OR 135000 OR 135100 OR 14 OR 140100 OR 140200 OR 140300 OR 140400 OR 140500 OR 140600 OR 140700 OR 140800 OR 140900 OR 141000 OR 141200 OR 141300 OR 141400 OR 141500 OR 141600 OR 141700 OR 141800 OR 141900 OR 142000 OR 142100 OR 145000 OR 15 OR 155000 OR 16 OR 165000 OR 17 OR 175000 OR 18 OR 185000 OR 19
Re: Help required with fq syntax
Hi Otis Your suggestion worked fine. Thanks kamal On Sun, Jun 9, 2013 at 7:58 AM, Kamal Palei palei.ka...@gmail.com wrote: Though the syntax looks fine, but I get all the records. As per example given above I get all the documents, meaning filtering did not work. I am curious to know if my indexing went fine or not. I will check and revert back. On Sun, Jun 9, 2013 at 7:21 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Try: ...q=*:*fq=-blocked_company_ids:5 Otis -- Solr ElasticSearch Support http://sematext.com/ On Sat, Jun 8, 2013 at 9:37 PM, Kamal Palei palei.ka...@gmail.com wrote: Dear All I have a multi-valued field blocked_company_ids in index. You can think like 1. document1 , blocked_company_ids: 1, 5, 7 2. document2 , blocked_company_ids: 2, 6, 7 3. document3 , blocked_company_ids: 4, 5, 6 and so on . If I want to retrieve all the documents where blocked_company_id does not contain one particular company id say 5. So my search result should give me only document2 as document1 and document3 both contains 5. To achieve this how fq syntax looks like is it something like below fq=blocked_company_ids:-5 I tried like above syntax, but it gives me 0 record. Can somebody help me with the syntax please, and point me where all syntax details are given. Thanks Kamal Net Cloud Systems
LIMIT on number of OR in fq
Dear All I am using below syntax to check for a particular field. fq=locations:(5000 OR 1 OR 15000 OR 2 OR 75100) With this I get the expected result properly. In a particular situations the number of ORs are more (looks around 280) something as below. fq=pref_work_locations:(5000 OR 1 OR 15000 OR 2 OR 75100 OR 125300 OR 25300 OR 141100 OR 100700 OR 50300 OR 132100 OR 25000 OR 25100 OR 25200 OR 25400 OR 25500 OR 25600 OR 25700 OR 25800 OR 25900 OR 26000 OR 26100 OR 26200 OR 26300 OR 26400 OR 26500 OR 3 OR 30100 OR 35000 OR 35100 OR 35200 OR 35300 OR 35400 OR 35500 OR 35600 OR 35700 OR 35800 OR 4 OR 45000 OR 45100 OR 45200 OR 45300 OR 45400 OR 45500 OR 5 OR 50100 OR 50200 OR 55000 OR 55100 OR 55200 OR 55300 OR 55400 OR 55500 OR 55600 OR 55700 OR 6 OR 60100 OR 60200 OR 60300 OR 60400 OR 60500 OR 65000 OR 65100 OR 65200 OR 7 OR 70100 OR 70200 OR 70300 OR 70400 OR 75000 OR 75200 OR 75300 OR 75400 OR 75500 OR 75600 OR 75700 OR 75800 OR 75900 OR 76000 OR 76100 OR 76200 OR 76300 OR 76400 OR 8 OR 80100 OR 80200 OR 80300 OR 80400 OR 80500 OR 85000 OR 85100 OR 85200 OR 85300 OR 85400 OR 85500 OR 85600 OR 85700 OR 85800 OR 85900 OR 86000 OR 86100 OR 86200 OR 9 OR 90100 OR 90200 OR 90300 OR 90400 OR 90500 OR 90600 OR 90700 OR 90800 OR 90900 OR 91000 OR 91100 OR 91200 OR 91300 OR 91400 OR 91500 OR 91600 OR 91700 OR 91800 OR 91900 OR 92000 OR 92100 OR 92200 OR 92300 OR 92400 OR 92500 OR 92600 OR 92700 OR 92800 OR 92900 OR 95000 OR 95100 OR 10 OR 100100 OR 105000 OR 105100 OR 105200 OR 105300 OR 105400 OR 105500 OR 105600 OR 105700 OR 105800 OR 105900 OR 106000 OR 106100 OR 106200 OR 11 OR 110100 OR 115000 OR 115100 OR 115200 OR 115300 OR 115400 OR 115500 OR 12 OR 120100 OR 120200 OR 120300 OR 120400 OR 120500 OR 120600 OR 120700 OR 120800 OR 120900 OR 121000 OR 121100 OR 125000 OR 125100 OR 125200 OR 125400 OR 125500 OR 125600 OR 125700 OR 125800 OR 125900 OR 126000 OR 126100 OR 13 OR 130100 OR 130200 OR 130300 OR 130400 OR 130500 OR 130600 OR 130700 OR 130800 OR 130900 OR 131000 OR 131100 OR 131200 OR 131300 OR 131400 OR 131500 OR 131600 OR 131700 OR 131800 OR 131900 OR 132000 OR 132200 OR 132300 OR 132400 OR 132500 OR 135000 OR 135100 OR 14 OR 140100 OR 140200 OR 140300 OR 140400 OR 140500 OR 140600 OR 140700 OR 140800 OR 140900 OR 141000 OR 141200 OR 141300 OR 141400 OR 141500 OR 141600 OR 141700 OR 141800 OR 141900 OR 142000 OR 142100 OR 145000 OR 15 OR 155000 OR 16 OR 165000 OR 17 OR 175000 OR 18 OR 185000 OR 19 OR 195000 OR 20 OR 205000 OR 21 OR 215000 OR 22 OR 225000 OR 23 OR 235000 OR 24 OR 245000 OR 25 OR 255000 OR 26 OR 265000 OR 27 OR 275000 OR 28 OR 285000 OR 29 OR 295000 OR 30 OR 305000 OR 31 OR 315000 OR 32 OR 325000 OR 33 OR 335000 OR 34 OR 345000 OR 35 OR 355000 OR 36 OR 365000 OR 37 OR 375000 OR 38 OR 385000 OR 39) When we have such a high number of ORs, it gives me 0 records, whereas I expected all possible records. So I am wondering, is there any limit for ORs in one fq filter. I know I need to go for something like, fq=locations:[min , max] format, but that may not be possible always.., or probably we need to modify a bigger piece of code. So just as a temporary solution, is there anyother way I can follow? Best Regards Kamal
Help required with fq syntax
Dear All I have a multi-valued field blocked_company_ids in index. You can think like 1. document1 , blocked_company_ids: 1, 5, 7 2. document2 , blocked_company_ids: 2, 6, 7 3. document3 , blocked_company_ids: 4, 5, 6 and so on . If I want to retrieve all the documents where blocked_company_id does not contain one particular company id say 5. So my search result should give me only document2 as document1 and document3 both contains 5. To achieve this how fq syntax looks like is it something like below fq=blocked_company_ids:-5 I tried like above syntax, but it gives me 0 record. Can somebody help me with the syntax please, and point me where all syntax details are given. Thanks Kamal Net Cloud Systems
Re: Help required with fq syntax
Also please note that for some documents, blocked_company_ids may not be present as well. In such cases that document should be present in search result as well. BR, Kamal On Sun, Jun 9, 2013 at 7:07 AM, Kamal Palei palei.ka...@gmail.com wrote: Dear All I have a multi-valued field blocked_company_ids in index. You can think like 1. document1 , blocked_company_ids: 1, 5, 7 2. document2 , blocked_company_ids: 2, 6, 7 3. document3 , blocked_company_ids: 4, 5, 6 and so on . If I want to retrieve all the documents where blocked_company_id does not contain one particular company id say 5. So my search result should give me only document2 as document1 and document3 both contains 5. To achieve this how fq syntax looks like is it something like below fq=blocked_company_ids:-5 I tried like above syntax, but it gives me 0 record. Can somebody help me with the syntax please, and point me where all syntax details are given. Thanks Kamal Net Cloud Systems
Re: Help required with fq syntax
Though the syntax looks fine, but I get all the records. As per example given above I get all the documents, meaning filtering did not work. I am curious to know if my indexing went fine or not. I will check and revert back. On Sun, Jun 9, 2013 at 7:21 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Try: ...q=*:*fq=-blocked_company_ids:5 Otis -- Solr ElasticSearch Support http://sematext.com/ On Sat, Jun 8, 2013 at 9:37 PM, Kamal Palei palei.ka...@gmail.com wrote: Dear All I have a multi-valued field blocked_company_ids in index. You can think like 1. document1 , blocked_company_ids: 1, 5, 7 2. document2 , blocked_company_ids: 2, 6, 7 3. document3 , blocked_company_ids: 4, 5, 6 and so on . If I want to retrieve all the documents where blocked_company_id does not contain one particular company id say 5. So my search result should give me only document2 as document1 and document3 both contains 5. To achieve this how fq syntax looks like is it something like below fq=blocked_company_ids:-5 I tried like above syntax, but it gives me 0 record. Can somebody help me with the syntax please, and point me where all syntax details are given. Thanks Kamal Net Cloud Systems
Re: Sorting results by last update date
Jack Thank you so much for detailed answer. -BR, Kamal On Thu, May 30, 2013 at 6:18 PM, Jack Krupansky j...@basetechnology.comwrote: I wrote Otherwise, it would miss dates after the start of today, but that should be Otherwise, it would miss documents with times after the start of today if the current time is before noon. But use * and you will be better off anyway. -- Jack Krupansky -Original Message- From: Jack Krupansky Sent: Thursday, May 30, 2013 8:27 AM To: solr-user@lucene.apache.org Subject: Re: Sorting results by last update date You can just use NOW/DAY for a filter that would only change once a day: [NOW/DAY-60DAY TO NOW/DAY] Oops... make that: [NOW/DAY-60DAY TO NOW/DAY+1DAY] Otherwise, it would miss dates after the start of today. Even better, make it: [NOW/DAY-60DAY TO *] -- Jack Krupansky -Original Message- From: Kamal Palei Sent: Thursday, May 30, 2013 5:41 AM To: solr-user@lucene.apache.org Subject: Re: Sorting results by last update date Thanks Shalini... It is solr 3.6.2 Instead of NOW, I can use today's date (I did not know this cache issue,, thanks). Later I realized , it looks it is my mistake that misleads asc and desc ordering result. After I get data from solr, again I do mysql query where the order changes again. Regards Kamal On Wed, May 29, 2013 at 2:54 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, May 29, 2013 at 12:10 PM, Kamal Palei palei.ka...@gmail.com wrote: Hi All I am trying to sort the results as per last updated date. My url looks as below. *fq=last_updated_date:[NOW-**60DAY TO NOW]fq=experience:[0 TO 588]fq=salary:[0 TO 500] OR salary:0fq=-bundle:jobfq=-**bundle:panelfq=-bundle:page** fq=-bundle:articlespellcheck=**trueq=+java +sipfl=id,entity_id,entity_**type,bundle,bundle_name,label,** is_comment_count,ds_created,**ds_changed,score,path,url,is_** uid,tos_name,zm_parent_entity,**ss_filemime,ss_file_entity_** title,ss_file_entity_url,ss_**field_uidspellcheck.q=+java +sipqf=content^40qf=label^5.**0qf=tos_content_extra^0.1qf=** tos_name^3.0hl.fl=contentmm=**1q.op=ANDwt=json json.nl=mapsort=last_updated_**date asc * With this I get the data in ascending order of last updated date. If I am trying to sort data in descending order, I use below url *fq=last_updated_date:[NOW-**60DAY TO NOW]fq=experience:[0 TO 588]fq=salary:[0 TO 500] OR salary:0fq=-bundle:jobfq=-**bundle:panelfq=-bundle:page** fq=-bundle:articlespellcheck=**trueq=+java +sipfl=id,entity_id,entity_**type,bundle,bundle_name,label,** is_comment_count,ds_created,**ds_changed,score,path,url,is_** uid,tos_name,zm_parent_entity,**ss_filemime,ss_file_entity_** title,ss_file_entity_url,ss_**field_uidspellcheck.q=+java +sipqf=content^40qf=label^5.**0qf=tos_content_extra^0.1qf=** tos_name^3.0hl.fl=contentmm=**1q.op=ANDwt=json json.nl=mapsort=last_updated_**date desc* Here the data set is not ordered properly, mostly it looks to me data is ordered on basis of score, not last updated date. Can somebody tell me what I am missing here, why *desc* is not working properly for me. What is the field type of last_update_date? Which version of Solr? A side note: Using NOW in a filter query is ineffecient because it doesn't use your filter cache effectively. Round it to nearest time interval instead. See http://java.dzone.com/**articles/solr-date-math-now-** and-filter http://java.dzone.com/articles/solr-date-math-now-and-filter -- Regards, Shalin Shekhar Mangar.
Re: Sorting results by last update date
Thanks Shalini... It is solr 3.6.2 Instead of NOW, I can use today's date (I did not know this cache issue,, thanks). Later I realized , it looks it is my mistake that misleads asc and desc ordering result. After I get data from solr, again I do mysql query where the order changes again. Regards Kamal On Wed, May 29, 2013 at 2:54 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, May 29, 2013 at 12:10 PM, Kamal Palei palei.ka...@gmail.com wrote: Hi All I am trying to sort the results as per last updated date. My url looks as below. *fq=last_updated_date:[NOW-60DAY TO NOW]fq=experience:[0 TO 588]fq=salary:[0 TO 500] OR salary:0fq=-bundle:jobfq=-bundle:panelfq=-bundle:pagefq=-bundle:articlespellcheck=trueq=+java +sipfl=id,entity_id,entity_type,bundle,bundle_name,label,is_comment_count,ds_created,ds_changed,score,path,url,is_uid,tos_name,zm_parent_entity,ss_filemime,ss_file_entity_title,ss_file_entity_url,ss_field_uidspellcheck.q=+java +sipqf=content^40qf=label^5.0qf=tos_content_extra^0.1qf=tos_name^3.0hl.fl=contentmm=1q.op=ANDwt=json json.nl=mapsort=last_updated_date asc * With this I get the data in ascending order of last updated date. If I am trying to sort data in descending order, I use below url *fq=last_updated_date:[NOW-60DAY TO NOW]fq=experience:[0 TO 588]fq=salary:[0 TO 500] OR salary:0fq=-bundle:jobfq=-bundle:panelfq=-bundle:pagefq=-bundle:articlespellcheck=trueq=+java +sipfl=id,entity_id,entity_type,bundle,bundle_name,label,is_comment_count,ds_created,ds_changed,score,path,url,is_uid,tos_name,zm_parent_entity,ss_filemime,ss_file_entity_title,ss_file_entity_url,ss_field_uidspellcheck.q=+java +sipqf=content^40qf=label^5.0qf=tos_content_extra^0.1qf=tos_name^3.0hl.fl=contentmm=1q.op=ANDwt=json json.nl=mapsort=last_updated_date desc* Here the data set is not ordered properly, mostly it looks to me data is ordered on basis of score, not last updated date. Can somebody tell me what I am missing here, why *desc* is not working properly for me. What is the field type of last_update_date? Which version of Solr? A side note: Using NOW in a filter query is ineffecient because it doesn't use your filter cache effectively. Round it to nearest time interval instead. See http://java.dzone.com/articles/solr-date-math-now-and-filter -- Regards, Shalin Shekhar Mangar.
Sorting results by last update date
Hi All I am trying to sort the results as per last updated date. My url looks as below. *fq=last_updated_date:[NOW-60DAY TO NOW]fq=experience:[0 TO 588]fq=salary:[0 TO 500] OR salary:0fq=-bundle:jobfq=-bundle:panelfq=-bundle:pagefq=-bundle:articlespellcheck=trueq=+java +sipfl=id,entity_id,entity_type,bundle,bundle_name,label,is_comment_count,ds_created,ds_changed,score,path,url,is_uid,tos_name,zm_parent_entity,ss_filemime,ss_file_entity_title,ss_file_entity_url,ss_field_uidspellcheck.q=+java +sipqf=content^40qf=label^5.0qf=tos_content_extra^0.1qf=tos_name^3.0hl.fl=contentmm=1q.op=ANDwt=json json.nl=mapsort=last_updated_date asc * With this I get the data in ascending order of last updated date. If I am trying to sort data in descending order, I use below url *fq=last_updated_date:[NOW-60DAY TO NOW]fq=experience:[0 TO 588]fq=salary:[0 TO 500] OR salary:0fq=-bundle:jobfq=-bundle:panelfq=-bundle:pagefq=-bundle:articlespellcheck=trueq=+java +sipfl=id,entity_id,entity_type,bundle,bundle_name,label,is_comment_count,ds_created,ds_changed,score,path,url,is_uid,tos_name,zm_parent_entity,ss_filemime,ss_file_entity_title,ss_file_entity_url,ss_field_uidspellcheck.q=+java +sipqf=content^40qf=label^5.0qf=tos_content_extra^0.1qf=tos_name^3.0hl.fl=contentmm=1q.op=ANDwt=json json.nl=mapsort=last_updated_date desc* Here the data set is not ordered properly, mostly it looks to me data is ordered on basis of score, not last updated date. Can somebody tell me what I am missing here, why *desc* is not working properly for me. Thanks kamal
How apache solr stores indexes
Dear All I have a basic doubt how the data is stored in apache solr indexes. Say I have thousand registered users in my site. Lets say I want to store skills of each users as a multivalued string index. Say user 1 has skill set - Java, MySql, PHP user 2 has skill set - C++, MySql, PHP user 3 has skill set - Java, Android, iOS ... so on You can see user 1 and 2 has two common skills that is MySql and PHP In an actual case there might be millions of repetition of words. Now question is, does apache solr stores them as just words, OR converts each words to an unique number and stores the number only. Best Regards Kamal Net Cloud Systems Bangalore, India
Re: How apache solr stores indexes
Thanks Alex. I am in dilemma how do I store the skill sets with solr index as a string token or as an integer. To give little background - As of today, each skill I assign a unique id (take as auto increment field in mysql table), and the store them against user id in a separate table. That's how I do search for users having a particular skill or retrieve complete skill set of a particular user. Now I want to dump everything to solr and will minimize mysql usage as low as possible. This will help me to scale to higher load. I am just weighing down options between 1. Should I store each skill as a string token (in a new multivalued string index) 2. OR should I store each skill as an integer (in a new multivalued integer index) Kindly suggest which is better option. Best Regards kamal On Wed, May 29, 2013 at 8:11 AM, Alexandre Rafalovitch arafa...@gmail.comwrote: And you need to know this why? If you are really trying to understand how this all works under the covers, you need to look at Lucene's inverted index as a start. Start here: http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/codecs/lucene42/package-summary.html#package_description Might take you a couple of weeks to put it all together. Or you could try asking the actual business-level question that you need an answer to. :-) Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Tue, May 28, 2013 at 10:13 PM, Kamal Palei palei.ka...@gmail.com wrote: Dear All I have a basic doubt how the data is stored in apache solr indexes. Say I have thousand registered users in my site. Lets say I want to store skills of each users as a multivalued string index. Say user 1 has skill set - Java, MySql, PHP user 2 has skill set - C++, MySql, PHP user 3 has skill set - Java, Android, iOS ... so on You can see user 1 and 2 has two common skills that is MySql and PHP In an actual case there might be millions of repetition of words. Now question is, does apache solr stores them as just words, OR converts each words to an unique number and stores the number only. Best Regards Kamal Net Cloud Systems Bangalore, India
Re: How apache solr stores indexes
Thanks a lot for all your input. I will go ahead and store as strings. Best Regards Kamal On Wed, May 29, 2013 at 9:00 AM, Jack Krupansky j...@basetechnology.comwrote: As a general rule with Solr, do a proof of concept implementation with the simplest sensible approach and only start piling on complexity if performance or capacity become problematic. If the data is naturally a string, use a string. If it is naturally a number, use a number. Use whatever the query client's will be most comfortable with. -- Jack Krupansky -Original Message- From: Kamal Palei Sent: Tuesday, May 28, 2013 10:54 PM To: solr-user@lucene.apache.org Subject: Re: How apache solr stores indexes Thanks Alex. I am in dilemma how do I store the skill sets with solr index as a string token or as an integer. To give little background - As of today, each skill I assign a unique id (take as auto increment field in mysql table), and the store them against user id in a separate table. That's how I do search for users having a particular skill or retrieve complete skill set of a particular user. Now I want to dump everything to solr and will minimize mysql usage as low as possible. This will help me to scale to higher load. I am just weighing down options between 1. Should I store each skill as a string token (in a new multivalued string index) 2. OR should I store each skill as an integer (in a new multivalued integer index) Kindly suggest which is better option. Best Regards kamal On Wed, May 29, 2013 at 8:11 AM, Alexandre Rafalovitch arafa...@gmail.comwrote: And you need to know this why? If you are really trying to understand how this all works under the covers, you need to look at Lucene's inverted index as a start. Start here: http://lucene.apache.org/core/**4_3_0/core/org/apache/lucene/** codecs/lucene42/package-**summary.html#package_**descriptionhttp://lucene.apache.org/core/4_3_0/core/org/apache/lucene/codecs/lucene42/package-summary.html#package_description Might take you a couple of weeks to put it all together. Or you could try asking the actual business-level question that you need an answer to. :-) Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/**alexandrerafalovitchhttp://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Tue, May 28, 2013 at 10:13 PM, Kamal Palei palei.ka...@gmail.com wrote: Dear All I have a basic doubt how the data is stored in apache solr indexes. Say I have thousand registered users in my site. Lets say I want to store skills of each users as a multivalued string index. Say user 1 has skill set - Java, MySql, PHP user 2 has skill set - C++, MySql, PHP user 3 has skill set - Java, Android, iOS ... so on You can see user 1 and 2 has two common skills that is MySql and PHP In an actual case there might be millions of repetition of words. Now question is, does apache solr stores them as just words, OR converts each words to an unique number and stores the number only. Best Regards Kamal Net Cloud Systems Bangalore, India
Re: search filter
Looks I am getting exception as below May 22, 2013 10:52:11 PM org.apache.solr.common.SolrException log SEVERE: java.lang.NumberFormatException: For input string: [3 TO 9] OR salary:0 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:438) at java.lang.Long.parseLong(Long.java:478) Regards kamal On Thu, May 23, 2013 at 11:19 AM, Kamal Palei palei.ka...@gmail.com wrote: HI Rafał Kuć I tried fq=Salary:[5+TO+10]+OR+Salary:0 and as well as fq=Salary:[5 TO 10] OR Salary:0 both, both the cases I retrieved 0 results. I use drupal along with solr, my code looks as below. * if($include_0_salary == 1) { $conditions['fq'][0] = 'salary:[' . $min_ctc . '+TO+' . $max_ctc . ']+OR+salary:0'; } else { $conditions['fq'][0] = 'salary:[' . $min_ctc . ' TO ' . $max_ctc . ']'; } $conditions['fq'][1] = 'experience:[' . $min_exp . ' TO ' . $max_exp . ']'; $results = apachesolr_search_search_execute($keys, $conditions); * Looks when iclude_0_salary is false, I am getting results as expected. If iclude_0_salary is true, I get 0 results, that means for me *$conditions['fq'][0]= salary:[5 TO 10] OR salary:0* did not work. Can somebody help me what the wrong I am doing here... Best regards kamal On Wed, May 22, 2013 at 7:00 PM, Rafał Kuć r@solr.pl wrote: Hello! You can try sending a filter like this fq=Salary:[5+TO+10]+OR+Salary:0 It should work -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Dear All Can I write a search filter for a field having a value in a range or a specific value. Say if I want to have a filter like 1. Select profiles with salary 5 to 10 or Salary 0. So I expect profiles having salary either 0 , 5, 6, 7, 8, 9, 10 etc. It should be possible, can somebody help me with syntax of 'fq' filter please. Best Regards kamal
search filter
Dear All Can I write a search filter for a field having a value in a range or a specific value. Say if I want to have a filter like 1. Select profiles with salary 5 to 10 or Salary 0. So I expect profiles having salary either 0 , 5, 6, 7, 8, 9, 10 etc. It should be possible, can somebody help me with syntax of 'fq' filter please. Best Regards kamal
Re: search filter
HI Rafał Kuć I tried fq=Salary:[5+TO+10]+OR+Salary:0 and as well as fq=Salary:[5 TO 10] OR Salary:0 both, both the cases I retrieved 0 results. I use drupal along with solr, my code looks as below. * if($include_0_salary == 1) { $conditions['fq'][0] = 'salary:[' . $min_ctc . '+TO+' . $max_ctc . ']+OR+salary:0'; } else { $conditions['fq'][0] = 'salary:[' . $min_ctc . ' TO ' . $max_ctc . ']'; } $conditions['fq'][1] = 'experience:[' . $min_exp . ' TO ' . $max_exp . ']'; $results = apachesolr_search_search_execute($keys, $conditions); * Looks when iclude_0_salary is false, I am getting results as expected. If iclude_0_salary is true, I get 0 results, that means for me *$conditions['fq'][0]= salary:[5 TO 10] OR salary:0* did not work. Can somebody help me what the wrong I am doing here... Best regards kamal On Wed, May 22, 2013 at 7:00 PM, Rafał Kuć r@solr.pl wrote: Hello! You can try sending a filter like this fq=Salary:[5+TO+10]+OR+Salary:0 It should work -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Dear All Can I write a search filter for a field having a value in a range or a specific value. Say if I want to have a filter like 1. Select profiles with salary 5 to 10 or Salary 0. So I expect profiles having salary either 0 , 5, 6, 7, 8, 9, 10 etc. It should be possible, can somebody help me with syntax of 'fq' filter please. Best Regards kamal
Re: Adding filed in Schema.xml
Hi Gora Thanks for your response. *What do you mean by not taking effect? You do not seem to have made this clear anywhere in the thread. * Basically I user SOLR in drupal environment. In drupal, in configuration page, there is a link that shows all available index fields. I had added two long types and one date type. The date type index field I can see. However I am not able to see two long fields salary and experience. * Besides adding the fields to Solr's schema.xml, you have to make sure that field values are picked up, and indexed properly into Solr. How are you indexing? Have you reindexed after adding the fields? Are you getting any errors in the logs after the indexing. * I have put the code to add these fields in document object and index it. I have not deleted whole indexed data and reindex it. But I expect whatever new documents are added, for those documents these two fields salary and experience should be reindexed. Eventually I have to delete the index and re-index it, but will do after all these things work. Now question is, what I need to do so that these fields are shown as index fields. Best Regards Kamal Mob: 9164 20 22 21 On Sun, May 19, 2013 at 9:12 AM, Gora Mohanty g...@mimirtech.com wrote: On 19 May 2013 08:36, Kamal Palei palei.ka...@gmail.com wrote: Hi Alex I just saw in* types *area, long is already defined as * fieldType name=long class=solr.TrieLongField precisionStep=0 omitNorms=true positionIncrementGap=0/ * Hence I hope, I should be able to declare a long type index in* fields *area as shown below. field name=salary type=long indexed=true stored=true / field name=experience type=long indexed=true stored=true / Yes, this should be fine. Not sure, why it is not taking effect. What do you mean by not taking effect? You do not seem to have made this clear anywhere in the thread. Besides adding the fields to Solr's schema.xml, you have to make sure that field values are picked up, and indexed properly into Solr. How are you indexing? Have you reindexed after adding the fields? Are you getting any errors in the logs after the indexing. Regards, Gora
Re: Adding filed in Schema.xml
Hi Alex, Where I need to mention the types. Kindly tell me in detail. I use Drupal framework. It has given a schema file. In that there are already some long type fields, and these are actually shown by solr as part of index. Whatever long field I am adding it does not show part of index. Best Regards kamal On Fri, May 17, 2013 at 7:47 PM, Alexandre Rafalovitch arafa...@gmail.comwrote: Do you have the types corresponding to those fields present? Specifically, long. You don't get any special type names out of the box, they all need to be present in types area. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Fri, May 17, 2013 at 8:49 AM, Kamal Palei palei.ka...@gmail.com wrote: Hi All I am trying to add few fields in schema.xml file as below. field name=salary type=long indexed=true stored=true / field name=experience type=long indexed=true stored=true / * field name=last_updated_date type=tdate indexed=true stored=true default=NOW multiValued=false / * dynamicField name=rs_* type=long indexed=true stored=true multiValued=false/ dynamicField name=rd_* type=tdate indexed=true stored=true multiValued=false/ Only the last_updated_date (the one in bold letters) getting added. Is there any syntax issue with other 4 entries. Kindly let me know. Thanks kamal
Re: Adding filed in Schema.xml
Hi Alex I just saw in* types *area, long is already defined as * fieldType name=long class=solr.TrieLongField precisionStep=0 omitNorms=true positionIncrementGap=0/ * Hence I hope, I should be able to declare a long type index in* fields *area as shown below. field name=salary type=long indexed=true stored=true / field name=experience type=long indexed=true stored=true / Not sure, why it is not taking effect. Best Regards Kamal On Sat, May 18, 2013 at 6:23 PM, Kamal Palei palei.ka...@gmail.com wrote: Hi Alex, Where I need to mention the types. Kindly tell me in detail. I use Drupal framework. It has given a schema file. In that there are already some long type fields, and these are actually shown by solr as part of index. Whatever long field I am adding it does not show part of index. Best Regards kamal On Fri, May 17, 2013 at 7:47 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: Do you have the types corresponding to those fields present? Specifically, long. You don't get any special type names out of the box, they all need to be present in types area. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Fri, May 17, 2013 at 8:49 AM, Kamal Palei palei.ka...@gmail.com wrote: Hi All I am trying to add few fields in schema.xml file as below. field name=salary type=long indexed=true stored=true / field name=experience type=long indexed=true stored=true / * field name=last_updated_date type=tdate indexed=true stored=true default=NOW multiValued=false / * dynamicField name=rs_* type=long indexed=true stored=true multiValued=false/ dynamicField name=rd_* type=tdate indexed=true stored=true multiValued=false/ Only the last_updated_date (the one in bold letters) getting added. Is there any syntax issue with other 4 entries. Kindly let me know. Thanks kamal
Adding filed in Schema.xml
Hi All I am trying to add few fields in schema.xml file as below. field name=salary type=long indexed=true stored=true / field name=experience type=long indexed=true stored=true / * field name=last_updated_date type=tdate indexed=true stored=true default=NOW multiValued=false / * dynamicField name=rs_* type=long indexed=true stored=true multiValued=false/ dynamicField name=rd_* type=tdate indexed=true stored=true multiValued=false/ Only the last_updated_date (the one in bold letters) getting added. Is there any syntax issue with other 4 entries. Kindly let me know. Thanks kamal
Re: Can we search some mandatory words and some optional words in SOLR
Thanks Hoss, I modified accordingly. One more thing I observed, if I give search key as one of the below 1. +Java +mysql +php +(TCL Perl Selenium) -ethernet -switching -routing 2. +(TCL Perl Selenium) -ethernet -switching -routing 3. +(TCL Perl Selenium) It works as expected. Like if key is +(TCL Perl Selenium) , then it searches documents having atleast one or more keyword out of TCL Perl Selenium. Best Regards Kamal On Wed, May 15, 2013 at 10:58 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : +Java +mysql +php TCL Perl Selenium -ethernet -switching -routing that's missing one of the started requirements... : 2. Atleast one keyword out of* TCL Perl Selenium* should be present ...should be... +Java +mysql +php +(TCL Perl Selenium) -ethernet -switching -routing -Hoss
Adding a field in schema , storing it and use it to search
Hi All Need help in adding a new field and making use of it during search. As of today I just search some keywords and whatever document (actually these are resumes of individuals) is retrieved from SOLR search I take these as input, then search in mysql for experience, salary etc and then selected resumes I show as search result. Say, while searching in SOLR, I want to achieve something as below. 1. Search keywords in those users resume whose experience is greater than 5 years. To achieve My understanding is 1. I need to define a new field in schema 2. During indexing, add this parameter 3. During search, have a condition like experience = 5 years When I will be adding a field , should I add as a normal field one as shown below *field name=experience type=integer indexed=true stored=true/* OR as a dynamic field as shown below *dynamicField name=exp_* type=double indexed=true stored=true multiValued=false/* And during search, how the condition should look like. Best regards Kamal
Re: Can we search some mandatory words and some optional words in SOLR
Hi Hoss I was wondering between this two keys. Though they look similar, but result set differs. In 1st case I give key as +c +c++ +sip +( *tcl* perl shell script) -manual testing -ss7 In 2nd case I give key as +c +c++ +sip +(*tcl* perl shell script) -manual testing -ss7 Please note that before *tcl* , space is not present in 2nd case. In 1st case I get more results, and in 2nd case I get only 3 results. In first case, I see atleast one result was there, which does not have single optional key (means one document that does not contain either tcl or perl or shell script). Is it a known issue.., please help.. Or I am doing something wrong in key preparation, please let me know. Thanks Kamal On Wed, May 15, 2013 at 10:58 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : +Java +mysql +php TCL Perl Selenium -ethernet -switching -routing that's missing one of the started requirements... : 2. Atleast one keyword out of* TCL Perl Selenium* should be present ...should be... +Java +mysql +php +(TCL Perl Selenium) -ethernet -switching -routing -Hoss
Re: Can we search some mandatory words and some optional words in SOLR
Thanks Jack Krupansky Your solution having key as* **+Java +mysql +php TCL Perl Selenium* worked nicely. If I have to extend it something like, I need to search all documents, those have 1. Mandatory Keywords present are *Java, MySql* 2. Atleast one keyword out of* TCL Perl Selenium* should be present 3. The keywords *ethernet, switching, routing* should not be present in document. In that case, how the search key should look like. Best Regards Kamal On Mon, May 13, 2013 at 8:10 PM, Jack Krupansky j...@basetechnology.comwrote: That's simply a standard, old-fashioned Lucene query: +Java +mysql +php TCL Perl Selenium And you can decide if min should match (mm) is 0, 1, 2, 3, etc. for the optional terms (TCL, Perl, Selenium) -- Jack Krupansky -Original Message- From: Kamal Palei Sent: Monday, May 13, 2013 9:56 AM To: solr-user@lucene.apache.org Subject: Can we search some mandatory words and some optional words in SOLR Dear SOLR Experts Llets say I want to search some mandatory words and some optional words. Say I want to search all documents those contains all *Java, mysql, php*keywords along with atleast one keyword out of * TCL, Perl, Selenium*. *Basically I am looking at few mandatory keywords and few optional keywords. * Is it possible to search this way. If so, kindly guide me how the query should look like. Best Regards Kamal
Mandatory words search in SOLR
Hi SOLR Experts When I search documents with keyword as *java, mysql* then I get the documents containing either *java* or *mysql* or both. Is it possible to get the documents those contains both *java* and *mysql*. In that case, how the query would look like. Thanks a lot Kamal
Re: Mandatory words search in SOLR
Hi Rafał Kuć I added q.op=AND as per you suggested. I see though some initial record document contains both keywords (*java* and *mysql*), towards end I see still there are number of documents, they have only one key word either *java* or *mysql*. Is it the SOLR behaviour or can I ask for a *strict search only if all my keywords are present, then only* *fetch record* else not. BR, Kamal On Mon, May 13, 2013 at 4:02 PM, Rafał Kuć r@solr.pl wrote: Hello! Change the default query operator. For example add the q.op=AND to your query. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi SOLR Experts When I search documents with keyword as *java, mysql* then I get the documents containing either *java* or *mysql* or both. Is it possible to get the documents those contains both *java* and *mysql*. In that case, how the query would look like. Thanks a lot Kamal
Re: Mandatory words search in SOLR
Hi François Thanks for input. The major problem I face is , I make use of Drupal (as a framework) and apachesolr_module provided by Drupal. Where I am not sure, how do I directly modify the query. However this is not a right forum to ask Drupal related questions. If somebody here knows both Drupal 7 and SOLR well, kindly let me know. One more doubt, lets say I want to search some mandatory words and some optional words. Say I want to search all documents those contains all Java, mysql, php keywords along with atleast one keyword out of TCL, Perl, Selenium. *Basically I am looking at few mandatory keywords and few optional keywords. * Is it possible to search this way. If so, kindly guide me how the query should look like. Best Regards Kamal On Mon, May 13, 2013 at 5:31 PM, François Schiettecatte fschietteca...@gmail.com wrote: Kamal You could also use the 'mm' parameter to require a minimum match, or you could prepend '+' to each required term. Cheers François On May 13, 2013, at 7:57 AM, Kamal Palei palei.ka...@gmail.com wrote: Hi Rafał Kuć I added q.op=AND as per you suggested. I see though some initial record document contains both keywords (*java* and *mysql*), towards end I see still there are number of documents, they have only one key word either *java* or *mysql*. Is it the SOLR behaviour or can I ask for a *strict search only if all my keywords are present, then only* *fetch record* else not. BR, Kamal On Mon, May 13, 2013 at 4:02 PM, Rafał Kuć r@solr.pl wrote: Hello! Change the default query operator. For example add the q.op=AND to your query. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi SOLR Experts When I search documents with keyword as *java, mysql* then I get the documents containing either *java* or *mysql* or both. Is it possible to get the documents those contains both *java* and *mysql*. In that case, how the query would look like. Thanks a lot Kamal
Re: Mandatory words search in SOLR
Hi François As per suggestion, I used 'mm' param and was able to do search for mandatory fields. In Drupal, one need to do as $query-addParam('mm' , '100%'); in query alter hook. Thanks a lot for guiding me. Best Regards Kamal On Mon, May 13, 2013 at 5:56 PM, Kamal Palei palei.ka...@gmail.com wrote: Hi François Thanks for input. The major problem I face is , I make use of Drupal (as a framework) and apachesolr_module provided by Drupal. Where I am not sure, how do I directly modify the query. However this is not a right forum to ask Drupal related questions. If somebody here knows both Drupal 7 and SOLR well, kindly let me know. One more doubt, lets say I want to search some mandatory words and some optional words. Say I want to search all documents those contains all Java, mysql, php keywords along with atleast one keyword out of TCL, Perl, Selenium. *Basically I am looking at few mandatory keywords and few optional keywords.* Is it possible to search this way. If so, kindly guide me how the query should look like. Best Regards Kamal On Mon, May 13, 2013 at 5:31 PM, François Schiettecatte fschietteca...@gmail.com wrote: Kamal You could also use the 'mm' parameter to require a minimum match, or you could prepend '+' to each required term. Cheers François On May 13, 2013, at 7:57 AM, Kamal Palei palei.ka...@gmail.com wrote: Hi Rafał Kuć I added q.op=AND as per you suggested. I see though some initial record document contains both keywords (*java* and *mysql*), towards end I see still there are number of documents, they have only one key word either *java* or *mysql*. Is it the SOLR behaviour or can I ask for a *strict search only if all my keywords are present, then only* *fetch record* else not. BR, Kamal On Mon, May 13, 2013 at 4:02 PM, Rafał Kuć r@solr.pl wrote: Hello! Change the default query operator. For example add the q.op=AND to your query. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi SOLR Experts When I search documents with keyword as *java, mysql* then I get the documents containing either *java* or *mysql* or both. Is it possible to get the documents those contains both *java* and *mysql*. In that case, how the query would look like. Thanks a lot Kamal
Can we search some mandatory words and some optional words in SOLR
Dear SOLR Experts Llets say I want to search some mandatory words and some optional words. Say I want to search all documents those contains all *Java, mysql, php*keywords along with atleast one keyword out of * TCL, Perl, Selenium*. *Basically I am looking at few mandatory keywords and few optional keywords. * Is it possible to search this way. If so, kindly guide me how the query should look like. Best Regards Kamal
SOLR query performance
Dear All I am using Apache SOLR 3.6.2 version for my search engine in a job site. I am observing for a solr query taking around 15 seconds to complete. I am sure there is something wrong in my approach or I am doing indexing wrongly. I need assistance/pointer to resolve this issue. I am providing a detail background of work what I have done. Kindly provide me some pointer how do I resolve this. I am using Drupal 7.15 framework for job site. Using Apache solr 3.6.2 as my search engine. When a user registers his profile, I create a node (page), attach the document in that node. In every one hour I run the cron task and do indexing of new nodes or updated nodes. When an employer search some key words say java, mysql, php etc, I use apis provided by Drupal to interact with SOLR and get the documents that contains keywords such as java, mysql, drupal etc. There is a filter rows. If I specify rows as 100 or 200, the query returns first (takes around half second). If I specify rows as 3000, it takes around 15seconds to return. Now, my question is, Is there any mechanism, I can tell to solr that, my start row is X, rows is Y, then it will return search result from Xth row with Y number of rows (Please note that this is similar with LIMIT stuff provided by mysql). Kindly let me know. This will help us to great extent. Best Regards Kamal
Re: SOLR query performance
Thanks a lot Alex. I will go and try to make use of start filter and update. Meantime, if I need to know, how many total search records are there. Example: Lets say I am searching key word java. There might be 1000 documents having java keyword. I need to show only 100 records at a time. When I query, as query result I need to know total number of records, and only 100 records data. At the bottom of the web page, I am showing something like *Prev 1234567 8 910 Next* When user clicks, 4, I will set start filter as 300, rows filter as 100 and do the query. As query result, I am expecting row count as 1000, and 100 records data (row number 301 to 400). Is this something possible. Alex, kindly guide me. Thanks kamal On Tue, May 7, 2013 at 7:55 PM, Alexandre Rafalovitch arafa...@gmail.comwrote: Yes, that's what the 'start' and 'rows' parameters do in the query string. I would check the queries Solr sees when you do that long request. There is usually a delay in retrieving items further down the sorted list, but 15 seconds does feel excessive. http://wiki.apache.org/solr/CommonQueryParameters#start Regards, Alex. On Tue, May 7, 2013 at 10:10 AM, Kamal Palei palei.ka...@gmail.com wrote: Now, my question is, Is there any mechanism, I can tell to solr that, my start row is X, rows is Y, then it will return search result from Xth row with Y number of rows (Please note that this is similar with LIMIT stuff provided by mysql). Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book)