Re: Issue with using createNodeSet in Solr Cloud
Ah, nice tip, thanks! This could also make scripts more portable too. Cheers, Savvas On 21 July 2015 at 08:40, Upayavira u...@odoko.co.uk wrote: Note, when you start up the instances, you can pass in a hostname to use instead of the IP address. If you are using bin/solr (which you should be!!) then you can use bin/solr -h my-host-name and that'll be used in place of the IP. Upayavira On Tue, Jul 21, 2015, at 05:45 AM, Erick Erickson wrote: Glad you found a solution Best, Erick On Mon, Jul 20, 2015 at 3:21 AM, Savvas Andreas Moysidis savvas.andreas.moysi...@gmail.com wrote: Erick, spot on! The nodes had been registered in zookeeper under my network interface's IP address...after specifying those the command worked just fine. It was indeed the thing I thought was true that wasn't... :) Many thanks, Savvas On 18 July 2015 at 20:47, Erick Erickson erickerick...@gmail.com wrote: P.S. It ain't the things ya don't know that'll kill ya, it's the things ya _do_ know that ain't so... On Sat, Jul 18, 2015 at 12:46 PM, Erick Erickson erickerick...@gmail.com wrote: Could you post your clusterstate.json? Or at least the live nodes section of your ZK config? (adminUIcloudtreelive_nodes. The addresses of my nodes are things like 192.168.1.201:8983_solr. I'm wondering if you're taking your node names from the information ZK records or assuming it's 127.0.0.1 On Sat, Jul 18, 2015 at 8:56 AM, Savvas Andreas Moysidis savvas.andreas.moysi...@gmail.com wrote: Thanks Eric, The strange thing is that although I have set the log level to ALL I see no error messages in the logs (apart from the line saying that the response is a 400 one). I'm quite confident the configset does exist as the collection gets created fine if I don't specify the createNodeSet param. Complete mystery..! I'll keep on troubleshooting and report back with my findings. Cheers, Savvas On 17 July 2015 at 02:14, Erick Erickson erickerick...@gmail.com wrote: There were a couple of cases where the no live servers was being returned when the error was something completely different. Does the Solr log show something more useful? And are you sure you have a configset named collection_A? 'cause this works (admittedly on 5.x) fine for me, and I'm quite sure there are bunches of automated tests that would be failing so I suspect it's just a misleading error being returned. Best, Erick On Thu, Jul 16, 2015 at 2:22 AM, Savvas Andreas Moysidis savvas.andreas.moysi...@gmail.com wrote: Hello There, I am trying to use the createNodeSet parameter when creating a new collection but I'm getting an error when doing so. More specifically, I have four Solr instances running locally in separate JVMs (127.0.0.1:8983, 127.0.0.1:8984, 127.0.0.1:8985, 127.0.0.1:8986 ) and a standalone Zookeeper instance which all Solr instances point to. The four Solr instances have no collections added to them and are all up and running (I can access the admin page in all of them). Now, I want to create a collections in only two of these four instances ( 127.0.0.1:8983, 127.0.0.1:8984) but when I hit one instance with the following URL: http://localhost:8983/solr/admin/collections?action=CREATEname=collection_AnumShards=1replicationFactor=2maxShardsPerNode=1createNodeSet=127.0.0.1:8983_solr,127.0.0.1:8984_solrcollection.configName=collection_A I am getting the following response: response lst name=responseHeader int name=status400/int int name=QTime3503/int /lst str name=Operation createcollection caused exception: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Cannot create collection collection_A. No live Solr-instances among Solr-instances specified in createNodeSet:127.0.0.1:8983_solr, 127.0.0.1:8984 _solr /str lst name=exception str name=msg Cannot create collection collection_A. No live Solr-instances among Solr-instances specified in createNodeSet:127.0.0.1:8983_solr, 127.0.0.1:8984 _solr /str int name=rspCode400/int /lst lst name=error str name=msg Cannot create collection collection_A. No live Solr-instances among Solr-instances specified in createNodeSet:127.0.0.1:8983_solr, 127.0.0.1:8984 _solr /str int name=code400/int /lst /response The instances are definitely up and running (at least the admin console can be accessed as mentioned) and if I remove the createNodeSet parameter the collection is created as expected. Am I missing something obvious or is this a bug
Re: Issue with using createNodeSet in Solr Cloud
Erick, spot on! The nodes had been registered in zookeeper under my network interface's IP address...after specifying those the command worked just fine. It was indeed the thing I thought was true that wasn't... :) Many thanks, Savvas On 18 July 2015 at 20:47, Erick Erickson erickerick...@gmail.com wrote: P.S. It ain't the things ya don't know that'll kill ya, it's the things ya _do_ know that ain't so... On Sat, Jul 18, 2015 at 12:46 PM, Erick Erickson erickerick...@gmail.com wrote: Could you post your clusterstate.json? Or at least the live nodes section of your ZK config? (adminUIcloudtreelive_nodes. The addresses of my nodes are things like 192.168.1.201:8983_solr. I'm wondering if you're taking your node names from the information ZK records or assuming it's 127.0.0.1 On Sat, Jul 18, 2015 at 8:56 AM, Savvas Andreas Moysidis savvas.andreas.moysi...@gmail.com wrote: Thanks Eric, The strange thing is that although I have set the log level to ALL I see no error messages in the logs (apart from the line saying that the response is a 400 one). I'm quite confident the configset does exist as the collection gets created fine if I don't specify the createNodeSet param. Complete mystery..! I'll keep on troubleshooting and report back with my findings. Cheers, Savvas On 17 July 2015 at 02:14, Erick Erickson erickerick...@gmail.com wrote: There were a couple of cases where the no live servers was being returned when the error was something completely different. Does the Solr log show something more useful? And are you sure you have a configset named collection_A? 'cause this works (admittedly on 5.x) fine for me, and I'm quite sure there are bunches of automated tests that would be failing so I suspect it's just a misleading error being returned. Best, Erick On Thu, Jul 16, 2015 at 2:22 AM, Savvas Andreas Moysidis savvas.andreas.moysi...@gmail.com wrote: Hello There, I am trying to use the createNodeSet parameter when creating a new collection but I'm getting an error when doing so. More specifically, I have four Solr instances running locally in separate JVMs (127.0.0.1:8983, 127.0.0.1:8984, 127.0.0.1:8985, 127.0.0.1:8986 ) and a standalone Zookeeper instance which all Solr instances point to. The four Solr instances have no collections added to them and are all up and running (I can access the admin page in all of them). Now, I want to create a collections in only two of these four instances ( 127.0.0.1:8983, 127.0.0.1:8984) but when I hit one instance with the following URL: http://localhost:8983/solr/admin/collections?action=CREATEname=collection_AnumShards=1replicationFactor=2maxShardsPerNode=1createNodeSet=127.0.0.1:8983_solr,127.0.0.1:8984_solrcollection.configName=collection_A I am getting the following response: response lst name=responseHeader int name=status400/int int name=QTime3503/int /lst str name=Operation createcollection caused exception: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Cannot create collection collection_A. No live Solr-instances among Solr-instances specified in createNodeSet:127.0.0.1:8983_solr, 127.0.0.1:8984 _solr /str lst name=exception str name=msg Cannot create collection collection_A. No live Solr-instances among Solr-instances specified in createNodeSet:127.0.0.1:8983_solr, 127.0.0.1:8984 _solr /str int name=rspCode400/int /lst lst name=error str name=msg Cannot create collection collection_A. No live Solr-instances among Solr-instances specified in createNodeSet:127.0.0.1:8983_solr, 127.0.0.1:8984 _solr /str int name=code400/int /lst /response The instances are definitely up and running (at least the admin console can be accessed as mentioned) and if I remove the createNodeSet parameter the collection is created as expected. Am I missing something obvious or is this a bug? The exact Solr version I'm using is 4.9.1. Any pointers would be much appreciated. Thanks, Savvas
Re: Issue with using createNodeSet in Solr Cloud
Thanks Eric, The strange thing is that although I have set the log level to ALL I see no error messages in the logs (apart from the line saying that the response is a 400 one). I'm quite confident the configset does exist as the collection gets created fine if I don't specify the createNodeSet param. Complete mystery..! I'll keep on troubleshooting and report back with my findings. Cheers, Savvas On 17 July 2015 at 02:14, Erick Erickson erickerick...@gmail.com wrote: There were a couple of cases where the no live servers was being returned when the error was something completely different. Does the Solr log show something more useful? And are you sure you have a configset named collection_A? 'cause this works (admittedly on 5.x) fine for me, and I'm quite sure there are bunches of automated tests that would be failing so I suspect it's just a misleading error being returned. Best, Erick On Thu, Jul 16, 2015 at 2:22 AM, Savvas Andreas Moysidis savvas.andreas.moysi...@gmail.com wrote: Hello There, I am trying to use the createNodeSet parameter when creating a new collection but I'm getting an error when doing so. More specifically, I have four Solr instances running locally in separate JVMs (127.0.0.1:8983, 127.0.0.1:8984, 127.0.0.1:8985, 127.0.0.1:8986) and a standalone Zookeeper instance which all Solr instances point to. The four Solr instances have no collections added to them and are all up and running (I can access the admin page in all of them). Now, I want to create a collections in only two of these four instances ( 127.0.0.1:8983, 127.0.0.1:8984) but when I hit one instance with the following URL: http://localhost:8983/solr/admin/collections?action=CREATEname=collection_AnumShards=1replicationFactor=2maxShardsPerNode=1createNodeSet=127.0.0.1:8983_solr,127.0.0.1:8984_solrcollection.configName=collection_A I am getting the following response: response lst name=responseHeader int name=status400/int int name=QTime3503/int /lst str name=Operation createcollection caused exception: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Cannot create collection collection_A. No live Solr-instances among Solr-instances specified in createNodeSet:127.0.0.1:8983_solr, 127.0.0.1:8984 _solr /str lst name=exception str name=msg Cannot create collection collection_A. No live Solr-instances among Solr-instances specified in createNodeSet:127.0.0.1:8983_solr, 127.0.0.1:8984 _solr /str int name=rspCode400/int /lst lst name=error str name=msg Cannot create collection collection_A. No live Solr-instances among Solr-instances specified in createNodeSet:127.0.0.1:8983_solr, 127.0.0.1:8984 _solr /str int name=code400/int /lst /response The instances are definitely up and running (at least the admin console can be accessed as mentioned) and if I remove the createNodeSet parameter the collection is created as expected. Am I missing something obvious or is this a bug? The exact Solr version I'm using is 4.9.1. Any pointers would be much appreciated. Thanks, Savvas
Issue with using createNodeSet in Solr Cloud
Hello There, I am trying to use the createNodeSet parameter when creating a new collection but I'm getting an error when doing so. More specifically, I have four Solr instances running locally in separate JVMs (127.0.0.1:8983, 127.0.0.1:8984, 127.0.0.1:8985, 127.0.0.1:8986) and a standalone Zookeeper instance which all Solr instances point to. The four Solr instances have no collections added to them and are all up and running (I can access the admin page in all of them). Now, I want to create a collections in only two of these four instances ( 127.0.0.1:8983, 127.0.0.1:8984) but when I hit one instance with the following URL: http://localhost:8983/solr/admin/collections?action=CREATEname=collection_AnumShards=1replicationFactor=2maxShardsPerNode=1createNodeSet=127.0.0.1:8983_solr,127.0.0.1:8984_solrcollection.configName=collection_A I am getting the following response: response lst name=responseHeader int name=status400/int int name=QTime3503/int /lst str name=Operation createcollection caused exception: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Cannot create collection collection_A. No live Solr-instances among Solr-instances specified in createNodeSet:127.0.0.1:8983_solr,127.0.0.1:8984 _solr /str lst name=exception str name=msg Cannot create collection collection_A. No live Solr-instances among Solr-instances specified in createNodeSet:127.0.0.1:8983_solr,127.0.0.1:8984 _solr /str int name=rspCode400/int /lst lst name=error str name=msg Cannot create collection collection_A. No live Solr-instances among Solr-instances specified in createNodeSet:127.0.0.1:8983_solr,127.0.0.1:8984 _solr /str int name=code400/int /lst /response The instances are definitely up and running (at least the admin console can be accessed as mentioned) and if I remove the createNodeSet parameter the collection is created as expected. Am I missing something obvious or is this a bug? The exact Solr version I'm using is 4.9.1. Any pointers would be much appreciated. Thanks, Savvas
Re: Highlighting without URL condition
Hello, You can add this request parameter in the defaults section of your request handler named /select in solrconfig.xml like this: lst name=defaults str name=hltrue/str /lst and as long as you use this request handler you won't need to explicitly specify this parameter in your request. On 19 September 2012 14:27, Spadez james_will...@hotmail.com wrote: Hi, I was wondering if it is possible to set up highlighting so it is on by default, and doesnt need to add to the URL. For example: http://localhost:8080/solr/select?q=bookhl=true I would like to have it so highlighting is on even if the URL is this: http://localhost:8080/solr/select?q=book Is this possible, and if so, how can it be achieved? -- View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-without-URL-condition-tp4008899.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - Proximity search using exact number of words apart
Hi, If you are using the dismax/edismax query parser you can maybe give query slops a try? http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_search_for_one_term_near_another_term_.28say.2C_.22batman.22_and_.22movie.22.29 On 16 September 2012 10:23, Omnia Zayed omnia.za...@gmail.com wrote: Hi; I am working with apache-solr-3.6.0 on windows machine. I would like to search for two words with certain number of words apart (No more than this number). For example: Consider the following phrases, I would like to search for Daisy exam with no more than 2 words apart. Daisy has exam Daisy has an exam Daisy has a math exam Daisy has a difficult math exam I searched for such thing and I tried Term Proximity. http://localhost:8983/solr/select/?q=Daisy exam~2version=2.2start=0rows=10indent=ondebugQuery=true The result that I need should be the phrase: Daisy has an exam. But using the above criteria, the result was the last 3 phrase. So any ideas to use an exact number of words apart? *--- **Omnia H. Zayed*
Re: Use a boolean field as a flag for another, just stored, field
Hi, In your field declaration you can specify a default value of something your field will be populated with in absence of any value and later at search time run filter queries against that value. Bare in mind that if you want to filter your results based on any value you *have* to index that value. Simply storing it won't work. Hope that helps, Savvas On 9 September 2012 22:18, simple350 aurel...@yahoo.com wrote: Hi, I want to be able to select from the index the documents who have a certain field not null. The problem is that the field is not indexed just stored. I'm not interested in indexing that field as it is just an internal URL. The idea was to add another field to the document - a boolean field - based on the initial field: 'True' for exiting field, 'False' for null - I could copy the initial field and use some analyzer having as output a bool result. Before trying to build a custom analyzer I wanted to ask if anything like this makes sense or if it is already available in Solr or if I completely missed some point. Regards, Alex -- View this message in context: http://lucene.472066.n3.nabble.com/Use-a-boolean-field-as-a-flag-for-another-just-stored-field-tp4006484.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Use a boolean field as a flag for another, just stored, field
So, as you say, you only need to have a hasInternalURL (or something similar) which will be of type boolean and will be populated at index time? Unless I'm missing something I don't see why you would need a custom analyzer for this. On 9 September 2012 22:56, simple350 aurel...@yahoo.com wrote: Well - this was the idea: not to index the useless data from the initial field but to add and index another field, a boolean one, based on the content of the first one. -- View this message in context: http://lucene.472066.n3.nabble.com/Use-a-boolean-field-as-a-flag-for-another-just-stored-field-tp4006484p4006492.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: The way to customize ranking?
Could you not apply this logic in your solr client prior to displaying the results? On 23 August 2012 20:56, François Schiettecatte fschietteca...@gmail.com wrote: I would create two indices, one with your content and one with your ads. This approach would allow you to precisely control how many ads you pull back and how you merge them into the results, and you would be able to control schemas, boosting, defaults fields, etc for each index independently. Best regards François On Aug 23, 2012, at 11:45 AM, Nicholas Ding nicholas...@gmail.com wrote: Thank you, but I don't want to filter those ads. For example, when user make a search like q=Car Result list: 1. Ford Automobile (score 10) 2. Honda Civic (score 9) ... ... ... 99. Paid Ads (score 1, Ad has own field to identify it's an Ad) What I want to find is a way to make the score of Paid Ads higher than Ford Automobile. Basically, the result structure will look like - [Paid Ads Section] [Most valuable Ads 1] [Most valuable Ads 2] [Less valuable Ads 1] [Less valuable Ads 2] - [Relevant Results Section] On Thu, Aug 23, 2012 at 11:33 AM, Karthick Duraisamy Soundararaj karthick.soundara...@gmail.com wrote: Hi You might add an int field Search Rule that identifies the type of search. example Search Rule Description 0 Unpaid Search 1 Paid Search - Rule 1 2 Paid Serch - Rule 2 You can use filterqueries ( http://wiki.apache.org/solr/CommonQueryParameters) like fq: Search Rule :[1 TO *] Alternatively, You can even use a boolean field to identify whether or not a search is paid and then an addtitional field that identifies the type of paid search. -- karthick On Thu, Aug 23, 2012 at 11:16 AM, Nicholas Ding nicholas...@gmail.com wrote: Hi I'm working on Solr to build a local business search in China. We have a special requirement from advertiser. When user makes a search, if the results contain paid advertisements, those ads need to be moved on the top of results. For different ads, they have detailed rules about which comes first. Could anyone offer me some suggestions how I customize the ranking based on my requirement? Thanks Nicholas
Re: Frequency of Unique Id displayed more than 1
Hello, Make sure your unique id has a type which always yields one token after tokenisation is applied (e.g. either string or a type which only defines the KeywordTokenizer in its chain) Regards, Savvas On 5 July 2012 11:02, Sohail Aboobaker sabooba...@gmail.com wrote: Hi, We have defined a unique key as schemaid. We add documents using server.addBean(obj) method. We are using the same method for updates as well. When browsing the schema, we see that some of the schemaid values have frequency of more than 1. Since, schemaid column is defined as unique key, we are expecting when addBean, it will automatically replace the existing entry in index. Are we supposed to use a different method for update as opposed to add? Regards, Sohail
Re: Frequency of Unique Id displayed more than 1
can you post the schema you are applying pls? On 5 July 2012 11:28, Sohail Aboobaker sabooba...@gmail.com wrote: Another observation is that when we query an individual schemaid, it returns only one row using the search interface. Why would frequency be more than 1?
Re: possible status codes from solr during a (DIH) data import process
Hello, Driven by the same requirements we also implemented the same polling mechanism (in java) and found it a bit awkward and error prone having to search through the returned response for occurrences of the terms failure or Rollback etc. It would be *really* handy if the status command returned numeric values to reflect the current state of the DIH process (similar to the HTTP status codes a server sends to a web browser). Our 2 cents.. :) On 1 June 2012 15:29, geeky2 gee...@hotmail.com wrote: thank you ALL for the great feedback - very much appreciated! -- View this message in context: http://lucene.472066.n3.nabble.com/possible-status-codes-from-solr-during-a-DIH-data-import-process-tp3987110p3987263.html Sent from the Solr - User mailing list archive at Nabble.com.
Function Query not getting picked up by Standard Query Parser
Hello, I'm trying to find out why my Function Query isn't getting picked up by the Standard Parser. More specifically I send the following set of http params (I'm using the _val_ syntax): . lst name=params str name=_val_creationDate^0.01/str str name=debugQueryon/str str name=start225/str str name=qallFields:(born to be wild)/str str name=rows5/str /lst . and turning on Debug Query yields the following calculation for the first result: . 0.29684606 = (MATCH) product of: 0.5936921 = (MATCH) sum of: 0.5936921 = (MATCH) weight(allFields:wild in 13093), product of: 0.64602524 = queryWeight(allFields:wild), product of: 5.88155 = idf(docFreq=223, maxDocs=29531) 0.10983928 = queryNorm 0.91899216 = (MATCH) fieldWeight(allFields:wild in 13093), product of: 1.0 = tf(termFreq(allFields:wild)=1) 5.88155 = idf(docFreq=223, maxDocs=29531) 0.15625 = fieldNorm(field=allFields, doc=13093) 0.5 = coord(1/2) . but I don't see anywhere my Function Query affecting the score.. Is there something else I should be setting? what am I missing? Cheers, Savvas
Re: Function Query not getting picked up by Standard Query Parser
great, that did it! I can now see the Function Query part in the calculation. Thanks very much Eric, Savvas On 2 June 2011 13:28, Erik Hatcher erik.hatc...@gmail.com wrote: For this to work, _val_: goes *in* the q parameter, not as a separate parameter. See here for more details: http://wiki.apache.org/solr/SolrQuerySyntax#Differences_From_Lucene_Query_Parser Erik On Jun 2, 2011, at 07:46 , Savvas-Andreas Moysidis wrote: Hello, I'm trying to find out why my Function Query isn't getting picked up by the Standard Parser. More specifically I send the following set of http params (I'm using the _val_ syntax): . lst name=params str name=_val_creationDate^0.01/str str name=debugQueryon/str str name=start225/str str name=qallFields:(born to be wild)/str str name=rows5/str /lst . and turning on Debug Query yields the following calculation for the first result: . 0.29684606 = (MATCH) product of: 0.5936921 = (MATCH) sum of: 0.5936921 = (MATCH) weight(allFields:wild in 13093), product of: 0.64602524 = queryWeight(allFields:wild), product of: 5.88155 = idf(docFreq=223, maxDocs=29531) 0.10983928 = queryNorm 0.91899216 = (MATCH) fieldWeight(allFields:wild in 13093), product of: 1.0 = tf(termFreq(allFields:wild)=1) 5.88155 = idf(docFreq=223, maxDocs=29531) 0.15625 = fieldNorm(field=allFields, doc=13093) 0.5 = coord(1/2) . but I don't see anywhere my Function Query affecting the score.. Is there something else I should be setting? what am I missing? Cheers, Savvas
Solr book
Hello, Does anyone know if there is a v 3.1 book coming any time soon? Regards, Savvas
Re: Solr book
great, thanks! So, I guess the Solr In Action and Solr Cookbook will be based on 3.1.. :) 2011/5/19 Rafał Kuć ra...@alud.com.pl Hello! Take a look at the Solr resources page on the wiki (http://wiki.apache.org/solr/SolrResources). -- Regards, Rafał Kuć http://solr.pl
DIH Response
Hello, We have configured solr for delta processing through DIH and we kick off the index request from within a batch process. However, we somehow need to know whether our indexing request succeeded or not because we want to be able to rollback a db transaction if that step fails. By looking at the SolrServer API we weren't able to find a method that could help us with that, so the only solution we see is by constantly polling the server and parsing the response for the idle or Rolledback words. What we noticed though is that the response also contains a message saying This response format is experimental. It is likely to change in the future. Does this mean that we can't rely on this response to build our module? Is there a better way? Thank you, Savvas
Re: catch_all field versus multiple OR Boolean query
Hi Eric, Yes, we are using the Dismax parser. It was more the All search fields selected use case that we were wondering about.. We specify a omitNorms=true for the catch_all field option which we have found to yield better results in our case, but we don't do that for all the other fields so, as you say, that might be the reason.. Thanks very much, - Savvas On 30 March 2011 00:37, Erick Erickson erickerick...@gmail.com wrote: It's not so much the Boolean as it is different field characteristics. The length of a field factors into the score, and a boolean query that goes against the individual fields will certainly score differently than putting all the fields in a catch-all which is, obviously, longer. Have you looked at the dismax query parser? It allows you to distribute queries over fields automatically, even with varying boosts. Finally, consider adding debugQuery=on to your query to see what each field contributes to the score, that'll help with understanding the scoring, although it's a little hard to read... Best Erick On Tue, Mar 29, 2011 at 6:06 PM, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: Hello, Currently in our index we have multiple fields and a copyfield / catch_all field. When users select all search options we specify the catch_all field as the field to search on. This has worked very well for our needs but a question was recently raised within our team regarding the difference between using a catch_all field and specifying a Boolean query by OR-ing all fields together. From our own experimentation, we have observed that using those two different strategies we get back different results lists. By looking at the Similarity class, we can understand how the score is calculated for the catch_all field but is there any input on how the score gets calculated for the Boolean query? Regards, - Savvas
Re: Matching on a multi valued field
I assume you are using the Standard Handler? In that case wouldn't something like: q=common_names:(man's friend)q.op=AND work? On 29 March 2011 21:57, Brian Lamb brian.l...@journalexperts.com wrote: Hi all, I have a field set up like this: field name=common_names multiValued=true type=text indexed=true stored=true required=false / And I have some records: RECORD1 arr name=common_names strman's best friend/str strpooch/str /arr RECORD2 arr name=common_names strman's worst enemy/str strfriend to no one/str /arr Now if I do a search such as: http://localhost:8983/solr/search/?q=*:*fq={!q.op=ANDdf=common_names}man's friend Both records are returned. However, I only want RECORD1 returned. I understand why RECORD2 is returned but how can I structure my query so that only RECORD1 is returned? Thanks, Brian Lamb
Re: Matching on a multi valued field
my bad..just realised your problem.. :D On 29 March 2011 22:07, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: I assume you are using the Standard Handler? In that case wouldn't something like: q=common_names:(man's friend)q.op=AND work? On 29 March 2011 21:57, Brian Lamb brian.l...@journalexperts.com wrote: Hi all, I have a field set up like this: field name=common_names multiValued=true type=text indexed=true stored=true required=false / And I have some records: RECORD1 arr name=common_names strman's best friend/str strpooch/str /arr RECORD2 arr name=common_names strman's worst enemy/str strfriend to no one/str /arr Now if I do a search such as: http://localhost:8983/solr/search/?q=*:*fq={!q.op=ANDdf=common_names}man's friend Both records are returned. However, I only want RECORD1 returned. I understand why RECORD2 is returned but how can I structure my query so that only RECORD1 is returned? Thanks, Brian Lamb
catch_all field versus multiple OR Boolean query
Hello, Currently in our index we have multiple fields and a copyfield / catch_all field. When users select all search options we specify the catch_all field as the field to search on. This has worked very well for our needs but a question was recently raised within our team regarding the difference between using a catch_all field and specifying a Boolean query by OR-ing all fields together. From our own experimentation, we have observed that using those two different strategies we get back different results lists. By looking at the Similarity class, we can understand how the score is calculated for the catch_all field but is there any input on how the score gets calculated for the Boolean query? Regards, - Savvas
Re: Logic operator with dismax
Hello, The Dismax search handler doesn't have the concept of a logical operator in terms of OR-AND but rather uses a feature called Min-Should-Match (or mm). This parameter specifies the absolute number or percentage of the entered terms that you need them to match. To have an OR-like effect you can specify an mm=0% and for AND-like an mm=100% should work. More information can be found here: http://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_.27Should.27_Match.29 On 21 March 2011 11:46, Gastone Penzo gastone.pe...@gmail.com wrote: Hi. i have a problem with logic operator OR in dismax query search. some days ago the query worked well. now it returns me anything (0 documents) i explain: the query is: http://localhost:8983/solr/select/?q= 1324OR4322OR2324OR%20hello+worlddefType=dismaxqf=code%20title the schema has the fields: code title i want to search the docs with hello world in the title, plus the docs with the codes 1324,4322,2324 (even if they don't have hello world in the title). the result is the query returns to me the docs with these codes AND hello world in the title (logic AND, not OR) the default operator in the schema is OR what's happened?? thank you -- Gastone Penzo *www.solr-italia.it* The first italian blog dedicated to Apache Solr
Re: frequent index updates
Hello, This thread might help: http://search-lucene.com/m/09PHV1E0ZxQ1/Possibilities+of+near+real+time+search+with+solr/v=threaded On 21 March 2011 09:33, Prav Buz buz.p...@gmail.com wrote: Hi, I'm wondering what are the best way to do this for this scenario: Index will have about 250 - 400 million items. Index needs to be updated every 10/20 minutes and no. of records updated could be upto 5-6 million in each. Could you please guide me on how the indexing is done when there are above 500 millions of records and what are the possible ways to do such frequent updates mentioned above. thanks Prav
Re: Solr under Tomcat
Hi Sai, You can find your index files at: {%TOMCAT_HOME}\solr\data\index If you want to clear the index just delete the whole index directory. Regards, - Savvas On 2 March 2011 14:09, Thumuluri, Sai sai.thumul...@verizonwireless.comwrote: Good Morning, We have deployed Solr 1.4.1 under Tomcat and it works great, however I cannot find where the index (directory) is created. I set solr home in web.xml under /webapps/solr/WEB-INF/, but not sure where the data directory is. I have a need where I need to completely index the site and it would help for me to stop solr, delete index directory and restart solr prior to re-indexing the content. Thanks, Sai Thumuluri
Re: Create a tomcat service.
Hi Rajini, We use the following script ran from within {TOMCAT_HOME}\bin directory to create service instances (assuming you are targeting Windows Server environments..): cd C:\Program Files\Apache Software Foundation\{TOMCAT_HOME}\bin set CATALINA_BASE=C:\Program Files\Apache Software Foundation\{TOMCAT_HOME} rem Delete service if it already exists rem tomcat6 //DS//your_instance_name tomcat6 //IS//your_instance_name --DisplayName={TOMCAT_HOME} --Description=This is your instance description --Install=C:\Program Files\Apache Software Foundation\{TOMCAT_HOME}\bin\tomcat6.exe --Classpath=C:\Program Files\Apache Software Foundation\{TOMCAT_HOME}\bin\bootstrap.jar --Jvm=auto --Startup=auto --StartMode=jvm --StartPath=C:\Program Files\Apache Software Foundation\{TOMCAT_HOME} --StopMode=jvm --StartClass=org.apache.catalina.startup.Bootstrap --StartParams=start --StopClass=org.apache.catalina.startup.Bootstrap --StopParams=stop --StopPath=C:\Program Files\Apache Software Foundation\{TOMCAT_HOME} --Startup=auto --LogPath=%CATALINA_BASE%\logs --StdOutput=auto --StdError=auto --JvmOptions=-Dcatalina.home='C:\Program Files\Apache Software Foundation\{TOMCAT_HOME}';-Dcatalina.base=%CATALINA_BASE%;-Djava.io.tmpdir=%CATALINA_BASE%\temp;-Djava.endorsed.dirs='C:\Program Files\Apache Software Foundation\{TOMCAT_HOME}\endorsed' rem Copy service applet tomcat6w.exe to instance name copy tomcat6w.exe your_instance_name.exe /Y Regards, - Savvas On 28 February 2011 12:15, Jan Høydahl jan@cominvent.com wrote: You may have downloaded the wrong Tomcat package? http://lmgtfy.com/?q=tomcat+windows+service On 28. feb. 2011, at 12.25, rajini maski wrote: Does anybody have a script to create a tomcat service? I'm trying to set my system up to run multiple instances of tomcat at the same time (on different ports, obviously), and can't get the service to create properly.I tried to follow the steps mentioned in this link http://doc.ittrium.com/ittrium/visit/A1x66x1y1x10ddx1x68y1x1209x1x68y1x1214x1x7d .. But not successful in getting this thing done.. The service.bat file referring to an exe that is not available in the zip. Any help or suggestions? Thanks, Rajani.
Re: Create a tomcat service.
..--DisplayName doesn't *have* to be {TOMCAT_HOME} of course..just a copy paste artifact.. :D On 28 February 2011 12:21, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: Hi Rajini, We use the following script ran from within {TOMCAT_HOME}\bin directory to create service instances (assuming you are targeting Windows Server environments..): cd C:\Program Files\Apache Software Foundation\{TOMCAT_HOME}\bin set CATALINA_BASE=C:\Program Files\Apache Software Foundation\{TOMCAT_HOME} rem Delete service if it already exists rem tomcat6 //DS//your_instance_name tomcat6 //IS//your_instance_name --DisplayName={TOMCAT_HOME} --Description=This is your instance description --Install=C:\Program Files\Apache Software Foundation\{TOMCAT_HOME}\bin\tomcat6.exe --Classpath=C:\Program Files\Apache Software Foundation\{TOMCAT_HOME}\bin\bootstrap.jar --Jvm=auto --Startup=auto --StartMode=jvm --StartPath=C:\Program Files\Apache Software Foundation\{TOMCAT_HOME} --StopMode=jvm --StartClass=org.apache.catalina.startup.Bootstrap --StartParams=start --StopClass=org.apache.catalina.startup.Bootstrap --StopParams=stop --StopPath=C:\Program Files\Apache Software Foundation\{TOMCAT_HOME} --Startup=auto --LogPath=%CATALINA_BASE%\logs --StdOutput=auto --StdError=auto --JvmOptions=-Dcatalina.home='C:\Program Files\Apache Software Foundation\{TOMCAT_HOME}';-Dcatalina.base=%CATALINA_BASE%;-Djava.io.tmpdir=%CATALINA_BASE%\temp;-Djava.endorsed.dirs='C:\Program Files\Apache Software Foundation\{TOMCAT_HOME}\endorsed' rem Copy service applet tomcat6w.exe to instance name copy tomcat6w.exe your_instance_name.exe /Y Regards, - Savvas On 28 February 2011 12:15, Jan Høydahl jan@cominvent.com wrote: You may have downloaded the wrong Tomcat package? http://lmgtfy.com/?q=tomcat+windows+service On 28. feb. 2011, at 12.25, rajini maski wrote: Does anybody have a script to create a tomcat service? I'm trying to set my system up to run multiple instances of tomcat at the same time (on different ports, obviously), and can't get the service to create properly.I tried to follow the steps mentioned in this link http://doc.ittrium.com/ittrium/visit/A1x66x1y1x10ddx1x68y1x1209x1x68y1x1214x1x7d .. But not successful in getting this thing done.. The service.bat file referring to an exe that is not available in the zip. Any help or suggestions? Thanks, Rajani.
Re: How to handle special character in filter query
Hello, Regarding HTTP specific characters(like spaces and ) , you'll need to URL-encode those if you are firing queries directly to Solr but you don't need to do so if you are using a Solr client such as SolrJ. Regards, - Savvas On 26 February 2011 03:11, cyang2010 ysxsu...@hotmail.com wrote: How to handle special character when constructing filter query? for example, i want to do something like: http://.fq=genre:ACTION ADVENTURE How do i handle the space and in the filter query part? Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-handle-special-character-in-filter-query-tp2579978p2579978.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: fq field with facets
Hello, you could try wrapping your fq terms in double quotes as in: ?q=homefq=category:Appartement Sale On 23 February 2011 13:25, Rosa (Anuncios) rosaemailanunc...@gmail.comwrote: Hi, I'm trying to reduce results from facets. (by category with my schema) My category field is String type in my schema.xml. The problem i've got is when the category value has space or special caracter it doen't work? Example: ?q=homefq=category:Appartement --- works fine ?q=homefq=category:Appartement for rent-- doesn't work? ?q=homefq=category:Appartement Sale-- doesn't work? I guess there is a workaround this? Sorry if it's obvious... i'm a newbie with Solr thanks for your help rosa
Re: fq field with facets
Hi Eric, could you please let us know where can we find more info about this notation ( fq={!field f=category})? What is it called, how to use it etc? Is there a wiki page? Thanks, - Savvas On 23 February 2011 14:17, Erik Hatcher erik.hatc...@gmail.com wrote: Try - fq={!field f=category}insert value, URL encoded of course, here You can also try surrounding with quotes, but that gets tricky and you'll need to escape things possibly. Or you could simply backslash escape the whitespace (and colon, etc) characters. Erik On Feb 23, 2011, at 08:25 , Rosa (Anuncios) wrote: Hi, I'm trying to reduce results from facets. (by category with my schema) My category field is String type in my schema.xml. The problem i've got is when the category value has space or special caracter it doen't work? Example: ?q=homefq=category:Appartement --- works fine ?q=homefq=category:Appartement for rent-- doesn't work? ?q=homefq=category:Appartement Sale-- doesn't work? I guess there is a workaround this? Sorry if it's obvious... i'm a newbie with Solr thanks for your help rosa
Re: [ANN] new SolrMeter release
Nice! will definitely give it a try! :) On 23 February 2011 22:55, Lance Norskog goks...@gmail.com wrote: Cool! On 2/23/11, Tomás Fernández Löbbe tomasflo...@gmail.com wrote: Hi All, I'm happy to announce a new release of SolrMeter, an open source stress test tool for Solr. You can obtain the code or executable jar from the google code page at: http://code.google.com/p/solrmeter There have been a lot of improvements since the last release, you can see what's new by checking the issues tool or entering here: http://code.google.com/p/solrmeter/issues/list?can=1q=Milestone%3DRelease-0.2.0+colspec=ID+Type+Status+Priority+Milestone+Owner+Summarycells=tiles Best Regards, Tomás -- Lance Norskog goks...@gmail.com
Re: General question about Solr Caches
Hi Hoss, Ok, that makes much more sense now. I was under the impression that values were copied as well which seemed a bit odd.. unless you have to deal with a use case similar to yours. :) Cheers, - Savvas On 9 February 2011 02:25, Chris Hostetter hossman_luc...@fucit.org wrote: : In my understanding, the Current Index Searcher uses a cache instance and : when a New Index Searcher is registered a new cache instance is used which : is also auto-warmed. However, what happens when the New Index Searcher is a : view of an index which has been modified? If the entries contained in the : old cache are copied during auto warming to the new cache wouldn’t that new : cache contain invalid entries? a) i'm not sure what you mean by view of an index which has been modified ... except for the first time an index is created, an Index Searcher always contains a view of an index which has been modified -- that view that the IndexSearcher represents is entirely consinsitent and doesn't change as documents are added/removed - that's why a new Searcher needs to be opened. b) entries are not copied during autowarming. the *keys* of the entries in the old cache are used to warm the new cache -- using the new searcher to generate new values. (caveat: if you have a custom cache, you could write a custom cache regenerator that did copy the values from the old cache verbatim -- i have done that in special cases where the type of object i was caching didn't vary based on the IndexSearcher -- or did vary, but in such a way that i could use the new Searcher to determine a cheap piece of information and based on the result either reuse an old value that was expensive to compute, or recompute it using hte new Searcher. ... but none of the default cache regenerators for the stock solr caches work this way) : : : : Thanks, : - Savvas : -Hoss
Re: Concurrent updates/commits
Hello, Thanks very much for your quick replies. So, according to Pierre, all updates will be immediately posted to Solr, but all commits will be serialised. But doesn't that contradict Jonathan's example where you can end up with FIVE 'new indexes' being warmed? If commits are serialised, then there can only ever be one Index Searcher being auto-warmed at a time or have I got this wrong? The reason we are investigating commit serialisation, is because we want to know whether the commit requests will be blocked until the previous ones finish. Cheers, - Savvas On 9 February 2011 15:44, Pierre GOSSE pierre.go...@arisem.com wrote: However, the Solr book, in the Commit, Optimise, Rollback section reads: if more than one Solr client were to submit modifications and commit them at similar times, it is possible for part of one client's set of changes to be committed before that client told Solr to commit which suggests that requests are *not* serialised. I read this as If two client submit modifications and commits every couple of minutes, it could happen that modifications of client1 got committed by client2's commit before client1 asks for a commit. As far as I understand Solr commit, they are serialized by design. And committing too often could lead you to trouble if you have many warm-up queries (?). Hope this helps, Pierre -Message d'origine- De : Savvas-Andreas Moysidis [mailto: savvas.andreas.moysi...@googlemail.com] Envoyé : mercredi 9 février 2011 16:34 À : solr-user@lucene.apache.org Objet : Concurrent updates/commits Hello, This topic has probably been covered before here, but we're still not very clear about how multiple commits work in Solr. We currently have a requirement to make our domain objects searchable immediately after the get updated in the database by some user action. This could potentially cause multiple updates/commits to be fired to Solr and we are trying to investigate how Solr handles those multiple requests. This thread: http://search-lucene.com/m/0cab31f10Mh/concurrent+commitssubj=commit+concurrency+full+text+search suggests that Solr will handle all of the lower level details and that Before a *COMMIT* is done , lock is obtained and its released after the operation which in my understanding means that Solr will serialise all update/commit requests? However, the Solr book, in the Commit, Optimise, Rollback section reads: if more than one Solr client were to submit modifications and commit them at similar times, it is possible for part of one client's set of changes to be committed before that client told Solr to commit which suggests that requests are *not* serialised. Our questions are: - Does Solr handle concurrent requests or do we need to add synchronisation logic around our code? - If Solr *does* handle concurrent requests, does it serialise each request or has some other strategy for processing those? Thanks, - Savvas
Re: Concurrent updates/commits
Yes, we'll probably go towards that path as our index files are relatively small, so auto warming might not be extremely useful in our case.. Yep, we do realise the difference between a db and a Solr commit. :) Thanks. On 9 February 2011 16:15, Walter Underwood wun...@wunderwood.org wrote: Don't think commit, that is confusing. Solr is not a database. In particular, it does not have the isolation property from ACID. Solr indexes new documents as a batch, then installs a new version of the entire index. Installing a new index isn't instant, especially with warming queries. Solr creates the index, then warms it, then makes it available for regular queries. If you are creating indexes frequently, don't bother warming. wunder == Walter Underwood Lead Engineer, MarkLogic On Feb 9, 2011, at 8:03 AM, Savvas-Andreas Moysidis wrote: Hello, Thanks very much for your quick replies. So, according to Pierre, all updates will be immediately posted to Solr, but all commits will be serialised. But doesn't that contradict Jonathan's example where you can end up with FIVE 'new indexes' being warmed? If commits are serialised, then there can only ever be one Index Searcher being auto-warmed at a time or have I got this wrong? The reason we are investigating commit serialisation, is because we want to know whether the commit requests will be blocked until the previous ones finish. Cheers, - Savvas On 9 February 2011 15:44, Pierre GOSSE pierre.go...@arisem.com wrote: However, the Solr book, in the Commit, Optimise, Rollback section reads: if more than one Solr client were to submit modifications and commit them at similar times, it is possible for part of one client's set of changes to be committed before that client told Solr to commit which suggests that requests are *not* serialised. I read this as If two client submit modifications and commits every couple of minutes, it could happen that modifications of client1 got committed by client2's commit before client1 asks for a commit. As far as I understand Solr commit, they are serialized by design. And committing too often could lead you to trouble if you have many warm-up queries (?). Hope this helps, Pierre -Message d'origine- De : Savvas-Andreas Moysidis [mailto: savvas.andreas.moysi...@googlemail.com] Envoyé : mercredi 9 février 2011 16:34 À : solr-user@lucene.apache.org Objet : Concurrent updates/commits Hello, This topic has probably been covered before here, but we're still not very clear about how multiple commits work in Solr. We currently have a requirement to make our domain objects searchable immediately after the get updated in the database by some user action. This could potentially cause multiple updates/commits to be fired to Solr and we are trying to investigate how Solr handles those multiple requests. This thread: http://search-lucene.com/m/0cab31f10Mh/concurrent+commitssubj=commit+concurrency+full+text+search suggests that Solr will handle all of the lower level details and that Before a *COMMIT* is done , lock is obtained and its released after the operation which in my understanding means that Solr will serialise all update/commit requests? However, the Solr book, in the Commit, Optimise, Rollback section reads: if more than one Solr client were to submit modifications and commit them at similar times, it is possible for part of one client's set of changes to be committed before that client told Solr to commit which suggests that requests are *not* serialised. Our questions are: - Does Solr handle concurrent requests or do we need to add synchronisation logic around our code? - If Solr *does* handle concurrent requests, does it serialise each request or has some other strategy for processing those? Thanks, - Savvas
Re: Concurrent updates/commits
Thanks very much Em. - Savvas On 9 February 2011 16:22, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: Yes, we'll probably go towards that path as our index files are relatively small, so auto warming might not be extremely useful in our case.. Yep, we do realise the difference between a db and a Solr commit. :) Thanks. On 9 February 2011 16:15, Walter Underwood wun...@wunderwood.org wrote: Don't think commit, that is confusing. Solr is not a database. In particular, it does not have the isolation property from ACID. Solr indexes new documents as a batch, then installs a new version of the entire index. Installing a new index isn't instant, especially with warming queries. Solr creates the index, then warms it, then makes it available for regular queries. If you are creating indexes frequently, don't bother warming. wunder == Walter Underwood Lead Engineer, MarkLogic On Feb 9, 2011, at 8:03 AM, Savvas-Andreas Moysidis wrote: Hello, Thanks very much for your quick replies. So, according to Pierre, all updates will be immediately posted to Solr, but all commits will be serialised. But doesn't that contradict Jonathan's example where you can end up with FIVE 'new indexes' being warmed? If commits are serialised, then there can only ever be one Index Searcher being auto-warmed at a time or have I got this wrong? The reason we are investigating commit serialisation, is because we want to know whether the commit requests will be blocked until the previous ones finish. Cheers, - Savvas On 9 February 2011 15:44, Pierre GOSSE pierre.go...@arisem.com wrote: However, the Solr book, in the Commit, Optimise, Rollback section reads: if more than one Solr client were to submit modifications and commit them at similar times, it is possible for part of one client's set of changes to be committed before that client told Solr to commit which suggests that requests are *not* serialised. I read this as If two client submit modifications and commits every couple of minutes, it could happen that modifications of client1 got committed by client2's commit before client1 asks for a commit. As far as I understand Solr commit, they are serialized by design. And committing too often could lead you to trouble if you have many warm-up queries (?). Hope this helps, Pierre -Message d'origine- De : Savvas-Andreas Moysidis [mailto: savvas.andreas.moysi...@googlemail.com] Envoyé : mercredi 9 février 2011 16:34 À : solr-user@lucene.apache.org Objet : Concurrent updates/commits Hello, This topic has probably been covered before here, but we're still not very clear about how multiple commits work in Solr. We currently have a requirement to make our domain objects searchable immediately after the get updated in the database by some user action. This could potentially cause multiple updates/commits to be fired to Solr and we are trying to investigate how Solr handles those multiple requests. This thread: http://search-lucene.com/m/0cab31f10Mh/concurrent+commitssubj=commit+concurrency+full+text+search suggests that Solr will handle all of the lower level details and that Before a *COMMIT* is done , lock is obtained and its released after the operation which in my understanding means that Solr will serialise all update/commit requests? However, the Solr book, in the Commit, Optimise, Rollback section reads: if more than one Solr client were to submit modifications and commit them at similar times, it is possible for part of one client's set of changes to be committed before that client told Solr to commit which suggests that requests are *not* serialised. Our questions are: - Does Solr handle concurrent requests or do we need to add synchronisation logic around our code? - If Solr *does* handle concurrent requests, does it serialise each request or has some other strategy for processing those? Thanks, - Savvas
General question about Solr Caches
Hello, I am going through the wiki page related to cache configuration http://wiki.apache.org/solr/SolrCaching and I have a question regarding the general cache architecture and implementation: In my understanding, the Current Index Searcher uses a cache instance and when a New Index Searcher is registered a new cache instance is used which is also auto-warmed. However, what happens when the New Index Searcher is a view of an index which has been modified? If the entries contained in the old cache are copied during auto warming to the new cache wouldn’t that new cache contain invalid entries? Thanks, - Savvas
Re: Scoring: Precedent for a Rules-/Priority-based Approach?
Hi Tavi, In my understanding the scoring formula Lucene (and therefore Solr) uses is based on a mathematical model which is proven to work for general purpose full text searching. The real challenge, as you mention, comes when you need to achieve high quality scoring based on the domain you are working in. For example, a general search portal for Songs might need to score Songs based on search relevance, but a search application for a Music Publisher might need to score Songs first by relevance with matched documents boosted according to the revenue they have generated..and ranking from that second scoring strategy could be widely different to the first one.. Personally, I can't think of a generic scoring strategy that would come out of the box with Solr which would allow for all the widely different use cases. Don't really agree that tuning Solr and in general experimenting for better scoring quality is something fragile or awkward. As the name suggests, it is a tuning process which is targeting your specific environment. :) Technically wise, in our case, we were able to significantly improve scoring quality (as expected by our domain experts) by using the Dismax Search Handler, and by experimenting with different Boost values, Function Queries, the mm parameter and by setting omitNorms to true for the fields we were having problems with. Regards, - Savvas On 8 February 2011 16:23, Tavi Nathanson tavi.nathan...@gmail.com wrote: Hey everyone, I have a question about Lucene/Solr scoring in general. There are many factors at play in the final score for each document, and very often one factor will completely dominate everything else when that may not be the intention. ** The question: might there be a way to enforce strict requirements that certain factors are higher priority than other factors, and/or certain factors shouldn't overtake other factors? Perhaps a set of rules where one factor is considered before even examining another factor? Tuning boost numbers around and hoping for the best seems imprecise and very fragile. ** To make this more concrete, an example: We previously added the scores of multi-field matches together via an OR, so: score(query apple) = score(field1:apple) + score(field2:apple). I changed that to be more in-line with DisMaxParser, namely a max: score(query apple) = max(score(field1:apple), score(field2:apple)). I also modified coord such that coord would only consider actual unique terms (apple vs. orange), rather than terms across multiple fields (field1:apple vs. field2:apple). This seemed like a good idea, but it actually introduced a bug that was previously hidden. Suddenly, documents matching apple in the title and *nothing* in the body were being boosted over documents matching apple in the title and apple in the body! I investigated, and it was due to lengthNorm: previously, documents matching apple in both title and body were getting very high scores and completely overwhelming lengthNorm. Now that they were no longer getting *such* high scores, which was beneficial in many respects, they were also no longer overwhelming lengthNorm. This allowed lengthNorm to dominate everything else. I'd love to hear your thoughts :) Tavi
Re: facet.mincount
could you post the query you are submitting to Solr? On 3 February 2011 09:33, Isan Fulia isan.fu...@germinait.com wrote: Hi all, Even after making facet.mincount=1 , it is showing the results with count = 0. Does anyone know why this is happening. -- Thanks Regards, Isan Fulia.
Re: facet.mincount
Hi Dan, I'm probably just not able to spot this, but where does the wiki page mention that the facet.mincount is not applicable on date fields? On 3 February 2011 10:55, Isan Fulia isan.fu...@germinait.com wrote: I am using solr1.4.1 release version I got the following error while using facet.mincount java.lang.IllegalStateException: STREAM at org.mortbay.jetty.Response.getWriter(Response.java:571) at org.apache.jasper.runtime.JspWriterImpl.initOut(JspWriterImpl.java:158) at org.apache.jasper.runtime.JspWriterImpl.flushBuffer(JspWriterImpl.java:151) at org.apache.jasper.runtime.PageContextImpl.release(PageContextImpl.java:208) at org.apache.jasper.runtime.JspFactoryImpl.internalReleasePageContext(JspFactoryImpl.java:144) at org.apache.jasper.runtime.JspFactoryImpl.releasePageContext(JspFactoryImpl.java:95) at org.apache.jsp.admin.index_jsp._jspService(org.apache.jsp.admin.index_jsp:397) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:80) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:373) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:464) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:358) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:487) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:367) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:268) at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:126) at org.mortbay.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:431) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:487) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1098) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:286) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) On 3 February 2011 16:17, dan sutton danbsut...@gmail.com wrote: I don't think facet.mincount works with date faceting, see here: http://wiki.apache.org/solr/SimpleFacetParameters Dan On Thu, Feb 3, 2011 at 10:11 AM, Isan Fulia isan.fu...@germinait.com wrote: Any query followed by facet=onfacet.date=aUpdDtfacet.date.start=2011-01-02T08:00:00.000Zfacet.date.end=2011-02-03T08:00:00.000Zfacet.date.gap=%2B1HOURfacet.mincount=1 On 3 February 2011 15:14, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: could you post the query you are submitting to Solr? On 3 February 2011 09:33, Isan Fulia isan.fu...@germinait.com wrote: Hi all, Even after making facet.mincount=1 , it is showing the results with count = 0. Does anyone know why this is happening. -- Thanks Regards, Isan Fulia
Re: facet.mincount
ahh..I see your point..well if that's true, then facet.missing/facet.method are also not supported? I'm not sure if this is the case, or the Date Faceting Parameters = Field Value Faceting Parameters + the extra ones. Maybe the page author(s) can clarify. On 3 February 2011 11:32, dan sutton danbsut...@gmail.com wrote: facet.mincount is grouped only under field faceting parameters not date faceting parameters On Thu, Feb 3, 2011 at 11:08 AM, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: Hi Dan, I'm probably just not able to spot this, but where does the wiki page mention that the facet.mincount is not applicable on date fields? On 3 February 2011 10:55, Isan Fulia isan.fu...@germinait.com wrote: I am using solr1.4.1 release version I got the following error while using facet.mincount java.lang.IllegalStateException: STREAM at org.mortbay.jetty.Response.getWriter(Response.java:571) at org.apache.jasper.runtime.JspWriterImpl.initOut(JspWriterImpl.java:158) at org.apache.jasper.runtime.JspWriterImpl.flushBuffer(JspWriterImpl.java:151) at org.apache.jasper.runtime.PageContextImpl.release(PageContextImpl.java:208) at org.apache.jasper.runtime.JspFactoryImpl.internalReleasePageContext(JspFactoryImpl.java:144) at org.apache.jasper.runtime.JspFactoryImpl.releasePageContext(JspFactoryImpl.java:95) at org.apache.jsp.admin.index_jsp._jspService(org.apache.jsp.admin.index_jsp:397) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:80) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:373) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:464) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:358) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:487) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:367) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:268) at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:126) at org.mortbay.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:431) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:487) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1098) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:286) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) On 3 February 2011 16:17, dan sutton danbsut...@gmail.com wrote: I don't think facet.mincount works with date faceting, see here: http://wiki.apache.org/solr/SimpleFacetParameters Dan On Thu, Feb 3, 2011 at 10:11 AM, Isan Fulia isan.fu
Re: Index Not Matching
Hello, Are you definitely positive your database isn't updated after you index your data? Are you querying against the same field(s) specifying the same criteria both in Solr and in the database? Any chance you might be pointing to a dev/test instance of Solr ? Regards, - Savvas On 3 February 2011 20:17, Esclusa, Will william.escl...@bonton.com wrote: Greetings! My organization is new to SOLR, so please bare with me. At times, we experience an out of sync condition between SOLR index files and our Database. We resolved that by clearing the index file and performing a full crawl of the database. Last time we noticed an out of sync condition, we went through our procedure of deleting and crawling, but this time it did not fix it. For example, search for swim on the DB and we get 440 products, but yet SOLR states we have 214 products. Has anyone experience anything like this? Does anyone have any suggestions on a trace we can turn on? Again, we are new to SOLR so any help you can provide is greatly appreciated. Thanks! Will
Re: Index Not Matching
that's odd..are you viewing the results through your application or the admin console? if you aren't, I'd suggest you use the admin console just to eliminate the possibility of an application bug. We had a similar problem in the past and turned out to be a mixup of our dev/test instances.. On 3 February 2011 21:41, Esclusa, Will william.escl...@bonton.com wrote: Hello Saavs, I am 100% sure we are not updating the DB after we index the data. We are specifying the same fields on both queries. Our prod boxes do not have access to QA or DEV, so I would expect a connection error when indexing if this is the case. No connection errors in the logs. -Original Message- From: Savvas-Andreas Moysidis [mailto:savvas.andreas.moysi...@googlemail.com] Sent: Thursday, February 03, 2011 4:26 PM To: solr-user@lucene.apache.org Subject: Re: Index Not Matching Hello, Are you definitely positive your database isn't updated after you index your data? Are you querying against the same field(s) specifying the same criteria both in Solr and in the database? Any chance you might be pointing to a dev/test instance of Solr ? Regards, - Savvas On 3 February 2011 20:17, Esclusa, Will william.escl...@bonton.com wrote: Greetings! My organization is new to SOLR, so please bare with me. At times, we experience an out of sync condition between SOLR index files and our Database. We resolved that by clearing the index file and performing a full crawl of the database. Last time we noticed an out of sync condition, we went through our procedure of deleting and crawling, but this time it did not fix it. For example, search for swim on the DB and we get 440 products, but yet SOLR states we have 214 products. Has anyone experience anything like this? Does anyone have any suggestions on a trace we can turn on? Again, we are new to SOLR so any help you can provide is greatly appreciated. Thanks! Will
Re: Index Not Matching
which field type are you specifying in your schema.xml for the fields that you search upon? if you are using text then this causes your input text to be stemmed to a common root making your searches more flexible. For instance: if you have the term dreaming in one row/document and the term dream in another, then this could be stemmed to dreami or something like during indexing. This effectively causes both your documents to match when you search for dream in Solr but you would only return 1 result if you searched directly in your database. On 3 February 2011 22:37, Geert-Jan Brits gbr...@gmail.com wrote: Make sure your index is completely commited. curl 'http://localhost:8983/solr/update?commit=true' http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22 for an overview: http://lucene.apache.org/solr/tutorial.html hth, Geert-Jan http://techgurulive.com/2010/11/22/apache-solr-commit-and-optimize/ 2011/2/3 Esclusa, Will william.escl...@bonton.com Both the application and the SOLR gui match (with the incorrect number of course :-) ) At first I thought it could be a schema problem, but we went though it with a fine comb and compared it to the one in our stage environment. What is really weird is that I grabbed one of the product ID that are not showing up in SOLR from the DB, search through the SOLR GUI and it found it. -Original Message- From: Savvas-Andreas Moysidis [mailto:savvas.andreas.moysi...@googlemail.com] Sent: Thursday, February 03, 2011 4:57 PM To: solr-user@lucene.apache.org Subject: Re: Index Not Matching that's odd..are you viewing the results through your application or the admin console? if you aren't, I'd suggest you use the admin console just to eliminate the possibility of an application bug. We had a similar problem in the past and turned out to be a mixup of our dev/test instances.. On 3 February 2011 21:41, Esclusa, Will william.escl...@bonton.com wrote: Hello Saavs, I am 100% sure we are not updating the DB after we index the data. We are specifying the same fields on both queries. Our prod boxes do not have access to QA or DEV, so I would expect a connection error when indexing if this is the case. No connection errors in the logs. -Original Message- From: Savvas-Andreas Moysidis [mailto:savvas.andreas.moysi...@googlemail.com] Sent: Thursday, February 03, 2011 4:26 PM To: solr-user@lucene.apache.org Subject: Re: Index Not Matching Hello, Are you definitely positive your database isn't updated after you index your data? Are you querying against the same field(s) specifying the same criteria both in Solr and in the database? Any chance you might be pointing to a dev/test instance of Solr ? Regards, - Savvas On 3 February 2011 20:17, Esclusa, Will william.escl...@bonton.com wrote: Greetings! My organization is new to SOLR, so please bare with me. At times, we experience an out of sync condition between SOLR index files and our Database. We resolved that by clearing the index file and performing a full crawl of the database. Last time we noticed an out of sync condition, we went through our procedure of deleting and crawling, but this time it did not fix it. For example, search for swim on the DB and we get 440 products, but yet SOLR states we have 214 products. Has anyone experience anything like this? Does anyone have any suggestions on a trace we can turn on? Again, we are new to SOLR so any help you can provide is greatly appreciated. Thanks! Will
Re: Search for FirstName with first Char uppercase followed by * not giving result; getting result with all lowercase and *
Hi Mark, When I indexed *George *it was also finally analyzed and stored as *george* Theny why is it that I don't get a match as per the analysis report I had your indexed term is george but you search for George* which does not go through the same analysis process as it did when it was indexed. So, since the terms you are searching for are not lowercased you are trying to find something that starts with George (capital G) which doesn't exist in you index. If you are not hitting Solr directly, maybe you can lowercase you input text before feeding it to Solr? On 30 January 2011 16:38, Mark Fletcher mark.fletcher2...@gmail.com wrote: Hi Ahmet, Thanks for the reply. I had attached the Analysis report of the query George* It is found to be split into terms *George** and *George* by the WordDelimiterFilterFactory and the LowerCaseFilterFactory converts it to * george** and *george* When I indexed *George *it was also finally analyzed and stored as *george* Theny why is it that I don't get a match as per the analysis report I had attached in my previous mail. Or Am I missing something basic here? Many Thanks. M On Sun, Jan 30, 2011 at 4:34 AM, Ahmet Arslan iori...@yahoo.com wrote: :When i try george* I get results. Whereas George* fetches no results. Wildcard queries are not analyzed by QueryParser.
Re: Searchers and Warmups
Hi David, maybe the wiki page on caching could be helpful: http://wiki.apache.org/solr/SolrCaching#newSearcher_and_firstSearcher_Event_Listeners http://wiki.apache.org/solr/SolrCaching#newSearcher_and_firstSearcher_Event_Listeners Regards, - Savvas On 14 January 2011 00:08, David Cramer dcra...@gmail.com wrote: I'm trying to understand the mechanics behind warming up, when new searchers are registered, and their costs. A quick Google didn't point me in the right direction, so hoping for some of that here. -- David Cramer
Re: SolrJ Question about Bad Request Root cause error
good point! that's an enhancement we would definitely welcome as well. currently, we too have to remote desktop to the Sol machine and search through the logs.. Any thoughts? Cheers, -- Savvas On 11 January 2011 19:59, roz dev rozde...@gmail.com wrote: Hi All We are using SolrJ client (v 1.4.1) to integrate with our solr search server. We notice that whenever SolrJ request does not match with Solr schema, we get Bad Request exception which makes sense. org.apache.solr.common.SolrException: Bad Request But, SolrJ Client does not provide any clue about the reason request is Bad. Is there any way to get the root cause on client side? Of Course, solr server logs have enough info to know that data is bad but it would be great to have the same info in the exception generated by SolrJ. Any thoughts? Is there any plan to add this in future releases? Thanks, Saroj
Re: [DIH] Example for SQL Server
Hi Adam, we are using DIH to index off an SQL Server database(the freeby SQLExpress one.. ;) ). We have defined the following in our %TOMCAT_HOME%\solr\conf\data-config.xml: dataConfig dataSource type=JdbcDataSource name=mssqlDatasource driver=net.sourceforge.jtds.jdbc.Driver url=jdbc:jtds:sqlserver://{server.name }:{server.port}/{dbInstanceName};instance=SQLEXPRESS convertType=true user={user.name} password={user.password}/ document entity name=id dataSource=mssqlDatasource query=your query here / /document /dataConfig We downloaded a JDBC driver from here http://jtds.sourceforge.net/faq.html and found it to be a quite stable driver. And the only thing we really had to do was drop that library in %TOMCAT_HOME%\lib directory (for Tomcat 6+). Hope that helps. -- Savvas. On 14 December 2010 22:46, Erick Erickson erickerick...@gmail.com wrote: The config isn't really any different for various sql instances, about the only difference is the driver. Have you seen the example in the distribution somewhere like solr_home/example/example-DIH/solr/db/conf/db-data-config.xml? Also, there's a magic URL for debugging DIH at: .../solr/admin/dataimport.jsp If none of that is useful, could you post your attempt and maybe someone can offer some hints? Best Erick On Tue, Dec 14, 2010 at 5:32 PM, Adam Estrada estrada.adam.gro...@gmail.com wrote: Does anyone have an example config.xml file I can take a look at for SQL Server? I need to index a lot of data from a DB and can't seem to figure out the right syntax so any help would be greatly appreciated. What is the correct /jar file to use and where do I put it in order for it to work? Thanks, Adam
Re: Lower level filtering
It might not be practical in your case, but is it possible to get from that other system, a list of ids the user is *not* allow to see and somehow invert the logic in the filter? Regards, -- Savvas. On 15 December 2010 14:49, Michael Owen michaelowe...@hotmail.com wrote: Hi all, I'm currently using Solr and I've got a question about filtering on a lower level than filter queries. We want to be able to restrict the documents that can possibly be returned to a users query. From another system we'll get a list of document unique ids for the user which is all the documents that they can possibly see (i.e. a base index list as such). The criteria for what document ids get returned is going to be quite flexible. As the number of ids can be up to index size - 1 (i.e. thousands) using a filter query doesn't seem right for entering a filter query which is so large. Can something be done at a lower level - perhaps at a Lucene level - as I understand Lucene starts from a bitset of possible documents it can return - could we AND this with a filter bitset returned from the other system? Would this be a good way forward? And then how would you do this in Solr with still keeping Solr's extra functionality it brings over Lucene. A new SearchHandler? Thanks Mike
Re: How to get all the search results?
Hello, shouldn't that query syntax be *:* ? Regards, -- Savvas. On 6 December 2010 16:10, Solr User solr...@gmail.com wrote: Hi, First off thanks to the group for guiding me to move from default search handler to dismax. I have a question related to getting all the search results. In the past with the default search handler I was getting all the search results (8000) if I pass q=* as search string but with dismax I was getting only 16 results instead of 8000 results. How to get all the search results using dismax? Do I need to configure anything to make * (asterisk) work? Thanks, Solr User
Re: How to get all the search results?
ahhh, right..in dismax, you pre-define the fields that will be searched upon is that right? is it also true that the query is parsed and all special characters escaped? On 6 December 2010 16:25, Peter Karich peat...@yahoo.de wrote: for dismax just pass an empty query all q= or none at all Hello, shouldn't that query syntax be *:* ? Regards, -- Savvas. On 6 December 2010 16:10, Solr Usersolr...@gmail.com wrote: Hi, First off thanks to the group for guiding me to move from default search handler to dismax. I have a question related to getting all the search results. In the past with the default search handler I was getting all the search results (8000) if I pass q=* as search string but with dismax I was getting only 16 results instead of 8000 results. How to get all the search results using dismax? Do I need to configure anything to make * (asterisk) work? Thanks, Solr User -- http://jetwick.com twitter search prototype
Re: Troubles with forming query for solr.
Hello, would something similar along those lines: (field1:term AND field2:term AND field3:term)^2 OR (field1:term AND field2:term)^0.8 OR (field2:term AND field3:term)^0.5 work? You'll probably need to experiment with the boost values to get the desired result. Another option could be investigating the Dismax handler. On 1 December 2010 02:38, kolesman alekkolesni...@gmail.com wrote: Hi, I have some troubles with forming query for solr. Here is my task : I'm indexing objects with 3 fields, for example {field1, field2, filed3} In solr's response I want to get object in special order : 1. Firstly I want to get objects where all 3 fields are matched 2. Then I want to get objects where ONLY field1 and field2 are matched 3. And finnally I want to get objects where ONLY field2 and field3 are matched. Could your explain me how to form query for my task? -- View this message in context: http://lucene.472066.n3.nabble.com/Troubles-with-forming-query-for-solr-tp1996630p1996630.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Joining Fields in and Index
Hi, If you are able to do a full re-index then you could index the full names and not the codes. When you later facet on the Country field you'll get the actual name rather than the code. If you are not able to re-index then probably this conversion could be added at your application layer prior to displaying your results.(e.g. in your DAO object) On 2 December 2010 22:05, Adam Estrada estrada.adam.gro...@gmail.comwrote: All, I have an index that has a field with country codes in it. I have 7 million or so documents in the index and when displaying facets the country codes don't mean a whole lot to me. Is there any way to add a field with the full country names then join the codes in there accordingly? I suppose I can do this before updating the records in the index but before I do that I would like to know if there is a way to do this sort of join. Example: US - United States Thanks, Adam
Re: Boost on newer documents
hi, I might not understand your case right but can you not add an extra publishedDate field and then specify a secondary (after relevance) sort by that? On 30 November 2010 08:05, jan.kure...@nokia.com wrote: You could also put a short representation of the data (I suggest days since 01.01.2010) as payload and calculate boost with payload function of the similarity. -Original Message- From: ext Jason Brown [mailto:jason.br...@sjp.co.uk] Sent: Montag, 29. November 2010 17:28 To: solr-user@lucene.apache.org Subject: Boost on newer documents Hi, I use the dismax query to search across several fields. I find I have a lot of documents with the same document name (one of the fields that the dismax queries) so I wanted to adjust the relevance so that titles with a newer published date have a higher relevance than documents with the same title but are older. Does anyone know how I can achieve this? Thank You Jason. If you wish to view the St. James's Place email disclaimer, please use the link below http://www.sjp.co.uk/portal/internet/SJPemaildisclaimer
Re: Boost on newer documents
ahhh I see..good point..yes, for a high number of unique scores the secondary sort won't have any effect.. On 30 November 2010 09:32, Jason Brown jason.br...@sjp.co.uk wrote: Hi - you do understand may case - we tried what you suggested but as the relevancy is very precise we couldn't get it it to do a dual-sort. I like the idea of using one of the dismax parameters (bf) to in-effect increase the boost on a newer document. Thanks for all replies, most useful. -Original Message- From: Savvas-Andreas Moysidis [mailto: savvas.andreas.moysi...@googlemail.com] Sent: Tue 30/11/2010 09:26 To: solr-user@lucene.apache.org Subject: Re: Boost on newer documents hi, I might not understand your case right but can you not add an extra publishedDate field and then specify a secondary (after relevance) sort by that? On 30 November 2010 08:05, jan.kure...@nokia.com wrote: You could also put a short representation of the data (I suggest days since 01.01.2010) as payload and calculate boost with payload function of the similarity. -Original Message- From: ext Jason Brown [mailto:jason.br...@sjp.co.uk] Sent: Montag, 29. November 2010 17:28 To: solr-user@lucene.apache.org Subject: Boost on newer documents Hi, I use the dismax query to search across several fields. I find I have a lot of documents with the same document name (one of the fields that the dismax queries) so I wanted to adjust the relevance so that titles with a newer published date have a higher relevance than documents with the same title but are older. Does anyone know how I can achieve this? Thank You Jason. If you wish to view the St. James's Place email disclaimer, please use the link below http://www.sjp.co.uk/portal/internet/SJPemaildisclaimer If you wish to view the St. James's Place email disclaimer, please use the link below http://www.sjp.co.uk/portal/internet/SJPemaildisclaimer
Bug in solrJ when retrieving results?
Hello, We think we've come across a bug in solrj? The following is a toString() dump of a QueryResponse object we are getting: {responseHeader={status=0,QTime=0,params={sort=creationDate asc,start=30,q=songTitle:(mad dog) AND creationDate:[123750720 TO 123802559],wt=javabin,rows=10,version=1}},response={numFound=2,start=30,docs=[]}} Isn't there something wrong with the *numFound = 2 - docs=[]* part? We are using solr-solrj 1.4.1 and the related fields have been defined as: field name=songTitle type=text indexed=true stored=false required=false / field name=creationDate type=tlong indexed=true stored=false required=false / Any thoughts or a workaround for that much appreciated. Cheers, -- Savvas
Re: SOLR and secure content
Sounds like a good plan. I'd probably also set multiple cores for each website. This could give you more accurate results scoring. Good question about the required configuration option.. any input? Although on the other hand, this is a rule which seems to better fit in your application's Validation layer rather than Solr. On 23 November 2010 12:35, Jos Janssen j...@websdesign.nl wrote: Hi everyone, This is how we think we should set it up. Situation: - Multiple websites indexed on 1 solr server - Results should be seperated for each website - Search results should be filtered on group access Solution i think is possible with solr: - Solr server should only be accesed through API which we will write in PHP. - Solr server authentication wil be defined through IP adres on server side and username and password will be send through API for each different website. - Extra document fields in Solr server will contain: 1. Website Hash to identify and filter results fo each different website (Website authentication) 2. list of groups who can access the document (Group authentication) When making a query these fields should be required. Is it possible to configure handlers on the solr server so that these field are required whith each type of query? So for adding documents, deleting and querying? Am i correct? Any further advice is welcome. regard, Jos -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1953071.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR and secure content
Hi, Could you elaborate a bit more on how you access Solr? are you making direct Solr calls or is the communication directed through an application layer? On 22 November 2010 11:05, Jos Janssen j...@websdesign.nl wrote: Hi, We are currently investigating how to setup a correct solr server for our goals. The problem i'm running into is how to design the solr setup so that we can check if a user is authenticated for viewing the document. Let me explain the situation. We have a website with some pages and documents which are accesible by everyone (Public). We also have some sort of extranet, thse pages and documents are not accesible for everyone. In this extranet we have different user groups. Acces is defined by the user group. What i'm looking for is some sort of best practices to design/configure solr setup for this situation. I searched the internet but could find any examples or documentation for this situation. Maybe i'm not looking for the right documentation, that why i post this message. Can someone give me some information for this. Regards, Jos -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1945028.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR and secure content
maybe this older thread on Modeling Access Control might help: http://lucene.472066.n3.nabble.com/Modelling-Access-Control-td1756817.html#a1761482 Regards, -- Savvas On 22 November 2010 18:53, Jos Janssen j...@websdesign.nl wrote: Hi, We plan to make an application layer in PHP which will communicate to the solr server. Direct calls will only be made for administration purposes only. regards, jos -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1947970.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Searching with acronyms
yes, a synonyms filter should allow you to achieve what you want. On 15 November 2010 03:14, sivaprasad sivaprasa...@echidnainc.com wrote: Hi, I have a requirement where a user enters acronym of a word, then the search results should come for the expandable word.Let us say. If the user enters 'TV', the search results should come for 'Television'. Is the synonyms filter is the way to achieve this? Any inputs. Regards, Siva -- View this message in context: http://lucene.472066.n3.nabble.com/Searching-with-acronyms-tp1902583p1902583.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Search with accent
have you tried using a TokenFilter which removes accents both at indexing and searching time? If you index terms without accents and search the same way you should be able to find all documents as you require. On 10 November 2010 20:08, Claudio Devecchi cdevec...@gmail.com wrote: Tomas, Let me try to explain better. For example. - I have 10 documents, where 7 have the word pereque (without accent) and 3 have the word perequê (with accent) When I do a search pereque, solr is returning just 7, and when I do a search perequê solr is returning 3. But for me, these words are the same, and when I do some search for perequê or pereque, it should show me 10 results. About the ISOLatin you told, do you know how can I enable it? tks, Claudio On Wed, Nov 10, 2010 at 5:00 PM, Tomas Fernandez Lobbe tomasflo...@yahoo.com.ar wrote: I don't understand, when the user search for perequê you want the results for perequê and pereque? If thats the case, any field type with ISOLatin1AccentFilterFactory should work. The accent should be removed at index time and at query time (Make sure the filter is being applied on both cases). Tomás De: Claudio Devecchi cdevec...@gmail.com Para: Lista Solr solr-user@lucene.apache.org Enviado: miércoles, 10 de noviembre, 2010 15:16:24 Asunto: Search with accent Hi all, Somebody knows how can I config my solr to make searches with and without accents? for example: pereque and perequê When I do it I need the same result, but its not working. tks -- -- Claudio Devecchi flickr.com/cdevecchi
Re: Search with accent
have you tried using a TokenFilter which removes accents both at indexing and searching time? If you index terms without accents and search the same way you should be able to find all documents as you require. On 10 November 2010 20:25, Tomas Fernandez Lobbe tomasflo...@yahoo.com.arwrote: It looks like ISOLatin1AccentFilter is deprecated on Solr 1.4.1, If you are on that version, you should use the ASCIIFoldingFilter instead. Like with any other filter, to use it, you have to add the filter factory to the analysis chain of the field type you are using: filter class=solr.ASCIIFoldingFilterFactory/ Make sure you add it to the query and index analysis chain, otherwise you'll have extrage results. You'll have to perform a full reindex. Tomás De: Claudio Devecchi cdevec...@gmail.com Para: solr-user@lucene.apache.org Enviado: miércoles, 10 de noviembre, 2010 17:08:06 Asunto: Re: Search with accent Tomas, Let me try to explain better. For example. - I have 10 documents, where 7 have the word pereque (without accent) and 3 have the word perequê (with accent) When I do a search pereque, solr is returning just 7, and when I do a search perequê solr is returning 3. But for me, these words are the same, and when I do some search for perequê or pereque, it should show me 10 results. About the ISOLatin you told, do you know how can I enable it? tks, Claudio On Wed, Nov 10, 2010 at 5:00 PM, Tomas Fernandez Lobbe tomasflo...@yahoo.com.ar wrote: I don't understand, when the user search for perequê you want the results for perequê and pereque? If thats the case, any field type with ISOLatin1AccentFilterFactory should work. The accent should be removed at index time and at query time (Make sure the filter is being applied on both cases). Tomás De: Claudio Devecchi cdevec...@gmail.com Para: Lista Solr solr-user@lucene.apache.org Enviado: miércoles, 10 de noviembre, 2010 15:16:24 Asunto: Search with accent Hi all, Somebody knows how can I config my solr to make searches with and without accents? for example: pereque and perequê When I do it I need the same result, but its not working. tks -- -- Claudio Devecchi flickr.com/cdevecchi
Re: Wildcard weirdness
One place to start would be the Analysis page http://{your machine}:{port}/solr/admin/analysis.jsp?highlight=on There you can see exactly what happens to your query as it being moved down the Analysis chain. In my knowledge, no analysis is performed on wildcarded terms so my guess would be that the analysis chain modifies (e.g. lowercases/stems) and indexes your terms this way and you can't have a match. If for instance, your indexed term is lowercased to o'connor and you are searching for O'Conno* then Solr will look for any terms starting with O'Conno and *not* o'conno . But like mentioned above, the Analysis page is usually very helpful in situations like that. :) hope that helps On 5 November 2010 16:35, C0re blue-...@hotmail.co.uk wrote: Hi, I'm trying to understand what Solr is doing when a search for O'Connor and O'Conn* is done. The first search returns 4 results, which is fine. I would expect the second search to return at least 4 (the same) results, however it fails to return any. I've debugged the query and this is the output: Debug for O'Connor : str name=rawquerystringsurname:O'Connor/str str name=querystringsurname:O'Connor/str str name=parsedqueryPhraseQuery(surname:o connor)/str str name=parsedquery_toStringsurname:o connor/str Debug for O'Conn* : str name=rawquerystringsurname:O'Conno*/str str name=querystringsurname:O'Conno*/str str name=parsedquerysurname:O'Conno*/str str name=parsedquery_toStringsurname:O'Conno*/str So as you can see the queries are different but I don't understand why Solr changes them the way it does? Also, searching for Conno* does work. Thanks, C. -- View this message in context: http://lucene.472066.n3.nabble.com/Wildcard-weirdness-tp1849362p1849362.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Wildcard weirdness
strange..my second guess would be that stemming could be the reason but if your analyser(s) emit the same values you use for searching that's odd.. could you post your schema definition for the surname field? On 5 November 2010 17:33, C0re blue-...@hotmail.co.uk wrote: Hi Savvas, Thanks for the reply. Yep I've been trying out the Analysis tool. As you say the index does lowercase the terms. Field Name: surname Index Value: O'Connor Query Value: connor The Index Analyzer creates: o connor Which the query value above will match on. However, if the query value is conno* then there is no match. -- View this message in context: http://lucene.472066.n3.nabble.com/Wildcard-weirdness-tp1849362p1849680.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: indexing '-
One way to view how your Tokenizers/Filters chain transforms your input terms, is to use the analysis page of the Solr admin web application. This is very handy when troubleshooting issues related to how terms are indexed. On 31 October 2010 17:13, PeterKerk vettepa...@hotmail.com wrote: I already tried the normal string type, but that doesnt work either. I now use this: fieldType name=mytype class=solr.TextField sortMissingLast=true omitNorms=true analyzer tokenizer class=solr.KeywordTokenizerFactory/ /analyzer /fieldType But that doesnt do it either...what else can I try? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/indexing-tp1816969p1817298.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Commit/Optimise question
Thanks Eric. For the record, we are using 1.4.1 and SolrJ. On 31 October 2010 01:54, Erick Erickson erickerick...@gmail.com wrote: What version of Solr are you using? About committing. I'd just let the solr defaults handle that. You configure this in the autocommit section of solrconfig.xml. I'm pretty sure this gets triggered even if you're using SolrJ. That said, it's probably wise to issue a commit after all your data is indexed too, just to flush any remaining documents since the last autocommit. Optimize should not be issued until you're all done, if at all. If you're not deleting (or updating) documents, don't bother to optimize unless the number of files in your index directory gets really large. Recent Solr code almost removes the need to optimize unless you delete documents, but I confess I don't know the revision number recent refers to, perhaps only trunk... HTH Erick On Thu, Oct 28, 2010 at 9:56 AM, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: Hello, We currently index our data through a SQL-DIH setup but due to our model (and therefore sql query) becoming complex we need to index our data programmatically. As we didn't have to deal with commit/optimise before, we are now wondering whether there is an optimal approach to that. Is there a batch size after which we should fire a commit or should we execute a commit after indexing all of our data? What about optimise? Our document corpus is 4m documents and through DIH the resulting index is around 1.5G We have searched previous posts but couldn't find a definite answer. Any input much appreciated! Regards, -- Savvas
Re: Natural string sorting
I think string10 is before string2 in lexicographic order? On 29 October 2010 09:18, RL rl.subscri...@gmail.com wrote: Just a quick question about natural sorting of strings. I've a simple dynamic field in my schema: fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true/ field name=nameSort_en type=string indexed=true stored=false omitNorms=true/ There are 3 indexed strings for example string1,string2,string10 Executing a query and sorting by this field leads to unnatural sorting of : string1 string10 string2 (Some time ago i used Lucene and i was pretty sure that Lucene used a natural sort, thus i expected the same from solr) Is there a way to sort in a natural order? Config option? Plugin? Expected output would be: string1 string2 string10 Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/Natural-string-sorting-tp1791227p1791227.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stored or indexed?
In our case, we just store a database id and do a secondary db query when displaying the results. This is handy and leads to a more centralised architecture when you need to display properties of a domain object which you don't index/search. On 28 October 2010 05:02, kenf_nc ken.fos...@realestate.com wrote: Interesting wiki link, I hadn't seen that table before. And to answer your specific question about indexed=true, stored=false, this is most often done when you are using analyzers/tokenizers on your field. This field is for search only, you would never retrieve it's contents for display. It may in fact be an amalgam of several fields into one 'content' field. You have your display copy stored in another field marked indexed=false, stored=true and optionally compressed. I also have simple string fields set to lowercase so searching is case-insensitive, and have a duplicate field where the string is normal case. the first one is indexed/not stored, the second is stored/not indexed. -- View this message in context: http://lucene.472066.n3.nabble.com/Stored-or-indexed-tp1782805p1784315.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How do I this in Solr?
If I get your question right, you probably want to use the AND binary operator as in samsung AND andriod AND GPS or +samsung +andriod +GPS On 26 October 2010 14:07, Varun Gupta varun.vgu...@gmail.com wrote: Hi, I have lot of small documents (each containing 1 to 15 words) indexed in Solr. For the search query, I want the search results to contain only those documents that satisfy this criteria All of the words of the search result document are present in the search query For example: If I have the following documents indexed: nokia n95, GPS, android, samsung, samsung andriod, nokia andriod, mobile with GPS If I search with the text samsung andriod GPS, search results should only conain samsung, GPS, andriod and samsung andriod. Is there a way to do this in Solr. -- Thanks Varun Gupta
Re: Modelling Access Control
Pushing ACL logic outside Solr sounds like a prudent choice indeed as in, my opinion, all of the business rules/conceptual logic should reside only within the code boundaries. This way your domain will be easier to model and your code to read, understand and maintain. More information on Filter Queries, when they should be used and how they affect performance can be found here: http://wiki.apache.org/solr/FilterQueryGuidance On 23 October 2010 20:00, Dennis Gearon gear...@sbcglobal.net wrote: Forgot to add, 3/ The external, application code selects the GROUPS that the user has permission to read (Solr will only serve up what is to be read?) then search on those groups. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from ' http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die. --- On Sat, 10/23/10, Dennis Gearon gear...@sbcglobal.net wrote: From: Dennis Gearon gear...@sbcglobal.net Subject: Re: Modelling Access Control To: solr-user@lucene.apache.org Date: Saturday, October 23, 2010, 11:49 AM Two things will lessen the solr admininstrative load : 1/ Follow examples of databases and *nix OSs. Give each user their own group, or set up groups that don't have regular users as OWNERS, but can have users assigned to the group to give them particular permissions. I.E. Roles, like publishers, reviewers, friends, etc. 2/ Put your ACL outside of Solr, using your server-side/command line language's object oriented properties. Force all searches to come from a single location in code (not sure how to do that), and make the piece of code check authentication and authorization. This is what my research shows how others do it, and how I plan to do it. ANY insight others have on this, I really want to hear. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from ' http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die. --- On Sat, 10/23/10, Paul Carey paul.p.ca...@gmail.com wrote: From: Paul Carey paul.p.ca...@gmail.com Subject: Modelling Access Control To: solr-user@lucene.apache.org Date: Saturday, October 23, 2010, 1:03 AM Hi My domain model is made of users that have access to projects which are composed of items. I'm hoping to use Solr and would like to make sure that searches only return results for items that users have access to. I've looked over some of the older posts on this mailing list about access control and saw a suggestion along the lines of acl:user_id AND (actual query). While this obviously works, there are a couple of niggles. Every item must have a list of valid user ids (typically less than 100 in my case). Every time a collaborator is added to or removed from a project, I need to update every item in that project. This will typically be fewer than 1000 items, so I guess is no big deal. I wondered if the following might be a reasonable alternative, assuming the number of projects to which a user has access is lower than a certain bound. (acl:project_id OR acl:project_id OR ... ) AND (actual query) When the numbers are small - e.g. each user has access to ~20 projects and each project has ~20 collaborators - is one approach preferable over another? And when outliers exist - e.g. a project with 2000 collaborators, or a user with access to 2000 projects - is one approach more liable to fail than the other? Many thanks Paul
Re: different results depending on result format
strange..are you absolutely sure the two queries are directed to the same Solr instance? I'm running the same query from the admin page (which specifies the xml format) and I get the exact same results as solrj. On 21 October 2010 22:25, Mike Sokolov soko...@ifactory.com wrote: quick follow-up: I also notice that the query from solrj gets version=1, whereas the admin webapp puts version=2.2 on the query string, although this param doesn't seem to change the xml results at all. Does this indicate an older version of solrj perhaps? -Mike On 10/21/2010 04:47 PM, Mike Sokolov wrote: I'm experiencing something really weird: I get different results depending on whether I specify wt=javabin, and retrieve using SolrJ, or wt=xml. I spent quite a while staring at query params to make sure everything else is the same, and they do seem to be. At first I thought the problem related to the javabin format change that has been talked about recently, but I am using solr 1.4.0 and solrj 1.4.0. Notice in the two entries that the wt param is different and the hits result count is different. Oct 21, 2010 4:22:19 PM org.apache.solr.core.SolrCore execute INFO: [bopp.ba] webapp=/solr path=/select/ params={wt=xmlrows=20start=0facet=truefacet.field=ref_taxid_msq=*:*fl=uri,meta_ssversion=1} hits=261 status=0 QTime=1 Oct 21, 2010 4:22:28 PM org.apache.solr.core.SolrCore execute INFO: [bopp.ba] webapp=/solr path=/select params={wt=javabinrows=20start=0facet=truefacet.field=ref_taxid_msq=*:*fl=uri,meta_ssversion=1} hits=57 status=0 QTime=0 The xml format results seem to be the correct ones. So one thought I had is that I could somehow fall back to using xml format in solrj, but I tried SolrQuery.set('wt','xml') and that didn't have the desired effect (I get 'wt=javabinwt=javabin' in the log - ie the param is repeated, but still javabin). Am I crazy? Is this a known issue? Thanks for any suggestions
Re: query between two date
You'll have to supply your dates in a format Solr expects (e.g. 2010-10-19T08:29:43Z and not 2010-10-19). If you don't need millisecond granularity you can use the DateMath syntax to specify that. Please, also check http://wiki.apache.org/solr/SolrQuerySyntax. On 17 October 2010 10:54, nedaha neda...@gmail.com wrote: Hi there, At first i have to explain the situation. I have 2 fields indexed named tdm_avail1 and tdm_avail2 that are arrays of some different dates. This is a sample doc arr name=tdm_avail1 date2010-10-21T08:29:43Z/date date2010-10-22T08:29:43Z/date date2010-10-25T08:29:43Z/date date2010-10-26T08:29:43Z/date date2010-10-27T08:29:43Z/date /arr arr name=tdm_avail2 date2010-10-19T08:29:43Z/date date2010-10-20T08:29:43Z/date date2010-10-21T08:29:43Z/date date2010-10-22T08:29:43Z/date /arr And in my search form i have 2 field named check-in date and check-out date. I want solr to compare the range that user enter in the search form with the values of tdm_avail1 and tdm_avail2 and return doc if all dates between check-in and check-out dates matches with tdm_avail1 or tdm_avail2 values. for example if user enter: check-in date: 2010-10-19 check-out date: 2010-10-21 that is match with tdm_avail2 then doc must be returned. but if user enter: check-in date: 2010-10-25 check-out date: 2010-10-29 doc could not be returned. so i want the query that gives me the mentioned result. could you help me please? thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/query-between-two-date-tp1718566p1718566.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: query between two date
ok, maybe don't get this right.. are you trying to match something like check-in date 2010-10-19T00:00:00Z AND check-out date 2010-10-21T00:00:00Z *or* check-in date = 2010-10-19T00:00:00Z AND check-out date = 2010-10-21T00:00:00Z? On 18 October 2010 10:05, nedaha neda...@gmail.com wrote: Thanks for your reply. I know about the solr date format!! Check-in and Check-out dates are user-friendly format that we use in our search form for system's users. and i change the format via code and then send them to solr. I want to know how can i make a query to compare a range between check-in and check-out date with some separate different days that i have in solr index. for example: check-in date is: 2010-10-19T00:00:00Z and check-out date is: 2010-10-21T00:00:00Z when i want to build a query from my application i have a range date but in solr index i have separate dates. So how can i compare them to get the appropriate result? -- View this message in context: http://lucene.472066.n3.nabble.com/query-between-two-date-tp1718566p1723752.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: query between two date
ok, I see now..well, the only query that comes to mind is something like: check-in date:[2010-10-19T00:00:00Z TO *] AND check-out date:[* TO 2010-10-21T00:00:00Z] would something like that work? On 18 October 2010 11:04, nedaha neda...@gmail.com wrote: The exact query that i want is: check-in date = 2010-10-19T00:00:00Z AND check-out date = 2010-10-21T00:00:00Z but because of the structure that i have to index i don't have specific start date and end date in my solr index to compare with check-in and check-out date range. I have some dates that available to reserve! Could you please help me? :) -- View this message in context: http://lucene.472066.n3.nabble.com/query-between-two-date-tp1718566p1724062.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: check if field CONTAINS a value, as opposed to IS of a value
looks like you are not tokenizing your field properly. What does your schema.xml look like? On 14 October 2010 13:01, Allistair Crossley a...@roxxor.co.uk wrote: actuall no you don't .. if you want hi in a sentence of hi there this is me this is just normal tokenizing and should work .. check your field type/analysers On Oct 14, 2010, at 7:59 AM, Allistair Crossley wrote: i think you need to look at ngram tokenizing On Oct 14, 2010, at 7:55 AM, PeterKerk wrote: I try to determine if a certain word occurs within a field. http://localhost:8983/solr/db/select/?indent=onfacet=truefl=id,titleq=introtext:hi this works if an EXACT match was found on field introtext, thus the field value is just hi But if the field value woud be hi there, this is just some text, the above URL does no longer find this record. What is the queryparameter to ask solr to look inside the introtext field for a value (and even better also for synonyms) -- View this message in context: http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700495.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: check if field CONTAINS a value, as opposed to IS of a value
verbatim from schema.xml: !-- The StrField type is not analyzed, but indexed/stored verbatim. - StrField and TextField support an optional compressThreshold which limits compression (if enabled in the derived fields) to values which exceed a certain size (in characters). -- so basically what this means is that when you index Hello there mate the only text that is indexed and therefore searchable is the exact phrase Hello there mate and *not* the terms Hello - there - mate. What you need is a solr.TextField based type which splits ( tokenizes) your text. On 14 October 2010 14:07, PeterKerk vettepa...@hotmail.com wrote: This is the definition fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true/ field name=introtext type=string indexed=true stored=true/ -- View this message in context: http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700893.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: check if field CONTAINS a value, as opposed to IS of a value
I think this should work..It might also be a good idea to investigate how exactly each filter in the chain modifies your original text..this way you will be able to better understand why certain queries match certain documents. On 14 October 2010 14:18, PeterKerk vettepa...@hotmail.com wrote: Correct, thanks! I have used the following: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_dutch.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_dutch.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType -- View this message in context: http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700945.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: check if field CONTAINS a value, as opposed to IS of a value
yep, the Solr Admin web-app provides functionality that does exactly that..it can reached@ http:// {serverName}:{serverPort}/solr/admin/analysis.jsp On 14 October 2010 14:28, PeterKerk vettepa...@hotmail.com wrote: It DOES work :) Oh and on the filtersis there some sort of debug/overview tool to see what each filter does and what an input string look like after going through a filter? -- View this message in context: http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700997.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: check if field CONTAINS a value, as opposed to IS of a value
correct, it show the transformations that happen to your indexed term (or query term if you use the *Field value (query)* box ) after each Tokenizer/Filter is executed. On 14 October 2010 14:40, PeterKerk vettepa...@hotmail.com wrote: Awesome again! And for my understanding, I type a single word Boston and then I see 7 lines of output: Boston Boston Boston Boston boston boston boston So each line represents what is done to the query value after it has passed through the filter? -- View this message in context: http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1701070.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Quoting special characters?
If I understand your problem right what you probably need is to escape those characters: http://lucene.apache.org/java/2_9_1/queryparsersyntax.html#Escaping Special Characters On 14 October 2010 14:36, Igor Chudov ichu...@gmail.com wrote: Let's say that I submit a query for a MoreLikeThis search. The query contains special characters, that Solr/Lucene interprets specially, such as colon :. Example textual query is Solve a proportion X:2 = 4/5 and find X. (the context is website algebra.com). My queries never intend those characters to be interpreted for anything other than literal value. As a first shot, I simply replace them with a space, but I wonder if I would be better off, matching wise, with quoting those characters instead of removing them? If so how do I quote such characters? Thanks1 i
Re: Solr Fuzzy
Hi, yes, Solr does support fuzzy queries by using the Levenstein Distance algorithm: http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance You can specify a fuzzy query by adding a tilde (~) symbol at the end of your query as in title: Solr~ You can even specify a proximity threshold in order to achieve a less or more strict fuzzy match as in title:Solr~0.8 with the threshold being a number between 0 and 1, 1 being the most strict.. HTH On 14 October 2010 19:26, Claudio Devecchi cdevec...@gmail.com wrote: Hi people, Somebody knows if solr have the fuzzy funcionality? Tks -- Claudio Devecchi
Re: SOLRJ - Searching text in all fields of a Bean
Hello, What does your schema look like? Have you defined a catch all field and copy every value from all your other fields in it with a copyField / directive? Cheers, -- Savvas On 8 October 2010 08:30, Subhash Bhushan subhash.bhus...@stratalabs.inwrote: Hi, I have two fields in the bean class, id and title. After adding the bean to SOLR, I want to search for, say kitten, in all defined fields in the bean, like this -- query.setQuery( kitten); -- But I get results only when I affix the bean field name before the search text like this -- query.setQuery( title:kitten); -- Same case even when I use SolrInputDocument, and add these fields. Can we search text in all fields of a bean, without having to specify a field? If we can, what am I missing in my code? *Code:* Bean: --- public class SOLRTitle { @Field public String id = ; @Field public String title = ; } --- Indexing function: --- private static void uploadData() { try { ... // Get Titles ListSOLRTitle solrTitles = new ArrayListSOLRTitle(); IteratorTitle it = titles.iterator(); while(it.hasNext()) { Title title = (Title) it.next(); SOLRTitle solrTitle = new SOLRTitle(); solrTitle.id = title.getID().toString(); solrTitle.title = title.getTitle(); solrTitles.add(solrTitle); } server.addBeans(solrTitles); server.commit(); } catch (SolrServerException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } } --- Querying function: --- private static void queryData() { try { SolrQuery query = new SolrQuery(); query.setQuery( kitten); QueryResponse rsp = server.query( query ); ListSOLRTitle beans = rsp.getBeans(SOLRTitle.class); System.out.println(beans.size()); IteratorSOLRTitle it = beans.iterator(); while(it.hasNext()) { SOLRTitle solrTitle = (SOLRTitle)it.next(); System.out.println(solrTitle.id); System.out.println(solrTitle.title); } } catch (SolrServerException e) { e.printStackTrace(); } } -- Subhash Bhushan.
Re: Strange search result (or lack of)
Hello, Try searching for name_de:(das urteil). A search for name_de:das urteil will search for das in *name_de* and for urteil in the default field (e.g. catch all field). Hope that helps, -- Savvas On 8 October 2010 09:00, Thomas Kellerer spam_ea...@gmx.net wrote: Hi, I have the following field defined in my schema: fieldType name=name_field class=solr.StrField positionIncrementGap=100 omitNorms=false analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.StandardFilterFactory/ filter class=solr.ISOLatin1AccentFilterFactory / filter class=solr.LowerCaseFilterFactory / filter class=solr.TrimFilterFactory / /analyzer /fieldType field name=name_de type=name_field indexed=true stored=true / The field contains the value Das Urteil which is thus stored as das urteil The following query (using Solr 1.4) returns nothing name_de:das urteil But when I run the query name_de:das urteil the expected document is found. When I check this through the Analysis page of the solr admin it does show me a match for the first query. I'm sure I'm missing something obvious. But what? Regards Thomas
Re: Umlaut in facet name attribute
Hello, It seems that your analysis process removes punctuation and therefore indexes terms without it. What you see in the faceted result is the text that has been indexed. If you select a Tokenizer/Token Filter which preserves punctuation you should be able to see what you want. Cheers, -- Savvas On 5 October 2010 20:25, alexander sulz a.s...@digiconcept.net wrote: Good Evening and Morning. I noticed that if I do a facet search on a field which value contains umlaute (öäü), the facet list returned converted the value of the field into a normal character (oau).. How do I precent this from happening? I cant seem to find the configuration for faceting in theschema or config xml files. thx alex
Re: Re: Umlaut in facet name attribute
Good point, so you could have an unanalyzed counterpart field set with a copyfield / and facet on that.. On 5 October 2010 23:49, Markus Jelsma markus.jel...@buyways.nl wrote: It is a good practice (for many cases as seen on the list) to search (usually with fq) on analzyed fields but return the facet list based on the unanalyzed counterparts. -Original message- From: Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com Sent: Wed 06-10-2010 00:46 To: solr-user@lucene.apache.org; Subject: Re: Umlaut in facet name attribute Hello, It seems that your analysis process removes punctuation and therefore indexes terms without it. What you see in the faceted result is the text that has been indexed. If you select a Tokenizer/Token Filter which preserves punctuation you should be able to see what you want. Cheers, -- Savvas On 5 October 2010 20:25, alexander sulz a.s...@digiconcept.net wrote: Good Evening and Morning. I noticed that if I do a facet search on a field which value contains umlaute (öäü), the facet list returned converted the value of the field into a normal character (oau).. How do I precent this from happening? I cant seem to find the configuration for faceting in theschema or config xml files. thx alex