Default Query Operator
Hi, Presently OR is the default operator for search in Solr. for e.g. If I am searching for these 2 words with a space: abc xyz then it will return all the records which has either abc or xyz or both. It means it is executing query like abc or xyz. But my requirement is that it should return only those records which contains abc and xyz both i.e. It should execute the query using AND operator as default like abc and xyz. Please suggest how I can do this. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Default-Query-Operator-tp23477955p23477955.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Default Query Operator
Having solrQueryParser defaultOperator=AND/ in your schema.xml should address your requirements. Cheers Avlesh On Mon, May 11, 2009 at 12:18 PM, dabboo ag...@sapient.com wrote: Hi, Presently OR is the default operator for search in Solr. for e.g. If I am searching for these 2 words with a space: abc xyz then it will return all the records which has either abc or xyz or both. It means it is executing query like abc or xyz. But my requirement is that it should return only those records which contains abc and xyz both i.e. It should execute the query using AND operator as default like abc and xyz. Please suggest how I can do this. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Default-Query-Operator-tp23477955p23477955.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Default Query Operator
Sorry to mention in the problem that I am trying to do this with dismax request. Without dismax request, it is working fine but not with dismax request. Avlesh Singh wrote: Having solrQueryParser defaultOperator=AND/ in your schema.xml should address your requirements. Cheers Avlesh On Mon, May 11, 2009 at 12:18 PM, dabboo ag...@sapient.com wrote: Hi, Presently OR is the default operator for search in Solr. for e.g. If I am searching for these 2 words with a space: abc xyz then it will return all the records which has either abc or xyz or both. It means it is executing query like abc or xyz. But my requirement is that it should return only those records which contains abc and xyz both i.e. It should execute the query using AND operator as default like abc and xyz. Please suggest how I can do this. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Default-Query-Operator-tp23477955p23477955.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Default-Query-Operator-tp23477955p23478820.html Sent from the Solr - User mailing list archive at Nabble.com.
I have bad performance when using elevate.xml
I have 10M document, 2.9GB,Not to use the elevate.xml when there is no problem, adding the elevate.xml in SOLR_HOME/data , to search the configured key word , the system will be very slow, all of memory be used(JVM 2GB ) soon ,and web container suspended.
Re: French and SpellingQueryConverter
Shalin Shekhar Mangar schrieb: On Fri, May 8, 2009 at 2:14 AM, Jonathan Mamou ma...@il.ibm.com wrote: SpellingQueryConverter always splits words with special character. I think that the issue is in SpellingQueryConverter class Pattern.compile.((?:(?!(\\w+:|\\d+)))\\w+);?: According to http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html, \w A word character: [a-zA-Z_0-9] I think that special character should also be added to the regex. Same issue for the GermanAnalyzer as for the FrenchAnalyzer. http://wiki.apache.org/solr/SpellCheckComponent says: The SpellingQueryConverter class does not deal properly with non-ASCII characters. In this case, you have either to use spellcheck.q, or to implement your own QueryConverter. If you use spellcheck.q parameter for specifying the spelling query, then the field's analyzer will be used (in this case, FrenchAnalyzer). If you use the q parameter, then the SpellingQueryConverter is used. Could you give an example of how the spellcheck.q parameter can be brought into play to (take non-ASCII characters into account, so that Käse isn't mishandled) given the following example: package org.apache.solr.spelling; import org.apache.lucene.analysis.de.GermanAnalyzer; public class GermanTest { public static void main(String[] args) { SpellingQueryConverter sqc = new SpellingQueryConverter(); sqc.analyzer = new GermanAnalyzer(); System.out.println(sqc.convert(Käse)); } } Note the result of the above, which is plain wrong, reads: [(k,0,1,type=ALPHANUM), (se,2,4,type=ALPHANUM)] Thanks. Michael Ludwig
Restarting tomcat deletes all Solr indexes
Hi, I'm facing a silly problem. Every time I restart tomcat all the indexes are lost. I used all the default configurations. I'm pretty sure there must be some basic changes to fix this. I'd highly appreciate if someone could direct me fixing this. Thanks, KK.
STop dataimport full-import
Hi Is it possible to stop a full-import from a dataimport handler and if so, how? If I stop the import or stop Jetty and restart it whilst the full-import is taking place, will it delete the indexed data? Thanks in Advance Andrew
Re: STop dataimport full-import
you can abort a running import with command=abort if you kill the jetty in between Lucene would commit the uncommitted docs On Mon, May 11, 2009 at 3:13 PM, Andrew McCombe eupe...@gmail.com wrote: Hi Is it possible to stop a full-import from a dataimport handler and if so, how? If I stop the import or stop Jetty and restart it whilst the full-import is taking place, will it delete the indexed data? Thanks in Advance Andrew -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: STop dataimport full-import
Hi Thanks. Found out the hard way that abort also removes the index :) Regards Andrew 2009/5/11 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com: you can abort a running import with command=abort if you kill the jetty in between Lucene would commit the uncommitted docs On Mon, May 11, 2009 at 3:13 PM, Andrew McCombe eupe...@gmail.com wrote: Hi Is it possible to stop a full-import from a dataimport handler and if so, how? If I stop the import or stop Jetty and restart it whilst the full-import is taking place, will it delete the indexed data? Thanks in Advance Andrew -- - Noble Paul | Principal Engineer| AOL | http://aol.com
about boosting queries...
Hey there, I would like to give very low boost to the docs that match field_a = 54. I have tried str name=bqfield_a:54^0.1/str but it's not working. In the opposite case, I mean to give hight boost doing: str name=bqfield_a:54^1/str it works perfect. I supose it is because I do the search in 6 fields and a summation is happening so.. even if I am seting boost to 0.1 the sum with other fields boost makes the bq to almost not take effect (and negative boost is not allowed). Is that the reason? Any clue how could I reach my goal? Thanks in advance -- View this message in context: http://www.nabble.com/about-boosting-queries...-tp23484208p23484208.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to combine facets count from multiple query into one query
Hi, Not sure if this is what you want, but would this do what you need? fq={!tag=p1}publisher_name:publisher1fq={!tag=p2}publisher_name:publisher2q=abstract:philosophyfacet=truefacet.mincount=1facet.field={!ex=p1 key=p2_book_title}book_titlefacet.field={!ex=p2 key=p1_book_title}book_title or seperated by newlines instead of for readability: fq={!tag=p1}publisher_name:publisher1 fq={!tag=p2}publisher_name:publisher2 q=abstract:philosophy facet=true facet.mincount=1 facet.field={!ex=p1 key=p2_book_title}book_title facet.field={!ex=p2 key=p1_book_title}book_title Of course, this uses an 1.4 feature (tagging and excluding) Regards, gwk Jeffrey Tiong wrote: Hi, I have a schema that has the following fields, publisher_name book_title year abstract Currently if I do a facet count when I have a query q=abstract:philosophy AND publisher_name:publisher1 , it can give me results like below, str name=qabstract:philosophy AND publisher_name:publisher1/str lst name=book_title int name=book1 70 /int int name=book2 60 /int int name=book3 20 /int /lst lst name=year int name=1990 78 /int int name=1991 62 /int int name=1992 19 /int /lst Likewise for q=abstract:philosophy AND publisher_name:publisher2 - str name=qabstract:philosophy AND publisher_name:publisher2/str lst name=book_title int name=book1 3 /int int name=book2 1 /int int name=book3 1 /int /lst lst name=year int name=1989 3 /int int name=1990 1 /int int name=1992 1 /int /lst However I have to do the query separately and get the facet count for each of them separately. Is there a way for me to combine all these into one query and get the facet count for each of them at one query? because sometimes it may go up to 20 queries in order to get all the separate counts. Thanks! Jef
Re: Default Query Operator
With dismax, to get all terms required, set mm (minimum match) to 100% Erik On May 11, 2009, at 4:08 AM, dabboo wrote: Sorry to mention in the problem that I am trying to do this with dismax request. Without dismax request, it is working fine but not with dismax request. Avlesh Singh wrote: Having solrQueryParser defaultOperator=AND/ in your schema.xml should address your requirements. Cheers Avlesh On Mon, May 11, 2009 at 12:18 PM, dabboo ag...@sapient.com wrote: Hi, Presently OR is the default operator for search in Solr. for e.g. If I am searching for these 2 words with a space: abc xyz then it will return all the records which has either abc or xyz or both. It means it is executing query like abc or xyz. But my requirement is that it should return only those records which contains abc and xyz both i.e. It should execute the query using AND operator as default like abc and xyz. Please suggest how I can do this. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Default-Query-Operator- tp23477955p23477955.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Default-Query-Operator-tp23477955p23478820.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Default Query Operator
Hi, I already have done this but still I am not getting any record. But if I remove the qt=dismaxrequest, then it works fine. Erik Hatcher wrote: With dismax, to get all terms required, set mm (minimum match) to 100% Erik On May 11, 2009, at 4:08 AM, dabboo wrote: Sorry to mention in the problem that I am trying to do this with dismax request. Without dismax request, it is working fine but not with dismax request. Avlesh Singh wrote: Having solrQueryParser defaultOperator=AND/ in your schema.xml should address your requirements. Cheers Avlesh On Mon, May 11, 2009 at 12:18 PM, dabboo ag...@sapient.com wrote: Hi, Presently OR is the default operator for search in Solr. for e.g. If I am searching for these 2 words with a space: abc xyz then it will return all the records which has either abc or xyz or both. It means it is executing query like abc or xyz. But my requirement is that it should return only those records which contains abc and xyz both i.e. It should execute the query using AND operator as default like abc and xyz. Please suggest how I can do this. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Default-Query-Operator- tp23477955p23477955.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Default-Query-Operator-tp23477955p23478820.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Default-Query-Operator-tp23477955p23485608.html Sent from the Solr - User mailing list archive at Nabble.com.
Selective Searches Based on User Identity
Can anybody point me in the direction of resources and/or projects regarding the following scenario; I have a community of users contributing content to a Solr index. By default, the user (A) who contributes a document owns it, and can see the document in their search results. The owner can then grant selective access to that document to other users. If another user (B) is granted access by A, then document shows up in B's search results, along with whatever B has contributed and any other documents to which B has been granted access. Conversely, if B is not granted access to the document, it does not show up in their search results. I'm comfortable building this logic myself, so long as I'm not repeating the work of others in this area. Thanks, in advance, for any advice or information. Terence
Concurrent run of snapshot scripts.
Hi Everyone, I'm running solr 1.3 and I was wondering if there's a problem with running the snapshot script concurrently . For instance, I have a cron job which performs a snappuller/snapinstaller every minute on my slave servers. Sometime (for instance after an optimize), the snappuller can take more than one minute. Is that a problem if another snappuller is spawned whilst another one older than one minute is still running ? Cheers !! Jerome Eteve. -- Jerome Eteve. Chat with me live at http://www.eteve.net jer...@eteve.net
Disable unique-key for Solr index
I have a case where I would like a solr index created which disables the unique-key option. I've tried commenting out the uniqueKey option and that just spits out an error: SEVERE: org.apache.solr.common.SolrException: QueryElevationComponent requires the schema to have a uniqueKeyField I've tried something like this : uniqueKey required=false/uniqueKey Nothing seems to do the trick. The problem with a unique key is that the uniqueness for my results are actually based on all the fields in my document. There isn't one specific field which is unique. All the fields combined are unique though (they are taken directly from a View inside an RDBMS whose primary key is all of the columns). Any help would be greatly appreciated! Thanks, Jeff -- View this message in context: http://www.nabble.com/Disable-unique-key-for-Solr-index-tp23487249p23487249.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Disable unique-key for Solr index
Hi ! Is there any primary table in your view with a unique single key you could use ? J. 2009/5/11 jcott28 jcot...@yahoo.com: I have a case where I would like a solr index created which disables the unique-key option. I've tried commenting out the uniqueKey option and that just spits out an error: SEVERE: org.apache.solr.common.SolrException: QueryElevationComponent requires the schema to have a uniqueKeyField I've tried something like this : uniqueKey required=false/uniqueKey Nothing seems to do the trick. The problem with a unique key is that the uniqueness for my results are actually based on all the fields in my document. There isn't one specific field which is unique. All the fields combined are unique though (they are taken directly from a View inside an RDBMS whose primary key is all of the columns). Any help would be greatly appreciated! Thanks, Jeff -- View this message in context: http://www.nabble.com/Disable-unique-key-for-Solr-index-tp23487249p23487249.html Sent from the Solr - User mailing list archive at Nabble.com. -- Jerome Eteve. Chat with me live at http://www.eteve.net jer...@eteve.net
Re: Disable unique-key for Solr index
Man, I hadn't even thought of that! Now I feel like an idiot! Thanks! Erik Hatcher wrote: If you're not using it, remove the QueryElevationComponent from solrconfig.xml Erik On May 11, 2009, at 1:15 PM, jcott28 wrote: I have a case where I would like a solr index created which disables the unique-key option. I've tried commenting out the uniqueKey option and that just spits out an error: SEVERE: org.apache.solr.common.SolrException: QueryElevationComponent requires the schema to have a uniqueKeyField I've tried something like this : uniqueKey required=false/ uniqueKey Nothing seems to do the trick. The problem with a unique key is that the uniqueness for my results are actually based on all the fields in my document. There isn't one specific field which is unique. All the fields combined are unique though (they are taken directly from a View inside an RDBMS whose primary key is all of the columns). Any help would be greatly appreciated! Thanks, Jeff -- View this message in context: http://www.nabble.com/Disable-unique-key-for-Solr-index-tp23487249p23487249.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Disable-unique-key-for-Solr-index-tp23487249p23488459.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Disable unique-key for Solr index
If you're not using it, remove the QueryElevationComponent from solrconfig.xml Erik On May 11, 2009, at 1:15 PM, jcott28 wrote: I have a case where I would like a solr index created which disables the unique-key option. I've tried commenting out the uniqueKey option and that just spits out an error: SEVERE: org.apache.solr.common.SolrException: QueryElevationComponent requires the schema to have a uniqueKeyField I've tried something like this : uniqueKey required=false/ uniqueKey Nothing seems to do the trick. The problem with a unique key is that the uniqueness for my results are actually based on all the fields in my document. There isn't one specific field which is unique. All the fields combined are unique though (they are taken directly from a View inside an RDBMS whose primary key is all of the columns). Any help would be greatly appreciated! Thanks, Jeff -- View this message in context: http://www.nabble.com/Disable-unique-key-for-Solr-index-tp23487249p23487249.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: BoostedQuery Performance
After spending more time on this, it seems more likely a problem from FunctionQuery. If using boost = log(100) takes 100ms, log(log(100)) adds another 100ms, log(log(log(100))) adds another 100ms, and so on. The time goes up almost linearly instead of being constant. Any ideas? Thanks, Guangwei On Sat, May 9, 2009 at 12:31 PM, Guangwei Yuan guy...@gmail.com wrote: Hi, I'm trying the BoostQParserPlugin and FunctionQuery to enable query time boosting. It works better than bf (boost function) because it multiplies the relevancy score by the boosts. However I noticed significant performance issues with it. The more functions I use as boosts, the slower it gets. For example, if a query without boosts takes 300ms, adding a boost function like log(popularity) becomes 600ms, using pow(log(popularity),2) takes 800ms, using product(pow(log(score_total),2),recip(rord(days),1,90,90)) takes 1100ms. Ideally I'd like to pass in a reasonable amount of functions to adjust the search ranking dynamically without sacrificing much performance. Thanks, Guangwei
how to manually add data to indexes generated by nutch-1.0 using solr
Hello, I had? Nutch -1.0 to crawl fetch and index a lot of files. Then I needed to? index a few files also. But I know keywords for those files and their? locations. I need to add them manually. I took a look to two tutorials on the wiki, but did not find any info about this issue. Is there a tutorial on, step by step procedure of adding data to? nutch index using solr? manually? Thanks in advance. Alex.
Re: French and SpellingQueryConverter
On Mon, May 11, 2009 at 2:46 PM, Michael Ludwig m...@as-guides.com wrote: Could you give an example of how the spellcheck.q parameter can be brought into play to (take non-ASCII characters into account, so that Käse isn't mishandled) given the following example: You will need to set the correct tokenizer and filters for your field which can handle your language correctly. Look at the GermanAnalyzer in Lucene contrib-analysis. It uses StandardTokenizer, StandardFilter, LowerCaseFilter, StopFilter, GermanStemFilter with a custom stopword list. Use the analysis.jsp on the admin page to see how queries on that field type are tokenizer. Tweak until it works as desired. Once that is setup, you need to send all the spell check queries through the spellcheck.q parameter. The query-time analyzer for that field will be used by spellchecker to analyze the query. -- Regards, Shalin Shekhar Mangar.
Re: STop dataimport full-import
On Mon, May 11, 2009 at 3:58 PM, Andrew McCombe eupe...@gmail.com wrote: Thanks. Found out the hard way that abort also removes the index :) I guess you were using 1.3? In the 1.3 release, abort stops the full-import and does not commit the data. However, due to Lucene's limitation, the data is committed when the servlet container is shutdown. Because a full-import starts off by issuing a delete-all query, all the data is lost. With Solr 1.4 (trunk), abort will roll back to the last commit. So it won't remove the documents automatically. -- Regards, Shalin Shekhar Mangar.
Re: BoostedQuery Performance
Please ignore my posts. Log is quite expensive an operation... On Mon, May 11, 2009 at 11:45 AM, Guangwei Yuan guy...@gmail.com wrote: After spending more time on this, it seems more likely a problem from FunctionQuery. If using boost = log(100) takes 100ms, log(log(100)) adds another 100ms, log(log(log(100))) adds another 100ms, and so on. The time goes up almost linearly instead of being constant. Any ideas? Thanks, Guangwei On Sat, May 9, 2009 at 12:31 PM, Guangwei Yuan guy...@gmail.com wrote: Hi, I'm trying the BoostQParserPlugin and FunctionQuery to enable query time boosting. It works better than bf (boost function) because it multiplies the relevancy score by the boosts. However I noticed significant performance issues with it. The more functions I use as boosts, the slower it gets. For example, if a query without boosts takes 300ms, adding a boost function like log(popularity) becomes 600ms, using pow(log(popularity),2) takes 800ms, using product(pow(log(score_total),2),recip(rord(days),1,90,90)) takes 1100ms. Ideally I'd like to pass in a reasonable amount of functions to adjust the search ranking dynamically without sacrificing much performance. Thanks, Guangwei
Re: Control segment size
Shalin, Here is what I've read on maxMergeDocs, While merging segments, Lucene will ensure that no segment with more than maxMergeDocs is created. Wouldn't that mean that no index file should contain more than max docs? I guess the index files could also just contain the index information which is not limited by any property - is that true? Is there any work around to limit the index size, beside limiting the index itself? Thanks, -vivek On Fri, May 8, 2009 at 10:02 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Fri, May 8, 2009 at 1:30 AM, vivek sar vivex...@gmail.com wrote: I did set the maxMergeDocs to 10M, but I still see couple of index files over 30G which do not match with max number of documents. Here are some numbers, 1) My total index size = 66GB 2) Number of total documents = 200M 3) 1M doc = 300MB 4) 10M doc should be roughly around 3-4GB. As you can see couple of files are huge. Are those documents or index files? How can I control the file size so no single file grows more than 10GB. No, there is no way to limit an individual file to a specific size. -- Regards, Shalin Shekhar Mangar.
Providing fault tolerance with Solr
Hi, I want to make my system fault tolerant. My system has two shards each with one master and two slaves. So if any of the slave or master fails i want my system to continue working. Any known solutions to this. Does solr provide any such functionalities as yet. Thanx. -- View this message in context: http://www.nabble.com/Providing-fault-tolerance-with-Solr-tp23492418p23492418.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Selective Searches Based on User Identity
Why can't you simply index a field authorized-to with value user-B and enrich any query you receive from a user with a mandatory query for that authorization? paul Le 11-mai-09 à 17:50, Terence Gannon a écrit : Can anybody point me in the direction of resources and/or projects regarding the following scenario; I have a community of users contributing content to a Solr index. By default, the user (A) who contributes a document owns it, and can see the document in their search results. The owner can then grant selective access to that document to other users. If another user (B) is granted access by A, then document shows up in B's search results, along with whatever B has contributed and any other documents to which B has been granted access. Conversely, if B is not granted access to the document, it does not show up in their search results. I'm comfortable building this logic myself, so long as I'm not repeating the work of others in this area. Thanks, in advance, for any advice or information. Terence smime.p7s Description: S/MIME cryptographic signature
Re: I have bad performance when using elevate.xml
On Mon, May 11, 2009 at 4:55 AM, ant dormant.m...@gmail.com wrote: I have 10M document, 2.9GB,Not to use the elevate.xml when there is no problem, adding the elevate.xml in SOLR_HOME/data , to search the configured key word , the system will be very slow, all of memory be used(JVM 2GB ) soon ,and web container suspended. How many keywords and documents do you list in elevate.xml? -Yonik http://www.lucidimagination.com
Re: Control segment size
On Tue, May 12, 2009 at 2:30 AM, vivek sar vivex...@gmail.com wrote: Here is what I've read on maxMergeDocs, While merging segments, Lucene will ensure that no segment with more than maxMergeDocs is created. Wouldn't that mean that no index file should contain more than max docs? I guess the index files could also just contain the index information which is not limited by any property - is that true? Yes, an individual segment will not contain more than maxMergeDocs number of documents. But the size of the segment may still vary because some documents may have more unique tokens than others. What you saw originally must have been a segment merge which is normal and happens in the course of indexing. I don't think there's a way to avoid that other than to have a ridiculously high mergeFactor (which will affect search performance). -- Regards, Shalin Shekhar Mangar.
Re: Providing fault tolerance with Solr
The fault tolerance is achieved using external loadbalancing .you can use an external /w loadbalancer or a simple one like this http://wiki.apache.org/solr/LBHttpSolrServer for java or http://code.google.com/p/solr-php-client/ for php On Tue, May 12, 2009 at 3:38 AM, mirage1987 mirage1...@gmail.com wrote: Hi, I want to make my system fault tolerant. My system has two shards each with one master and two slaves. So if any of the slave or master fails i want my system to continue working. Any known solutions to this. Does solr provide any such functionalities as yet. Thanx. -- View this message in context: http://www.nabble.com/Providing-fault-tolerance-with-Solr-tp23492418p23492418.html Sent from the Solr - User mailing list archive at Nabble.com. -- - Noble Paul | Principal Engineer| AOL | http://aol.com