Re: Issue with getting highlight with hl.maxAnalyzedChars = -1
You didn't say, what is exactly going weird.. On Fri, May 10, 2013 at 2:19 PM, meghana meghana.rav...@amultek.com wrote: I am facing one weird issue while setting hl.maxAnalyzedChars to -1 to fetch highlight for some random records, for other records its working fine. Below is my solr query http://localhost:8080/solr/core0/select?q=(text:new year) AND (id:2343287)hl=onhl.fl=texthl.fragsize=500hl.maxAnalyzedChars=-1 If i remove hl.maxAnalyzedChars=-1 from above query , or set it to some positive value (higher than text field length) , then it return record with proper highlight. But my text field length is very long, and i want like to limit it, so I also need to set hl.maxAnalyzedChars to -1 . Please help me to solve this. -- View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-getting-highlight-with-hl-maxAnalyzedChars-1-tp4062269.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Negative Boosting at Recent Versions of Solr?
On Fri, 2013-05-10 at 16:49 +0200, Jason Hellman wrote: -23.0 = product(float(price)=11.5,const(-2)) I wonder how fantastically this can be abused now? Mmm... Products of negative scores. I foresee The products matching an uneven number of search terms gets much higher scores. Why?-questions in the future. - Toke Eskildsen, State and University Library, Denmark
Re: Unable to load environment info from /solr/collection1/admin/system?wt=json
I have tried to open that URL: ip:8983/solr/ I get that error: INFO: [collection1] CLOSING SolrCore org.apache.solr.core.SolrCore@62ad1b5c May 13, 2013 10:38:40 AM org.apache.solr.update.DirectUpdateHandler2 close INFO: closing DirectUpdateHandler2{commits=0,autocommit maxTime=15000ms,autocommits=0,soft autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0} May 13, 2013 10:38:40 AM org.apache.solr.core.SolrCore closeSearcher INFO: [collection1] Closing main searcher on request. May 13, 2013 10:38:40 AM org.apache.solr.common.SolrException log SEVERE: java.lang.RuntimeException: Already closed at org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:336) at org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:321) at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:244) at org.apache.solr.core.SolrCore.getIndexDir(SolrCore.java:223) at org.apache.solr.handler.admin.SystemInfoHandler.getCoreInfo(SystemInfoHandler.java:112) at org.apache.solr.handler.admin.SystemInfoHandler.handleRequestBody(SystemInfoHandler.java:78) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1812) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:365) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:722) May 13, 2013 10:38:40 AM org.apache.solr.core.SolrCore execute INFO: [collection1] webapp=/solr path=/admin/system params={_=136843072wt=json} status=500 QTime=1 May 13, 2013 10:38:40 AM org.apache.solr.common.SolrException log SEVERE: null:java.lang.RuntimeException: Already closed at org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:336) at org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:321) at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:244) at org.apache.solr.core.SolrCore.getIndexDir(SolrCore.java:223) at org.apache.solr.handler.admin.SystemInfoHandler.getCoreInfo(SystemInfoHandler.java:112) at org.apache.solr.handler.admin.SystemInfoHandler.handleRequestBody(SystemInfoHandler.java:78) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1812) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) at
Re: MultiValue
Hi All, I managed to *solve* the issue I had posted earlier with respect to multiValued. Here is the Query suppose to configured this way in *data-config.xml * Description: in the below, first query has associated table images. Each person would have many images. Here the JSON/XML would return all the images associated with the person in block. document name=manju entity name=list dataSource=manjudb query=select member_id,MemberName FROM member entity name=imagelist dataSource=manjudb query=SELECT imagepath FROM images WHERE member_id='${list.member_id}' /entity /entity /document and *schema.xml* for the above query fields looks like field name=member_id type=int indexed=true stored=true required=true/ field name=MemberName type=string indexed=true stored=true/ field name=imagepath type=string indexed=true stored=true bmultiValued=true* / Output: response: { numFound: 3, start: 0, docs: [ { MemberName: Vettel, member_id: 1, _version_: 1434904021528739800 }, { MemberName: Schumacher, member_id: 2, imagepath: [ //As you could see here that Three rows from the table *images* returned as one JSON object. c:images\\etc\\test.jpg, c:images\\etc\\test211.jpg, c:images\\etc\\test2343434.jpg, C:manju ], _version_: 1434904021541322800 }, { MemberName: J.Button, member_id: 3, _version_: 143490402154342 } ] } } Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/MultiValue-tp4034305p4062863.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: multiValued schema example (SOLVED)
Hi All, I managed to *solve* the issue I had posted earlier with respect to multiValued. Here is the Query suppose to configured this way in *data-config.xml * Description: in the below, first query has associated table images. Each person would have many images. Here the JSON/XML would return all the images associated with the person in block. document name=manju entity name=list dataSource=manjudb query=select member_id,MemberName FROM member entity name=imagelist dataSource=manjudb query=SELECT imagepath FROM images WHERE member_id='${list.member_id}' /entity /entity /document and *schema.xml* for the above query fields looks like field name=member_id type=int indexed=true stored=true required=true/ field name=MemberName type=string indexed=true stored=true/ field name=imagepath type=string indexed=true stored=true bmultiValued=true* / Output: response: { numFound: 3, start: 0, docs: [ { MemberName: Vettel, member_id: 1, _version_: 1434904021528739800 }, { MemberName: Schumacher, member_id: 2, imagepath: [ //As you could see here that Three rows from the table *images* returned as one JSON object. c:images\\etc\\test.jpg, c:images\\etc\\test211.jpg, c:images\\etc\\test2343434.jpg, C:manju ], _version_: 1434904021541322800 }, { MemberName: J.Button, member_id: 3, _version_: 143490402154342 } ] } } Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/multiValued-schema-example-SOLVED-tp4062209p4062864.html Sent from the Solr - User mailing list archive at Nabble.com.
Best way to design a story and comments schema.
Hi, I wish to know how to best design a schema to store comments in stories / articles posted. I have a set of fields: / lt;field name=quot;subjectquot; type=quot;text_generalquot; indexed=quot;truequot; stored=quot;truequot;/gt; lt;field name=quot;keywordsquot; type=quot;text_generalquot; indexed=quot;truequot; stored=quot;truequot;/gt; lt;field name=quot;categoryquot; type=quot;text_generalquot; indexed=quot;truequot; stored=quot;truequot;/gt; lt;field name=quot;contentquot; type=quot;text_generalquot; indexed=quot;falsequot; stored=quot;truequot; /gt; / Users can post their comments on a post and I should be able to retrieve these comments and show it along side the original post. I only need to show the last 3 comments and show a facet of the remaining comments which user can click and see the rest of the comments ( something like facebook does ). One alternative, I could think of, was adding a dynamic field for all comments : /lt;dynamicField name=quot;comment_*quot; type=quot;stringquot; indexed=quot;falsequot; stored=quot;truequot;/gt;/ So, to store each comments, I would send a text to solr of the form - For Field Name: /comment_n/ Value:/[Commenter Name]:[Commenter ID]:[Actual Comment Text]/ And to keep the count of those comments, I could use another field like so :/lt;field name=quot;comment_countquot; type=quot;intquot; indexed=quot;truequot; stored=quot;truequot;/gt;/ With this approach, I will have to do some calculation when a comment is deleted by the user but I still can manage to show the comments right. My idea is to find the best solution for this scenario which will be fast and also be simple. Kindly suggest. -- View this message in context: http://lucene.472066.n3.nabble.com/Best-way-to-design-a-story-and-comments-schema-tp4062867.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Sorting Algorithm
Good Morning All, The alphabetical sorting is causing slight issues as below: I have 3 documents with title value as below: 1) Acer Palmatum (Tree) 2) Aceraceae (Tree Family) 3) Acer Pseudoplatanus (Tree) I have created title_sort field which is defined with field type as alphaNumericalSort (that comes with solr example schema) When I apply the sort order (sort=title_sort asc), I get the results as: Aceraceae (Tree Family) Acer Palmatum (Tree) Acer Pseudoplatanus (Tree) But, the expected order is (spaces first), Acer Palmatum (Tree) Acer Pseudoplatanus (Tree) Aceraceae (Tree Family) My unit test contains Collections.sort method and I get the expected results but I'm not sure why Solr is doing it in different way. From Collections.sort API, I can see that it uses modified merge sort, could you tell me which algorithm solr follows for sorting logic and also if there is any other approach I can take? Many Thanks, Sandeep
Mandatory words search in SOLR
Hi SOLR Experts When I search documents with keyword as *java, mysql* then I get the documents containing either *java* or *mysql* or both. Is it possible to get the documents those contains both *java* and *mysql*. In that case, how the query would look like. Thanks a lot Kamal
Re: Mandatory words search in SOLR
Hello! Change the default query operator. For example add the q.op=AND to your query. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi SOLR Experts When I search documents with keyword as *java, mysql* then I get the documents containing either *java* or *mysql* or both. Is it possible to get the documents those contains both *java* and *mysql*. In that case, how the query would look like. Thanks a lot Kamal
CJK question
A question about CJK, how will U+3000 be handled? U+3000 belongs to CJK Symbols and Punctuation and is named IDEOGRAPHIC SPACE. Is it wrong if I just map it to U+0020 (SPACE)? What is CJK Analyzer doing with U+3000? If two CJK words have U+3000 inside, does it mean these two CJK words belong together and changing U+3000 to U+0020 will break the meaning of the whole CJK word? Actually I have no idea about CJK. Any help welcome. Bernd
RE: CJK question
Hi, It uses the StandardAnalyzer which does split on IDEOGRAPHIC SPACE. Cheers, Markus -Original message- From:Bernd Fehling bernd.fehl...@uni-bielefeld.de Sent: Mon 13-May-2013 13:36 To: solr-user@lucene.apache.org Subject: CJK question A question about CJK, how will U+3000 be handled? U+3000 belongs to CJK Symbols and Punctuation and is named IDEOGRAPHIC SPACE. Is it wrong if I just map it to U+0020 (SPACE)? What is CJK Analyzer doing with U+3000? If two CJK words have U+3000 inside, does it mean these two CJK words belong together and changing U+3000 to U+0020 will break the meaning of the whole CJK word? Actually I have no idea about CJK. Any help welcome. Bernd
Re: Mandatory words search in SOLR
Hi Rafał Kuć I added q.op=AND as per you suggested. I see though some initial record document contains both keywords (*java* and *mysql*), towards end I see still there are number of documents, they have only one key word either *java* or *mysql*. Is it the SOLR behaviour or can I ask for a *strict search only if all my keywords are present, then only* *fetch record* else not. BR, Kamal On Mon, May 13, 2013 at 4:02 PM, Rafał Kuć r@solr.pl wrote: Hello! Change the default query operator. For example add the q.op=AND to your query. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi SOLR Experts When I search documents with keyword as *java, mysql* then I get the documents containing either *java* or *mysql* or both. Is it possible to get the documents those contains both *java* and *mysql*. In that case, how the query would look like. Thanks a lot Kamal
Re: Mandatory words search in SOLR
Kamal You could also use the 'mm' parameter to require a minimum match, or you could prepend '+' to each required term. Cheers François On May 13, 2013, at 7:57 AM, Kamal Palei palei.ka...@gmail.com wrote: Hi Rafał Kuć I added q.op=AND as per you suggested. I see though some initial record document contains both keywords (*java* and *mysql*), towards end I see still there are number of documents, they have only one key word either *java* or *mysql*. Is it the SOLR behaviour or can I ask for a *strict search only if all my keywords are present, then only* *fetch record* else not. BR, Kamal On Mon, May 13, 2013 at 4:02 PM, Rafał Kuć r@solr.pl wrote: Hello! Change the default query operator. For example add the q.op=AND to your query. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi SOLR Experts When I search documents with keyword as *java, mysql* then I get the documents containing either *java* or *mysql* or both. Is it possible to get the documents those contains both *java* and *mysql*. In that case, how the query would look like. Thanks a lot Kamal
Re: Log Monitor System for SolrCloud and Logging to log4j at SolrCloud?
Sorry but do you mean that I can use log4j with Solr 4.2.1? 2013/5/6 Steve Rowe sar...@gmail.com Done - see http://markmail.org/message/66vpwk42ih6uxps7 On May 6, 2013, at 5:29 AM, Furkan KAMACI furkankam...@gmail.com wrote: Is there any road map for Solr when will Solr 4.3 be tagged at svn? 2013/4/26 Mark Miller markrmil...@gmail.com Slf4j is meant to work with existing frameworks - you can set it up to work with log4j, and Solr will use log4j by default in the about to be released 4.3. http://wiki.apache.org/solr/SolrLogging - Mark On Apr 26, 2013, at 7:19 AM, Furkan KAMACI furkankam...@gmail.com wrote: I want to use GrayLog2 to monitor my logging files for SolrCloud. However I think that GrayLog2 works with log4j and logback. Solr uses slf4j. How can I solve this problem and what logging monitoring system does folks use?
Solr Licensing (Sizzle)
In the source code of Apache Solr 4.2.0 there is an unclear license reference in * \solr-4.2.0\solr\webapp\web\js\lib\jquery-1.7.2.min.js and * \solr-4.2.0\solr\webapp\web\js\require.jstxt Can you please tell me what kind of license does this refer to exactly: * Sizzle CSS Selector Engine * Copyright 2011, The Dojo Foundation * Released under the MIT, BSD, and GPL Licenses. * More information: http://sizzlejs.com/; * Includes Sizzle.js * http://sizzlejs.com/ * Copyright 2010, The Dojo Foundation * Released under the MIT, BSD, and GPL Licenses. Dojo Foundation says its not their business anymore. 1. Which version of GPL, which clause and copyright of BSD and MIT? 2. Is there a choice here? MIT or BSD or GPL? or all apply at the same time, hence the article and? 3. I cannot find Sizzle in the Solr distribution at all. Is it really included? Thank you in advance Peter Polhodzik Üdvözlettel / Best Regards, Péter POLHODZIK License Clearing Specialist [cid:image001.jpg@01CE4D87.0B17B5B0] Phone: +36 (46) 5-17894 Fax: +36 (46) 5-17801 peter.polhodzik@evosoft.commailto:peter.polhodzik@evosoft.com evosoft Hungary Kft. Arany János tér 1. H-3508 Miskolc www.evosoft.comhttp://www.evosoft.com/ ---
maximum number of simultaneous threads
I am seeing the following in solrconfig.xml It is possible to specific max number of threads for query time too? -- View this message in context: http://lucene.472066.n3.nabble.com/maximum-number-of-simultaneous-threads-tp4062903.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Mandatory words search in SOLR
Hi François Thanks for input. The major problem I face is , I make use of Drupal (as a framework) and apachesolr_module provided by Drupal. Where I am not sure, how do I directly modify the query. However this is not a right forum to ask Drupal related questions. If somebody here knows both Drupal 7 and SOLR well, kindly let me know. One more doubt, lets say I want to search some mandatory words and some optional words. Say I want to search all documents those contains all Java, mysql, php keywords along with atleast one keyword out of TCL, Perl, Selenium. *Basically I am looking at few mandatory keywords and few optional keywords. * Is it possible to search this way. If so, kindly guide me how the query should look like. Best Regards Kamal On Mon, May 13, 2013 at 5:31 PM, François Schiettecatte fschietteca...@gmail.com wrote: Kamal You could also use the 'mm' parameter to require a minimum match, or you could prepend '+' to each required term. Cheers François On May 13, 2013, at 7:57 AM, Kamal Palei palei.ka...@gmail.com wrote: Hi Rafał Kuć I added q.op=AND as per you suggested. I see though some initial record document contains both keywords (*java* and *mysql*), towards end I see still there are number of documents, they have only one key word either *java* or *mysql*. Is it the SOLR behaviour or can I ask for a *strict search only if all my keywords are present, then only* *fetch record* else not. BR, Kamal On Mon, May 13, 2013 at 4:02 PM, Rafał Kuć r@solr.pl wrote: Hello! Change the default query operator. For example add the q.op=AND to your query. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi SOLR Experts When I search documents with keyword as *java, mysql* then I get the documents containing either *java* or *mysql* or both. Is it possible to get the documents those contains both *java* and *mysql*. In that case, how the query would look like. Thanks a lot Kamal
Re: Best way to design a story and comments schema.
Try the simplest, cleanest design first (at least on paper), before you start resorting to either dynamic fields or multi-valued fields or other messy approaches. Like, one collection for stories, which would have a story id and a second collection for comments, each with a comment id and a field that is the associated story id and user id. And a third collection for users and their profiles. Identify the user and get their user id. Identify the story (maybe by keyword search) to get story id. Then identify and facet user comments by story id and user id and whatever other search criteria, and then facet on that. -- Jack Krupansky -Original Message- From: samabhiK Sent: Monday, May 13, 2013 5:24 AM To: solr-user@lucene.apache.org Subject: Best way to design a story and comments schema. Hi, I wish to know how to best design a schema to store comments in stories / articles posted. I have a set of fields: / lt;field name=quot;subjectquot; type=quot;text_generalquot; indexed=quot;truequot; stored=quot;truequot;/gt; lt;field name=quot;keywordsquot; type=quot;text_generalquot; indexed=quot;truequot; stored=quot;truequot;/gt; lt;field name=quot;categoryquot; type=quot;text_generalquot; indexed=quot;truequot; stored=quot;truequot;/gt; lt;field name=quot;contentquot; type=quot;text_generalquot; indexed=quot;falsequot; stored=quot;truequot; /gt; / Users can post their comments on a post and I should be able to retrieve these comments and show it along side the original post. I only need to show the last 3 comments and show a facet of the remaining comments which user can click and see the rest of the comments ( something like facebook does ). One alternative, I could think of, was adding a dynamic field for all comments : /lt;dynamicField name=quot;comment_*quot; type=quot;stringquot; indexed=quot;falsequot; stored=quot;truequot;/gt;/ So, to store each comments, I would send a text to solr of the form - For Field Name: /comment_n/ Value:/[Commenter Name]:[Commenter ID]:[Actual Comment Text]/ And to keep the count of those comments, I could use another field like so :/lt;field name=quot;comment_countquot; type=quot;intquot; indexed=quot;truequot; stored=quot;truequot;/gt;/ With this approach, I will have to do some calculation when a comment is deleted by the user but I still can manage to show the comments right. My idea is to find the best solution for this scenario which will be fast and also be simple. Kindly suggest. -- View this message in context: http://lucene.472066.n3.nabble.com/Best-way-to-design-a-story-and-comments-schema-tp4062867.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: maximum number of simultaneous threads
venkata, only blank lines between ..in solrconfig.xml and Is it possible.. have arrived. On Mon, May 13, 2013 at 3:25 PM, venkata vmarr...@yahoo.com wrote: I am seeing the following in solrconfig.xml It is possible to specific max number of threads for query time too? -- View this message in context: http://lucene.472066.n3.nabble.com/maximum-number-of-simultaneous-threads-tp4062903.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Best way to design a story and comments schema.
Thanks for your reply. I generally get confused by a collection and a core. But just FYI, I do have two cores at the moment - one for the users and another for the Stories. Initially I thought of adding an extra core for the Comments too but realized that it would mean multiple HTTP calls to fetch both the story and the comments. Also, when a story is deleted, so should be its comments. Having that spread across two cores might cause issues with transaction when I delete the story and try to delete the respective comments? Or when I delete the User and all hos stories and comments? I really wish to understand how that works. Sam -- View this message in context: http://lucene.472066.n3.nabble.com/Best-way-to-design-a-story-and-comments-schema-tp4062867p4062913.html Sent from the Solr - User mailing list archive at Nabble.com.
Getting explain information of more like this search in a more usable format
Hi, I'm executing a more like this search using the MoreLikeThisHandler. I can add score to the fields to be returned, but that's all I could find about getting information about how/why documents match. I would like to give my users more hints why documents are similar, so I would like to display important overlapping terms. If I specify debugQuery=true the result contains a explain section which is quite detailed, but in a text format I would have to parse. Is there a way to get this kind of information in a more usable way which does not force me to use a debug-flag? I'm mainly interested in showing the terms which each result document has in common with the reference document. regards, Achim
Solr fullname search
Hi, I'm trying to set up a fullname search in Solr. Until now I thought my work was fine until I've found something strange, and I can't figure out how to correct it. So I want to be able to do searches on full names. My index is a database where I get first name and last name and put them in one multivalued field with keyword tokenizer. Here's my fieldtype : fieldType name=text_auto class=solr.TextField analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.ASCIIFoldingFilterFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Everything works fine, I can search only a first name OR lastname and it gives me the full names that exists, and it also works for full names in any order if there's no mispelling. I just noticed something wrong ! For example, if I ask for Dupont dupont, it'll give me every Dupont that exists, even the ones for which the first name doesn't match with dupont. It seems that for each word in the query, it searches in the full name once. The problem is that if they're looking for dupont d, they'll find every Dupont that exist because d is contained in Dupont ! That's not what I want, in that case, I want to find every Dupont with a d in their first name (the other string). So I need to find a way to make it work, I tried many different tokenizers and filters but I'm affraid it's not possible... FYI, I'm using SolrJ, q.op=AND and wildcards (*) in front and behind every word queried. Thank you for any help you could provide me ! -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-fullname-search-tp4062919.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Best way to design a story and comments schema.
There are no transactions in Solr. Delete the Story and then the comments. Core is just the old Solr terminology. A collection is the data itself, like the data on the disk. And with SolrCloud, the collection terminology is required. How much data will hou have. I mean, a news article could have thousands of comments. Do you want to be able to search through them? Solr has no provision for searching across an arbitrary number of dynamic fields. I mean, if you want a query to search in a field, you need to name the field in either the query, or qf even for dismax, which makes query across arbitrary columns unworkable. Multiple HTTP requests should not be a problem, especially if each of them is shorter. Are you running into some problem? Technically, you could also do a custom search component that did a lot of the multi-query processing inside Solr, but once again, it is best to start with a simple design first. -- Jack Krupansky -Original Message- From: samabhiK Sent: Monday, May 13, 2013 8:55 AM To: solr-user@lucene.apache.org Subject: Re: Best way to design a story and comments schema. Thanks for your reply. I generally get confused by a collection and a core. But just FYI, I do have two cores at the moment - one for the users and another for the Stories. Initially I thought of adding an extra core for the Comments too but realized that it would mean multiple HTTP calls to fetch both the story and the comments. Also, when a story is deleted, so should be its comments. Having that spread across two cores might cause issues with transaction when I delete the story and try to delete the respective comments? Or when I delete the User and all hos stories and comments? I really wish to understand how that works. Sam -- View this message in context: http://lucene.472066.n3.nabble.com/Best-way-to-design-a-story-and-comments-schema-tp4062867p4062913.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting explain information of more like this search in a more usable format
Try debug.explain.structured=true, which will give you an XML response that can be traversed. Don't worry about the fact that these features are labeled debug - they are there simply to explain what is happening. Is there some particular concern you have about them being labeled debug? Although, you are not the first person to complain! What if Solr simply renamed these features with the term detail instead of debug - would that cure your concern?! -- Jack Krupansky -Original Message- From: Achim Domma Sent: Monday, May 13, 2013 9:12 AM To: solr-user@lucene.apache.org Subject: Getting explain information of more like this search in a more usable format Hi, I'm executing a more like this search using the MoreLikeThisHandler. I can add score to the fields to be returned, but that's all I could find about getting information about how/why documents match. I would like to give my users more hints why documents are similar, so I would like to display important overlapping terms. If I specify debugQuery=true the result contains a explain section which is quite detailed, but in a text format I would have to parse. Is there a way to get this kind of information in a more usable way which does not force me to use a debug-flag? I'm mainly interested in showing the terms which each result document has in common with the reference document. regards, Achim=
Re: Best way to design a story and comments schema.
I think I got your point. So, what I will create are three cores (or collections) - one for the users, one for the stories and the last one for comments. When I need to find all the stories posted by a single user, I first need to search the stories core with a unique userid in the filter and then run another query to fetch the collection of comments. Correct? Also, I have no such requirement to search through the comments and its mostly a storage filed for me. So, do you think I should shift that into a DB from where I may query the comments? Or will it be too costly for Solr to just plain store that data in a core? Which would be the best option here? Also, the idea of custom search component sounds great. But as you said, I will first try this out with a simple possible setup and then go from there. -- View this message in context: http://lucene.472066.n3.nabble.com/Best-way-to-design-a-story-and-comments-schema-tp4062867p4062929.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Best way to design a story and comments schema.
Jack, Why are multi-valued fields considered messy? I think I am about to learn something.. Thanks Another Jack On Mon, May 13, 2013 at 5:29 AM, Jack Krupansky j...@basetechnology.com wrote: Try the simplest, cleanest design first (at least on paper), before you start resorting to either dynamic fields or multi-valued fields or other messy approaches. Like, one collection for stories, which would have a story id and a second collection for comments, each with a comment id and a field that is the associated story id and user id. And a third collection for users and their profiles. Identify the user and get their user id. Identify the story (maybe by keyword search) to get story id. Then identify and facet user comments by story id and user id and whatever other search criteria, and then facet on that. -- Jack Krupansky -Original Message- From: samabhiK Sent: Monday, May 13, 2013 5:24 AM To: solr-user@lucene.apache.org Subject: Best way to design a story and comments schema. Hi, I wish to know how to best design a schema to store comments in stories / articles posted. I have a set of fields: / lt;field name=quot;subjectquot; type=quot;text_generalquot; indexed=quot;truequot; stored=quot;truequot;/gt; lt;field name=quot;keywordsquot; type=quot;text_generalquot; indexed=quot;truequot; stored=quot;truequot;/gt; lt;field name=quot;categoryquot; type=quot;text_generalquot; indexed=quot;truequot; stored=quot;truequot;/gt; lt;field name=quot;contentquot; type=quot;text_generalquot; indexed=quot;falsequot; stored=quot;truequot; /gt; / Users can post their comments on a post and I should be able to retrieve these comments and show it along side the original post. I only need to show the last 3 comments and show a facet of the remaining comments which user can click and see the rest of the comments ( something like facebook does ). One alternative, I could think of, was adding a dynamic field for all comments : /lt;dynamicField name=quot;comment_*quot; type=quot;stringquot; indexed=quot;falsequot; stored=quot;truequot;/gt;/ So, to store each comments, I would send a text to solr of the form - For Field Name: /comment_n/ Value:/[Commenter Name]:[Commenter ID]:[Actual Comment Text]/ And to keep the count of those comments, I could use another field like so :/lt;field name=quot;comment_countquot; type=quot;intquot; indexed=quot;truequot; stored=quot;truequot;/gt;/ With this approach, I will have to do some calculation when a comment is deleted by the user but I still can manage to show the comments right. My idea is to find the best solution for this scenario which will be fast and also be simple. Kindly suggest. -- View this message in context: http://lucene.472066.n3.nabble.com/Best-way-to-design-a-story-and-comments-schema-tp4062867.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Mandatory words search in SOLR
Hi François As per suggestion, I used 'mm' param and was able to do search for mandatory fields. In Drupal, one need to do as $query-addParam('mm' , '100%'); in query alter hook. Thanks a lot for guiding me. Best Regards Kamal On Mon, May 13, 2013 at 5:56 PM, Kamal Palei palei.ka...@gmail.com wrote: Hi François Thanks for input. The major problem I face is , I make use of Drupal (as a framework) and apachesolr_module provided by Drupal. Where I am not sure, how do I directly modify the query. However this is not a right forum to ask Drupal related questions. If somebody here knows both Drupal 7 and SOLR well, kindly let me know. One more doubt, lets say I want to search some mandatory words and some optional words. Say I want to search all documents those contains all Java, mysql, php keywords along with atleast one keyword out of TCL, Perl, Selenium. *Basically I am looking at few mandatory keywords and few optional keywords.* Is it possible to search this way. If so, kindly guide me how the query should look like. Best Regards Kamal On Mon, May 13, 2013 at 5:31 PM, François Schiettecatte fschietteca...@gmail.com wrote: Kamal You could also use the 'mm' parameter to require a minimum match, or you could prepend '+' to each required term. Cheers François On May 13, 2013, at 7:57 AM, Kamal Palei palei.ka...@gmail.com wrote: Hi Rafał Kuć I added q.op=AND as per you suggested. I see though some initial record document contains both keywords (*java* and *mysql*), towards end I see still there are number of documents, they have only one key word either *java* or *mysql*. Is it the SOLR behaviour or can I ask for a *strict search only if all my keywords are present, then only* *fetch record* else not. BR, Kamal On Mon, May 13, 2013 at 4:02 PM, Rafał Kuć r@solr.pl wrote: Hello! Change the default query operator. For example add the q.op=AND to your query. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi SOLR Experts When I search documents with keyword as *java, mysql* then I get the documents containing either *java* or *mysql* or both. Is it possible to get the documents those contains both *java* and *mysql*. In that case, how the query would look like. Thanks a lot Kamal
Can we search some mandatory words and some optional words in SOLR
Dear SOLR Experts Llets say I want to search some mandatory words and some optional words. Say I want to search all documents those contains all *Java, mysql, php*keywords along with atleast one keyword out of * TCL, Perl, Selenium*. *Basically I am looking at few mandatory keywords and few optional keywords. * Is it possible to search this way. If so, kindly guide me how the query should look like. Best Regards Kamal
Re: Log Monitor System for SolrCloud and Logging to log4j at SolrCloud?
On 5/13/2013 6:09 AM, Furkan KAMACI wrote: Sorry but do you mean that I can use log4j with Solr 4.2.1? You can. You need to obtain a war without any slf4j jars, which you can do by unpacking the original war, deleting the jars, and repackaging it. You can also build from source with the dist-excl-slf4j or dist-war-excl-slf4j build target. After you have the war without slf4j, then you need to put the proper slf4j and log4j jars into the classpath - for jetty, this is typically lib/ext. See the 4.3.0 download - it has the proper jars in its example/lib/ext directory. Thanks, Shawn
Re: Can we search some mandatory words and some optional words in SOLR
That's simply a standard, old-fashioned Lucene query: +Java +mysql +php TCL Perl Selenium And you can decide if min should match (mm) is 0, 1, 2, 3, etc. for the optional terms (TCL, Perl, Selenium) -- Jack Krupansky -Original Message- From: Kamal Palei Sent: Monday, May 13, 2013 9:56 AM To: solr-user@lucene.apache.org Subject: Can we search some mandatory words and some optional words in SOLR Dear SOLR Experts Llets say I want to search some mandatory words and some optional words. Say I want to search all documents those contains all *Java, mysql, php*keywords along with atleast one keyword out of * TCL, Perl, Selenium*. *Basically I am looking at few mandatory keywords and few optional keywords. * Is it possible to search this way. If so, kindly guide me how the query should look like. Best Regards Kamal
Re: Best way to design a story and comments schema.
Multi-valued fields don't have the same full support as simple fields and documents (since they are effectively a sub-document). Although we do now have the ability to add to a multi-valued field with atomic update, we can't directly edit them, like delete/replace the kth item or insert before/after an item, sort them by various criteria, etc. And a query won't tell you which entry matched. And you can't narrow your query to search a subset of a multi-valued field. They do work well for short lists, but not Big Data. Listing a few authors for a book is fine. But trying to do hundreds, thousands, and more, is quite problematic. There was a recent issue on the list about how multi-valued field values are sometimes handled inefficiently in Solr. -- Jack Krupansky -Original Message- From: Jack Park Sent: Monday, May 13, 2013 9:44 AM To: solr-user@lucene.apache.org Subject: Re: Best way to design a story and comments schema. Jack, Why are multi-valued fields considered messy? I think I am about to learn something.. Thanks Another Jack On Mon, May 13, 2013 at 5:29 AM, Jack Krupansky j...@basetechnology.com wrote: Try the simplest, cleanest design first (at least on paper), before you start resorting to either dynamic fields or multi-valued fields or other messy approaches. Like, one collection for stories, which would have a story id and a second collection for comments, each with a comment id and a field that is the associated story id and user id. And a third collection for users and their profiles. Identify the user and get their user id. Identify the story (maybe by keyword search) to get story id. Then identify and facet user comments by story id and user id and whatever other search criteria, and then facet on that. -- Jack Krupansky -Original Message- From: samabhiK Sent: Monday, May 13, 2013 5:24 AM To: solr-user@lucene.apache.org Subject: Best way to design a story and comments schema. Hi, I wish to know how to best design a schema to store comments in stories / articles posted. I have a set of fields: / lt;field name=quot;subjectquot; type=quot;text_generalquot; indexed=quot;truequot; stored=quot;truequot;/gt; lt;field name=quot;keywordsquot; type=quot;text_generalquot; indexed=quot;truequot; stored=quot;truequot;/gt; lt;field name=quot;categoryquot; type=quot;text_generalquot; indexed=quot;truequot; stored=quot;truequot;/gt; lt;field name=quot;contentquot; type=quot;text_generalquot; indexed=quot;falsequot; stored=quot;truequot; /gt; / Users can post their comments on a post and I should be able to retrieve these comments and show it along side the original post. I only need to show the last 3 comments and show a facet of the remaining comments which user can click and see the rest of the comments ( something like facebook does ). One alternative, I could think of, was adding a dynamic field for all comments : /lt;dynamicField name=quot;comment_*quot; type=quot;stringquot; indexed=quot;falsequot; stored=quot;truequot;/gt;/ So, to store each comments, I would send a text to solr of the form - For Field Name: /comment_n/ Value:/[Commenter Name]:[Commenter ID]:[Actual Comment Text]/ And to keep the count of those comments, I could use another field like so :/lt;field name=quot;comment_countquot; type=quot;intquot; indexed=quot;truequot; stored=quot;truequot;/gt;/ With this approach, I will have to do some calculation when a comment is deleted by the user but I still can manage to show the comments right. My idea is to find the best solution for this scenario which will be fast and also be simple. Kindly suggest. -- View this message in context: http://lucene.472066.n3.nabble.com/Best-way-to-design-a-story-and-comments-schema-tp4062867.html Sent from the Solr - User mailing list archive at Nabble.com.
Quick question about indexing with SolrJ.
Is it possible to index plain String JSON documents using SolrJ? I already know annotating POJOs works fine, but I need a more flexible way to index data without any intermediate POJO. That's because when changing, adding or removing new fields I don't want to change continously that POJO again and again. -- - Luis Cappa
Re: Solr Licensing (Sizzle)
On May 13, 2013, at 14:15 , Polhodzik Peter (ext) peter.polhodzik@evosoft.com wrote: In the source code of Apache Solr 4.2.0 there is an unclear license reference in · \solr-4.2.0\solr\webapp\web\js\lib\jquery-1.7.2.min.js and · \solr-4.2.0\solr\webapp\web\js\require.jstxt Can you please tell me what kind of license does this refer to exactly: “* Sizzle CSS Selector Engine * Copyright 2011, The Dojo Foundation * Released under the MIT, BSD, and GPL Licenses. * More information: http://sizzlejs.com/” “* Includes Sizzle.js * http://sizzlejs.com/ * Copyright 2010, The Dojo Foundation * Released under the MIT, BSD, and GPL Licenses.” Dojo Foundation says its not their business anymore. 1. Which version of GPL, which clause and copyright of BSD and MIT? 2. Is there a choice here? MIT or BSD or GPL? or all apply at the same time, hence the article and? 3. I cannot find Sizzle in the Solr distribution at all. Is it really included? In my experience, the presence of several licenses normally indicate that you get to choose one of them, and not that they all apply. If you go to sizzlejs.com and follow the link to documentation, you'll find the file MIT-LICENSE.txt. This file indicates that the current license for sizzle.js is a variation of the MIT license. It also indicates that the current licensor of sizzle.js is the jQuery project and other contributors, so you should be able to get the definitive answer on licensing terms from the jQuery project.
Re: Quick question about indexing with SolrJ.
Do your POJOs follow a simple flat data model that is 100% compatible with Solr? If so, maybe you can simply ingest them by setting the Content-type to application/json and maybe having to put some minimal wrapper around the raw JSON. But... if they DON'T follow a simple, flat data model, then YOU are going to have to transform their data into a format that does have a simple, flat data model. -- Jack Krupansky -Original Message- From: Luis Cappa Banda Sent: Monday, May 13, 2013 10:52 AM To: solr-user@lucene.apache.org Subject: Quick question about indexing with SolrJ. Is it possible to index plain String JSON documents using SolrJ? I already know annotating POJOs works fine, but I need a more flexible way to index data without any intermediate POJO. That's because when changing, adding or removing new fields I don't want to change continously that POJO again and again. -- - Luis Cappa
Re: Quick question about indexing with SolrJ.
Hello, Jack. I don't want to use POJOs, that's the main problem. I know that you can send AJAX POST HTTP Requests with JSON data to index new documents and I would like to do that with SolrJ, that's all, but I don't find the way to do that, :-/ . What I would like to do is simple retrieve an String with an embedded JSON and add() it via an HttpSolrServer object instance. If the JSON matches the Solr server schema.xml or not it would be a server-side problem, not a client-side one. I mean, I want to use a best effort and more flexible way to index data, and using POJOs is not the way to do that: you have to change the Java class, compile it again and relaunch whatever the process that uses that Java class. Regards, - Luis Cappa 2013/5/13 Jack Krupansky j...@basetechnology.com Do your POJOs follow a simple flat data model that is 100% compatible with Solr? If so, maybe you can simply ingest them by setting the Content-type to application/json and maybe having to put some minimal wrapper around the raw JSON. But... if they DON'T follow a simple, flat data model, then YOU are going to have to transform their data into a format that does have a simple, flat data model. -- Jack Krupansky -Original Message- From: Luis Cappa Banda Sent: Monday, May 13, 2013 10:52 AM To: solr-user@lucene.apache.org Subject: Quick question about indexing with SolrJ. Is it possible to index plain String JSON documents using SolrJ? I already know annotating POJOs works fine, but I need a more flexible way to index data without any intermediate POJO. That's because when changing, adding or removing new fields I don't want to change continously that POJO again and again. -- - Luis Cappa -- - Luis Cappa
Re: Quick question about indexing with SolrJ.
You can send JSON to Solr as update documents: http://wiki.apache.org/solr/UpdateJSON. Not sure if SolrJ supports it, but it is just an HTTP post, so you may not even need SolrJ. But the issue is that your own JSON probably does not match JSON expected by Solr. So, you need to map it somehow, right? Unless you figured that part already. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Mon, May 13, 2013 at 11:51 AM, Luis Cappa Banda luisca...@gmail.com wrote: Hello, Jack. I don't want to use POJOs, that's the main problem. I know that you can send AJAX POST HTTP Requests with JSON data to index new documents and I would like to do that with SolrJ, that's all, but I don't find the way to do that, :-/ . What I would like to do is simple retrieve an String with an embedded JSON and add() it via an HttpSolrServer object instance. If the JSON matches the Solr server schema.xml or not it would be a server-side problem, not a client-side one. I mean, I want to use a best effort and more flexible way to index data, and using POJOs is not the way to do that: you have to change the Java class, compile it again and relaunch whatever the process that uses that Java class. Regards, - Luis Cappa 2013/5/13 Jack Krupansky j...@basetechnology.com Do your POJOs follow a simple flat data model that is 100% compatible with Solr? If so, maybe you can simply ingest them by setting the Content-type to application/json and maybe having to put some minimal wrapper around the raw JSON. But... if they DON'T follow a simple, flat data model, then YOU are going to have to transform their data into a format that does have a simple, flat data model. -- Jack Krupansky -Original Message- From: Luis Cappa Banda Sent: Monday, May 13, 2013 10:52 AM To: solr-user@lucene.apache.org Subject: Quick question about indexing with SolrJ. Is it possible to index plain String JSON documents using SolrJ? I already know annotating POJOs works fine, but I need a more flexible way to index data without any intermediate POJO. That's because when changing, adding or removing new fields I don't want to change continously that POJO again and again. -- - Luis Cappa -- - Luis Cappa
Re: maximum number of simultaneous threads
I am seeing configuration point for indexing threads. However I am not finding anything for search. How many simultaneous threads, SOLR can spin during search time? -- View this message in context: http://lucene.472066.n3.nabble.com/maximum-number-of-simultaneous-threads-tp4062903p4062982.html Sent from the Solr - User mailing list archive at Nabble.com.
Making protwords.txt changes effective
Hi I added some words to protwords.txt, but there doesnt seem to be any effect in the resulting search. Do I need to restart Apache or Solr or rebuild the index?
Re: Need solr query help
Hi Abhishek, I've had a look into this problem and have come up with a solution. Following instructions assume you have downloaded the 4.3.0 release of Solr from:- http://www.apache.org/dyn/closer.cgi/lucene/solr/4.3.0 First add to:- solr-4.3.0/solr/example/solr/collection1/conf/schema.xml the following:- field name=shopLocation type=location indexed=true stored=true/ field name=shopMaxDeliveryDistance type=float indexed=true stored=true/ after the id field:- field name=id type=string indexed=true stored=true required=true multiValued=false / Then start solr by going to:- solr-4.3.0/solr/example and running:- java -jar start.jar Then change into your solr-4.3.0/solr/example/exampledocs directory and write the following text to a new file called shops.xml:- add doc field name=id2468/field field name=nameShop A/field field name=shopLocation0.1,0.1/field field name=shopMaxDeliveryDistance10/field /doc doc field name=id2469/field field name=nameShop B/field field name=shopLocation0.2,0.2/field field name=shopMaxDeliveryDistance35/field /doc doc field name=id2470/field field name=nameShop C/field field name=shopLocation0.9,0.1/field field name=shopMaxDeliveryDistance25/field /doc doc field name=id2480/field field name=nameShop D/field field name=shopLocation0.3,0.2/field field name=shopMaxDeliveryDistance50/field /doc /add Now run:- ./post.sh shops.xml You should get back something like:- Posting file shops.xml to http://localhost:8983/solr/update ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime120/int/lst /response ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime46/int/lst /response The doing the following queries in your browser:- All 4 shops:- http://localhost:8983/solr/select?q=name:shopfl=name,shopLocation,shopMaxDeliveryDistance All shops with distance from point 0.0,0.0 and ordered by distance from point 0.0,0.0 (gives order A, B, D, C):- http://localhost:8983/solr/select?q=name:shopfl=name,shopLocation,shopMaxDeliveryDistance,geodist%28shopLocation,0.0,0.0%29sort=geodist%28shopLocation,0.0,0.0%29%20asc All shops with distance from point 0.0,0.0 and ordered by distance from point 0.0,0.0 and filtered to eliminate all shops with distance from point 0.0,0.0 greater than shopMaxDeliveryDistance (gives shops B and D):- http://localhost:8983/solr/select?q=name:shopfl=name,shopLocation,shopMaxDeliveryDistance,geodist%28shopLocation,0.0,0.0%29sort=geodist%28shopLocation,0.0,0.0%29%20ascfq={!frange%20u=0}sub%28geodist%28shopLocation,0.0,0.0%29,shopMaxDeliveryDistance%29 To delete all shops so you can edit the file to play with it and repost the shops:- http://localhost:8983/solr/update?stream.body=deletequeryname:shop/query/deletecommit=true smsolr -- View this message in context: http://lucene.472066.n3.nabble.com/Need-solr-query-help-tp4061800p4062591.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Disabling tf (term frequency) during indexing and/or scoring
This is an old post, now there is a solution in SOLR omitTermFreqAndPositions=true http://wiki.apache.org/solr/SchemaXml#Data_Types -- View this message in context: http://lucene.472066.n3.nabble.com/Disabling-tf-term-frequency-during-indexing-and-or-scoring-tp502956p4062595.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Need solr query help
Hi Abhishek, I forgot to explain why it works. It uses the frange filter which is mentioned here:- http://wiki.apache.org/solr/CommonQueryParameters and it works because it filters in results where the geodist minus the shopMaxDeliveryDistance is less than zero (that's what the u=0 means, upper limit=0), i.e.:- geodist - shopMaxDeliveryDistance 0 - geodist shopMaxDeliveryDistance i.e. the geodist is less than the shopMaxDeliveryDistance and so the shop is within delivery range of the location specified. smsolr -- View this message in context: http://lucene.472066.n3.nabble.com/Need-solr-query-help-tp4061800p4062603.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Making protwords.txt changes effective
Yes, restart Solr. Not to reindex, but simply to reload the file. Well... depending on where you use the protected words, you may need to reindex as well. For a query-time filter you don't need to reindex, but for index-time filters, you must reindex. -- Jack Krupansky -Original Message- From: Shane Magee Sent: Saturday, May 11, 2013 7:42 AM To: solr-user@lucene.apache.org Subject: Making protwords.txt changes effective Hi I added some words to protwords.txt, but there doesnt seem to be any effect in the resulting search. Do I need to restart Apache or Solr or rebuild the index?
Re: Looking for Best Practice of Spellchecker
Thank you for you help, guys. I agreed, wall mart should be a synonyms, it's not a good example. I did an experiment by using KeywordTokenizer + DirectSolrSpellChecker, I can get suggestion even for wall mart to walmart. But I don't know whether it's a good practice or not. It's much like a workaround to me. And for WordBreakSpellChecker, I haven't tried it yet. Does this spellchecker break the word and concatenate them then give me collations? Thanks On Fri, May 10, 2013 at 11:34 AM, Dyer, James james.d...@ingramcontent.comwrote: Good point, Jason. In fact, even if you use WorkBreakSpellChecker wall mart will not correct to walmart. The reason is the spellchecker cannot both correct a token's spelling *and* fix the wordbreak issue involving that same token. So in this case a synonym is the way to go. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Jason Hellman [mailto:jhell...@innoventsolutions.com] Sent: Friday, May 10, 2013 9:55 AM To: solr-user@lucene.apache.org Subject: Re: Looking for Best Practice of Spellchecker Nicholas, Also consider that some misspellings are better handled through Synonyms (or injected metadata). You can garner a great deal of value out of the spell checker by following the great advice James is giving here...but you'll find a well-placed helper synonym or metavalue can often save a lot of headache and time. Jason On May 10, 2013, at 7:32 AM, Dyer, James james.d...@ingramcontent.com wrote: Nicholas, It sounds like you might want to use WordBreakSolrSpellChecker, which gets obscure mention in the wiki. Read through this section: http://wiki.apache.org/solr/SpellCheckComponent#Configuration and you will see some information. Also, the Solr Example shows how to configure this. See http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/solr/example/solr/collection1/conf/solrconfig.xml Look for... lst name=spellchecker str name=namewordbreak/str ... /lst ...and... requestHandler name=/spell ... ... /requestHandler Also, I'd recommend you take a look at each parameter in the /spell request handler and read its section on the spellcheckcomponent wiki page. You probably will want to set many of these parameters as well. You can get a query to return only spell results simply by specifying rows=0. However, its one less query to just have it return the results also. If there are no results, your application can check for collations and re-issue a collation query. If there are both results and collations returned, you can give the user results with did-you-mean suggestions. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Nicholas Ding [mailto:nicholas...@gmail.com] Sent: Friday, May 10, 2013 8:47 AM To: solr-user@lucene.apache.org Subject: Looking for Best Practice of Spellchecker Hi guys, I'm working on a local search project, I wanna integrate spellchecker for the search. So basically, my search engines is used to search local businesses. For example, user could search for wall mart, here is a typo, I wanna spellchecker to give me Collation for walmart. My problems are: 1. I use DirectSolrSpellChecker on my BusinessNameField and pass wall mart as phrase search, but I can't get collation from the spellchecker. 2. I tried not to pass phrase search, but pass q=Wall AND Mart to force a 100% match, but spellchecker can't give me collation also. I read the documents about spellchecker on Solr wiki, but it's very brief. I'm wondering is there any best practice of spellchecker, I believe it's widely used in the search, right? And I have another idea, I don't know whether it's valid or not. I want to apply spellchecker everything before doing the search, so that I could rely on the spellchecker to tell me whether my search could get result or not. Thanks Nicholas
How to improve performance of geodist()
Hi guys, I'm using geodist() in a recip boost function. I noticed a performance impact to the response time. I did a profiling session, the geodist() calculation took 30% of CPU time. I'm wondering is there any alternative to Haversine function that can reduce CPU calculation? I don't need very accurate float numbers when I use geodist() in the boost function. Thanks Nicholas
Re: rename a core to same name of existing core
did any one verified the following is ture? the Description on http://wiki.apache.org/solr/CoreAdmin#CREATE is: *quote* If a core with the same name exists, while the new created core is initalizing, the old one will continue to accept requests. Once it has finished, all new request will go to the new core, and the old core will be unloaded. */quote* step1 - I have a core 'abc' with 30 documents in it: http://myhost.com:8080/solr/abc/select/?q=type%3Amessageversion=2.2start=0rows=10indent=on str name=rows10/str/lst/lstresult name=response numFound=30 start=0doc step2 - then I create a new core with same name 'abc': http://myhost.com:8080/solr/admin/cores?action=createname=abcinstanceDir=./ responselst name=responseHeaderint name=status0/intint name=QTime303/int/lststr name=coreabc/strstr name=saved/mxl/var/solr/solr.xml/str/response step3 - I cleared out my browser cache step4 - I did same query as in step1, got same results (30 documents): http://myhost.com:8080/solr/abc/select/?q=type%3Amessageversion=2.2start=0rows=10indent=on str name=rows10/str/lst/lstresult name=response numFound=30 start=0doc I thought the old core should be unloaded? did I misunderstand any thing here? thanks Jie -- View this message in context: http://lucene.472066.n3.nabble.com/rename-a-core-to-same-name-of-existing-core-tp3090960p4063008.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR guidance required
If this is for the US, remove the age range feature before you get sued. On 05/09/2013 08:41 PM, Kamal Palei wrote: Dear SOLR experts I might be asking a very silly question. As I am new to SOLR kindly guide me. I have a job site. Using SOLR to search resumes. When a HR user enters some keywords say JAVA, MySQL etc, I search resume documents using SOLR, retrieve 100 records and show to user. The problem I face is say, I retrieved 100 records, then we do filtering for experience range, age range, salary range (using mysql query). Sometimes it so happens that the 100 records I fetch , I do not get a single record to show to user. When user clicks next link there might be few records, it looks odd really. I hope there must be some mechanism, by which I can associate salary, experience, age etc with resume document during indexing. And when I search for resumes I can give all filters accordingly and can retrieve 100 records and strait way I can show 100 records to user without doing any mysql query. Please let me know if this is feasible. If so, kindly give me some pointer how do I do it. Best Regards Kamal
Re: SOLR guidance required
Jason can you explain what you mean at here: Where OR operators apply, this does not matter. But your Solr cache will be much more savvy with the first construct. 2013/5/13 Lance Norskog goks...@gmail.com If this is for the US, remove the age range feature before you get sued. On 05/09/2013 08:41 PM, Kamal Palei wrote: Dear SOLR experts I might be asking a very silly question. As I am new to SOLR kindly guide me. I have a job site. Using SOLR to search resumes. When a HR user enters some keywords say JAVA, MySQL etc, I search resume documents using SOLR, retrieve 100 records and show to user. The problem I face is say, I retrieved 100 records, then we do filtering for experience range, age range, salary range (using mysql query). Sometimes it so happens that the 100 records I fetch , I do not get a single record to show to user. When user clicks next link there might be few records, it looks odd really. I hope there must be some mechanism, by which I can associate salary, experience, age etc with resume document during indexing. And when I search for resumes I can give all filters accordingly and can retrieve 100 records and strait way I can show 100 records to user without doing any mysql query. Please let me know if this is feasible. If so, kindly give me some pointer how do I do it. Best Regards Kamal
Re: How to improve performance of geodist()
On Mon, May 13, 2013 at 1:12 PM, Nicholas Ding nicholas...@gmail.com wrote: I'm using geodist() in a recip boost function. I noticed a performance impact to the response time. I did a profiling session, the geodist() calculation took 30% of CPU time. Are you also using an fq with geofilt to narrow down the number of documents that must be scored? -Yonik http://lucidworks.com
How to force a document to be indexed in a given shard at SolrCloud?
I want to run some test cases on SolrCloud at my pre-prototype system. How can I force a document to be indexed in a given shard at SolrCloud (I use Solr 4.2.1) ? Does something like shard.keys works for me?
Re: SOLR guidance required
Multiple fq params are ANDed. So if you have fq=clause1 AND clause2, you should implement that as fq=clause1fq=clause2. However, if you want fq=clause1 OR clause2, you have no choice but to keep it as a single filter query. Upayavira On Mon, May 13, 2013, at 06:55 PM, Furkan KAMACI wrote: Jason can you explain what you mean at here: Where OR operators apply, this does not matter. But your Solr cache will be much more savvy with the first construct. 2013/5/13 Lance Norskog goks...@gmail.com If this is for the US, remove the age range feature before you get sued. On 05/09/2013 08:41 PM, Kamal Palei wrote: Dear SOLR experts I might be asking a very silly question. As I am new to SOLR kindly guide me. I have a job site. Using SOLR to search resumes. When a HR user enters some keywords say JAVA, MySQL etc, I search resume documents using SOLR, retrieve 100 records and show to user. The problem I face is say, I retrieved 100 records, then we do filtering for experience range, age range, salary range (using mysql query). Sometimes it so happens that the 100 records I fetch , I do not get a single record to show to user. When user clicks next link there might be few records, it looks odd really. I hope there must be some mechanism, by which I can associate salary, experience, age etc with resume document during indexing. And when I search for resumes I can give all filters accordingly and can retrieve 100 records and strait way I can show 100 records to user without doing any mysql query. Please let me know if this is feasible. If so, kindly give me some pointer how do I do it. Best Regards Kamal
Re: SOLR guidance required
On 5/13/2013 11:55 AM, Furkan KAMACI wrote: Jason can you explain what you mean at here: Where OR operators apply, this does not matter. But your Solr cache will be much more savvy with the first construct. If you need to OR different filters together, you have to have all those in the same filter query. Multiple filter queries are ANDed together, that can't be changed. If you need your filter clauses ANDed together, you can split them into multiple small filter queries. Those filters will be cached individually. If you put all of them in the same filter query, then you can't re-use pieces of the filter without a new cache entry, so caching isn't as efficient. Thanks, Shawn
Faceting json response - odd format
Hello, Relatively new to SOLR, I am quite happy with the API. I am a bit challenged by the faceting response in JSON though. This is what i am getting which mirrors what is in the documentation: facet_counts:{facet_queries:{}, facet_fields:{metadata_meta_last_author:[Nick,330,standarduser,153,Mohan,52,wwd,49,gerald,45,Riggins,36,fallon,31,blister,28, ,26,morfitelli,24,Administrator,22,morrow,22,richard,22,egilhoi,18,USer Group,16], This is not trivial to parse - I've read the docs but can't seem to figure out who one might get a more structured response to this. Assuming I am not missing anything, I guess i have to write a custom parser to build a separate data structure that can be more easily presented in a UI. Thank you Cord
Re: rename a core to same name of existing core
On 5/13/2013 11:46 AM, Jie Sun wrote: did any one verified the following is ture? the Description on http://wiki.apache.org/solr/CoreAdmin#CREATE is: *quote* If a core with the same name exists, while the new created core is initalizing, the old one will continue to accept requests. Once it has finished, all new request will go to the new core, and the old core will be unloaded. */quote* step1 - I have a core 'abc' with 30 documents in it: http://myhost.com:8080/solr/abc/select/?q=type%3Amessageversion=2.2start=0rows=10indent=on str name=rows10/str/lst/lstresult name=response numFound=30 start=0doc step2 - then I create a new core with same name 'abc': http://myhost.com:8080/solr/admin/cores?action=createname=abcinstanceDir=./ responselst name=responseHeaderint name=status0/intint name=QTime303/int/lststr name=coreabc/strstr name=saved/mxl/var/solr/solr.xml/str/response step3 - I cleared out my browser cache step4 - I did same query as in step1, got same results (30 documents): http://myhost.com:8080/solr/abc/select/?q=type%3Amessageversion=2.2start=0rows=10indent=on str name=rows10/str/lst/lstresult name=response numFound=30 start=0doc I thought the old core should be unloaded? did I misunderstand any thing here? If the instanceDir value that you are using is different than the existing core, then this might be a bug, either in the documentation or Solr. If the instanceDir is the same as the existing core, then it's working as designed -- you've created a core with an index that already exists. I personally would like to see core creation fail if the core already exists, but others with more authority may disagree. A workaround would be to CREATE a new core with a different name and instanceDir, SWAP them, and then UNLOAD the one you don't need any more, optionally deleting it. Thanks, Shawn
RE: Looking for Best Practice of Spellchecker
The Word Break spellchecker will incorporate the broken combined words in the collations. Its designed to work seamlessly in conjunction with a regular spellchecker (IndexBased- or Direct-). James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Nicholas Ding [mailto:nicholas...@gmail.com] Sent: Monday, May 13, 2013 12:07 PM To: solr-user@lucene.apache.org Subject: Re: Looking for Best Practice of Spellchecker Thank you for you help, guys. I agreed, wall mart should be a synonyms, it's not a good example. I did an experiment by using KeywordTokenizer + DirectSolrSpellChecker, I can get suggestion even for wall mart to walmart. But I don't know whether it's a good practice or not. It's much like a workaround to me. And for WordBreakSpellChecker, I haven't tried it yet. Does this spellchecker break the word and concatenate them then give me collations? Thanks On Fri, May 10, 2013 at 11:34 AM, Dyer, James james.d...@ingramcontent.comwrote: Good point, Jason. In fact, even if you use WorkBreakSpellChecker wall mart will not correct to walmart. The reason is the spellchecker cannot both correct a token's spelling *and* fix the wordbreak issue involving that same token. So in this case a synonym is the way to go. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Jason Hellman [mailto:jhell...@innoventsolutions.com] Sent: Friday, May 10, 2013 9:55 AM To: solr-user@lucene.apache.org Subject: Re: Looking for Best Practice of Spellchecker Nicholas, Also consider that some misspellings are better handled through Synonyms (or injected metadata). You can garner a great deal of value out of the spell checker by following the great advice James is giving here...but you'll find a well-placed helper synonym or metavalue can often save a lot of headache and time. Jason On May 10, 2013, at 7:32 AM, Dyer, James james.d...@ingramcontent.com wrote: Nicholas, It sounds like you might want to use WordBreakSolrSpellChecker, which gets obscure mention in the wiki. Read through this section: http://wiki.apache.org/solr/SpellCheckComponent#Configuration and you will see some information. Also, the Solr Example shows how to configure this. See http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/solr/example/solr/collection1/conf/solrconfig.xml Look for... lst name=spellchecker str name=namewordbreak/str ... /lst ...and... requestHandler name=/spell ... ... /requestHandler Also, I'd recommend you take a look at each parameter in the /spell request handler and read its section on the spellcheckcomponent wiki page. You probably will want to set many of these parameters as well. You can get a query to return only spell results simply by specifying rows=0. However, its one less query to just have it return the results also. If there are no results, your application can check for collations and re-issue a collation query. If there are both results and collations returned, you can give the user results with did-you-mean suggestions. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Nicholas Ding [mailto:nicholas...@gmail.com] Sent: Friday, May 10, 2013 8:47 AM To: solr-user@lucene.apache.org Subject: Looking for Best Practice of Spellchecker Hi guys, I'm working on a local search project, I wanna integrate spellchecker for the search. So basically, my search engines is used to search local businesses. For example, user could search for wall mart, here is a typo, I wanna spellchecker to give me Collation for walmart. My problems are: 1. I use DirectSolrSpellChecker on my BusinessNameField and pass wall mart as phrase search, but I can't get collation from the spellchecker. 2. I tried not to pass phrase search, but pass q=Wall AND Mart to force a 100% match, but spellchecker can't give me collation also. I read the documents about spellchecker on Solr wiki, but it's very brief. I'm wondering is there any best practice of spellchecker, I believe it's widely used in the search, right? And I have another idea, I don't know whether it's valid or not. I want to apply spellchecker everything before doing the search, so that I could rely on the spellchecker to tell me whether my search could get result or not. Thanks Nicholas
Re: SOLR guidance required
Best advice in this thread. :) Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com The science of influence marketing. On Mon, May 13, 2013 at 1:29 PM, Lance Norskog goks...@gmail.com wrote: If this is for the US, remove the age range feature before you get sued. On 05/09/2013 08:41 PM, Kamal Palei wrote: Dear SOLR experts I might be asking a very silly question. As I am new to SOLR kindly guide me. I have a job site. Using SOLR to search resumes. When a HR user enters some keywords say JAVA, MySQL etc, I search resume documents using SOLR, retrieve 100 records and show to user. The problem I face is say, I retrieved 100 records, then we do filtering for experience range, age range, salary range (using mysql query). Sometimes it so happens that the 100 records I fetch , I do not get a single record to show to user. When user clicks next link there might be few records, it looks odd really. I hope there must be some mechanism, by which I can associate salary, experience, age etc with resume document during indexing. And when I search for resumes I can give all filters accordingly and can retrieve 100 records and strait way I can show 100 records to user without doing any mysql query. Please let me know if this is feasible. If so, kindly give me some pointer how do I do it. Best Regards Kamal
Re: How to improve performance of geodist()
Yes, I did. But instead of sorting by geodist(), I use function query to boost by distance. That's why I noticed the heavy calculation happened in the processing. Example: bf=recip(geodist(), 50, 5) Basically, I think the boost function will iterate all the results, and calculate the distance. On Mon, May 13, 2013 at 1:27 PM, Yonik Seeley yo...@lucidworks.com wrote: On Mon, May 13, 2013 at 1:12 PM, Nicholas Ding nicholas...@gmail.com wrote: I'm using geodist() in a recip boost function. I noticed a performance impact to the response time. I did a profiling session, the geodist() calculation took 30% of CPU time. Are you also using an fq with geofilt to narrow down the number of documents that must be scored? -Yonik http://lucidworks.com
Anybody knows what IBM FileNet search looks like?
And how does it compare to Solr. I am not buying (or selling), just trying to get some technical details and my GoogleFoo is failing me. I thought they were one of the purchased companies, but Autonomy/Verity seems to be referred to as 'old' search engine with FileNet's as new. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book)
Re: Making protwords.txt changes effective
I think you can put it in your data dir and it'll get reloaded on commit. Try it and report back. Upayavira On Mon, May 13, 2013, at 06:01 PM, Jack Krupansky wrote: Yes, restart Solr. Not to reindex, but simply to reload the file. Well... depending on where you use the protected words, you may need to reindex as well. For a query-time filter you don't need to reindex, but for index-time filters, you must reindex. -- Jack Krupansky -Original Message- From: Shane Magee Sent: Saturday, May 11, 2013 7:42 AM To: solr-user@lucene.apache.org Subject: Making protwords.txt changes effective Hi I added some words to protwords.txt, but there doesnt seem to be any effect in the resulting search. Do I need to restart Apache or Solr or rebuild the index?
Re: rename a core to same name of existing core
thanks for the information, you are right, I was using the same instance dir. I agree with you, I would like to see an error is I am creating a core with the name of existing core name. right now I have to do ping first, and analyze if the returned code is 404 or not. Jie -- View this message in context: http://lucene.472066.n3.nabble.com/rename-a-core-to-same-name-of-existing-core-tp3090960p4063047.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Anybody knows what IBM FileNet search looks like?
:-) Alex, it seems to be a copyright ... Think about Lucene + ManifoldCF. FileNet is file repository saved in DB2. ManifoldCF has a connector that helps retrieve files/directories from DB using Lucene it may index the context of the files. I am not sure if Solr has such handler like Tika, however you can write by yourself. Hope it helps, Oleg On Mon, May 13, 2013 at 9:39 PM, Alexandre Rafalovitch arafa...@gmail.comwrote: And how does it compare to Solr. I am not buying (or selling), just trying to get some technical details and my GoogleFoo is failing me. I thought they were one of the purchased companies, but Autonomy/Verity seems to be referred to as 'old' search engine with FileNet's as new. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book)
Re: How to force a document to be indexed in a given shard at SolrCloud?
Hi, Yes shard.keys should work for this case.Please check this link. http://docs.lucidworks.com/display/solr/Shards+and+Indexing+Data+in+SolrCloud Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-force-a-document-to-be-indexed-in-a-given-shard-at-SolrCloud-tp4063017p4063052.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Faceting json response - odd format
: This is what i am getting which mirrors what is in the documentation: : : facet_counts:{facet_queries:{}, : facet_fields:{metadata_meta_last_author:[Nick,330,standarduser,153,Mohan,52,wwd,49,gerald,45,Riggins,36,fallon,31,blister,28, ,26,morfitelli,24,Administrator,22,morrow,22,richard,22,egilhoi,18,USer Group,16], : : : This is not trivial to parse - I've read the docs but can't seem to : figure out who one might get a more structured response to this. You didn't provide any specifics about what you felt was problematic, but i'm guessing what you want to do is pick the value you think is best for the json.nl param... http://wiki.apache.org/solr/SolJSON#JSON_specific_parameters -Hoss
Re: Faceting json response - odd format
thank you Hoss, What i would prefer to see as we do with all other parameters is a normal key/value pairing. this might look like: {metadata_meta_last_author:[{value: Nick, count: 330},{value: standard user,count: 153},{value: Mohan,count: 52},{value:wwd,count: 49}… Cord On May 13, 2013, at 12:34 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : This is what i am getting which mirrors what is in the documentation: : : facet_counts:{facet_queries:{}, : facet_fields:{metadata_meta_last_author:[Nick,330,standarduser,153,Mohan,52,wwd,49,gerald,45,Riggins,36,fallon,31,blister,28, ,26,morfitelli,24,Administrator,22,morrow,22,richard,22,egilhoi,18,USer Group,16], : : : This is not trivial to parse - I've read the docs but can't seem to : figure out who one might get a more structured response to this. You didn't provide any specifics about what you felt was problematic, but i'm guessing what you want to do is pick the value you think is best for the json.nl param... http://wiki.apache.org/solr/SolJSON#JSON_specific_parameters -Hoss
RE: spellcheker and exact match
I tried those parameters and it does suggest keywords but not the ones I'm interested in -- View this message in context: http://lucene.472066.n3.nabble.com/spellcheker-and-exact-match-tp4061672p4063060.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Faceting json response - odd format
: What i would prefer to see as we do with all other parameters is a : normal key/value pairing. this might look like: a true key value pairing via a map type structure is what you get with json.nl=map -- but in most client langauges that would lose whatever sorting you might have specified with facet.sort. : {metadata_meta_last_author:[{value: Nick, count: 330},{value: : standard user,count: 153},{value: Mohan,count: : 52},{value:wwd,count: 49}… that structure is essentially what you get with json.nl=arrarr -- ie: the values are still in the specified facet.sort order; but instead of an array of maps, it's an array of array pairs. This is the most close equivilent to how the facet counts are internally modeled -- you should be able to inject those keys you choose (value and count) in your client layer fairely easily. -Hoss
Solr 4.3 core swap
Since upgrading to solr 4.3 we get the following errors on our slaves when we swap cores on our master: Solr index directory '/usr/local/solr_aggregate/solr_aggregate/data/index.20130513152644966' is locked. Throwing exception SEVERE: Unable to reload core: production org.apache.solr.common.SolrException: Index locked for write for core production SEVERE: Could not reload core org.apache.solr.common.SolrException: Unable to reload core: production On older solr versions it would create a new index.* directory and use it, it hasn't been the case w/ 4.3. The new core seems to replicate fine and the new index files are in the original index.* directory so I'm not sure what is happening. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-3-core-swap-tp4063065.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Boolean query help
I am trying to form a Solr query. Our documents have a multi-valued field named tag_id. I want to get documents that either do not have tag_id 1 or have both tag_id 1 and 2 i.e. q=(tag_id:(1 AND 2) OR tag_id:(NOT 1)) This is not giving the desired results. The result is the same as that of q=tag_id:(1 AND 2) and the OR condition is ignored. How would one do this query?
Re: Solr Boolean query help
Inner purely negative clauses aren't allowed by Lucene. (Solr supports top-level negative clauses, though, so q=NOT foo works as expected) To get a nested negative clause to work, try this: q=tag_id:(1 AND 2) OR (*:* AND -tag_id:1) On May 13, 2013, at 16:11 , Arun Rangarajan wrote: I am trying to form a Solr query. Our documents have a multi-valued field named tag_id. I want to get documents that either do not have tag_id 1 or have both tag_id 1 and 2 i.e. q=(tag_id:(1 AND 2) OR tag_id:(NOT 1)) This is not giving the desired results. The result is the same as that of q=tag_id:(1 AND 2) and the OR condition is ignored. How would one do this query?
Re: Solr Boolean query help
Pure negative queries only work at the top level. So, try: q=(tag_id:(1 AND 2) OR tag_id:(*:* NOT 1)) -- Jack Krupansky -Original Message- From: Arun Rangarajan Sent: Monday, May 13, 2013 4:11 PM To: solr-user@lucene.apache.org Subject: Solr Boolean query help I am trying to form a Solr query. Our documents have a multi-valued field named tag_id. I want to get documents that either do not have tag_id 1 or have both tag_id 1 and 2 i.e. q=(tag_id:(1 AND 2) OR tag_id:(NOT 1)) This is not giving the desired results. The result is the same as that of q=tag_id:(1 AND 2) and the OR condition is ignored. How would one do this query?
Re: Solr Boolean query help
Erik, Jack, Thanks for your quick replies! That works. On Mon, May 13, 2013 at 1:18 PM, Jack Krupansky j...@basetechnology.comwrote: Pure negative queries only work at the top level. So, try: q=(tag_id:(1 AND 2) OR tag_id:(*:* NOT 1)) -- Jack Krupansky -Original Message- From: Arun Rangarajan Sent: Monday, May 13, 2013 4:11 PM To: solr-user@lucene.apache.org Subject: Solr Boolean query help I am trying to form a Solr query. Our documents have a multi-valued field named tag_id. I want to get documents that either do not have tag_id 1 or have both tag_id 1 and 2 i.e. q=(tag_id:(1 AND 2) OR tag_id:(NOT 1)) This is not giving the desired results. The result is the same as that of q=tag_id:(1 AND 2) and the OR condition is ignored. How would one do this query?
writing a custom Filter plugin?
Does anyone know of any tutorials, basic examples, and/or documentation on writing your own Filter plugin for Solr? For Solr 4.x/4.3? I would like a Solr 4.3 version of the normalization filters found here for Solr 1.4: https://github.com/billdueber/lib.umich.edu-solr-stuff But those are old, for Solr 1.4. Does anyone have any hints for writing a simple substitution Filter for Solr 4.x? Or, does a simple sourcecode example exist anywhere?
.skip.autorecovery=Y + restart solr after crash + losing many documents
Hi all, We write to two same-named cores in the same collection for redundancy, and are not taking advantage of the full benefits of solr cloud replication. We use solrcloud.skip.autorecovery=true so that Solr doesn't try to sync the indexes when it starts up. However, we find that if the core is not optimized prior to shutting it down (in a crash situation), we can lose all of the data after starting up. The files are written to disk, but we can lose a full 24 hours worth of data as they are all removed when we start SOLR. (I don't think it is a commit issue) If we optimize before shutting down, we never lose any data. Sadly, sometimes SOLR is in a state where optimizing is not an option. Can anyone think of why that might be? Is there any special configuration you need if you want to write directly to two cores rather than use replication? Version 4.0, this used to work in our 4.0 nightly build, but broke when we migrated to 4.0 production.(until we test and migrate to the replication setup - it won't be too long and I'm a bit embarrassed to be asking this question!) Regards, Gilles
Re: How to get/set customized Solr data source properties?
: learned it should work. And this is my actual code. I create this : DataSource for testing my ideas. I am blocked at the very beginning...sucks : :( but you only showed us one line of code w/o any context. nothing in your email was reproducible for other people to try to compile/run themselves to see if htey can figure out why your code isn't working. : : I am working on a DataSource implementation. I want to get some : customized : : properties when the *DataSource.init* method is called. I tried to add : the : ... : : dataConfig : : dataSource type=com.my.company.datasource : : my=value / : : My understanding from looking at other DataSources is that should work. : : : But initProps.getProperty(my) == null. : : can you show us some actual that fails with that dataConfig you mentioned? -Hoss
Solritas truncates content
Hi, I'm playing around with the example that comes with SOLR 4. I've indexed some documents using the Tika extractor. I'm looking at the velocity templates and trying to figure out how the /browse (solritas) functionality works because I would like to add functionality to view the complete document content. Presently, the content field is truncated in the results to around 730 characters. How is this done? How can I access the full text? I've poked around quite a bit but have not found anything. The content field is added to the result set in richtext-doc.vm: div class=result-body#field('content')/div Any help is greatly appreciated! Peace. Michael
Re: How to deal with cache for facet search when index is always increment?
: For real time seach, the docs would be import to index anytime. In this : case, the cache is nealy always need to create again, which cause the facet : seach is very slowly. : Do you have any idea to deal with such problem? : We're in a similar situation and have had better performance using : facet.method=fcs. : : http://wiki.apache.org/solr/SimpleFacetParameters#facet.method DocValues is another vey new option that may help improve the performance of faceting in NRT sitautions, because it eliminates the need to build the Field Cache... http://wiki.apache.org/solr/DocValues ...but there are caveats to using it (see wiki). -Hoss
Request to be added to ContributorsGroup
Hello Wiki Admins, Request you to please add me to the ContributorsGroup. I have been using Solr for a few years now and I would like to contribute back by adding more information to the wiki Pages. Wiki User Name : Shreejay --Shreejay
Re: Quick question about indexing with SolrJ.
: I don't want to use POJOs, that's the main problem. I know that you can : send AJAX POST HTTP Requests with JSON data to index new documents and I : would like to do that with SolrJ, that's all, but I don't find the way to : do that, :-/ . What I would like to do is simple retrieve an String with an : embedded JSON and add() it via an HttpSolrServer object instance. If the Use ContentStreamUpdateRequest -- provide your pre-generated JSON as the ContentStream (you can back it by a String using ContentStreamBase.StringStream or whatever you have to work with) then process it against your HttpSolrServer object. -Hoss
Re: Request to be added to ContributorsGroup
On May 13, 2013, at 6:54 PM, Shreejay Nair shreej...@gmail.com wrote: Hello Wiki Admins, Request you to please add me to the ContributorsGroup. I have been using Solr for a few years now and I would like to contribute back by adding more information to the wiki Pages. Wiki User Name : Shreejay --Shreejay Added to the solr wiki ContributorsGroup. - Steve
Re: How to get/set customized Solr data source properties?
If the property has a full stop, it is probably going through the scoped resolver which may be causing issues. I would start with very basic property name format and see what happens. Otherwise, it is probably a breakpoint and debug time. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Mon, May 13, 2013 at 6:08 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : learned it should work. And this is my actual code. I create this : DataSource for testing my ideas. I am blocked at the very beginning...sucks : :( but you only showed us one line of code w/o any context. nothing in your email was reproducible for other people to try to compile/run themselves to see if htey can figure out why your code isn't working. : : I am working on a DataSource implementation. I want to get some : customized : : properties when the *DataSource.init* method is called. I tried to add : the : ... : : dataConfig : : dataSource type=com.my.company.datasource : : my=value / : : My understanding from looking at other DataSources is that should work. : : : But initProps.getProperty(my) == null. : : can you show us some actual that fails with that dataConfig you mentioned? -Hoss
Re: Solritas truncates content
#field is defined in conf/velocity/VM_global_library.vm as: #macro(field $f) #if($response.response.highlighting.get($docId).get($f).get(0)) #set($pad = ) #foreach($v in $response.response.highlighting.get($docId).get($f)) $pad$v## #set($pad = ... ) #end #else #foreach($v in $doc.getFieldValues($f)) $v## #end #end #end It's a little ugly because it supports highlighting if a field has an values for that document in the highlighting section of the response. But if there is no highlighting, then it outputs each value of a field as-is from the response. Are you sure you're getting it truncated? Try adding wt=xml to the /browse requests you're making and see if perhaps the actual value coming back from Solr is the same as what you're seeing rendered. Unless it's from highlighting, it should be the same. Erik On May 13, 2013, at 18:14 , Michael Schmitz wrote: Hi, I'm playing around with the example that comes with SOLR 4. I've indexed some documents using the Tika extractor. I'm looking at the velocity templates and trying to figure out how the /browse (solritas) functionality works because I would like to add functionality to view the complete document content. Presently, the content field is truncated in the results to around 730 characters. How is this done? How can I access the full text? I've poked around quite a bit but have not found anything. The content field is added to the result set in richtext-doc.vm: div class=result-body#field('content')/div Any help is greatly appreciated! Peace. Michael
Re: How to improve performance of geodist()
Hi Nicholas, Given that boosting is generally inherently fuzzy / inexact thing, you can likely get away with using simpler calculations. dist() can do the Euclidean distance (i.e. the Pythagorean theorem). If your data is in just one region of the world, you can project your data into a 2-D plane (a so-called projection) and use the Euclidean distance. If your data is everywhere, you may need to use multiple projections, putting them in separate fields for each projection and then choose the best projected set of coordinates based on your starting point. ~ David Nicholas Ding wrote Yes, I did. But instead of sorting by geodist(), I use function query to boost by distance. That's why I noticed the heavy calculation happened in the processing. Example: bf=recip(geodist(), 50, 5) Basically, I think the boost function will iterate all the results, and calculate the distance. On Mon, May 13, 2013 at 1:27 PM, Yonik Seeley lt; yonik@ gt; wrote: On Mon, May 13, 2013 at 1:12 PM, Nicholas Ding lt; nicholasdsj@ gt; wrote: I'm using geodist() in a recip boost function. I noticed a performance impact to the response time. I did a profiling session, the geodist() calculation took 30% of CPU time. Are you also using an fq with geofilt to narrow down the number of documents that must be scored? -Yonik http://lucidworks.com - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-improve-performance-of-geodist-tp4063004p4063136.html Sent from the Solr - User mailing list archive at Nabble.com.
Request to be added to Contributor Group
Hi Admins, My name is Eric. I got an account at http://wiki.apache.org/solr/ with user name is Eric D. Please add me to the Contributor Group. We currently have JobSearcher.com.au up and running which is using Solr. I am sure we can add comments and share some experience with Solr up there. Thank you very much. Best regards,