Typecast non stored string field for sorting
Hi friends, I have a field which is string which I created by mistake it should have been int. It is not stored just indexed. I want to numerically sort it, and hence I want a function which can at query convert to integer or double and then I can apply sort. Is it possible? If not then can I create a new field with the value from non stored field? Please advise. Thanks Abhishek -- Thanks and kind Regards, Abhishek jain +91 9971376767
Error handling in Solr.
hi friends, While browsing through the logs of solr,i noticed a few null pointer exceptions, i am concerned what could be the reason? ERROR org.apache.solr.core.SolrCore â EURO java.lang.NullPointerException at org.apache.solr.handler.admin.ShowFileRequestHandler.showFromFileSystem(ShowFileRequestHandler.java:212) at org.apache.solr.handler.admin.ShowFileRequestHandler.handleRequestBody(ShowFileRequestHandler.java:122) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1820) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:656) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:359) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:365) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Unknown Source) Please help, -- Thanks and kind Regards, Abhishek jain +91 9971376767
Stopping Solr instance
Hi friends, What is the best way to stop solr from command line, the command with the stop port and secret key as given in most online help links don't work for me all time, I have to kill it most times ! i have though noted excessive swap usage when i have to kill it. Is there a link between swap usage and solr not stopping? Please let me know best way to stop solr instance. Thanks Abhi
Re: AND not as a boolean operator in Phrase
Hi, Ok thanks, i want to search for phrase A and B with the *and *word sandwiched between A and B. I dont want to work with and as a boolean operator when within quotes. I have and as a stop word and i dont want to reindex data. What is my best bet. thanks abhishek jain On Sun, Mar 30, 2014 at 2:33 AM, Bob Laferriere spongeb...@icloud.comwrote: If you are using edismax you need to use AND. So A AND B will ignore the stop word and apply the Boolean operator. You can configure edismax to ignore Boolean stop words that are lowercase. Regards, Bob On Mar 26, 2014, at 2:39 AM, abhishek jain abhishek.netj...@gmail.com wrote: Hi Jack, You are right, i am using 'and' as a stop word in both indexing and query, Should i use it only during indexing? thanks On Tue, Mar 25, 2014 at 11:09 PM, Jack Krupansky j...@basetechnology.comwrote: What does your field type analyzer look like? I suspect that you have a stop filter which cause and to be removed. -- Jack Krupansky -Original Message- From: abhishek jain Sent: Tuesday, March 25, 2014 1:29 PM To: solr-user@lucene.apache.org Subject: AND not as a boolean operator in Phrase hi friends, when i search for A and B it gives me result for A , B , i am not sure why? Please guide how can i exact match when it is within phrase/quotes. -- Thanks and kind Regards, Abhishek jain -- Thanks and kind Regards, Abhishek jain +91 9971376767 -- Thanks and kind Regards, Abhishek jain +91 9971376767
Strange behavior while deleting
hi friends, I have observed a strange behavior, I have two indexes of same ids and same number of docs, and i am using a json file to delete records from both the indexes, after deleting the ids, the resulting indexes now show different count of docs, Not sure why I used curl with the same json file to delete from both the indexes. Please advise asap, thanks -- Thanks and kind Regards, Abhishek
Re: AND not as a boolean operator in Phrase
Hi Jack, You are right, i am using 'and' as a stop word in both indexing and query, Should i use it only during indexing? thanks On Tue, Mar 25, 2014 at 11:09 PM, Jack Krupansky j...@basetechnology.comwrote: What does your field type analyzer look like? I suspect that you have a stop filter which cause and to be removed. -- Jack Krupansky -Original Message- From: abhishek jain Sent: Tuesday, March 25, 2014 1:29 PM To: solr-user@lucene.apache.org Subject: AND not as a boolean operator in Phrase hi friends, when i search for A and B it gives me result for A , B , i am not sure why? Please guide how can i exact match when it is within phrase/quotes. -- Thanks and kind Regards, Abhishek jain -- Thanks and kind Regards, Abhishek jain +91 9971376767
AND not as a boolean operator in Phrase
hi friends, when i search for A and B it gives me result for A , B , i am not sure why? Please guide how can i exact match when it is within phrase/quotes. -- Thanks and kind Regards, Abhishek jain
Re: Optimizing RAM
hi Shawn, Thanks for the reply, Is there a way to optimize RAM or does Solr does automatically. I have multiple shards and i know i will be querying only 30% of shards most of time! and i have 6 slaves. so dedicating more slave with 30% most used shards . Another question: Is it advised to serve queries from master or only from slaves? or it doesnt matter? thanks Abhishek On Tue, Mar 11, 2014 at 9:12 PM, Shawn Heisey s...@elyograg.org wrote: On 3/11/2014 6:14 AM, abhishek.netj...@gmail.com wrote: Hi all, What should be the ideal RAM index size ratio. please reply I expect index to be of size of 60 gb and I dont store contents. Ideally, your total system RAM will be equal to the size of all your program's heap requirements, plus the size of all the data for all the programs. If Solr is the only thing on the box, then the ideal memory size is roughly the Solr heap plus the size of all the Solr indexes that live on that machine. So if your heap is 8GB and your index is 60GB, you'll want at least 68GB of RAM for an ideal setup. I don't know how big your heap is, so I am guessing here. You said your index does not store much content. That means you will need a higher percentage of your total index size to be in RAM for good performance. I would estimate that you want a minimum of two thirds of your index in RAM, which indicates a minimum RAM size of 48GB if we assume your heap is 8GB. 64GB would be better. http://wiki.apache.org/solr/SolrPerformanceProblems#General_information Thanks, Shawn -- Thanks and kind Regards, Abhishek jain +91 9971376767
Re: Which Tokenizer to use at searching
Hi, As a solution, i have tried a combination of PatternTokenizerFactory and PatternReplaceFilterFactory . In both query and indexer i have written: tokenizer class=solr.PatternTokenizerFactory pattern=\s+ / filter class=solr.PatternReplaceFilterFactory pattern=([^-\w]+) replacement= punct replace=all/ What i am trying to do is tokenizing on space and then rewriting every special character as punct . So, A,B becomes A punct B . but the problem is A punct B is still one word and not tokenized further application of filter, Is there a way i can tokenize after application of filter, please suggest i know i am missing something basic. thanks abhishek On Mon, Mar 10, 2014 at 2:06 AM, abhishek.netj...@gmail.com wrote: Hi Oops my bad. I actually meant While indexing A,B A and B should give result but A B should not give result. Also I will look at analyser. Thanks Abhishek Original Message From: Erick Erickson Sent: Monday, 10 March 2014 01:38 To: abhishek jain Subject: Re: Which Tokenizer to use at searching Then I don't see the problem. StandardTokenizer (see the text_general fieldType) should do all this for you automatically. Did you look at the analysis page? I really recommend it. Best, Erick On Sun, Mar 9, 2014 at 3:04 PM, abhishek jain abhishek.netj...@gmail.com wrote: Hi Erick, Thanks for replying, I want to index A,B (with or without space with comma) as separate words and also want to return results when A and B searched individually and also A,B . Please let me know your views. Let me know if i still havent explained correctly. I will try again. Thanks abhishek On Sun, Mar 9, 2014 at 11:49 PM, Erick Erickson erickerick...@gmail.com wrote: You've contradicted yourself, so it's hard to say. Or I'm mis-reading your messages. bq: During indexing i want to token on all punctuations, so i can use StandardTokenizer, but at search time i want to consider punctuations as part of text, and in your second message: bq: when i search for A,B it should return result. [for input A,B] If, indeed, you ... at search time i want to consider punctuations as part of text then A,B should NOT match the document. The admin/analysis page is your friend, I strongly suggest you spend some time looking at the various transformations performed by the various analyzers and tokenizers. Best, Erick On Sun, Mar 9, 2014 at 1:54 PM, abhishek jain abhishek.netj...@gmail.com wrote: hi, Thanks for replying promptly, an example: I want to index for A,B but when i search A AND B, it should return result, when i search for A,B it should return result. Also Ideally when i search for A , B (with space) it should return result. please advice thanks abhishek On Sun, Mar 9, 2014 at 9:52 PM, Furkan KAMACI furkankam...@gmail.comwrote: Hi; Firstly you have to keep in mind that if you don't index punctuation they will not be visible for search. On the other hand you can have different analyzer for index and search. You have to give more detail about your situation. What will be your tokenizer at search time, WhiteSpaceTokenizer? You can have a look at here: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters If you can give some examples what you want for indexing and searching I can help you to combine index and search analyzer/tokenizer/token filters. Thanks; Furkan KAMACI 2014-03-09 18:06 GMT+02:00 abhishek jain abhishek.netj...@gmail.com : Hi Friends, I am concerned on Tokenizer, my scenario is: During indexing i want to token on all punctuations, so i can use StandardTokenizer, but at search time i want to consider punctuations as part of text, I dont store contents but only indexes. What should i use. Any advices ? -- Thanks and kind Regards, Abhishek jain -- Thanks and kind Regards, Abhishek jain +91 9971376767 -- Thanks and kind Regards, Abhishek jain +91 9971376767 -- Thanks and kind Regards, Abhishek jain +91 9971376767
Which Tokenizer to use at searching
Hi Friends, I am concerned on Tokenizer, my scenario is: During indexing i want to token on all punctuations, so i can use StandardTokenizer, but at search time i want to consider punctuations as part of text, I dont store contents but only indexes. What should i use. Any advices ? -- Thanks and kind Regards, Abhishek jain
Re: Which Tokenizer to use at searching
hi, Thanks for replying promptly, an example: I want to index for A,B but when i search A AND B, it should return result, when i search for A,B it should return result. Also Ideally when i search for A , B (with space) it should return result. please advice thanks abhishek On Sun, Mar 9, 2014 at 9:52 PM, Furkan KAMACI furkankam...@gmail.comwrote: Hi; Firstly you have to keep in mind that if you don't index punctuation they will not be visible for search. On the other hand you can have different analyzer for index and search. You have to give more detail about your situation. What will be your tokenizer at search time, WhiteSpaceTokenizer? You can have a look at here: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters If you can give some examples what you want for indexing and searching I can help you to combine index and search analyzer/tokenizer/token filters. Thanks; Furkan KAMACI 2014-03-09 18:06 GMT+02:00 abhishek jain abhishek.netj...@gmail.com: Hi Friends, I am concerned on Tokenizer, my scenario is: During indexing i want to token on all punctuations, so i can use StandardTokenizer, but at search time i want to consider punctuations as part of text, I dont store contents but only indexes. What should i use. Any advices ? -- Thanks and kind Regards, Abhishek jain -- Thanks and kind Regards, Abhishek jain +91 9971376767
Optimizing RAM
hi friends, I want to index some good amount of data, i want to keep both stemmed and unstemmed versions , I am confused should i keep two separate indexes or keep one index with two versions or column , i mean col1_stemmed and col2_unstemmed. I have multicore with multi shard configuration. My server have 32 GB RAM and stemmed index size (without content) i calculated as 60 GB . I want to not put too much load and I/O load on a decent server with some 5 other replicated servers and want to use servers for other purposes also. Also is it advised to server queries from master server or only from slaves? -- Thanks, Abhishek
Re: Which Tokenizer to use at searching
Hi Erick, Thanks for replying, I want to index A,B (with or without space with comma) as separate words and also want to return results when A and B searched individually and also A,B . Please let me know your views. Let me know if i still havent explained correctly. I will try again. Thanks abhishek On Sun, Mar 9, 2014 at 11:49 PM, Erick Erickson erickerick...@gmail.comwrote: You've contradicted yourself, so it's hard to say. Or I'm mis-reading your messages. bq: During indexing i want to token on all punctuations, so i can use StandardTokenizer, but at search time i want to consider punctuations as part of text, and in your second message: bq: when i search for A,B it should return result. [for input A,B] If, indeed, you ... at search time i want to consider punctuations as part of text then A,B should NOT match the document. The admin/analysis page is your friend, I strongly suggest you spend some time looking at the various transformations performed by the various analyzers and tokenizers. Best, Erick On Sun, Mar 9, 2014 at 1:54 PM, abhishek jain abhishek.netj...@gmail.com wrote: hi, Thanks for replying promptly, an example: I want to index for A,B but when i search A AND B, it should return result, when i search for A,B it should return result. Also Ideally when i search for A , B (with space) it should return result. please advice thanks abhishek On Sun, Mar 9, 2014 at 9:52 PM, Furkan KAMACI furkankam...@gmail.com wrote: Hi; Firstly you have to keep in mind that if you don't index punctuation they will not be visible for search. On the other hand you can have different analyzer for index and search. You have to give more detail about your situation. What will be your tokenizer at search time, WhiteSpaceTokenizer? You can have a look at here: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters If you can give some examples what you want for indexing and searching I can help you to combine index and search analyzer/tokenizer/token filters. Thanks; Furkan KAMACI 2014-03-09 18:06 GMT+02:00 abhishek jain abhishek.netj...@gmail.com: Hi Friends, I am concerned on Tokenizer, my scenario is: During indexing i want to token on all punctuations, so i can use StandardTokenizer, but at search time i want to consider punctuations as part of text, I dont store contents but only indexes. What should i use. Any advices ? -- Thanks and kind Regards, Abhishek jain -- Thanks and kind Regards, Abhishek jain +91 9971376767 -- Thanks and kind Regards, Abhishek jain +91 9971376767
RE: Special character search in Solr and boosting without altering the resultset
Hi, Ok thanks, will look more into it, Any info on boosting without altering the resultset? Thanks Abhishek -Original Message- Hi Abhishek, dot is not a special character. Your field type / analyzer is stripping that character. Please see similar discussions and alternative solutions. http://search-lucene.com/m/6dbI9zMSob1 http://search-lucene.com/m/Ac71G0KlGz http://search-lucene.com/m/RRD2D1p1mi Ahmet On Friday, January 31, 2014 8:23 PM, abhishek jain abhishek.netj...@gmail.com wrote: Hi friends, I am facing a strange problem, When I search a term eg .Net , the solr searches for Net and not includes '.' Is dot a special character in Solr? I tried escaping it with backslash in the url call to solr, but no use same resultset, Also , is there a way to boost some terms within a resultset. I mean I want to boost a term within a result and I don't want to fire a separate query. I couldn't use OR operator as it will modify the resultset. I want to use a single query and boost. I don't want to use dismax query as well, Please advice. Thanks, Abhishek
RE: Special character search in Solr and boosting without altering the resultset
Hi, Thanks for replying but if i understand right: q=term1 term2^0.6 means it will search for term1 and term2 and somewhat less boost to term2, I want to search only for term1 and if the term2 exists boost by a positive factor . I am not able to make such a query . Thanks Abhishek -Original Message- From: Ahmet Arslan [mailto:iori...@yahoo.com] Sent: Saturday, February 1, 2014 8:51 PM To: solr-user@lucene.apache.org Subject: Re: Special character search in Solr and boosting without altering the resultset Hi, Can you elaborate your boosting requirement? There is a carat operator to boost query terms. for example : q=term1 term2^0.6 On Saturday, February 1, 2014 1:51 PM, abhishek jain abhishek.netj...@gmail.com wrote: Hi, Ok thanks, will look more into it, Any info on boosting without altering the resultset? Thanks Abhishek -Original Message- Hi Abhishek, dot is not a special character. Your field type / analyzer is stripping that character. Please see similar discussions and alternative solutions. http://search-lucene.com/m/6dbI9zMSob1 http://search-lucene.com/m/Ac71G0KlGz http://search-lucene.com/m/RRD2D1p1mi Ahmet On Friday, January 31, 2014 8:23 PM, abhishek jain abhishek.netj...@gmail.com wrote: Hi friends, I am facing a strange problem, When I search a term eg .Net , the solr searches for Net and not includes '.' Is dot a special character in Solr? I tried escaping it with backslash in the url call to solr, but no use same resultset, Also , is there a way to boost some terms within a resultset. I mean I want to boost a term within a result and I don't want to fire a separate query. I couldn't use OR operator as it will modify the resultset. I want to use a single query and boost. I don't want to use dismax query as well, Please advice. Thanks, Abhishek
Remove stemming without reindexing - currently using KStem
Hi Friends, Is it possible to remove stemming without having to reindex the entire data, I am using KStem. Can we do so by query itself, not sure how? I am not using dismax. Thanks Abhishek
Special character search in Solr and boosting without altering the resultset
Hi friends, I am facing a strange problem, When I search a term eg .Net , the solr searches for Net and not includes '.' Is dot a special character in Solr? I tried escaping it with backslash in the url call to solr, but no use same resultset, Also , is there a way to boost some terms within a resultset. I mean I want to boost a term within a result and I don't want to fire a separate query. I couldn't use OR operator as it will modify the resultset. I want to use a single query and boost. I don't want to use dismax query as well, Please advice. Thanks, Abhishek