Typecast non stored string field for sorting

2014-04-23 Thread abhishek jain
Hi friends,
I have a field which is string which I created by mistake it should have
been int.
It is not stored just indexed.

I want to numerically sort it, and hence I want a function which can at
query convert to integer or double and then I can apply sort. Is it
possible?
If not then can I create a new field with the value from non stored field?

Please advise.
Thanks
Abhishek

-- 
Thanks and kind Regards,
Abhishek jain
+91 9971376767


Error handling in Solr.

2014-04-08 Thread abhishek jain
hi friends,
While browsing through the logs of solr,i noticed a few null pointer
exceptions, i am concerned what could be the reason?


 ERROR org.apache.solr.core.SolrCore  â EURO  java.lang.NullPointerException
at
org.apache.solr.handler.admin.ShowFileRequestHandler.showFromFileSystem(ShowFileRequestHandler.java:212)
at
org.apache.solr.handler.admin.ShowFileRequestHandler.handleRequestBody(ShowFileRequestHandler.java:122)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1820)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:656)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:359)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:365)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635)
at
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Unknown Source)

Please help,

-- 
Thanks and kind Regards,
Abhishek jain
+91 9971376767


Stopping Solr instance

2014-04-08 Thread abhishek jain
Hi friends,

What is the best way to stop solr from command line, the command with the
stop port and secret key as given in most online help links don't work for
me all time, 

I have to kill it most times ! i have though noted excessive swap usage when
i have to kill it. Is there a link between swap usage and solr not stopping?

 

Please let me know best way to stop solr instance.

 

Thanks

Abhi 

 

 



Re: AND not as a boolean operator in Phrase

2014-04-02 Thread abhishek jain
Hi,
Ok thanks,
i want to search for phrase A and B with the *and *word sandwiched
between A and B. I dont want to work with and as a boolean operator when
within quotes.

I have and as a stop word and i dont want to reindex data.

What is my best bet.

thanks
abhishek jain


On Sun, Mar 30, 2014 at 2:33 AM, Bob Laferriere spongeb...@icloud.comwrote:

 If you are using edismax you need to use AND. So A AND B will ignore the
 stop word and apply the Boolean operator. You can configure edismax to
 ignore Boolean stop words that are lowercase.

 Regards,

 Bob

  On Mar 26, 2014, at 2:39 AM, abhishek jain abhishek.netj...@gmail.com
 wrote:
 
  Hi Jack,
  You are right, i am using 'and' as a stop word in both indexing and
 query,
 
  Should i use it only during  indexing?
 
  thanks
 
 
 
  On Tue, Mar 25, 2014 at 11:09 PM, Jack Krupansky 
 j...@basetechnology.comwrote:
 
  What does your field type analyzer look like?
 
  I suspect that you have a stop filter which cause and to be removed.
 
  -- Jack Krupansky
 
  -Original Message- From: abhishek jain Sent: Tuesday, March 25,
  2014 1:29 PM To: solr-user@lucene.apache.org Subject: AND not as a
  boolean operator in Phrase
  hi friends,
 
  when i search for A and B it gives me result for A , B , i am not sure
  why?
 
  Please guide how can i exact match when it is within phrase/quotes.
 
  --
  Thanks and kind Regards,
  Abhishek jain
 
 
 
  --
  Thanks and kind Regards,
  Abhishek jain
  +91 9971376767




-- 
Thanks and kind Regards,
Abhishek jain
+91 9971376767


Strange behavior while deleting

2014-03-31 Thread abhishek jain
hi friends,
I have observed a strange behavior,

I have two indexes of same ids and same number of docs, and i am using a
json file to delete records from both the indexes,
after deleting the ids, the resulting indexes now show different count of
docs,

Not sure why
I used curl with the same json file to delete from both the indexes.

Please advise asap,
thanks

-- 
Thanks and kind Regards,
Abhishek


Re: AND not as a boolean operator in Phrase

2014-03-26 Thread abhishek jain
Hi Jack,
You are right, i am using 'and' as a stop word in both indexing and query,

Should i use it only during  indexing?

thanks



On Tue, Mar 25, 2014 at 11:09 PM, Jack Krupansky j...@basetechnology.comwrote:

 What does your field type analyzer look like?

 I suspect that you have a stop filter which cause and to be removed.

 -- Jack Krupansky

 -Original Message- From: abhishek jain Sent: Tuesday, March 25,
 2014 1:29 PM To: solr-user@lucene.apache.org Subject: AND not as a
 boolean operator in Phrase
 hi friends,

 when i search for A and B it gives me result for A , B , i am not sure
 why?

 Please guide how can i exact match when it is within phrase/quotes.

 --
 Thanks and kind Regards,
 Abhishek jain




-- 
Thanks and kind Regards,
Abhishek jain
+91 9971376767


AND not as a boolean operator in Phrase

2014-03-25 Thread abhishek jain
hi friends,

when i search for A and B it gives me result for A , B , i am not sure
why?

Please guide how can i exact match when it is within phrase/quotes.

-- 
Thanks and kind Regards,
Abhishek jain


Re: Optimizing RAM

2014-03-11 Thread abhishek jain
hi Shawn,
Thanks for the reply,

Is there a way to optimize RAM or does  Solr does automatically. I have
multiple shards and i know i will be querying only 30% of shards most of
time! and i have 6 slaves. so dedicating more slave with 30% most used
shards .

Another question:
Is it advised to serve queries from master or only from slaves? or it
doesnt matter?

thanks
Abhishek




On Tue, Mar 11, 2014 at 9:12 PM, Shawn Heisey s...@elyograg.org wrote:

 On 3/11/2014 6:14 AM, abhishek.netj...@gmail.com wrote:
  Hi all,
  What should be the ideal RAM index size ratio.
 
  please reply I expect index to be of size of 60 gb and I dont store
 contents.

 Ideally, your total system RAM will be equal to the size of all your
 program's heap requirements, plus the size of all the data for all the
 programs.

 If Solr is the only thing on the box, then the ideal memory size is
 roughly the Solr heap plus the size of all the Solr indexes that live on
 that machine.  So if your heap is 8GB and your index is 60GB, you'll
 want at least 68GB of RAM for an ideal setup.  I don't know how big your
 heap is, so I am guessing here.

 You said your index does not store much content.  That means you will
 need a higher percentage of your total index size to be in RAM for good
 performance.  I would estimate that you want a minimum of two thirds of
 your index in RAM, which indicates a minimum RAM size of 48GB if we
 assume your heap is 8GB.  64GB would be better.

 http://wiki.apache.org/solr/SolrPerformanceProblems#General_information

 Thanks,
 Shawn




-- 
Thanks and kind Regards,
Abhishek jain
+91 9971376767


Re: Which Tokenizer to use at searching

2014-03-10 Thread abhishek jain
Hi,
As a solution, i have tried a combination of PatternTokenizerFactory and
PatternReplaceFilterFactory .

In both query and indexer i have written:

tokenizer class=solr.PatternTokenizerFactory pattern=\s+ /
filter class=solr.PatternReplaceFilterFactory pattern=([^-\w]+)
replacement= punct  replace=all/

What i am trying to do is tokenizing on space and then rewriting every
special character as  punct  .

So, A,B becomes A punct B .

but the problem is A punct B is still one word and not tokenized further
application of filter,

Is there a way i can tokenize after application of filter, please suggest i
know i am missing something basic.

thanks
abhishek


On Mon, Mar 10, 2014 at 2:06 AM, abhishek.netj...@gmail.com wrote:

 Hi
 Oops my bad. I actually meant
 While indexing A,B
 A and B should give result but
 A B should not give result.

 Also I will look at analyser.

 Thanks
 Abhishek

   Original Message
 From: Erick Erickson
 Sent: Monday, 10 March 2014 01:38
 To: abhishek jain
 Subject: Re: Which Tokenizer to use at searching

 Then I don't see the problem. StandardTokenizer
 (see the text_general fieldType) should do all this
 for you automatically.

 Did you look at the analysis page? I really recommend it.

 Best,
 Erick

 On Sun, Mar 9, 2014 at 3:04 PM, abhishek jain
 abhishek.netj...@gmail.com wrote:
  Hi Erick,
  Thanks for replying,
 
  I want to index A,B (with or without space with comma) as separate words
 and
  also want to return results when A and B searched individually and also
  A,B .
 
  Please let me know your views.
  Let me know if i still havent explained correctly. I will try again.
 
  Thanks
  abhishek
 
 
  On Sun, Mar 9, 2014 at 11:49 PM, Erick Erickson erickerick...@gmail.com
 
  wrote:
 
  You've contradicted yourself, so it's hard to say. Or
  I'm mis-reading your messages.
 
  bq: During indexing i want to token on all punctuations, so i can use
  StandardTokenizer, but at search time i want to consider punctuations as
  part of text,
 
  and in your second message:
 
  bq: when i search for A,B it should return result. [for input A,B]
 
  If, indeed, you ... at search time i want to consider punctuations as
  part of text then A,B should NOT match the document.
 
  The admin/analysis page is your friend, I strongly suggest you spend
  some time looking at the various transformations performed by
  the various analyzers and tokenizers.
 
  Best,
  Erick
 
  On Sun, Mar 9, 2014 at 1:54 PM, abhishek jain
  abhishek.netj...@gmail.com wrote:
   hi,
  
   Thanks for replying promptly,
   an example:
  
   I want to index for A,B
   but when i search A AND B, it should return result,
   when i search for A,B it should return result.
  
   Also Ideally when i search for A , B (with space) it should return
   result.
  
  
   please advice
   thanks
   abhishek
  
  
   On Sun, Mar 9, 2014 at 9:52 PM, Furkan KAMACI
   furkankam...@gmail.comwrote:
  
   Hi;
  
   Firstly you have to keep in mind that if you don't index punctuation
   they
   will not be visible for search. On the other hand you can have
   different
   analyzer for index and search. You have to give more detail about
 your
   situation. What will be your tokenizer at search time,
   WhiteSpaceTokenizer?
   You can have a look at here:
   http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
  
   If you can give some examples what you want for indexing and
 searching
   I
   can help you to combine index and search analyzer/tokenizer/token
   filters.
  
   Thanks;
   Furkan KAMACI
  
  
   2014-03-09 18:06 GMT+02:00 abhishek jain abhishek.netj...@gmail.com
 :
  
Hi Friends,
   
I am concerned on Tokenizer, my scenario is:
   
During indexing i want to token on all punctuations, so i can use
StandardTokenizer, but at search time i want to consider
 punctuations
as
part of text,
   
I dont store contents but only indexes.
   
What should i use.
   
Any advices ?
   
   
--
Thanks and kind Regards,
Abhishek jain
   
  
  
  
  
   --
   Thanks and kind Regards,
   Abhishek jain
   +91 9971376767
 
 
 
 
  --
  Thanks and kind Regards,
  Abhishek jain
  +91 9971376767




-- 
Thanks and kind Regards,
Abhishek jain
+91 9971376767


Which Tokenizer to use at searching

2014-03-09 Thread abhishek jain
Hi Friends,

I am concerned on Tokenizer, my scenario is:

During indexing i want to token on all punctuations, so i can use
StandardTokenizer, but at search time i want to consider punctuations as
part of text,

I dont store contents but only indexes.

What should i use.

Any advices ?


-- 
Thanks and kind Regards,
Abhishek jain


Re: Which Tokenizer to use at searching

2014-03-09 Thread abhishek jain
hi,

Thanks for replying promptly,
an example:

I want to index for A,B
but when i search A AND B, it should return result,
when i search for A,B it should return result.

Also Ideally when i search for A , B (with space) it should return result.


please advice
thanks
abhishek


On Sun, Mar 9, 2014 at 9:52 PM, Furkan KAMACI furkankam...@gmail.comwrote:

 Hi;

 Firstly you have to keep in mind that if you don't index punctuation they
 will not be visible for search. On the other hand you can have different
 analyzer for index and search. You have to give more detail about your
 situation. What will be your tokenizer at search time, WhiteSpaceTokenizer?
 You can have a look at here:
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

 If you can give some examples what you want for indexing and searching I
 can help you to combine index and search analyzer/tokenizer/token filters.

 Thanks;
 Furkan KAMACI


 2014-03-09 18:06 GMT+02:00 abhishek jain abhishek.netj...@gmail.com:

  Hi Friends,
 
  I am concerned on Tokenizer, my scenario is:
 
  During indexing i want to token on all punctuations, so i can use
  StandardTokenizer, but at search time i want to consider punctuations as
  part of text,
 
  I dont store contents but only indexes.
 
  What should i use.
 
  Any advices ?
 
 
  --
  Thanks and kind Regards,
  Abhishek jain
 




-- 
Thanks and kind Regards,
Abhishek jain
+91 9971376767


Optimizing RAM

2014-03-09 Thread abhishek jain
hi friends,
I want to index some good amount of data, i want to keep both stemmed and
unstemmed versions ,
I am confused should i keep two separate indexes or keep one index with two
versions or column , i mean col1_stemmed and col2_unstemmed.

I have multicore with multi shard configuration.
My server have 32 GB RAM and stemmed index size (without content) i
calculated as 60 GB .
I want to not put too much load and I/O load on a decent server with some 5
other replicated servers and want to use servers for other purposes also.


Also is it advised to server queries from master server or only from slaves?
-- 
Thanks,
Abhishek


Re: Which Tokenizer to use at searching

2014-03-09 Thread abhishek jain
Hi Erick,
Thanks for replying,

I want to index A,B (with or without space with comma) as separate words
and also want to return results when A and B searched individually and also
A,B .

Please let me know your views.
Let me know if i still havent explained correctly. I will try again.

Thanks
abhishek


On Sun, Mar 9, 2014 at 11:49 PM, Erick Erickson erickerick...@gmail.comwrote:

 You've contradicted yourself, so it's hard to say. Or
 I'm  mis-reading your messages.

 bq: During indexing i want to token on all punctuations, so i can use
 StandardTokenizer, but at search time i want to consider punctuations as
 part of text,

 and in your second message:

 bq: when i search for A,B it should return result. [for input A,B]

 If, indeed, you ... at search time i want to consider punctuations as
 part of text then A,B should NOT match the document.

 The admin/analysis page is your friend, I strongly suggest you spend
 some time looking at the various transformations performed by
 the various analyzers and tokenizers.

 Best,
 Erick

 On Sun, Mar 9, 2014 at 1:54 PM, abhishek jain
 abhishek.netj...@gmail.com wrote:
  hi,
 
  Thanks for replying promptly,
  an example:
 
  I want to index for A,B
  but when i search A AND B, it should return result,
  when i search for A,B it should return result.
 
  Also Ideally when i search for A , B (with space) it should return
 result.
 
 
  please advice
  thanks
  abhishek
 
 
  On Sun, Mar 9, 2014 at 9:52 PM, Furkan KAMACI furkankam...@gmail.com
 wrote:
 
  Hi;
 
  Firstly you have to keep in mind that if you don't index punctuation
 they
  will not be visible for search. On the other hand you can have different
  analyzer for index and search. You have to give more detail about your
  situation. What will be your tokenizer at search time,
 WhiteSpaceTokenizer?
  You can have a look at here:
  http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
 
  If you can give some examples what you want for indexing and searching I
  can help you to combine index and search analyzer/tokenizer/token
 filters.
 
  Thanks;
  Furkan KAMACI
 
 
  2014-03-09 18:06 GMT+02:00 abhishek jain abhishek.netj...@gmail.com:
 
   Hi Friends,
  
   I am concerned on Tokenizer, my scenario is:
  
   During indexing i want to token on all punctuations, so i can use
   StandardTokenizer, but at search time i want to consider punctuations
 as
   part of text,
  
   I dont store contents but only indexes.
  
   What should i use.
  
   Any advices ?
  
  
   --
   Thanks and kind Regards,
   Abhishek jain
  
 
 
 
 
  --
  Thanks and kind Regards,
  Abhishek jain
  +91 9971376767




-- 
Thanks and kind Regards,
Abhishek jain
+91 9971376767


RE: Special character search in Solr and boosting without altering the resultset

2014-02-01 Thread abhishek jain
Hi,
Ok thanks, will look more into it,

Any info on boosting without altering the resultset?

Thanks
Abhishek 

 -Original Message-
 
 Hi Abhishek,
 
 dot is not a special character. Your field type / analyzer is stripping
 that character. Please see similar discussions and alternative
 solutions.
 
 http://search-lucene.com/m/6dbI9zMSob1
 http://search-lucene.com/m/Ac71G0KlGz
 http://search-lucene.com/m/RRD2D1p1mi
 
 Ahmet
 
 
 
 On Friday, January 31, 2014 8:23 PM, abhishek jain
 abhishek.netj...@gmail.com wrote:
 Hi friends,
 
 I am facing a strange problem, When I search a term eg     .Net   , the
 solr searches for Net and not includes '.'
 
 Is dot a special character in Solr? I tried escaping it with backslash
 in the url call to solr, but no use same resultset,
 
 
 
 Also , is there a way to boost some terms within a resultset.
 
 I mean I want to boost a term within a result and I don't want to fire
 a separate query. I couldn't use OR operator as it will modify the
 resultset.
 I want to use a single query and boost. I don't want to use dismax
 query as well,
 
 
 
 Please advice.
 
 
 
 Thanks,
 
 Abhishek



RE: Special character search in Solr and boosting without altering the resultset

2014-02-01 Thread abhishek jain
Hi,
Thanks for replying but if i understand right:
q=term1 term2^0.6 means it will search for term1 and term2 and somewhat less
boost to term2, 

I want to search only for term1 and if the term2 exists boost by a positive
factor . I am not able to make such a query .

Thanks
Abhishek 

 -Original Message-
 From: Ahmet Arslan [mailto:iori...@yahoo.com]
 Sent: Saturday, February 1, 2014 8:51 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Special character search in Solr and boosting without
 altering the resultset
 
 Hi,
 
 Can you elaborate your boosting requirement? There is a carat operator
 to boost query terms.
 
 for example : q=term1 term2^0.6
 
 
 
 
 On Saturday, February 1, 2014 1:51 PM, abhishek jain
 abhishek.netj...@gmail.com wrote:
 Hi,
 Ok thanks, will look more into it,
 
 Any info on boosting without altering the resultset?
 
 Thanks
 Abhishek
 
 
  -Original Message-
 
  Hi Abhishek,
 
  dot is not a special character. Your field type / analyzer is
  stripping that character. Please see similar discussions and
  alternative solutions.
 
  http://search-lucene.com/m/6dbI9zMSob1
  http://search-lucene.com/m/Ac71G0KlGz
  http://search-lucene.com/m/RRD2D1p1mi
 
  Ahmet
 
 
 
  On Friday, January 31, 2014 8:23 PM, abhishek jain
  abhishek.netj...@gmail.com wrote:
  Hi friends,
 
  I am facing a strange problem, When I search a term eg     .Net   ,
  the solr searches for Net and not includes '.'
 
  Is dot a special character in Solr? I tried escaping it with
 backslash
  in the url call to solr, but no use same resultset,
 
 
 
  Also , is there a way to boost some terms within a resultset.
 
  I mean I want to boost a term within a result and I don't want to
 fire
  a separate query. I couldn't use OR operator as it will modify the
  resultset.
  I want to use a single query and boost. I don't want to use dismax
  query as well,
 
 
 
  Please advice.
 
 
 
  Thanks,
 
  Abhishek



Remove stemming without reindexing - currently using KStem

2014-02-01 Thread abhishek jain
Hi Friends,

Is it possible to remove stemming without having to reindex the entire data,
I am using KStem.

Can we do so by query itself, not sure how?

I am not using dismax.

 

Thanks

Abhishek 

 



Special character search in Solr and boosting without altering the resultset

2014-01-31 Thread abhishek jain
Hi friends,

I am facing a strange problem, When I search a term eg .Net   , the solr
searches for Net and not includes '.'  

Is dot a special character in Solr? I tried escaping it with backslash in
the url call to solr, but no use same resultset,

 

Also , is there a way to boost some terms within a resultset.

I mean I want to boost a term within a result and I don't want to fire a
separate query. I couldn't use OR operator as it will modify the resultset.
I want to use a single query and boost. I don't want to use dismax query as
well,

 

Please advice.

 

Thanks,

Abhishek