Build Java Package for required schema and solrconfig files field and configuration.

2016-04-26 Thread Nitin Solanki
Hello Everyone,
 I have created a autosuggest using Solr suggester.
I have added a field and field type in schema.xml and did some changes in
/suggest request handler into solrconfig.xml.
Now, I need to build a java package using those configuration which I need
to plug into my current java project. I don't want to use CURL, I need my
configuration as jar or java package. How can I do ? Not having experience
of jar package too much. Any help please...

Thanks,
Nitin


Re: How get around solr's spellcheck maxEdit limit of 2?

2016-01-22 Thread Nitin Solanki
Ok, But IndexBasedSpellChecker needs a directory where all indexes are
stored to do spell check. I don't have any idea about
IndexBasedSpellChecker. If you send me snap configuration of that. It will
help me.. Thanks

On Fri, Jan 22, 2016 at 1:45 AM Dyer, James <james.d...@ingramcontent.com>
wrote:

> But if you really need more than 2 edits, I think IndexBasedSpellChecker
> supports it.
>
> James Dyer
> Ingram Content Group
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Thursday, January 21, 2016 11:29 AM
> To: solr-user
> Subject: Re: How get around solr's spellcheck maxEdit limit of 2?
>
> bq: ...is anyway to increase that maxEdit
>
> IIUC, increasing maxEdit beyond 2 increases the space/time required
> unacceptably, that limit is there on purpose, put there by people who
> know their stuff.
>
> Best,
> Erick
>
> On Thu, Jan 21, 2016 at 12:39 AM, Nitin Solanki <nitinml...@gmail.com>
> wrote:
> > I am using Solr for spell Correction. Solr is limited to maxEdit of 2.
> Does
> > there is anyway to increase that maxEdit without using phonetic mapping ?
> > Please any suggestions
>
>


How get around solr's spellcheck maxEdit limit of 2?

2016-01-21 Thread Nitin Solanki
I am using Solr for spell Correction. Solr is limited to maxEdit of 2. Does
there is anyway to increase that maxEdit without using phonetic mapping ?
Please any suggestions


IOException, ConnectionTimeout Error while searching

2015-08-26 Thread Nitin Solanki
Hello,
I indexed 2 million documents and after completing indexing. I
tried for searching. It throws IOException and Connection Timeout Error.


 error:{
msg:org.apache.solr.client.solrj.SolrServerException:
IOException occured when talking to server at:
http://192.168.1.25:8983/solr/col_ner_shard1_replica1;,
trace:org.apache.solr.common.SolrException:
org.apache.solr.client.solrj.SolrServerException: IOException occured
when talking to server at:
http://192.168.1.25:8983/solr/col_ner_shard1_replica1\n\tat
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:337)\n\tat
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144)\n\tat
org.apache.solr.core.SolrCore.execute(SolrCore.java:2006)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:413)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:204)\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)\n\tat
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)\n\tat


Re: Make search faster in Solr

2015-08-11 Thread Nitin Solanki
Okay davidphilip.

On Mon, Aug 10, 2015 at 8:24 PM davidphilip cherian 
davidphilipcher...@gmail.com wrote:

 Hi Nitin,

 32 shards for 16 million documents is too much. 2 shards should suffice
 considering your document sizes are moderate. Caches are to be monitored
 and tuned accordingly. You should study about caches a bit here

 https://cwiki.apache.org/confluence/display/solr/Query+Settings+in+SolrConfig



 On Mon, Aug 10, 2015 at 4:34 PM, Nitin Solanki nitinml...@gmail.com
 wrote:

  Hi,
  I have 32 shards and single replica of each shards having 4 nodes
  over Solr cloud.
  I have indexed 16 million documents. Without cache, total time taken to
  search a document is 0.2 second. And with cache is 0.04 second.
  I don't do anything of cache. Caches are set by default in
 solrconfig.xml.
 
  How to make faster search without cache? Or how to make more faster with
  cache while searching. Which cache is used for searching?
 



Re: Concurrent Indexing and Searching in Solr.

2015-08-11 Thread Nitin Solanki
Hi Erick,
 Thanks a lot for your help. I will go through MongoDB.

On Mon, Aug 10, 2015 at 9:14 PM Erick Erickson erickerick...@gmail.com
wrote:

 bq:  I changed
 maxWarmingSearchers*2*/maxWarmingSearchers
 to maxWarmingSearchers*100*/maxWarmingSearchers. And apply simultaneous
 searching using 100 workers.

 Do not do this. This has nothing to do with the number of searcher
 threads. And with
 your update rate, especially if you continue to insist on adding
 commit=true to every
 update request, this will explode your memory requirements. To no good
 purpose
 whatsoever.

 bq: But MongoDB can handle concurrent searching and indexing faster.

 Because MongoDB is optimized for different kinds of operations. Solr
 is a ranking,
 free-text search engine. It's an apples-and-oranges comparison. If MongoDB
 meets your search needs, you should use it.

 Best,
 Erick

 On Sun, Aug 9, 2015 at 11:04 PM, Nitin Solanki nitinml...@gmail.com
 wrote:
  Hi,
   I used solr 5.2.1 version. It is fast, I think. But again, I am
 stuck
  on concurrent searching and threading. I changed
  maxWarmingSearchers*2*/maxWarmingSearchers
  to maxWarmingSearchers*100*/maxWarmingSearchers. And apply
 simultaneous
  searching using 100 workers. It works fast but not upto the mark.
 
  It increases searching from 1.5  to 0.5 seconds. But If I run only single
  worker then searching time is 0.03 seconds,  it is too fast but not
  possible with 100 workers simultaneously.
 
  As Shawn said - Making 100 concurrent indexing requests at the same time
  as 100
  concurrent queries will overwhelm *any* single Solr server. I got your
  point.
 
  But MongoDB can handle concurrent searching and indexing faster. Then why
  not solr? Sorry for this..
 
 
 
  On Mon, Aug 10, 2015 at 2:39 AM Shawn Heisey apa...@elyograg.org
 wrote:
 
  On 8/7/2015 1:15 PM, Nitin Solanki wrote:
   I wrote a python script for indexing and using
   urllib and urllib2 for indexing data via http..
 
  There are a number of Solr python clients.  Using a client makes your
  code much easier to write and understand.
 
  https://wiki.apache.org/solr/SolPython
 
  I have no experience with any of these clients, but I can say that the
  one encountered most often when Python developers come into the #solr
  IRC channel is pysolr.  Our wiki page says the last update for pysolr
  happened in December of 2013, but I can see that the last version on
  their web page is dated 2015-05-26.
 
  Making 100 concurrent indexing requests at the same time as 100
  concurrent queries will overwhelm *any* single Solr server.  In a
  previous message you said that you have 4 CPU cores.  The load you're
  trying to put on Solr will require at *LEAST* 200 threads.  It may be
  more than that.  Any single system is going to have trouble with that.
  A system with 4 cores will be *very* overloaded.
 
  Thanks,
  Shawn
 
 



Re: Make search faster in Solr

2015-08-11 Thread Nitin Solanki
Hi davidphilip,
Without caching, Can we do fast searching?

On Tue, Aug 11, 2015 at 11:43 AM Nitin Solanki nitinml...@gmail.com wrote:

 Okay davidphilip.

 On Mon, Aug 10, 2015 at 8:24 PM davidphilip cherian 
 davidphilipcher...@gmail.com wrote:

 Hi Nitin,

 32 shards for 16 million documents is too much. 2 shards should suffice
 considering your document sizes are moderate. Caches are to be monitored
 and tuned accordingly. You should study about caches a bit here

 https://cwiki.apache.org/confluence/display/solr/Query+Settings+in+SolrConfig



 On Mon, Aug 10, 2015 at 4:34 PM, Nitin Solanki nitinml...@gmail.com
 wrote:

  Hi,
  I have 32 shards and single replica of each shards having 4
 nodes
  over Solr cloud.
  I have indexed 16 million documents. Without cache, total time taken to
  search a document is 0.2 second. And with cache is 0.04 second.
  I don't do anything of cache. Caches are set by default in
 solrconfig.xml.
 
  How to make faster search without cache? Or how to make more faster with
  cache while searching. Which cache is used for searching?
 




Re: Concurrent Indexing and Searching in Solr.

2015-08-10 Thread Nitin Solanki
Hi,
 I used solr 5.2.1 version. It is fast, I think. But again, I am stuck
on concurrent searching and threading. I changed
maxWarmingSearchers*2*/maxWarmingSearchers
to maxWarmingSearchers*100*/maxWarmingSearchers. And apply simultaneous
searching using 100 workers. It works fast but not upto the mark.

It increases searching from 1.5  to 0.5 seconds. But If I run only single
worker then searching time is 0.03 seconds,  it is too fast but not
possible with 100 workers simultaneously.

As Shawn said - Making 100 concurrent indexing requests at the same time
as 100
concurrent queries will overwhelm *any* single Solr server. I got your
point.

But MongoDB can handle concurrent searching and indexing faster. Then why
not solr? Sorry for this..



On Mon, Aug 10, 2015 at 2:39 AM Shawn Heisey apa...@elyograg.org wrote:

 On 8/7/2015 1:15 PM, Nitin Solanki wrote:
  I wrote a python script for indexing and using
  urllib and urllib2 for indexing data via http..

 There are a number of Solr python clients.  Using a client makes your
 code much easier to write and understand.

 https://wiki.apache.org/solr/SolPython

 I have no experience with any of these clients, but I can say that the
 one encountered most often when Python developers come into the #solr
 IRC channel is pysolr.  Our wiki page says the last update for pysolr
 happened in December of 2013, but I can see that the last version on
 their web page is dated 2015-05-26.

 Making 100 concurrent indexing requests at the same time as 100
 concurrent queries will overwhelm *any* single Solr server.  In a
 previous message you said that you have 4 CPU cores.  The load you're
 trying to put on Solr will require at *LEAST* 200 threads.  It may be
 more than that.  Any single system is going to have trouble with that.
 A system with 4 cores will be *very* overloaded.

 Thanks,
 Shawn




Make search faster in Solr

2015-08-10 Thread Nitin Solanki
Hi,
I have 32 shards and single replica of each shards having 4 nodes
over Solr cloud.
I have indexed 16 million documents. Without cache, total time taken to
search a document is 0.2 second. And with cache is 0.04 second.
I don't do anything of cache. Caches are set by default in solrconfig.xml.

How to make faster search without cache? Or how to make more faster with
cache while searching. Which cache is used for searching?


Is cache enabled by default?

2015-08-10 Thread Nitin Solanki
Hi,
 I have commented  querycache, filterquerycache and document
cache. Still searching is using cache. why so?

2) First time searching a query, it takes time and afterwards, it can't due
to cache, I know that. But how to make search always faster even first time
searching?


Re: Concurrent Indexing and Searching in Solr.

2015-08-08 Thread Nitin Solanki
Thanks Erick for your suggestion. I will remove commit = true and use solr
5.2 and then get back to you again. For further help. Thanks.

On Sat, Aug 8, 2015 at 4:07 AM Erick Erickson erickerick...@gmail.com
wrote:

 bq: So, How much minimum concurrent threads should I run?

 I really can't answer that in the abstract, you'll simply have to
 test.

 I'd prefer SolrJ to post.jar. If you're not going to SolrJ, I'd imagine
 that
 moving from Python to post.jar isn't all that useful.

 But before you do anything, see what really happens when you remove th
 commit=true. That's likely way more important than the rest.

 Best,
 Erick

 On Fri, Aug 7, 2015 at 3:15 PM, Nitin Solanki nitinml...@gmail.com
 wrote:
  Hi Erick,
  posting files to Solr via curl =
  Rather than posting files via curl. Which is better SolrJ or post.jar...
 I
  don't use both things. I wrote a python script for indexing and using
  urllib and urllib2 for indexing data via http.. I don't have any  option
 to
  use SolrJ Right now. How can I do same thing via post.jar in python? Any
  help Please.
 
  indexing with 100 threads is going to eat up a lot of CPU cycles
  = So, How much minimum concurrent threads should I run? And I also need
  concurrent searching. So, How much?
 
  And Thanks for solr 5.2, I will go through that. Thanking for reply.
 Please
  help me..
 
  On Fri, Aug 7, 2015 at 11:51 PM Erick Erickson erickerick...@gmail.com
  wrote:
 
  bq: How much limitations does Solr has related to indexing and searching
  simultaneously? It means that how many simultaneously calls, I made for
  searching and indexing once?
 
  None a-priori. It all depends on the hardware you're throwing at it.
  Obviously
  indexing with 100 threads is going to eat up a lot of CPU cycles that
  can't then
  be devoted to satisfying queries. You need to strike a balance. Do
  seriously
  consider using some other method than posting files to Solr via curl
  or the like,
  that's rarely a robust solution for production.
 
  As for adding the commit=true, this shouldn't be affecting the index
 size,
  I
  suspect you were mislead by something else happening.
 
  Really, remove it or you'll beat up your system hugely. As for the soft
  commit
  interval, that's totally irrelevant when you're committing every
  document. But do
  lengthen it as much as you can. Most of the time when people say real
  time,
  it turns out that 10 seconds is OK. Or 60 seconds is OK.  You have to
 check
  what the _real_ requirement is, it's often not what's stated.
 
  bq: I am using Solr 5.0 version. Is 5.0 almost similar to 5.2 regarding
  indexing and searching data.
 
  Did you read the link I provided? With replicas, 5.2 will index almost
  twice as
  fast. That means (roughly) half the work on the followers is being done,
  freeing up cycles for performing queries.
 
  Best,
  Erick
 
 
  On Fri, Aug 7, 2015 at 2:06 PM, Nitin Solanki nitinml...@gmail.com
  wrote:
   Hi Erick,
 You said that soft commit should be more than 3000 ms.
   Actually, I need Real time searching and that's why I need soft commit
  fast.
  
   commit=true = I made commit=true because , It reduces by indexed data
  size
   from 1.5GB to 500MB on* each shard*. When I did commit=false then, my
   indexed data size was 1.5GB. After changing it to commit=true, then
 size
   reduced to 500MB only. I am not getting how is it?
  
   I am using Solr 5.0 version. Is 5.0 almost similar to 5.2 regarding
   indexing and searching data.
  
   How much limitations does Solr has related to indexing and searching
   simultaneously? It means that how many simultaneously calls, I made
 for
   searching and indexing once?
  
  
   On Fri, Aug 7, 2015 at 9:18 PM Erick Erickson 
 erickerick...@gmail.com
   wrote:
  
   Your soft commit time of 3 seconds is quite aggressive,
   I'd lengthen it to as long as possible.
  
   Ugh, looked at your query more closely. Adding commit=true to every
  update
   request is horrible performance wise. Let your autocommit process
   handle the commits is the first thing I'd do. Second, I'd try going
 to
   SolrJ
   and batching up documents (I usually start with 1,000) or using the
   post.jar
   tool rather than sending them via a raw URL.
  
   I agree with Upayavira, 100 concurrent threads is a _lot_. Also, what
   version of Solr?
   There was a 2x speedup in Solr 5.2, see:
  
 
 http://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/
  
   One symptom was that the followers were doing way more work than
 the
   leader
   (BTW, using master/slave when talking SolrCloud is a bit
 confusing...)
   which will
   affect query response rates.
  
   Basically, if query response is paramount, you really need to
 throttle
   your indexing,
   there's just a whole lot of work going on here..
  
   Best,
   Erick
  
   On Fri, Aug 7, 2015 at 11:23 AM, Upayavira u...@odoko.co.uk wrote:
How many CPUs do you have? 100 concurrent

Concurrent Indexing and Searching in Solr.

2015-08-07 Thread Nitin Solanki
Hello Everyone,
  I have indexed 16 million documents in Solr
Cloud. Created 4 nodes and 8 shards with single replica.
I am trying to make concurrent indexing and searching on those indexed
documents. Trying to make 100 concurrent indexing calls along with 100
concurrent searching calls.
It *degrades searching and indexing* performance both.

Configuration :

  commitWithin:{softCommit:true},
  autoCommit:{
maxDocs:-1,
maxTime:6,
openSearcher:false},
  autoSoftCommit:{
maxDocs:-1,
maxTime:3000}},

  indexConfig:{
  maxBufferedDocs:-1,
  maxMergeDocs:-1,
  maxIndexingThreads:8,
  mergeFactor:-1,
  ramBufferSizeMB:100.0,
  writeLockTimeout:-1,
  lockType:native}}}

AND  maxWarmingSearchers2/maxWarmingSearchers

I don't have know that how master and slave works. Normally, I created 8
shards and indexed documents using :




*http://localhost:8983/solr/test_commit_fast/update/json?commit=true
http://localhost:8983/solr/test_commit_fast/update/json?commit=true -H
'Content-type:application/json' -d ' [ JSON_Document ]'*And Searching using
*: http://localhost:8983/solr/test_commit_fast/select
http://localhost:8983/solr/test_commit_fast/select*?q= field_name:
search_string

Please any help on it. To make searching and indexing fast concurrently.
Thanks.


Regards,
Nitin


Re: Concurrent Indexing and Searching in Solr.

2015-08-07 Thread Nitin Solanki
Hi Erick,
posting files to Solr via curl =
Rather than posting files via curl. Which is better SolrJ or post.jar... I
don't use both things. I wrote a python script for indexing and using
urllib and urllib2 for indexing data via http.. I don't have any  option to
use SolrJ Right now. How can I do same thing via post.jar in python? Any
help Please.

indexing with 100 threads is going to eat up a lot of CPU cycles
= So, How much minimum concurrent threads should I run? And I also need
concurrent searching. So, How much?

And Thanks for solr 5.2, I will go through that. Thanking for reply. Please
help me..

On Fri, Aug 7, 2015 at 11:51 PM Erick Erickson erickerick...@gmail.com
wrote:

 bq: How much limitations does Solr has related to indexing and searching
 simultaneously? It means that how many simultaneously calls, I made for
 searching and indexing once?

 None a-priori. It all depends on the hardware you're throwing at it.
 Obviously
 indexing with 100 threads is going to eat up a lot of CPU cycles that
 can't then
 be devoted to satisfying queries. You need to strike a balance. Do
 seriously
 consider using some other method than posting files to Solr via curl
 or the like,
 that's rarely a robust solution for production.

 As for adding the commit=true, this shouldn't be affecting the index size,
 I
 suspect you were mislead by something else happening.

 Really, remove it or you'll beat up your system hugely. As for the soft
 commit
 interval, that's totally irrelevant when you're committing every
 document. But do
 lengthen it as much as you can. Most of the time when people say real
 time,
 it turns out that 10 seconds is OK. Or 60 seconds is OK.  You have to check
 what the _real_ requirement is, it's often not what's stated.

 bq: I am using Solr 5.0 version. Is 5.0 almost similar to 5.2 regarding
 indexing and searching data.

 Did you read the link I provided? With replicas, 5.2 will index almost
 twice as
 fast. That means (roughly) half the work on the followers is being done,
 freeing up cycles for performing queries.

 Best,
 Erick


 On Fri, Aug 7, 2015 at 2:06 PM, Nitin Solanki nitinml...@gmail.com
 wrote:
  Hi Erick,
You said that soft commit should be more than 3000 ms.
  Actually, I need Real time searching and that's why I need soft commit
 fast.
 
  commit=true = I made commit=true because , It reduces by indexed data
 size
  from 1.5GB to 500MB on* each shard*. When I did commit=false then, my
  indexed data size was 1.5GB. After changing it to commit=true, then size
  reduced to 500MB only. I am not getting how is it?
 
  I am using Solr 5.0 version. Is 5.0 almost similar to 5.2 regarding
  indexing and searching data.
 
  How much limitations does Solr has related to indexing and searching
  simultaneously? It means that how many simultaneously calls, I made for
  searching and indexing once?
 
 
  On Fri, Aug 7, 2015 at 9:18 PM Erick Erickson erickerick...@gmail.com
  wrote:
 
  Your soft commit time of 3 seconds is quite aggressive,
  I'd lengthen it to as long as possible.
 
  Ugh, looked at your query more closely. Adding commit=true to every
 update
  request is horrible performance wise. Let your autocommit process
  handle the commits is the first thing I'd do. Second, I'd try going to
  SolrJ
  and batching up documents (I usually start with 1,000) or using the
  post.jar
  tool rather than sending them via a raw URL.
 
  I agree with Upayavira, 100 concurrent threads is a _lot_. Also, what
  version of Solr?
  There was a 2x speedup in Solr 5.2, see:
 
 http://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/
 
  One symptom was that the followers were doing way more work than the
  leader
  (BTW, using master/slave when talking SolrCloud is a bit confusing...)
  which will
  affect query response rates.
 
  Basically, if query response is paramount, you really need to throttle
  your indexing,
  there's just a whole lot of work going on here..
 
  Best,
  Erick
 
  On Fri, Aug 7, 2015 at 11:23 AM, Upayavira u...@odoko.co.uk wrote:
   How many CPUs do you have? 100 concurrent indexing calls seems like
   rather a lot. You're gonna end up doing a lot of context switching,
   hence degraded performance. Dunno what others would say, but I'd aim
 for
   approx one indexing thread per CPU.
  
   Upayavira
  
   On Fri, Aug 7, 2015, at 02:58 PM, Nitin Solanki wrote:
   Hello Everyone,
 I have indexed 16 million documents in Solr
   Cloud. Created 4 nodes and 8 shards with single replica.
   I am trying to make concurrent indexing and searching on those
 indexed
   documents. Trying to make 100 concurrent indexing calls along with
 100
   concurrent searching calls.
   It *degrades searching and indexing* performance both.
  
   Configuration :
  
 commitWithin:{softCommit:true},
 autoCommit:{
   maxDocs:-1,
   maxTime:6,
   openSearcher:false

Re: Concurrent Indexing and Searching in Solr.

2015-08-07 Thread Nitin Solanki
Hi Erick,
  You said that soft commit should be more than 3000 ms.
Actually, I need Real time searching and that's why I need soft commit fast.

commit=true = I made commit=true because , It reduces by indexed data size
from 1.5GB to 500MB on* each shard*. When I did commit=false then, my
indexed data size was 1.5GB. After changing it to commit=true, then size
reduced to 500MB only. I am not getting how is it?

I am using Solr 5.0 version. Is 5.0 almost similar to 5.2 regarding
indexing and searching data.

How much limitations does Solr has related to indexing and searching
simultaneously? It means that how many simultaneously calls, I made for
searching and indexing once?


On Fri, Aug 7, 2015 at 9:18 PM Erick Erickson erickerick...@gmail.com
wrote:

 Your soft commit time of 3 seconds is quite aggressive,
 I'd lengthen it to as long as possible.

 Ugh, looked at your query more closely. Adding commit=true to every update
 request is horrible performance wise. Let your autocommit process
 handle the commits is the first thing I'd do. Second, I'd try going to
 SolrJ
 and batching up documents (I usually start with 1,000) or using the
 post.jar
 tool rather than sending them via a raw URL.

 I agree with Upayavira, 100 concurrent threads is a _lot_. Also, what
 version of Solr?
 There was a 2x speedup in Solr 5.2, see:
 http://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/

 One symptom was that the followers were doing way more work than the
 leader
 (BTW, using master/slave when talking SolrCloud is a bit confusing...)
 which will
 affect query response rates.

 Basically, if query response is paramount, you really need to throttle
 your indexing,
 there's just a whole lot of work going on here..

 Best,
 Erick

 On Fri, Aug 7, 2015 at 11:23 AM, Upayavira u...@odoko.co.uk wrote:
  How many CPUs do you have? 100 concurrent indexing calls seems like
  rather a lot. You're gonna end up doing a lot of context switching,
  hence degraded performance. Dunno what others would say, but I'd aim for
  approx one indexing thread per CPU.
 
  Upayavira
 
  On Fri, Aug 7, 2015, at 02:58 PM, Nitin Solanki wrote:
  Hello Everyone,
I have indexed 16 million documents in Solr
  Cloud. Created 4 nodes and 8 shards with single replica.
  I am trying to make concurrent indexing and searching on those indexed
  documents. Trying to make 100 concurrent indexing calls along with 100
  concurrent searching calls.
  It *degrades searching and indexing* performance both.
 
  Configuration :
 
commitWithin:{softCommit:true},
autoCommit:{
  maxDocs:-1,
  maxTime:6,
  openSearcher:false},
autoSoftCommit:{
  maxDocs:-1,
  maxTime:3000}},
 
indexConfig:{
maxBufferedDocs:-1,
maxMergeDocs:-1,
maxIndexingThreads:8,
mergeFactor:-1,
ramBufferSizeMB:100.0,
writeLockTimeout:-1,
lockType:native}}}
 
  AND  maxWarmingSearchers2/maxWarmingSearchers
 
  I don't have know that how master and slave works. Normally, I created 8
  shards and indexed documents using :
 
 
 
 
  *http://localhost:8983/solr/test_commit_fast/update/json?commit=true
  http://localhost:8983/solr/test_commit_fast/update/json?commit=true
 -H
  'Content-type:application/json' -d ' [ JSON_Document ]'*And Searching
  using
  *: http://localhost:8983/solr/test_commit_fast/select
  http://localhost:8983/solr/test_commit_fast/select*?q= field_name:
  search_string
 
  Please any help on it. To make searching and indexing fast concurrently.
  Thanks.
 
 
  Regards,
  Nitin



Re: Concurrent Indexing and Searching in Solr.

2015-08-07 Thread Nitin Solanki
Hi, Upayavira

RAM = 28GB
CPU = 4 processes..


On Fri, Aug 7, 2015 at 8:53 PM Upayavira u...@odoko.co.uk wrote:

 How many CPUs do you have? 100 concurrent indexing calls seems like
 rather a lot. You're gonna end up doing a lot of context switching,
 hence degraded performance. Dunno what others would say, but I'd aim for
 approx one indexing thread per CPU.

 Upayavira

 On Fri, Aug 7, 2015, at 02:58 PM, Nitin Solanki wrote:
  Hello Everyone,
I have indexed 16 million documents in Solr
  Cloud. Created 4 nodes and 8 shards with single replica.
  I am trying to make concurrent indexing and searching on those indexed
  documents. Trying to make 100 concurrent indexing calls along with 100
  concurrent searching calls.
  It *degrades searching and indexing* performance both.
 
  Configuration :
 
commitWithin:{softCommit:true},
autoCommit:{
  maxDocs:-1,
  maxTime:6,
  openSearcher:false},
autoSoftCommit:{
  maxDocs:-1,
  maxTime:3000}},
 
indexConfig:{
maxBufferedDocs:-1,
maxMergeDocs:-1,
maxIndexingThreads:8,
mergeFactor:-1,
ramBufferSizeMB:100.0,
writeLockTimeout:-1,
lockType:native}}}
 
  AND  maxWarmingSearchers2/maxWarmingSearchers
 
  I don't have know that how master and slave works. Normally, I created 8
  shards and indexed documents using :
 
 
 
 
  *http://localhost:8983/solr/test_commit_fast/update/json?commit=true
  http://localhost:8983/solr/test_commit_fast/update/json?commit=true -H
  'Content-type:application/json' -d ' [ JSON_Document ]'*And Searching
  using
  *: http://localhost:8983/solr/test_commit_fast/select
  http://localhost:8983/solr/test_commit_fast/select*?q= field_name:
  search_string
 
  Please any help on it. To make searching and indexing fast concurrently.
  Thanks.
 
 
  Regards,
  Nitin



Hard Commit not working

2015-07-30 Thread Nitin Solanki
Hi,
   I am trying to index documents using solr cloud. After setting,
maxTime to 6 ms in hard commit. Documents are visible instantly while
adding them. Not commiting after 6 ms.
I have added Solr log. Please check it. I am not getting exactly what is
happening.

*CURL to commit documents:*

curl http://localhost:8983/solr/test/update/json -H
'Content-type:application/json' -d 'json-here'

*Solrconfig.xml:*
autoCommit
   maxDocs1/maxDocs
   maxTime6/maxTime
   openSearcherfalse/openSearcher
 /autoCommit
!--autoSoftCommit --
 !--  maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime --
 !--/autoSoftCommit--


*Solr Log: *
INFO  - 2015-07-30 14:14:12.636; [test shard6 core_node2
test_shard6_replica1] org.apache.solr.update.processor.LogUpdateProcessor;
[test_shard6_replica1] webapp=/solr path=/update
params={update.distrib=FROMLEADERwaitSearcher=trueopenSearcher=truecommit=truesoftCommit=falsedistrib.from=
http://100.77.202.145:8983/solr/test_shard2_replica1/commit_end_point=truewt=javabinversion=2expungeDeletes=false}
{commit=} 0 26


Re: Hard Commit not working

2015-07-30 Thread Nitin Solanki
Hi Edwards,
  I am only sending 1 document for indexing then why it is
committing instantly. I gave maxTime to 6.

On Thu, Jul 30, 2015 at 8:26 PM Edward Ribeiro edward.ribe...@gmail.com
wrote:

 Your maxDocs is set to 1. This is the number of pending docs before
 autocommit is triggered too. You should set it to a higher value like
 1, for example.

 Edward
 Em 30/07/2015 11:43, Nitin Solanki nitinml...@gmail.com escreveu:

  Hi,
 I am trying to index documents using solr cloud. After setting,
  maxTime to 6 ms in hard commit. Documents are visible instantly
 while
  adding them. Not commiting after 6 ms.
  I have added Solr log. Please check it. I am not getting exactly what is
  happening.
 
  *CURL to commit documents:*
 
  curl http://localhost:8983/solr/test/update/json -H
  'Content-type:application/json' -d 'json-here'
 
  *Solrconfig.xml:*
  autoCommit
 maxDocs1/maxDocs
 maxTime6/maxTime
 openSearcherfalse/openSearcher
   /autoCommit
  !--autoSoftCommit --
   !--  maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime --
   !--/autoSoftCommit--
 
 
  *Solr Log: *
  INFO  - 2015-07-30 14:14:12.636; [test shard6 core_node2
  test_shard6_replica1]
 org.apache.solr.update.processor.LogUpdateProcessor;
  [test_shard6_replica1] webapp=/solr path=/update
 
 
 params={update.distrib=FROMLEADERwaitSearcher=trueopenSearcher=truecommit=truesoftCommit=falsedistrib.from=
 
 
 http://100.77.202.145:8983/solr/test_shard2_replica1/commit_end_point=truewt=javabinversion=2expungeDeletes=false
  }
  {commit=} 0 26
 



Solr port went down on remote server

2015-05-06 Thread Nitin Solanki
Hi,
   I have installed Solr on remote server and started on port 8983.
Now, I have bind my local machine port 8983 with remote server 8983 of Solr
using *ssh* (Ubuntu OS). When I am requesting on Solr for getting the
suggestions on remote server through local machine calls. Sometimes it
gives response, sometimes doesn't.

I am not able to detect the problem that why is it so?
Is it remote server binding issue?  OR Solr went down ?
I am not getting the problem.

To detect the problem, I ran a crontab job using telnet command to check
existence of port (8983) of Solr. It is working fine without throwing any
connection refused error. I am able to detect the problem. Any help please..


Re: Data indexing is going too slow on single shard Why?

2015-03-27 Thread Nitin Solanki
Okay. Thanks Shawn..

On Thu, Mar 26, 2015 at 12:25 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 3/26/2015 12:03 AM, Nitin Solanki wrote:
  Great thanks Shawn...
  As you said -  **For 204GB of data per server, I recommend at least 128GB
  of total RAM,
  preferably 256GB**. Therefore, if I have 204GB of data on single
  server/shard then I prefer is 256GB by which searching will be fast and
  never slow down. Is it?

 Obviously I cannot guarantee it, but I think it's extremely likely that
 with that much memory, performance will be very good.

 One other possibility, which is discussed on that wiki page I linked, is
 that your java heap is being almost exhausted and large amounts of time
 are spent in garbage collection.  If you increase the heap from 4GB to
 5GB and see performance get better, then that would be confirmed.  There
 would be less memory available for caching, but constant garbage
 collection would be a much greater problem than the disk cache being too
 small.

 Thanks,
 Shawn




Re: Data indexing is going too slow on single shard Why?

2015-03-26 Thread Nitin Solanki
Great thanks Shawn...
As you said -  **For 204GB of data per server, I recommend at least 128GB
of total RAM,
preferably 256GB**. Therefore, if I have 204GB of data on single
server/shard then I prefer is 256GB by which searching will be fast and
never slow down. Is it?

On Wed, Mar 25, 2015 at 9:50 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 3/25/2015 8:42 AM, Nitin Solanki wrote:
  Server configuration:
  8 CPUs.
  32 GB RAM
  O.S. - Linux

 snip

  are running.  Java heap set to 4096 MB in Solr.  While indexing,

 snip

  *Currently*, I have 1 shard  with 2 replicas using SOLR CLOUD.
  Data Size:
  102Gsolr/node1/solr/wikingram_shard1_replica2
  102Gsolr/node2/solr/wikingram_shard1_replica1

 If both of those are on the same machine, I'm guessing that you're
 running two Solr instances on that machine, so there's 8GB of RAM used
 for Java.  That means you have about 24 GB of RAM left for caching ...
 and 200GB of index data to cache.

 24GB is not enough to cache 200GB of index.  If there is only one Solr
 instance (leaving 28GB for caching) with 102GB of data on the machine,
 it still might not be enough.  See that SolrPerformanceProblems wiki
 page I linked in my earlier email.

 For 102GB of data per server, I recommend at least 64GB of total RAM,
 preferably 128GB.

 For 204GB of data per server, I recommend at least 128GB of total RAM,
 preferably 256GB.

 Thanks,
 Shawn




Re: Data indexing is going too slow on single shard Why?

2015-03-25 Thread Nitin Solanki
Hello,
* Updating my question again.*
Please can anyone assist me? I am indexing on single shard it
is taking too much of time to index data. And I am indexing around 49GB of
data on single shard. What's wrong? Why solr is taking too much time to
index data?
Earlier I was indexing same data on 8 shards. That time, it was fast as
compared to single shard. Why so? Any help please..


*HardCommit - 15 sec*
*SoftCommit - 10 min.*

ii) Searching a query/term is also taking too much time. Any help on this
also.



On Wed, Mar 25, 2015 at 4:33 PM, Nitin Solanki nitinml...@gmail.com wrote:

 Hello,
 Please can anyone assist me? I am indexing on single shard it
 is taking too much of time to index data. And I am indexing around 49GB of
 data on single shard. What's wrong? Why solr is taking too much time to
 index data?
 Earlier I was indexing same data on 8 shards. That time, it was fast as
 compared to single shard. Why so? Any help please..


 *HardCommit - 15 sec*
 *SoftCommit - 10 min.*



 Best,
 Nitin



Re: Data indexing is going too slow on single shard Why?

2015-03-25 Thread Nitin Solanki
Hi Shawn,
  Sorry for all the things.

Server configuration:
8 CPUs.
32 GB RAM
O.S. - Linux
*Earlier*, I was using 8 shards without replica(default is 1) using SOLR
CLOUD. On server, Only Solr is running. There is no other application which
are running.  Java heap set to 4096 MB in Solr.  While indexing,
Solr(sometime) eats up whole RAM. I don't know how each solr server takes
RAM? Each server taking around 50 GB data(indexed). Actually, I had deleted
previous solr architecture, so I don't have any idea that how many
documents were on each shards and also don't know total documents.

*Currently*, I have 1 shard  with 2 replicas using SOLR CLOUD.
Data Size:
102Gsolr/node1/solr/wikingram_shard1_replica2
102Gsolr/node2/solr/wikingram_shard1_replica1

I am running a python script to index data using Solr RESTAPI. Commiting
2 Documents each time for indexing using python script with Solr
RESTAPI.
If I missed anything related to Solr. Please inform me..
THanks Shawn. Waiting for your reply




On Wed, Mar 25, 2015 at 7:33 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 3/25/2015 5:03 AM, Nitin Solanki wrote:
  Please can anyone assist me? I am indexing on single shard it
  is taking too much of time to index data. And I am indexing around 49GB
 of
  data on single shard. What's wrong? Why solr is taking too much time to
  index data?
  Earlier I was indexing same data on 8 shards. That time, it was fast as
  compared to single shard. Why so? Any help please..

 There's practically no information to go on here, so about all I can
 offer is general information in return:

 http://wiki.apache.org/solr/SolrPerformanceProblems

 I looked over the previous messages that you have sent the list, and I
 can find very little of the required information about your index.  I
 see a lot of questions from you, but they did not include the kind of
 details needed here:

 How much total RAM is in each Solr server?  Are there any other programs
 on the server with significant RAM requirements?  An example of such a
 program would be a database server.  On each server, how much memory is
 dedicated to the java heap(s) for Solr?  I gather from other questions
 that you are running SolrCloud, can you confirm?

 On a per-server basis, how much disk space do all the index replicas
 take?  How many documents are on each server?  Note that for disk space
 and number of documents, I am asking you to count every replica, not
 take the total in the collection and divide it by the number of servers.

 How are you doing your indexing?  For this question, I am asking what
 program or Solr API is actually sending the data to Solr.  Possible
 answers include the dataimport handler, a SolrJ program, one of the
 other Solr APIs such as a PHP client, and hand-crafted URLs with an HTTP
 client.

 Thanks,
 Shawn




Data indexing is going too slow on single shard Why?

2015-03-25 Thread Nitin Solanki
Hello,
Please can anyone assist me? I am indexing on single shard it
is taking too much of time to index data. And I am indexing around 49GB of
data on single shard. What's wrong? Why solr is taking too much time to
index data?
Earlier I was indexing same data on 8 shards. That time, it was fast as
compared to single shard. Why so? Any help please..


*HardCommit - 15 sec*
*SoftCommit - 10 min.*



Best,
Nitin


Read or Capture Solr Logs

2015-03-24 Thread Nitin Solanki
Hello,
I want to read or capture all the queries which are searched by
users. Any help on this?


Set search query logs into Solr

2015-03-24 Thread Nitin Solanki
Hello,
 I want to insert searched queries into solr log to track the
input of users. I googled too much but didn't find anything. Please help.
Your help will be appreciated...


Re: Read or Capture Solr Logs

2015-03-24 Thread Nitin Solanki
Hi Markus,
  Can you please help me. How to do that?
Using both Process the logs or make a simple SearchComponent
implementation that reads SolrQueryRequest

On Tue, Mar 24, 2015 at 4:25 PM, Nitin Solanki nitinml...@gmail.com wrote:

 Hi Markus,
   Can you please help me. How to do that?
 Using both Process the logs
 or make a simple SearchComponent implementation that reads
 SolrQueryRequest.

 On Tue, Mar 24, 2015 at 4:17 PM, Markus Jelsma markus.jel...@openindex.io
  wrote:

 Hello, you can either process the logs, or make a simple SearchComponent
 implementation that reads SolrQueryRequest.

 Markus



 -Original message-
  From:Nitin Solanki nitinml...@gmail.com
  Sent: Tuesday 24th March 2015 11:38
  To: solr-user@lucene.apache.org
  Subject: Read or Capture Solr Logs
 
  Hello,
  I want to read or capture all the queries which are
 searched by
  users. Any help on this?
 





Re: How to deal with different configurations on different collection?

2015-03-23 Thread Nitin Solanki
Thanks Shawn. It is working now as you said..  No need to switch to
external zookeeper. It is also working in embedded zookeeper

On Mon, Mar 23, 2015 at 5:42 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 3/23/2015 4:51 AM, Nitin Solanki wrote:
 Few days before, I have created a collection (wikingram) in
 solr
  4.10.4(Solr cloud) by applying default configuration from collection1.
 
  *sudo /mnt/nitin/Solr/solr_lm/example/scripts/cloud-scripts/zkcli.sh
  -zkhost localhost:9983 -cmd upconfig -confdir
  /mnt/nitin/Solr/solr_lm/example/solr/collection1/conf -confname default*
 
  Now, I want to create another collection (wikingram2) which needs
 different
  configuration. How can I do that?
  How to deal with different configuration on different collections.
 
  *Scenario: *
  {
  wikingram : myconf1,
  wikingram2 : myconf2
  }
 
  How to set the configuration just like above ?

 The upconfig command that you executed has uploaded a config named
 default (because of the -confname default parameters).

 To do what you want, simply repeat the upconfig command with another
 configuration directory and -confname myconf2, then use that
 configName when you call the Collections API to create the second
 collection.

 I notice you're using the embedded zookeeper.  You're going to want to
 switch to an external zookeeper ensemble with at least three hosts
 before you go into production.

 Thanks,
 Shawn




How to deal with different configurations on different collection?

2015-03-23 Thread Nitin Solanki
Hello,
   Few days before, I have created a collection (wikingram) in solr
4.10.4(Solr cloud) by applying default configuration from collection1.

*sudo /mnt/nitin/Solr/solr_lm/example/scripts/cloud-scripts/zkcli.sh
-zkhost localhost:9983 -cmd upconfig -confdir
/mnt/nitin/Solr/solr_lm/example/solr/collection1/conf -confname default*

Now, I want to create another collection (wikingram2) which needs different
configuration. How can I do that?
How to deal with different configuration on different collections.

*Scenario: *
{
wikingram : myconf1,
wikingram2 : myconf2
}

How to set the configuration just like above ?

I don't have much idea about that..
Any help please?


Re: Whole RAM consumed while Indexing.

2015-03-20 Thread Nitin Solanki
Hi Erick,
   I read mergeFactor Policy for indexing. By default, mergerFactor
is 10. As said in document,

High value merge factor (e.g., 25):

   - Pro: Generally improves indexing speed
   - Con: Less frequent merges, resulting in a collection with more index
   files which may slow searching

Low value merge factor (e.g., 2):

   - Pro: Smaller number of index files, which speeds up searching.
   - Con: More segment merges slow down indexing.

So, My main purpose is **searching**. Searching must be fast. Therefore, If
I set the value of **mergeFactor = 2 ** then indexing will be slow but
searching may fast right.

Once Again, I will tell. I am indexing(Total data size - 28GB)  2
document at a time that encounter commits after 15 seconds(hard commit) and
10 mins(soft commit).

Is searching be fast, if I set **mergeFactor = 2 ** and what should be the
value for ramBufferSizeMB, maxBufferedDocs, maxIndexingThreads?

Right now, All value are set by default..

On Fri, Mar 20, 2015 at 11:42 AM, Nitin Solanki nitinml...@gmail.com
wrote:



 On Fri, Mar 20, 2015 at 1:35 AM, Erick Erickson erickerick...@gmail.com
 wrote:

 That or even hard commit to 60 seconds. It's strictly a matter of how
 often
 you want to close old segments and open new ones.

 On Thu, Mar 19, 2015 at 3:12 AM, Nitin Solanki nitinml...@gmail.com
 wrote:
  Hi Erick..
I read your Article. Really nice...
  Inside that you said that for bulk indexing. Set soft commit = 10 mins
 and
  hard commit = 15sec. Is it also okay for my scenario?
 
  On Thu, Mar 19, 2015 at 1:53 AM, Erick Erickson 
 erickerick...@gmail.com
  wrote:
 
  bq: As you said, do commits after 6 seconds
 
  No, No, No. I'm NOT saying 6 seconds! That time is in
 _milliseconds_
  as Shawn said. So setting it to 6 is every minute.
 
  From solrconfig.xml, conveniently located immediately above the
  autoCommit tag:
 
  maxTime - Maximum amount of time in ms that is allowed to pass since a
  document was added before automatically triggering a new commit.
 
  Also, a lot of answers to soft and hard commits is here as I pointed
  out before, did you read it?
 
 
 
 https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
 
  Best
  Erick
 
  On Wed, Mar 18, 2015 at 9:44 AM, Alexandre Rafalovitch
  arafa...@gmail.com wrote:
   Probably merged somewhat differently with some terms indexes
 repeating
   between segments. Check the number of segments in data directory.And
   do search for *:* and make sure both do have the same document
 counts.
  
   Also, In all these discussions, you still haven't answered about how
   fast after indexing you want to _search_? Because, if you are not
   actually searching while committing, you could even index on a
   completely separate server (e.g. a faster one) and swap (or alias)
   index in afterwards. Unless, of course, I missed it, it's a lot of
   emails in a very short window of time.
  
   Regards,
  Alex.
  
   
   Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
   http://www.solr-start.com/
  
  
   On 18 March 2015 at 12:09, Nitin Solanki nitinml...@gmail.com
 wrote:
   When I kept my configuration to 300 for soft commit and 3000 for
 hard
   commit and indexed some amount of data, I got the data size of the
 whole
   index to be 6GB after completing the indexing.
  
   When I changed the configuration to 6 for soft commit and 6
 for
   hard commit and indexed same data then I got the data size of the
 whole
   index to be 5GB after completing the indexing.
  
   But the number of documents in the both scenario were same. I am
  wondering
   how that can be possible?
  
   On Wed, Mar 18, 2015 at 9:14 PM, Nitin Solanki 
 nitinml...@gmail.com
  wrote:
  
   Hi Erick,
I am just saying. I want to be sure on commits
  difference..
   What if I do frequent commits or not? And why I am saying that I
 need
  to
   commit things so very quickly because I have to index 28GB of data
  which
   takes 7-8 hours(frequent commits).
   As you said, do commits after 6 seconds then it will be more
  expensive.
   If I don't encounter with **overlapping searchers warning
 messages**
   then I feel it seems to be okay. Is it?
  
  
  
  
   On Wed, Mar 18, 2015 at 8:54 PM, Erick Erickson 
  erickerick...@gmail.com
   wrote:
  
   Don't do it. Really, why do you want to do this? This seems like
   an XY problem, you haven't explained why you need to commit
   things so very quickly.
  
   I suspect you haven't tried _searching_ while committing at such
   a rate, and you might as well turn all your top-level caches off
   in solrconfig.xml since they won't be useful at all.
  
   Best,
   Erick
  
   On Wed, Mar 18, 2015 at 6:24 AM, Nitin Solanki 
 nitinml...@gmail.com
   wrote:
Hi,
   If I do very very fast indexing(softcommit = 300 and
  hardcommit =
3000) v/s slow indexing (softcommit = 6 and hardcommit

Re: Whole RAM consumed while Indexing.

2015-03-20 Thread Nitin Solanki
On Fri, Mar 20, 2015 at 1:35 AM, Erick Erickson erickerick...@gmail.com
wrote:

 That or even hard commit to 60 seconds. It's strictly a matter of how often
 you want to close old segments and open new ones.

 On Thu, Mar 19, 2015 at 3:12 AM, Nitin Solanki nitinml...@gmail.com
 wrote:
  Hi Erick..
I read your Article. Really nice...
  Inside that you said that for bulk indexing. Set soft commit = 10 mins
 and
  hard commit = 15sec. Is it also okay for my scenario?
 
  On Thu, Mar 19, 2015 at 1:53 AM, Erick Erickson erickerick...@gmail.com
 
  wrote:
 
  bq: As you said, do commits after 6 seconds
 
  No, No, No. I'm NOT saying 6 seconds! That time is in _milliseconds_
  as Shawn said. So setting it to 6 is every minute.
 
  From solrconfig.xml, conveniently located immediately above the
  autoCommit tag:
 
  maxTime - Maximum amount of time in ms that is allowed to pass since a
  document was added before automatically triggering a new commit.
 
  Also, a lot of answers to soft and hard commits is here as I pointed
  out before, did you read it?
 
 
 
 https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
 
  Best
  Erick
 
  On Wed, Mar 18, 2015 at 9:44 AM, Alexandre Rafalovitch
  arafa...@gmail.com wrote:
   Probably merged somewhat differently with some terms indexes repeating
   between segments. Check the number of segments in data directory.And
   do search for *:* and make sure both do have the same document counts.
  
   Also, In all these discussions, you still haven't answered about how
   fast after indexing you want to _search_? Because, if you are not
   actually searching while committing, you could even index on a
   completely separate server (e.g. a faster one) and swap (or alias)
   index in afterwards. Unless, of course, I missed it, it's a lot of
   emails in a very short window of time.
  
   Regards,
  Alex.
  
   
   Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
   http://www.solr-start.com/
  
  
   On 18 March 2015 at 12:09, Nitin Solanki nitinml...@gmail.com
 wrote:
   When I kept my configuration to 300 for soft commit and 3000 for hard
   commit and indexed some amount of data, I got the data size of the
 whole
   index to be 6GB after completing the indexing.
  
   When I changed the configuration to 6 for soft commit and 6
 for
   hard commit and indexed same data then I got the data size of the
 whole
   index to be 5GB after completing the indexing.
  
   But the number of documents in the both scenario were same. I am
  wondering
   how that can be possible?
  
   On Wed, Mar 18, 2015 at 9:14 PM, Nitin Solanki nitinml...@gmail.com
 
  wrote:
  
   Hi Erick,
I am just saying. I want to be sure on commits
  difference..
   What if I do frequent commits or not? And why I am saying that I
 need
  to
   commit things so very quickly because I have to index 28GB of data
  which
   takes 7-8 hours(frequent commits).
   As you said, do commits after 6 seconds then it will be more
  expensive.
   If I don't encounter with **overlapping searchers warning
 messages**
   then I feel it seems to be okay. Is it?
  
  
  
  
   On Wed, Mar 18, 2015 at 8:54 PM, Erick Erickson 
  erickerick...@gmail.com
   wrote:
  
   Don't do it. Really, why do you want to do this? This seems like
   an XY problem, you haven't explained why you need to commit
   things so very quickly.
  
   I suspect you haven't tried _searching_ while committing at such
   a rate, and you might as well turn all your top-level caches off
   in solrconfig.xml since they won't be useful at all.
  
   Best,
   Erick
  
   On Wed, Mar 18, 2015 at 6:24 AM, Nitin Solanki 
 nitinml...@gmail.com
   wrote:
Hi,
   If I do very very fast indexing(softcommit = 300 and
  hardcommit =
3000) v/s slow indexing (softcommit = 6 and hardcommit =
 6)
  as
   you
both said. Will fast indexing fail to index some data?
Any suggestion on this ?
   
On Tue, Mar 17, 2015 at 2:29 AM, Ramkumar R. Aiyengar 
andyetitmo...@gmail.com wrote:
   
Yes, and doing so is painful and takes lots of people and
 hardware
resources to get there for large amounts of data and queries :)
   
As Erick says, work backwards from 60s and first establish how
  high the
commit interval can be to satisfy your use case..
On 16 Mar 2015 16:04, Erick Erickson erickerick...@gmail.com
 
   wrote:
   
 First start by lengthening your soft and hard commit intervals
 substantially. Start with 6 and work backwards I'd say.

 Ramkumar has tuned the heck out of his installation to get the
  commit
 intervals to be that short ;).

 I'm betting that you'll see your RAM usage go way down, but
  that' s a
 guess until you test.

 Best,
 Erick

 On Sun, Mar 15, 2015 at 10:56 PM, Nitin Solanki 
   nitinml...@gmail.com
 wrote:
  Hi Erick

Re: Whole RAM consumed while Indexing.

2015-03-19 Thread Nitin Solanki
Hi Alxeandre,
Number of segment counts are different but document
counts are same.
With (soft commit - 300 and hardcommit - 6000) = No. of segment - 43
AND
With (soft commit - 6 and hardcommit - 6) = No. of segment - 31

I dont' have any idea related to segment counts. What is it? How to solve
it? Any idea.
Or it is fine without worrying about segments.
Just want to ask - If segment counts are more than searching will be slow?

On Wed, Mar 18, 2015 at 10:14 PM, Alexandre Rafalovitch arafa...@gmail.com
wrote:

 Probably merged somewhat differently with some terms indexes repeating
 between segments. Check the number of segments in data directory.And
 do search for *:* and make sure both do have the same document counts.

 Also, In all these discussions, you still haven't answered about how
 fast after indexing you want to _search_? Because, if you are not
 actually searching while committing, you could even index on a
 completely separate server (e.g. a faster one) and swap (or alias)
 index in afterwards. Unless, of course, I missed it, it's a lot of
 emails in a very short window of time.

 Regards,
Alex.

 
 Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
 http://www.solr-start.com/


 On 18 March 2015 at 12:09, Nitin Solanki nitinml...@gmail.com wrote:
  When I kept my configuration to 300 for soft commit and 3000 for hard
  commit and indexed some amount of data, I got the data size of the whole
  index to be 6GB after completing the indexing.
 
  When I changed the configuration to 6 for soft commit and 6 for
  hard commit and indexed same data then I got the data size of the whole
  index to be 5GB after completing the indexing.
 
  But the number of documents in the both scenario were same. I am
 wondering
  how that can be possible?
 
  On Wed, Mar 18, 2015 at 9:14 PM, Nitin Solanki nitinml...@gmail.com
 wrote:
 
  Hi Erick,
   I am just saying. I want to be sure on commits difference..
  What if I do frequent commits or not? And why I am saying that I need to
  commit things so very quickly because I have to index 28GB of data which
  takes 7-8 hours(frequent commits).
  As you said, do commits after 6 seconds then it will be more
 expensive.
  If I don't encounter with **overlapping searchers warning messages**
  then I feel it seems to be okay. Is it?
 
 
 
 
  On Wed, Mar 18, 2015 at 8:54 PM, Erick Erickson 
 erickerick...@gmail.com
  wrote:
 
  Don't do it. Really, why do you want to do this? This seems like
  an XY problem, you haven't explained why you need to commit
  things so very quickly.
 
  I suspect you haven't tried _searching_ while committing at such
  a rate, and you might as well turn all your top-level caches off
  in solrconfig.xml since they won't be useful at all.
 
  Best,
  Erick
 
  On Wed, Mar 18, 2015 at 6:24 AM, Nitin Solanki nitinml...@gmail.com
  wrote:
   Hi,
  If I do very very fast indexing(softcommit = 300 and
 hardcommit =
   3000) v/s slow indexing (softcommit = 6 and hardcommit = 6)
 as
  you
   both said. Will fast indexing fail to index some data?
   Any suggestion on this ?
  
   On Tue, Mar 17, 2015 at 2:29 AM, Ramkumar R. Aiyengar 
   andyetitmo...@gmail.com wrote:
  
   Yes, and doing so is painful and takes lots of people and hardware
   resources to get there for large amounts of data and queries :)
  
   As Erick says, work backwards from 60s and first establish how high
 the
   commit interval can be to satisfy your use case..
   On 16 Mar 2015 16:04, Erick Erickson erickerick...@gmail.com
  wrote:
  
First start by lengthening your soft and hard commit intervals
substantially. Start with 6 and work backwards I'd say.
   
Ramkumar has tuned the heck out of his installation to get the
 commit
intervals to be that short ;).
   
I'm betting that you'll see your RAM usage go way down, but that'
 s a
guess until you test.
   
Best,
Erick
   
On Sun, Mar 15, 2015 at 10:56 PM, Nitin Solanki 
  nitinml...@gmail.com
wrote:
 Hi Erick,
 You are saying correct. Something, **overlapping
   searchers
 warning messages** are coming in logs.
 **numDocs numbers** are changing when documents are adding at
 the
  time
   of
 indexing.
 Any help?

 On Sat, Mar 14, 2015 at 11:24 PM, Erick Erickson 
erickerick...@gmail.com
 wrote:

 First, the soft commit interval is very short. Very, very,
 very,
  very
 short. 300ms is
 just short of insane unless it's a typo ;).

 Here's a long background:


   
  
 
 https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

 But the short form is that you're opening searchers every 300
 ms.
  The
 hard commit is better,
 but every 3 seconds is still far too short IMO. I'd start with
  soft
 commits of 6 and hard
 commits of 6 (60 seconds

Re: Whole RAM consumed while Indexing.

2015-03-19 Thread Nitin Solanki
Hi Erick..
  I read your Article. Really nice...
Inside that you said that for bulk indexing. Set soft commit = 10 mins and
hard commit = 15sec. Is it also okay for my scenario?

On Thu, Mar 19, 2015 at 1:53 AM, Erick Erickson erickerick...@gmail.com
wrote:

 bq: As you said, do commits after 6 seconds

 No, No, No. I'm NOT saying 6 seconds! That time is in _milliseconds_
 as Shawn said. So setting it to 6 is every minute.

 From solrconfig.xml, conveniently located immediately above the
 autoCommit tag:

 maxTime - Maximum amount of time in ms that is allowed to pass since a
 document was added before automatically triggering a new commit.

 Also, a lot of answers to soft and hard commits is here as I pointed
 out before, did you read it?


 https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

 Best
 Erick

 On Wed, Mar 18, 2015 at 9:44 AM, Alexandre Rafalovitch
 arafa...@gmail.com wrote:
  Probably merged somewhat differently with some terms indexes repeating
  between segments. Check the number of segments in data directory.And
  do search for *:* and make sure both do have the same document counts.
 
  Also, In all these discussions, you still haven't answered about how
  fast after indexing you want to _search_? Because, if you are not
  actually searching while committing, you could even index on a
  completely separate server (e.g. a faster one) and swap (or alias)
  index in afterwards. Unless, of course, I missed it, it's a lot of
  emails in a very short window of time.
 
  Regards,
 Alex.
 
  
  Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
  http://www.solr-start.com/
 
 
  On 18 March 2015 at 12:09, Nitin Solanki nitinml...@gmail.com wrote:
  When I kept my configuration to 300 for soft commit and 3000 for hard
  commit and indexed some amount of data, I got the data size of the whole
  index to be 6GB after completing the indexing.
 
  When I changed the configuration to 6 for soft commit and 6 for
  hard commit and indexed same data then I got the data size of the whole
  index to be 5GB after completing the indexing.
 
  But the number of documents in the both scenario were same. I am
 wondering
  how that can be possible?
 
  On Wed, Mar 18, 2015 at 9:14 PM, Nitin Solanki nitinml...@gmail.com
 wrote:
 
  Hi Erick,
   I am just saying. I want to be sure on commits
 difference..
  What if I do frequent commits or not? And why I am saying that I need
 to
  commit things so very quickly because I have to index 28GB of data
 which
  takes 7-8 hours(frequent commits).
  As you said, do commits after 6 seconds then it will be more
 expensive.
  If I don't encounter with **overlapping searchers warning messages**
  then I feel it seems to be okay. Is it?
 
 
 
 
  On Wed, Mar 18, 2015 at 8:54 PM, Erick Erickson 
 erickerick...@gmail.com
  wrote:
 
  Don't do it. Really, why do you want to do this? This seems like
  an XY problem, you haven't explained why you need to commit
  things so very quickly.
 
  I suspect you haven't tried _searching_ while committing at such
  a rate, and you might as well turn all your top-level caches off
  in solrconfig.xml since they won't be useful at all.
 
  Best,
  Erick
 
  On Wed, Mar 18, 2015 at 6:24 AM, Nitin Solanki nitinml...@gmail.com
  wrote:
   Hi,
  If I do very very fast indexing(softcommit = 300 and
 hardcommit =
   3000) v/s slow indexing (softcommit = 6 and hardcommit = 6)
 as
  you
   both said. Will fast indexing fail to index some data?
   Any suggestion on this ?
  
   On Tue, Mar 17, 2015 at 2:29 AM, Ramkumar R. Aiyengar 
   andyetitmo...@gmail.com wrote:
  
   Yes, and doing so is painful and takes lots of people and hardware
   resources to get there for large amounts of data and queries :)
  
   As Erick says, work backwards from 60s and first establish how
 high the
   commit interval can be to satisfy your use case..
   On 16 Mar 2015 16:04, Erick Erickson erickerick...@gmail.com
  wrote:
  
First start by lengthening your soft and hard commit intervals
substantially. Start with 6 and work backwards I'd say.
   
Ramkumar has tuned the heck out of his installation to get the
 commit
intervals to be that short ;).
   
I'm betting that you'll see your RAM usage go way down, but
 that' s a
guess until you test.
   
Best,
Erick
   
On Sun, Mar 15, 2015 at 10:56 PM, Nitin Solanki 
  nitinml...@gmail.com
wrote:
 Hi Erick,
 You are saying correct. Something, **overlapping
   searchers
 warning messages** are coming in logs.
 **numDocs numbers** are changing when documents are adding at
 the
  time
   of
 indexing.
 Any help?

 On Sat, Mar 14, 2015 at 11:24 PM, Erick Erickson 
erickerick...@gmail.com
 wrote:

 First, the soft commit interval is very short. Very, very,
 very,
  very
 short. 300ms is
 just

Re: Whole RAM consumed while Indexing.

2015-03-18 Thread Nitin Solanki
Hi,
   If I do very very fast indexing(softcommit = 300 and hardcommit =
3000) v/s slow indexing (softcommit = 6 and hardcommit = 6) as you
both said. Will fast indexing fail to index some data?
Any suggestion on this ?

On Tue, Mar 17, 2015 at 2:29 AM, Ramkumar R. Aiyengar 
andyetitmo...@gmail.com wrote:

 Yes, and doing so is painful and takes lots of people and hardware
 resources to get there for large amounts of data and queries :)

 As Erick says, work backwards from 60s and first establish how high the
 commit interval can be to satisfy your use case..
 On 16 Mar 2015 16:04, Erick Erickson erickerick...@gmail.com wrote:

  First start by lengthening your soft and hard commit intervals
  substantially. Start with 6 and work backwards I'd say.
 
  Ramkumar has tuned the heck out of his installation to get the commit
  intervals to be that short ;).
 
  I'm betting that you'll see your RAM usage go way down, but that' s a
  guess until you test.
 
  Best,
  Erick
 
  On Sun, Mar 15, 2015 at 10:56 PM, Nitin Solanki nitinml...@gmail.com
  wrote:
   Hi Erick,
   You are saying correct. Something, **overlapping
 searchers
   warning messages** are coming in logs.
   **numDocs numbers** are changing when documents are adding at the time
 of
   indexing.
   Any help?
  
   On Sat, Mar 14, 2015 at 11:24 PM, Erick Erickson 
  erickerick...@gmail.com
   wrote:
  
   First, the soft commit interval is very short. Very, very, very, very
   short. 300ms is
   just short of insane unless it's a typo ;).
  
   Here's a long background:
  
  
 
 https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
  
   But the short form is that you're opening searchers every 300 ms. The
   hard commit is better,
   but every 3 seconds is still far too short IMO. I'd start with soft
   commits of 6 and hard
   commits of 6 (60 seconds), meaning that you're going to have to
   wait 1 minute for
   docs to show up unless you explicitly commit.
  
   You're throwing away all the caches configured in solrconfig.xml more
   than 3 times a second,
   executing autowarming, etc, etc, etc
  
   Changing these to longer intervals might cure the problem, but if not
   then, as Hoss would
   say, details matter. I suspect you're also seeing overlapping
   searchers warning messages
   in your log, and it;s _possible_ that what's happening is that you're
   just exceeding the
   max warming searchers and never opening a new searcher with the
   newly-indexed documents.
   But that's a total shot in the dark.
  
   How are you looking for docs (and not finding them)? Does the numDocs
   number in
   the solr admin screen change?
  
  
   Best,
   Erick
  
   On Thu, Mar 12, 2015 at 10:27 PM, Nitin Solanki nitinml...@gmail.com
 
   wrote:
Hi Alexandre,
   
   
*Hard Commit* is :
   
 autoCommit
   maxTime${solr.autoCommit.maxTime:3000}/maxTime
   openSearcherfalse/openSearcher
 /autoCommit
   
*Soft Commit* is :
   
autoSoftCommit
maxTime${solr.autoSoftCommit.maxTime:300}/maxTime
/autoSoftCommit
   
And I am committing 2 documents each time.
Is it good config for committing?
Or I am good something wrong ?
   
   
On Fri, Mar 13, 2015 at 8:52 AM, Alexandre Rafalovitch 
   arafa...@gmail.com
wrote:
   
What's your commit strategy? Explicit commits? Soft commits/hard
commits (in solrconfig.xml)?
   
Regards,
   Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/
   
   
On 12 March 2015 at 23:19, Nitin Solanki nitinml...@gmail.com
  wrote:
 Hello,
   I have written a python script to do 2 documents
   indexing
 each time on Solr. I have 28 GB RAM with 8 CPU.
 When I started indexing, at that time 15 GB RAM was freed. While
indexing,
 all RAM is consumed but **not** a single document is indexed. Why
  so?
 And it through *HTTPError: HTTP Error 503: Service Unavailable*
 in
   python
 script.
 I think it is due to heavy load on Zookeeper by which all nodes
  went
down.
 I am not sure about that. Any help please..
 Or anything else is happening..
 And how to overcome this issue.
 Please assist me towards right path.
 Thanks..

 Warm Regards,
 Nitin Solanki
   
  
 



Re: Add replica on shards

2015-03-18 Thread Nitin Solanki
Thanks Norgorn.
I did the same thing but in different manner..
like -

localhost:8983/solr/admin/cores?action=CREATEname=wikingram_shard4_replica3collection=wikingramproperty.shard=shard4

On Wed, Mar 18, 2015 at 7:20 PM, Norgorn lsunnyd...@mail.ru wrote:


 U can do the same simply by something like that


 http://localhost:8983/solr/admin/cores?action=CREATEcollection=wikingramname=ANY_NAME_HEREshard=shard1

 The main part is shard=shard1, when you create core with existing shard
 (core name doesn't matter, we use collection_shard1_replica2, but u can
 do
 whatever u want), this core becomes a replica and copies data from leading
 shard.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Add-replica-on-shards-tp4193659p4193732.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Whole RAM consumed while Indexing.

2015-03-18 Thread Nitin Solanki
Hi Erick,
 I am just saying. I want to be sure on commits difference..
What if I do frequent commits or not? And why I am saying that I need to
commit things so very quickly because I have to index 28GB of data which
takes 7-8 hours(frequent commits).
As you said, do commits after 6 seconds then it will be more expensive.
If I don't encounter with **overlapping searchers warning messages** then
I feel it seems to be okay. Is it?




On Wed, Mar 18, 2015 at 8:54 PM, Erick Erickson erickerick...@gmail.com
wrote:

 Don't do it. Really, why do you want to do this? This seems like
 an XY problem, you haven't explained why you need to commit
 things so very quickly.

 I suspect you haven't tried _searching_ while committing at such
 a rate, and you might as well turn all your top-level caches off
 in solrconfig.xml since they won't be useful at all.

 Best,
 Erick

 On Wed, Mar 18, 2015 at 6:24 AM, Nitin Solanki nitinml...@gmail.com
 wrote:
  Hi,
 If I do very very fast indexing(softcommit = 300 and hardcommit =
  3000) v/s slow indexing (softcommit = 6 and hardcommit = 6) as
 you
  both said. Will fast indexing fail to index some data?
  Any suggestion on this ?
 
  On Tue, Mar 17, 2015 at 2:29 AM, Ramkumar R. Aiyengar 
  andyetitmo...@gmail.com wrote:
 
  Yes, and doing so is painful and takes lots of people and hardware
  resources to get there for large amounts of data and queries :)
 
  As Erick says, work backwards from 60s and first establish how high the
  commit interval can be to satisfy your use case..
  On 16 Mar 2015 16:04, Erick Erickson erickerick...@gmail.com wrote:
 
   First start by lengthening your soft and hard commit intervals
   substantially. Start with 6 and work backwards I'd say.
  
   Ramkumar has tuned the heck out of his installation to get the commit
   intervals to be that short ;).
  
   I'm betting that you'll see your RAM usage go way down, but that' s a
   guess until you test.
  
   Best,
   Erick
  
   On Sun, Mar 15, 2015 at 10:56 PM, Nitin Solanki nitinml...@gmail.com
 
   wrote:
Hi Erick,
You are saying correct. Something, **overlapping
  searchers
warning messages** are coming in logs.
**numDocs numbers** are changing when documents are adding at the
 time
  of
indexing.
Any help?
   
On Sat, Mar 14, 2015 at 11:24 PM, Erick Erickson 
   erickerick...@gmail.com
wrote:
   
First, the soft commit interval is very short. Very, very, very,
 very
short. 300ms is
just short of insane unless it's a typo ;).
   
Here's a long background:
   
   
  
 
 https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
   
But the short form is that you're opening searchers every 300 ms.
 The
hard commit is better,
but every 3 seconds is still far too short IMO. I'd start with soft
commits of 6 and hard
commits of 6 (60 seconds), meaning that you're going to have to
wait 1 minute for
docs to show up unless you explicitly commit.
   
You're throwing away all the caches configured in solrconfig.xml
 more
than 3 times a second,
executing autowarming, etc, etc, etc
   
Changing these to longer intervals might cure the problem, but if
 not
then, as Hoss would
say, details matter. I suspect you're also seeing overlapping
searchers warning messages
in your log, and it;s _possible_ that what's happening is that
 you're
just exceeding the
max warming searchers and never opening a new searcher with the
newly-indexed documents.
But that's a total shot in the dark.
   
How are you looking for docs (and not finding them)? Does the
 numDocs
number in
the solr admin screen change?
   
   
Best,
Erick
   
On Thu, Mar 12, 2015 at 10:27 PM, Nitin Solanki 
 nitinml...@gmail.com
  
wrote:
 Hi Alexandre,


 *Hard Commit* is :

  autoCommit
maxTime${solr.autoCommit.maxTime:3000}/maxTime
openSearcherfalse/openSearcher
  /autoCommit

 *Soft Commit* is :

 autoSoftCommit
 maxTime${solr.autoSoftCommit.maxTime:300}/maxTime
 /autoSoftCommit

 And I am committing 2 documents each time.
 Is it good config for committing?
 Or I am good something wrong ?


 On Fri, Mar 13, 2015 at 8:52 AM, Alexandre Rafalovitch 
arafa...@gmail.com
 wrote:

 What's your commit strategy? Explicit commits? Soft commits/hard
 commits (in solrconfig.xml)?

 Regards,
Alex.
 
 Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
 http://www.solr-start.com/


 On 12 March 2015 at 23:19, Nitin Solanki nitinml...@gmail.com
   wrote:
  Hello,
I have written a python script to do 2 documents
indexing
  each time on Solr. I have 28 GB RAM with 8 CPU.
  When I started indexing

Re: Whole RAM consumed while Indexing.

2015-03-18 Thread Nitin Solanki
When I kept my configuration to 300 for soft commit and 3000 for hard
commit and indexed some amount of data, I got the data size of the whole
index to be 6GB after completing the indexing.

When I changed the configuration to 6 for soft commit and 6 for
hard commit and indexed same data then I got the data size of the whole
index to be 5GB after completing the indexing.

But the number of documents in the both scenario were same. I am wondering
how that can be possible?

On Wed, Mar 18, 2015 at 9:14 PM, Nitin Solanki nitinml...@gmail.com wrote:

 Hi Erick,
  I am just saying. I want to be sure on commits difference..
 What if I do frequent commits or not? And why I am saying that I need to
 commit things so very quickly because I have to index 28GB of data which
 takes 7-8 hours(frequent commits).
 As you said, do commits after 6 seconds then it will be more expensive.
 If I don't encounter with **overlapping searchers warning messages**
 then I feel it seems to be okay. Is it?




 On Wed, Mar 18, 2015 at 8:54 PM, Erick Erickson erickerick...@gmail.com
 wrote:

 Don't do it. Really, why do you want to do this? This seems like
 an XY problem, you haven't explained why you need to commit
 things so very quickly.

 I suspect you haven't tried _searching_ while committing at such
 a rate, and you might as well turn all your top-level caches off
 in solrconfig.xml since they won't be useful at all.

 Best,
 Erick

 On Wed, Mar 18, 2015 at 6:24 AM, Nitin Solanki nitinml...@gmail.com
 wrote:
  Hi,
 If I do very very fast indexing(softcommit = 300 and hardcommit =
  3000) v/s slow indexing (softcommit = 6 and hardcommit = 6) as
 you
  both said. Will fast indexing fail to index some data?
  Any suggestion on this ?
 
  On Tue, Mar 17, 2015 at 2:29 AM, Ramkumar R. Aiyengar 
  andyetitmo...@gmail.com wrote:
 
  Yes, and doing so is painful and takes lots of people and hardware
  resources to get there for large amounts of data and queries :)
 
  As Erick says, work backwards from 60s and first establish how high the
  commit interval can be to satisfy your use case..
  On 16 Mar 2015 16:04, Erick Erickson erickerick...@gmail.com
 wrote:
 
   First start by lengthening your soft and hard commit intervals
   substantially. Start with 6 and work backwards I'd say.
  
   Ramkumar has tuned the heck out of his installation to get the commit
   intervals to be that short ;).
  
   I'm betting that you'll see your RAM usage go way down, but that' s a
   guess until you test.
  
   Best,
   Erick
  
   On Sun, Mar 15, 2015 at 10:56 PM, Nitin Solanki 
 nitinml...@gmail.com
   wrote:
Hi Erick,
You are saying correct. Something, **overlapping
  searchers
warning messages** are coming in logs.
**numDocs numbers** are changing when documents are adding at the
 time
  of
indexing.
Any help?
   
On Sat, Mar 14, 2015 at 11:24 PM, Erick Erickson 
   erickerick...@gmail.com
wrote:
   
First, the soft commit interval is very short. Very, very, very,
 very
short. 300ms is
just short of insane unless it's a typo ;).
   
Here's a long background:
   
   
  
 
 https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
   
But the short form is that you're opening searchers every 300 ms.
 The
hard commit is better,
but every 3 seconds is still far too short IMO. I'd start with
 soft
commits of 6 and hard
commits of 6 (60 seconds), meaning that you're going to have
 to
wait 1 minute for
docs to show up unless you explicitly commit.
   
You're throwing away all the caches configured in solrconfig.xml
 more
than 3 times a second,
executing autowarming, etc, etc, etc
   
Changing these to longer intervals might cure the problem, but if
 not
then, as Hoss would
say, details matter. I suspect you're also seeing overlapping
searchers warning messages
in your log, and it;s _possible_ that what's happening is that
 you're
just exceeding the
max warming searchers and never opening a new searcher with the
newly-indexed documents.
But that's a total shot in the dark.
   
How are you looking for docs (and not finding them)? Does the
 numDocs
number in
the solr admin screen change?
   
   
Best,
Erick
   
On Thu, Mar 12, 2015 at 10:27 PM, Nitin Solanki 
 nitinml...@gmail.com
  
wrote:
 Hi Alexandre,


 *Hard Commit* is :

  autoCommit
maxTime${solr.autoCommit.maxTime:3000}/maxTime
openSearcherfalse/openSearcher
  /autoCommit

 *Soft Commit* is :

 autoSoftCommit
 maxTime${solr.autoSoftCommit.maxTime:300}/maxTime
 /autoSoftCommit

 And I am committing 2 documents each time.
 Is it good config for committing?
 Or I am good something wrong ?


 On Fri, Mar 13, 2015 at 8:52 AM

Re: Add replica on shards

2015-03-18 Thread Nitin Solanki
Any help please...

On Wed, Mar 18, 2015 at 12:02 PM, Nitin Solanki nitinml...@gmail.com
wrote:

 Hi,
  I have created 8 shards on a collection named as ***wikingram**.
 Now at that time, I were not created any replica. Now, I want to add a
 replica on each shard. How can I do?
 I created this - ** sudo curl
 http://localhost:8983/solr/admin/collections?action=ADDREPLICAcollection=wikingramshard=shard1node=localhost:8983_solr**
 but it is not working.

 It throws errror -


 response
 lst name=responseHeader
 int name=status400/int
 int name=QTime86/int
 /lst
 str name=Operation ADDREPLICA caused
 exception:org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
 Could not find collection : null/str
 lst name=exception
 str name=msgCould not find collection : null/str
 int name=rspCode400/int
 /lst
 lst name=error
 str name=msgCould not find collection : null/str
 int name=code400/int
 /lst
 /response

 Any help on this?



Add replica on shards

2015-03-18 Thread Nitin Solanki
Hi,
 I have created 8 shards on a collection named as ***wikingram**.
Now at that time, I were not created any replica. Now, I want to add a
replica on each shard. How can I do?
I created this - ** sudo curl
http://localhost:8983/solr/admin/collections?action=ADDREPLICAcollection=wikingramshard=shard1node=localhost:8983_solr**
but it is not working.

It throws errror -


response
lst name=responseHeader
int name=status400/int
int name=QTime86/int
/lst
str name=Operation ADDREPLICA caused
exception:org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not find collection : null/str
lst name=exception
str name=msgCould not find collection : null/str
int name=rspCode400/int
/lst
lst name=error
str name=msgCould not find collection : null/str
int name=code400/int
/lst
/response

Any help on this?


Re: Want to modify Solr Source Code

2015-03-17 Thread Nitin Solanki
Hi Anshum,
  The reason behind to edit source code is that I am using
spell check component on solr. I have implemented it and it is working
fine..
But something suggestion frequency goes vary. I have explain that it into
this -
http://stackoverflow.com/questions/28857915/original-frequency-is-not-matching-with-suggestion-frequency-in-solr.
Please check it..
And now, I am thinking that if I will able to add hitcount instead of freq
in suggestion then I will be helpful for my purpose.

On Tue, Mar 17, 2015 at 1:23 PM, Anshum Gupta ans...@anshumgupta.net
wrote:

 Hi Nitin,

 Do you intend to browse the code? If you really want to modify the code,
 can you tell us about what exactly is it that you're trying to achieve?
 Can you clarify on how you want to test Solr? If so, do you plan on running
 the tests that Solr ships with or do you have your own tests?


 All said and done, if you don't want to use svn but still want to download
 the Solr source, you can download the same (for Solr 5.0.0) from any of the
 mirrors listed here:

 http://www.apache.org/dyn/closer.cgi/lucene/solr/5.0.0



 On Tue, Mar 17, 2015 at 12:42 AM, Nitin Solanki nitinml...@gmail.com
 wrote:

  Hi Gora,
 Hi, I want to make changes only into my machine without
 svn.
  I want to do test on source code. How ? Any steps to do so ? Please
 help..
 
  On Tue, Mar 17, 2015 at 1:01 PM, Gora Mohanty g...@mimirtech.com
 wrote:
 
   On 17 March 2015 at 12:22, Nitin Solanki nitinml...@gmail.com wrote:
   
Hi,
 I want to modify the solr source code. I don't have any idea
   where
source code is available. I want to edit source code. How can I do ?
Any help please...
  
   Please start with:
  
  
 
 http://wiki.apache.org/solr/HowToContribute#Contributing_Code_.28Features.2C_Bug_Fixes.2C_Tests.2C_etc29
  
   Regards,
   Gora
  
 



 --
 Anshum Gupta



Want to modify Solr Source Code

2015-03-17 Thread Nitin Solanki
Hi,
 I want to modify the solr source code. I don't have any idea where
source code is available. I want to edit source code. How can I do ?
Any help please...


Re: Want to modify Solr Source Code

2015-03-17 Thread Nitin Solanki
Hi Ramkumar,
 Sorry but svn will create cumbersome for me and I
don't want to use it right now. I want to do anything on local machine
without using svn.
As you said to download -src.tgz. I have download solr-4.10.2-src.tar.gz.
Now able to see source code. Now how to configure it and compile any file,
if I will change?
Any help please..

On Tue, Mar 17, 2015 at 1:21 PM, Ramkumar R. Aiyengar 
andyetitmo...@gmail.com wrote:

 Is your concern that you want to be able to modify source code just on your
 machine or that  you can't for some reason install svn?

 If it's the former, even if you checkout using svn, you can't modify
 anything outside the machine as changes can be checked in only by the
 committers of the project. You will need to raise a JIRA for the changes to
 go back in as described by the wiki page.

 If the latter, try downloading the source code using the downloads section
 in https://lucene.apache.org/solr and choose the download which ends as
 -src.tgz, that has the source bundled as a single file.
 On 17 Mar 2015 07:42, Nitin Solanki nitinml...@gmail.com wrote:

  Hi Gora,
 Hi, I want to make changes only into my machine without
 svn.
  I want to do test on source code. How ? Any steps to do so ? Please
 help..
 
  On Tue, Mar 17, 2015 at 1:01 PM, Gora Mohanty g...@mimirtech.com
 wrote:
 
   On 17 March 2015 at 12:22, Nitin Solanki nitinml...@gmail.com wrote:
   
Hi,
 I want to modify the solr source code. I don't have any idea
   where
source code is available. I want to edit source code. How can I do ?
Any help please...
  
   Please start with:
  
  
 
 http://wiki.apache.org/solr/HowToContribute#Contributing_Code_.28Features.2C_Bug_Fixes.2C_Tests.2C_etc29
  
   Regards,
   Gora
  
 



Re: Want to modify Solr Source Code

2015-03-17 Thread Nitin Solanki
Hi Gora,
   Hi, I want to make changes only into my machine without svn.
I want to do test on source code. How ? Any steps to do so ? Please help..

On Tue, Mar 17, 2015 at 1:01 PM, Gora Mohanty g...@mimirtech.com wrote:

 On 17 March 2015 at 12:22, Nitin Solanki nitinml...@gmail.com wrote:
 
  Hi,
   I want to modify the solr source code. I don't have any idea
 where
  source code is available. I want to edit source code. How can I do ?
  Any help please...

 Please start with:

 http://wiki.apache.org/solr/HowToContribute#Contributing_Code_.28Features.2C_Bug_Fixes.2C_Tests.2C_etc29

 Regards,
 Gora



Re: Want to modify Solr Source Code

2015-03-17 Thread Nitin Solanki
I have already downloaded
http://archive.apache.org/dist/lucene/solr/4.10.2/solr-4.10.2.tgz. Now, How
to view or edit the source code of any file? I don't have any idea about
it.. Your help is appreciated..
Please guide my step by step..
Thanks again..


On Tue, Mar 17, 2015 at 1:16 PM, Gora Mohanty g...@mimirtech.com wrote:

 On 17 March 2015 at 13:12, Nitin Solanki nitinml...@gmail.com wrote:
  Hi Gora,
 Hi, I want to make changes only into my machine without
 svn.
  I want to do test on source code. How ? Any steps to do so ? Please
 help..

 You could still use SVN for a local repository. Else, you can download
 a tar.gz of a Solr distribution from under the Download link at the
 top right of http://lucene.apache.org/solr/

 Regards,
 Gora



Re: Want to modify Solr Source Code

2015-03-17 Thread Nitin Solanki
Hi Gora,
 Thanks again. Do you have any link/ article of Wiki article?
Please send me.

On Tue, Mar 17, 2015 at 1:30 PM, Gora Mohanty g...@mimirtech.com wrote:

 On 17 March 2015 at 13:21, Nitin Solanki nitinml...@gmail.com wrote:

  I have already downloaded
  http://archive.apache.org/dist/lucene/solr/4.10.2/solr-4.10.2.tgz. Now,
  How
  to view or edit the source code of any file? I don't have any idea about
  it.. Your help is appreciated..
  Please guide my step by step..
  Thanks again..
 

 You need to learn the basics of putting together a development setup
 yourself, or from a local mentor. A .tgz is a gzip-compressed tar file that
 can be unarchived with tar, or most unarchivers. You are probably best off
 to use a Java IDE, such as Eclipse, to edit the source code. The Wiki
 article covers how to compile the code, and run the built-in tests.

 Regards,
 Gora



Shards doesn't seems to give same suggestion of a term/ misspell term.

2015-03-17 Thread Nitin Solanki
Hi everyone,
I have stuck in a big issue. First I will explain what I am
doing. I am creating a spell correction using Solr where I have indexed
21GB of data and used sharding/ distributed search. I have created 4 nodes
having 8 shards without any replica.

When I search a term, I got suggestions of it. But here the problem is that
each shard/server not able to give me the same suggestion terms of each
searched term.
*Example* - If I search term - **chare** then I got **care** as suggestion
but I got **care** suggestion on 6 shards and missing on 2 shards by which
total frequency of **care** on 6 shards is 50. But the actual/original
frequency of **case** is 96 and remaining frequency(46) missing out in 2
shards.
After using debugQuery=true. I am able to find the frequency issue that
left out.
Now how to get the suggestion inside each shard same. Or any other solution
?
Please help..
Thanks again.


Warm Regards,
Nitin Solanki


Re: Want to modify Solr Source Code

2015-03-17 Thread Nitin Solanki
Hi all,
   I have configured solr source code with eclipse. Now, I have
written a print statement in between the SolrSpellChecker.java. Now, I want
to compile this file. How to do that ?
Any help please...

On Tue, Mar 17, 2015 at 2:27 PM, Gora Mohanty g...@mimirtech.com wrote:

 On 17 March 2015 at 13:38, Nitin Solanki nitinml...@gmail.com wrote:
  Hi Gora,
   Thanks again. Do you have any link/ article of Wiki article?
  Please send me.

 Sent the link in my very first follow-up:

 http://wiki.apache.org/solr/HowToContribute#Contributing_Code_.28Features.2C_Bug_Fixes.2C_Tests.2C_etc29

 Regards,
 Gora



Re: Want to modify Solr Source Code

2015-03-17 Thread Nitin Solanki
Hi all,
   How to set breakpoints throughout the Solr code, step through
code ?

On Tue, Mar 17, 2015 at 6:22 PM, Nitin Solanki nitinml...@gmail.com wrote:

 Hi all,
I have configured solr source code with eclipse. Now, I have
 written a print statement in between the SolrSpellChecker.java. Now, I want
 to compile this file. How to do that ?
 Any help please...

 On Tue, Mar 17, 2015 at 2:27 PM, Gora Mohanty g...@mimirtech.com wrote:

 On 17 March 2015 at 13:38, Nitin Solanki nitinml...@gmail.com wrote:
  Hi Gora,
   Thanks again. Do you have any link/ article of Wiki
 article?
  Please send me.

 Sent the link in my very first follow-up:

 http://wiki.apache.org/solr/HowToContribute#Contributing_Code_.28Features.2C_Bug_Fixes.2C_Tests.2C_etc29

 Regards,
 Gora





thresholdTokenFrequency changes suggestion frequency..

2015-03-16 Thread Nitin Solanki
Hi,
  I am not getting that why suggestion frequency goes varies from
original frequency.
Example - I have a word = *who* and its original frequency is *100* but
when I find suggestion of it. It suggestion goes change to *50*.

I think it is happening because of *thresholdTokenFrequency*.
When I set the value of thresholdTokenFrequency to *0.1* then it gives
different frequency for 'who' suggestion while  set the value of
thresholdTokenFrequency to *0.0001* then it gives something different
frequency. Why so? I am not getting logic behind this..

As we know suggestion frequency is same as the index original frequency -

*The spellcheck.extendedResults=true parameter provides frequency of each
original term in the index (origFreq) as well as the frequency of each
suggestion in the index (frequency).*


maxQueryFrequency v/s thresholdTokenFrequency

2015-03-16 Thread Nitin Solanki
Hello Everyone,
 Please anybody can explain me what is the
difference between maxQueryFrequency and thresholdTokenFrequency?
Got the link -
http://wiki.apache.org/solr/SpellCheckComponent#thresholdTokenFrequency but
unable to understand..
I am very much confusing in both of them.
Your help is appreciated.


Warm Regards,
Nitin


Re: Whole RAM consumed while Indexing.

2015-03-15 Thread Nitin Solanki
Hi Erick,
You are saying correct. Something, **overlapping searchers
warning messages** are coming in logs.
**numDocs numbers** are changing when documents are adding at the time of
indexing.
Any help?

On Sat, Mar 14, 2015 at 11:24 PM, Erick Erickson erickerick...@gmail.com
wrote:

 First, the soft commit interval is very short. Very, very, very, very
 short. 300ms is
 just short of insane unless it's a typo ;).

 Here's a long background:

 https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

 But the short form is that you're opening searchers every 300 ms. The
 hard commit is better,
 but every 3 seconds is still far too short IMO. I'd start with soft
 commits of 6 and hard
 commits of 6 (60 seconds), meaning that you're going to have to
 wait 1 minute for
 docs to show up unless you explicitly commit.

 You're throwing away all the caches configured in solrconfig.xml more
 than 3 times a second,
 executing autowarming, etc, etc, etc

 Changing these to longer intervals might cure the problem, but if not
 then, as Hoss would
 say, details matter. I suspect you're also seeing overlapping
 searchers warning messages
 in your log, and it;s _possible_ that what's happening is that you're
 just exceeding the
 max warming searchers and never opening a new searcher with the
 newly-indexed documents.
 But that's a total shot in the dark.

 How are you looking for docs (and not finding them)? Does the numDocs
 number in
 the solr admin screen change?


 Best,
 Erick

 On Thu, Mar 12, 2015 at 10:27 PM, Nitin Solanki nitinml...@gmail.com
 wrote:
  Hi Alexandre,
 
 
  *Hard Commit* is :
 
   autoCommit
 maxTime${solr.autoCommit.maxTime:3000}/maxTime
 openSearcherfalse/openSearcher
   /autoCommit
 
  *Soft Commit* is :
 
  autoSoftCommit
  maxTime${solr.autoSoftCommit.maxTime:300}/maxTime
  /autoSoftCommit
 
  And I am committing 2 documents each time.
  Is it good config for committing?
  Or I am good something wrong ?
 
 
  On Fri, Mar 13, 2015 at 8:52 AM, Alexandre Rafalovitch 
 arafa...@gmail.com
  wrote:
 
  What's your commit strategy? Explicit commits? Soft commits/hard
  commits (in solrconfig.xml)?
 
  Regards,
 Alex.
  
  Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
  http://www.solr-start.com/
 
 
  On 12 March 2015 at 23:19, Nitin Solanki nitinml...@gmail.com wrote:
   Hello,
 I have written a python script to do 2 documents
 indexing
   each time on Solr. I have 28 GB RAM with 8 CPU.
   When I started indexing, at that time 15 GB RAM was freed. While
  indexing,
   all RAM is consumed but **not** a single document is indexed. Why so?
   And it through *HTTPError: HTTP Error 503: Service Unavailable* in
 python
   script.
   I think it is due to heavy load on Zookeeper by which all nodes went
  down.
   I am not sure about that. Any help please..
   Or anything else is happening..
   And how to overcome this issue.
   Please assist me towards right path.
   Thanks..
  
   Warm Regards,
   Nitin Solanki
 



Re: Update solr schema.xml in real time for Solr 4.10.1

2015-03-14 Thread Nitin Solanki
Hi Zheng,
  As you said **there's no physical schema.xml** but I
have. I am using sampletechproductsconfig configuration where I have
found schema.xml. In that, I am managing my schema.xml and then I
upload that it into zookeeper and reload the collection.



On 3/14/15, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote:
 Hi Erick,

 The real time update of the schema means we can just do an update using
 REST-API curl instead of manually editing the schema.xml and restart the
 Solr server. In Solr 5.0, if Solr is loading the schema from the resource
 named in 'managedSchemaResourceName', instead of schema.xml, I can just
 update it from the REST-API curl.

 For earlier version of Solr, the default setting is
 ClassicIndexSchemaFactory, which is read from schema.xml. So besides
 getting Solr to load the schema from the resource named in
 'managedSchemaResourceName', rather than from schema.xml, is there other
 settings required?

 Zheng Lin

 On 12 March 2015 at 23:26, Erick Erickson erickerick...@gmail.com wrote:

 Actually I ran across a neat IntelliJ plugin that you could install
 and directly edit ZK files. And I'm pretty sure there are stand-alone
 programs that do this, but they are all outside Solr.

 I'm not sure what real time update of the schema is for, would you
 (Zheng) explain further? Collections _must_ be reloaded for schema
 changes to take effect so I'm not quite sure what you're referring to.

 Nitin:
 The usual process is to have the master config be local, change the
 local version then upload it to ZK with the upconfig option in zkCli,
 then reload your collection.

 Best,
 Erick

 On Thu, Mar 12, 2015 at 6:04 AM, Shawn Heisey apa...@elyograg.org
 wrote:
  On 3/12/2015 2:00 AM, Zheng Lin Edwin Yeo wrote:
  I understand that in Solr 5.0, they provide a REST API to do real-time
  update of the schema using Curl. However, I could not do that for my
  eariler version of Solr 4.10.1.
 
  Would like to check, is this function available for the earlier
  version
 of
  Solr, and is the curl syntax the same as Solr 5.0?
 
  Providing a way to simply edit the config files directly is a potential
  security issue.  We briefly had a way to edit those configs right in
  the
  admin UI, but Redhat reported this capability as a security problem, so
  we removed it.  I don't remember whether there is a way to re-enable
  this functionality.
 
  The Schema REST API is available in 4.10.  It was also present in 4.9.
  Currently you can only *add* to the schema, you cannot edit what's
  already there.
 
  Thanks,
  Shawn
 




Re: Update solr schema.xml in real time for Solr 4.10.1

2015-03-14 Thread Nitin Solanki
Ok.. Got Zheng...
Thanks a Lot..

On Sat, Mar 14, 2015 at 1:02 PM, Zheng Lin Edwin Yeo edwinye...@gmail.com
wrote:

 Hi Nitin,

 What I experienced is when I create a new collection, there's no physical
 schema in that collection. But there is schema.xml in some of the example
 folder. You can create your own schema.xml in your own collection, but in
 order to use it, you have to change the schemaFactory class
 to ClassicIndexSchemaFactory in solrconfig.xml. As by default, the
 schemaFactory class is set to ManagedIndexSchemaFactory in Solr 5.0.

 Zheng Lin


 On 14 March 2015 at 15:22, Nitin Solanki nitinml...@gmail.com wrote:

  Hi Zheng,
As you said **there's no physical schema.xml** but I
  have. I am using sampletechproductsconfig configuration where I have
  found schema.xml. In that, I am managing my schema.xml and then I
  upload that it into zookeeper and reload the collection.
 
 
 
  On 3/14/15, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote:
   Hi Erick,
  
   The real time update of the schema means we can just do an update
 using
   REST-API curl instead of manually editing the schema.xml and restart
 the
   Solr server. In Solr 5.0, if Solr is loading the schema from the
 resource
   named in 'managedSchemaResourceName', instead of schema.xml, I can just
   update it from the REST-API curl.
  
   For earlier version of Solr, the default setting is
   ClassicIndexSchemaFactory, which is read from schema.xml. So besides
   getting Solr to load the schema from the resource named in
   'managedSchemaResourceName', rather than from schema.xml, is there
 other
   settings required?
  
   Zheng Lin
  
   On 12 March 2015 at 23:26, Erick Erickson erickerick...@gmail.com
  wrote:
  
   Actually I ran across a neat IntelliJ plugin that you could install
   and directly edit ZK files. And I'm pretty sure there are stand-alone
   programs that do this, but they are all outside Solr.
  
   I'm not sure what real time update of the schema is for, would you
   (Zheng) explain further? Collections _must_ be reloaded for schema
   changes to take effect so I'm not quite sure what you're referring to.
  
   Nitin:
   The usual process is to have the master config be local, change the
   local version then upload it to ZK with the upconfig option in zkCli,
   then reload your collection.
  
   Best,
   Erick
  
   On Thu, Mar 12, 2015 at 6:04 AM, Shawn Heisey apa...@elyograg.org
   wrote:
On 3/12/2015 2:00 AM, Zheng Lin Edwin Yeo wrote:
I understand that in Solr 5.0, they provide a REST API to do
  real-time
update of the schema using Curl. However, I could not do that for
 my
eariler version of Solr 4.10.1.
   
Would like to check, is this function available for the earlier
version
   of
Solr, and is the curl syntax the same as Solr 5.0?
   
Providing a way to simply edit the config files directly is a
  potential
security issue.  We briefly had a way to edit those configs right in
the
admin UI, but Redhat reported this capability as a security problem,
  so
we removed it.  I don't remember whether there is a way to re-enable
this functionality.
   
The Schema REST API is available in 4.10.  It was also present in
 4.9.
Currently you can only *add* to the schema, you cannot edit what's
already there.
   
Thanks,
Shawn
   
  
  
 



Re: Update solr schema.xml in real time for Solr 4.10.1

2015-03-12 Thread Nitin Solanki
Hi Zheng,


*** I understand that in Solr 5.0, they provide a REST API to do real-time
update of the schema using Curl ** *. Would please help me how to do this?
I need to update both schema.xml and solrconfig.xml in Solr 5.0 in
SolrCloud.
Your help is appreciated..

*Thanks Again..*


On Thu, Mar 12, 2015 at 1:30 PM, Zheng Lin Edwin Yeo edwinye...@gmail.com
wrote:

 Hi,

 I understand that in Solr 5.0, they provide a REST API to do real-time
 update of the schema using Curl. However, I could not do that for my
 eariler version of Solr 4.10.1.

 Would like to check, is this function available for the earlier version of
 Solr, and is the curl syntax the same as Solr 5.0?

 Regards,
 Edwin



Re: Where is schema.xml and solrconfig.xml in solr 5.0.0

2015-03-12 Thread Nitin Solanki
Hi. Erick..
   Would please help me distinguish between
Uploading a Configuration Directory and Linking a Collection to a
Configuration Set ?

On Thu, Mar 12, 2015 at 2:01 AM, Nitin Solanki nitinml...@gmail.com wrote:

 Thanks a lot Erick.. It will be helpful.

 On Wed, Mar 11, 2015 at 9:27 PM, Erick Erickson erickerick...@gmail.com
 wrote:

 The configs are in Zookeeper. So you have to switch your thinking,
 it's rather confusing at first.

 When you create a collection, you specify a config set, these are
 usually in

 ./server/solr/configsets/data_driven_schema,
 ./server/solr/configsets/techproducts and the like.

 The entire conf directory under one of these is copied to Zookeeper
 (which you can see
 from the admin screen cloudtree, then in the right hand side you'll
 be able to find the config sets
 you uploaded.

 But, you cannot edit them there directly. You edit them on disk, then
 push them to Zookeeper,
 then reload the collection (or restart everything). See the reference
 guide here:
 https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities

 Best,
 Erick

 On Wed, Mar 11, 2015 at 6:01 AM, Nitin Solanki nitinml...@gmail.com
 wrote:
  Hi, alexandre..
 
  Thanks for responding...
  When I created new collection(wikingram) using solrCloud. It gets create
  into example/cloud/node*(node1, node2) like that.
  I have used *schema.xml and solrconfig.xml of
 sample_techproducts_configs*
  configuration.
 
  Now, The problem is that.
  If I change the configuration of *solrconfig.xml of *
  *sample_techproducts_configs*. Its configuration doesn't reflect on
  *wikingram* collection.
  How to reflect the changes of configuration in the collection?
 
  On Wed, Mar 11, 2015 at 5:42 PM, Alexandre Rafalovitch 
 arafa...@gmail.com
  wrote:
 
  Which example are you using? Or how are you creating your collection?
 
  If you are using your example, it creates a new directory under
  example. If you are creating a new collection with -c, it creates
  a new directory under the server/solr. The actual files are a bit
  deeper than usual to allow for a log folder next to the collection
  folder. So, for example:
  example/schemaless/solr/gettingstarted/conf/solrconfig.xml
 
  If it's a dynamic schema configuration, you don't actually have
  schema.xml, but managed-schema, as you should be mostly using REST
  calls to configure it.
 
  If you want to see the configuration files before the collection
  actually created, they are under server/solr/configsets, though they
  are not configsets in Solr sense, as they do get copied when you
  create your collections (sharing them causes issues).
 
  Regards,
 Alex.
  
  Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
  http://www.solr-start.com/
 
 
  On 11 March 2015 at 07:50, Nitin Solanki nitinml...@gmail.com wrote:
   Hello,
  I have switched from solr 4.10.2 to solr 5.0.0. In
 solr
   4-10.2, schema.xml and solrconfig.xml were in example/solr/conf/
 folder.
   Where is schema.xml and solrconfig.xml in solr 5.0.0 ? and also want
 to
   know how to configure in solrcloud ?
 





Re: Whole RAM consumed while Indexing.

2015-03-12 Thread Nitin Solanki
Hi Alexandre,


*Hard Commit* is :

 autoCommit
   maxTime${solr.autoCommit.maxTime:3000}/maxTime
   openSearcherfalse/openSearcher
 /autoCommit

*Soft Commit* is :

autoSoftCommit
maxTime${solr.autoSoftCommit.maxTime:300}/maxTime
/autoSoftCommit

And I am committing 2 documents each time.
Is it good config for committing?
Or I am good something wrong ?


On Fri, Mar 13, 2015 at 8:52 AM, Alexandre Rafalovitch arafa...@gmail.com
wrote:

 What's your commit strategy? Explicit commits? Soft commits/hard
 commits (in solrconfig.xml)?

 Regards,
Alex.
 
 Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
 http://www.solr-start.com/


 On 12 March 2015 at 23:19, Nitin Solanki nitinml...@gmail.com wrote:
  Hello,
I have written a python script to do 2 documents indexing
  each time on Solr. I have 28 GB RAM with 8 CPU.
  When I started indexing, at that time 15 GB RAM was freed. While
 indexing,
  all RAM is consumed but **not** a single document is indexed. Why so?
  And it through *HTTPError: HTTP Error 503: Service Unavailable* in python
  script.
  I think it is due to heavy load on Zookeeper by which all nodes went
 down.
  I am not sure about that. Any help please..
  Or anything else is happening..
  And how to overcome this issue.
  Please assist me towards right path.
  Thanks..
 
  Warm Regards,
  Nitin Solanki



Re: Where is schema.xml and solrconfig.xml in solr 5.0.0

2015-03-12 Thread Nitin Solanki
Thanks Shawn and Erick for explanation...

On Thu, Mar 12, 2015 at 9:02 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 3/12/2015 9:18 AM, Erick Erickson wrote:
  By and large, I really never use linking. But it's about associating a
  config set
  you've _already_ uploaded with a collection.
 
  So uploading is pushing the configset from your local machine up to
 Zookeeper,
  and linking is using that uploaded, named configuration with an
  arbitrary collection.
 
  But usually you just make this association when creating the collection.

 The primary use case that I see for linkconfig is in testing upgrades to
 configurations.  So let's say you have a production collection that uses
 a config that you name fooV1 for foo version 1.  You can build a test
 collection that uses a config named fooV2, work out all the bugs, and
 then when you're ready to deploy it, you can use linkconfig to link your
 production collection to fooV2, reload the collection, and you're using
 the new config.  I haven't discussed here how to handle the situation
 where a reindex is required.

 One thing you CAN do is run linkconfig for a collection that doesn't
 exist yet, and then you don't need to include collection.configName when
 you create the collection, because the link is already present in
 zookeeper.  I personally don't like doing things this way, but I'm
 pretty sure it works.

 Thanks,
 Shawn




Whole RAM consumed while Indexing.

2015-03-12 Thread Nitin Solanki
Hello,
  I have written a python script to do 2 documents indexing
each time on Solr. I have 28 GB RAM with 8 CPU.
When I started indexing, at that time 15 GB RAM was freed. While indexing,
all RAM is consumed but **not** a single document is indexed. Why so?
And it through *HTTPError: HTTP Error 503: Service Unavailable* in python
script.
I think it is due to heavy load on Zookeeper by which all nodes went down.
I am not sure about that. Any help please..
Or anything else is happening..
And how to overcome this issue.
Please assist me towards right path.
Thanks..

Warm Regards,
Nitin Solanki


java.nio.channels.CancelledKeyException

2015-03-12 Thread Nitin Solanki
Hi,
I am indexing documents on Solr 4.10.2. While indexing, I am
getting this error in log -

java.nio.channels.CancelledKeyException
at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
at 
org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
at 
org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081)
at 
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404)
at 
org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:169)

What does it means? Will it skip the current index documents? Or anything else?

Please Help...


Re: Where is schema.xml and solrconfig.xml in solr 5.0.0

2015-03-11 Thread Nitin Solanki
Thanks a lot Erick.. It will be helpful.

On Wed, Mar 11, 2015 at 9:27 PM, Erick Erickson erickerick...@gmail.com
wrote:

 The configs are in Zookeeper. So you have to switch your thinking,
 it's rather confusing at first.

 When you create a collection, you specify a config set, these are
 usually in

 ./server/solr/configsets/data_driven_schema,
 ./server/solr/configsets/techproducts and the like.

 The entire conf directory under one of these is copied to Zookeeper
 (which you can see
 from the admin screen cloudtree, then in the right hand side you'll
 be able to find the config sets
 you uploaded.

 But, you cannot edit them there directly. You edit them on disk, then
 push them to Zookeeper,
 then reload the collection (or restart everything). See the reference
 guide here:
 https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities

 Best,
 Erick

 On Wed, Mar 11, 2015 at 6:01 AM, Nitin Solanki nitinml...@gmail.com
 wrote:
  Hi, alexandre..
 
  Thanks for responding...
  When I created new collection(wikingram) using solrCloud. It gets create
  into example/cloud/node*(node1, node2) like that.
  I have used *schema.xml and solrconfig.xml of
 sample_techproducts_configs*
  configuration.
 
  Now, The problem is that.
  If I change the configuration of *solrconfig.xml of *
  *sample_techproducts_configs*. Its configuration doesn't reflect on
  *wikingram* collection.
  How to reflect the changes of configuration in the collection?
 
  On Wed, Mar 11, 2015 at 5:42 PM, Alexandre Rafalovitch 
 arafa...@gmail.com
  wrote:
 
  Which example are you using? Or how are you creating your collection?
 
  If you are using your example, it creates a new directory under
  example. If you are creating a new collection with -c, it creates
  a new directory under the server/solr. The actual files are a bit
  deeper than usual to allow for a log folder next to the collection
  folder. So, for example:
  example/schemaless/solr/gettingstarted/conf/solrconfig.xml
 
  If it's a dynamic schema configuration, you don't actually have
  schema.xml, but managed-schema, as you should be mostly using REST
  calls to configure it.
 
  If you want to see the configuration files before the collection
  actually created, they are under server/solr/configsets, though they
  are not configsets in Solr sense, as they do get copied when you
  create your collections (sharing them causes issues).
 
  Regards,
 Alex.
  
  Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
  http://www.solr-start.com/
 
 
  On 11 March 2015 at 07:50, Nitin Solanki nitinml...@gmail.com wrote:
   Hello,
  I have switched from solr 4.10.2 to solr 5.0.0. In solr
   4-10.2, schema.xml and solrconfig.xml were in example/solr/conf/
 folder.
   Where is schema.xml and solrconfig.xml in solr 5.0.0 ? and also want
 to
   know how to configure in solrcloud ?
 



Re: Where is schema.xml and solrconfig.xml in solr 5.0.0

2015-03-11 Thread Nitin Solanki
Hi, alexandre..

Thanks for responding...
When I created new collection(wikingram) using solrCloud. It gets create
into example/cloud/node*(node1, node2) like that.
I have used *schema.xml and solrconfig.xml of sample_techproducts_configs*
configuration.

Now, The problem is that.
If I change the configuration of *solrconfig.xml of *
*sample_techproducts_configs*. Its configuration doesn't reflect on
*wikingram* collection.
How to reflect the changes of configuration in the collection?

On Wed, Mar 11, 2015 at 5:42 PM, Alexandre Rafalovitch arafa...@gmail.com
wrote:

 Which example are you using? Or how are you creating your collection?

 If you are using your example, it creates a new directory under
 example. If you are creating a new collection with -c, it creates
 a new directory under the server/solr. The actual files are a bit
 deeper than usual to allow for a log folder next to the collection
 folder. So, for example:
 example/schemaless/solr/gettingstarted/conf/solrconfig.xml

 If it's a dynamic schema configuration, you don't actually have
 schema.xml, but managed-schema, as you should be mostly using REST
 calls to configure it.

 If you want to see the configuration files before the collection
 actually created, they are under server/solr/configsets, though they
 are not configsets in Solr sense, as they do get copied when you
 create your collections (sharing them causes issues).

 Regards,
Alex.
 
 Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
 http://www.solr-start.com/


 On 11 March 2015 at 07:50, Nitin Solanki nitinml...@gmail.com wrote:
  Hello,
 I have switched from solr 4.10.2 to solr 5.0.0. In solr
  4-10.2, schema.xml and solrconfig.xml were in example/solr/conf/ folder.
  Where is schema.xml and solrconfig.xml in solr 5.0.0 ? and also want to
  know how to configure in solrcloud ?



Where is schema.xml and solrconfig.xml in solr 5.0.0

2015-03-11 Thread Nitin Solanki
Hello,
   I have switched from solr 4.10.2 to solr 5.0.0. In solr
4-10.2, schema.xml and solrconfig.xml were in example/solr/conf/ folder.
Where is schema.xml and solrconfig.xml in solr 5.0.0 ? and also want to
know how to configure in solrcloud ?


Re: how to change configurations in solrcloud setup

2015-03-10 Thread Nitin Solanki
Hi Aman,
 You can apply configuration on solr cloud by using this
command -

sudo
path_of_solr/solr_folder_name/example/scripts/cloud-scripts/zkcli.sh
-zkhost localhost:9983 -cmd upconfig -confdir
path_of_solr/solr_folder_name/example/solr/collection1/conf -confname
default

and then restart all nodes of solrcloud.

On Mon, Mar 9, 2015 at 11:43 AM, Aman Tandon amantandon...@gmail.com
wrote:

 Please help.

 With Regards
 Aman Tandon

 On Sat, Mar 7, 2015 at 9:58 PM, Aman Tandon amantandon...@gmail.com
 wrote:

  Hi,
 
  Please tell me what is best way to apply configuration changes in solr
  cloud and how to do that.
 
  Thanks in advance.
 
  With Regards
  Aman Tandon
 



Re: Frequency of Suggestion are varying from original Frequency in index

2015-03-09 Thread Nitin Solanki
Hi ale42,
  Yes. I am using the same field (gram_ci) to make a query and
also using the same field(gram_ci) to build suggestion on it.

Here is the explanation:
I have a 2 fields - gram and gram_ci.
where gram field sets to stored = true and index = true while gram_ci field
sets to stored=false but index = true.
and making copy field of gram into gram_ci.

Both gram and gram_ci fields using same fieldType -
StandardTokenizerFactory and ShingleFilterFactory for both index and query.
Only the difference is that gram_ci is using lowercaseFilter and gram
doesn't. And I am making query on gram_ci not on gram.



On Mon, Mar 9, 2015 at 3:24 PM, ale42 
alexandre.faye...@etu.esisar.grenoble-inp.fr wrote:

 When you make a query, does it use the same field type as the field that
 you
 are using to build suggestions?



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Frequency-of-Suggestion-are-varying-from-original-Frequency-in-index-tp4190927p4191813.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Frequency of Suggestion are varying from original Frequency in index

2015-03-09 Thread Nitin Solanki
I am using field as standardTokenizerFactory with
ShingleFilterFactory. Is it doing so?

On 3/9/15, ale42 alexandre.faye...@etu.esisar.grenoble-inp.fr wrote:
 So, I think it's depend on the field that you are working on ?!



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Frequency-of-Suggestion-are-varying-from-original-Frequency-in-index-tp4190927p4191800.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Frequency of Suggestion are varying from original Frequency in index

2015-03-08 Thread Nitin Solanki
Hi ale42,
I am not using /solr.IndexBasedSpellChecker/. I used
solr.DirectSpellChecker. Is there anyway to solve my issue?

On Fri, Mar 6, 2015 at 6:27 PM, ale42 
alexandre.faye...@etu.esisar.grenoble-inp.fr wrote:

 I think these frequencies are not the frequence of the term in the same
 index
 :

 - original frequency represents the number of results that you have in
 lucene index when you query who.

 - suggestion frequency is the number of results of this term in the
 spellcheck dictionnary.

 I guess you're using /solr.IndexBasedSpellChecker/ !



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Frequency-of-Suggestion-are-varying-from-original-Frequency-in-index-tp4190927p4191397.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Frequency of Suggestion are varying from original Frequency in index

2015-03-08 Thread Nitin Solanki
Hi Wang,
 I am using SolrCloud. Is suggestion not working properly in
that?

On Fri, Mar 6, 2015 at 2:36 PM, gaohang wang gaohangw...@gmail.com wrote:

 do you use solrcloud?maybe your suggestion is not support distribute

 2015-03-04 22:39 GMT+08:00 Nitin Solanki nitinml...@gmail.com:

  Hi..
 I have a term(who) where original frequency of who is 191 but
  when I get suggestion of who it gives me 90. Why?
 
  Example :
 
  *Original Frequency* comes like:
 
  spellcheck:{
  suggestions:[
who,{
  numFound:1,
  startOffset:1,
  endOffset:4,
  origFreq:*191*,
correctlySpelled,false]}}
 
  While In *Suggestion*, it gives like:
 
  spellcheck:{
  suggestions:[
whs,{
  numFound:1,
  startOffset:1,
  endOffset:4,
  origFreq:0,
  suggestion:[{
  word:who,
  freq:*90*}]},
correctlySpelled,false]}}
 
 
 
  Why it is so?
 
  I am using StandardTokenizerFactory with ShingleFilterFactory in
  Schema.xml..
 



Original frequency is not matching with suggestion frequency in SOLR

2015-03-04 Thread Nitin Solanki
Hello,
  Something suggestion frequency varies from the original
frequency.
Output for *whs is* - *(73)* which is a suggestion of *who is* varies
than its actual original frequency

*(94). *
*Please* check this link for more explanation -



*http://stackoverflow.com/questions/28857915/original-frequency-is-not-matching-with-suggestion-frequency-in-solr
http://stackoverflow.com/questions/28857915/original-frequency-is-not-matching-with-suggestion-frequency-in-solr*


Frequency of Suggestion are varying from original Frequency in index

2015-03-04 Thread Nitin Solanki
Hi..
   I have a term(who) where original frequency of who is 191 but
when I get suggestion of who it gives me 90. Why?

Example :

*Original Frequency* comes like:

spellcheck:{
suggestions:[
  who,{
numFound:1,
startOffset:1,
endOffset:4,
origFreq:*191*,
  correctlySpelled,false]}}

While In *Suggestion*, it gives like:

spellcheck:{
suggestions:[
  whs,{
numFound:1,
startOffset:1,
endOffset:4,
origFreq:0,
suggestion:[{
word:who,
freq:*90*}]},
  correctlySpelled,false]}}



Why it is so?

I am using StandardTokenizerFactory with ShingleFilterFactory in
Schema.xml..


Why frequency of suggestion is different from indexed frequency in Solr?

2015-03-04 Thread Nitin Solanki
Hi,
Frequency of suggestion is different from the original frequency
which is in indexed. Why so?
I have applied StandardTokenizer with ShingleFilterFactory on field.


Get suggestion for each term in the query

2015-02-26 Thread Nitin Solanki
Hi,
  I want to get suggestion of each term/word in query.
Condition:
i) Either word/term is correct or incorrect.
ii) Either word/term has high frequency or has low frequency.

Whatever the condition of term/word, I need to suggestion all time.


Confusion in making true or false in spellcheck.onlymorepopular

2015-02-26 Thread Nitin Solanki
HI,
Only return suggestions that result in more hits for the query
than the existing query

What does it means the existing query in above sentence for
spellcheck.onlymorepopular?

what happens when I make true to spellcheck.onlymorepopular or false to
spellcheck.onlymorepopular? Any difference in it?


Do Multiprocessing on Solr to search?

2015-02-25 Thread Nitin Solanki
Hello,
I want to search lakhs of queries/terms concurrently.

Is there any technique to do multiprocessing on Solr?
Is Solr is capable to handle this situation?
I wrote a code in python that do multiprocessing and search lakhs of
queries and do hit on Solr simultaneously/ parallely at once but it seems
that Solr doesn't able to handle queries at once.
Any help Please?


Solr takes time to start

2015-02-25 Thread Nitin Solanki
Hello,
 Why Solr is taking too much of time to start all nodes/ports?


Re: Collations are not working fine.

2015-02-25 Thread Nitin Solanki
Hi Rajesh,
What configuration had you set in your schema.xml?

On Sat, Feb 14, 2015 at 2:18 AM, Rajesh Hazari rajeshhaz...@gmail.com
wrote:

 Hi Nitin,

 Can u try with the below config, we have these config seems to be working
 for us.

 searchComponent name=spellcheck class=solr.SpellCheckComponent

  str name=queryAnalyzerFieldTypetext_general/str


   lst name=spellchecker
 str name=namewordbreak/str
 str name=classnamesolr.WordBreakSolrSpellChecker/str
 str name=fieldtextSpell/str
 str name=combineWordstrue/str
 str name=breakWordsfalse/str
 int name=maxChanges5/int
   /lst

lst name=spellchecker
 str name=namedefault/str
 str name=fieldtextSpell/str
 str name=classnamesolr.IndexBasedSpellChecker/str
 str name=spellcheckIndexDir./spellchecker/str
 str name=accuracy0.75/str
 float name=thresholdTokenFrequency0.01/float
 str name=buildOnCommittrue/str
 str name=spellcheck.maxResultsForSuggest5/str
  /lst


   /searchComponent



 str name=spellchecktrue/str
 str name=spellcheck.dictionarydefault/str
 str name=spellcheck.dictionarywordbreak/str
 int name=spellcheck.count5/int
 str name=spellcheck.alternativeTermCount15/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.onlyMorePopularfalse/str
 str name=spellcheck.extendedResultstrue/str
 str name =spellcheck.maxCollations100/str
 str name=spellcheck.collateParam.mm100%/str
 str name=spellcheck.collateParam.q.opAND/str
 str name=spellcheck.maxCollationTries1000/str


 *Rajesh.*

 On Fri, Feb 13, 2015 at 1:01 PM, Dyer, James james.d...@ingramcontent.com
 
 wrote:

  Nitin,
 
  Can you post the full spellcheck response when you query:
 
  q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell
 
  James Dyer
  Ingram Content Group
 
 
  -Original Message-
  From: Nitin Solanki [mailto:nitinml...@gmail.com]
  Sent: Friday, February 13, 2015 1:05 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Collations are not working fine.
 
  Hi James Dyer,
I did the same as you told me. Used
  WordBreakSolrSpellChecker instead of shingles. But still collations are
 not
  coming or working.
  For instance, I tried to get collation of gone with the wind by
 searching
  gone wthh thes wint on field=gram_ci but didn't succeed. Even, I am
  getting the suggestions of wtth as *with*, thes as *the*, wint as *wind*.
  Also I have documents which contains gone with the wind having 167
 times
  in the documents. I don't know that I am missing something or not.
  Please check my below solr configuration:
 
  *URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes
  wintwt=jsonindent=trueshards.qt=/spell
 
  *solrconfig.xml:*
 
  searchComponent name=spellcheck class=solr.SpellCheckComponent
  str name=queryAnalyzerFieldTypetextSpellCi/str
  lst name=spellchecker
str name=namedefault/str
str name=fieldgram_ci/str
str name=classnamesolr.DirectSolrSpellChecker/str
str name=distanceMeasureinternal/str
float name=accuracy0.5/float
int name=maxEdits2/int
int name=minPrefix0/int
int name=maxInspections5/int
int name=minQueryLength2/int
float name=maxQueryFrequency0.9/float
str name=comparatorClassfreq/str
  /lst
  lst name=spellchecker
str name=namewordbreak/str
str name=classnamesolr.WordBreakSolrSpellChecker/str
str name=fieldgram/str
str name=combineWordstrue/str
str name=breakWordstrue/str
int name=maxChanges5/int
  /lst
  /searchComponent
 
  requestHandler name=/spell class=solr.SearchHandler startup=lazy
  lst name=defaults
str name=dfgram_ci/str
str name=spellcheck.dictionarydefault/str
str name=spellcheckon/str
str name=spellcheck.extendedResultstrue/str
str name=spellcheck.count25/str
str name=spellcheck.onlyMorePopulartrue/str
str name=spellcheck.maxResultsForSuggest1/str
str name=spellcheck.alternativeTermCount25/str
str name=spellcheck.collatetrue/str
str name=spellcheck.maxCollations50/str
str name=spellcheck.maxCollationTries50/str
str name=spellcheck.collateExtendedResultstrue/str
  /lst
  arr name=last-components
strspellcheck/str
  /arr
/requestHandler
 
  *Schema.xml: *
 
  field name=gram_ci type=textSpellCi indexed=true stored=true
  multiValued=false/
 
  /fieldTypefieldType name=textSpellCi class=solr.TextField
  positionIncrementGap=100
 analyzer type=index
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  /analyzer
  /fieldType
 



Override freq field from custom field in Suggestions

2015-02-24 Thread Nitin Solanki
Hello,
 I have a scenario where I want to use own custom field instead
of freq in suggestions of each term. Custom field will be integer value
and having some different value than freq in suggestion.
Is it possible in Solr to use custom field instead of freq in suggestion.
Your help is appreciated.


Thanks and Regards,
 Nitin Solanki.


Re: Sorting on multi-valued field

2015-02-24 Thread Nitin Solanki
Hi Peri,
  You cannot do sort on multi-valued field. It should be set to
false.

On Tue, Feb 24, 2015 at 8:07 PM, Peri Subrahmanya 
peri.subrahma...@htcinc.com wrote:

 All,

 Is there a way sorting can work on a multi-valued field or does it always
 have to be “false” for it to work.

 Thanks
 -Peri

 *** DISCLAIMER *** This is a PRIVATE message. If you are not the intended
 recipient, please delete without copying and kindly advise us by e-mail of
 the mistake in delivery.
 NOTE: Regardless of content, this e-mail shall not operate to bind HTC
 Global Services to any order or other contract unless pursuant to explicit
 written agreement or government initiative expressly permitting the use of
 e-mail for such purpose.





How make Searching fast in spell checking

2015-02-24 Thread Nitin Solanki
Hello all,
 I have 49 GB of indexed data. I am doing spell checking
things. I have applied ShingleFilter on both index and query part and
taking 25 suggestions of each word in the query and not using collations.
When I search a phrase(taken 5-6 words. Ex.- barack obama is president of
America) then it takes 2 to 3 seconds to process while searching a single
term(Ex. - barack) then it takes only 0.23 second which is good.
Why phrase checking is taking time. Am I doing something wrong ? Any help
on this?


CollationKeyFilterFactory stops suggestions and collations

2015-02-23 Thread Nitin Solanki
Hello all,
  I am working on collations. Somewhere in Solr, I found that
UnicodeCollation will do searching fast. But after applying
CollationKeyFilterFactory in schema.xml, it stops the suggestions and
collations both. Please check the configurations and help me.

*Schema.xml:*

fieldType name=textSpell class=solr.TextField
positionIncrementGap=100
   analyzer type=index
   tokenizer class=solr.StandardTokenizerFactory/
   filter class=solr.CollationKeyFilterFactory language=
strength=primary/
  /analyzer
  analyzer type=query
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.CollationKeyFilterFactory language=
strength=primary/
  /analyzer
/fieldType


Solrconfig.xml:

requestHandler name=/spell class=solr.SearchHandler startup=lazy
lst name=defaults
  str name=dfgram_ci/str
  !-- Solr will use suggestions from both the 'default' spellchecker
   and from the 'wordbreak' spellchecker and combine them.
   collations (re-written queries) can include a combination of
   corrections from both spellcheckers --
  str name=spellcheck.dictionarydefault/str
  str name=spellcheckon/str
  str name=spellcheck.extendedResultstrue/str
  str name=spellcheck.count25/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.maxResultsForSuggest10/str
  str name=spellcheck.alternativeTermCount25/str
  str name=spellcheck.collatetrue/str
  str name=spellcheck.maxCollations100/str
  str name=spellcheck.maxCollationTries1000/str
  str name=spellcheck.collateExtendedResultstrue/str
/lst
arr name=last-components
  strspellcheck/str
  !--strsuggest/str--
  !--strquery/str--
/arr
  /requestHandler


Re: CollationKeyFilterFactory stops suggestions and collations

2015-02-23 Thread Nitin Solanki
Hi all,
I have found to use UnicodeCollation. I need
*lucene-collation-2.9.1.jar.
*I am using solr 4.10.2. I have download lucene-collation-2.9.1.jar where I
have to store this or Is it already in-built in solr?
If it already in solr then why suggestions and collations are not coming?
Any help. Please?


On Mon, Feb 23, 2015 at 4:43 PM, Nitin Solanki nitinml...@gmail.com wrote:

 Hello all,
   I am working on collations. Somewhere in Solr, I found that
 UnicodeCollation will do searching fast. But after applying
 CollationKeyFilterFactory in schema.xml, it stops the suggestions and
 collations both. Please check the configurations and help me.

 *Schema.xml:*

 fieldType name=textSpell class=solr.TextField
 positionIncrementGap=100
analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.CollationKeyFilterFactory language=
 strength=primary/
   /analyzer
   analyzer type=query
   tokenizer class=solr.StandardTokenizerFactory/
   filter class=solr.CollationKeyFilterFactory language=
 strength=primary/
   /analyzer
 /fieldType


 Solrconfig.xml:

 requestHandler name=/spell class=solr.SearchHandler startup=lazy
 lst name=defaults
   str name=dfgram_ci/str
   !-- Solr will use suggestions from both the 'default' spellchecker
and from the 'wordbreak' spellchecker and combine them.
collations (re-written queries) can include a combination of
corrections from both spellcheckers --
   str name=spellcheck.dictionarydefault/str
   str name=spellcheckon/str
   str name=spellcheck.extendedResultstrue/str
   str name=spellcheck.count25/str
   str name=spellcheck.onlyMorePopulartrue/str
   str name=spellcheck.maxResultsForSuggest10/str
   str name=spellcheck.alternativeTermCount25/str
   str name=spellcheck.collatetrue/str
   str name=spellcheck.maxCollations100/str
   str name=spellcheck.maxCollationTries1000/str
   str name=spellcheck.collateExtendedResultstrue/str
 /lst
 arr name=last-components
   strspellcheck/str
   !--strsuggest/str--
   !--strquery/str--
 /arr
   /requestHandler



Used CollationKeyFilterFactory, Seems not to be working

2015-02-23 Thread Nitin Solanki
Hi,
  I have integrate CollationKeyFilterFactory in schema.xml and re-index
the data again.

*filter class=solr.CollationKeyFilterFactory language=
strength=primary/*

I need to use this becuase I want to build collations fast.
Referred link: http://wiki.apache.org/solr/UnicodeCollation

But it stops both suggestions and  collations. *Why?*

I have also test *CollationKeyFilterFactory *into solr admin inside
analysis. Inside that, CKF show some chinese language output.

*Please any help?*


Is Solr best for did you mean functionality just like Google?

2015-02-23 Thread Nitin Solanki
Hello,
  I came in the worst condition. I want to do spell/query
correction functionality. I have 49 GB indexed data where I have applied
spellchecker. I want to do same as Google - *did you mean*.
*Example* - If any user types any question/query which might be misspell or
wrong typed. I need to give them suggestion like Did you mean.
Is Solr best for it?


Warm Regards,
Nitin Solanki


Re: Collations are not working fine.

2015-02-23 Thread Nitin Solanki
Hi Charles,
 How you patch the suggester to get frequency information in
the spellcheck response?
It's very good. I also want to do that?


On Mon, Feb 16, 2015 at 7:59 PM, Reitzel, Charles 
charles.reit...@tiaa-cref.org wrote:

 I have been working with collations the last couple days and I kept adding
 the collation-related parameters until it started working for me.   It
 seems I needed str name=spellcheck.collateMaxCollectDocs50/str.

 But, I am using the Suggester with the WFSTLookupFactory.

 Also, I needed to patch the suggester to get frequency information in the
 spellcheck response.

 -Original Message-
 From: Rajesh Hazari [mailto:rajeshhaz...@gmail.com]
 Sent: Friday, February 13, 2015 3:48 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Collations are not working fine.

 Hi Nitin,

 Can u try with the below config, we have these config seems to be working
 for us.

 searchComponent name=spellcheck class=solr.SpellCheckComponent

  str name=queryAnalyzerFieldTypetext_general/str


   lst name=spellchecker
 str name=namewordbreak/str
 str name=classnamesolr.WordBreakSolrSpellChecker/str
 str name=fieldtextSpell/str
 str name=combineWordstrue/str
 str name=breakWordsfalse/str
 int name=maxChanges5/int
   /lst

lst name=spellchecker
 str name=namedefault/str
 str name=fieldtextSpell/str
 str name=classnamesolr.IndexBasedSpellChecker/str
 str name=spellcheckIndexDir./spellchecker/str
 str name=accuracy0.75/str
 float name=thresholdTokenFrequency0.01/float
 str name=buildOnCommittrue/str
 str name=spellcheck.maxResultsForSuggest5/str
  /lst


   /searchComponent



 str name=spellchecktrue/str
 str name=spellcheck.dictionarydefault/str
 str name=spellcheck.dictionarywordbreak/str
 int name=spellcheck.count5/int
 str name=spellcheck.alternativeTermCount15/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.onlyMorePopularfalse/str
 str name=spellcheck.extendedResultstrue/str
 str name =spellcheck.maxCollations100/str
 str name=spellcheck.collateParam.mm100%/str
 str name=spellcheck.collateParam.q.opAND/str
 str name=spellcheck.maxCollationTries1000/str


 *Rajesh.*

 On Fri, Feb 13, 2015 at 1:01 PM, Dyer, James james.d...@ingramcontent.com
 
 wrote:

  Nitin,
 
  Can you post the full spellcheck response when you query:
 
  q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell
 
  James Dyer
  Ingram Content Group
 
 
  -Original Message-
  From: Nitin Solanki [mailto:nitinml...@gmail.com]
  Sent: Friday, February 13, 2015 1:05 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Collations are not working fine.
 
  Hi James Dyer,
I did the same as you told me. Used
  WordBreakSolrSpellChecker instead of shingles. But still collations
  are not coming or working.
  For instance, I tried to get collation of gone with the wind by
  searching gone wthh thes wint on field=gram_ci but didn't succeed.
  Even, I am getting the suggestions of wtth as *with*, thes as *the*,
 wint as *wind*.
  Also I have documents which contains gone with the wind having 167
  times in the documents. I don't know that I am missing something or not.
  Please check my below solr configuration:
 
  *URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes
  wintwt=jsonindent=trueshards.qt=/spell
 
  *solrconfig.xml:*
 
  searchComponent name=spellcheck class=solr.SpellCheckComponent
  str name=queryAnalyzerFieldTypetextSpellCi/str
  lst name=spellchecker
str name=namedefault/str
str name=fieldgram_ci/str
str name=classnamesolr.DirectSolrSpellChecker/str
str name=distanceMeasureinternal/str
float name=accuracy0.5/float
int name=maxEdits2/int
int name=minPrefix0/int
int name=maxInspections5/int
int name=minQueryLength2/int
float name=maxQueryFrequency0.9/float
str name=comparatorClassfreq/str
  /lst
  lst name=spellchecker
str name=namewordbreak/str
str name=classnamesolr.WordBreakSolrSpellChecker/str
str name=fieldgram/str
str name=combineWordstrue/str
str name=breakWordstrue/str
int name=maxChanges5/int
  /lst
  /searchComponent
 
  requestHandler name=/spell class=solr.SearchHandler startup=lazy
  lst name=defaults
str name=dfgram_ci/str
str name=spellcheck.dictionarydefault/str
str name=spellcheckon/str
str name=spellcheck.extendedResultstrue/str
str name=spellcheck.count25/str
str name=spellcheck.onlyMorePopulartrue/str
str name=spellcheck.maxResultsForSuggest1/str
str name=spellcheck.alternativeTermCount25/str
str name=spellcheck.collatetrue/str
str name=spellcheck.maxCollations50/str
str name=spellcheck.maxCollationTries50/str
str name=spellcheck.collateExtendedResultstrue/str
  /lst
  arr name=last-components
strspellcheck/str
  /arr

Re: Used CollationKeyFilterFactory, Seems not to be working

2015-02-23 Thread Nitin Solanki
Hi Ahmet,
 language= means that  it is used for any language -
simply define the language as the empty string for most languages

*Intention:* I am working on spell/question correction. Just like google, I
want to do same as did you mean.

Using spellchecker, I got suggestions and collations both. But collations
are not coming as I expected. Reason is that
spellcheck.maxCollationTries, If I set the value
spellcheck.maxCollationTries=10 then it gives nearby 10 results.
Sometimes, expected collation doesn't come inside 10 collations. So, I
increased the value to 16000 and results come but it takes around 15 sec.
on 49GB indexed data. It is worst case. So, somewhere in Solr, I found
*unicodeCollation* and it says that build collations fast.
Is it fast? Or Am I doing something wrong in collations?

On Mon, Feb 23, 2015 at 9:12 PM, Ahmet Arslan iori...@yahoo.com.invalid
wrote:

 Hi Nitin,


 How can you pass empty value to the language attribute?
 Is this intentional?

 What is your intention to use that filter with suggestion functionality?

 Ahmet

 On Monday, February 23, 2015 5:03 PM, Nitin Solanki nitinml...@gmail.com
 wrote:



 Hi,
   I have integrate CollationKeyFilterFactory in schema.xml and re-index
 the data again.

 *filter class=solr.CollationKeyFilterFactory language=
 strength=primary/*

 I need to use this becuase I want to build collations fast.
 Referred link: http://wiki.apache.org/solr/UnicodeCollation

 But it stops both suggestions and  collations. *Why?*

 I have also test *CollationKeyFilterFactory *into solr admin inside
 analysis. Inside that, CKF show some chinese language output.

 *Please any help?*



Error instantiating class: 'org.apache.lucene.collation.CollationKeyFilterFactory'

2015-02-23 Thread Nitin Solanki
Hi,
   I am using Collation Key Filter. After adding it into schema.xml.

*Schema.xml*
field name=gram type=textSpell indexed=true stored=true
required=true multiValued=false/

/fieldTypefieldType name=textSpell class=solr.TextField
positionIncrementGap=100
   analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.CollationKeyFilterFactory language=
strength=primary/
   /analyzer
   analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.CollationKeyFilterFactory language=
strength=primary/
   /analyzer
/fieldType


*  It throws errror...*

Problem accessing /solr/. Reason:

{msg=SolrCore 'collection1' is not available due to init failure:
Could not load conf for core collection1: Plugin init failure for
[schema.xml] fieldType textSpell: Plugin init failure for
[schema.xml] analyzer/filter: Error instantiating class:
'org.apache.lucene.collation.CollationKeyFilterFactory'. Schema file
is /configs/myconf/schema.xml,trace=org.apache.solr.common.SolrException:
SolrCore 'collection1' is not available due to init failure: Could not
load conf for core collection1: Plugin init failure for [schema.xml]
fieldType textSpell: Plugin init failure for [schema.xml]
analyzer/filter: Error instantiating class:
'org.apache.lucene.collation.CollationKeyFilterFactory'. Schema file
is /configs/myconf/schema.xml
at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:745)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:347)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:745)


Use multiple collections having different configuration

2015-02-20 Thread Nitin Solanki
Hello,
I have scenario where I want to create/use 2 collection into
same Solr named as collection1 and collection2. I want to use distributed
servers. Each collection has multiple shards. Each collection contains
different configurations(solrconfig.xml and schema.xml). How can I do?
In between, If I want to re-configure any collection then how to do that?

As I know, If we use single collection which having multiple shards then we
need to use this upconfig link -

* example/scripts/cloud-scripts/zkcli.sh -zkhost localhost:9983 -cmd
upconfig -confdir example/solr/collection1/conf -confname default *
and restart all the nodes.
For 2 collections into same solr. How can I do re-configure?


Advantage of using Java programming with Solr over Solr API

2015-02-20 Thread Nitin Solanki
Hi,
  What is the advantages of java programming with Solr over Solr
API?


Re: Advantage of using Java programming with Solr over Solr API

2015-02-20 Thread Nitin Solanki
I mean embedded Solr .

On Fri, Feb 20, 2015 at 7:05 PM, Alexandre Rafalovitch arafa...@gmail.com
wrote:

 This question makes no sense. Do you mean embedded Solr vs Standalone?

 Regards,
 Alex
 On 20 Feb 2015 3:30 am, Nitin Solanki nitinml...@gmail.com wrote:

  Hi,
What is the advantages of java programming with Solr over Solr
  API?
 



Re: Collations are not working fine.

2015-02-20 Thread Nitin Solanki
How to get only the best collations whose hits are more and need to sort
them?

On Wed, Feb 18, 2015 at 3:53 AM, Reitzel, Charles 
charles.reit...@tiaa-cref.org wrote:

 Hi Nitin,

 I was trying many different options for a couple different queries.   In
 fact, I have collations working ok now with the Suggester and WFSTLookup.
  The problem may have been due to a different dictionary and/or lookup
 implementation and the specific options I was sending.

 In general, we're using spellcheck for search suggestions.   The Suggester
 component (vs. Suggester spellcheck implementation), doesn't handle all of
 our cases.  But we can get things working using the spellcheck interface.
 What gives us particular troubles are the cases where a term may be valid
 by itself, but also be the start of longer words.

 The specific terms are acronyms specific to our business.   But I'll
 attempt to show generic examples.

 E.g. a partial term like fo can expand to fox, fog, etc. and a full term
 like brown can also expand to something like brownstone.   And, yes, the
 collation brownstone fox is nonsense.  But assume, for the sake of
 argument, it appears in our documents somewhere.

 For multiple term query with a spelling error (or partially typed term):
 brown fo

 We get collations in order of hits, descending like ...
 brown fox,
 brown fog,
 brownstone fox.

 So far, so good.

 For a single term query, brown, we get a single suggestion, brownstone and
 no collations.

 So, we don't know to keep the term brown!

 At this point, we need spellcheck.extendedResults=true and look at the
 origFreq value in the suggested corrections.  Unfortunately, the Suggester
 (spellcheck dictionary) does not populate the original frequency
 information.  And, without this information, the SpellCheckComponent cannot
 format the extended results.

 However, with a simple change to Suggester.java, it was easy to get the
 needed frequency information use it to make a sound decision to keep or
 drop the input term.   But I'd be much obliged if there is a better way to
 go about it.

 Configs below.

 Thanks,
 Charlie

 !-- SpellCheck component --
   searchComponent class=solr.SpellCheckComponent name=suggestSC
 lst name=spellchecker
   str name=namesuggestDictionary/str
   str
 name=classnameorg.apache.solr.spelling.suggest.Suggester/str
   str
 name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory/str
   str name=fieldtext_all/str
   float name=threshold0.0001/float
   str name=exactMatchFirsttrue/str
   str name=buildOnCommittrue/str
 /lst
   /searchComponent

 !-- Request Handler --
 requestHandler name=/tcSuggest class=solr.SearchHandler
   lst name=defaults
 str name=titleSearch Suggestions (spellcheck)/str
 str name=echoParamsexplicit/str
 str name=wtjson/str
 str name=rows0/str
 str name=defTypeedismax/str
 str name=dftext_all/str
 str
 name=flid,name,ticker,entityType,transactionType,accountType/str
 str name=spellchecktrue/str
 str name=spellcheck.count5/str
 str name=spellcheck.dictionarysuggestDictionary/str
 str name=spellcheck.alternativeTermCount5/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.extendedResultstrue/str
 str name=spellcheck.maxCollationTries10/str
 str name=spellcheck.maxCollations5/str
   /lst
   arr name=last-components
 strsuggestSC/str
   /arr
 /requestHandler

 -Original Message-
 From: Nitin Solanki [mailto:nitinml...@gmail.com]
 Sent: Tuesday, February 17, 2015 3:17 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Collations are not working fine.

 Hi Charles,
  Will you please send the configuration which you tried.
 It will help to solve my problem. Have you sorted the collations on hits or
 frequencies of suggestions? If you did than please assist me.

 On Mon, Feb 16, 2015 at 7:59 PM, Reitzel, Charles 
 charles.reit...@tiaa-cref.org wrote:

  I have been working with collations the last couple days and I kept
 adding
  the collation-related parameters until it started working for me.   It
  seems I needed str name=spellcheck.collateMaxCollectDocs50/str.
 
  But, I am using the Suggester with the WFSTLookupFactory.
 
  Also, I needed to patch the suggester to get frequency information in
  the spellcheck response.
 
  -Original Message-
  From: Rajesh Hazari [mailto:rajeshhaz...@gmail.com]
  Sent: Friday, February 13, 2015 3:48 PM
  To: solr-user@lucene.apache.org
  Subject: Re: Collations are not working fine.
 
  Hi Nitin,
 
  Can u try with the below config, we have these config seems to be
  working for us.
 
  searchComponent name=spellcheck class=solr.SpellCheckComponent
 
   str name=queryAnalyzerFieldTypetext_general/str
 
 
lst name=spellchecker
  str name=namewordbreak/str
  str name=classnamesolr.WordBreakSolrSpellChecker/str
  str name=fieldtextSpell/str
  str name=combineWordstrue/str
  str name=breakWordsfalse/str

Re: Use multiple collections having different configuration

2015-02-20 Thread Nitin Solanki
Thanks Shawn..

On Fri, Feb 20, 2015 at 7:53 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 2/20/2015 4:06 AM, Nitin Solanki wrote:
  I have scenario where I want to create/use 2 collection into
  same Solr named as collection1 and collection2. I want to use distributed
  servers. Each collection has multiple shards. Each collection contains
  different configurations(solrconfig.xml and schema.xml). How can I do?
  In between, If I want to re-configure any collection then how to do that?
 
  As I know, If we use single collection which having multiple shards then
 we
  need to use this upconfig link -
 
  * example/scripts/cloud-scripts/zkcli.sh -zkhost localhost:9983 -cmd
  upconfig -confdir example/solr/collection1/conf -confname default *
  and restart all the nodes.
  For 2 collections into same solr. How can I do re-configure?

 First, upload your two different configurations with zkcli upconfig
 using two different names.

 Create your collections with the Collections API, and tell each one to
 use a different collection.configName.  If the collection already
 exists, use the zkcli linkconfig command, and reload the collection.

 If you need to change a config, edit the config on disk and re-do the
 zkcli upconfig.  Then reload the collection with the Collections API.
 Alternately you could upload a whole new config and then link it to the
 existing collection.

 The Collections API is not yet exposed in the admin interface, you will
 need to do those calls yourself.  If you're doing this with SolrJ, there
 are some objects inside CollectionAdminRequest that let you do all the
 API actions.

 Thanks,
 Shawn




Auto-correct the phrase/query

2015-02-19 Thread Nitin Solanki
Hello,
  I want to do same like google phrase/spell correction. If anyone
type a query the dark night then I need a suggestion like the dark
knight in Solr. Is there anyway to do this?


Re: spellcheck.count v/s spellcheck.alternativeTermCount

2015-02-19 Thread Nitin Solanki
I have 48GB of indexed data.
I have set spellcheck.count=1  spellcheck.alternativeTermCount=10 but I am
getting only 1 suggestions in suggestion block but Suggestions for
collations are coming.

*PFA*. for details

On Thu, Feb 19, 2015 at 1:50 AM, Dyer, James james.d...@ingramcontent.com
wrote:

 It will try to give you suggestions up to the number you specify, but if
 fewer are available it will not give you any more.

 James Dyer
 Ingram Content Group

 -Original Message-
 From: Nitin Solanki [mailto:nitinml...@gmail.com]
 Sent: Tuesday, February 17, 2015 11:40 PM
 To: solr-user@lucene.apache.org
 Subject: Re: spellcheck.count v/s spellcheck.alternativeTermCount

 Thanks James,
   I tried the same thing
 spellcheck.count=10spellcheck.alternativeTermCount=5. And I got 5
 suggestions of both life and hope but not like this * The spellchecker
 will try to return you up to 10 suggestions for hope, but only up to 5
 suggestions for life. *


 On Wed, Feb 18, 2015 at 1:10 AM, Dyer, James james.d...@ingramcontent.com
 
 wrote:

  Here is an example to illustrate what I mean...
 
  - query q=text:(life AND
  hope)spellcheck.count=10spellcheck.alternativeTermCount=5
  - suppose at least one document in your dictionary field has life in it
  - also suppose zero documents in your dictionary field have hope in
 them
  - The spellchecker will try to return you up to 10 suggestions for
 hope,
  but only up to 5 suggestions for life
 
  James Dyer
  Ingram Content Group
 
 
  -Original Message-
  From: Nitin Solanki [mailto:nitinml...@gmail.com]
  Sent: Tuesday, February 17, 2015 11:35 AM
  To: solr-user@lucene.apache.org
  Subject: Re: spellcheck.count v/s spellcheck.alternativeTermCount
 
  Hi James,
  How can you say that count doesn't use
  index/dictionary then from where suggestions come.
 
  On Tue, Feb 17, 2015 at 10:29 PM, Dyer, James 
  james.d...@ingramcontent.com
  wrote:
 
   See http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.count
 and
   the following section, for details.
  
   Briefly, count is the # of suggestions it will return for terms that
  are
   *not* in your index/dictionary.  alternativeTermCount are the # of
   alternatives you want returned for terms that *are* in your dictionary.
   You can set them to the same value, unless you want fewer suggestions
  when
   the terms is in the dictionary.
  
   James Dyer
   Ingram Content Group
  
   -Original Message-
   From: Nitin Solanki [mailto:nitinml...@gmail.com]
   Sent: Tuesday, February 17, 2015 5:27 AM
   To: solr-user@lucene.apache.org
   Subject: spellcheck.count v/s spellcheck.alternativeTermCount
  
   Hello Everyone,
 I got confusion between spellcheck.count and
   spellcheck.alternativeTermCount in Solr. Any help in details?
  
 



Re: How to place whole indexed data on cache

2015-02-19 Thread Nitin Solanki
Thanks Dominique. Got your view..

On Wed, Feb 18, 2015 at 11:55 PM, Dominique Bejean 
dominique.bej...@eolya.fr wrote:

 Hi,

 As Shawn said, install enough memory in order that all free direct memory
 (non heap memory) be used as disk cache.
 Use 40% maximum of the available memory for heap memory (Xmx JVM
 parameter), but never more than 32 Gb

 And avoid your server to swap.
 For most Linux systems, this is configured using the /etc/sysctl.conf
 value:
 vm.swappiness = 1
 This prevents swapping under normal circumstances, but still allows the OS
 to swap under emergency memory situations.
 A swappiness of 1 is better than 0, since on some kernel versions a
 swappiness of 0 can invoke the OOM-killer

 http://askubuntu.com/questions/103915/how-do-i-configure-swappiness

 http://unix.stackexchange.com/questions/88693/why-is-swappiness-set-to-60-by-default

 Dominique
 http://www.eolya.fr/


 2015-02-18 14:39 GMT+01:00 Shawn Heisey apa...@elyograg.org:

  On 2/18/2015 4:20 AM, Nitin Solanki wrote:
How can I place whole indexed data on cache by which if I will
   search any query then I will get response, suggestions, collations
  rapidly.
   And also how to view that which documents are on cache and how to
 verify
  it?
 
  Simply install enough extra memory in your machine for the entire index
  to fit in RAM that is not being used by programs ... and then do NOT
  allocate that extra memory to any program.
 
  The operating system will automatically do the caching for you as part
  of normal operation, no config required.
 
  https://wiki.apache.org/solr/SolrPerformanceProblems#RAM
 
  Relevant articles referenced by that wiki page:
 
  http://en.wikipedia.org/wiki/Page_cache
  http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
 
  Thanks,
  Shawn
 
 



Re: Divide 4 Nodes into 100 nodes in Solr Cloud

2015-02-19 Thread Nitin Solanki
Hi Yago  Shawn,
   Sorry, I think, you both are taking about
shard splitting but I want node splitting. I have 4 nodes. Each node has 2
shards, So, Now, I want 100 Nodes from that 4 nodes and each having 2
shards. Any Ideas?


On Wed, Feb 18, 2015 at 9:25 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 2/18/2015 8:17 AM, Nitin Solanki wrote:
  I have created 4 nodes having 8 shards. Now, I want to divide
 those
  4 Nodes into 100 Nodes without any failure/ or re-indexing the data. Any
  help please?

 I think your only real option within a strict interpretation of your
 requirements is shard splitting.  You will probably have to do it
 several times, and the resulting core names could get very ugly.


 https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud#ShardsandIndexingDatainSolrCloud-ShardSplitting

 Reindexing is a LOT cleaner and is likely to work better.  If you build
 a new collection sharded the way you want across all the new nodes, you
 can delete the old collection and set up an alias pointing the old name
 at the new collection, no need to change any applications, as long as
 they use the collection name rather than the actual core names.  The
 delete and alias might take long enough that there would be a few
 seconds of downtime, but that's probably all you'd see.  Both indexing
 and queries would work with the alias.


 https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-DeleteaCollection


 https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-CreateormodifyanAliasforaCollection

 Thanks,
 Shawn




Re: Divide 4 Nodes into 100 nodes in Solr Cloud

2015-02-19 Thread Nitin Solanki
Okay, thanks Shawn..

On Thu, Feb 19, 2015 at 7:59 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 2/19/2015 4:18 AM, Nitin Solanki wrote:
 Sorry, I think, you both are taking about
  shard splitting but I want node splitting. I have 4 nodes. Each node has
 2
  shards, So, Now, I want 100 Nodes from that 4 nodes and each having 2
  shards. Any Ideas?

 Node splitting does not exist as a discrete command, but shard splitting
 is the first step in node splitting.  The full procedure would be:

 *) Split one or more shards.  Wait for that to complete.
 *) Do the ADDREPLICA action for some of the new shards to other hosts.
 *) Wait for the replication to the new core(s) to complete
 *) Do the DELETEREPLICA action for those shards on the original hosts.
 *) Delete the originally-split shard(s) at your leisure.

 The overall procedure will be labor intensive and might be prone to
 error, plus as already mentioned, the core names might become very
 convoluted.  It is MUCH cleaner to reindex into a new collection.

 Thanks,
 Shawn




Re: spellcheck.count v/s spellcheck.alternativeTermCount

2015-02-18 Thread Nitin Solanki
Hi James,
 How to see the suggestions of
spellcheck.alternativeTermCount ?

On Wed, Feb 18, 2015 at 11:09 AM, Nitin Solanki nitinml...@gmail.com
wrote:

 Thanks James,
   I tried the same thing
 spellcheck.count=10spellcheck.alternativeTermCount=5. And I got 5
 suggestions of both life and hope but not like this * The spellchecker
 will try to return you up to 10 suggestions for hope, but only up to 5
 suggestions for life. *


 On Wed, Feb 18, 2015 at 1:10 AM, Dyer, James james.d...@ingramcontent.com
  wrote:

 Here is an example to illustrate what I mean...

 - query q=text:(life AND
 hope)spellcheck.count=10spellcheck.alternativeTermCount=5
 - suppose at least one document in your dictionary field has life in it
 - also suppose zero documents in your dictionary field have hope in them
 - The spellchecker will try to return you up to 10 suggestions for
 hope, but only up to 5 suggestions for life

 James Dyer
 Ingram Content Group


 -Original Message-
 From: Nitin Solanki [mailto:nitinml...@gmail.com]
 Sent: Tuesday, February 17, 2015 11:35 AM
 To: solr-user@lucene.apache.org
 Subject: Re: spellcheck.count v/s spellcheck.alternativeTermCount

 Hi James,
 How can you say that count doesn't use
 index/dictionary then from where suggestions come.

 On Tue, Feb 17, 2015 at 10:29 PM, Dyer, James 
 james.d...@ingramcontent.com
 wrote:

  See http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.count
 and
  the following section, for details.
 
  Briefly, count is the # of suggestions it will return for terms that
 are
  *not* in your index/dictionary.  alternativeTermCount are the # of
  alternatives you want returned for terms that *are* in your dictionary.
  You can set them to the same value, unless you want fewer suggestions
 when
  the terms is in the dictionary.
 
  James Dyer
  Ingram Content Group
 
  -Original Message-
  From: Nitin Solanki [mailto:nitinml...@gmail.com]
  Sent: Tuesday, February 17, 2015 5:27 AM
  To: solr-user@lucene.apache.org
  Subject: spellcheck.count v/s spellcheck.alternativeTermCount
 
  Hello Everyone,
I got confusion between spellcheck.count and
  spellcheck.alternativeTermCount in Solr. Any help in details?
 





Why collations are coming even I set the value of spellcheck.count to zero(0)

2015-02-18 Thread Nitin Solanki
Hi Everyone,
I have set the value of spellcheck.count = 0 and
spellcheck.alternativeTermCount = 0. Even though collations are coming when
I search any query which is misspelled. Why so?
I also set the value of spellcheck.maxCollations = 100 and
spellcheck.maxCollationTries = 100. What I know that collations are built
on suggestions. So, Have I any misunderstanding about collation or any
other configuration issue. Any help Please?


How to place whole indexed data on cache

2015-02-18 Thread Nitin Solanki
Hi,
 How can I place whole indexed data on cache by which if I will
search any query then I will get response, suggestions, collations rapidly.
And also how to view that which documents are on cache and how to verify it?


Re: Divide 4 Nodes into 100 nodes in Solr Cloud

2015-02-18 Thread Nitin Solanki
Okay, It will destroy/harm my indexed data. Right?

On Wed, Feb 18, 2015 at 9:01 PM, Yago Riveiro yago.rive...@gmail.com
wrote:

 You can try the SPLIT command


 —
 /Yago Riveiro

 On Wed, Feb 18, 2015 at 3:19 PM, Nitin Solanki nitinml...@gmail.com
 wrote:

  Hi,
  I have created 4 nodes having 8 shards. Now, I want to divide
 those
  4 Nodes into 100 Nodes without any failure/ or re-indexing the data. Any
  help please?



Get nearby suggestions for any phrase searching

2015-02-18 Thread Nitin Solanki
Hello,
I want to retrieve only top- five suggestions for any
phrase/query searching. How to do that?
Assume, If I search like ?q=the bark night then I need suggestion/
collation like the dark knight.
How to get nearby suggestion/ terms of the phrase?


Divide 4 Nodes into 100 nodes in Solr Cloud

2015-02-18 Thread Nitin Solanki
Hi,
I have created 4 nodes having 8 shards. Now, I want to divide those
4 Nodes into 100 Nodes without any failure/ or re-indexing the data. Any
help please?


  1   2   >