Re: Configuration recommendation for SolrCloud

2019-07-01 Thread Jörn Franke
As someone else wrote there are a lot of uncertainties and I recommend to test yourself to find the optimal configuration. Some food for thought: How many clients do you have and what is their concurrency? What operations will they do? Do they Access Solr directly? You can use Jmeter to simulate

Re: Configuration recommendation for SolrCloud

2019-07-01 Thread Rahul Goswami
Hi Toke, Thank you for following up. Reading back, I surely could have explained better. Thanks for asking again. >> What is a cluster? Is it a fully separate SolrCloud? Yes, by cluster I mean a fully separate SolrCloud. >> If so, does that mean you can divide your collection into (at least) 4

Re: Configuration recommendation for SolrCloud

2019-06-29 Thread Toke Eskildsen
Rahul Goswami wrote: > We are running Solr 7.2.1 and planning for a deployment which will grow to > 4 billion documents over time. We have 16 nodes at disposal.I am thinking > between 3 configurations: > > 1 cluster - 16 nodes > vs > 2 clusters - 8 nodes each > vs > 4 clusters -4 nodes each You

Re: Configuration of SOLR Cluster

2018-02-28 Thread Shawn Heisey
On 2/28/2018 6:54 AM, James Keeney wrote: I did notice one thing in the logs: 2018-02-28 13:21:58,932 [myid:1] - INFO [/172.31.86.130:3888:QuorumCnxManager$Listener@743] - *Received connection request /172.31.73.122:34804 * When the restarted node attempts to re

Re: Configuration of SOLR Cluster

2018-02-27 Thread Shawn Heisey
On 2/27/2018 6:42 PM, James Keeney wrote: -DzkHost=:2181,:2181,:2181 This looks correct, except that with AWS, I have no idea whether you need the internal IP addressing or the external IP addressing.  If all of the machines involved (both servers and clients) are able to communicate on the

Re: Configuration of SOLR Cluster

2018-02-27 Thread James Keeney
Shawn - First, it's good to know that this is unusual behavior. That actually helps as it lets me know that I should keep digging. Here are a couple of things that might help. In the configuration I am calling out all three ZK nodes. Here is the configuration of Solr: -DSTOP.KEY=solrrocks -DSTO

Re: Configuration of SOLR Cluster

2018-02-27 Thread Shawn Heisey
On 2/27/2018 10:57 AM, James Keeney wrote: > *1 - ZK ensemble not accepting return of node* > Currently, when a ZK node in the ensemble goes down the ensemble is able to > do what it should do and keeps working. However when I bring the 3rd node > back online the other two nodes reject connection r

Re: Configuration of parallel indexing threads

2017-06-09 Thread gigo314
Thanks a lot! -- View this message in context: http://lucene.472066.n3.nabble.com/Configuration-of-parallel-indexing-threads-tp4338466p4339792.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Configuration of parallel indexing threads

2017-06-02 Thread Erick Erickson
that's pretty much my strategy. I'll add parenthetically that I often see the bottleneck for indexing to be acquiring the data from the system of record in the first place rather than Solr. Assuming you're using SolrJ, an easy test is to comment out the line that sends to Solr. There's usually som

Re: Configuration of parallel indexing threads

2017-06-02 Thread gigo314
Thanks for the replies. Just to confirm that I got it right: 1. Since there is no setting to control index writers, is it fair to assume that Solr always indexes at maximum possible speed? 2. The way to control write speed is to control number of clients that are simultaneously posting data, right?

Re: Configuration of parallel indexing threads

2017-06-01 Thread Susheel Kumar
How are you indexing currently? Are you using DIH or using SolrJ/Java? And are you indexing with multiple threads/machines simultaneously etc or just one thread/machine etc. Thnx Susheel On Thu, Jun 1, 2017 at 11:45 AM, Erick Erickson wrote: > That's been removed in LUCENE-6659. I regularly max

Re: Configuration of parallel indexing threads

2017-06-01 Thread Erick Erickson
That's been removed in LUCENE-6659. I regularly max out my CPUs by having multiple _clients_ send update simultaneously rather than trying to up the number of threads the indexing process takes. But Mike McCandless can answer authoritatively... Best, Erick On Thu, Jun 1, 2017 at 4:16 AM, gigo314

Re: Configuration folder for the collection generated with instead of just

2016-12-07 Thread Erik Hatcher
Nicole - Since this is probably off-topic for the solr-user list, let’s take this offline and over to your Lucidworks support. But while we’re here, here’s an example of using the Fusion API to create a collection and then the Solr API to configure the schema. In this example, it’s not using

Re: Configuration folder for the collection generated with instead of just

2016-12-07 Thread Nicole Bilić
Good suggestion, but unfortunately it does not address this issue as we are not using the time-based partitioning in this project. It would be useful to know in which case is the configuration created with in Solr, what scenario does lead to that so we can investigate further. Any other suggestio

Re: Configuration folder for the collection generated with instead of just

2016-12-07 Thread Erik Hatcher
Looks best to file that as a Lucidworks support ticket. But are you using the time-based sharding feature of Fusion? If that's the case that might explain it as that creates collections for each time partition. Erik > On Dec 7, 2016, at 00:31, Nicole Bilić wrote: > > Hi all, > >

Re: Configuration

2015-10-19 Thread Alexandre Rafalovitch
Sounds like a mission impossible given the number of inner joins. However, what are you _actually_ trying to do? Are you trying to reindex the data? Do you actually have the data to reindex? Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-sta

Re: Configuration problem

2014-03-03 Thread Thomas Fischer
Am 03.03.2014 um 22:43 schrieb Shawn Heisey: > On 3/3/2014 9:02 AM, Thomas Fischer wrote: >> The setting is >> solr directories (I use different solr versions at the same time): >> /srv/solr/solr4.6.1 is the solr home, in solr home is a file solr.xml of the >> new "discovery type" (no cores), and

Re: Configuration problem

2014-03-03 Thread Shawn Heisey
On 3/3/2014 9:02 AM, Thomas Fischer wrote: The setting is solr directories (I use different solr versions at the same time): /srv/solr/solr4.6.1 is the solr home, in solr home is a file solr.xml of the new "discovery type" (no cores), and inside the core directories are empty files core.propert

Re: configuration for heavy system

2014-02-23 Thread Erick Erickson
You haven't told us anything about _how_ you're trying to index this document nor what it's format is. Nor what "100 indexes and around 10 million records" means. 1B total records? 10M total records? Solr easily handles 10s of M records on a single decent size node, I've seen between 50M and 300M.

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-30 Thread Walter Underwood
A flat distribution of queries is a poor test. Real queries have a zipf distribution. The flat distribution will get almost no benefit from caching, so it will give too low a number and stress disk IO too much. The 99th percentile is probably the same for both distributions, because that is domi

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-30 Thread Toke Eskildsen
On Wed, 2013-10-30 at 14:24 +0100, Shawn Heisey wrote: > On 10/30/2013 4:00 AM, Toke Eskildsen wrote: > > Why would TRIM have any influence on whether or not a driver failure > > also means server failure? > > I left out a step in my description. > > Lack of TRIM support in RAID means that I woul

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-30 Thread Shawn Heisey
On 10/30/2013 4:00 AM, Toke Eskildsen wrote: > On Tue, 2013-10-29 at 16:41 +0100, Shawn Heisey wrote: >> If you put the index on SSD, you could get by with less RAM, but a RAID >> solution that works properly with SSD (TRIM support) is hard to find, so >> SSD failure in most situations effectively

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-30 Thread eShard
Wow again! Thank you all very much for your insights. We will certainly take all of this under consideration. Erik: I want to upgrade but unfortunately, it's not up to me. You're right, we definitely need to do it. And SolrJ sounds interesting, thanks for the suggestions. By the way, is ther

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-30 Thread Toke Eskildsen
On Tue, 2013-10-29 at 16:41 +0100, Shawn Heisey wrote: > If you put the index on SSD, you could get by with less RAM, but a RAID > solution that works properly with SSD (TRIM support) is hard to find, so > SSD failure in most situations effectively means a server failure. Solr > and Lucene have a

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-30 Thread Toke Eskildsen
On Tue, 2013-10-29 at 14:24 +0100, eShard wrote: > I have a 1 TB repository with approximately 500,000 documents (that will > probably grow from there) that needs to be indexed. As Shawn point out, that isn't telling us much. If you describe the documents, how and how often you index and how you

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-29 Thread Erick Erickson
In addition to Shawn's comments... bq: we're close to beta release, so I can't upgrade right now WHO! You say you're close to release but you haven't successfully crawled the data even once? Upgrading to 4.5.1 is a trivial risk compared to that statement! This is setting itself up for a real

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-29 Thread Shawn Heisey
On 10/29/2013 10:44 AM, eShard wrote: Offhand, how do I control how much of the index is held in RAM? Can you point me in the right direction? This is automatically handled by the operating system. For quite some time, Solr (Lucene) has by default used the MMap functionality provided by all

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-29 Thread eShard
P.S. Offhand, how do I control how much of the index is held in RAM? Can you point me in the right direction? Thanks, -- View this message in context: http://lucene.472066.n3.nabble.com/Configuration-and-specs-to-index-a-1-terabyte-TB-repository-tp4098227p4098260.html Sent from the Solr - User

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-29 Thread eShard
Wow, thanks for your response. You raise a lot of great questions; I wish I had the answers! We're still trying to get enough resources to finish crawling the repository, so I don't even know what the final size of the index will be. I've thought about excluding the videos and other large files and

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-29 Thread Shawn Heisey
On 10/29/2013 7:24 AM, eShard wrote: > Good morning, > I have a 1 TB repository with approximately 500,000 documents (that will > probably grow from there) that needs to be indexed. > I'm limited to Solr 4.0 final (we're close to beta release, so I can't > upgrade right now) and I can't use SolrC

Re: Configuration for distributed search

2012-08-08 Thread Chris Hostetter
: This command to each shard returns one document from each shard. : curl 'http://localhost:8983/solr/select?debugQuery=true&indent=true&q=conway : curl 'http://localhost:7574/solr/select?debugQuery=true&indent=true&q=conway : : This distributed search command returns 0 documents: What do those

Re: Configuration for distributed search

2012-08-03 Thread Erick Erickson
Hmmm, the zero results could be that you're searching against the default text field and you don't have "conway" in that field. the default search field has recently been deprecated, so try specifying a field in your search The debugQuery=on worked fine for me, so I'm not sure what's happening

Re: Configuration steps to create dynamic core

2012-05-09 Thread pprabhcisco123
Hi I tried to create cores dynamically using the below code, CoreAdminResponse statusResponse = CoreAdminRequest.getStatus(indexName, solr); coreExists = statusResponse.getCoreStatus(indexName).size() > 0; System.out.println("got the cor

Re: Configuration steps to create dynamic core

2012-05-09 Thread pprabhcisco123
Hi Dave, I tried to create core programmatically as below. But getting following error. CoreAdminResponse statusResponse = CoreAdminRequest.getStatus(indexName, solr); coreExists = statusResponse.getCoreStatus(indexName).size() > 0;

Re: Configuration steps to create dynamic core

2012-05-09 Thread Dave Stuart
This pages gives you everything you need http://wiki.apache.org/solr/CoreAdmin#CREATE Regards, Dave On 9 May 2012, at 08:32, pprabhcisco123 wrote: > Hi, > > > I am trying to create core dynamically. what are the configuration > steps that needs to be followed to do the same. Please l

Re: Configuration option for disableReplication

2010-12-23 Thread Upayavira
Having played with it, I can see that it would be extremely useful to be able to disable replication in the solrconfig.xml, and then enable it with a URL. So, as to your patch, I'd say yes, submit it. But do try to make it backwards compatible. It'll make it much more likely to get accepted. Upay

Re: Configuration option for disableReplication

2010-12-23 Thread Francis Rhys-Jones
Hi, Were running a cloud based cluster of servers and its not that easy to get a list of the current slaves. Since my problem is only around the restart/redeployment of the master it seems an unnecessary complication to have to start interacting with slaves as part of the scripts that do this. As

Re: Configuration option for disableReplication

2010-12-22 Thread Upayavira
I've just done a bit of playing here, because I've spent a lot of time reading the SolrReplication wiki page[1], and have often wondered how some features interact. Unfortunately, if you specify false in your replication request handler for your master, you cannot re-enable it with a call to /solr

Re: Configuration of format and type index with solr

2009-04-27 Thread Shalin Shekhar Mangar
On Mon, Apr 27, 2009 at 10:40 PM, hpn1975 nasc wrote: > > 1- Guarantee that my searcher (solr) ALWAYS search in my index in *memory > * (use RAMDirectory). Not to use cache. It is possible to disable all caches. But it is not possible to use RAMDirectory right now. This is in progress. https