Re: Slow import from MsSQL and down cluster during process

2018-10-23 Thread Daniel Carrasco
Thanks for all, I'll try later ;) Greetings!!. El mié., 24 oct. 2018 a las 7:13, Walter Underwood () escribió: > We handle request rates at a few thousand requests/minute with an 8 GB > heap. 95th percentile response time is 200 ms. Median (cached) is 4 ms. > > An oversized heap will hurt your

Re: Slow import from MsSQL and down cluster during process

2018-10-23 Thread Walter Underwood
We handle request rates at a few thousand requests/minute with an 8 GB heap. 95th percentile response time is 200 ms. Median (cached) is 4 ms. An oversized heap will hurt your query performance because everything stops for the huge GC. RAM is still a thousand times faster than SSD, so you want

Re: Join across shards?

2018-10-23 Thread Erick Erickson
In addition to Vadim's comment, Solr Streaming _can_ work across shards and even across collections. Depending on your use-case this may work for you. Best, Erick On Tue, Oct 23, 2018 at 6:41 AM Vadim Ivanov wrote: > > Hi, > You CAN join across collections with runtime "join". > The only

Re: Slow import from MsSQL and down cluster during process

2018-10-23 Thread Daniel Carrasco
Hello, I've set that heap size because the solr receives a lot of queries every second and I want to cache as much as possible. Also I'm not sure about the number of documents in the collection, but the webpage have a lot of products. About store the index data in RAM is just an expression. The

Re: Setting up MiniSolrCloudCluster to use pre-built index

2018-10-23 Thread Ken Krugler
Hi Mark, I’ll have a completely new, rebuilt index that’s (a) large, and (b) already sharded appropriately. In that case, using the merge API isn’t great, in that it would take significant time and temporarily use double (or more) disk space. E.g. I’ve got an index with 250M+ records, and

Re: Regarding multi keyword search

2018-10-23 Thread Shawn Heisey
On 10/23/2018 8:20 AM, Gauri Dhawan wrote: I have been facing an issue for quite some time and haven't been able to come to a solution as of yet. We are trying to implement search on our platform and all our data is stored in Solr. I have a field `description` which is the field where I have to

Re: Slow import from MsSQL and down cluster during process

2018-10-23 Thread Shawn Heisey
On 10/23/2018 7:15 AM, Daniel Carrasco wrote: Hello, Thanks for your response. We've already thought about that and doubled the instances. Just now for every Solr instance we've 60GB of RAM (40GB configured on Solr), and a 16 Cores CPU. The entire Data can be stored on RAM and will not fill

Re: AW: AW: AW: 6.6 -> 7.5 SolrJ, seeing many "Connection evictor"-Threads

2018-10-23 Thread Shawn Heisey
On 10/22/2018 9:44 PM, Clemens Wyss DEV wrote: On 10/22/2018 6:15 AM, Shawn Heisey wrote: autoSoftCommit is pretty aggressive . If your commits are taking 1-2 seconds or les well, some take minutes (re-index)! Are you absolutely sure that you have commits taking that much time?  I'm not

Re: Internal Solr communication question

2018-10-23 Thread Shawn Heisey
On 10/23/2018 9:31 AM, Fernando Otero wrote: Hey all I'm running some tests on Solr cloud (10 nodes, 3 shards, 3 replicas), when I run the queries I end up seeing 7x traffic ( requests / minute) in Newrelic. Could it be that the internal communication between nodes is done through HTTP

Re: Regarding multi keyword search

2018-10-23 Thread Walter Underwood
100% on mm with dangerous. If there is one misspelled or wrong word, there are zero matches. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Oct 23, 2018, at 8:25 AM, ANNAMANENI RAVEENDRA > wrote: > > You should use mm parameter and it should be

Internal Solr communication question

2018-10-23 Thread Fernando Otero
Hey all I'm running some tests on Solr cloud (10 nodes, 3 shards, 3 replicas), when I run the queries I end up seeing 7x traffic ( requests / minute) in Newrelic. Could it be that the internal communication between nodes is done through HTTP and newrelic counts those calls? Thanks!

Re: Regarding multi keyword search

2018-10-23 Thread ANNAMANENI RAVEENDRA
You should use mm parameter and it should be set to 100 if you use dismax or edismax On Tue, Oct 23, 2018 at 11:18 AM Gauri Dhawan wrote: > Hi! > I have been facing an issue for quite some time and haven't been able to > come to a solution as of yet. We are trying to implement search on our >

Re: ZookeeperServer not running/Client Session timed out

2018-10-23 Thread Susheel Kumar
Hi Shawn, Thanks for pointing out that it may be due to network/VM issue. I looked the ZK logs in detail and i see below Socket timeout issue after which ZK shutdown is called. Is that good enough to confirm some VM/network issue not any ZK/Solr issue. I am also including dmesg output during the

Regarding multi keyword search

2018-10-23 Thread Gauri Dhawan
Hi! I have been facing an issue for quite some time and haven't been able to come to a solution as of yet. We are trying to implement search on our platform and all our data is stored in Solr. I have a field `description` which is the field where I have to search. It is of the field type

RE: Join across shards?

2018-10-23 Thread Vadim Ivanov
Hi, You CAN join across collections with runtime "join". The only limitation is that FROM collection should not be sharded and joined data should reside on one node. Solr cannot join across nodes (distributed search is not supported). Though using streaming expressions it's possible to do

Re: Slow import from MsSQL and down cluster during process

2018-10-23 Thread Daniel Carrasco
Hello, Thanks for your response. We've already thought about that and doubled the instances. Just now for every Solr instance we've 60GB of RAM (40GB configured on Solr), and a 16 Cores CPU. The entire Data can be stored on RAM and will not fill the RAM (of course talking about raw data, not

Re: Integrate nutch with solr

2018-10-23 Thread Elizabeth Haubert
Hi Dinesh, This article is quite old (Nutch 1.x, Solr 4.x), but the high-level steps are still pretty much the same: get your java set up, kick off a Solr , and

Re: Slow import from MsSQL and down cluster during process

2018-10-23 Thread Chris Ulicny
Dan, Do you have any idea on the resource usage for the hosts when Solr starts to become unresponsive? It could be that you need more resources or better AWS instances for the hosts. We had what sounds like a similar scenario when attempting to move one of our solrcloud instances to a cloud

Join across shards?

2018-10-23 Thread e_briere
Hi all, Sorry if the question was already covered. We are using joins across documents with the limitation of having the documents to be joined sitting on the same shard. Is there a way around this limitation and even join across collections? Are there plans to support this out of the box?

Re: Slow import from MsSQL and down cluster during process

2018-10-23 Thread Daniel Carrasco
Hi, El mar., 23 oct. 2018 a las 10:18, Charlie Hull () escribió: > On 23/10/2018 02:57, Daniel Carrasco wrote: > > annoyingHello, > > > > I've a Solr Cluster that is created with 7 machines on AWS instances. The > > Solr version is 7.2.1 (b2b6438b37073bee1fca40374e85bf91aa457c0b) and all > >

Re: Inconsistent leader between ZK and Solr and a lot of downtime

2018-10-23 Thread Ben Knüttgen
Daniel Carrasco wrote > Hello, > > I'm investigating an 8 nodes Solr 7.2.1 cluster because we've a lot of > problems, like when a node fails to import from a DB (maybe it freeze), > the > entire cluster goes down, and other like the leader wont change even when > is down (all nodes detects that

Re: AW: AW: 6.6 -> 7.5 SolrJ, seeing many "Connection evictor"-Threads

2018-10-23 Thread Shalin Shekhar Mangar
You can expect as many connection evictor threads as the number of http client instances. This is true for both Solr 6.6 and 7.x. I was intrigued as to why you were not seeing the same threads in both versions. It turns out that I made a mistake in the patch I committed in SOLR-9290 where instead

Re: Slow import from MsSQL and down cluster during process

2018-10-23 Thread Charlie Hull
On 23/10/2018 02:57, Daniel Carrasco wrote: annoyingHello, I've a Solr Cluster that is created with 7 machines on AWS instances. The Solr version is 7.2.1 (b2b6438b37073bee1fca40374e85bf91aa457c0b) and all nodes are running on NTR mode and I've a replica by node (7 replicas). One node is used