Re: Performance problems with extremely common terms in collection (Solr 7.4)

2019-04-08 Thread Ash Ramesh
' to appear commonly in a document's keyword list. it seems unlikely we can use the common grams solution with our business case. Regards, Ash On Mon, Apr 8, 2019 at 5:01 PM Toke Eskildsen wrote: > On Mon, 2019-04-08 at 09:58 +1000, Ash Ramesh wrote: > > We have a corpus of 50+ million

Performance problems with extremely common terms in collection (Solr 7.4)

2019-04-07 Thread Ash Ramesh
Hi everybody, We have a corpus of 50+ million documents in our collection. I've noticed that some queries with specific keywords tend to be extremely slow. E.g. the q=`photography' or q='background'. After digging into the raw documents, I could see that these two terms appear in greater than 90%

Problem understanding why QPS is so low

2019-03-19 Thread Ash Ramesh
Hi everybody, My team run a solr cluster which has very low QPS throughput. I have been going through the different configurations in our setup, and think that it's probably the way we have defined our request handlers that is causing the slowness. Details of our cluster are below the fold.

Re: Questions about stored fields and updates.

2018-11-04 Thread Ash Ramesh
Also thanks for the information Shawn! :) On Mon, Nov 5, 2018 at 12:09 PM Ash Ramesh wrote: > Sorry Shawn, > > I seem to have gotten my wording wrong. I meant that we wanted to move > away from atomic-updates to replacing/reindexing the document entirely > again when changes ar

Re: Questions about stored fields and updates.

2018-11-04 Thread Ash Ramesh
On Mon, Nov 5, 2018 at 11:29 AM Shawn Heisey wrote: > On 11/3/2018 9:45 PM, Ash Ramesh wrote: > > My company currently uses SOLR to completely hydrate client objects by > > storing all fields (stored=true). Therefore we have 2 types of fields: > > > > 1.

Questions about stored fields and updates.

2018-11-03 Thread Ash Ramesh
Hi everyone, My company currently uses SOLR to completely hydrate client objects by storing all fields (stored=true). Therefore we have 2 types of fields: 1. indexed=true | stored=true : For fields that will be used for searching, sorting, etc. 2. indexed=false | stored=true: For fields

Potential bug? maxConnectionsPerHost on requestHandler configuration

2018-09-10 Thread Ash Ramesh
Hi, I tried setting up a bespoke ShardHandlerFactory configuration for each request handler in solrconfig.xml. However when I stepped through the code in debug mode (via IntelliJ) I could see that the ShardHandler created and used in the searcher still didn't reflect the values in solrconfig

Solr Cloud not routing to PULL replicas

2018-08-28 Thread Ash Ramesh
Hi again, We are currently using Solr 7.3.1 and have a 8 shard collection. All our TLOGs are in seperate machines & PULLs in others. Since not all shards are in the same machine, the request will be distributed. However, we are seeing that most of the 'distributed' parts of the requests are being

Re: Understanding how timeAllowed works in a distributed cluster

2018-08-22 Thread Ash Ramesh
8 at 4:09 PM Ash Ramesh wrote: > > > Hi again, > > > > Specs: 7.3.1 | 8 Shards | Solr Cloud > > > > I was wondering how the timeAllowed parameter works when you architect > your > > cluster in a sharded and distributed manner. This is the curl command and &g

Understanding how timeAllowed works in a distributed cluster

2018-08-22 Thread Ash Ramesh
Hi again, Specs: 7.3.1 | 8 Shards | Solr Cloud I was wondering how the timeAllowed parameter works when you architect your cluster in a sharded and distributed manner. This is the curl command and the timing Query:

Re: 7.3.1: Query of death - all nodes ran out of memory and had to be shut down

2018-08-22 Thread Ash Ramesh
Thank you all :) We have made the necessary changes to mitigate this issue On Wed, Aug 22, 2018 at 6:01 AM Shawn Heisey wrote: > On 8/20/2018 9:55 PM, Ash Ramesh wrote: > > We ran a bunch of deep paginated queries (offset of 1,000,000) with a > > filter query. We set the timeo

Re: 7.3.1: Query of death - all nodes ran out of memory and had to be shut down

2018-08-20 Thread Ash Ramesh
s very likely (but not guaranteed) that using cursors will fix > this problem. > > Best, > Erick > > > > On Mon, Aug 20, 2018 at 8:55 PM, Ash Ramesh wrote: > > Hi everyone, > > > > We ran into an issue yesterday where all our ec2 machines, running solr, >

7.3.1: Query of death - all nodes ran out of memory and had to be shut down

2018-08-20 Thread Ash Ramesh
Hi everyone, We ran into an issue yesterday where all our ec2 machines, running solr, ran out of memory and could not heal themselves. I'll try break down what happened here. *System Architecture:* - Solr Version: 7.3.1 - Replica Types: TLOG/PULL - Num Shards: 8 (default hashing mechanism) -

Re: Memory requirements for TLOGs (7.3.1)

2018-07-18 Thread Ash Ramesh
l 18, 2018 at 7:46 AM, Shawn Heisey wrote: > > On 7/18/2018 12:04 AM, Ash Ramesh wrote: > >> > >> I have a quick question about what the memory requirements for TLOG > >> machines are on 7.3.1. We currently run replication where there are 3 > >> TL

Memory requirements for TLOGs (7.3.1)

2018-07-18 Thread Ash Ramesh
Hi everybody, I have a quick question about what the memory requirements for TLOG machines are on 7.3.1. We currently run replication where there are 3 TLOGs with 8gb ram (2gb heap) and N PULL replicas with 32gb ram (4gb heap). We have > 10M documents (1 collection) with the index size being ~