---------- Forwarded message ---------- From: *Moenieb Davids* <moenieb.dav...@gmail.com> Date: Thursday, June 7, 2018 Subject: Sole for Content Management To: "general@lucene.apache.org" <general@lucene.apache.org>, " u...@lucene.apache.org" <u...@lucene.apache.org>
Hi All, Background: I am currently testing a deployment of a content management framework where I am trying to punt Solr as the tool of choice for ingestion and searching. Current status: I have deployed SolrCloud across multiple servers with multiple shards and a replication factor of 2. In terms of collections, I have a person collection that contains details individuals including address and high level portfolio info. Structurally, this collection contains great grandchildren. Then I have a few collections that deals with content. For now, content is just emails and document with a max size of 2MB, with certain user exceptions that can go higher than 2MB. Content is indexed twice in terms of the actual content, firstly as binary/stream and then as readable text. Metadata is negligible Challenges: When performing full text searches without concurrently executing updates, solr seems to be doing well. Running updates also does okish given the nature of the transaction. However, when I run search and updates simultaneously, performance drops quite significantly. I have played with field properties, analyzers, tokenizers, shafting sizes etc. Any advice? Would like to know if anyone has done something similar. Please excuse the long winded message -- Sent from Gmail Mobile -- Sent from Gmail Mobile