Otis,

Good points. I guess you are suggesting that it depends on the resources. The document is 100k each the pre processing server is a 2 cpu VM running with 4G RAM. So, that could be a "small" machine relatively to process such amount of data??


On 3/5/14, 12:27 PM, Otis Gospodnetic wrote:
Hi,

It depends.  Are docs huge or small? Server single core or 32 core?  Heap
big or small?  etc. etc.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Wed, Mar 5, 2014 at 3:02 PM, Rallavagu <rallav...@gmail.com> wrote:

It seems the latency is introduced by collecting the data from different
sources and putting them together then actual Solr index. I would say all
these activities are contributing equally though I would say So, is it
normal to expect to run indexing to run for long? Wondering what to expect
in such cases. Thanks.

On 3/5/14, 11:47 AM, Otis Gospodnetic wrote:

Hi,

6M is really not huge these days.  6B is big, though also still not huge
any more.  What seems to be the bottleneck?  Solr or DB or network or
something else?

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Wed, Mar 5, 2014 at 2:37 PM, Rallavagu <rallav...@gmail.com> wrote:

  All,

Wondering about best practices/common practices to index/re-index huge
amount of data in Solr. The data is about 6 million entries in the db and
other source (data is not located in one resource). Trying with solrj
based
solution to collect data from difference resources to index into Solr. It
takes hours to index Solr.

Thanks in advance




Reply via email to