Hi,
Thanks for your reply.
I will work on your suggestion for using only one solr instance.
I tried to merge the 15 indexes again, and I found out that the new merged
index (without opitmization) size was about 351 GB , but when I optimize it
the size return back to 411 GB, Why?
I thought that optimization would decrease or at least be equal to the same
index size before optimization
Funtick wrote:
>
> Hi,
>
> Can you try to use single SOLR instance with heavy RAM (so that
> ramBufferSizeMB=8192 for instance) and mergeFactor=10? Single SOLR
> instance
> is fast enough (> 100 client threads of Tomcat; configurable) - I usually
> prefer single instance for single "writable" box with heavy RAM allocation
> and good I/O.
>
> Merging 15 indexes and 4-times larger size could happen, for instance,
> because of differences in SOLR Schema and Lucene; ensure that schema is
> the
> same (using Luke for instance). SOLR 1.4 has some new powerful features
> such
> as document->term cache stored somewhere (uninverted index) (Yonik), term
> vectors, stored=true, copyField, etc.
>
> Do not do commit per 100; do it once at the end...
>
>
>
> -Original Message-
> From: engy.ali [mailto:omeshm...@hotmail.com]
> Sent: August-25-09 3:31 PM
> To: solr-user@lucene.apache.org
> Subject: Solr index - Size and indexing speed
>
>
> Summary
> ===
>
> I had about 120,000 object of total size 71.2 GB, those objects are
> already
> indexed using Lucene. The index size is about 111 GB.
>
> I tried to use solr 1.4 nightly build to index the same collection. I
> divided collection on three servers, each server had 5 solr instances (not
> solr cores) up and running.
>
> After collection had been indexed, i merge the 15 indexes.
>
> Problems
> ==
>
> 1. The new merged index size is about 411 GB (i.e: 4 times larger than old
> index using lucene)
>
> I tried to index only on object using lucene and same object using solr to
> verify the size and the result was that the new index is about twice size
> of
> old index.
>
> DO you have any idea what might be the reason?
>
>
> 2. the indexing speed is slow, 100 object on single solr instance were
> indexed in 1 hour so i estimated that 1000 on single instance can be done
> in
> 10 hours, but that was not the case, the indexing time exceeds estimated
> time by about 12 hour.
>
> is that might be related to the growth of index?if not, so what might be
> the
> reason.
>
> Note: I do a commit/100 object and an optimize by the end of the whole
> operation. I also changed the mergeFactor from 10 to 15.
>
>
> 3. I google and found out that solr is using an inverted index, but I
> want
> to know what is the internal structure of solr index,for example if i have
> a
> word and its stems, how it will be store in the index
>
> Thanks,
> Engy
> --
> View this message in context:
> http://www.nabble.com/Solr-index---Size-and-indexing-speed-tp25140702p251407
> 02.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
>
>
>
--
View this message in context:
http://www.nabble.com/Solr-index---Size-and-indexing-speed-tp25140702p25201981.html
Sent from the Solr - User mailing list archive at Nabble.com.