Why are most of your fields stored but not indexed?  That suggests to me
that you are using Solr as your primary data store, not as an index (which
is not Solr's ideal use case)

Secondly, I think there is confusion around the term "segments".  You have
a field called segment in your schema, but segments in Lucene terms means
parts of the index.  So to clarify, when you say your "segments" size is
8.4Gb, I assume you mean the input data you are putting in the segments
field?

If you look at the files in your index, you can see the different elements
that make up the index,
https://lucene.apache.org/core/4_7_2/core/org/apache/lucene/codecs/lucene46/package-summary.html#package_description
gives the full description of all the different elements for your version.
As Alessandro says, based on your schema the field data (.fdt) files are
probably the largest part of your index?

You should be able to see how the index breaks down in terms of data, from
there you can work out how to tweak your schema.

Remember that all your fields are stored, so the index size will always be
the size of all the stored data, plus all the indexes needed.  Solr's
efficiency is around the indexed data, and it does sometimes trade off more
disk space for greater speed in reading, so you will have to bear that in
mind.


On 22 July 2015 at 12:29, Emir Arnautovic <emir.arnauto...@sematext.com>
wrote:

> Is this test index? Do you rewrite documents with same ids? Did you try to
> optimize index?
>
> Emir
>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
>
> On 22.07.2015 13:10, Daniel Holmes wrote:
>
>> Upayavira number of docs in that case is 140275. The solr memory is 30Gb.
>>
>> Yes Emir I need most of them to be saved.
>>
>> I don't know Alessandro is that usual to use disk for indexing more than
>> 3x
>> of document size and presumably it will grow up in continue of crawl
>> exponentially... Its so suboptimal I think.
>>
>>
>> On Wed, Jul 22, 2015 at 3:16 PM, Alessandro Benedetti <
>> benedetti.ale...@gmail.com> wrote:
>>
>>  "In one case for instance my segments size is 8.4G while index size is
>>> 28G!!! It seems unusual…"
>>>
>>> The index is a collection of index segments + few overhead .
>>> So, do you simply mean  you have 4 segments ?
>>> Where is the problem anyway ?
>>> You are also storing content which usually is a big part of the index.
>>> As Upaya said, I am curious to know why you are so surprised !
>>>
>>> Cheers
>>>
>>> 2015-07-22 11:27 GMT+01:00 Daniel Holmes <noora.sa...@gmail.com>:
>>>
>>>  Hi All
>>>> I have problem with index size in solr 4.7.2. My OS is Ubuntu 14.10
>>>>
>>> 64-bit.
>>>
>>>> my fields are :
>>>>
>>>> <field name="id" type="string" stored="true" indexed="true"/>
>>>> <field name="segment" type="string" stored="true" indexed="false"/>
>>>> <field name="url" type="url_text" stored="true" indexed="true"
>>>> required="true"/>
>>>> <field name="outlink" type="url_text" stored="true" indexed="true"
>>>> required="true"/>
>>>> <field name="content" type="text_general" stored="true" indexed="true"/>
>>>> <field name="title" type="text_general" stored="true" indexed="true"/>
>>>> <field name="host" type="url" stored="false" indexed="true"/>
>>>> <field name="segment" type="string" stored="true" indexed="false"/>
>>>> <field name="boost" type="float" stored="true" indexed="false"/>
>>>> <field name="digest" type="string" stored="true" indexed="false"/>
>>>> <field name="tstamp" type="date" stored="true" indexed="false"/>
>>>>
>>>> In one case for instance my segments size is 8.4G while index size is
>>>> 28G!!! It seems unusual...
>>>>
>>>> What suggestions do you have to reduce index size?
>>>> Is there any way to check disk usage details in cores? e.g. stop words,
>>>> stored docs, etc.
>>>>
>>>>
>>>
>>> --
>>> --------------------------
>>>
>>> Benedetti Alessandro
>>> Visiting card - http://about.me/alessandro_benedetti
>>> Blog - http://alexbenedetti.blogspot.co.uk
>>>
>>> "Tyger, tyger burning bright
>>> In the forests of the night,
>>> What immortal hand or eye
>>> Could frame thy fearful symmetry?"
>>>
>>> William Blake - Songs of Experience -1794 England
>>>
>>>

Reply via email to