Hi Erick,

Its due to some past issues observed with Joins on Solr 4, which got OOM on 
joining to large indexes after optimization/compaction, if those are stored as 
smaller files those gets fit into memory and operations are performed 
appropriately. Also, there are slow write/commit/updates are observed for large 
files. Thus, to minimize this risk while upgrading on Solr 6, we wanted to 
store indexes into smaller sized files.

Thanks,
Manan Sheth
________________________________________
From: Erick Erickson <erickerick...@gmail.com>
Sent: Tuesday, January 10, 2017 5:24 AM
To: solr-user
Subject: Re: Help needed in breaking large index file into smaller ones

Why do you have a requirement that the indexes be < 4G? If it's
arbitrarily imposed why bother?

Or is it a non-negotiable requirement imposed by the platform you're on?

Because just splitting the files into a smaller set won't help you if
you then start to index into it, the merge process will just recreate
them.

You might be able to do something with the settings in
TieredMergePolicy in the first place to stop generating files > 4g..

Best,
Erick

On Mon, Jan 9, 2017 at 3:27 PM, Anshum Gupta <ans...@anshumgupta.net> wrote:
> Can you provide more information about:
> - Are you using Solr in standalone or SolrCloud mode? What version of Solr?
> - Why do you want this? Lack of disk space? Uneven distribution of data on
> shards?
> - Do you want this data together i.e. as part of a single collection?
>
> You can check out the following APIs:
> SPLITSHARD:
> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api3
> MIGRATE:
> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api12
>
> Among other things, make sure you have enough spare disk-space before
> trying out the SPLITSHARD API in particular.
>
> -Anshum
>
>
>
> On Mon, Jan 9, 2017 at 12:08 PM Mikhail Khludnev <m...@apache.org> wrote:
>
>> Perhaps you can copy this index into a separate location. Remove odd and
>> even docs into former and later indexes consequently, and then force merge
>> to single segment in both locations separately.
>> Perhaps shard splitting in SolrCloud does something like that.
>>
>> On Mon, Jan 9, 2017 at 1:12 PM, Narsimha Reddy CHALLA <
>> chnredd...@gmail.com>
>> wrote:
>>
>> > Hi All,
>> >
>> >       My solr server has a few large index files (say ~10G). I am looking
>> > for some help on breaking them it into smaller ones (each < 4G) to
>> satisfy
>> > my application requirements. Are there any such tools available?
>> >
>> > Appreciate your help.
>> >
>> > Thanks
>> > NRC
>> >
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>>

________________________________






NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.

Reply via email to