Hi Erick, Its due to some past issues observed with Joins on Solr 4, which got OOM on joining to large indexes after optimization/compaction, if those are stored as smaller files those gets fit into memory and operations are performed appropriately. Also, there are slow write/commit/updates are observed for large files. Thus, to minimize this risk while upgrading on Solr 6, we wanted to store indexes into smaller sized files.
Thanks, Manan Sheth ________________________________________ From: Erick Erickson <erickerick...@gmail.com> Sent: Tuesday, January 10, 2017 5:24 AM To: solr-user Subject: Re: Help needed in breaking large index file into smaller ones Why do you have a requirement that the indexes be < 4G? If it's arbitrarily imposed why bother? Or is it a non-negotiable requirement imposed by the platform you're on? Because just splitting the files into a smaller set won't help you if you then start to index into it, the merge process will just recreate them. You might be able to do something with the settings in TieredMergePolicy in the first place to stop generating files > 4g.. Best, Erick On Mon, Jan 9, 2017 at 3:27 PM, Anshum Gupta <ans...@anshumgupta.net> wrote: > Can you provide more information about: > - Are you using Solr in standalone or SolrCloud mode? What version of Solr? > - Why do you want this? Lack of disk space? Uneven distribution of data on > shards? > - Do you want this data together i.e. as part of a single collection? > > You can check out the following APIs: > SPLITSHARD: > https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api3 > MIGRATE: > https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api12 > > Among other things, make sure you have enough spare disk-space before > trying out the SPLITSHARD API in particular. > > -Anshum > > > > On Mon, Jan 9, 2017 at 12:08 PM Mikhail Khludnev <m...@apache.org> wrote: > >> Perhaps you can copy this index into a separate location. Remove odd and >> even docs into former and later indexes consequently, and then force merge >> to single segment in both locations separately. >> Perhaps shard splitting in SolrCloud does something like that. >> >> On Mon, Jan 9, 2017 at 1:12 PM, Narsimha Reddy CHALLA < >> chnredd...@gmail.com> >> wrote: >> >> > Hi All, >> > >> > My solr server has a few large index files (say ~10G). I am looking >> > for some help on breaking them it into smaller ones (each < 4G) to >> satisfy >> > my application requirements. Are there any such tools available? >> > >> > Appreciate your help. >> > >> > Thanks >> > NRC >> > >> >> >> >> -- >> Sincerely yours >> Mikhail Khludnev >> ________________________________ NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.