You need free disk space equal to at least half the minimum sizes of the collections. You might need more. We have a 23 GB collection in Solr cloud. When we reload all the content and wait until the end to do a commit, it gets up to 51 GB. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog)
> On Aug 29, 2018, at 1:41 PM, Kudrettin Güleryüz <kudret...@gmail.com> wrote: > > Given the set of preferences above, I would expect the difference between > the largest freedisk (test-43 currently) and the smallest freedisk (test-45 > currently) to be smaller than what is below. Below is the output from > reading diagnostics endpoint from autoscaling API. According this output, > the variation between freedisk values is currently as large as 220GiB. I am > concerned because I cannot tell if the variation is expected, or if it is > due to configuration error. Also, it would be great to keep track of a > single disk space, rather than keeping track of 6 disk spaces, if possible. > > What policy/preferences options would you suggest exploring specifically > for evening out freedisks across Solr nodes? > > { > "responseHeader":{ > "status":0, > "QTime":284}, > "diagnostics":{ > "sortedNodes":[{ > "node":"test-43:8983_solr", > "cores":137, > "freedisk":447.0913887023926, > "sysLoadAvg":117.0}, > { > "node":"test-42:8983_solr", > "cores":137, > "freedisk":369.33697509765625, > "sysLoadAvg":93.0}, > { > "node":"test-46:8983_solr", > "cores":137, > "freedisk":361.7615737915039, > "sysLoadAvg":93.0}, > { > "node":"test-41:8983_solr", > "cores":137, > "freedisk":347.91234970092773, > "sysLoadAvg":86.0}, > { > "node":"test-44:8983_solr", > "cores":137, > "freedisk":341.1301383972168, > "sysLoadAvg":160.0}, > { > "node":"test-45:8983_solr", > "cores":137, > "freedisk":227.17399215698242, > "sysLoadAvg":118.0}], > "violations":[]}, > "WARNING":"This response format is experimental. It is likely to change > in the future."} > > On Mon, Aug 27, 2018 at 5:17 PM Kudrettin Güleryüz <kudret...@gmail.com> > wrote: > >> Hi, >> >> We have six Solr nodes with ~1TiB disk space on each mounted as ext4. The >> indexers sometimes update the collections and create new ones if update >> wouldn't be faster than scratch indexing. (up to around 5 million documents >> are indexed for each collection) On average there are around 130 >> collections on this SolrCloud. Collection sizes vary from 1GiB to 150GiB. >> >> Preferences set: >> >> "cluster-preferences":[{ >> "maximize":"freedisk", >> "precision":10} >> ,{ >> "minimize":"cores", >> "precision":1} >> ,{ >> "minimize":"sysLoadAvg", >> "precision":3}], >> >> * Is it be possible to run out of disk space on one of the nodes while >> others would have plenty? I observe some are getting close to ~80% >> utilization while others stay at ~60% >> * Would this difference be due to collection index size differences or due >> to error on my side to come up with a useful policy/preferences? >> >> Thank you >> >>