It is just as I described it. Lucene45 is not an in-memory format. it only holds 2 datastructures in ram (which only apply to certain cases of docvalues instances: sorted, sortedset, and variable-length binary). You can see this, if you look at the producer's source code (the javadoc of *Format also describes in great detail what these datastructures are and how they are used):
// memory-resident structures private final Map<Integer,MonotonicBlockPackedReader> addressInstances = new HashMap<Integer,MonotonicBlockPackedReader>(); private final Map<Integer,MonotonicBlockPackedReader> ordIndexInstances = new HashMap<Integer,MonotonicBlockPackedReader>(); http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/lucene45/Lucene45DocValuesProducer.java the "disk" codec simply overrides these 2 datastructures to not be in RAM either. This is extreme, as they should be very small. You really should not use this unless you have a 486 or something like that. http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/diskdv/DiskDocValuesProducer.java all-in-memory is in the memory/ package: http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/memory/MemoryDocValuesFormat.java direct is like memory, except applies no compression at all: http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/memory/DirectDocValuesFormat.java On Sat, Feb 1, 2014 at 5:10 PM, Joel Bernstein <[email protected]> wrote: > Robert, > > Unless I'm missing something the default docvalues format appears to > be Lucene45 in (Solr 4.6). Is this the "Memory" format you mention, or is > there another "Memory" docvalues format? I'm confused because I thought the > Disk format kept certain things on disk and certain things in memory, but > this does not appear to be the default format. > > Thanks, > Joel > > > > > > > > > > Joel Bernstein > Search Engineer at Heliosearch > > > On Sat, Feb 1, 2014 at 12:19 PM, Tom Burton-West <[email protected]>wrote: > >> Thanks Shawn, Joel, and Robert, >> >> Shawn, thanks for mentioning the caveat of having to re-index when >> upgrading Solr. We almost always re-index when we upgrade Solr. >> >> >> >>There is a ton of misinformation in this thread. >> I think this might be because the DocValues implementation is a moving >> target, and that the documentation has not kept up. >> >> >>As of lucene 4.5, the default docvalues are disk-based >>(mostly, some >> small stuff in ram). >> >>You probably don't need to change anything from the defaults, unless: >> >> >>if you want everything in RAM, use Memory. >> >>If you want to waste RAM, use Direct. >> >>If you have no RAM, use Disk. >> >> Should I try to edit the Solr wiki (which talks about 4.2 and says the >> default is to put everything in memory) or is the idea that the cwiki is >> where people should look for current documentation? >> One of the things that confused me was that the cwiki pointed to the >> outdated Solr wiki entry on DocValues. >> >> I think I understand the use cases where someone would want everything in >> RAM or everything on Disk. I'm assuming that the default (4.5) makes some >> trade-off by putting some important data structures in RAM. >> >> Where should I look (maybe a JIRA issue?) to understand the use case for >> Direct? Maybe adding a sentence to the JavaDoc for Direct explaining why >> someone would want to use it would be useful. >> >> p.s. Robert, I saw your edits on the cwiki and I really appreciate that >> with all the time you spend working on code, that you take the time to help >> with the docs. >> >> >> Tom >> >> >> >> >> >
