Robert, OK, I didn't realize that Lucene45 was mostly a disk based format. Thanks for the clear explanation and links.
Joel Joel Bernstein Search Engineer at Heliosearch On Sat, Feb 1, 2014 at 8:25 PM, Robert Muir <[email protected]> wrote: > > It is just as I described it. Lucene45 is not an in-memory format. it only > holds 2 datastructures in ram (which only apply to certain cases of > docvalues instances: sorted, sortedset, and variable-length binary). You > can see this, if you look at the producer's source code (the javadoc of > *Format also describes in great detail what these datastructures are and > how they are used): > > // memory-resident structures > private final Map<Integer,MonotonicBlockPackedReader> addressInstances = > new HashMap<Integer,MonotonicBlockPackedReader>(); > private final Map<Integer,MonotonicBlockPackedReader> ordIndexInstances = > new HashMap<Integer,MonotonicBlockPackedReader>(); > > > http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/lucene45/Lucene45DocValuesProducer.java > > the "disk" codec simply overrides these 2 datastructures to not be in RAM > either. This is extreme, as they should be very small. You really should > not use this unless you have a 486 or something like that. > > > http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/diskdv/DiskDocValuesProducer.java > > all-in-memory is in the memory/ package: > > > http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/memory/MemoryDocValuesFormat.java > > direct is like memory, except applies no compression at all: > > > http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/memory/DirectDocValuesFormat.java > > On Sat, Feb 1, 2014 at 5:10 PM, Joel Bernstein <[email protected]> wrote: > >> Robert, >> >> Unless I'm missing something the default docvalues format appears to >> be Lucene45 in (Solr 4.6). Is this the "Memory" format you mention, or is >> there another "Memory" docvalues format? I'm confused because I thought the >> Disk format kept certain things on disk and certain things in memory, but >> this does not appear to be the default format. >> >> Thanks, >> Joel >> >> >> >> >> >> >> >> >> >> Joel Bernstein >> Search Engineer at Heliosearch >> >> >> On Sat, Feb 1, 2014 at 12:19 PM, Tom Burton-West <[email protected]>wrote: >> >>> Thanks Shawn, Joel, and Robert, >>> >>> Shawn, thanks for mentioning the caveat of having to re-index when >>> upgrading Solr. We almost always re-index when we upgrade Solr. >>> >>> >>> >>There is a ton of misinformation in this thread. >>> I think this might be because the DocValues implementation is a moving >>> target, and that the documentation has not kept up. >>> >>> >>As of lucene 4.5, the default docvalues are disk-based >>(mostly, some >>> small stuff in ram). >>> >>You probably don't need to change anything from the defaults, unless: >>> >>> >>if you want everything in RAM, use Memory. >>> >>If you want to waste RAM, use Direct. >>> >>If you have no RAM, use Disk. >>> >>> Should I try to edit the Solr wiki (which talks about 4.2 and says the >>> default is to put everything in memory) or is the idea that the cwiki is >>> where people should look for current documentation? >>> One of the things that confused me was that the cwiki pointed to the >>> outdated Solr wiki entry on DocValues. >>> >>> I think I understand the use cases where someone would want everything >>> in RAM or everything on Disk. I'm assuming that the default (4.5) makes >>> some trade-off by putting some important data structures in RAM. >>> >>> Where should I look (maybe a JIRA issue?) to understand the use case for >>> Direct? Maybe adding a sentence to the JavaDoc for Direct explaining why >>> someone would want to use it would be useful. >>> >>> p.s. Robert, I saw your edits on the cwiki and I really appreciate that >>> with all the time you spend working on code, that you take the time to help >>> with the docs. >>> >>> >>> Tom >>> >>> >>> >>> >>> >> >
