It is just as I described it. Lucene45 is not an in-memory format. it only
holds 2 datastructures in ram (which only apply to certain cases of
docvalues instances: sorted, sortedset, and variable-length binary). You
can see this, if you look at the producer's source code (the javadoc of
*Format also describes in great detail what these datastructures are and
how they are used):

  // memory-resident structures
  private final Map<Integer,MonotonicBlockPackedReader>
addressInstances = new HashMap<Integer,MonotonicBlockPackedReader>();
  private final Map<Integer,MonotonicBlockPackedReader>
ordIndexInstances = new HashMap<Integer,MonotonicBlockPackedReader>();


http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/lucene45/Lucene45DocValuesProducer.java

the "disk" codec simply overrides these 2 datastructures to not be in RAM
either. This is extreme, as they should be very small. You really should
not use this unless you have a 486 or something like that.

http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/diskdv/DiskDocValuesProducer.java

all-in-memory is in the memory/ package:

http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/memory/MemoryDocValuesFormat.java

direct is like memory, except applies no compression at all:

http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/memory/DirectDocValuesFormat.java

On Sat, Feb 1, 2014 at 5:10 PM, Joel Bernstein <[email protected]> wrote:

> Robert,
>
> Unless I'm missing something the default docvalues format appears to
> be Lucene45 in (Solr 4.6). Is this the "Memory" format you mention, or is
> there another "Memory" docvalues format? I'm confused because I thought the
> Disk format kept certain things on disk and certain things in memory, but
> this does not appear to be the default format.
>
> Thanks,
> Joel
>
>
>
>
>
>
>
>
>
>  Joel Bernstein
> Search Engineer at Heliosearch
>
>
> On Sat, Feb 1, 2014 at 12:19 PM, Tom Burton-West <[email protected]>wrote:
>
>> Thanks Shawn, Joel, and Robert,
>>
>> Shawn, thanks for mentioning the caveat of having to re-index when
>> upgrading Solr.  We almost always re-index when we upgrade Solr.
>>
>>
>> >>There is a ton of misinformation in this thread.
>> I think this might be because the DocValues implementation is a moving
>> target, and that the documentation has not kept up.
>>
>> >>As of lucene 4.5, the default docvalues are disk-based >>(mostly, some
>> small stuff in ram).
>> >>You probably don't need to change anything from the defaults, unless:
>>
>> >>if you want everything in RAM, use Memory.
>> >>If you want to waste RAM, use Direct.
>> >>If you have no RAM, use Disk.
>>
>> Should I try to edit the Solr wiki (which talks about 4.2 and says the
>> default is to put everything in memory)  or is the idea that the cwiki is
>> where people should look for current documentation?
>> One of the things that confused me was that the cwiki pointed to the
>> outdated Solr wiki entry on DocValues.
>>
>> I think I understand the use cases where someone would want everything in
>> RAM or everything on Disk.  I'm assuming that the default (4.5) makes some
>> trade-off by putting some important data structures in RAM.
>>
>> Where should I look (maybe a JIRA issue?) to understand the use case for
>> Direct?   Maybe adding a sentence to the JavaDoc for Direct explaining why
>> someone would want to use it would be useful.
>>
>> p.s. Robert, I saw your edits on the cwiki and I really appreciate that
>> with all the time you spend working on code, that you take the time to help
>> with the docs.
>>
>>
>> Tom
>>
>>
>>
>>
>>
>

Reply via email to