Hi Adrein,

Thanks for clarifying the things.
I have some doubts regarding sorting :
> 
> While you can do that, I don't recommend it. For example, if you have
> 5 fields, loading all fields from stored fields requires at most 1
> disk seek while loading all fields from doc values requires at least 5
> disk seeks for disk-based doc values.


1> I am assuming those mentioned 5 fields are sortable fields upon which 
sorting is done.
In my understanding, loading stored fields takes 1 disk seek for finding file 
pointer & 1 disk seek for getting all those fields.
Since different file is maintained for a particular doc value field. We get 5 
disk seeks + 1 disk seek for file pointer.
If we have only one sortable field , which could be better ? I guess no diff.
Also, I vaguely remember that there is some performance loss for sorting based 
on string in lucene 4.0
Then, will the decision change for String field or based on type of field ?

2> Also, In my understanding, if we need to use parser based queries for 
docvalues, we need to have a storedfield for a doc with same name & value of 
the doc's docvalue.
Even term queries won't work. Am i right here?

Thanks,
Arun


On 28-May-2013, at 8:31 PM, Adrien Grand <jpou...@gmail.com> wrote:

> On Tue, May 28, 2013 at 4:48 PM, Arun Kumar K <arunk...@gmail.com> wrote:
>> Hi Guys,
> 
> Hi,
> 
>> I have been trying to understand DocValues and get some hands on and have
>> observed few things.
>> 
>> I have added LongDocValuesField to the documents like:
>> doc.add(new LongDocValuesField("id",1));
>> 
>> 1> In 4.0 i saw that there are two versions for docvalues,
>>     RAM Resident(using Sources.getSOurces())  & On
>> Disk(Sources.getDirectSources()).
>> 
>>     But in 4.2 i get LongDocValues using
>> "context.reader().getNumericDocValues(field) ". Which type is this ?
>>     If this RAM based then is there any Disk-Based equivalent ?
> 
> Indeed, doc values have changed a lot between 4.1 and 4.2. The way doc
> values are stored now depends on the DocValuesFormat. For example, the
> default format (Lucene42DocValuesFormat) today stores data in memory
> while we also have DiskDocValuesFormat (in lucene/codecs) which stores
> data on disk.
> 
>> 2> Can DocValuesField be used for search ? I coudn't. Did i miss something?
>>     "searcher.search(parser.parse("docvaluedfield:value"),100)"
> 
> Yes and no. The query parser can't deal with it, but for example, you
> could use FieldCacheRangeFilter to build a range query (potentially
> matching a single value) on top of doc values. (When a field has doc
> values, Fieldcache will automatically use them instead of uninverting
> the field). While this will likely be slower for thin ranges, this
> should be very fast (probably even faster than a range query based on
> the terms dictionary) for large ranges that match many documents.
> 
>>     I am able to use for sorting.
>>     If possible i want to avoid having a stored field in index with same
>> "name" & "Value" of DocValueField of same
>>     document and perform search.
> 
> While you can do that, I don't recommend it. For example, if you have
> 5 fields, loading all fields from stored fields requires at most 1
> disk seek while loading all fields from doc values requires at least 5
> disk seeks for disk-based doc values.
> 
>> 3> I have a reader opened on DirectoryReader with the docBaseInParent value
>> as 0 (first documents internal ID).
>>     Even when i delete the first added document (with internal docID = 0)
>> using some query the docBaseInParent is not
>>     updated to 1(next documents internal ID). I have committed writer,
>> forceMergeDeletes but it's the same.
>>     I have also seen getLiveDocs().
>> 
>>    Just curious to know the reasons for not updating the docBase ?
> 
> Everything in Lucene is based on the fact that segments are immutable
> up to deletes. Starting to mutate internal data such as
> docBaseInParent would make the design much more complicated (hence
> harder to reason about, to optimize, etc.).
> 
> --
> Adrien
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to