[jira] [Commented] (LUCENE-3108) Land DocValues on trunk

Simon Willnauer (JIRA) Fri, 20 May 2011 05:33:37 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036798#comment-13036798
 ]


Simon Willnauer commented on LUCENE-3108:
-----------------------------------------

bq. It is tricky... but, eg, when someone does SortField("title", 
SortField.STRING), which cache (DV or FC) should we populate?

I think we should have a specialized sort field eventually. FCSortField / 
DVSortField?

bq. Both ValueSource and DocValues have long been used by function queries.

Suggestions welcome - nothing is fixed yet so we should find non-conflicting 
names. Maybe we can call it o.a.l.index.columns.Columns and 
o.a.l.index.columns.ColumnsEnum / ColumnsArray (instead of source) 


bq. OK, but I think if we make a "straight longs" impl (ie no packed ints at 
all) then we can handle all long values? But in that case we'd require the app 
to pick a sentinel to mean "unset"?

yes, I will open an issue.

> Land DocValues on trunk
> -----------------------
>
>                 Key: LUCENE-3108
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3108
>             Project: Lucene - Java
>          Issue Type: Task
>          Components: core/index, core/search, core/store
>    Affects Versions: CSF branch, 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3108.patch
>
>
> Its time to move another feature from branch to trunk. I want to start this 
> process now while still a couple of issues remain on the branch. Currently I 
> am down to a single nocommit (javadocs on DocValues.java) and a couple of 
> testing TODOs (explicit multithreaded tests and unoptimized with deletions) 
> but I think those are not worth separate issues so we can resolve them as we 
> go. 
> The already created issues (LUCENE-3075 and LUCENE-3074) should not block 
> this process here IMO, we can fix them once we are on trunk. 
> Here is a quick feature overview of what has been implemented:
>  * DocValues implementations for Ints (based on PackedInts), Float 32 / 64, 
> Bytes (fixed / variable size each in sorted, straight and deref variations)
>  * Integration into Flex-API, Codec provides a 
> PerDocConsumer->DocValuesConsumer (write) / PerDocValues->DocValues (read) 
>  * By-Default enabled in all codecs except of PreFlex
>  * Follows other flex-API patterns like non-segment reader throw UOE forcing 
> MultiPerDocValues if on DirReader etc.
>  * Integration into IndexWriter, FieldInfos etc.
>  * Random-testing enabled via RandomIW - injecting random DocValues into 
> documents
>  * Basic checks in CheckIndex (which runs after each test)
>  * FieldComparator for int and float variants (Sorting, currently directly 
> integrated into SortField, this might go into a separate DocValuesSortField 
> eventually)
>  * Extended TestSort for DocValues
>  * RAM-Resident random access API plus on-disk DocValuesEnum (currently only 
> sequential access) -> Source.java / DocValuesEnum.java
>  * Extensible Cache implementation for RAM-Resident DocValues (by-default 
> loaded into RAM only once and freed once IR is closed) -> SourceCache.java
>  
> PS: Currently the RAM resident API is named Source (Source.java) which seems 
> too generic. I think we should rename it into RamDocValues or something like 
> that, suggestion welcome!   
> Any comments, questions (rants :)) are very much appreciated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3108) Land DocValues on trunk

Reply via email to