[jira] Commented: (LUCENE-1673) Move TrieRange to core

Michael McCandless (JIRA) Tue, 16 Jun 2009 08:39:34 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720191#action_12720191
 ]


Michael McCandless commented on LUCENE-1673:
--------------------------------------------

{quote}
bq. We could easily add "numeric"; then FieldsReader would return a 
NumericField.

This is that baking in a specific implementation into the index format that I 
don't like.
{quote}

But we are already "baking in" the trie indexing format?  That's what
"moving trie to core" implies.  Lucene can now index numbers, well,
and has committed to a certain approach (trie).

The term dict of a numeric field is trie encoded, each doc field is
indexed under a series of trie encoded tokens (w/ different
precisions), etc.

Sure, in the future we may find improvements to how Lucene indexes
numbers, by why choose to be buggy today ("hey how come I didn't get a
NumericField back on my doc?") for this possible future that may or
may not come?  If/when that future arrives, we can improve the index
format at that point rather than intentionally create buggy code
today?

I do agree that retrieving a doc is already "buggy", in that various
things are lost from your index time doc (a well known issue at this
point!), but I don't think we should intentionally make that behavior
even more buggy, if we can help it...


> Move TrieRange to core
> ----------------------
>
>                 Key: LUCENE-1673
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1673
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Search
>    Affects Versions: 2.9
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 2.9
>
>         Attachments: LUCENE-1673.patch, LUCENE-1673.patch, LUCENE-1673.patch
>
>
> TrieRange was iterated many times and seems stable now (LUCENE-1470, 
> LUCENE-1582, LUCENE-1602). There is lots of user interest, Solr added it to 
> its default FieldTypes (SOLR-940) and if possible I want to move it to core 
> before release of 2.9.
> Before this can be done, there are some things to think about:
> # There are now classes called LongTrieRangeQuery, IntTrieRangeQuery, how 
> should they be called in core? I would suggest to leave it as it is. On the 
> other hand, if this keeps our only numeric query implementation, we could 
> call it LongRangeQuery, IntRangeQuery or NumericRangeQuery (see below, here 
> are problems). Same for the TokenStreams and Filters.
> # Maybe the pairs of classes for indexing and searching should be moved into 
> one class: NumericTokenStream, NumericRangeQuery, NumericRangeFilter. The 
> problem here: ctors must be able to pass int, long, double, float as range 
> parameters. For the end user, mixing these 4 types in one class is hard to 
> handle. If somebody forgets to add a L to a long, it suddenly instantiates a 
> int version of range query, hitting no results and so on. Same with other 
> types. Maybe accept java.lang.Number as parameter (because nullable for 
> half-open bounds) and one enum for the type.
> # TrieUtils move into o.a.l.util? or document or?
> # Move TokenStreams into o.a.l.analysis, ShiftAttribute into 
> o.a.l.analysis.tokenattributes? Somewhere else?
> # If we rename the classes, should Solr stay with Trie (because there are 
> different impls)?
> # Maybe add a subclass of AbstractField, that automatically creates these 
> TokenStreams and omits norms/tf per default for easier addition to Document 
> instances?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-1673) Move TrieRange to core

Reply via email to