[jira] Commented: (LUCENE-1673) Move TrieRange to core

Michael McCandless (JIRA) Wed, 17 Jun 2009 02:52:40 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720593#action_12720593
 ]


Michael McCandless commented on LUCENE-1673:
--------------------------------------------

bq. Want a convenience method for the user? TrieUtils.createDocumentField(...) 
, same as the sortField currently works.

I don't think this is "convenient" enough.

bq.  If you'd like to have end-to-end experience for numeric fields, build 
something schema-like and put it in contribs

+1

Long (medium?) term I'd love to get to this point; I think it'd make
Lucene quite a bit more consumable.  But we shouldn't sacrifice
consumability today on the hope for that future nirvana.

You already have a nice starting point here... is that something you
could donate?

{quote}
bq. I do agree that retrieving a doc is already "buggy", in that various things 
are lost from your index time doc (a well known issue at this point!)

How on earth is it buggy?  You're working with an inverted index, you aren't 
supposed to get original document from it in the first place. It's like saying 
a hash function is buggy because it is not reversible.
{quote}

I completely agree: you're not supposed to get the original doc back.
And the fact that Lucene's API now "pretends" you do, is wrong.  We all
agree to that, and that we need to fix Lucene.

But, as things now stand, it's not yet fixed, so until it's fixed, I
don't like intentionally making it worse.

It'd be great to simply stop returning Document from IndexReader.
Wanna make a patch?  I don't think the new sheriff'd hold 2.9 for this
though ;)

{quote}
bq. "hey how come I didn't get a NumericField back on my doc?

Perhaps a good reason to not add a NumericField.
{quote}

I think NumericField (when building your doc) is still valuable, even
if we can't return NumericField when you retrieve the doc.

OK... since adding the bit to the stored fields is controversial, I
think for 2.9, we should only add NumericField at indexing (document
creation) time.  So, we don't store a new bit in stored fields file
and the index format is unchanged.


> Move TrieRange to core
> ----------------------
>
>                 Key: LUCENE-1673
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1673
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Search
>    Affects Versions: 2.9
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 2.9
>
>         Attachments: LUCENE-1673.patch, LUCENE-1673.patch, LUCENE-1673.patch
>
>
> TrieRange was iterated many times and seems stable now (LUCENE-1470, 
> LUCENE-1582, LUCENE-1602). There is lots of user interest, Solr added it to 
> its default FieldTypes (SOLR-940) and if possible I want to move it to core 
> before release of 2.9.
> Before this can be done, there are some things to think about:
> # There are now classes called LongTrieRangeQuery, IntTrieRangeQuery, how 
> should they be called in core? I would suggest to leave it as it is. On the 
> other hand, if this keeps our only numeric query implementation, we could 
> call it LongRangeQuery, IntRangeQuery or NumericRangeQuery (see below, here 
> are problems). Same for the TokenStreams and Filters.
> # Maybe the pairs of classes for indexing and searching should be moved into 
> one class: NumericTokenStream, NumericRangeQuery, NumericRangeFilter. The 
> problem here: ctors must be able to pass int, long, double, float as range 
> parameters. For the end user, mixing these 4 types in one class is hard to 
> handle. If somebody forgets to add a L to a long, it suddenly instantiates a 
> int version of range query, hitting no results and so on. Same with other 
> types. Maybe accept java.lang.Number as parameter (because nullable for 
> half-open bounds) and one enum for the type.
> # TrieUtils move into o.a.l.util? or document or?
> # Move TokenStreams into o.a.l.analysis, ShiftAttribute into 
> o.a.l.analysis.tokenattributes? Somewhere else?
> # If we rename the classes, should Solr stay with Trie (because there are 
> different impls)?
> # Maybe add a subclass of AbstractField, that automatically creates these 
> TokenStreams and omits norms/tf per default for easier addition to Document 
> instances?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-1673) Move TrieRange to core

Reply via email to