[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720191#action_12720191 ]
Michael McCandless commented on LUCENE-1673: -------------------------------------------- {quote} bq. We could easily add "numeric"; then FieldsReader would return a NumericField. This is that baking in a specific implementation into the index format that I don't like. {quote} But we are already "baking in" the trie indexing format? That's what "moving trie to core" implies. Lucene can now index numbers, well, and has committed to a certain approach (trie). The term dict of a numeric field is trie encoded, each doc field is indexed under a series of trie encoded tokens (w/ different precisions), etc. Sure, in the future we may find improvements to how Lucene indexes numbers, by why choose to be buggy today ("hey how come I didn't get a NumericField back on my doc?") for this possible future that may or may not come? If/when that future arrives, we can improve the index format at that point rather than intentionally create buggy code today? I do agree that retrieving a doc is already "buggy", in that various things are lost from your index time doc (a well known issue at this point!), but I don't think we should intentionally make that behavior even more buggy, if we can help it... > Move TrieRange to core > ---------------------- > > Key: LUCENE-1673 > URL: https://issues.apache.org/jira/browse/LUCENE-1673 > Project: Lucene - Java > Issue Type: New Feature > Components: Search > Affects Versions: 2.9 > Reporter: Uwe Schindler > Assignee: Uwe Schindler > Fix For: 2.9 > > Attachments: LUCENE-1673.patch, LUCENE-1673.patch, LUCENE-1673.patch > > > TrieRange was iterated many times and seems stable now (LUCENE-1470, > LUCENE-1582, LUCENE-1602). There is lots of user interest, Solr added it to > its default FieldTypes (SOLR-940) and if possible I want to move it to core > before release of 2.9. > Before this can be done, there are some things to think about: > # There are now classes called LongTrieRangeQuery, IntTrieRangeQuery, how > should they be called in core? I would suggest to leave it as it is. On the > other hand, if this keeps our only numeric query implementation, we could > call it LongRangeQuery, IntRangeQuery or NumericRangeQuery (see below, here > are problems). Same for the TokenStreams and Filters. > # Maybe the pairs of classes for indexing and searching should be moved into > one class: NumericTokenStream, NumericRangeQuery, NumericRangeFilter. The > problem here: ctors must be able to pass int, long, double, float as range > parameters. For the end user, mixing these 4 types in one class is hard to > handle. If somebody forgets to add a L to a long, it suddenly instantiates a > int version of range query, hitting no results and so on. Same with other > types. Maybe accept java.lang.Number as parameter (because nullable for > half-open bounds) and one enum for the type. > # TrieUtils move into o.a.l.util? or document or? > # Move TokenStreams into o.a.l.analysis, ShiftAttribute into > o.a.l.analysis.tokenattributes? Somewhere else? > # If we rename the classes, should Solr stay with Trie (because there are > different impls)? > # Maybe add a subclass of AbstractField, that automatically creates these > TokenStreams and omits norms/tf per default for easier addition to Document > instances? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org