[jira] Commented: (LUCENE-1701) Add NumericField and NumericSortField, make plain text numeric parsers public in FieldCache, move trie parsers to FieldCache

Earwin Burrfoot (JIRA) Mon, 22 Jun 2009 12:36:31 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722775#action_12722775
 ]


Earwin Burrfoot commented on LUCENE-1701:
-----------------------------------------

>>> Design for today.
>> And spend two years deprecating and supporting today's designs after you get 
>> a better thing tomorrow. Back-compat Lucene-style and agile design aren't 
>> something that marries well.
>> donating something to Lucene means casting it in concrete.
> We can't let fear of back-compat prevent us from making progress.
My point was that strict back-compat prevents people from donating work which 
is not yet finalized. They either lose comfortable volatility of private code, 
or have to maintain two versions of it - private and Lucene.

>> NRT seems to tread the same path, and I'm not sure it's going to win that 
>> much turnaround time after newly-introduced per-segment collection.
> I agree, per-segment collection was the bulk of the gains needed for
> NRT. This was a big change and a huge step forward in simple reopen
> turnaround.
I vote it for the most frustrating (in terms of adopting your custom code) and 
most useful change of 2.9 :)

> But, not having to write & read deletes to disk, not commit (fsync)
> from writer in order to see those changes in reader should also give
> us decent gains. fsync is surprisingly and intermittently costly.
I'm not sure this can't be achieved without messing with IR/W guts so much. 
Guys from LinkedIn that drive this feature (if i'm not mistaken), they had a 
prior solution with separate indexes, one on disk, one in RAM. Per-segment 
collection adds superfast reopens and MultiReader that is way greater than 
MultiSearcher - you can finally do adequate fast searches across separate 
indexes. Do we still need to add complexity for minor performance gains?

> And this integration lets us take it a step further with LUCENE-1313,
> where recently created segments can remain in RAM and be shared with
> the reader.
RAMDirectory?

>> Some time ago I finished a first version of IR plugins, and enjoy pretty low 
>> reopen times (field/facet/filter cache warmups included). (Yes, I'm going to 
>> open an issue for plugins once they stabilize enough)
> I'm confused: I thought that effort was to make SegmentReader's
> components fully pluggable? (Not to actually change what components
> SegmentReader is creating). EG does this modularization alter the
> approach to NRT? I thought they were orthogonal.
Yes, they are orthonogal. This was yet another praise to per-segment collection 
and an example of how this approach can be extended on your custom stuff (like 
filtercache).


> Add NumericField and NumericSortField, make plain text numeric parsers public 
> in FieldCache, move trie parsers to FieldCache
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1701
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1701
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index, Search
>    Affects Versions: 2.9
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 2.9
>
>         Attachments: LUCENE-1701-test-tag-special.patch, LUCENE-1701.patch, 
> LUCENE-1701.patch, LUCENE-1701.patch, LUCENE-1701.patch, LUCENE-1701.patch, 
> LUCENE-1701.patch, NumericField.java
>
>
> In discussions about LUCENE-1673, Mike & me wanted to add a new NumericField 
> to o.a.l.document specific for easy indexing. An alternative would be to add 
> a NumericUtils.newXxxField() factory, that creates a preconfigured Field 
> instance with norms and tf off, optionally a stored text (LUCENE-1699) and 
> the TokenStream already initialized. On the other hand 
> NumericUtils.newXxxSortField could be moved to NumericSortField.
> I and Yonik tend to use the factory for both, Mike tends to create the new 
> classes.
> Also the parsers for string-formatted numerics are not public in FieldCache. 
> As the new SortField API (LUCENE-1478) makes it possible to support a parser 
> in SortField instantiation, it would be good to have the static parsers in 
> FieldCache public available. SortField would init its member variable to them 
> (instead of NULL), so making code a lot easier (FieldComparator has this ugly 
> null checks when retrieving values from the cache).
> Moving the Trie parsers also as static instances into FieldCache would make 
> the code cleaner and we would be able to hide the "hack" 
> StopFillCacheException by making it private to FieldCache (currently its 
> public because NumericUtils is in o.a.l.util).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-1701) Add NumericField and NumericSortField, make plain text numeric parsers public in FieldCache, move trie parsers to FieldCache

Reply via email to