[ https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051512#comment-13051512 ]
Michael McCandless commented on LUCENE-2308: -------------------------------------------- Patch looks good, thanks Nikola! When you make the patch, can you run "svn diff" from the top-level dir? Ie, so that file paths look lucene/src/java/org/apache/lucene/document/Field.java A couple minor code-formatting things: * Please add { } around one-line ifs, eg in FieldType.toString * import lines go after the copyright (FieldType.java) * If possible please try to avoid adding "noise" to the patch, for example re-formatting javadocs (eg NumericField.java). It's fine to clean things up (add missing {}'s to existing code) as you go, but if it's simply a reformat that just adds noise which makes it harder to see real changes. Other stuff: * The DEFAULT_TYPE for each field can be final right? * For FieldType, can we use direct members of the class, instead of the EnumSet? (Ie, boolean indexed, boolean stored, etc.). The patch causes compilation errors when I run "ant compile-core", but that's expected right? I think our immediate goal here should be to get a compilable patch with tests passing, ie the "dirt path". Then we can go back and iterate. But, because so many tests rely on the current Document/Field API... I think in order to stage this we should make a totally new package, call it document2 for now, and create all these new classes inside there. Then, one by one we can cutover tests to use document2/*, starting with TestDemo. Eventually, once everything is cutover, we can remove document and rename document2 to document. > Separately specify a field's type > --------------------------------- > > Key: LUCENE-2308 > URL: https://issues.apache.org/jira/browse/LUCENE-2308 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index > Reporter: Michael McCandless > Assignee: Michael McCandless > Labels: gsoc2011, lucene-gsoc-11, mentor > Fix For: 4.0 > > Attachments: LUCENE-2308-2.patch, LUCENE-2308.patch, LUCENE-2308.patch > > > This came up from dicussions on IRC. I'm summarizing here... > Today when you make a Field to add to a document you can set things > index or not, stored or not, analyzed or not, details like omitTfAP, > omitNorms, index term vectors (separately controlling > offsets/positions), etc. > I think we should factor these out into a new class (FieldType?). > Then you could re-use this FieldType instance across multiple fields. > The Field instance would still hold the actual value. > We could then do per-field analyzers by adding a setAnalyzer on the > FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise > for per-field codecs (with flex), where we now have > PerFieldCodecWrapper). > This would NOT be a schema! It's just refactoring what we already > specify today. EG it's not serialized into the index. > This has been discussed before, and I know Michael Busch opened a more > ambitious (I think?) issue. I think this is a good first baby step. We could > consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold > off on that for starters... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org