[ https://issues.apache.org/jira/browse/SOLR-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13693188#comment-13693188 ]
Hoss Man commented on SOLR-4892: -------------------------------- Steve: for the most part the patch looks good, but a couple of things jumped out at me though as seeming kind of fishy... 1) AllValuesOrNoneFieldMutatingUpdateProcessor semantics The "if (destVal == srcVal) ... do nothing" logic seems like a trap for future devs who want to subclass -- not to mention i'm not 100% certain i agree it makes sense for the subclasses you've already written. (If I send the list [new String("6.0"), new Double(7.3D), new String("4.3")] to ParseDOubleUpdateProcessor should it really reject the entire list because one of the original values is already a double?) I think it would be more straight forward to give subclasses another singleton marker object (similar to the DELETE singleton) they can return from mutateValue() to say "abort processing this list i have encountered an object i do not want". If mutateValue() returns the same object it was passed in, that should be considered a successful mutation. 2) ParsingFieldUpdateProcessor overriding processAdd Some code duplication here that worries me. If we want to make the change I suggested above for parsing strings in lists where we can and allowing type specific objects if they are already the correct type, then (if i'm thinking about this correctly) we don't need this class at all, classes like ParseFooFieldUpdateProcessor could easily just handle the CharSequence check as part of their basic logic... {code} if (val instanceof CharSequence) { val = ... attempt parse as a Foo...; } if (val instanceof Foo) return val; return SKIP_FIELD_VALUE_LIST_SINGLETON {code} But even if we want to keep the current semantics of AllValuesOrNoneFieldMutatingUpdateProcessor, then i think it would be cleaner if ParsingFieldUpdateProcessor was changed to be really simple, ala... {code} ParsingFieldUpdateProcessor extends AllValuesOrNoneFieldMutatingUpdateProcessor { public ParsingFieldUpdateProcessor(FieldNameSelector selector, UpdateRequestProcessor next) { super(selector, next); } protected final Object mutateValue(final Object srcVal) { return (srcVal instanceof CharSequence) ? parse(srcVal.toString()) : SKIP_FIELD_VALUE_LIST_SINGLETON; } protected abstract Object parse(final Object srcVal); } {code} ...and then the concrete subclasses just implement "parse" 3) either way we go with #2, that means we don't need the access modifieier change on 'selector' in FieldMutatingUpdateProcessor. 4) double check the handling of field boosts in AllValuesOrNoneFieldMutatingUpdateProcessor ... pretty sure it isn't being preserved (see FieldValueMutatingUpdateProcessor for example) 5) The SortableFooField FieldType's should also be decorated with your new interfaces. 6) Can we refactor the locale init param parsing into a helper utility class or a common factory bsae class so it's not duplicated in both ParseNumericFieldUpdateProcessorFactory and ParseDateFieldUpdateProcessorFactory ? ---- And some misc questions... 7) Why no ParseIntFieldUpdateProcessor and ParseFloatFieldUpdateProcessor ? 8) would it be simpler for users to have a single "locale" init param that accepted things like "ru" and "ru_RU" and "ru_RU_xxx" using LocaleUtils.toLocale in commons-lang? 9) we should be able to make the ParseFooFieldUpdateProcessor inner classses into static inner classes just by passing hte locale into the constructor and right? > Add field update processors to parse/convert String-typed fields to Date, > Number, and Boolean > --------------------------------------------------------------------------------------------- > > Key: SOLR-4892 > URL: https://issues.apache.org/jira/browse/SOLR-4892 > Project: Solr > Issue Type: New Feature > Components: update > Reporter: Steve Rowe > Assignee: Steve Rowe > Priority: Minor > Fix For: 4.4 > > Attachments: SOLR-4892.patch > > > Add {{FieldMutatingUpdateProcessorFactory}} subclasses > {{ParseFooUpdateProcessorFactory}}, where {{Foo}} includes {{Date}}, > {{Double}}, {{Long}}, and {{Boolean}}. > These factories will have a default selector that matches all fields that > either don’t match any schema field, or are in the schema with the > corresponding {{typeClass}}. They can optionally support a list of multiple > format specifiers. If they see a value that is not a CharSequence, or can't > parse the value using a configured format, they ignore it and leave it as is. > For multi-valued fields, these processors will not convert any values unless > all are first successfully parsed. Ordering the processors [Boolean, Long, > Double, Date] will allow e.g. values [2, 5, 8.6] to be left alone by the > Boolean and Long processors, but then converted by the Double processor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org