I heartily agree with you Grant that these objects should be user-extensible. 
But removing the exception test entirely would probably be a great disservice 
to Solr users, who could spend untold hours debugging problems in schema.xml 
(eg. misspelled or contextually inappropriate properties) without the valuable 
feedback it provides.  So to do this right there should be a way to define 
additional properties (defined as booleans in Solr) and attributes (which can 
be string-valued).

Thinking aloud here...

For properties, something like this added to FieldProperties would allow 
user-defined global properties:

final static int USER_DEFINED = 0x00010000;

static int nextIndex = USER_DEFINED;

static int addPropertyType(String prop) {
    if( propertyMap.containsKey(prop) ) throw ...
    if( nextIndex > 31 ) throw ...
    i = nextIndex++;
    propertyMap.put(prop, i);
    return i;
}

Which could be enabled by parsing a new <fieldProperty name="..."/> tag from 
schema.xml before any of the fieldType or field declarations.

For string-valued attributes, FieldType could be extended with a Set of 
user-defined names (or name/type mappings?) which would be removed from 
initArgs before the exception test.  The values could be returned by a trivial 
method

  public String getAttribute(String name) {
        return args.get(name);
  }
 
so other code could repeatedly get access to them (initArgs are progressively 
removed until the null set or error, but args persist) without having to parse 
and store the value somewhere.

Simplest would be for the attribute name set to be global across all 
field-types, with a static addAttributeType method and a freestanding tag in 
schema.xml similar to the above for properties.  But one could argue for the 
set of user-defined attribute to be local to a particular fieldType and all 
fields defined from it, perhaps set from an XML attribute:

    <!-- text fields have an attribute lang defaulting to 'american' -->
    <fieldType name="text" extra="lang" lang="american" ... />

    <field name="Prenom" type="text" lang="french" ... />

Anyway, does this make sense and fit with what you were thinking of?

- J.J.

At 10:22 AM -0400 6/30/08, Grant Ingersoll wrote:
>Currently, FieldType throws a RuntimeException if there are any "extra" 
>properties in the configuration.  I think SchemaField does something similar.
>
>I'd like to consider not doing this.  My main case is I want to be able to 
>store semantic information about the FieldType with the FieldType.  Doing this 
>now, requires creating a whole separate object model that overlays the 
>FieldType and stores the information elsewhere (i.e. DB).  For example, say 
>you want to denote what language a given field type supports, one has to store 
>this information elsewhere, when it could easily be seen as a property of the 
>FieldType.  I think right now, people often rely on naming conventions to 
>convey this, such as text_zh or text_it or something like that and that 
>doesn't extend very well, IMO.  These new attributes would allow applications 
>to make use of richer semantics for FieldType w/o harming Solr in anyway (I 
>think.)
>
>From the looks of it, FieldType has all the functionality already built in, 
>minus a few lines where the exception is thrown if there are "extra" 
>attributes.
>
>I think a similar argument can be made for SchemaField as well (and probably 
>other things like RequestHandler, etc. but "baby steps" first)
>
>Any thoughts/objections?
>
>
>-Grant

Reply via email to