You make a good point about the countless hours debugging. On the flip side, one could ask the question as to whether the Solr schema is stable enough that we should publish an XML Schema for it, thus helping alleviate some of the pain.

More below...

On Jun 30, 2008, at 3:28 PM, J.J. Larrea wrote:

I heartily agree with you Grant that these objects should be user- extensible. But removing the exception test entirely would probably be a great disservice to Solr users, who could spend untold hours debugging problems in schema.xml (eg. misspelled or contextually inappropriate properties) without the valuable feedback it provides. So to do this right there should be a way to define additional properties (defined as booleans in Solr) and attributes (which can be string-valued).

Thinking aloud here...

For properties, something like this added to FieldProperties would allow user-defined global properties:

final static int USER_DEFINED = 0x00010000;

static int nextIndex = USER_DEFINED;

static int addPropertyType(String prop) {
   if( propertyMap.containsKey(prop) ) throw ...
   if( nextIndex > 31 ) throw ...
   i = nextIndex++;
   propertyMap.put(prop, i);
   return i;
}

Which could be enabled by parsing a new <fieldProperty name="..."/> tag from schema.xml before any of the fieldType or field declarations.

For string-valued attributes, FieldType could be extended with a Set of user-defined names (or name/type mappings?) which would be removed from initArgs before the exception test. The values could be returned by a trivial method

 public String getAttribute(String name) {
        return args.get(name);
 }

so other code could repeatedly get access to them (initArgs are progressively removed until the null set or error, but args persist) without having to parse and store the value somewhere.

Simplest would be for the attribute name set to be global across all field-types, with a static addAttributeType method and a freestanding tag in schema.xml similar to the above for properties. But one could argue for the set of user-defined attribute to be local to a particular fieldType and all fields defined from it, perhaps set from an XML attribute:

<!-- text fields have an attribute lang defaulting to 'american' -->
   <fieldType name="text" extra="lang" lang="american" ... />

This seems a bit clunky to me, syntax-wise, but the idea seems right. I suppose another option is that I could just extend the FieldType and have it look for my own attributes.

I'll have to think some more about it...



   <field name="Prenom" type="text" lang="french" ... />

Anyway, does this make sense and fit with what you were thinking of?

- J.J.

At 10:22 AM -0400 6/30/08, Grant Ingersoll wrote:
Currently, FieldType throws a RuntimeException if there are any "extra" properties in the configuration. I think SchemaField does something similar.

I'd like to consider not doing this. My main case is I want to be able to store semantic information about the FieldType with the FieldType. Doing this now, requires creating a whole separate object model that overlays the FieldType and stores the information elsewhere (i.e. DB). For example, say you want to denote what language a given field type supports, one has to store this information elsewhere, when it could easily be seen as a property of the FieldType. I think right now, people often rely on naming conventions to convey this, such as text_zh or text_it or something like that and that doesn't extend very well, IMO. These new attributes would allow applications to make use of richer semantics for FieldType w/o harming Solr in anyway (I think.)

From the looks of it, FieldType has all the functionality already built in, minus a few lines where the exception is thrown if there are "extra" attributes.

I think a similar argument can be made for SchemaField as well (and probably other things like RequestHandler, etc. but "baby steps" first)

Any thoughts/objections?


-Grant


--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ







Reply via email to