Hi Hoss, > > : <fieldType name="latlon" type="LatLonFieldType" pattern="location__*" /> > : <fieldType name="latlon_home" type="LatLonFieldType" > pattern="location_home_*"/> > : <fieldType name="latlon_work" type="LatLonFieldType" > pattern="location_home_*"/> > : > : <field name="location" type=latlon"/> > : <field name="location_home" type=latlon_home"/> > : <field name="location_work" type=latlon_work"/> > > I'm not really understanding the value of an approach like that. for > starters, what Lucene field names would ultimately be created in those > examples?
The first field would be named location__location. The second field would be named location_home_location_home. The third field would be named location_work_location_work. > And if i also added... > > <field name="other_location" type=latlon"/> > <dynamicField name="*_dynamic_location" type=latlon"/> > > ...then what field names would be created under the covers? > In general, it would be FieldType#getPattern().stripOffEndRegexStarStuff() + Field#getName(). > : I think it makes more sense to define the heterogeneity at the fieldType > level because: > : > : (a) it's a bit more consistent with the existing solr schema examples, > : where the difference between many of the field types (e.g., ints and > : tints, which are both solr.TrieIntField's, date and tdate, both > : instances of solr.TrieDateField, with different configuration, etc.) > : > : (b) isolation of change: <fieldType> defs will change less often than > : <field> defs, where names and indexed/stored/etc. debugging are likely > : to occur more frequently > > ...this just feels wrong to me ... i can't really explain why. It seems > like you are suggesting thatt every <field/> declaration would need a one > to one corrispondence with a unique <fieldType/> declaration in order to > prevent field name collisions, which sounds sketchy enough ... but i'm > also not fond of the idea that a person editing the schema can't just look > at the <field/> and <dynamicField/> names to ensure that they understand > what underlying fields are being created (so they don't inadvertantly add > a new one that collides) ... now they also have to look at the "pattern" > attribute of every <fieldType/> that is a poly field. Well if this feels wrong to you then I think the schema.xml file that ships with SOLR should also feel wrong as well because it uses the exact same pattern for defining field type variations. That is, differences between FieldType representations for ints and tints are not stored as variations on the SchemaField definition itself but they are stored as variation on the FieldTypes (e.g., a different precisionStep in the case of int [0] versus that of tint [8]). Based on what you are proposing, why isn't precisionStep an attribute on <field, rather than <fieldType in those examples? > > letting <dynamicField/> drive everything just seems a *lot* simpler ... > both as far as implementation, and as far as maintaining the schema. Possibly. It's also a lot less traceable. It's implicit versus explicit, which I'm not sure leads to simplicity in the end. > > : I don't think the above hybrid approach will lead to anything other than > : confusion, as you indicated above. Let's stick to the pattern defs at > : the <fieldType> level, and then let the fieldType handle the internal > : "dynamicity" with e.g., a dynamicField, and then notify the schema user > > From the standpoint of reading a schema.xml file, the approach you're > describing of a pattern attribute on <fieldType/> declarations actaully > seems more confusing then the strawman suggestion i made of a pattern > attribute on <field> ... even without understanding what concrete feilds > you are suggesting would be created with a configuration like that, it > still increases the number of places you have to look to see what field > names are getting created. How so? In actuality, it reduces it. Instead of having pattern definitions on fields (which there is a greater chance of having more of), you have them on field types? Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.mattm...@jpl.nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++