Re: Feature: Solr implicitly defined field types?

Jörn Franke Sun, 30 Dec 2018 01:54:17 -0800

Hi David,

I now get the idea and yes this makes sense. It would require though some 
tutorial or best practices, eg overriding a platform data type may make not so 
much sense - it may confuse new developers in an existing project that know 
Solr, but then get a platform type that has not the default behavior.


Could you deal with different languages in platform types? Eg for dates it does 
not seem a problem, because Solr expects only one specific type of date that 
needs to be somehow converted beforehand (maybe that conversion could be also 
part of a platform type), but decimals are different in some languages or 
Boolean values.

> Am 30.12.2018 um 07:01 schrieb David Smiley <david.w.smi...@gmail.com>:
> 
> Thanks for your thoughtful response Jörn!
> ...
>> On Sat, Dec 29, 2018 at 4:14 AM Jörn Franke <jornfra...@gmail.com> wrote:
>> I think it is a good idea, but I see some potential complexity for 
>> “deployment” of collections. For instance, in environments where Solr is 
>> used as a shared platform amongst several stakeholders, every time you 
>> deploy/modify a collection you need to take care that the platform types 
>> exist. If it exists in the Test environment then i need to make sure that it 
>> exists as well in acceptance/production. The problem is that the platform 
>> type could have been defined by somebody else who has not yet (eg due to 
>> project/sprint delays) not updated the other environments. Another issue is 
>> if I move to another Solr cluster in the same environment. Then, I have to 
>> make sure that all platform types move with me. 
> 
> RE "the platform type could have been defined by somebody else":  I'm not 
> imagining it'd be configurable, thus the "somebody else" is the Solr 
> project/committers.
> 
> Otherwise, I think I get your point, but perhaps I don't.  It's the same 
> point for any use of some new feature of Solr.  If you use some new feature, 
> you have to take care that all Solr instances you deploy your configuration 
> to can handle that new feature.  That's a fairly generic point that would 
> apply to just about anything in Solr.
>  
>> A (minor) issue is that platform types may change (for whatever reasons) and 
>> that then potentially all collections have to be reindexed or we have 
>> different versions of the same platform type making things not easier.
> 
> Yes it's possible.  Though I think that point is apart from the feature I 
> propose.  You're saying that you might want to use an "int" field and then 
> one day realize you want some newer/better definition of what an "int" is 
> (e.g. trie -> points).  Sure.  That's true wether the field type is explicit 
> or implicit.  There's nothing stopping you from explicitly defining the field 
> type if you want to; the names would not be reserved. If you want to stick 
> with your current index running the new Solr version, then you would keep 
> luceneMatchVersion what it was, which would effectively retain the 
> interpretation of the implicit field types.
>  
>> Currently we have all our Schema definitions in a version management system 
>> (we use the Schema API but the JSON requests are out there) so that projects 
>> can inspire from each other. Needless to say, that careful type engineering 
>> requires also some documentation on technical design and may be indeed very 
>> Collection specific.
>> 
>> Another issue could be that a platform type may also imply a certain 
>> platform solrconfig.xml (eg lib directive etc). 
> 
> I'm imagining platform types would be basic primitive types (int, boolean, 
> etc. and some special situations like in the issue I referenced).  They would 
> not depend on contrib libs... though I could imagine one day an evolution of 
> this in which a contrib could somehow auto-add implicit field types.
>  
>> I am not sure yet what are the exact benefits of referring to types of other 
>> collections in the Solr runtime itself instead of having a version system 
>> and letting projects decide if they want to adapt types of other 
>> collections, but maybe I am overlooking something here.
> 
> The notion of implicit field types is not a cross-config (cross-collection) 
> thing.  Implicit field types are nothing more than built-in shortcuts.
>  
> I recall one of my very early observations of Solr's schema was of surprise 
> to see primitive types defined in the schema.  Consider in SQL DDL statements 
> that refer to varchar and such.  Your DDL doesn't need to define what a 
> varchar is!
> 
> Happy New Year,
> ~ David
> 
>>> Am 28.12.2018 um 17:36 schrieb David Smiley <david.w.smi...@gmail.com>:
>>> 
>>> While working on https://issues.apache.org/jira/browse/SOLR-12768 it 
>>> occurred to me that it would be nice if Solr had implicitly defined field 
>>> types.  This would allow you to define a field in your schema that refers 
>>> to a type that is not also in your schema -- at least not explicitly (need 
>>> not explicitly be put in your schema.xml if classic, or need not be passed 
>>> to schema manipulation API if you use that).  The idea would be that these 
>>> types would be Solr platform provided field types that need not be defined 
>>> by you.  
>>> 
>>> There are multiple ways this loose idea might be conceived / imagined into 
>>> a concrete proposal.  
>>> 
>>> (A) The main idea I'm kicking around right now is that Solr would _not_ 
>>> throw an error at the moment of reading your field definition that it 
>>> doesn't see your type... instead it would see it's a platform type (via 
>>> some built-in hard-coded registry) and then register that type on the fly.  
>>> So if you were to read the schema then you'd see it.  In this way, it's 
>>> kind of a shortcut.  Platform field types that you don't actually refer to 
>>> will never end up being put into your schema.
>>> 
>>> (B) A schema could pre-initialize with the platform/implicit types.  This 
>>> is the simplest idea but I don't like it because you may not even need some 
>>> of these types.  I'm not going to go down this path now but wanted to 
>>> mention it.
>>> 
>>> I'm exploring (A) right now... I'm hoping to do this for at least a 
>>> "_nest_path_"  field in support of nested documents in 8.0, but conceivably 
>>> the idea would be expanded to lots of things in our base schema right now 
>>> (int, str, etc.)
>>> -- 
>>> Lucene/Solr Search Committer (PMC), Developer, Author, Speaker
>>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: 
>>> http://www.solrenterprisesearchserver.com
> -- 
> Lucene/Solr Search Committer (PMC), Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: 
> http://www.solrenterprisesearchserver.com

Re: Feature: Solr implicitly defined field types?

Reply via email to