What about if a system schema was loaded at a startup implicitly. Then, if a new schema is loaded and type definition is missing, it is copied - at that time - into the specific schema. So, on the first rewrite those - and only those used - types will be written out.
This allows to version the system types the same way as we version normal schema. I agree with Gus that hidden configuration causes all sorts of challenges. And - for tooling purposes - there definitely needs to be a way to get all used definitions, explicit and implicit, used and just available. That also points towards something that already has self-describing mechanism (like Schema API) available. Regards, Alex. On Fri, 4 Jan 2019 at 10:45, David Smiley <david.w.smi...@gmail.com> wrote: > > I'm thinking this feature would be used conservatively -- and thus just > primitive types that wouldn't have an interesting configuration to them, or > for something you are really not expected to change (the nest path of nested > docs). So you wouldn't feel you had to go read the docs. The schema might > even have a comment to mention a list of implicit field types (a one-liner > comma delimited list). > > On Fri, Jan 4, 2019 at 10:34 AM Gus Heck <gus.h...@gmail.com> wrote: >> >> I'm perhaps slightly conservative with respect to configuration, but I'm not >> fond of hidden configuration that I can't see. What I don't like is looking >> at a config file and not seeing the full story. That means i have to read >> the config and ALSO go read some part of the documentation that I've failed >> to memorize, and probably need to google to find to be fully aware of what's >> going on.... (and no I didn't like it when some standard stuff disappeared >> from solrconfig.xml a while back either). Small changes of course seem >> reasonable, but the further we drift into implicit things, especially if we >> get a collection of several implicit things described in various disparate >> parts of the manual the more cryptic the system becomes. That's my opinion, >> YMMV. >> >> -Gus >> >> On Thu, Jan 3, 2019 at 2:57 PM David Smiley <david.w.smi...@gmail.com> wrote: >>> >>> Broadly, you refer to "locale" issues. Solr's way of dealing with this >>> today is with optional & configurable use of URPs. The schema-less / >>> data-driven mode has some of these enabled; you can see it in the >>> solrconfig.xml including many date formats. You can look into that for >>> further info if you like. The primitive field types are not locale >>> sensitive. >>> >>> Update: It's looking like 8.0 will only employ this implicit field type >>> mechanism for _nest_path_ which probably won't be in the default schema. >>> Assuming it isn't, then it'll only be documented in the context of this >>> particular feature. It'd be nice to see the scope of fields expanded and >>> at that juncture it could/should be more broadly documented. That can wait >>> to people have energy to do it. >>> >>> On Sun, Dec 30, 2018 at 4:54 AM Jörn Franke <jornfra...@gmail.com> wrote: >>>> >>>> Hi David, >>>> >>>> I now get the idea and yes this makes sense. It would require though some >>>> tutorial or best practices, eg overriding a platform data type may make >>>> not so much sense - it may confuse new developers in an existing project >>>> that know Solr, but then get a platform type that has not the default >>>> behavior. >>>> >>>> Could you deal with different languages in platform types? Eg for dates it >>>> does not seem a problem, because Solr expects only one specific type of >>>> date that needs to be somehow converted beforehand (maybe that conversion >>>> could be also part of a platform type), but decimals are different in some >>>> languages or Boolean values. >>>> >>>> Am 30.12.2018 um 07:01 schrieb David Smiley <david.w.smi...@gmail.com>: >>>> >>>> Thanks for your thoughtful response Jörn! >>>> ... >>>> On Sat, Dec 29, 2018 at 4:14 AM Jörn Franke <jornfra...@gmail.com> wrote: >>>>> >>>>> I think it is a good idea, but I see some potential complexity for >>>>> “deployment” of collections. For instance, in environments where Solr is >>>>> used as a shared platform amongst several stakeholders, every time you >>>>> deploy/modify a collection you need to take care that the platform types >>>>> exist. If it exists in the Test environment then i need to make sure that >>>>> it exists as well in acceptance/production. The problem is that the >>>>> platform type could have been defined by somebody else who has not yet >>>>> (eg due to project/sprint delays) not updated the other environments. >>>>> Another issue is if I move to another Solr cluster in the same >>>>> environment. Then, I have to make sure that all platform types move with >>>>> me. >>>> >>>> >>>> RE "the platform type could have been defined by somebody else": I'm not >>>> imagining it'd be configurable, thus the "somebody else" is the Solr >>>> project/committers. >>>> >>>> Otherwise, I think I get your point, but perhaps I don't. It's the same >>>> point for any use of some new feature of Solr. If you use some new >>>> feature, you have to take care that all Solr instances you deploy your >>>> configuration to can handle that new feature. That's a fairly generic >>>> point that would apply to just about anything in Solr. >>>> >>>>> >>>>> A (minor) issue is that platform types may change (for whatever reasons) >>>>> and that then potentially all collections have to be reindexed or we have >>>>> different versions of the same platform type making things not easier. >>>> >>>> >>>> Yes it's possible. Though I think that point is apart from the feature I >>>> propose. You're saying that you might want to use an "int" field and then >>>> one day realize you want some newer/better definition of what an "int" is >>>> (e.g. trie -> points). Sure. That's true wether the field type is >>>> explicit or implicit. There's nothing stopping you from explicitly >>>> defining the field type if you want to; the names would not be reserved. >>>> If you want to stick with your current index running the new Solr version, >>>> then you would keep luceneMatchVersion what it was, which would >>>> effectively retain the interpretation of the implicit field types. >>>> >>>>> >>>>> Currently we have all our Schema definitions in a version management >>>>> system (we use the Schema API but the JSON requests are out there) so >>>>> that projects can inspire from each other. Needless to say, that careful >>>>> type engineering requires also some documentation on technical design and >>>>> may be indeed very Collection specific. >>>>> >>>>> Another issue could be that a platform type may also imply a certain >>>>> platform solrconfig.xml (eg lib directive etc). >>>> >>>> >>>> I'm imagining platform types would be basic primitive types (int, boolean, >>>> etc. and some special situations like in the issue I referenced). They >>>> would not depend on contrib libs... though I could imagine one day an >>>> evolution of this in which a contrib could somehow auto-add implicit field >>>> types. >>>> >>>>> >>>>> I am not sure yet what are the exact benefits of referring to types of >>>>> other collections in the Solr runtime itself instead of having a version >>>>> system and letting projects decide if they want to adapt types of other >>>>> collections, but maybe I am overlooking something here. >>>> >>>> >>>> The notion of implicit field types is not a cross-config >>>> (cross-collection) thing. Implicit field types are nothing more than >>>> built-in shortcuts. >>>> >>>> I recall one of my very early observations of Solr's schema was of >>>> surprise to see primitive types defined in the schema. Consider in SQL >>>> DDL statements that refer to varchar and such. Your DDL doesn't need to >>>> define what a varchar is! >>>> >>>> Happy New Year, >>>> ~ David >>>> >>>>> Am 28.12.2018 um 17:36 schrieb David Smiley <david.w.smi...@gmail.com>: >>>>> >>>>> While working on https://issues.apache.org/jira/browse/SOLR-12768 it >>>>> occurred to me that it would be nice if Solr had implicitly defined field >>>>> types. This would allow you to define a field in your schema that refers >>>>> to a type that is not also in your schema -- at least not explicitly >>>>> (need not explicitly be put in your schema.xml if classic, or need not be >>>>> passed to schema manipulation API if you use that). The idea would be >>>>> that these types would be Solr platform provided field types that need >>>>> not be defined by you. >>>>> >>>>> There are multiple ways this loose idea might be conceived / imagined >>>>> into a concrete proposal. >>>>> >>>>> (A) The main idea I'm kicking around right now is that Solr would _not_ >>>>> throw an error at the moment of reading your field definition that it >>>>> doesn't see your type... instead it would see it's a platform type (via >>>>> some built-in hard-coded registry) and then register that type on the >>>>> fly. So if you were to read the schema then you'd see it. In this way, >>>>> it's kind of a shortcut. Platform field types that you don't actually >>>>> refer to will never end up being put into your schema. >>>>> >>>>> (B) A schema could pre-initialize with the platform/implicit types. This >>>>> is the simplest idea but I don't like it because you may not even need >>>>> some of these types. I'm not going to go down this path now but wanted >>>>> to mention it. >>>>> >>>>> I'm exploring (A) right now... I'm hoping to do this for at least a >>>>> "_nest_path_" field in support of nested documents in 8.0, but >>>>> conceivably the idea would be expanded to lots of things in our base >>>>> schema right now (int, str, etc.) >>>>> -- >>>>> Lucene/Solr Search Committer (PMC), Developer, Author, Speaker >>>>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: >>>>> http://www.solrenterprisesearchserver.com >>>> >>>> -- >>>> Lucene/Solr Search Committer (PMC), Developer, Author, Speaker >>>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: >>>> http://www.solrenterprisesearchserver.com >>> >>> -- >>> Lucene/Solr Search Committer (PMC), Developer, Author, Speaker >>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: >>> http://www.solrenterprisesearchserver.com >> >> >> >> -- >> http://www.the111shift.com > > -- > Lucene/Solr Search Committer (PMC), Developer, Author, Speaker > LinkedIn: http://linkedin.com/in/davidwsmiley | Book: > http://www.solrenterprisesearchserver.com --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org