On Sun, Nov 11, 2012 at 3:33 AM, Robert Muir <rcm...@gmail.com> wrote: > I am guessing at times people are lazy about schema definition. But, I think > with lucene 4 stats we can detect if a field is actually single valued... > Something like terms.size == terms.doccount == terms.sumdocfreq. I have to > think about it a bit, maybe its even simpler than this? Anyway, this couple > be used instead of actual schema def to just build a fieldcache instead of > uninverted field I think... Should be a simple opto but maybe potent...
Funny you should mention this now - I was thinking exactly the same thing on the flight home from ApacheCon! This "detect single-valued" also has implications for things other than faceting as well - as you say, people can be lazy about the schema definition and having things "just work" is a good thing. I've thought about a more flexible field that acts like a single valued field when you use it like that, and a multi-valued field otherwise. There won't quite be back compat with responses though (since multiValued fields with single values now look like "foo":["single_value"] instead of "foo":"single_value".) Perhaps we could add something like multiValued=flexible or something (and switch to that by default), while retaining back compat for multiValued=true/false. Either that or bump "version" of the schema or response. This is actually pretty important if we ever want to do more "schema-less" (i.e. type guessing based on input), since it allows us to only guess type and not have to deal with figuring out multiValued. It could lower the numer of dynamic field definitions necessary and make choosing the correct one simpler. -Yonik http://lucidworks.com --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org