Thanks for your reply Alex. I have replies inline.
bq. but you are also putting meta information in the same document - Correct. My elasticsearch implementation is part of a larger framework. Similar to Pig, Hive, Avro and other data model-agnostic frameworks, I pass along a small piece metadata with each key/value that gets stored on an object. This promotes change of models without breaking the analytics processing or view layers. bq. you will not be able to execute number range queries on the age of people The model I've given in my previous post is actually a little dumbed down. The framework has a value normalization system that knows how to turn native datatypes into lexicographically sortable strings (fixed length byte arrays or strings for longs/ints, etc...). What I'm showing in my previous post is simply a hand typed version of the actual data model. bq. it might be more useful to have a dedicated index for your field configuration, which you can query for the usecase you outlined in your post. And you have a dedicated index for the data This solution sounds wonderful! Is there a way I can do this automatically in ElasticSearch? I know one of the things I did in my mappings was to bifurcate the indexes for each of the tuples so that one index I can do exact matches and the other index I can do fuzzy matches (I believe I just used one with analyzed and one with not_analyzed). Is this where i'd tell it to index all unique tuple key names for me? I agree with you on the facets, I'd rather not have to perform an aggregated query on ALL the entity types if it's not necessary. Thanks much! On Thursday, March 6, 2014 3:59:43 AM UTC-5, Alexander Reelsen wrote: > > Hey, > > before answering your question, I think the approach of handling your data > might be problematic. You are actually mixing two things in your data and > your metadata (which is in every document). First the data itself (John > Doe, 38 years old), but you are also putting meta information in the same > document - maybe it makes more sense to put this data somewhere else (as it > hopefully applies for all documents of that type). Also your above approach > has another problem, you will not be able to execute number range queries > on the age of people, because the value field is configured to be a string > - same goes for sorting. > With that said, it might be more useful to have a dedicated index for your > field configuration, which you can query for the usecase you outlined in > your post. And you have a dedicated index for the data - splitting those > IMO makes a lot of sense. On the other hand I dont know your data well > enough, maybe I am completely wrong. > > Back to your original question. If you store a document like the above, > and you execute searches on it, the full document always gets returned, not > just parts of it. You may want to read into parent-child/nested > functionality though (I still do not like that approach). > > Facetting can only be done on single fields, so you will not get back the > tuple you actually need (you could join them via a script facet, but that > seems like another work around) - or again read about parent-child/nested > documents (again disliking this, but I guess you know this by now). > > One last thing: Its nice to have everything in one query, but dont > consider this a must. If two queries solve your problem, it might make more > sense. > > > --Alex > > > > > On Wed, Mar 5, 2014 at 5:15 AM, Corey Nolet <cjn...@gmail.com<javascript:> > > wrote: > >> I forgot to mention, I need the ability for the user to specify they only >> care about keys for the entity.type === 'person' (or any type for that >> matter). >> >> >> On Tuesday, March 4, 2014 11:13:27 PM UTC-5, Corey Nolet wrote: >>> >>> Hello, >>> >>> I've got an "entity" document which looks like this: >>> >>> { >>> id: 'id', >>> type: 'person', >>> tuples: [ >>> { >>> key: 'nameFirst', >>> value: 'john', >>> type: 'string' >>> }, >>> key: 'age', >>> value: '38', >>> type: 'int' >>> }, >>> { >>> key: 'nameLast', >>> value: 'doe', >>> } >>> ] >>> } >>> >>> The tuples field has been mapped in ElasticSearch as a nested type where >>> I provide both analyzed and not_analyzed indices for each of the nested >>> fields (for exact and fuzzy match). What I'm trying to do is find, for each >>> entity's type field, the unique tuple key values along with their >>> associated types. >>> >>> In other words, I want to write a web service where someone can start >>> typing "n" and I'll return "[{ key:'nameFirst', type:'string'}, { key: >>> 'nameLast', type: 'string' }]" or they could start typing "a" and I'll >>> return "[{ key: 'age', type: 'int' }]. If they don't type anything, I'd >>> like to return the union between the two sets (where it includes nameLast, >>> nameFirst, and age). >>> >>> As i'm reading, I'm seeing that this may be done with facets but I know >>> they have some limitations Is this something that would be possible to do >>> directly? I'm trying to do this all with one fast query if I can. >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearc...@googlegroups.com <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/5c62e98d-3ad9-4a4f-b7c7-5620221c2380%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/5c62e98d-3ad9-4a4f-b7c7-5620221c2380%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/groups/opt_out. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b532ed7d-34ae-46d6-8409-b56d30207aee%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.