Re: Facet to find possible keys for querying

Corey Nolet Thu, 06 Mar 2014 06:57:06 -0800

Thanks for your reply Alex. I have replies inline.


bq. but you are also putting meta information in the same document -

Correct. My elasticsearch implementation is part of a larger framework. 
Similar to Pig, Hive, Avro and other data model-agnostic frameworks, I pass 
along a small piece metadata with each key/value that gets stored on an 
object. This promotes change of models without breaking the analytics 
processing or view layers.

bq. you will not be able to execute number range queries on the age of 
people

The model I've given in my previous post is actually a little dumbed down. 
The framework has a value normalization system that knows how to turn 
native datatypes into lexicographically sortable strings (fixed length byte 
arrays or strings for longs/ints, etc...). What I'm showing in my previous 
post is simply a hand typed version of the actual data model.

bq. it might be more useful to have a dedicated index for your field 
configuration, which you can query for the usecase you outlined in your 
post. And you have a dedicated index for the data

This solution sounds wonderful! Is there a way I can do this automatically 
in ElasticSearch? I know one of the things I did in my mappings was to 
bifurcate the indexes for each of the tuples so that one index I can do 
exact matches and the other index I can do fuzzy matches (I believe I just 
used one with analyzed and one with not_analyzed). Is this where i'd tell 
it to index all unique tuple key names for me? I agree with you on the 
facets, I'd rather not have to perform an aggregated query on ALL the 
entity types if it's not necessary.


Thanks much!


On Thursday, March 6, 2014 3:59:43 AM UTC-5, Alexander Reelsen wrote:
>
> Hey,
>
> before answering your question, I think the approach of handling your data 
> might be problematic. You are actually mixing two things in your data and 
> your metadata (which is in every document). First the data itself (John 
> Doe, 38 years old), but you are also putting meta information in the same 
> document - maybe it makes more sense to put this data somewhere else (as it 
> hopefully applies for all documents of that type). Also your above approach 
> has another problem, you will not be able to execute number range queries 
> on the age of people, because the value field is configured to be a string 
> - same goes for sorting.
> With that said, it might be more useful to have a dedicated index for your 
> field configuration, which you can query for the usecase you outlined in 
> your post. And you have a dedicated index for the data - splitting those 
> IMO makes a lot of sense. On the other hand I dont know your  data well 
> enough, maybe I am completely wrong.
>
> Back to your original question. If you store a document like the above, 
> and you execute searches on it, the full document always gets returned, not 
> just parts of it. You may want to read into parent-child/nested 
> functionality though (I still do not like that approach).
>
> Facetting can only be done on single fields, so you will not get back the 
> tuple you actually need (you could join them via a script facet, but that 
> seems like another work around) - or again read about parent-child/nested 
> documents (again disliking this, but I guess you know this by now).
>
> One last thing: Its nice to have everything in one query, but dont 
> consider this a must. If two queries solve your problem, it might make more 
> sense.
>
>
> --Alex
>
>
>
>
> On Wed, Mar 5, 2014 at 5:15 AM, Corey Nolet <cjn...@gmail.com<javascript:>
> > wrote:
>
>> I forgot to mention, I need the ability for the user to specify they only 
>> care about keys for the entity.type === 'person' (or any type for that 
>> matter).
>>
>>
>> On Tuesday, March 4, 2014 11:13:27 PM UTC-5, Corey Nolet wrote:
>>>
>>> Hello,
>>>
>>> I've got an "entity" document which looks like this:
>>>
>>> {
>>>    id: 'id',
>>>    type: 'person',
>>>    tuples: [
>>>        {
>>>             key: 'nameFirst',
>>>             value: 'john',
>>>             type: 'string'
>>>         },
>>>             key: 'age',
>>>             value: '38',
>>>             type: 'int'
>>>         },
>>>         {
>>>             key: 'nameLast',
>>>             value: 'doe',
>>>         }       
>>>     ]
>>> }
>>>
>>> The tuples field has been mapped in ElasticSearch as a nested type where 
>>> I provide both analyzed and not_analyzed indices for each of the nested 
>>> fields (for exact and fuzzy match). What I'm trying to do is find, for each 
>>> entity's type field, the unique tuple key values along with their 
>>> associated types.
>>>
>>> In other words, I want to write a web service where someone can start 
>>> typing "n" and I'll return "[{ key:'nameFirst', type:'string'}, { key: 
>>> 'nameLast', type: 'string' }]" or they could start typing "a" and I'll 
>>> return "[{ key: 'age', type: 'int' }]. If they don't type anything, I'd 
>>> like to return the union between the two sets (where it includes nameLast, 
>>> nameFirst, and age). 
>>>
>>> As i'm reading, I'm seeing that this may be done with facets but I know 
>>> they have some limitations  Is this something that would be possible to do 
>>> directly? I'm trying to do this all with one fast query if I can.
>>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/5c62e98d-3ad9-4a4f-b7c7-5620221c2380%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/5c62e98d-3ad9-4a4f-b7c7-5620221c2380%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b532ed7d-34ae-46d6-8409-b56d30207aee%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: Facet to find possible keys for querying

Reply via email to