What about nested or parent/child query? How to achieve?
On Thursday, May 8, 2014 4:45:36 PM UTC-7, Yao Li wrote:
>
> I have a collection of products which belong to few users, like
>
> [
> { id: 1, user_id: 1, description: "blabla...", ... },
> { id: 2, user_id: 2, description: "blabla...", ... },
> { id: 3, user_id: 2, description: "blabla...", ... },
> { id: 4, user_id: 3, description: "blabla...", ... },
> { id: 5, user_id: 4, description: "blabla...", ... },
> { id: 6, user_id: 2, description: "blabla...", ... },
> { id: 7, user_id: 3, description: "blabla...", ... },
> { id: 8, user_id: 4, description: "blabla...", ... },
> { id: 9, user_id: 2, description: "blabla...", ... },
> { id: 10, user_id: 3, description: "blabla...", ... },
> { id: 11, user_id: 4, description: "blabla...", ... },
> ...
> ]
>
> (the real data has more fields, but most important ones like 1st for
> product id, 2nd for user id, 3rd for product description.)
>
> I'd like to retrieve 2 products for top 3 users whose products have
> highest matching score (matching condition is description includes
> "fashion" and some other keywords, in this case just use "fashion" as
> example) :
>
> [
> { id: 2, user_id: '2', description: "blabla...", ..., _score: 100},
> { id: 3, user_id: '2', description: "blabla...", ..., _score: 95},
> { id: 4, user_id: '3', description: "blabla...", ..., _score: 90},
> { id: 5, user_id: '4', description: "blabla...", ..., _score: 80},
> { id: 7, user_id: '3', description: "blabla...", ..., _score: 70},
> { id: 8, user_id: '4', description: "blabla...", ..., _score: 65},
> ...
> ]
>
> I have 3 possible ways to try:
>
> 1. use term facet to get unique user_id in nested query, then use them for
> the user id range of outside query which focus on match description with
> keywords like "fashion".
>
> I don't know how to implement it in ES (stuck in facet terms iteration and
> construct user_id range with subquery with facet), try in sql like:
>
> select id, user_id, description
> from product
> where user_id in (
> select distinct user_id
> from product
> limit 3)
> order by _score
> limit 6
> /* 6 = 2 * 3 */
>
> But it cannot guarantee top 6 products coming from 3 different user.
>
> Also, according to the following two links, it seems facet terms specific
> information iteration feature has not been implemented in ES so far.
>
> http://elasticsearch-users.115913.n3.nabble.com/Terms-stats-facet-Additional-information-td4035199.html
>
> https://github.com/elasticsearch/elasticsearch/issues/256
>
> 2. query with term filed in description matched with keywords like
> "fashion", at same time do statistics for each user_id with aggregation and
> limit the count to 2, then pick top 6 products with highest matching score.
>
> I still don't know how to implement in ES.
>
> 3. use brute force with multiple queries until find top 3 users, each one
> has 2 products with highest matching scores.
>
> I mean use a hash map, key is user_id, value is how many times it appears.
> Query with matching keywords first, then iterate immediate results and
> check hash map, if value is less than 2, add to final result product list,
> otherwise skip it.
>
> Please let me know if you can figure it out in the above 1st or 2nd way.
>
> Appreciate in advance.
> Yao
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8273ae86-1344-4b59-8680-2a82eee98de5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.