Ok, so for the 1st question, I think I'm getting closer: adding facet: {top_terms_by_doc: "unique(_root_)"} as indicated in http://blog.griddynamics.com/search/label/~Mikhail%20Khludnev returns correct counts. However, sorting is done by the upper faceting not by the unique(_root_):
curl http://localhost:8985/solr/my_collection /query -d 'q={!parent%20which="type_s:doc"}type_s:doc.userData%20%2BSubject_t:california&rows=0& json.facet={ filter_by_child_type :{ type:query, q:"type_s:doc.enriched.text.keywords", domain: { blockChildren : "type_s:doc" }, facet:{ top_keywords_text : { type: terms, field: text_t, limit: 10, facet: { top_terms_by_doc: "unique(_root_)" } } } } }' RETURNS { "responseHeader":{ "status":0, "QTime":25, "params":{ "q":"{!parent which=\"type_s:doc\"}type_s:doc.userData +Subject_t:california", "json.facet":"{\n filter_by_child_type :{\n type:query,\n q:\"type_s:doc.enriched.text.keywords\",\n domain: { blockChildren : \"type_s:doc\" },\n facet:{\n top_keywords_text : {\n type: terms,\n field: text_t,\n limit: 10,\n facet: {\n top_terms_by_doc: \"unique(_root_)\"\n }\n }\n }\n }\n}", "rows":"0"}}, "response":{"numFound":19,"start":0,"docs":[] }, "facets":{ "count":19, "filter_by_child_type":{ "count":686, "top_keywords_text":{ "buckets":[{ "val":"enron", "count":57, "top_terms_by_doc":9}, { "val":"california", "count":22, "top_terms_by_doc":13}, { "val":"power", "count":21, "top_terms_by_doc":7}, { "val":"rate", "count":15, "top_terms_by_doc":5}, { "val":"plan", "count":13, "top_terms_by_doc":3}, { "val":"hou", "count":12, "top_terms_by_doc":5}, { "val":"energy", "count":11, "top_terms_by_doc":5}, { "val":"na", "count":11, "top_terms_by_doc":5}, { "val":"mckinsey", "count":10, "top_terms_by_doc":1}, { "val":"socal", "count":10, "top_terms_by_doc":4}]}}}} Nice, but I want them to be ordered by "top_terms_by_doc" frequencies, not by the "count" frequencies. Any suggestions? Thanks, Alisa >Понедельник, 28 марта 2016, 15:39 -04:00 от Alisa Z. <prol...@mail.ru>: > >Hi all, > >I am trying to perform faceting of parent docs by nested document fields. I've >tried 2 approaches as in subject, yet in first the results are not quite >correct and in the 2nd I cannot get the query right. So I need help on either >of them and any explication or documentation or blogs on the behavior is much >appreciated. > >Verbally the query is as follows: "Find top 10 keywords for all documents with >"california" in email subject line" > >Here is the query with responses: > >==== Json Facet API ==== > >curl http://localhost:8985/solr/my_collection/query -d >'q={!parent%20which="type_s:doc"}type_s:doc.userData%20%2BSubject_t:california&rows=0& >json.facet={ > filter_by_child_type :{ > type:query, > q:"type_s:doc.enriched.text.keywords", > domain: { blockChildren : "type_s:doc" }, > facet:{ > top_keywords_text : { > type: terms, > field: text_t, > limit: 10 > } > } > } >}' > >RETURNS: > >{ > "responseHeader":{ > "status":0, > "QTime":134, > "params":{ > "q":"{!parent which=\"type_s:doc\"}type_s:doc.userData >+Subject_t:california", > "json.facet":"{\n filter_by_child_type :{\n type:query,\n >q:\"type_s:doc.enriched.text.keywords\",\n domain: { blockChildren : >\"type_s:doc\" },\n facet:{\n top_keywords_text : {\n type: >terms,\n field: text_t,\n limit: 10\n }\n }\n }\n}", > "rows":"0"}}, > "response":{"numFound":19,"start":0,"docs":[] > }, > "facets":{ > "count":19, > "filter_by_child_type":{ > "count":686, > "top_keywords_text":{ > "buckets":[{ > "val":"enron", > "count":57}, > { > "val":"california", > "count":22}, > { > "val":"power", > "count":21}, > { > "val":"rate", > "count":15}, > { > "val":"plan", > "count":13}, > { > "val":"hou", > "count":12}, > { > "val":"energy", > "count":11}, > { > "val":"na", > "count":11}, > { > "val":"mckinsey", > "count":10}, > { > "val":"socal", > "count":10}]}}}} > > >QUESTION: where do the counts greater than 19 (the total number of the >top-level documents returned by the query) comes from? How to adjust the >query to facet only on the top-level documents (and consequently no count >should be greater than 19)? > > >===== BlockJoin Faceting ====== >Following the example on >https://cwiki.apache.org/confluence/display/solr/BlockJoin+Faceting , I've >tried this: > >/bjqfacet?q={!parent%20which=type_s:doc}type_s:doc.enriched.text.keywords&child.facet.field=text_t&child.facet.limit=10&child.facet.mincount=5&rows=0&fq={!parent%20which=type_s:doc}type_s:doc.userData%20%2BSubject_t:california&wt=json&indent=true > >RETURNS: > >{ > "responseHeader":{ > "status":0, > "QTime":1}, > "response":{"numFound":19,"start":0,"docs":[] > }, > "facet_counts":[ > "facet_fields",[ > "text_t",[ > "128x",1, > "18xx",1, > "1x",1, > "2",2, > "30",1, > "60",1, > "78xx",1, > "82xx",1, > "ab",2, > "access",5, > "account",1, > "accounts",1, >... >"california",13, >... >"enron",9, >... >]]]} > >QUESTION: This looks very close to what I want, yet why >child.facet.limit=10&child.facet.mincount=5 are ignored? How to get top 10 >most frequent? > > >Thank you for your help in advance! > >-- >Alisa Zhila