Re: Nested JSON Facets (Subfacets)

2016-12-15 Thread Chantal Ackermann
> 
> Interesting I don't recall a bug like that being fixed.
> Anyway, glad it works for you now!
> -Yonik


Then it’s probably because it’s Christmas time! :-)

Re: Nested JSON Facets (Subfacets)

2016-12-15 Thread Yonik Seeley
Interesting I don't recall a bug like that being fixed.
Anyway, glad it works for you now!
-Yonik


On Thu, Dec 15, 2016 at 11:01 AM, Chantal Ackermann
 wrote:
> Hi Yonik,
>
> after upgrading to Solr 6.3.0, the nested function works as expected! (Both 
> with and without docValues.)
>
> "facets":{
> "count":3179500,
> "all_pop":1.5901646171168616E8,
> "shop_cat":{
>   "buckets":[{
>   "val":"Kontaktlinsen > Torische Linsen",
>   "count":75168,
>   "cat_sum":3752665.0497611803},
>
>
> Thanks,
> Chantal
>
>
>> Am 15.12.2016 um 16:00 schrieb Chantal Ackermann 
>> :
>>
>> Hi Yonik,
>>
>> are you certain that nesting a function works as documented on 
>> http://yonik.com/solr-subfacets/?
>>
>> top_authors:{
>>type: terms,
>>field: author,
>>limit: 7,
>>sort: "revenue desc",
>>facet:{
>>  revenue: "sum(sales)"
>>}
>>  }
>>
>>
>> I’m getting the feeling that the function is never really executed because, 
>> for my index, calling sum() with a non-number field (e.g. a multi-valued 
>> string field) throws an error when *not nested* but does *not throw an 
>> error* when nested:
>>
>>json.facet={all_pop: "sum(gtin)“}
>>
>>"error":{
>>"trace":“java.lang.UnsupportedOperationException
>>   at 
>> org.apache.lucene.queries.function.FunctionValues.doubleVal(FunctionValues.java:47)
>>
>> And the following does not throw an error but definitely should if the 
>> function would be executed:
>>
>>json.facet={all_pop:"sum(popularity)",shop_cat: {type:terms, 
>> field:shop_cat, facet: {cat_pop:"sum(gtin)"}}}
>>
>> returns:
>>
>> "facets":{
>>"count":2815500,
>>"all_pop":1.4065865823321116E8,
>>"shop_cat":{
>>  "buckets":[{
>>  "val":"Kontaktlinsen > Torische Linsen",
>>  "count":75168,
>>  "cat_pop":0.0},
>>{
>>  "val":"Damen-Mode/Inspirationen",
>>  "count":47053,
>>  "cat_pop":0.0},
>>
>> For completeness: here is the field directive for „gtin“ with 
>> „text_noleadzero“ based on „solr.TextField“:
>>
>>> required="false" multiValued="true“/>
>>
>>
>> Is this a bug or is the documentation a glimpse of the future? I will try 
>> version 6.3.0, now. I was still on 6.1.0 for the above tests.
>> (I have also tried with the „avg“ function, just to make sure, but same 
>> there.)
>>
>> Cheers,
>> Chantal
>


Re: Nested JSON Facets (Subfacets)

2016-12-15 Thread Chantal Ackermann
Hi Yonik,

after upgrading to Solr 6.3.0, the nested function works as expected! (Both 
with and without docValues.)

"facets":{
"count":3179500,
"all_pop":1.5901646171168616E8,
"shop_cat":{
  "buckets":[{
  "val":"Kontaktlinsen > Torische Linsen",
  "count":75168,
  "cat_sum":3752665.0497611803},


Thanks,
Chantal


> Am 15.12.2016 um 16:00 schrieb Chantal Ackermann :
> 
> Hi Yonik,
> 
> are you certain that nesting a function works as documented on 
> http://yonik.com/solr-subfacets/?
> 
> top_authors:{ 
>type: terms,
>field: author,
>limit: 7,
>sort: "revenue desc",
>facet:{
>  revenue: "sum(sales)"
>}
>  }
> 
> 
> I’m getting the feeling that the function is never really executed because, 
> for my index, calling sum() with a non-number field (e.g. a multi-valued 
> string field) throws an error when *not nested* but does *not throw an error* 
> when nested:
> 
>json.facet={all_pop: "sum(gtin)“}
> 
>"error":{
>"trace":“java.lang.UnsupportedOperationException
>   at 
> org.apache.lucene.queries.function.FunctionValues.doubleVal(FunctionValues.java:47)
> 
> And the following does not throw an error but definitely should if the 
> function would be executed:
> 
>json.facet={all_pop:"sum(popularity)",shop_cat: {type:terms, 
> field:shop_cat, facet: {cat_pop:"sum(gtin)"}}}
> 
> returns:
> 
> "facets":{
>"count":2815500,
>"all_pop":1.4065865823321116E8,
>"shop_cat":{
>  "buckets":[{
>  "val":"Kontaktlinsen > Torische Linsen",
>  "count":75168,
>  "cat_pop":0.0},
>{
>  "val":"Damen-Mode/Inspirationen",
>  "count":47053,
>  "cat_pop":0.0},
> 
> For completeness: here is the field directive for „gtin“ with 
> „text_noleadzero“ based on „solr.TextField“:
> 
> required="false" multiValued="true“/>
> 
> 
> Is this a bug or is the documentation a glimpse of the future? I will try 
> version 6.3.0, now. I was still on 6.1.0 for the above tests.
> (I have also tried with the „avg“ function, just to make sure, but same 
> there.)
> 
> Cheers,
> Chantal



Re: Nested JSON Facets (Subfacets)

2016-12-15 Thread Chantal Ackermann
Hi Yonik,

are you certain that nesting a function works as documented on 
http://yonik.com/solr-subfacets/?

top_authors:{ 
type: terms,
field: author,
limit: 7,
sort: "revenue desc",
facet:{
  revenue: "sum(sales)"
}
  }


I’m getting the feeling that the function is never really executed because, for 
my index, calling sum() with a non-number field (e.g. a multi-valued string 
field) throws an error when *not nested* but does *not throw an error* when 
nested:

json.facet={all_pop: "sum(gtin)“}

"error":{
"trace":“java.lang.UnsupportedOperationException
at 
org.apache.lucene.queries.function.FunctionValues.doubleVal(FunctionValues.java:47)

And the following does not throw an error but definitely should if the function 
would be executed:

json.facet={all_pop:"sum(popularity)",shop_cat: {type:terms, 
field:shop_cat, facet: {cat_pop:"sum(gtin)"}}}

returns:

"facets":{
"count":2815500,
"all_pop":1.4065865823321116E8,
"shop_cat":{
  "buckets":[{
  "val":"Kontaktlinsen > Torische Linsen",
  "count":75168,
  "cat_pop":0.0},
{
  "val":"Damen-Mode/Inspirationen",
  "count":47053,
  "cat_pop":0.0},

For completeness: here is the field directive for „gtin“ with „text_noleadzero“ 
based on „solr.TextField“:

 required="false" multiValued="false" docValues="true“/>
> 
> I have also re-indexed (removed data/ and indexed from scratch). The 
> popularity field is populated with random values (as I don’t have the real 
> values from production) meaning that all documents have values > 0.
> 
> Here the statistics output:
> 
> "stats":{
>"stats_fields":{
>  "popularity":{
>"min":7.952374289743602E-5,
>"max":99.3896484375,
>"count":1687500,
>"missing":0,
>"sum":8.436878611434968E7,
>"sumOfSquares":5.626142812197906E9,
>"mean":49.9963176973924,
>"stddev":28.885623872869992},
> 
> And this is the relevant facet output from calling
> 
> /solr//query?
> json.facet={
> num_pop:{query: "popularity[* TO  *]“},
> all_pop: "sum(popularity)“,
> shop_cat: {type:terms, field:shop_cat, facet: {cat_pop: 
> "sum(popularity)"}}}=*:*=1=popularity=json
> 
> "facets":{
>"count":1687500,
>"all_pop":1.5893775613258794E8,
>"num_pop":{
>  "count":1687500},
>"shop_cat":{
>  "buckets":[{
>  "val":"Kontaktlinsen > Torische Linsen",
>  "count":75168,
>  "cat_pop":0.0},
>{
>  "val":"Neu",
>  "count":31772,
>  "cat_pop":0.0},
>{
>  "val":"Gesundheit & Schönheit > Gesundheitspflege",
>  "count":20281,
>  "cat_pop":0.0},
> [… more facets omitted]
> 
> 
> The /query handler is an edismax configuration, though I don’t think this 
> matters as long as the results include documents with popularity > 0 which is 
> the case as seen in the facet output (and sum() works in general for all of 
> the documents just not for the buckets as seen in „all_pop").
> 
> I will try to explicitly turn off the docValues and add stored=„true“ just to 
> try out more. If someone has any other suggestions that I should try - I 
> would be glad to here them. If it is not possible to retrieve the sum in this 
> way I would have to fetch each sum separately which would be a huge 
> performance impact.
> 
> Thanks!
> Chantal
> 
> 
> 
> 
> 
>> Am 15.12.2016 um 10:16 schrieb CA :
>> 
>>> num_pop:{query:"popularity:[* TO *]"}
> 



Re: Nested JSON Facets (Subfacets)

2016-12-15 Thread Chantal Ackermann
Hi Yonik,


here is an update on what I’ve tried so far, unfortunately without any more 
luck.

The field directive is (should have included this when asking the question):

   /query?
json.facet={
num_pop:{query: "popularity[* TO  *]“},
all_pop: "sum(popularity)“,
shop_cat: {type:terms, field:shop_cat, facet: {cat_pop: 
"sum(popularity)"}}}=*:*=1=popularity=json

"facets":{
"count":1687500,
"all_pop":1.5893775613258794E8,
"num_pop":{
  "count":1687500},
"shop_cat":{
  "buckets":[{
  "val":"Kontaktlinsen > Torische Linsen",
  "count":75168,
  "cat_pop":0.0},
{
  "val":"Neu",
  "count":31772,
  "cat_pop":0.0},
{
  "val":"Gesundheit & Schönheit > Gesundheitspflege",
  "count":20281,
  "cat_pop":0.0},
[… more facets omitted]


The /query handler is an edismax configuration, though I don’t think this 
matters as long as the results include documents with popularity > 0 which is 
the case as seen in the facet output (and sum() works in general for all of the 
documents just not for the buckets as seen in „all_pop").

I will try to explicitly turn off the docValues and add stored=„true“ just to 
try out more. If someone has any other suggestions that I should try - I would 
be glad to here them. If it is not possible to retrieve the sum in this way I 
would have to fetch each sum separately which would be a huge performance 
impact.

Thanks!
Chantal





> Am 15.12.2016 um 10:16 schrieb CA :
> 
>> num_pop:{query:"popularity:[* TO *]"}



Re: Nested JSON Facets (Subfacets)

2016-12-15 Thread CA
Hi Yonik,

thank you for your quick reply.

(((I just send my original e-mail a second time (I did not confirm the 
subscription so I thought it might not have been send the first time, I’m 
sorry.

We are using SOLR 6.1.0. Sorry, I should have mentioned.

The low number is because of the test data. It’s not how it would look like in 
production. That’s also why I was never wondering about 0 values in the 
beginning. But now that I have tweaked the data I can see that it’s not 
returning the values as it should. And in production there are values > 0 as 
expected but the sum() returns 0 nevertheless, that’s why we are aware that 
something is wrong.

In production the data is re-indexed constantly. Though, we might have changed 
the field type from int to float. I’m not sure whether we have really 
re-indexed from scratch after that, in production, but I think in my local env 
I did re-create the index. I will check this out.

I’ll also play around with the range query, thanks for the tip!

Cheers,
Chantal



> That should work... what version of Solr are you using?  Did you 
> change the type of the popularity field w/o completely reindexing? 
> 
> You can try to verify the number of documents in each bucket that have 
> the popularity field by adding another sub-facet next to cat_pop: 
> num_pop:{query:"popularity:[* TO *]"} 
> 
> > A quick check with this json.facet parameter: 
> > 
> > json.facet: {cat_pop:"sum(popularity)“} 
> > 
> > returns: 
> > 
> > "facets“: { 
> > "count":2508, 
> > "cat_pop":21.0}, 
> 
> That looks like a pretty low sum for all those documents perhaps 
> most of them are missing "popularity" (or have a 0 popularity). 
> To test one of the buckets at the top-level this way, you could add 
> fq=shop_cat:"Men > Clothing > Jumpers & Cardigans" 
> and see if you get anything. 
> 
> -Yonik 



Nested JSON Facets (Subfacets)

2016-12-15 Thread CA
Hi all,

this is about using a function in nested facets, specifically the „sum()“ 
function inside a „terms“ facet using the json.facet api.

My json.facet parameter looks like this:

   json.facet={shop_cat: {type:terms, field:shop_cat, facet: 
{cat_pop:"sum(popularity)"}}}

A snippet of the result:

   "facets“: {
   "count":2508,
   "shop_cat“: {
   "buckets“: [{
   "val“: "Men > Clothing > Jumpers & Cardigans",
   "count":252,
   "cat_pop“:0.0
}, {
  "val":"Men > Clothing > Jackets & Coats",
  "count":157,
  "cat_pop“:0.0
}, // and more

This looks fine all over but it turns out that „cat_pop“, the result of 
„sum(popularity)“ is always 0.0 even if the documents for this facet value have 
popularities > 0.

A quick check with this json.facet parameter:

   json.facet: {cat_pop:"sum(popularity)“}

returns:

   "facets“: {
   "count":2508,
   "cat_pop":21.0},

To me, it seems it works fine on the base level but not when nested. Still, 
Yonik’s documentation and the Jira issues indicate that it is possible to use 
functions in nested facets so I might just be using the wrong structure? I have 
a hard time finding any other examples on the i-net and I had no luck changing 
the structure around.
Could someone shed some light on this for me? It would also help to know if it 
is not possible to sum the values up this way.

Thanks a lot!
Chantal




Re: Nested JSON Facets (Subfacets)

2016-12-14 Thread Yonik Seeley
That should work... what version of Solr are you using?  Did you
change the type of the popularity field w/o completely reindexing?

You can try to verify the number of documents in each bucket that have
the popularity field by adding another sub-facet next to cat_pop:
num_pop:{query:"popularity:[* TO *]"}

> A quick check with this json.facet parameter:
>
> json.facet: {cat_pop:"sum(popularity)“}
>
> returns:
>
> "facets“: {
> "count":2508,
> "cat_pop":21.0},

That looks like a pretty low sum for all those documents perhaps
most of them are missing "popularity" (or have a 0 popularity).
To test one of the buckets at the top-level this way, you could add
fq=shop_cat:"Men > Clothing > Jumpers & Cardigans"
and see if you get anything.

-Yonik


On Wed, Dec 14, 2016 at 12:46 PM, CA  wrote:
> Hi all,
>
> this is about using a function in nested facets, specifically the „sum()“ 
> function inside a „terms“ facet using the json.facet api.
>
> My json.facet parameter looks like this:
>
> json.facet={shop_cat: {type:terms, field:shop_cat, facet: 
> {cat_pop:"sum(popularity)"}}}
>
> A snippet of the result:
>
> "facets“: {
> "count":2508,
> "shop_cat“: {
> "buckets“: [{
> "val“: "Men > Clothing > Jumpers & Cardigans",
> "count":252,
> "cat_pop“:0.0
>  }, {
>"val":"Men > Clothing > Jackets & Coats",
>"count":157,
>"cat_pop“:0.0
>  }, // and more
>
> This looks fine all over but it turns out that „cat_pop“, the result of 
> „sum(popularity)“ is always 0.0 even if the documents for this facet value 
> have popularities > 0.
>
> A quick check with this json.facet parameter:
>
> json.facet: {cat_pop:"sum(popularity)“}
>
> returns:
>
> "facets“: {
> "count":2508,
> "cat_pop":21.0},
>
> To me, it seems it works fine on the base level but not when nested. Still, 
> Yonik’s documentation and the Jira issues indicate that it is possible to use 
> functions in nested facets so I might just be using the wrong structure? I 
> have a hard time finding any other examples on the i-net and I had no luck 
> changing the structure around.
> Could someone shed some light on this for me? It would also help to know if 
> it is not possible to sum the values up this way.
>
> Thanks a lot!
> Chantal
>
>


Nested JSON Facets (Subfacets)

2016-12-14 Thread CA
Hi all,

this is about using a function in nested facets, specifically the „sum()“ 
function inside a „terms“ facet using the json.facet api.

My json.facet parameter looks like this:

json.facet={shop_cat: {type:terms, field:shop_cat, facet: 
{cat_pop:"sum(popularity)"}}}

A snippet of the result:

"facets“: {
"count":2508,
"shop_cat“: {
"buckets“: [{
"val“: "Men > Clothing > Jumpers & Cardigans",
"count":252,
"cat_pop“:0.0
 }, {
   "val":"Men > Clothing > Jackets & Coats",
   "count":157,
   "cat_pop“:0.0
 }, // and more

This looks fine all over but it turns out that „cat_pop“, the result of 
„sum(popularity)“ is always 0.0 even if the documents for this facet value have 
popularities > 0.

A quick check with this json.facet parameter:

json.facet: {cat_pop:"sum(popularity)“}

returns:

"facets“: {
"count":2508,
"cat_pop":21.0},

To me, it seems it works fine on the base level but not when nested. Still, 
Yonik’s documentation and the Jira issues indicate that it is possible to use 
functions in nested facets so I might just be using the wrong structure? I have 
a hard time finding any other examples on the i-net and I had no luck changing 
the structure around.
Could someone shed some light on this for me? It would also help to know if it 
is not possible to sum the values up this way.

Thanks a lot!
Chantal