[
https://issues.apache.org/jira/browse/SOLR-7452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048265#comment-16048265
]
Yonik Seeley commented on SOLR-7452:
------------------------------------
bq. Basically Calcite is expecting a long in this scenario because we map all
integer types to longs in the SolrSchema.java class.
Ah, that's where that mapping is!
bq. I think the best way to deal with this is to have the FacetStream convert
all Integers buckets to Longs.
Here's a quick patch that seems to get things working:
{code}
diff --git
a/solr/solrj/src/java/org/apache/solr/client/solrj/io/stream/FacetStream.java
b/solr/solrj/src/java/org/apache/solr/client/solrj/io/stream/FacetStream.java
index c5bd56bcb9..fb53e8464b 100644
---
a/solr/solrj/src/java/org/apache/solr/client/solrj/io/stream/FacetStream.java
+++
b/solr/solrj/src/java/org/apache/solr/client/solrj/io/stream/FacetStream.java
@@ -477,6 +477,9 @@ public class FacetStream extends TupleStream implements
Expressible {
for(int b=0; b<allBuckets.size(); b++) {
NamedList bucket = (NamedList)allBuckets.get(b);
Object val = bucket.get("val");
+ if (val instanceof Integer) {
+ val=((Integer)val).longValue(); // calcite currently expects Long
values here
+ }
Tuple t = currentTuple.clone();
t.put(bucketName, val);
int nextLevel = level+1;
{code}
> json facet api returning inconsistent counts in cloud set up
> ------------------------------------------------------------
>
> Key: SOLR-7452
> URL: https://issues.apache.org/jira/browse/SOLR-7452
> Project: Solr
> Issue Type: Bug
> Components: Facet Module
> Affects Versions: 5.1
> Reporter: Vamsi Krishna D
> Labels: count, facet, sort
> Attachments: SOLR-7452.patch, SOLR-7452.patch, SOLR-7452.patch
>
> Original Estimate: 96h
> Remaining Estimate: 96h
>
> While using the newly added feature of json term facet api
> (http://yonik.com/json-facet-api/#TermsFacet) I am encountering inconsistent
> returns of counts of faceted value ( Note I am running on a cloud mode of
> solr). For example consider that i have txns_id(unique field or key),
> consumer_number and amount. Now for a 10 million such records , lets say i
> query for
> q=*:*&rows=0&
> json.facet={
> biskatoo:{
> type : terms,
> field : consumer_number,
> limit : 20,
> sort : {y:desc},
> numBuckets : true,
> facet:{
> y : "sum(amount)"
> }
> }
> }
> the results are as follows ( some are omitted ):
> "facets":{
> "count":6641277,
> "biskatoo":{
> "numBuckets":3112708,
> "buckets":[{
> "val":"surya",
> "count":4,
> "y":2.264506},
> {
> "val":"raghu",
> "COUNT":3, // capitalised for recognition
> "y":1.8},
> {
> "val":"malli",
> "count":4,
> "y":1.78}]}}}
> but if i restrict the query to
> q=consumer_number:raghu&rows=0&
> json.facet={
> biskatoo:{
> type : terms,
> field : consumer_number,
> limit : 20,
> sort : {y:desc},
> numBuckets : true,
> facet:{
> y : "sum(amount)"
> }
> }
> }
> i get :
> "facets":{
> "count":4,
> "biskatoo":{
> "numBuckets":1,
> "buckets":[{
> "val":"raghu",
> "COUNT":4,
> "y":2429708.24}]}}}
> One can see the count results are inconsistent ( and I found many occasions
> of inconsistencies).
> I have tried the patch https://issues.apache.org/jira/browse/SOLR-7412 but
> still the issue seems not resolved
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]