[jira] [Commented] (SOLR-7452) json facet api returning inconsistent counts in cloud set up
[ https://issues.apache.org/jira/browse/SOLR-7452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322458#comment-15322458 ] Vishnu Mishra commented on SOLR-7452: - [~ysee...@gmail.com] Any progress on the patch of this issue? > json facet api returning inconsistent counts in cloud set up > > > Key: SOLR-7452 > URL: https://issues.apache.org/jira/browse/SOLR-7452 > Project: Solr > Issue Type: Bug > Components: Facet Module >Affects Versions: 5.1 >Reporter: Vamsi Krishna D > Labels: count, facet, sort > Fix For: 5.2 > > Original Estimate: 96h > Remaining Estimate: 96h > > While using the newly added feature of json term facet api > (http://yonik.com/json-facet-api/#TermsFacet) I am encountering inconsistent > returns of counts of faceted value ( Note I am running on a cloud mode of > solr). For example consider that i have txns_id(unique field or key), > consumer_number and amount. Now for a 10 million such records , lets say i > query for > q=*:*=0& > json.facet={ >biskatoo:{ >type : terms, >field : consumer_number, >limit : 20, > sort : {y:desc}, > numBuckets : true, > facet:{ >y : "sum(amount)" >} >} > } > the results are as follows ( some are omitted ): > "facets":{ > "count":6641277, > "biskatoo":{ > "numBuckets":3112708, > "buckets":[{ > "val":"surya", > "count":4, > "y":2.264506}, > { > "val":"raghu", > "COUNT":3, // capitalised for recognition > "y":1.8}, > { > "val":"malli", > "count":4, > "y":1.78}]}}} > but if i restrict the query to > q=consumer_number:raghu=0& > json.facet={ >biskatoo:{ >type : terms, >field : consumer_number, >limit : 20, > sort : {y:desc}, > numBuckets : true, > facet:{ >y : "sum(amount)" >} >} > } > i get : > "facets":{ > "count":4, > "biskatoo":{ > "numBuckets":1, > "buckets":[{ > "val":"raghu", > "COUNT":4, > "y":2429708.24}]}}} > One can see the count results are inconsistent ( and I found many occasions > of inconsistencies). > I have tried the patch https://issues.apache.org/jira/browse/SOLR-7412 but > still the issue seems not resolved -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7452) json facet api returning inconsistent counts in cloud set up
[ https://issues.apache.org/jira/browse/SOLR-7452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15063678#comment-15063678 ] Vishnu Mishra commented on SOLR-7452: - Any progress on this issue... > json facet api returning inconsistent counts in cloud set up > > > Key: SOLR-7452 > URL: https://issues.apache.org/jira/browse/SOLR-7452 > Project: Solr > Issue Type: Bug > Components: faceting >Affects Versions: 5.1 >Reporter: Vamsi Krishna D > Labels: count, facet, sort > Fix For: 5.2 > > Original Estimate: 96h > Remaining Estimate: 96h > > While using the newly added feature of json term facet api > (http://yonik.com/json-facet-api/#TermsFacet) I am encountering inconsistent > returns of counts of faceted value ( Note I am running on a cloud mode of > solr). For example consider that i have txns_id(unique field or key), > consumer_number and amount. Now for a 10 million such records , lets say i > query for > q=*:*=0& > json.facet={ >biskatoo:{ >type : terms, >field : consumer_number, >limit : 20, > sort : {y:desc}, > numBuckets : true, > facet:{ >y : "sum(amount)" >} >} > } > the results are as follows ( some are omitted ): > "facets":{ > "count":6641277, > "biskatoo":{ > "numBuckets":3112708, > "buckets":[{ > "val":"surya", > "count":4, > "y":2.264506}, > { > "val":"raghu", > "COUNT":3, // capitalised for recognition > "y":1.8}, > { > "val":"malli", > "count":4, > "y":1.78}]}}} > but if i restrict the query to > q=consumer_number:raghu=0& > json.facet={ >biskatoo:{ >type : terms, >field : consumer_number, >limit : 20, > sort : {y:desc}, > numBuckets : true, > facet:{ >y : "sum(amount)" >} >} > } > i get : > "facets":{ > "count":4, > "biskatoo":{ > "numBuckets":1, > "buckets":[{ > "val":"raghu", > "COUNT":4, > "y":2429708.24}]}}} > One can see the count results are inconsistent ( and I found many occasions > of inconsistencies). > I have tried the patch https://issues.apache.org/jira/browse/SOLR-7412 but > still the issue seems not resolved -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7867) implicit sharded, facet grouping problem with multivalued string field starting with digits
[ https://issues.apache.org/jira/browse/SOLR-7867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15062403#comment-15062403 ] Vishnu Mishra commented on SOLR-7867: - We are using Solr 5.3.1 and facing the same issue with group.facet. Any progress? > implicit sharded, facet grouping problem with multivalued string field > starting with digits > --- > > Key: SOLR-7867 > URL: https://issues.apache.org/jira/browse/SOLR-7867 > Project: Solr > Issue Type: Bug > Components: faceting, SolrCloud >Affects Versions: 5.2 > Environment: 3.13.0-48-generic #80-Ubuntu SMP x86_64 GNU/Linux > java version "1.7.0_80" > Java(TM) SE Runtime Environment (build 1.7.0_80-b15) > Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode) >Reporter: Umut Erogul > Labels: docValues, facet, group, sharding > Attachments: DocValuesException.PNG, ErrorReadingDocValues.PNG > > > related parts @ schema.xml: > {code} docValues="true" multiValued="true"/> > docValues="true"/>{code} > every document has valid author_s and keyword_ss fields; > we can make successful facet group queries on single node, single collection, > solr-4.9.0 server > {code} > q: *:* fq: keyword_ss:3m > facet=true=keyword_ss=true=author_s=true > {code} > when querying on solr-5.2.0 server with implicit sharded environment with: > {code} > required="true"/>{code} > with example shard names; affinity1 affinity2 affinity3 affinity4 > the same query with same documents gets: > {code} > ERROR - 2015-08-04 08:15:15.222; [document affinity3 core_node32 > document_affinity3_replica2] org.apache.solr.common.SolrException; > org.apache.solr.common.SolrException: Exception during facet.field: keyword_ss > at org.apache.solr.request.SimpleFacets$3.call(SimpleFacets.java:632) > at org.apache.solr.request.SimpleFacets$3.call(SimpleFacets.java:617) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > org.apache.solr.request.SimpleFacets$2.execute(SimpleFacets.java:571) > at > org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:642) > ... > at > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ArrayIndexOutOfBoundsException > at > org.apache.lucene.codecs.lucene50.Lucene50DocValuesProducer$CompressedBinaryDocValues$CompressedBinaryTermsEnum.readTerm(Lucene50DocValuesProducer.java:1008) > at > org.apache.lucene.codecs.lucene50.Lucene50DocValuesProducer$CompressedBinaryDocValues$CompressedBinaryTermsEnum.next(Lucene50DocValuesProducer.java:1026) > at > org.apache.lucene.search.grouping.term.TermGroupFacetCollector$MV$SegmentResult.nextTerm(TermGroupFacetCollector.java:373) > at > org.apache.lucene.search.grouping.AbstractGroupFacetCollector.mergeSegmentResults(AbstractGroupFacetCollector.java:91) > at > org.apache.solr.request.SimpleFacets.getGroupedCounts(SimpleFacets.java:541) > at > org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:463) > at > org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:386) > at org.apache.solr.request.SimpleFacets$3.call(SimpleFacets.java:626) > ... 33 more > {code} > all the problematic queries are caused by strings starting with digits; > ("3m", "8 saniye", "2 broke girls", "1v1y") > there are some strings that the query works like ("24", "90+", "45 dakika") > we do not observe the problem when querying with > -keyword_ss:(0-9)* > updating the problematic documents (a small subset of keyword_ss:(0-9)*), > fixes the query, > but we cannot find an easy solution to find the problematic documents > there is around 400m docs; seperated at 28 shards; > -keyword_ss:(0-9)* matches %97 of documents -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7452) json facet api returning inconsistent counts in cloud set up
[ https://issues.apache.org/jira/browse/SOLR-7452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030437#comment-15030437 ] Vishnu Mishra commented on SOLR-7452: - I have the same count mismatch issue between json facet API and Simple facet in distributed search. I am using Solr 5.3.1. > json facet api returning inconsistent counts in cloud set up > > > Key: SOLR-7452 > URL: https://issues.apache.org/jira/browse/SOLR-7452 > Project: Solr > Issue Type: Bug > Components: faceting >Affects Versions: 5.1 >Reporter: Vamsi Krishna D > Labels: count, facet, sort > Fix For: 5.2 > > Original Estimate: 96h > Remaining Estimate: 96h > > While using the newly added feature of json term facet api > (http://yonik.com/json-facet-api/#TermsFacet) I am encountering inconsistent > returns of counts of faceted value ( Note I am running on a cloud mode of > solr). For example consider that i have txns_id(unique field or key), > consumer_number and amount. Now for a 10 million such records , lets say i > query for > q=*:*=0& > json.facet={ >biskatoo:{ >type : terms, >field : consumer_number, >limit : 20, > sort : {y:desc}, > numBuckets : true, > facet:{ >y : "sum(amount)" >} >} > } > the results are as follows ( some are omitted ): > "facets":{ > "count":6641277, > "biskatoo":{ > "numBuckets":3112708, > "buckets":[{ > "val":"surya", > "count":4, > "y":2.264506}, > { > "val":"raghu", > "COUNT":3, // capitalised for recognition > "y":1.8}, > { > "val":"malli", > "count":4, > "y":1.78}]}}} > but if i restrict the query to > q=consumer_number:raghu=0& > json.facet={ >biskatoo:{ >type : terms, >field : consumer_number, >limit : 20, > sort : {y:desc}, > numBuckets : true, > facet:{ >y : "sum(amount)" >} >} > } > i get : > "facets":{ > "count":4, > "biskatoo":{ > "numBuckets":1, > "buckets":[{ > "val":"raghu", > "COUNT":4, > "y":2429708.24}]}}} > One can see the count results are inconsistent ( and I found many occasions > of inconsistencies). > I have tried the patch https://issues.apache.org/jira/browse/SOLR-7412 but > still the issue seems not resolved -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-7452) json facet api returning inconsistent counts in cloud set up
[ https://issues.apache.org/jira/browse/SOLR-7452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030437#comment-15030437 ] Vishnu Mishra edited comment on SOLR-7452 at 11/28/15 9:05 AM: --- I have the same count mismatch issue between json facet API and Simple facet in distributed search. I am using Solr 5.3.1, and we have very much importance of facet count was (Author: vdil...@gmail.com): I have the same count mismatch issue between json facet API and Simple facet in distributed search. I am using Solr 5.3.1. > json facet api returning inconsistent counts in cloud set up > > > Key: SOLR-7452 > URL: https://issues.apache.org/jira/browse/SOLR-7452 > Project: Solr > Issue Type: Bug > Components: faceting >Affects Versions: 5.1 >Reporter: Vamsi Krishna D > Labels: count, facet, sort > Fix For: 5.2 > > Original Estimate: 96h > Remaining Estimate: 96h > > While using the newly added feature of json term facet api > (http://yonik.com/json-facet-api/#TermsFacet) I am encountering inconsistent > returns of counts of faceted value ( Note I am running on a cloud mode of > solr). For example consider that i have txns_id(unique field or key), > consumer_number and amount. Now for a 10 million such records , lets say i > query for > q=*:*=0& > json.facet={ >biskatoo:{ >type : terms, >field : consumer_number, >limit : 20, > sort : {y:desc}, > numBuckets : true, > facet:{ >y : "sum(amount)" >} >} > } > the results are as follows ( some are omitted ): > "facets":{ > "count":6641277, > "biskatoo":{ > "numBuckets":3112708, > "buckets":[{ > "val":"surya", > "count":4, > "y":2.264506}, > { > "val":"raghu", > "COUNT":3, // capitalised for recognition > "y":1.8}, > { > "val":"malli", > "count":4, > "y":1.78}]}}} > but if i restrict the query to > q=consumer_number:raghu=0& > json.facet={ >biskatoo:{ >type : terms, >field : consumer_number, >limit : 20, > sort : {y:desc}, > numBuckets : true, > facet:{ >y : "sum(amount)" >} >} > } > i get : > "facets":{ > "count":4, > "biskatoo":{ > "numBuckets":1, > "buckets":[{ > "val":"raghu", > "COUNT":4, > "y":2429708.24}]}}} > One can see the count results are inconsistent ( and I found many occasions > of inconsistencies). > I have tried the patch https://issues.apache.org/jira/browse/SOLR-7412 but > still the issue seems not resolved -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7495) Unexpected docvalues type NUMERIC when grouping by a int facet
[ https://issues.apache.org/jira/browse/SOLR-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736462#comment-14736462 ] Vishnu Mishra commented on SOLR-7495: - I also have the same problem in solr 5.3. > Unexpected docvalues type NUMERIC when grouping by a int facet > -- > > Key: SOLR-7495 > URL: https://issues.apache.org/jira/browse/SOLR-7495 > Project: Solr > Issue Type: Bug >Affects Versions: 5.0, 5.1, 5.2, 5.3 >Reporter: Fabio Batista da Silva > Attachments: SOLR-7495.patch > > > Hey All, > After upgrading from solr 4.10 to 5.1 with solr could > I'm getting a IllegalStateException when i try to facet a int field. > IllegalStateException: unexpected docvalues type NUMERIC for field 'year' > (expected=SORTED). Use UninvertingReader or index with docvalues. > schema.xml > {code} > > > > > > > multiValued="false" required="true"/> > multiValued="false" required="true"/> > > > stored="true"/> > > > > /> > sortMissingLast="true"/> > positionIncrementGap="0"/> > positionIncrementGap="0"/> > positionIncrementGap="0"/> > precisionStep="0" positionIncrementGap="0"/> > positionIncrementGap="0"/> > positionIncrementGap="100"> > > > words="stopwords.txt" /> > > maxGramSize="15"/> > > > > words="stopwords.txt" /> > synonyms="synonyms.txt" ignoreCase="true" expand="true"/> > > > > positionIncrementGap="100"> > > > words="stopwords.txt" /> > > maxGramSize="15"/> > > > > words="stopwords.txt" /> > synonyms="synonyms.txt" ignoreCase="true" expand="true"/> > > > > class="solr.SpatialRecursivePrefixTreeFieldType" geo="true" > distErrPct="0.025" maxDistErr="0.09" units="degrees" /> > > id > name > > > {code} > query : > {code} > http://solr.dev:8983/solr/my_collection/select?wt=json=id=index_type:foobar=true=year_make_model=true=true=year > {code} > Exception : > {code} > ull:org.apache.solr.common.SolrException: Exception during facet.field: year > at org.apache.solr.request.SimpleFacets$3.call(SimpleFacets.java:627) > at org.apache.solr.request.SimpleFacets$3.call(SimpleFacets.java:612) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at org.apache.solr.request.SimpleFacets$2.execute(SimpleFacets.java:566) > at > org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:637) > at > org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:280) > at > org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:106) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:222) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1984) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) > at >
[jira] [Created] (SOLR-8023) Add Support for group.facet in json facet API
Vishnu Mishra created SOLR-8023: --- Summary: Add Support for group.facet in json facet API Key: SOLR-8023 URL: https://issues.apache.org/jira/browse/SOLR-8023 Project: Solr Issue Type: New Feature Components: Facet Module Affects Versions: 5.3.1 Environment: Solr5.3, Java 8, Apache Tomcat 8, Windows 7 Reporter: Vishnu Mishra Fix For: 5.3.1 While using Normal facet we can specify group.facet=true to do faceting on grouped result. Can we add something similar functionality to Facet Module (Json facet api). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4285) Solr mangles distributed query parameters
[ https://issues.apache.org/jira/browse/SOLR-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613485#comment-14613485 ] Vishnu Mishra commented on SOLR-4285: - I am using solr 4.10.3 and facing the same issue with distributed search. Solr mangles distributed query parameters - Key: SOLR-4285 URL: https://issues.apache.org/jira/browse/SOLR-4285 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: Trunk Environment: trunk check outs between august 2012 and january 2013 using Tomcat6 and Java6. Reporter: Markus Jelsma Priority: Critical Fix For: Trunk Using Siege to load test a cluster via a load balancer we sometimes see the forwarded query strings being mangled and causing an error. The problem mainly manifests in a function query parameter and the host__terms parameter. It doesn't seem to be an issue of concurrency because it also happens when load testing with a single thread. function query parameter causing an error: {code} 2012-12-12 11:11:45,527 ERROR [solr.core.SolrCore] - [http-8080-exec-16] - : org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError: Expected ',' at position 55 in 'if(exists(date),max(recip(ms(NOW/DAY,date),3.17e-8,143 .9),.8),.7)' at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:154) ... at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.solr.search.SyntaxError: Expected ',' at position 55 in 'if(exists(date),max(recip(ms(NOW/DAY,date),3.17e-8,143 .9),.8),.7)' {code} The above error is somewhat older but for some reason the comma in the edismax boost parameter is replaced by a newline. host__terms parameter causing an error: {code} 2013-01-08 12:09:08,902 ERROR [handler.component.FacetComponent] - [http-8080-exec-13] - : Unexpected term returned for facet refining. key=host term='dafenwout.domain.ext^daisybel.domain.ext' request params=spellcheck=falsefacet=truesort=score+desc.facet.field=%7B%21terms%3D%24host__terms+ex%3Dhost%7Dhosthost__terms=daanjobbe.domain.ext%2Cdaank.domain.ext%2Cdafenwout.domain.ext%2Cdaisybel.domain.ext%2Cdaniellehooijmans.domain.ext...koenleurs.domain.ext toRefine=[Ljava.util.List;@5b48447f response={daanjobbe.domain.ext=0,daank.domain.ext=0,dafenwout.domain.ext^daisybel.domain.ext=0,daniellehooijmans.domain.ext=0...koenleurs.domain.ext=0} {code} I've shortened the above error significantly, it was about 20kB. It's the carret symbol causing the issue. For some reason the logged request does not contain the carret symbol. Both issues are very elusive and hard to reproduce but in our case will appear if we send queries long enough, 50k, 100k queries. Original thread: http://lucene.472066.n3.nabble.com/SolrCloud-breaks-distributed-query-strings-td4026314.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-7002) Solr Cursor Support for Faceting
Vishnu Mishra created SOLR-7002: --- Summary: Solr Cursor Support for Faceting Key: SOLR-7002 URL: https://issues.apache.org/jira/browse/SOLR-7002 Project: Solr Issue Type: Improvement Components: faceting Affects Versions: 4.10.4 Reporter: Vishnu Mishra Fix For: 4.10.4 We have added solr cursor mark support for deep paging. It is working like as a charm. Can we also add this support in case of faceting to set facet.offset. Is this possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org