Re: solr 4.10 I change slop in pf2 pf3 and query norm changes

2015-12-21 Thread elisabeth benoit
hello, That's what I did, like I wrote in my mail yesterday. In first case, solr computes max. In second case, he sums both results. That's why I dont get the same relative scoring between docs with the same query. 2015-12-22 8:30 GMT+01:00 Binoy Dalal : > Unless the content for both the docs i

Re: solr 4.10 I change slop in pf2 pf3 and query norm changes

2015-12-21 Thread Binoy Dalal
Unless the content for both the docs is exactly the same it is highly unlikely that you will get the same score for the docs under different querying conditions. What you saw in the first case may have been a happy coincidence. Other than that it is very difficult to say why the scoring is differen

Re: Slow query response.

2015-12-21 Thread Modassar Ather
Thanks Jack for your response. The users of our application can enter a list of ids which the UI caps at 50k. All the ids are valid and match documents. We do faceting, grouping etc. on the result set of up to 50k documents. I checked and found that the query is not very resource intensive. It is

Re: solr 4.10 I change slop in pf2 pf3 and query norm changes

2015-12-21 Thread elisabeth benoit
hello, yes in the second case I get one document with a higher score. the relative scoring between documents is not the same anymore. best regards, elisabeth 2015-12-22 4:39 GMT+01:00 Binoy Dalal : > I have one query. > In the second case do you get two records with the same lower scores or > j

Re: solrcloud used a lot of memory and memory keep increasing during long time run

2015-12-21 Thread Erick Erickson
bq: What can we benefit from set maxWarmingSearchers to a larger value You really don't get _any_ value. That's in there as a safety valve to prevent run-away resource consumption. Getting this warning in your logs means you're mis-configuring your system. Increasing the value is almost totally us

documentCache - max concurrent queries

2015-12-21 Thread Vincenzo D'Amore
Hi all, looking at solr wiki https://cwiki.apache.org/confluence/display/solr/Query+Settings+in+SolrConfig I found this: "The size for the documentCache should always be greater than max_results times the max_concurrent_queries, to ensure that Solr does not need to refetch a document during a r

Re: solr 4.10 I change slop in pf2 pf3 and query norm changes

2015-12-21 Thread Binoy Dalal
I have one query. In the second case do you get two records with the same lower scores or just one record with a lower score and the other with a higher one? On Mon, 21 Dec 2015, 18:45 elisabeth benoit wrote: > Hello, > > I don't think the query is important in this case. > > After checking out

Re: Json facet api method stream

2015-12-21 Thread Yonik Seeley
On Mon, Dec 21, 2015 at 6:56 PM, Yago Riveiro wrote: > The json facet API method "stream" uses the docvalues internally for do the > aggregation on the fly? > > I wan't to know if using this method justifies have the docvalues configured > in schema. It won't use docValues for the actual field be

Re: Re: Re: Some problems when upload data to index in cloud environment

2015-12-21 Thread 周建二
Erick: Thank your so much for your advise. Now we do not index a large number of files, but in future we may. I will pay more attention to ExtractingRequestHandler. Thanks again. Best regard, Jianer > -原始邮件- > 发件人: "Erick Erickson" > 发送时间: 2015年12月22日 星期二 > 收件人: solr-user > 抄送: >

Json facet api method stream

2015-12-21 Thread Yago Riveiro
Hi, The json facet API method "stream" uses the docvalues internally for do the aggregation on the fly? I wan't to know if using this method justifies have the docvalues configured in schema. - Best regards -- View this message in context: http://lucene.472066.n3.nabble.com/Json-facet-api

RE: Is Pivoted Grouping possible?

2015-12-21 Thread Lewin Joy (TMS)
If there is even a way to have a string concatenate function, we could bring out similar result sets. Is that possible? -Lewin -Original Message- From: Lewin Joy (TMS) [mailto:lewin@toyota.com] Sent: Monday, December 21, 2015 12:16 PM To: solr-user@lucene.apache.org Subject: Is Pivo

Is Pivoted Grouping possible?

2015-12-21 Thread Lewin Joy (TMS)
Hi, I am working with Solr 4.10.3 . And we are trying to retrieve some documents under for categories and sub-categories. With grouping we are able to bring n number of records under each group. Could we have a pivoted grouping where I could bring the results from sub-categories? Example: App

Re: TPS with Solr Cloud

2015-12-21 Thread Walter Underwood
How many documents do you have? How big is the index? You can increase total throughput with replicas. Shards will make it slower, but allow more documents. At 8000 queries/s, I assume you are using the same query over and over. If so, that is a terrible benchmark. Everything is served out of c

Re: TPS with Solr Cloud

2015-12-21 Thread Upayavira
You add shards to reduce response times. If your responses are too slow for 1 shard, try it with three. Skip two for reasons stated above. Upayavira On Mon, Dec 21, 2015, at 04:27 PM, Erick Erickson wrote: > 8,000 TPS almost certainly means you're firing the same (or > same few) requests over an

Re: Re: Some problems when upload data to index in cloud environment

2015-12-21 Thread Erick Erickson
Jianer: Getting your head around the configs is, indeed, "exciting" at times. I just wanted to caution you that using ExtractingRequestHandler puts the Tika parsing load on the Solr server, which doesn't scale as the same machine that's serving queries and indexing is _also_ parsing potentially v

Re: Numerous problems with SolrCloud

2015-12-21 Thread Erick Erickson
right, do note that when you _do_ hit an OOM, you really should restart the JVM as nothing is _really_ certain after that. You're right, just bumping the memory is a band-aid, but whatever gets you by. Lucene makes heavy use of MMapDirectory which uses OS memory rather than JVM memory, so you're r

Re: solrcloud used a lot of memory and memory keep increasing during long time run

2015-12-21 Thread Erick Erickson
Do you have any custom components? Indeed, you shouldn't have that many searchers open. But could we see a screenshot? That's the best way to insure that we're talking about the same thing. Your autocommit settings are really hurting you. Your commit interval should be as long as you can tolerate.

Re: Numerous problems with SolrCloud

2015-12-21 Thread John Smith
OK, great. I've eliminated OOM errors after increasing the memory allocated to Solr: 12Gb out of 20Gb. It's probably not an optimal setting but this is all I can have right now on the Solr machines. I'll look into GC logging too. Turning to the Solr logs, a quick sweep revealed a lot of "Caused by

Re: TPS with Solr Cloud

2015-12-21 Thread Erick Erickson
8,000 TPS almost certainly means you're firing the same (or same few) requests over and over and hitting the queryResultCache, look in the adminUI>>core>>plugins/stats>>cache>>queryResultCache. I bet you're seeing a hit ratio near 100%. This is what Toke means when he says your tests are too lightw

Re: Numerous problems with SolrCloud

2015-12-21 Thread Erick Erickson
ZK isn't pushed all that heavily, although all things are possible. Still, for maintenance putting Zk on separate machines is a good idea. They don't have to be very beefy machines. Look in your logs for LeaderInitiatedRecovery messages. If you find them then _probably_ you have some issues with t

Re: Numerous problems with SolrCloud

2015-12-21 Thread John Smith
Thanks, I'll have a try. Can the load on the Solr servers impair the zk response time in the current situation, which would cause the desync? Is this the reason for the change? John. On 21/12/15 16:45, Erik Hatcher wrote: > John - the first recommendation that pops out is to run (only) 3 zookeep

Re: Numerous problems with SolrCloud

2015-12-21 Thread Erik Hatcher
John - the first recommendation that pops out is to run (only) 3 zookeepers, entirely separate from Solr servers, and then as many Solr servers from there that you need to scale indexing and querying to your needs. Sounds like 3 ZKs + 2 Solr’s is a good start, given you have 5 servers at your d

Numerous problems with SolrCloud

2015-12-21 Thread John Smith
This is my first experience with SolrCloud, so please bear with me. I've inherited a setup with 5 servers, 2 of which are Zookeeper only and the 3 others SolrCloud + Zookeeper. Versions are respectively 5.4.0 & 3.4.7. There's around 80 Gb of index, some collections are rather big (20Gb) and some v

Re: facet component and uninverted field

2015-12-21 Thread Jamie Johnson
Thanks, the issue I'm having is that there is no equivalent to method uif for the standard facet component. We'll see how SOLR-8096 shakes out. On Sun, Dec 20, 2015 at 11:29 PM, Upayavira wrote: > > > On Sun, Dec 20, 2015, at 01:32 PM, Jamie Johnson wrote: > > For those interested I've attached

Solr 5.4, NGramFilterFactory highlighting

2015-12-21 Thread Bjørn Hjelle
Hi, I have problems getting hit highlighting to work in NGram-fields, with search terms longer than 8 characters. Without the luceneMatchVersion="4.3" parameter in the field type definition, the whole word is highlighted, not just the search term. Here are the exact steps to reproduce the issue:

Re: solr 4.10 I change slop in pf2 pf3 and query norm changes

2015-12-21 Thread elisabeth benoit
Hello, I don't think the query is important in this case. After checking out solr's debug output, I dont think the query norm is relevant either. I think the scoring changes because 1) in first case, I have same slop for catchall and name fields. Bot match pf2 pf3. In this case, solr uses max o

Re: new data structure for some fields

2015-12-21 Thread Binoy Dalal
I wasn't clear enough. What I meant was that basically your integer field should not be multivalued. That's it. If on the other hand your integer field is multivalued, sort will not work. You will have to figure out some sort of a conditional boosting approach wherein you check the integer value a

Re: new data structure for some fields

2015-12-21 Thread Emir Arnautovic
Maybe missing something but if c and b are one-to-one and you are filtering by c, how can you sort on b since all values will be the same? On 21.12.2015 13:10, Abhishek Mishra wrote: Hi binoy it will not work as category and integer is one to one mapping so if category_id is multivalued same go

Re: new data structure for some fields

2015-12-21 Thread Abhishek Mishra
Hi binoy it will not work as category and integer is one to one mapping so if category_id is multivalued same goes to integer also. and you need some kind of mechanism which will identify which integer to pick given to category_id for search thenafter you can implement sort according to it. On Mon

Re: new data structure for some fields

2015-12-21 Thread Binoy Dalal
Small edit: The sort parameter in the solrconfig goes in the request handler declaration that you're using. So if it's select, put in the list. On Mon, 21 Dec 2015, 17:21 Binoy Dalal wrote: > OK. You will only be able to sort based on the integers if the integer > field is single valued, I.e. o

Re: new data structure for some fields

2015-12-21 Thread Binoy Dalal
OK. You will only be able to sort based on the integers if the integer field is single valued, I.e. only one integer is associated with one category I'd. To do this you've to use the sort parameter. You can either specify it in your solrconfig.XML like so: integer asc Field name followed by the or

Re: new data structure for some fields

2015-12-21 Thread Abhishek Mishra
hi binoy thanks for reply. I mean by sort is to sort the data-sets on the basis of integers values given for that category. For any document let say for an id P1, category associated is c1,c2,c3,c4 (using multivalued field) For new implementation similarly a number is associated with each category.

Re: Solr 6 Distributed Join

2015-12-21 Thread Akiel Ahmed
Thank you for the help. I am working through what I want to do with the join - will let you know if I hit any issues. From: Joel Bernstein To: solr-user@lucene.apache.org Date: 17/12/2015 15:40 Subject:Re: Solr 6 Distributed Join One thing to note about the hashJoin is tha

Re: new data structure for some fields

2015-12-21 Thread Binoy Dalal
When you say sort, do you mean search on the basis of category and integers? Or score the docs based on their category and integer values? Also, for any given document, how many categories or integers are associated with it? On Mon, 21 Dec 2015, 14:43 Abhishek Mishra wrote: > Hello all > > i am

Re: solr 4.10 I change slop in pf2 pf3 and query norm changes

2015-12-21 Thread Binoy Dalal
What is your query? On Mon, 21 Dec 2015, 14:37 elisabeth benoit wrote: > Hello all, > > I am using solr 4.10.1 and I have configured my pf2 pf3 like this > > catchall~0^0.2 name~0^0.21 synonyms^0.2 > catchall~0^0.2 name~0^0.21 synonyms^0.2 > > my search field (qf) is my catchall field > > I'v be

Re: TPS with Solr Cloud

2015-12-21 Thread Emir Arnautovic
Hi Anshul, TPS depends on number of concurrent request you can run and request processing time. With sharding you reduce processing time with reducing amount of data single node process, but you have overhead of inter shard communication and merging results from different shards. If that overh

new data structure for some fields

2015-12-21 Thread Abhishek Mishra
Hello all i am facing some kind of requirement that where for an id p1 is associated with some category_ids c1,c2,c3,c4 with some integers b1,b2,b3,b4. We need to sort the query of solr on the basis of b1/b2/b3/b4 depending on given category_id . Right now we mapped the category_ids into multi-va

solr 4.10 I change slop in pf2 pf3 and query norm changes

2015-12-21 Thread elisabeth benoit
Hello all, I am using solr 4.10.1 and I have configured my pf2 pf3 like this catchall~0^0.2 name~0^0.21 synonyms^0.2 catchall~0^0.2 name~0^0.21 synonyms^0.2 my search field (qf) is my catchall field I'v been trying to change slop in pf2, pf3 for catchall and synonyms (going from 0, or default v

Re: Permutations of entries in a multivalued field

2015-12-21 Thread Johannes Riedl
Thanks a lot for these useful hints. Best, Johannes On 18.12.2015 20:59, Allison, Timothy B. wrote: Duh, didn't realize you could set inOrder in Solr. Y, that's the better solution. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Friday, December 18, 2

Re: TPS with Solr Cloud

2015-12-21 Thread Toke Eskildsen
Anshul Sharma wrote: > I have configured solr on 1 AWS server as standalone application which is > giving me a tps of ~8000 for my query. [...] > In order to test the scalability, i have done sharding of the same data > across two AWS servers with 2.5 milion records each .When i try to query > t