hello,
That's what I did, like I wrote in my mail yesterday. In first case, solr
computes max. In second case, he sums both results.
That's why I dont get the same relative scoring between docs with the same
query.
2015-12-22 8:30 GMT+01:00 Binoy Dalal :
> Unless the content for both the docs i
Unless the content for both the docs is exactly the same it is highly
unlikely that you will get the same score for the docs under different
querying conditions. What you saw in the first case may have been a happy
coincidence.
Other than that it is very difficult to say why the scoring is differen
Thanks Jack for your response.
The users of our application can enter a list of ids which the UI caps at
50k. All the ids are valid and match documents. We do faceting, grouping
etc. on the result set of up to 50k documents.
I checked and found that the query is not very resource intensive. It is
hello,
yes in the second case I get one document with a higher score. the relative
scoring between documents is not the same anymore.
best regards,
elisabeth
2015-12-22 4:39 GMT+01:00 Binoy Dalal :
> I have one query.
> In the second case do you get two records with the same lower scores or
> j
bq: What can we benefit from set maxWarmingSearchers to a larger value
You really don't get _any_ value. That's in there as a safety valve to
prevent run-away resource consumption. Getting this warning in your logs
means you're mis-configuring your system. Increasing the value is almost
totally us
Hi all,
looking at solr wiki
https://cwiki.apache.org/confluence/display/solr/Query+Settings+in+SolrConfig
I found this:
"The size for the documentCache should always be greater than max_results
times the max_concurrent_queries, to ensure that Solr does not need to
refetch a document during a r
I have one query.
In the second case do you get two records with the same lower scores or
just one record with a lower score and the other with a higher one?
On Mon, 21 Dec 2015, 18:45 elisabeth benoit
wrote:
> Hello,
>
> I don't think the query is important in this case.
>
> After checking out
On Mon, Dec 21, 2015 at 6:56 PM, Yago Riveiro wrote:
> The json facet API method "stream" uses the docvalues internally for do the
> aggregation on the fly?
>
> I wan't to know if using this method justifies have the docvalues configured
> in schema.
It won't use docValues for the actual field be
Erick:
Thank your so much for your advise. Now we do not index a large number of
files, but in future we may. I will pay more attention to
ExtractingRequestHandler. Thanks again.
Best regard,
Jianer
> -原始邮件-
> 发件人: "Erick Erickson"
> 发送时间: 2015年12月22日 星期二
> 收件人: solr-user
> 抄送:
>
Hi,
The json facet API method "stream" uses the docvalues internally for do the
aggregation on the fly?
I wan't to know if using this method justifies have the docvalues configured
in schema.
-
Best regards
--
View this message in context:
http://lucene.472066.n3.nabble.com/Json-facet-api
If there is even a way to have a string concatenate function, we could bring
out similar result sets. Is that possible?
-Lewin
-Original Message-
From: Lewin Joy (TMS) [mailto:lewin@toyota.com]
Sent: Monday, December 21, 2015 12:16 PM
To: solr-user@lucene.apache.org
Subject: Is Pivo
Hi,
I am working with Solr 4.10.3 . And we are trying to retrieve some documents
under for categories and sub-categories.
With grouping we are able to bring n number of records under each group.
Could we have a pivoted grouping where I could bring the results from
sub-categories?
Example:
App
How many documents do you have? How big is the index?
You can increase total throughput with replicas. Shards will make it slower,
but allow more documents.
At 8000 queries/s, I assume you are using the same query over and over. If so,
that is a terrible benchmark. Everything is served out of c
You add shards to reduce response times. If your responses are too slow
for 1 shard, try it with three. Skip two for reasons stated above.
Upayavira
On Mon, Dec 21, 2015, at 04:27 PM, Erick Erickson wrote:
> 8,000 TPS almost certainly means you're firing the same (or
> same few) requests over an
Jianer:
Getting your head around the configs is, indeed, "exciting" at times.
I just wanted to caution you that using ExtractingRequestHandler
puts the Tika parsing load on the Solr server, which doesn't
scale as the same machine that's serving queries and indexing
is _also_ parsing potentially v
right, do note that when you _do_ hit an OOM, you really
should restart the JVM as nothing is _really_ certain after
that.
You're right, just bumping the memory is a band-aid, but
whatever gets you by. Lucene makes heavy use of
MMapDirectory which uses OS memory rather than JVM
memory, so you're r
Do you have any custom components? Indeed, you shouldn't have
that many searchers open. But could we see a screenshot? That's
the best way to insure that we're talking about the same thing.
Your autocommit settings are really hurting you. Your commit interval
should be as long as you can tolerate.
OK, great. I've eliminated OOM errors after increasing the memory
allocated to Solr: 12Gb out of 20Gb. It's probably not an optimal
setting but this is all I can have right now on the Solr machines. I'll
look into GC logging too.
Turning to the Solr logs, a quick sweep revealed a lot of "Caused by
8,000 TPS almost certainly means you're firing the same (or
same few) requests over and over and hitting the queryResultCache,
look in the adminUI>>core>>plugins/stats>>cache>>queryResultCache.
I bet you're seeing a hit ratio near 100%. This is what Toke means
when he says your tests are too lightw
ZK isn't pushed all that heavily, although all things are possible. Still,
for maintenance putting Zk on separate machines is a good idea. They
don't have to be very beefy machines.
Look in your logs for LeaderInitiatedRecovery messages. If you find them
then _probably_ you have some issues with t
Thanks, I'll have a try. Can the load on the Solr servers impair the zk
response time in the current situation, which would cause the desync? Is
this the reason for the change?
John.
On 21/12/15 16:45, Erik Hatcher wrote:
> John - the first recommendation that pops out is to run (only) 3 zookeep
John - the first recommendation that pops out is to run (only) 3 zookeepers,
entirely separate from Solr servers, and then as many Solr servers from there
that you need to scale indexing and querying to your needs. Sounds like 3 ZKs
+ 2 Solr’s is a good start, given you have 5 servers at your d
This is my first experience with SolrCloud, so please bear with me.
I've inherited a setup with 5 servers, 2 of which are Zookeeper only and
the 3 others SolrCloud + Zookeeper. Versions are respectively 5.4.0 &
3.4.7. There's around 80 Gb of index, some collections are rather big
(20Gb) and some v
Thanks, the issue I'm having is that there is no equivalent to method uif
for the standard facet component. We'll see how SOLR-8096 shakes out.
On Sun, Dec 20, 2015 at 11:29 PM, Upayavira wrote:
>
>
> On Sun, Dec 20, 2015, at 01:32 PM, Jamie Johnson wrote:
> > For those interested I've attached
Hi,
I have problems getting hit highlighting to work in NGram-fields, with
search terms longer than 8 characters.
Without the luceneMatchVersion="4.3" parameter in the field type
definition, the whole word is highlighted, not just the search term.
Here are the exact steps to reproduce the issue:
Hello,
I don't think the query is important in this case.
After checking out solr's debug output, I dont think the query norm is
relevant either.
I think the scoring changes because
1) in first case, I have same slop for catchall and name fields. Bot match
pf2 pf3. In this case, solr uses max o
I wasn't clear enough. What I meant was that basically your integer field
should not be multivalued. That's it.
If on the other hand your integer field is multivalued, sort will not work.
You will have to figure out some sort of a conditional boosting approach
wherein you check the integer value a
Maybe missing something but if c and b are one-to-one and you are
filtering by c, how can you sort on b since all values will be the same?
On 21.12.2015 13:10, Abhishek Mishra wrote:
Hi binoy it will not work as category and integer is one to one mapping so
if category_id is multivalued same go
Hi binoy it will not work as category and integer is one to one mapping so
if category_id is multivalued same goes to integer also. and you need some
kind of mechanism which will identify which integer to pick given to
category_id for search thenafter you can implement sort according to it.
On Mon
Small edit:
The sort parameter in the solrconfig goes in the request handler
declaration that you're using. So if it's select, put in the list.
On Mon, 21 Dec 2015, 17:21 Binoy Dalal wrote:
> OK. You will only be able to sort based on the integers if the integer
> field is single valued, I.e. o
OK. You will only be able to sort based on the integers if the integer
field is single valued, I.e. only one integer is associated with one
category I'd.
To do this you've to use the sort parameter.
You can either specify it in your solrconfig.XML like so:
integer asc
Field name followed by the or
hi binoy
thanks for reply. I mean by sort is to sort the data-sets on the basis of
integers values given for that category.
For any document let say for an id P1,
category associated is c1,c2,c3,c4 (using multivalued field)
For new implementation
similarly a number is associated with each category.
Thank you for the help.
I am working through what I want to do with the join - will let you know
if I hit any issues.
From: Joel Bernstein
To: solr-user@lucene.apache.org
Date: 17/12/2015 15:40
Subject:Re: Solr 6 Distributed Join
One thing to note about the hashJoin is tha
When you say sort, do you mean search on the basis of category and
integers? Or score the docs based on their category and integer values?
Also, for any given document, how many categories or integers are
associated with it?
On Mon, 21 Dec 2015, 14:43 Abhishek Mishra wrote:
> Hello all
>
> i am
What is your query?
On Mon, 21 Dec 2015, 14:37 elisabeth benoit
wrote:
> Hello all,
>
> I am using solr 4.10.1 and I have configured my pf2 pf3 like this
>
> catchall~0^0.2 name~0^0.21 synonyms^0.2
> catchall~0^0.2 name~0^0.21 synonyms^0.2
>
> my search field (qf) is my catchall field
>
> I'v be
Hi Anshul,
TPS depends on number of concurrent request you can run and request
processing time. With sharding you reduce processing time with reducing
amount of data single node process, but you have overhead of inter shard
communication and merging results from different shards. If that
overh
Hello all
i am facing some kind of requirement that where for an id p1 is associated
with some category_ids c1,c2,c3,c4 with some integers b1,b2,b3,b4. We need
to sort the query of solr on the basis of b1/b2/b3/b4 depending on given
category_id . Right now we mapped the category_ids into multi-va
Hello all,
I am using solr 4.10.1 and I have configured my pf2 pf3 like this
catchall~0^0.2 name~0^0.21 synonyms^0.2
catchall~0^0.2 name~0^0.21 synonyms^0.2
my search field (qf) is my catchall field
I'v been trying to change slop in pf2, pf3 for catchall and synonyms (going
from 0, or default v
Thanks a lot for these useful hints.
Best,
Johannes
On 18.12.2015 20:59, Allison, Timothy B. wrote:
Duh, didn't realize you could set inOrder in Solr. Y, that's the better
solution.
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Friday, December 18, 2
Anshul Sharma wrote:
> I have configured solr on 1 AWS server as standalone application which is
> giving me a tps of ~8000 for my query.
[...]
> In order to test the scalability, i have done sharding of the same data
> across two AWS servers with 2.5 milion records each .When i try to query
> t
40 matches
Mail list logo