Hi,
We have 10 node Solr Cloud (5 shards, 2 replicas) with 30 GB JVM on 60GB
machine and 40 GB of index.
We're constantly noticing that Solr queries take longer time while update (with
commit=false setting) is in progress. The query which usually takes .5 seconds,
take up to 2 minutes while
What do you have for hour _softcommit_ settings in solrconfig.xml? I'm
guessing you're using SolrJ or similar, but the solrconfig settings
will trip a commit as well.
For that matter ,what are all our commit settings in solrconfig.xml,
both hard and soft?
Best,
Erick
On Tue, Apr 8, 2014 at
Hi Joshi;
Click to the Plugins/Stats section under your collection at Solr Admin UI.
You will see the cache statistics for different types of caches. hitratio
and evictions are good statistics to look at first. On the other hand you
should read here:
-binary @- -H 'Content-type:text/plain;
charset=utf-8'
EnD)
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Tuesday, April 08, 2014 2:21 PM
To: solr-user@lucene.apache.org
Subject: Re: solr4 performance question
What do you have for hour _softcommit_
/plain;
charset=utf-8'
EnD)
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Tuesday, April 08, 2014 2:21 PM
To: solr-user@lucene.apache.org
Subject: Re: solr4 performance question
What do you have for hour _softcommit_ settings in solrconfig.xml? I'm
I'm debating whether or not to set the 'facets.missing' parameter to true by
default when faceting. What is the performance impact of setting
'facets.missing' to true?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Performance-Question-facets-missing-tp4099602.html
Sent
On Wed, Nov 6, 2013 at 12:07 PM, andres and...@octopart.com wrote:
I'm debating whether or not to set the 'facets.missing' parameter to true by
default when faceting. What is the performance impact of setting
'facets.missing' to true?
It really depends on the faceting method. For some
.
Thanks for looking into this. Appreciate your help.
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Tuesday, August 13, 2013 8:12 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr4 update and query performance question
1 That's hard-coded at present
if it
gives us desired performance.
Thanks for looking into this. Appreciate your help.
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Tuesday, August 13, 2013 8:12 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr4 update and query performance
1 That's hard-coded at present. There's anecdotal evidence that there
are throughput improvements with larger batch sizes, but no action
yet.
2 Yep, all searchers are also re-opened, caches re-warmed, etc.
3 Odd. I'm assuming your Solr3 was master/slave setup? Seeing the
queries
Hi,
We have SolrCloud (4.4.0) cluster (5 shards and 2 replicas) on 10 boxes with
about 450 mil documents (~90 mil per shard). We're loading 1000 or less
documents in CSV format every few minutes. In Solr3, with 300 mil documents, it
used to take 30 seconds to load 1000 documents while in
So after re-feeding our data with a new boolean field that is true when
data exists and false when it doesn't our search times have gone from avg
of about 20s to around 150ms... pretty amazing change in perf... It seems
like https://issues.apache.org/jira/browse/SOLR-5093 might alleviate many
On 8/5/2013 7:13 AM, Steven Bower wrote:
So after re-feeding our data with a new boolean field that is true when
data exists and false when it doesn't our search times have gone from avg
of about 20s to around 150ms... pretty amazing change in perf... It seems
like
From: Steven Bower-2 [via Lucene]
ml-node+s472066n4082569...@n3.nabble.commailto:ml-node+s472066n4082569...@n3.nabble.com
Date: Monday, August 5, 2013 9:14 AM
To: Smiley, David W. dsmi...@mitre.orgmailto:dsmi...@mitre.org
Subject: Re: Performance question on Spatial Search
So after re-feeding
On Wed, Jul 31, 2013 at 1:10 AM, Steven Bower sbo...@alcyon.net wrote:
not sure what you mean by good hit raitio?
I mean such queries are really expensive (even on cache hit), so if the
list of ids changes every time, it never hit cache and hence executes these
heavy queries every time. It's
the list of IDs does change relatively frequently, but this doesn't seem to
have very much impact on the performance of the query as far as I can tell.
attached are the stacks
thanks,
steve
On Wed, Jul 31, 2013 at 6:33 AM, Mikhail Khludnev
mkhlud...@griddynamics.com wrote:
On Wed, Jul 31,
of where to go with this I'd love to
know
thanks,
steve
-
Author:
http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context:
http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search
-
Author:
http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context:
http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search
-tp4081150p4081309.html
Sent from the Solr - User mailing list archive at Nabble.com.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Sear
ch
-tp4081150p4081309.html
Sent from the Solr - User mailing list archive at Nabble.com.
/Performance-question-on-Spatial-Sear
ch
-tp4081150p4081309.html
Sent from the Solr - User mailing list archive at Nabble.com.
--
- Luis Cappa
Very good read... Already using MMap... verified using pmap and vsz from
top..
not sure what you mean by good hit raitio?
Here are the stacks...
Name Time (ms) Own Time (ms)
org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(AtomicReaderContext,
Bits) 300879 203478
@David I will certainly update when we get the data refed... and if you
have things you'd like to investigate or try out please let me know.. I'm
happy to eval things at scale here... we will be taking this index from its
current 45m records to 6-700m over the next few months as well..
steve
On
thanks,
steve
-
Author:
http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context:
http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Sear
ch
-tp4081150p4081309.html
Sent from
-
Author:
http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context:
http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Sear
ch
-tp4081150p4081309.html
Sent from the Solr
Can you compare with the old geo handler as a baseline. ?
Bill Bell
Sent from mobile
On Jul 29, 2013, at 4:25 PM, Erick Erickson erickerick...@gmail.com wrote:
This is very strange. I'd expect slow queries on
the first few queries while these caches were
warmed, but after that I'd expect
i shift machine to m1.large for 250 data or for 500??
or it will work for now ??
--
View this message in context:
http://lucene.472066.n3.nabble.com/SOLR-Performance-question-tp4041245.html
Sent from the Solr - User mailing list archive at Nabble.com.
...@gmail.com]
Sent: Tuesday, February 19, 2013 1:46 PM
To: solr-user@lucene.apache.org
Subject: SOLR Performance question
Hi everybody.
I stored 42 field in solr.
and indexed 34 field.
and going to store 4-6 coloum more and indexed 3-5
and total doc i have stored --- 250
and may
I am having some difficulty migrating our solr indexing scripts from using 3.5
to solr 4.0. Notably, I am trying to track down why our performance in solr 4.0
is about 5-10 times slower when indexing documents. Querying is still quite
fast.
The code adds documents in groups of 1000, and adds
It's hard to guess, but I might start by looking at what the new UpdateLog is
costing you. Take it's definition out of solrconfig.xml and try your test
again. Then let's take it from there.
- Mark
On Jan 23, 2013, at 11:00 AM, Kevin Stone kevin.st...@jax.org wrote:
I am having some
Do you mean commenting out the updateLog.../updateLog tag? Because
that I already commented out. Or do I also need to remove the entire
updateHandler tag? Sorry, I am not too familiar with everything in the
solrconfig file. I have a tag that essentially looks like this:
updateHandler
I'm still poking around trying to find the differences. I found a couple
things that may or may not be relevant.
First, when I start up my 3.5 solr, I get all sorts of warnings that my
solrconfig is old and will run using 2.4 emulation.
Of course I had to upgrade the solconfig for the 4.0 instance
Another revelation...
I can see that there is a time difference in the Solr output for adding
these documents when I watch it realtime.
Here are some rows from the 3.5 solr server:
Jan 23, 2013 11:57:23 AM org.apache.solr.core.SolrCore execute
INFO: [gxdResult] webapp=/solr path=/update/javabin
Mikhail,
Thanks for the response. Just to be clear you're saying that the size
of the index does not matter, it's more the size of the results?
On Fri, Mar 16, 2012 at 2:43 PM, Mikhail Khludnev
mkhlud...@griddynamics.com wrote:
Hello,
Frankly speaking the computational complexity of Lucene
Exactly. That's what I mean.
On Mon, Mar 19, 2012 at 6:15 PM, Jamie Johnson jej2...@gmail.com wrote:
Mikhail,
Thanks for the response. Just to be clear you're saying that the size
of the index does not matter, it's more the size of the results?
On Fri, Mar 16, 2012 at 2:43 PM, Mikhail
The size of the index does matter practically speaking.
Bill Bell
Sent from mobile
On Mar 19, 2012, at 11:41 AM, Mikhail Khludnev mkhlud...@griddynamics.com
wrote:
Exactly. That's what I mean.
On Mon, Mar 19, 2012 at 6:15 PM, Jamie Johnson jej2...@gmail.com wrote:
Mikhail,
Thanks
I'm curious if anyone tell me how Solr/Lucene performs in a situation
where you have 100,000 documents each with 100 tokens vs having
1,000,000 documents each with 10 tokens. Should I expect the
performance to be the same? Any information would be greatly
appreciated.
Hello,
Frankly speaking the computational complexity of Lucene search depends from
size of search result: numFound*log(start+rows), but from size of index.
Regards
On Fri, Mar 16, 2012 at 9:34 PM, Jamie Johnson jej2...@gmail.com wrote:
I'm curious if anyone tell me how Solr/Lucene performs in
Strictly speaking there is some insignificant distinctions in performance
related to how a field name is resolved -- Grant alluded to this
earlier in this thread -- but it only comes into play when you actually
refer to that field by name and Solr has to look them up in the
metadata. So for
You don't lose copyField capability with dynamic fields. You can copy
dynamic fields into a fixed field name like *_s = text or dynamic
fields into another dynamic field like *_s = *_t
Erik
On Jan 6, 2010, at 9:35 AM, A. Steven Anderson wrote:
Strictly speaking there is some
You don't lose copyField capability with dynamic fields. You can copy
dynamic fields into a fixed field name like *_s = text or dynamic fields
into another dynamic field like *_s = *_t
Ahhh...I missed that little detail. Nice!
Ok, so there are no negatives to using dynamic fields then.
: So, in general, there is no *significant* performance difference with using
: dynamic fields. Correct?
:
: Correct. There's not even really an insignificant performance difference.
: A dynamic field is the same as a regular field in practically every way on the
: search side of things.
On Jan 4, 2010, at 12:04 AM, A. Steven Anderson wrote:
dynamic fields don't make it worse ... the number of actaul field
names
you sort on makes it worse.
If you sort on 100 fields, the cost is the same regardless of
wether all
100 of those fields exist because of a single dynamicField/
Sorting and index norms have space penalties.
Sorting on a field creates an array of Java ints, one for every
document in the index. Index norms (used for boosting documents and
other things) create an array of bytes in the Lucene index files, one
for every document in the index.
If you sort
: If you sort on many of your dynamic fields your memory use will
: explode, and the same with index norms and disk space.
: Thanks for the info. In general, I knew sorting was expensive, but I didn't
: realize that dynamic fields made it worse.
dynamic fields don't make it worse ... the
dynamic fields don't make it worse ... the number of actaul field names
you sort on makes it worse.
If you sort on 100 fields, the cost is the same regardless of wether all
100 of those fields exist because of a single dynamicField/ declaration,
or 100 distinct field/ declarations.
Sorting and index norms have space penalties.
Sorting on a field creates an array of Java ints, one for every
document in the index. Index norms (used for boosting documents and
other things) create an array of bytes in the Lucene index files, one
for every document in the index.
If you sort on
On Dec 29, 2009, at 2:19 PM, A. Steven Anderson wrote:
Greetings!
Is there any significant negative performance impact of using a
dynamicField?
There can be an impact if you are searching against a lot of fields or if you
are indexing a lot of fields on every document, but for the most
There can be an impact if you are searching against a lot of fields or if
you are indexing a lot of fields on every document, but for the most part in
most applications it is negligible.
We index a lot of fields at one time, but we can tolerate the performance
impact at index time.
It
Greetings!
Is there any significant negative performance impact of using a
dynamicField?
Likewise for multivalued fields?
The reason why I ask is that our system basically aggregates data from many
disparate data sources (structured, unstructured, and semi-structured), and
the management of the
,
could be the JVM being busy sweeping the garbage out, etc.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Robert Purdy [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Thursday, November 15, 2007 4:05:00 PM
Subject: Performance question
the time next to the query in the log greater on the production server? If
so what is the best way to configure tomcat to deal with that issue?
Thanks Robert.
--
View this message in context:
http://www.nabble.com/Performance-question%3A-Solr-64-bit-java-vs-32-bit-mode.-tf4817186.html#a13781791
On 11/5/07, Haishan Chen [EMAIL PROTECTED] wrote:
As for the first issues. The number of different phrase queries have
performance issues I found so far are about 10.
If these are normal phrase queries (no slop), a good solution might be
to simply index and query these phrases as a single
He means extremely frequent and I agree. --wunder
On 11/2/07 1:51 AM, Haishan Chen [EMAIL PROTECTED] wrote:
Thanks for the advice. You certainly have a point. I believe you mean a query
term that appears in 5-10% of an index in a natural language corpus is
extremely INFREQUENT?
From: [EMAIL PROTECTED] Subject: Re: Phrase Query Performance Question
Date: Thu, 1 Nov 2007 11:25:26 -0700 To: solr-user@lucene.apache.org On
31-Oct-07, at 11:54 PM, Haishan Chen wrote:Date: Wed, 31 Oct 2007
17:54:53 -0700 Subject: Re: Phrase Query Performance Question From
On 2-Nov-07, at 10:03 AM, Haishan Chen wrote:
Date: Fri, 2 Nov 2007 07:32:30 -0700 Subject: Re: Phrase Query
Performance Question From: [EMAIL PROTECTED] To: solr-
[EMAIL PROTECTED] He means extremely frequent and I
agree. --wunder
Then it means a PHRASE (combination of terms
: It still feels to me that you are trying doing something unique with your
: phrase queries. Unfortunately, you still haven't said what you are trying to
: do in general terms, which makes it very difficult for people to help you.
Agreed. This seems very special case, but we dont' know what
Date: Fri, 2 Nov 2007 12:31:29 -0700 From: [EMAIL PROTECTED] To:
solr-user@lucene.apache.org Subject: Re: Phrase Query Performance Question
: It still feels to me that you are trying doing something unique with
your : phrase queries. Unfortunately, you still haven't said what you
On 31-Oct-07, at 11:54 PM, Haishan Chen wrote:
Date: Wed, 31 Oct 2007 17:54:53 -0700 Subject: Re: Phrase Query
Performance Question From: [EMAIL PROTECTED] To: solr-
[EMAIL PROTECTED] hurricane katrina is a very expensive
query against a collection focused on Hurricane Katrina
From: [EMAIL PROTECTED] Subject: Re: Phrase Query Performance Question
Date: Tue, 30 Oct 2007 11:22:17 -0700 To: solr-user@lucene.apache.org On
30-Oct-07, at 6:09 AM, Yonik Seeley wrote: On 10/30/07, Haishan Chen
[EMAIL PROTECTED] wrote: Thanks a lot for replying Yonik! I am
On 31-Oct-07, at 2:40 PM, Haishan Chen wrote:
http://mail-archives.apache.org/mod_mbox/lucene-java-user/
200512.mbox/[EMAIL PROTECTED]
It mentioned that http://websearch.archive.org/katrina/ (in nutch)
had 10M documents and a search of hurricane katrina was able to
return in 1.35 seconds
From: [EMAIL PROTECTED] Subject: Re: Phrase Query Performance Question
Date: Wed, 31 Oct 2007 15:25:42 -0700 To: solr-user@lucene.apache.org On
31-Oct-07, at 2:40 PM, Haishan Chen wrote:
http://mail-archives.apache.org/mod_mbox/lucene-java-user/
200512.mbox/[EMAIL PROTECTED
hurricane katrina is a very expensive query against a collection
focused on Hurricane Katrina. There will be many matches in many
documents. If you want to measure worst-case, this is fine.
I'd try other things, like:
* ninth ward
* Ray Nagin
* Audubon Park
* Canal Street
* French Quarter
* FEMA
: (auto repair) 100384 hits 946 ms(auto repair) 100384 hits 31ms(car
: repair~100) 112183 hits 766 ms(car repair) 112183 hits 63
: ms(business service~100) 1209751 hits 1500 ms(business service)
: 1209751 hits 234 ms(shopping center~100) 119481 hits 359
: ms(shopping
Date: Wed, 31 Oct 2007 19:19:07 -0700 From: [EMAIL PROTECTED] To:
solr-user@lucene.apache.org Subject: RE: Phrase Query Performance Question
: (auto repair) 100384 hits 946 ms(auto repair) 100384 hits 31ms(car
: repair~100) 112183 hits 766 ms(car repair) 112183 hits 63 :
ms
Date: Wed, 31 Oct 2007 17:54:53 -0700 Subject: Re: Phrase Query Performance
Question From: [EMAIL PROTECTED] To: solr-user@lucene.apache.org
hurricane katrina is a very expensive query against a collection focused
on Hurricane Katrina. There will be many matches in many documents
Thanks a lot for replying Yonik!
I am running solr on a windows 2003 server (standard version). intel Xeon CPU
3.00GHz, with 4.00 GB RAM.
The index is locate on Raid5 with 2 million documents. Is there any way to
improve query performance without moving to more powerful computer?
I
On 10/30/07, Haishan Chen [EMAIL PROTECTED] wrote:
Thanks a lot for replying Yonik!
I am running solr on a windows 2003 server (standard version). intel Xeon CPU
3.00GHz, with 4.00 GB RAM.
The index is locate on Raid5 with 2 million documents. Is there any way to
improve query performance
On 30-Oct-07, at 6:09 AM, Yonik Seeley wrote:
On 10/30/07, Haishan Chen [EMAIL PROTECTED] wrote:
Thanks a lot for replying Yonik!
I am running solr on a windows 2003 server (standard version).
intel Xeon CPU 3.00GHz, with 4.00 GB RAM.
The index is locate on Raid5 with 2 million documents.
I am a new Solr user and wonder if anyone can help me these questions. I used
Solr to index about two million documents and query on it using standard
request handler. I disabled all cache. I found phrase query was substantially
slower than the usual query. The statistic I collected is as
On 3/26/07, climbingrose [EMAIL PROTECTED] wrote:
I'm developing an application that potentially creates thousands of dynamic
fields. Does anyone know if large number of dynamic fields will degrade
Solr performance?
Thousands of fields won't be a problem if
- you don't sort on most of them
Thanks Yonik. I think both of the conditions hold true for our application
;).
On 3/27/07, Yonik Seeley [EMAIL PROTECTED] wrote:
On 3/26/07, climbingrose [EMAIL PROTECTED] wrote:
I'm developing an application that potentially creates thousands of
dynamic
fields. Does anyone know if large
Hi all,
I'm developing an application that potentially creates thousands of dynamic
fields. Does anyone know if large number of dynamic fields will degrade
Solr performance?
Thanks.
--
Regards,
Cuong Hoang
72 matches
Mail list logo