Thanks all.
I've the same index with a bit different schema and 200M documents,
installed on 3 r3.xlarge (30GB RAM, and 600 General Purpose SSD). The size
of index is about 1.5TB, have many updates every 5 minutes, complex queries
and faceting with response time of 100ms that is acceptable for
On 12/29/2014 2:36 AM, Mahmoud Almokadem wrote:
I've the same index with a bit different schema and 200M documents,
installed on 3 r3.xlarge (30GB RAM, and 600 General Purpose SSD). The size
of index is about 1.5TB, have many updates every 5 minutes, complex queries
and faceting with response
Thanks Shawn.
What do you mean with important parts of index? and how to calculate their
size?
Thanks,
Mahmoud
Sent from my iPhone
On Dec 29, 2014, at 8:19 PM, Shawn Heisey apa...@elyograg.org wrote:
On 12/29/2014 2:36 AM, Mahmoud Almokadem wrote:
I've the same index with a bit different
On 12/29/2014 12:07 PM, Mahmoud Almokadem wrote:
What do you mean with important parts of index? and how to calculate their
size?
I have no formal education in what's important when it comes to doing a
query, but I can make some educated guesses.
Starting with this as a reference:
Mahmoud Almokadem [prog.mahm...@gmail.com] wrote:
I've the same index with a bit different schema and 200M documents,
installed on 3 r3.xlarge (30GB RAM, and 600 General Purpose SSD). The size
of index is about 1.5TB, have many updates every 5 minutes, complex queries
and faceting with
On 12/26/2014 7:17 AM, Mahmoud Almokadem wrote:
We've installed a cluster of one collection of 350M documents on 3
r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is
about 1.1TB and maximum storage on Amazon is 1 TB so we add 2 SSD EBS
General purpose (1x1TB + 1x500GB) on
Mahmoud Almokadem [prog.mahm...@gmail.com] wrote:
We've installed a cluster of one collection of 350M documents on 3
r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is
about 1.1TB and maximum storage on Amazon is 1 TB so we add 2 SSD EBS
General purpose (1x1TB + 1x500GB)
Dears,
We've installed a cluster of one collection of 350M documents on 3
r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is
about 1.1TB and maximum storage on Amazon is 1 TB so we add 2 SSD EBS
General purpose (1x1TB + 1x500GB) on each instance. Then we create logical
Likely lots of disk + network IO, yes. Put SPM for Solr on your nodes to double
check.
Otis
On Dec 26, 2014, at 09:17, Mahmoud Almokadem prog.mahm...@gmail.com wrote:
Dears,
We've installed a cluster of one collection of 350M documents on 3
r3.2xlarge (60GB RAM) Amazon servers. The
We have a solr core with about 115 million documents. We are trying to
migrate data and running a simple query with *:* query and with start
and rows param.
The performance is becoming too slow in solr, its taking almost 2 mins
to get 4000 rows and migration is being just too slow. Logs snippet
Hi,
How many shards do you have? This is a known issue with deep paging with multi
shard, see https://issues.apache.org/jira/browse/SOLR-1726
You may be more successful in going to each shard, one at a time (with
distrib=false) to avoid this issue.
--
Jan Høydahl, search solution architect
Jan,
Would the same distrib=false help for distributed faceting? We are running
into a similar issue with facet paging.
Dmitry
On Mon, Apr 29, 2013 at 11:58 AM, Jan Høydahl jan@cominvent.com wrote:
Hi,
How many shards do you have? This is a known issue with deep paging with
multi
We have a single shard, and all the data is in a single box only.
Definitely looks like deep-paging is having problems.
Just to understand, is the searcher looping over the result set
everytime and skipping the first start count? This will definitely
take a toll when we reach higher start
Abhishek,
There is a wiki regarding this:
http://wiki.apache.org/solr/CommonQueryParameters
search pageDoc and pageScore.
On Mon, Apr 29, 2013 at 1:17 PM, Abhishek Sanoujam
abhi.sanou...@gmail.comwrote:
We have a single shard, and all the data is in a single box only.
Definitely looks like
We've found that you can do a lot for yourself by using a filter query
to page through your data if it has a natural range to do so instead
of start and rows.
Michael Della Bitta
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271
I guess so, you'd have to use a filter query to page through the set
of documents you were faceting against and sum them all at the end.
It's not quite the same operation as paging through results, because
facets are aggregate statistics, but if you're willing to go through
the trouble, I bet it
Thanks.
Only question is how to smoothly transition to this model. Our facet
(string) fields contain timestamp prefixes, that are reverse ordered
starting from the freshest value. In theory, we could try computing the
filter queries for those. But before doing so, we would need the matched
ids
On Mon, Oct 29, 2012 at 7:04 AM, Shawn Heisey s...@elyograg.org wrote:
They are indeed Java options. The first two control the maximum and
starting heap sizes. NewRatio controls the relative size of the young and
old generations, making the young generation considerably larger than it is
by
On Fri, Oct 26, 2012 at 11:04 PM, Shawn Heisey s...@elyograg.org wrote:
Warming doesn't seem to be a problem here -- all your warm times are zero,
so I am going to take a guess that it may be a heap/GC issue. I would
recommend starting with the following additional arguments to your JVM.
On 10/28/2012 2:28 PM, Dotan Cohen wrote:
On Fri, Oct 26, 2012 at 11:04 PM, Shawn Heisey s...@elyograg.org wrote:
Warming doesn't seem to be a problem here -- all your warm times are zero,
so I am going to take a guess that it may be a heap/GC issue. I would
recommend starting with the
On Wed, Oct 24, 2012 at 4:33 PM, Walter Underwood wun...@wunderwood.org wrote:
Please consider never running optimize. That should be called force merge.
Thanks. I have been letting the system run for about two days already
without an optimize. I will let it run a week, then merge to see the
I spoke too soon! Wereas three days ago when the index was new 500
records could be written to it in 3 seconds, now that operation is
taking a minute and a half, sometimes longer. I ran optimize() but
that did not help the writes. What can I do to improve the write
performance?
Even opening the
On 10/26/2012 7:16 AM, Dotan Cohen wrote:
I spoke too soon! Wereas three days ago when the index was new 500
records could be written to it in 3 seconds, now that operation is
taking a minute and a half, sometimes longer. I ran optimize() but
that did not help the writes. What can I do to
On Fri, Oct 26, 2012 at 4:02 PM, Shawn Heisey s...@elyograg.org wrote:
Taking all the information I've seen so far, my bet is on either cache
warming or heap/GC trouble as the source of your problem. It's now specific
information gathering time. Can you gather all the following information
On 10/26/2012 9:41 AM, Dotan Cohen wrote:
On the dashboard of the GUI, it lists all the jvm arguments. Include those.
Click Java Properties and gather the java.runtime.version and
java.specification.vendor information.
After one of the long update times, pause/stop your indexing application.
On Tue, Oct 23, 2012 at 3:07 PM, Erick Erickson erickerick...@gmail.com wrote:
Maybe you've been looking at it but one thing that I didn't see on a fast
scan was that maybe the commit bit is the problem. When you commit,
eventually the segments will be merged and a new searcher will be opened
Please consider never running optimize. That should be called force merge.
wunder
On Oct 24, 2012, at 3:28 AM, Dotan Cohen wrote:
On Tue, Oct 23, 2012 at 3:07 PM, Erick Erickson erickerick...@gmail.com
wrote:
Maybe you've been looking at it but one thing that I didn't see on a fast
scan
Maybe you've been looking at it but one thing that I didn't see on a fast
scan was that maybe the commit bit is the problem. When you commit,
eventually the segments will be merged and a new searcher will be opened
(this is true even if you're NOT optimizing). So you're effectively committing
When Solr is slow, I'm seeing these in the logs:
[collection1] Error opening new searcher. exceeded limit of
maxWarmingSearchers=2, try again later.
[collection1] PERFORMANCE WARNING: Overlapping onDeckSearchers=2
Googling, I found this in the FAQ:
Typically the way to avoid this error is to
Hello!
You can check if the long warming is causing the overlapping
searchers. Check Solr admin panel and look at cache statistics, there
should be warmupTime property.
Lowering the autowarmCount should lower the time needed to warm up,
howere you can also look at your warming queries (if you
Are you using Solr 3X? The occasional long commit should no longer
show up in Solr 4.
- Mark
On Mon, Oct 22, 2012 at 10:44 AM, Dotan Cohen dotanco...@gmail.com wrote:
I've got a script writing ~50 documents to Solr at a time, then
commiting. Each of these documents is no longer than 1 KiB of
On Mon, Oct 22, 2012 at 5:02 PM, Rafał Kuć r@solr.pl wrote:
Hello!
You can check if the long warming is causing the overlapping
searchers. Check Solr admin panel and look at cache statistics, there
should be warmupTime property.
Thank you, I have gone over the Solr admin panel twice and
On Mon, Oct 22, 2012 at 5:27 PM, Mark Miller markrmil...@gmail.com wrote:
Are you using Solr 3X? The occasional long commit should no longer
show up in Solr 4.
Thank you Mark. In fact, this is the production release of Solr 4.
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
On 10/22/2012 9:58 AM, Dotan Cohen wrote:
Thank you, I have gone over the Solr admin panel twice and I cannot
find the cache statistics. Where are they?
If you are running Solr4, you can see individual cache autowarming times
here, assuming your core is named collection1:
On Mon, Oct 22, 2012 at 7:29 PM, Shawn Heisey s...@elyograg.org wrote:
On 10/22/2012 9:58 AM, Dotan Cohen wrote:
Thank you, I have gone over the Solr admin panel twice and I cannot find
the cache statistics. Where are they?
If you are running Solr4, you can see individual cache autowarming
Perhaps you can grab a snapshot of the stack traces when the 60 second
delay is occurring?
You can get the stack traces right in the admin ui, or you can use
another tool (jconsole, visualvm, jstack cmd line, etc)
- Mark
On Mon, Oct 22, 2012 at 1:47 PM, Dotan Cohen dotanco...@gmail.com wrote:
On Mon, Oct 22, 2012 at 9:22 PM, Mark Miller markrmil...@gmail.com wrote:
Perhaps you can grab a snapshot of the stack traces when the 60 second
delay is occurring?
You can get the stack traces right in the admin ui, or you can use
another tool (jconsole, visualvm, jstack cmd line, etc)
First, stop optimizing. You do not need to manually force merges. The system
does a great job. Forcing merges (optimize) uses a lot of CPU and disk IO and
might be the cause of your problem.
Second, the OS will use the extra memory for file buffers, which really helps
performance, so you might
Has the Solr team considered renaming the optimize function to avoid
leading people down the path of this antipattern?
Michael Della Bitta
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271
www.appinions.com
Where Influence Isn’t a
Lucene already did that:
https://issues.apache.org/jira/browse/LUCENE-3454
Here is the Solr issue:
https://issues.apache.org/jira/browse/SOLR-3141
People over-use this regardless of the name. In Ultraseek Server, it was called
force merge and we had to tell people to stop doing that nearly
On Mon, Oct 22, 2012 at 4:39 PM, Michael Della Bitta
michael.della.bi...@appinions.com wrote:
Has the Solr team considered renaming the optimize function to avoid
leading people down the path of this antipattern?
If it were never the right thing to do, it could simply be removed.
The problem is
On Mon, Oct 22, 2012 at 10:01 PM, Walter Underwood
wun...@wunderwood.org wrote:
First, stop optimizing. You do not need to manually force merges. The system
does a great job. Forcing merges (optimize) uses a lot of CPU and disk IO and
might be the cause of your problem.
Thanks. Looking at
On Mon, Oct 22, 2012 at 10:44 PM, Walter Underwood
wun...@wunderwood.org wrote:
Lucene already did that:
https://issues.apache.org/jira/browse/LUCENE-3454
Here is the Solr issue:
https://issues.apache.org/jira/browse/SOLR-3141
People over-use this regardless of the name. In Ultraseek
On 10/22/2012 3:11 PM, Dotan Cohen wrote:
On Mon, Oct 22, 2012 at 10:01 PM, Walter Underwood
wun...@wunderwood.org wrote:
First, stop optimizing. You do not need to manually force merges. The system
does a great job. Forcing merges (optimize) uses a lot of CPU and disk IO and
might be the
On Tue, Oct 23, 2012 at 3:52 AM, Shawn Heisey s...@elyograg.org wrote:
As soon as you make any change at all to an index, it's no longer
optimized. Delete one document, add one document, anything. Most of the
time you will not see a performance increase from optimizing an index that
consists
,
Regards,
--
- Siddhant
--
- Siddhant
--
View this message in context:
http://old.nabble.com/Solr-Performance-Issues-tp27864278p27868456.html
Sent from the Solr - User mailing list archive
queries
per second with the hardware mentioned above.
Thanks,
Regards,
--
- Siddhant
--
- Siddhant
--
View this message in context:
http://old.nabble.com/Solr-Performance-Issues-tp27864278p27868456.html
Sent from the Solr - User mailing list archive
queries
per second with the hardware mentioned above.
Thanks,
Regards,
--
- Siddhant
--
- Siddhant
--
View this message in context:
http://old.nabble.com/Solr-Performance-Issues-tp27864278p27868456.html
Sent from the Solr - User mailing
--
View this message in context:
http://old.nabble.com/Solr-Performance-Issues-tp27864278p27868456.html
Sent from the Solr - User mailing list archive at Nabble.com.
--
- Siddhant
--
- Siddhant
Hi everyone,
I have an index corresponding to ~2.5 million documents. The index size is
43GB. The configuration of the machine which is running Solr is - Dual
Processor Quad Core Xeon 5430 - 2.66GHz (Harpertown) - 2 x 12MB cache, 8GB
RAM, and 250 GB HDD.
I'm observing a strange trend in the
How many outstanding queries do you have at a time? Is it possible
that when you start, you have only a few queries executing concurrently
but as your test runs you have hundreds?
This really is a question of how your load test is structured. You might
get a better sense of how it works if your
Hi Erick,
The way the load test works is that it picks up 5000 queries, splits them
according to the number of threads (so if we have 10 threads, it schedules
10 threads - each one sending 500 queries). So it might be possible that the
number of queries at a point later in time is greater than
--
View this message in context:
http://old.nabble.com/Solr-Performance-Issues-tp27864278p27872139.html
Sent from the Solr - User mailing list archive at Nabble.com.
On Jun 19, 2008, at 6:28 PM, Yonik Seeley wrote:
2. I use acts_as_solr and by default they only make post
requests, even
for /select. With that setup the response time for most queries,
simple or
complex ones, were ranging from 150ms to 600ms, with an average of
250ms. I
changed the
On Fri, Jun 20, 2008 at 8:32 AM, Erik Hatcher [EMAIL PROTECTED]
wrote:
On Jun 19, 2008, at 6:28 PM, Yonik Seeley wrote:
2. I use acts_as_solr and by default they only make post requests, even
for /select. With that setup the response time for most queries, simple
or
complex ones, were
Hi,
I've been using solr for a little without worrying too much about how it
works but now it's becoming a bottleneck in my application. I have a couple
issues with it:
1. My index always gets slower and slower when commiting/optimizing for some
obscure reason. It goes from 1 second with a new
56 matches
Mail list logo