Great updates. Thanks for keeping us all in the loop!
On Thu, Oct 22, 2020 at 7:43 PM Wei wrote:
>
> Hi Shawn,
>
> I.m circling back with some new findings with our 2 NUMA issue. After a
> few iterations, we do see improvement with the useNUMA flag and other JVM
> setting changes. Here are the
Hi Shawn,
I.m circling back with some new findings with our 2 NUMA issue. After a
few iterations, we do see improvement with the useNUMA flag and other JVM
setting changes. Here are the current settings, with Java 11:
-XX:+UseNUMA
-XX:+UseG1GC
-XX:+AlwaysPreTouch
-XX:+UseTLAB
On 9/28/2020 12:17 PM, Wei wrote:
Thanks Shawn. Looks like Java 11 is the way to go with -XX:+UseNUMA. Do you
see any backward compatibility issue for Solr 8 with Java 11? Can we run
Solr 8 built with JDK 8 in Java 11 JRE, or need to rebuild solr with Java
11 JDK?
I do not know of any problems
Thanks Shawn. Looks like Java 11 is the way to go with -XX:+UseNUMA. Do you
see any backward compatibility issue for Solr 8 with Java 11? Can we run
Solr 8 built with JDK 8 in Java 11 JRE, or need to rebuild solr with Java
11 JDK?
Best,
Wei
On Sat, Sep 26, 2020 at 6:44 PM Shawn Heisey wrote:
>
On 9/26/2020 1:39 PM, Wei wrote:
Thanks Shawn! Currently we are still using the CMS collector for solr with
Java 8. When last evaluated with Solr 7, CMS performs better than G1 for
our case. When using G1, is it better to upgrade from Java 8 to Java 11?
From
Thanks Shawn! Currently we are still using the CMS collector for solr with
Java 8. When last evaluated with Solr 7, CMS performs better than G1 for
our case. When using G1, is it better to upgrade from Java 8 to Java 11?
>From https://lucene.apache.org/solr/guide/8_4/solr-system-requirements.html,
On 9/23/2020 7:42 PM, Wei wrote:
Recently we deployed solr 8.4.1 on a batch of new servers with 2 NUMAs. I
noticed that query latency almost doubled compared to deployment on single
NUMA machines. Not sure what's causing the huge difference. Is there any
tuning to boost the performance on
Thanks Dominique. I'll start with the -XX:+UseNUMA option.
Best,
Wei
On Fri, Sep 25, 2020 at 7:04 AM Dominique Bejean
wrote:
> Hi,
>
> This would be a Java VM option, not something Solr itself can know about.
> Take a look at this article in comments. May be it will help.
>
>
Hi,
This would be a Java VM option, not something Solr itself can know about.
Take a look at this article in comments. May be it will help.
https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html?showComment=1347033706559#c229885263664926125
Regards
Dominique
Le jeu. 24
Hi,
Recently we deployed solr 8.4.1 on a batch of new servers with 2 NUMAs. I
noticed that query latency almost doubled compared to deployment on single
NUMA machines. Not sure what's causing the huge difference. Is there any
tuning to boost the performance on multiple NUMA machines? Any pointer
On 6/1/2020 9:29 AM, Odysci wrote:
Hi,
I'm looking for some advice on improving performance of our solr setup.
Does anyone have any insights on what would be better for maximizing
throughput on multiple searches being done at the same time?
thanks!
In almost all cases, adding memory will
Hi,
I'm looking for some advice on improving performance of our solr setup. In
particular, about the trade-offs between applying larger machines, vs more
smaller machines. Our full index has just over 100 million docs, and we do
almost all searches using fq's (with q=*:*) and facets. We are using
On 4/18/2020 12:20 PM, Odysci wrote:
We don't used this field for general queries (q:*), only for fq and
faceting.
Do you think making it indexed="true" would make a difference in fq
performance?
fq means "filter query". It's still a query. So yes, the field should
be indexed. The query
We don't used this field for general queries (q:*), only for fq and
faceting.
Do you think making it indexed="true" would make a difference in fq
performance?
Thanks
Reinaldo
On Sat, Apr 18, 2020 at 3:06 PM Sylvain James
wrote:
> Hi Reinaldo,
>
> Involved fields should be indexed for better
Hi Reinaldo,
Involved fields should be indexed for better performance ?
Sylvain
Le sam. 18 avr. 2020 à 18:46, Odysci a écrit :
> Hi,
>
> We are seeing significant performance degradation on single queries that
> use fq with multiple values as in:
>
> fq=field1_name:(V1 V2 V3 ...)
>
> If we
Hi,
We are seeing significant performance degradation on single queries that
use fq with multiple values as in:
fq=field1_name:(V1 V2 V3 ...)
If we use only one value in the fq (say only V1) we get Qtime = T ms
As we increase the number of values, say to 5 values, Qtime more than
triples, even
Hi,
It means that you are either committing too frequently or your warming up takes
too long. If you are committing on every bulk, stop doing that and use
autocommit.
Regards,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training -
Hi All,
I am using SOLR 7.5 version with master slave architecture.
I am getting :
"PERFORMANCE WARNING: Overlapping onDeckSearchers=2"
continuously on my master logs for all cores. Please help me to resolve this.
Thanks & Regards,
Akreeti Agarwal
::DISCLAIMER::
Hi,
Which Solr version are you using?
Also, how many collections do you have, and how many records have you
indexed in those collections?
Regards,
Edwin
On Mon, 4 Feb 2019 at 23:33, Anchal Sharma2 wrote:
>
>
> Hi All,
>
> We had recently enabled SSL on solr. But afterwards ,our application
>
Hi All,
We had recently enabled SSL on solr. But afterwards ,our application
performance has degraded significantly i.e the time for the source
application to fetch a record from solr has increased from approx 4 ms to
200 ms(this is for a single record) .This amounts to a lot of time ,when
On 11/21/2018 8:59 AM, Marc Schöchlin wrote:
Is it possible to modify the log4j appender to also log other query attributes
like response/request size in bytes and number of resulted documents?
Changing the log4j config might not do anything useful at all. In order
for such a change to be
Hello list,
i am using the pretty old solr 4.7 *sigh* release and i am currently in
investigation of performance problems.
The solr instance runs currently very expensive queries with huge results and i
want to find the most promising queries for optimization.
I am currently using the solr
Sharding can be one of the option.
But what is the size of your documents? And which Solr version are you
using?
Regards,
Edwin
On Tue, 20 Nov 2018 at 01:40, Balanathagiri Ayyasamypalanivel <
bala.cit...@gmail.com> wrote:
> Hi,
> We are in the process for live Publishing document in solr and
Hi,
We are in the process for live Publishing document in solr and the same
time we have to maintain the search performance.
Total existing docs : 120 million
Expected data for live publishing : 1 million
For every 1 hour, we will get 1m docs to publish in live to the hot solr
collection, can
On 2/15/2018 2:00 AM, Srinivas Kashyap wrote:
> I have implemented 'SortedMapBackedCache' in my SqlEntityProcessor for the
> child entities in data-config.xml. And i'm using the same for full-import
> only. And in the beginning of my implementation, i had written delta-import
> query to index
Srinivas:
Not an answer to your question, but when DIH starts getting this
complicated, I start to seriously think about SolrJ, see:
https://lucidworks.com/2012/02/14/indexing-with-solrj/
IN particular, it moves the heavy lifting of acquiring the data from a
Solr node (which I'm assuming also
Hi,
I have implemented 'SortedMapBackedCache' in my SqlEntityProcessor for the
child entities in data-config.xml. And i'm using the same for full-import only.
And in the beginning of my implementation, i had written delta-import query to
index the modified changes. But my requirement grew and
Hi Erick,
As suggested, I did try nonHDFS solr cloud instance and it response looks to
be really better. From the configuration side to, I am mostly using default
configurations and with block.cache.direct.memory.allocation as false. On
analysis of hdfs cache, evictions seems to be on higher
Hi Arun,
It is hard to measure something without affecting it, but we could use debug
results and combine with QTime without debug: If we ignore merging results, it
seems that majority of time is spent for retrieving docs (~500ms). You should
consider reducing number of rows if you want better
Hi Emir,
Please find the response without bq parameter and debugQuery set to true.
Also it was noted that Qtime comes down drastically without the debug
parameter to about 700-800.
true
0
3446
("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
"Electrical" "Electricity"
Hi Erick,
Qtime comes down with rows set as 1. Also it was noted that qtime comes down
when debug parameter is not added with the query. It comes to about 900.
Thanks,
Arun
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
On Tue, 2017-09-26 at 07:43 -0700, sasarun wrote:
> Allocated heap size for young generation is about 8 gb and old
> generation is about 24 gb. And gc analysis showed peak
> size utlisation is really low compared to these values.
That does not come as a surprise. Your collections would normally
Hi Arun,
This is not the most simple query either - a dozen of phrase queries on several
fields + the same query as bq. Can you provide debugQuery info.
I did not look much into debug times and what includes what, but one thing that
is strange to me is that QTime is 4s while query in debug is
Well, 15 second responses are not what I'd expect either. But two
things (just looked again)
1> note that the time to assemble the debug information is a large
majority of your total time (14 of 15.3 seconds).
2> you're specifying 600 rows which is quite a lot as each one
requires that a 16K
Hi Erick,
Thank you for the quick response. Query time was relatively faster once it
is read from memory. But personally I always felt response time could be far
better. As suggested, We will try and set up in a non HDFS environment and
update on the results.
Thanks,
Arun
--
Sent from:
Does the query time _stay_ low? Once the data is read from HDFS it
should pretty much stay in memory. So my question is whether, once
Solr warms up you see this kind of query response time.
Have you tried this on a non HDFS system? That would be useful to help
figure out where to look.
And given
Hi All,
I have been using Solr for some time now but mostly in standalone mode. Now
my current project is using Solr 6.5.1 hosted on hadoop. My solrconfig.xml
has the following configuration. In the prod environment the performance on
querying seems to really slow. Can anyone help me with few
Impossible to answer as Shawn says. Or even recommend. For instance,
you say "but once we launch our application all across the world it
may give performance issues."
You haven't defined at all what changes when you "launch our
application all across the world". Increasing your query traffic 10
Thanks, Shawn.
As of now, we don't have any performance issues, We are just working for
the future purpose.
So I was looking for any general architecture which is agreed by many of
Solr experts.
Thanks,
Venkat.
On Thu, May 11, 2017 at 8:19 PM, Shawn Heisey wrote:
> On
On 5/11/2017 7:39 AM, Venkateswarlu Bommineni wrote:
> In current design we have below configuration: *One collection with
> one shard with 4 replication factor with 4 nodes.* as of now, it is
> working fine.but once we launch our application all across the world
> it may give performance issues.
Hello Guys,
In current design we have below configuration:
*One collection with one shard with 4 replication factor with 4 nodes.*
as of now, it is working fine.but once we launch our application all across
the world it may give performance issues.
To improve the performance below is our
Already have a Jira issue for next week. I have a script to run prod logs
against a cluster. I’ll be testing a four shard by two replica cluster with 17
million docs and very long queries. We are working on getting the 95th
percentile under one second, so we should exercise the timeAllowed
+Walter test it
Jeff,
How much CPU does the EC2 hypervisor use? I have heard 5% but that is for a
normal workload, and is mostly consumed during system calls or context changes.
So it is quite understandable that frequent time calls would take a bigger bite
in the AWS cloud compared to bare
It’s presumably not a small degradation - this guy very recently suggested it’s
77% slower:
https://blog.packagecloud.io/eng/2017/03/08/system-calls-are-much-slower-on-ec2/
The other reason that blog post is interesting to me is that his benchmark
utility showed the work of entering the kernel
I remember seeing some performance impact (even when not using it) and it
was attributed to the calls to System.nanoTime. See SOLR-7875 and SOLR-7876
(fixed for 5.3 and 5.4). Those two Jiras fix the impact when timeAllowed is
not used, but I don't know if there were more changes to improve the
Hmm, has anyone measured the overhead of timeAllowed? We use it all the time.
If nobody has, I’ll run a benchmark with and without it.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 2, 2017, at 9:52 AM, Chris Hostetter
: I specify a timeout on all queries,
Ah -- ok, yeah -- you mean using "timeAllowed" correct?
If the root issue you were seeing is in fact clocksource related,
then using timeAllowed would probably be a significant compounding
factor there since it would involve a lot of time checks in a
Yes, that’s the Xenial I tried. Ubuntu 16.04.2 LTS.
On 5/1/17, 7:22 PM, "Will Martin" wrote:
Ubuntu 16.04 LTS - Xenial (HVM)
Is this your Xenial version?
On 5/1/2017 6:37 PM, Jeff Wartes wrote:
> I tried a few variations of
I started with the same three-node 15-shard configuration I’d been used to, in
an RF1 cluster. (the index is almost 700G so this takes three r4.8xlarge’s if I
want to be entirely memory-resident) I eventually dropped down to a 1/3rd size
index on a single node (so 5 shards, 100M docs each) so I
Ubuntu 16.04 LTS - Xenial (HVM)
Is this your Xenial version?
On 5/1/2017 6:37 PM, Jeff Wartes wrote:
> I tried a few variations of various things before we found and tried that
> linux/EC2 tuning page, including:
>- EC2 instance type: r4, c4, and i3
>- Ubuntu version: Xenial and
Might want to measure the single CPU performance of your EC2 instance. The last
time I checked, my MacBook was twice as fast as the EC2 instance I was using.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 1, 2017, at 6:24 PM, Chris Hostetter
: tldr: Recently, I tried moving an existing solrcloud configuration from
: a local datacenter to EC2. Performance was roughly 1/10th what I’d
: expected, until I applied a bunch of linux tweaks.
How many total nodes in your cluster? How many of them running ZooKeeper?
Did you observe the
I tried a few variations of various things before we found and tried that
linux/EC2 tuning page, including:
- EC2 instance type: r4, c4, and i3
- Ubuntu version: Xenial and Trusty
- EBS vs local storage
- Stock openjdk vs Zulu openjdk (Recent java8 in both cases - I’m aware of
the issues
It's also very important to consider the type of EC2 instance you are
using...
We settled on the R4.2XL... The R series is labeled "High-Memory"
Which instance type did you end up using?
On Mon, May 1, 2017 at 8:22 AM, Shawn Heisey wrote:
> On 4/28/2017 10:09 AM, Jeff
On 4/28/2017 10:09 AM, Jeff Wartes wrote:
> tldr: Recently, I tried moving an existing solrcloud configuration from a
> local datacenter to EC2. Performance was roughly 1/10th what I’d expected,
> until I applied a bunch of linux tweaks.
How very strange. I knew virtualization would have
I’d like to think I helped a little with the metrics upgrade that got released
in 6.4, so I was already watching that and I’m aware of the resulting
performance issue.
This was 5.4 though, patched with https://github.com/whitepages/SOLR-4449 - an
index we’ve been running for some time now.
We use Solr 6.2 in EC2 instance with Cent OS 6.2 and we don't see any
difference in performance between EC2 and in local environment.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-performance-on-EC2-linux-tp4332467p4332553.html
Sent from the Solr - User mailing list
Well, 6.4.0 had a pretty severe performance issue so if you were using
that release you might see this, 6.4.2 is the most recent 6.4 release.
But I have no clue how changing linux settings would alter that and I
sure can't square that issue with you having such different
performance between local
tldr: Recently, I tried moving an existing solrcloud configuration from a local
datacenter to EC2. Performance was roughly 1/10th what I’d expected, until I
applied a bunch of linux tweaks.
This should’ve been a straight port: one datacenter server -> one EC2 node.
Solr 5.4, Solrcloud, Ubuntu
> Also we will try to decouple tika to solr.
+1
-Original Message-
From: tstusr [mailto:ulfrhe...@gmail.com]
Sent: Friday, March 31, 2017 4:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr performance issue on indexing
Hi, thanks for the feedback.
Yes, it is about OOM, ind
r.
>
> By the way, make it available with solr cloud will improve performance? Or
> there will be no perceptible improvement?
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-performance-issue-on-indexing-tp4327886p4327914.html
> Sent from the Solr - User mailing list archive at Nabble.com.
?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-performance-issue-on-indexing-tp4327886p4327914.html
Sent from the Solr - User mailing list archive at Nabble.com.
n a docker container with 4gb of JVM
> Memory and ~50gb of physical memory (reported through dashboard) we are
> using a single instance.
>
> I don't think is a normal behaviour that handler crashes. So, what are some
> general tips about improving performance for this scenario?
>
>
>
shes. So, what are some
general tips about improving performance for this scenario?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-performance-issue-on-indexing-tp4327886.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thanks EricK
Regards,
Prateek Jain
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: 21 November 2016 04:32 PM
To: solr-user <solr-user@lucene.apache.org>
Subject: Re: solr | performance warning
_when_ are you seeing this? I see this on startu
_when_ are you seeing this? I see this on startup upon occasion, and I _think_
there's a JIRA about startup opening more than one searcher on startup.
If it _is_ on startup, you can simply ignore it.
If it's after the system is up and running, then you're probably committing too
frequently. "Too
Hi All,
I am observing following error in logs, any clues about this:
2016-11-06T23:15:53.066069+00:00@solr@@ org.apache.solr.core.SolrCore:1650 -
[my_custom_core] PERFORMANCE WARNING: Overlapping onDeckSearchers=2
Slight web search suggests that it could be a case of too-frequent commits. I
rom: Ilan Schwarts [mailto:ila...@gmail.com]
> > Sent: Wednesday, March 09, 2016 9:09 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Disable hyper-threading for better Solr performance?
> >
> > What is the solr version and shard config? Standalone? Multiple core
r
>
> -Original Message-
> From: Ilan Schwarts [mailto:ila...@gmail.com]
> Sent: Wednesday, March 09, 2016 9:09 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Disable hyper-threading for better Solr performance?
>
> What is the solr version and shard config? Standa
will perform better.
Thanks,
Avner
-Original Message-
From: Ilan Schwarts [mailto:ila...@gmail.com]
Sent: Wednesday, March 09, 2016 9:09 AM
To: solr-user@lucene.apache.org
Subject: Re: Disable hyper-threading for better Solr performance?
What is the solr version and shard config
-
> From:Avner Levy <av...@checkpoint.com>
> Sent: Wednesday 9th March 2016 8:00
> To: solr-user@lucene.apache.org
> Subject: Disable hyper-threading for better Solr performance?
>
> I have a machine with 16 real cores (32 with HT enabled).
> I'm running on it a Solr server
What is the solr version and shard config? Standalone? Multiple cores?
Spread over RAID ?
On Mar 9, 2016 9:00 AM, "Avner Levy" wrote:
> I have a machine with 16 real cores (32 with HT enabled).
> I'm running on it a Solr server and trying to reach maximum performance
> for
I have a machine with 16 real cores (32 with HT enabled).
I'm running on it a Solr server and trying to reach maximum performance for
indexing and queries (indexing 20k documents/sec by a number of threads).
I've read on multiple places that in some scenarios / products disabling the
m/spm to collect Solr and host metrics
> > and
> > >> see where the issue is.
> > >>
> > >> Thanks,
> > >> Emir
> > >>
> > >> --
> > >> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > &
> and
> >> see where the issue is.
> >>
> >> Thanks,
> >> Emir
> >>
> >> --
> >> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> >> Solr & Elasticsearch Support * http://sematext.com/
> >>
> &g
hi all.
i have a problem with my solr performance and usage hardware like a
ram,cup...
i have a lot of document and so indexed file about 1000 doc in solr that
every doc has about 8 field in average.
and each field has about 60 char.
i set my field as a storedfield = "false" except o
nomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/
On 08.02.2016 10:27, sara hajili wrote:
hi all.
i have a problem with my solr performance and usage hardware like a
ram,cup...
i have a lot of document and so indexed file about 1000 doc in solr
* Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/
On 08.02.2016 10:27, sara hajili wrote:
hi all.
i have a problem with my solr performance and usage hardware like a
ram,cup...
i have a lot of document and so indexed file about 1000 doc in solr that
every doc has a
ch Support * http://sematext.com/
>
>
>
> On 08.02.2016 10:27, sara hajili wrote:
>
>> hi all.
>> i have a problem with my solr performance and usage hardware like a
>> ram,cup...
>> i have a lot of document and so indexed file about 1000 doc in s
he issue is.
>>
>> Thanks,
>> Emir
>>
>> --
>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>> Solr & Elasticsearch Support * http://sematext.com/
>>
>>
>>
>> On 08.02.2016 10:27, sara hajili wrote:
&
On Wed, 2015-08-26 at 15:47 +0800, Zheng Lin Edwin Yeo wrote:
Now I've tried to increase the carrot.fragSize to 75 and
carrot.summarySnippets to 2, and set the carrot.produceSummary to
true. With this setting, I'm mostly able to get the cluster results
back within 2 to 3 seconds when I set
Hi Toke,
Thank you for the link.
I'm using Solr 5.2.1 but I think the carrot2 bundled will be slightly older
version, as I'm using the latest carrot2-workbench-3.10.3, which is only
released recently. I've changed all the settings like fragSize and
desiredCluserCountBase to be the same on both
Thanks for your recommendation Toke.
Will try to ask in the carrot forum.
Regards,
Edwin
On 26 August 2015 at 18:45, Toke Eskildsen t...@statsbiblioteket.dk wrote:
On Wed, 2015-08-26 at 15:47 +0800, Zheng Lin Edwin Yeo wrote:
Now I've tried to increase the carrot.fragSize to 75 and
On Wed, 2015-08-26 at 10:10 +0800, Zheng Lin Edwin Yeo wrote:
I'm currently trying out on the Carrot2 Workbench and get it to call Solr
to see how they did the clustering. Although it still takes some time to do
the clustering, but the results of the cluster is much better than mine. I
think
On Tue, 2015-08-25 at 10:40 +0800, Zheng Lin Edwin Yeo wrote:
Would like to confirm, when I set rows=100, does it mean that it only build
the cluster based on the first 100 records that are returned by the search,
and if I have 1000 records that matches the search, all the remaining 900
Hi Toke,
Thank you for your reply.
I'm currently trying out on the Carrot2 Workbench and get it to call Solr
to see how they did the clustering. Although it still takes some time to do
the clustering, but the results of the cluster is much better than mine. I
think its probably due to the
I honestly suspect your performance issue is down to the number of terms
you are passing into the clustering algorithm, not to memory usage as
such. If you have *huge* documents and cluster across them, performance
will be slower, by definition.
Clustering is usually done offline, for example on
Thank you Upayavira for your reply.
Would like to confirm, when I set rows=100, does it mean that it only build
the cluster based on the first 100 records that are returned by the search,
and if I have 1000 records that matches the search, all the remaining 900
records will not be considered for
And be aware that I'm sure the more terms in your documents, the slower
clustering will be. So it isn't just the number of docs, the size of
them counts in this instance.
A simple test would be to build an index with just the first 1000 terms
of your clustering fields, and see if that makes a
Zheng Lin Edwin Yeo edwinye...@gmail.com wrote:
However, I find that clustering is exceeding slow after I index this 1GB of
data. It took almost 30 seconds to return the cluster results when I set it
to cluster the top 1000 records, and still take more than 3 seconds when I
set it to cluster
Are you by any chance doing store=true on the fields you want to search?
If so, you may want to switch to just index=true. Of course, they will
then not come back in the results, but do you really want to sling
huge content fields around.
The other option is to do lazyLoading=true and not
You're confusing clustering with searching. Sure, Solr can index
and lots of data, but clustering is essentially finding ad-hoc
similarities between arbitrary documents. It must take each of
the documents in the result size you specify from your result
set and try to find commonalities.
For perf
unsubscribe
On Sat, Aug 22, 2015 at 9:31 PM, Zheng Lin Edwin Yeo edwinye...@gmail.com
wrote:
Hi,
I'm using Solr 5.2.1, and I've indexed about 1GB of data into Solr.
However, I find that clustering is exceeding slow after I index this 1GB of
data. It took almost 30 seconds to return the
Hi Shawn and Toke,
I only have 520 docs in my data, but each of the documents is quite big in
size, In the Solr, it is using 221MB. So when i set to read from the top
1000 rows, it should just be reading all the 520 docs that are indexed?
Regards,
Edwin
On 23 August 2015 at 22:52, Shawn Heisey
On 8/22/2015 10:28 PM, Zheng Lin Edwin Yeo wrote:
Hi Shawn,
Yes, I've increased the heap size to 4GB already, and I'm using a machine
with 32GB RAM.
Is it recommended to further increase the heap size to like 8GB or 16GB?
Probably not, but I know nothing about your data. How many Solr
We use 8gb to 10gb for those size indexes all the time.
Bill Bell
Sent from mobile
On Aug 23, 2015, at 8:52 AM, Shawn Heisey apa...@elyograg.org wrote:
On 8/22/2015 10:28 PM, Zheng Lin Edwin Yeo wrote:
Hi Shawn,
Yes, I've increased the heap size to 4GB already, and I'm using a machine
Hi Alexandre,
I've tried to use just index=true, and the speed is still the same and not
any faster. If I set to store=false, there's no results that came back with
the clustering. Is this due to the index are not stored, and the clustering
requires indexed that are stored?
I've also increase my
Yes, I'm using store=true.
field name=content type=text_general indexed=true stored=true
omitNorms=true termVectors=true/
However, this field needs to be stored as my program requires this field to
be returned during normal searching. I tried the lazyLoading=true, but it's
not working.
Will you
Hi Shawn,
Yes, I've increased the heap size to 4GB already, and I'm using a machine
with 32GB RAM.
Is it recommended to further increase the heap size to like 8GB or 16GB?
Regards,
Edwin
On 23 Aug 2015 10:23, Shawn Heisey apa...@elyograg.org wrote:
On 8/22/2015 7:31 PM, Zheng Lin Edwin Yeo
Hi,
I'm using Solr 5.2.1, and I've indexed about 1GB of data into Solr.
However, I find that clustering is exceeding slow after I index this 1GB of
data. It took almost 30 seconds to return the cluster results when I set it
to cluster the top 1000 records, and still take more than 3 seconds when
1 - 100 of 464 matches
Mail list logo