Re: SolrCloud Performance Issue

2013-10-21 Thread Erick Erickson
Shamik:

You're right, the use of NOW shouldn't be making that much of a difference
between versions. FYI, though, here's a way to use NOW and re-use fq
clauses:

http://searchhub.org/2012/02/23/date-math-now-and-filter-queries/

It may well be this setting:


1000


Every second (assuming you're indexing), you're throwing away all your
top-level caches and executing any autowarm queries etc. And if you _don't_
have any autowarming queries, you may not be filling caches, an expensive
process. Try lengthening that out to, say, a minute (6) or even longer
and see if that makes a difference. If that's the culprit, you at least
have a place to start.

If that's not it, it's also possible you're seeing decompression.

How many documents are you returning and how big are they? There's some
anecdotal comments that the default stored field decompression for either a
large number of doc or very large docs may be playing a role here. Try
setting fl=id (don't return any stored fields). If that is faster, this
might be your problem.

queryResultCache is often not very high re: hit ratio. It's usually used
for paging, so if your users aren't hitting the "next" page you may not hit
many.

Best,
Erick


On Sat, Oct 19, 2013 at 4:12 AM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:

> Hi,
>
> What happens if you have just 1 shard - no distributed search, like
> before? SPM for Solr or any other monitoring tool that captures OS and
> Solr metrics should help you find the source of the problem faster.
> Is disk IO the same? utilization of caches? JVM version, heap, etc.?
> CPU usage? network?  I'd look at each of these things side by side and
> look for big differences.
>
> Otis
> --
> Solr & ElasticSearch Support -- http://sematext.com/
> SOLR Performance Monitoring -- http://sematext.com/spm
>
>
>
> On Fri, Oct 18, 2013 at 1:38 AM, shamik  wrote:
> > I tried commenting out NOW in bq, but didn't make any difference in the
> > performance. I do see minor entry in the queryfiltercache rate which is a
> > meager 0.02.
> >
> > I'm really struggling to figure out the bottleneck, any known pain
> points I
> > should be checking ?
> >
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/SolrCloud-Performance-Issue-tp4095971p4096277.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: SolrCloud Performance Issue

2013-10-18 Thread Otis Gospodnetic
Hi,

What happens if you have just 1 shard - no distributed search, like
before? SPM for Solr or any other monitoring tool that captures OS and
Solr metrics should help you find the source of the problem faster.
Is disk IO the same? utilization of caches? JVM version, heap, etc.?
CPU usage? network?  I'd look at each of these things side by side and
look for big differences.

Otis
--
Solr & ElasticSearch Support -- http://sematext.com/
SOLR Performance Monitoring -- http://sematext.com/spm



On Fri, Oct 18, 2013 at 1:38 AM, shamik  wrote:
> I tried commenting out NOW in bq, but didn't make any difference in the
> performance. I do see minor entry in the queryfiltercache rate which is a
> meager 0.02.
>
> I'm really struggling to figure out the bottleneck, any known pain points I
> should be checking ?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/SolrCloud-Performance-Issue-tp4095971p4096277.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud Performance Issue

2013-10-17 Thread shamik
I tried commenting out NOW in bq, but didn't make any difference in the
performance. I do see minor entry in the queryfiltercache rate which is a
meager 0.02. 

I'm really struggling to figure out the bottleneck, any known pain points I
should be checking ?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Performance-Issue-tp4095971p4096277.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud Performance Issue

2013-10-17 Thread shamik
Thanks Primoz, I was suspecting that too. But then, its hard to imagine that
query cache is only contributing to the big performance hit. The setting
applies to the old configuration, and it works pretty well even with the
query cache low hit rate.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Performance-Issue-tp4095971p4096123.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud Performance Issue

2013-10-16 Thread primoz . skale
Query result cache hit might be low due to using NOW in bf. NOW is always 
translated to current time and that of course changes from ms to ms... :)

Primoz



From:   Shamik Bandopadhyay 
To: solr-user@lucene.apache.org
Date:   17.10.2013 00:14
Subject:SolrCloud Performance Issue



Hi,

  I'm in the process of transitioning to SolrCloud from a conventional
Master-Slave model. I'm using Solr 4.4 and has set-up 2 shards with 1
replica each. I've 3 zookeeper ensemble. All the nodes are running on AWS
EC2 instances. Shards are on m1.xlarge and sharing a zookeeper instance
(mounted on a separate volume). 6 gb memory is allocated to each solr
instance.

I've around 10 million documents in index. With the previous standalone
model, the queries avg around 100 ms.  The SolrCloud query response have
been abysmal so far. The query response time is over 1000ms, reaching
2000ms often. I expected some surge due to additional servers, network
latency, etc. but this difference is really baffling. The hardware is
similar in both cases, except for the fact that couple of SolrCloud node 
is
sharing zookeeper as well. m1x.large I/O is high, so shouldn't be a
bottleneck as well.

The other difference from old setup is that I'm using the new
CloudSolrServer class which is having the 3 zookeeper reference for load
balancing. But I don't think it has any major impact as the queries
executed from Solr admin query panel confirms the slowness.

Here are some of my configuration setup:


3
false



1000



1024










true

200

400





line
xref
draw




line
draw
linelanguage:english
lineSource2:documentation
lineSource2:CloudHelp
drawlanguage:english
drawSource2:documentation
drawSource2:CloudHelp



2


The custom request handler :



explicit
0.01
velocity
browse
text/html;charset=UTF-8
layout
cloudhelp

edismax
*:*
15
id,url,Description,Source2,text,filetype,title,LastUpdateDate,PublishDate,ViewCount,TotalMessageCount,Solution,LastPostAuthor,Author,Duration,AuthorUrl,ThumbnailUrl,TopicId,score
text^1.5 title^2 IndexTerm^.9
keywords^1.2 ADSKCommandSrch^2 ADSKContextId^1
Source2:CloudHelp^3
Source2:youtube^0.85
recip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0
text


on
1
100
language
Source2
DocumentationBook
ADSKProductDisplay
audience


true
text title
250
ShortDesc


true
default
true
false
false
1


spellcheck



One thing I've noticed is that the queryresultcache hit rate is really 
low,
not sure our queries are always that unique. I'm using edismax and there's
a recip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0 , can
this contribute ?

Sorry about the long post, but I'm struggling to nail down the issue here,
especially when queries are running fine in a master-slave environment 
with
similar hardware and network.

Any pointers will be highly appreciated.

Regards,
Shamik



SolrCloud Performance Issue

2013-10-16 Thread Shamik Bandopadhyay
Hi,

  I'm in the process of transitioning to SolrCloud from a conventional
Master-Slave model. I'm using Solr 4.4 and has set-up 2 shards with 1
replica each. I've 3 zookeeper ensemble. All the nodes are running on AWS
EC2 instances. Shards are on m1.xlarge and sharing a zookeeper instance
(mounted on a separate volume). 6 gb memory is allocated to each solr
instance.

I've around 10 million documents in index. With the previous standalone
model, the queries avg around 100 ms.  The SolrCloud query response have
been abysmal so far. The query response time is over 1000ms, reaching
2000ms often. I expected some surge due to additional servers, network
latency, etc. but this difference is really baffling. The hardware is
similar in both cases, except for the fact that couple of SolrCloud node is
sharing zookeeper as well. m1x.large I/O is high, so shouldn't be a
bottleneck as well.

The other difference from old setup is that I'm using the new
CloudSolrServer class which is having the 3 zookeeper reference for load
balancing. But I don't think it has any major impact as the queries
executed from Solr admin query panel confirms the slowness.

Here are some of my configuration setup:


3
false



1000



1024










true

200

400





line
xref
draw




line
draw
linelanguage:english
lineSource2:documentation
lineSource2:CloudHelp
drawlanguage:english
drawSource2:documentation
drawSource2:CloudHelp



2


The custom request handler :



explicit
0.01
velocity
browse
text/html;charset=UTF-8
layout
cloudhelp

edismax
*:*
15
id,url,Description,Source2,text,filetype,title,LastUpdateDate,PublishDate,ViewCount,TotalMessageCount,Solution,LastPostAuthor,Author,Duration,AuthorUrl,ThumbnailUrl,TopicId,score
text^1.5 title^2 IndexTerm^.9
keywords^1.2 ADSKCommandSrch^2 ADSKContextId^1
Source2:CloudHelp^3
Source2:youtube^0.85
recip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0
text


on
1
100
language
Source2
DocumentationBook
ADSKProductDisplay
audience


true
text title
250
ShortDesc


true
default
true
false
false
1


spellcheck



One thing I've noticed is that the queryresultcache hit rate is really low,
not sure our queries are always that unique. I'm using edismax and there's
a recip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0 , can
this contribute ?

Sorry about the long post, but I'm struggling to nail down the issue here,
especially when queries are running fine in a master-slave environment with
similar hardware and network.

Any pointers will be highly appreciated.

Regards,
Shamik


SolrCloud Performance Issue

2013-10-16 Thread shamik
Hi,

  I'm in the process of transitioning to SolrCloud from a conventional
Master-Slave model. I'm using Solr 4.4 and has set-up 2 shards with 1
replica each. I've 3 zookeeper ensemble. All the nodes are running on AWS
EC2 instances. Shards are on m1.xlarge and sharing a zookeeper instance
(mounted on a separate volume). 6 gb memory is allocated to each solr
instance.

I've around 10 million documents in index. With the previous standalone
model, the queries avg around 100 ms.  The SolrCloud query response have
been abysmal so far. The query response time is over 1000ms, reaching 2000ms
often. I expected some surge due to additional servers, network latency,
etc. but this difference is really baffling. The hardware is similar in both
cases, except for the fact that couple of SolrCloud node is sharing
zookeeper as well. m1x.large I/O is high, so shouldn't be a bottleneck as
well.

The other difference from old setup is that I'm using the new
CloudSolrServer class which is having the 3 zookeeper reference for load
balancing. But I don't think it has any major impact as the queries executed
from Solr admin query panel confirms the slowness.

Here are some of my configuration setup:

 
3 
false 


 
1000 



1024










true

200

400





line
xref
draw




line
draw
linelanguage:english
lineSource2:documentation
lineSource2:CloudHelp
drawlanguage:english
drawSource2:documentation
drawSource2:CloudHelp



2


The custom request handler :



explicit
0.01
velocity
browse
text/html;charset=UTF-8 
  
layout
cloudhelp

edismax
*:*
15
id,url,Description,Source2,text,filetype,title,LastUpdateDate,PublishDate,ViewCount,TotalMessageCount,Solution,LastPostAuthor,Author,Duration,AuthorUrl,ThumbnailUrl,TopicId,score
text^1.5 title^2 IndexTerm^.9 
keywords^1.2
ADSKCommandSrch^2 ADSKContextId^1
Source2:CloudHelp^3 
Source2:youtube^0.85 
recip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0 
text


on
1
100
language
Source2
DocumentationBook
ADSKProductDisplay
audience


true
text title
250
ShortDesc


true
default
true
false
false
1


spellcheck



One thing I've noticed is that the queryresultcache hit rate is really low,
not sure our queries are always that unique. I'm using edismax and there's a
recip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0 , can this
contribute ?

Sorry about the long post, but I'm struggling to nail down the issue here,
especially when queries are running fine in a master-slave environment with
similar hardware and network.

Any pointers will be highly appreciated.

Regards,
Shamik




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Performance-Issue-tp4095940.html
Sent from the Solr - User mailing list archive at Nabble.com.