Re: Solr performance issue on querying --> Solr 6.5.1

2017-09-30 Thread sasarun
Hi Erick, 

As suggested, I did try nonHDFS solr cloud instance and it response looks to
be really better. From the configuration side to, I am mostly using default
configurations and with block.cache.direct.memory.allocation as false.  On
analysis of hdfs cache, evictions seems to be on higher side. 

Thanks, 
Arun



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr performance issue on querying --> Solr 6.5.1

2017-09-27 Thread Emir Arnautović
Hi Arun,
It is hard to measure something without affecting it, but we could use debug 
results and combine with QTime without debug: If we ignore merging results, it 
seems that majority of time is spent for retrieving docs (~500ms). You should 
consider reducing number of rows if you want better response time (you can ask 
for rows=0 to see max possible time). Also, as Erick suggested, reducing number 
of shards (1 if not plan much more doc) will trim some overhead of merging 
results.

Thanks,
Emir

I noticed that you removed bq - is time with bq acceptable as well?
> On 27 Sep 2017, at 12:34, sasarun  wrote:
> 
> Hi Emir, 
> 
> Please find the response without bq parameter and debugQuery set to true. 
> Also it was noted that Qtime comes down drastically without the debug
> parameter to about 700-800. 
> 
> 
> true
> 0
> 3446
> 
> 
> ("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
> "Electrical" "Electricity" "Engine" "fuel economy" "fuel efficiency" "Hybrid
> Electric Propulsion" "Power Systems" "Powerplant" "Propulsion" "hybrid"
> "hybrid electric" "electric powerplant")
> 
> edismax
> on
> 
> host
> title
> url
> customContent
> contentSpecificSearch
> 
> 
> id
> contentOntologyTagsCount
> 
> 0
> OR
> 3985d7e2-3e54-48d8-8336-229e85f5d9de
> 600
> true
> 
> 
>  maxScore="56.74194">...
> 
> 
> 
> solr-prd-cluster-m-GooglePatent_shard4_replica2-1506504238282-20
> 
> 
> 
> 35
> 159
> GET_TOP_IDS
> 41294
> ...
> 
> 
> 29
> 165
> GET_TOP_IDS
> 40980
> ...
> 
> 
> 31
> 200
> GET_TOP_IDS
> 41006
> ...
> 
> 
> 43
> 208
> GET_TOP_IDS
> 41040
> ...
> 
> 
> 181
> 466
> GET_TOP_IDS
> 41138
> ...
> 
> 
> 
> 
> 1518
> 1523
> GET_FIELDS,GET_DEBUG
> 110
> ...
> 
> 
> 1562
> 1573
> GET_FIELDS,GET_DEBUG
> 115
> ...
> 
> 
> 1793
> 1800
> GET_FIELDS,GET_DEBUG
> 120
> ...
> 
> 
> 2153
> 2161
> GET_FIELDS,GET_DEBUG
> 125
> ...
> 
> 
> 2957
> 2970
> GET_FIELDS,GET_DEBUG
> 130
> ...
> 
> 
> 
> 
> 10302.0
> 
> 2.0
> 
> 2.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 
> 10288.0
> 
> 661.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 9627.0
> 
> 
> 
> 
> ("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
> "Electrical" "Electricity" "Engine" "fuel economy" "fuel efficiency" "Hybrid
> Electric Propulsion" "Power Systems" "Powerplant" "Propulsion" "hybrid"
> "hybrid electric" "electric powerplant")
> 
> 
> ("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
> "Electrical" "Electricity" "Engine" "fuel economy" "fuel efficiency" "Hybrid
> Electric Propulsion" "Power Systems" "Powerplant" "Propulsion" "hybrid"
> "hybrid electric" "electric powerplant")
> 
> 
> (+(DisjunctionMaxQuery((host:hybrid electric powerplant |
> contentSpecificSearch:"hybrid electric powerplant" | customContent:"hybrid
> electric powerplant" | title:hybrid electric powerplant | url:hybrid
> electric powerplant)) DisjunctionMaxQuery((host:hybrid electric powerplants
> | contentSpecificSearch:"hybrid electric powerplants" |
> customContent:"hybrid electric powerplants" | title:hybrid electric
> powerplants | url:hybrid electric powerplants))
> DisjunctionMaxQuery((host:Electric | contentSpecificSearch:electric |
> customContent:electric | title:Electric | url:Electric))
> DisjunctionMaxQuery((host:Electrical | contentSpecificSearch:electrical |
> customContent:electrical | title:Electrical | url:Electrical))
> DisjunctionMaxQuery((host:Electricity | contentSpecificSearch:electricity |
> customContent:electricity | title:Electricity | url:Electricity))
> DisjunctionMaxQuery((host:Engine | contentSpecificSearch:engine |
> customContent:engine | title:Engine | url:Engine))
> DisjunctionMaxQuery((host:fuel economy | contentSpecificSearch:"fuel
> economy" | customContent:"fuel economy" | title:fuel economy | url:fuel
> economy)) DisjunctionMaxQuery((host:fuel efficiency |
> contentSpecificSearch:"fuel efficiency" | customContent:"fuel efficiency" |
> title:fuel efficiency | url:fuel efficiency))
> DisjunctionMaxQuery((host:Hybrid Electric Propulsion |
> contentSpecificSearch:"hybrid electric propulsion" | customContent:"hybrid
> electric propulsion" | title:Hybrid Electric Propulsion | url:Hybrid
> Electric Propulsion)) DisjunctionMaxQuery((host:Power Systems |
> contentSpecificSearch:"power systems" | customContent:"power systems" |
> title:Power Systems | url:Power Systems))
> DisjunctionMaxQuery((host:Powerplant | contentSpecificSearch:powerplant |
> customContent:powerplant | title:Powerplant | url:Powerplant))
> DisjunctionMaxQuery((host:Propulsion | contentSpecificSearch:propulsion |
> customContent:propulsion | title:Propulsion | url:Propulsion))
> DisjunctionMaxQuery((host:hybrid | contentSpecificSearch:hybrid |
> customContent:hybrid | title:hybrid | url:hybrid))
> DisjunctionMaxQuery((host:hybrid electric | contentSpecificSearch:"hybrid
> electric" | customContent:"hybrid 

Re: Solr performance issue on querying --> Solr 6.5.1

2017-09-27 Thread sasarun
Hi Emir, 

Please find the response without bq parameter and debugQuery set to true. 
Also it was noted that Qtime comes down drastically without the debug
parameter to about 700-800. 


true
0
3446


("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
"Electrical" "Electricity" "Engine" "fuel economy" "fuel efficiency" "Hybrid
Electric Propulsion" "Power Systems" "Powerplant" "Propulsion" "hybrid"
"hybrid electric" "electric powerplant")

edismax
on

host
title
url
customContent
contentSpecificSearch


id
contentOntologyTagsCount

0
OR
3985d7e2-3e54-48d8-8336-229e85f5d9de
600
true


...



solr-prd-cluster-m-GooglePatent_shard4_replica2-1506504238282-20



35
159
GET_TOP_IDS
41294
...


29
165
GET_TOP_IDS
40980
...


31
200
GET_TOP_IDS
41006
...


43
208
GET_TOP_IDS
41040
...


181
466
GET_TOP_IDS
41138
...




1518
1523
GET_FIELDS,GET_DEBUG
110
...


1562
1573
GET_FIELDS,GET_DEBUG
115
...


1793
1800
GET_FIELDS,GET_DEBUG
120
...


2153
2161
GET_FIELDS,GET_DEBUG
125
...


2957
2970
GET_FIELDS,GET_DEBUG
130
...




10302.0

2.0

2.0


0.0


0.0


0.0


0.0


0.0


0.0


0.0


0.0



10288.0

661.0


0.0


0.0


0.0


0.0


0.0


0.0


0.0


9627.0




("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
"Electrical" "Electricity" "Engine" "fuel economy" "fuel efficiency" "Hybrid
Electric Propulsion" "Power Systems" "Powerplant" "Propulsion" "hybrid"
"hybrid electric" "electric powerplant")


("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
"Electrical" "Electricity" "Engine" "fuel economy" "fuel efficiency" "Hybrid
Electric Propulsion" "Power Systems" "Powerplant" "Propulsion" "hybrid"
"hybrid electric" "electric powerplant")


(+(DisjunctionMaxQuery((host:hybrid electric powerplant |
contentSpecificSearch:"hybrid electric powerplant" | customContent:"hybrid
electric powerplant" | title:hybrid electric powerplant | url:hybrid
electric powerplant)) DisjunctionMaxQuery((host:hybrid electric powerplants
| contentSpecificSearch:"hybrid electric powerplants" |
customContent:"hybrid electric powerplants" | title:hybrid electric
powerplants | url:hybrid electric powerplants))
DisjunctionMaxQuery((host:Electric | contentSpecificSearch:electric |
customContent:electric | title:Electric | url:Electric))
DisjunctionMaxQuery((host:Electrical | contentSpecificSearch:electrical |
customContent:electrical | title:Electrical | url:Electrical))
DisjunctionMaxQuery((host:Electricity | contentSpecificSearch:electricity |
customContent:electricity | title:Electricity | url:Electricity))
DisjunctionMaxQuery((host:Engine | contentSpecificSearch:engine |
customContent:engine | title:Engine | url:Engine))
DisjunctionMaxQuery((host:fuel economy | contentSpecificSearch:"fuel
economy" | customContent:"fuel economy" | title:fuel economy | url:fuel
economy)) DisjunctionMaxQuery((host:fuel efficiency |
contentSpecificSearch:"fuel efficiency" | customContent:"fuel efficiency" |
title:fuel efficiency | url:fuel efficiency))
DisjunctionMaxQuery((host:Hybrid Electric Propulsion |
contentSpecificSearch:"hybrid electric propulsion" | customContent:"hybrid
electric propulsion" | title:Hybrid Electric Propulsion | url:Hybrid
Electric Propulsion)) DisjunctionMaxQuery((host:Power Systems |
contentSpecificSearch:"power systems" | customContent:"power systems" |
title:Power Systems | url:Power Systems))
DisjunctionMaxQuery((host:Powerplant | contentSpecificSearch:powerplant |
customContent:powerplant | title:Powerplant | url:Powerplant))
DisjunctionMaxQuery((host:Propulsion | contentSpecificSearch:propulsion |
customContent:propulsion | title:Propulsion | url:Propulsion))
DisjunctionMaxQuery((host:hybrid | contentSpecificSearch:hybrid |
customContent:hybrid | title:hybrid | url:hybrid))
DisjunctionMaxQuery((host:hybrid electric | contentSpecificSearch:"hybrid
electric" | customContent:"hybrid electric" | title:hybrid electric |
url:hybrid electric)) DisjunctionMaxQuery((host:electric powerplant |
contentSpecificSearch:"electric powerplant" | customContent:"electric
powerplant" | title:electric powerplant | url:electric
powerplant/no_coord


+((host:hybrid electric powerplant | contentSpecificSearch:"hybrid electric
powerplant" | customContent:"hybrid electric powerplant" | title:hybrid
electric powerplant | url:hybrid electric powerplant) (host:hybrid electric
powerplants | contentSpecificSearch:"hybrid electric powerplants" |
customContent:"hybrid electric powerplants" | title:hybrid electric
powerplants | url:hybrid electric powerplants) (host:Electric |
contentSpecificSearch:electric | customContent:electric | title:Electric |
url:Electric) (host:Electrical | contentSpecificSearch:electrical |
customContent:electrical | title:Electrical | url:Electrical)
(host:Electricity | contentSpecificSearch:electricity |
customContent:electricity | title:Electricity | url:Electricity)
(host:Engine | contentSpecificSearch:engine | customContent:engine |
title:Engine | url:Engine) (host:fuel 

Re: Solr performance issue on querying --> Solr 6.5.1

2017-09-27 Thread sasarun
Hi Erick, 

Qtime comes down with rows set as 1. Also it was noted that qtime comes down
when debug parameter is not added with the query. It comes to about 900.

Thanks, 
Arun 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr performance issue on querying --> Solr 6.5.1

2017-09-27 Thread Toke Eskildsen
On Tue, 2017-09-26 at 07:43 -0700, sasarun wrote:
> Allocated heap size for young generation is about 8 gb and old 
> generation is about 24 gb. And gc analysis showed peak
> size utlisation is really low compared to these values.

That does not come as a surprise. Your collections would normally be
considered small, if not tiny, looking only at their size measured in
bytes. Again, if you expect them to grow significantly (more than 10x),
your allocation might make sense. If you do not expect such a growth in
the near future, you will be better off with a much smaller heap: The
peak heap utilization that you have logged (or twice that to err on the
cautious side) seems a good starting point.

And whatever you do, don't set Xmx to 32GB. Use <31GB or significantly
more than 32GB:
https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-mem
ory-oddities/


Are you indexing while you search? If so, you need to set auto-warm or
state a few explicit warmup-queries. If not, your measuring will not be
representative as it will be on first-searches, which are always slower
than warmed-searches.


- Toke Eskildsen, Royal Danish Library



Re: Solr performance issue on querying --> Solr 6.5.1

2017-09-27 Thread Emir Arnautović
Hi Arun,
This is not the most simple query either - a dozen of phrase queries on several 
fields + the same query as bq. Can you provide debugQuery info.
I did not look much into debug times and what includes what, but one thing that 
is strange to me is that QTime is 4s while query in debug is 1.3s. Can you try 
running without bq? Can you include boost factors in the main query?

Thanks,
Emir

> On 26 Sep 2017, at 16:43, sasarun  wrote:
> 
> Hi All, 
> I have been using Solr for some time now but mostly in standalone mode. Now
> my current project is using Solr 6.5.1 hosted on hadoop. My solrconfig.xml
> has the following configuration. In the prod environment the performance on
> querying seems to really slow. Can anyone help me with few pointers on
> howimprove on the same. 
> 
> 
>${solr.hdfs.home:}
> name="solr.hdfs.blockcache.enabled">${solr.hdfs.blockcache.enabled:true}
> name="solr.hdfs.blockcache.slab.count">${solr.hdfs.blockcache.slab.count:1}
> name="solr.hdfs.blockcache.direct.memory.allocation">${solr.hdfs.blockcache.direct.memory.allocation:false}
> name="solr.hdfs.blockcache.blocksperbank">${solr.hdfs.blockcache.blocksperbank:16384}
> name="solr.hdfs.blockcache.read.enabled">${solr.hdfs.blockcache.read.enabled:true}
> name="solr.hdfs.blockcache.write.enabled">${solr.hdfs.blockcache.write.enabled:false}
> name="solr.hdfs.nrtcachingdirectory.enable">${solr.hdfs.nrtcachingdirectory.enable:true}
> name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">${solr.hdfs.nrtcachingdirectory.maxmergesizemb:16}
> name="solr.hdfs.nrtcachingdirectory.maxcachedmb">${solr.hdfs.nrtcachingdirectory.maxcachedmb:192}
> 
>hdfs
> It has 6 collections of following size 
> Collection 1 -->6.41 MB
> Collection 2 -->634.51 KB 
> Collection 3 -->4.59 MB 
> Collection 4 -->1,020.56 MB 
> Collection 5 --> 607.26 MB
> Collection 6 -->102.4 kb
> Each Collection has 5 shards each. Allocated heap size for young generation
> is about 8 gb and old generation is about 24 gb. And gc analysis showed peak
> size 
> utlisation is really low compared to these values. 
> But querying to Collection 4 and collection 5 is giving really slow response
> even thoughwe are not using any complex queries.Output of debug quries run
> with debug=timing
> are given below for reference. Can anyone help suggest a way improve the
> performance.
> 
> Response to query
> 
> 
> true
> 0
> 3962
> 
> 
> ("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
> "Electrical" "Electricity" "Engine" "fuel economy" "fuel efficiency" "Hybrid
> Electric Propulsion" "Power Systems" "Powerplant" "Propulsion" "hybrid"
> "hybrid electric" "electric powerplant")
> 
> edismax
> true
> on
> 
> host
> title
> url
> customContent
> contentSpecificSearch
> 
> 
> id
> contentTagsCount
> 
> 0
> OR
> OR
> 3985d7e2-3e54-48d8-8336-229e85f5d9de
> 600
> 
> ("hybrid electric powerplant"^100.0 "hybrid electric powerplants"^100.0
> "Electric"^50.0 "Electrical"^50.0 "Electricity"^50.0 "Engine"^50.0 "fuel
> economy"^50.0 "fuel efficiency"^50.0 "Hybrid Electric Propulsion"^50.0
> "Power Systems"^50.0 "Powerplant"^50.0 "Propulsion"^50.0 "hybrid"^15.0
> "hybrid electric"^15.0 "electric powerplant"^15.0)
> 
> 
> 
> 
> 
> 15374.0
> 
> 2.0
> 
> 2.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 
> 15363.0
> 
> 1313.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 14048.0
> 
> 
> 
> 
> 
> Thanks,
> Arun
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Re: Solr performance issue on querying --> Solr 6.5.1

2017-09-26 Thread Erick Erickson
Well, 15 second responses are not what I'd expect either. But two
things (just looked again)

1> note that the time to assemble the debug information is a large
majority of your total time (14 of 15.3 seconds).
2> you're specifying 600 rows which is quite a lot as each one
requires that a 16K block of data be read from disk and decompressed
to assemble the "fl" list.

so one quick test would be to set rows=1 or something. All that said,
the QTime value returned does _not_ include <1> or <2> above and even
4 seconds seems excessive.

Best,
Erick

On Tue, Sep 26, 2017 at 10:54 AM, sasarun  wrote:
> Hi Erick,
>
> Thank you for the quick response. Query time was relatively faster once it
> is read from memory. But personally I always felt response time could be far
> better. As suggested, We will try and set up in a non HDFS environment and
> update on the results.
>
> Thanks,
> Arun
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr performance issue on querying --> Solr 6.5.1

2017-09-26 Thread sasarun
Hi Erick, 

Thank you for the quick response. Query time was relatively faster once it
is read from memory. But personally I always felt response time could be far
better. As suggested, We will try and set up in a non HDFS environment and
update on the results. 

Thanks, 
Arun



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr performance issue on querying --> Solr 6.5.1

2017-09-26 Thread Erick Erickson
Does the query time _stay_ low? Once the data is read from HDFS it
should pretty much stay in memory. So my question is whether, once
Solr warms up you see this kind of query response time.

Have you tried this on a non HDFS system? That would be useful to help
figure out where to look.

And given the sizes of your collections, unless you expect them to get
much larger, there's no reason to shard any of them. Sharding should
only really be used when the collections are too big for a single
shard as distributed searches inevitably have increased overhead. I
expect _at least_ 20M documents/shard, and have seen 200M docs/shard.
YMMV of course.

Best,
Erick

On Tue, Sep 26, 2017 at 7:43 AM, sasarun  wrote:
> Hi All,
> I have been using Solr for some time now but mostly in standalone mode. Now
> my current project is using Solr 6.5.1 hosted on hadoop. My solrconfig.xml
> has the following configuration. In the prod environment the performance on
> querying seems to really slow. Can anyone help me with few pointers on
> howimprove on the same.
>
> 
> ${solr.hdfs.home:}
>  name="solr.hdfs.blockcache.enabled">${solr.hdfs.blockcache.enabled:true}
>  name="solr.hdfs.blockcache.slab.count">${solr.hdfs.blockcache.slab.count:1}
>  name="solr.hdfs.blockcache.direct.memory.allocation">${solr.hdfs.blockcache.direct.memory.allocation:false}
>  name="solr.hdfs.blockcache.blocksperbank">${solr.hdfs.blockcache.blocksperbank:16384}
>  name="solr.hdfs.blockcache.read.enabled">${solr.hdfs.blockcache.read.enabled:true}
>  name="solr.hdfs.blockcache.write.enabled">${solr.hdfs.blockcache.write.enabled:false}
>  name="solr.hdfs.nrtcachingdirectory.enable">${solr.hdfs.nrtcachingdirectory.enable:true}
>  name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">${solr.hdfs.nrtcachingdirectory.maxmergesizemb:16}
>  name="solr.hdfs.nrtcachingdirectory.maxcachedmb">${solr.hdfs.nrtcachingdirectory.maxcachedmb:192}
> 
> hdfs
> It has 6 collections of following size
> Collection 1 -->6.41 MB
> Collection 2 -->634.51 KB
> Collection 3 -->4.59 MB
> Collection 4 -->1,020.56 MB
> Collection 5 --> 607.26 MB
> Collection 6 -->102.4 kb
> Each Collection has 5 shards each. Allocated heap size for young generation
> is about 8 gb and old generation is about 24 gb. And gc analysis showed peak
> size
> utlisation is really low compared to these values.
> But querying to Collection 4 and collection 5 is giving really slow response
> even thoughwe are not using any complex queries.Output of debug quries run
> with debug=timing
> are given below for reference. Can anyone help suggest a way improve the
> performance.
>
> Response to query
> 
> 
> true
> 0
> 3962
> 
> 
> ("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
> "Electrical" "Electricity" "Engine" "fuel economy" "fuel efficiency" "Hybrid
> Electric Propulsion" "Power Systems" "Powerplant" "Propulsion" "hybrid"
> "hybrid electric" "electric powerplant")
> 
> edismax
> true
> on
> 
> host
> title
> url
> customContent
> contentSpecificSearch
> 
> 
> id
> contentTagsCount
> 
> 0
> OR
> OR
> 3985d7e2-3e54-48d8-8336-229e85f5d9de
> 600
> 
> ("hybrid electric powerplant"^100.0 "hybrid electric powerplants"^100.0
> "Electric"^50.0 "Electrical"^50.0 "Electricity"^50.0 "Engine"^50.0 "fuel
> economy"^50.0 "fuel efficiency"^50.0 "Hybrid Electric Propulsion"^50.0
> "Power Systems"^50.0 "Powerplant"^50.0 "Propulsion"^50.0 "hybrid"^15.0
> "hybrid electric"^15.0 "electric powerplant"^15.0)
> 
> 
> 
> 
> 
> 15374.0
> 
> 2.0
> 
> 2.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 
> 15363.0
> 
> 1313.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 0.0
> 
> 
> 14048.0
> 
> 
> 
>
>
> Thanks,
> Arun
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Solr performance issue on querying --> Solr 6.5.1

2017-09-26 Thread sasarun
Hi All, 
I have been using Solr for some time now but mostly in standalone mode. Now
my current project is using Solr 6.5.1 hosted on hadoop. My solrconfig.xml
has the following configuration. In the prod environment the performance on
querying seems to really slow. Can anyone help me with few pointers on
howimprove on the same. 


${solr.hdfs.home:}
${solr.hdfs.blockcache.enabled:true}
${solr.hdfs.blockcache.slab.count:1}
${solr.hdfs.blockcache.direct.memory.allocation:false}
${solr.hdfs.blockcache.blocksperbank:16384}
${solr.hdfs.blockcache.read.enabled:true}
${solr.hdfs.blockcache.write.enabled:false}
${solr.hdfs.nrtcachingdirectory.enable:true}
${solr.hdfs.nrtcachingdirectory.maxmergesizemb:16}
${solr.hdfs.nrtcachingdirectory.maxcachedmb:192}

hdfs
It has 6 collections of following size 
Collection 1 -->6.41 MB
Collection 2 -->634.51 KB 
Collection 3 -->4.59 MB 
Collection 4 -->1,020.56 MB 
Collection 5 --> 607.26 MB
Collection 6 -->102.4 kb
Each Collection has 5 shards each. Allocated heap size for young generation
is about 8 gb and old generation is about 24 gb. And gc analysis showed peak
size 
utlisation is really low compared to these values. 
But querying to Collection 4 and collection 5 is giving really slow response
even thoughwe are not using any complex queries.Output of debug quries run
with debug=timing
are given below for reference. Can anyone help suggest a way improve the
performance.

Response to query


true
0
3962


("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
"Electrical" "Electricity" "Engine" "fuel economy" "fuel efficiency" "Hybrid
Electric Propulsion" "Power Systems" "Powerplant" "Propulsion" "hybrid"
"hybrid electric" "electric powerplant")

edismax
true
on

host
title
url
customContent
contentSpecificSearch


id
contentTagsCount

0
OR
OR
3985d7e2-3e54-48d8-8336-229e85f5d9de
600

("hybrid electric powerplant"^100.0 "hybrid electric powerplants"^100.0
"Electric"^50.0 "Electrical"^50.0 "Electricity"^50.0 "Engine"^50.0 "fuel
economy"^50.0 "fuel efficiency"^50.0 "Hybrid Electric Propulsion"^50.0
"Power Systems"^50.0 "Powerplant"^50.0 "Propulsion"^50.0 "hybrid"^15.0
"hybrid electric"^15.0 "electric powerplant"^15.0)





15374.0

2.0

2.0


0.0


0.0


0.0


0.0


0.0


0.0


0.0


0.0



15363.0

1313.0


0.0


0.0


0.0


0.0


0.0


0.0


0.0


14048.0





Thanks,
Arun



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Issue with Querying Solr

2015-12-08 Thread Salman Ansari
Hi,

I have created a cluster of Solr and Zookeepers on 3 machines connected
together. Currently, I am facing a weird problem. My collection has only
261 documents and when I try to query the documents using the browser such
as

http://
[ASolrServerInTheCluster]:8983/solr/sabrLocationsStore/select?q=(*:*)

it returns the documents properly. However, when I try to do the same using
Solr.NET, it throws java.lang.OutOfMemoryError: Java heap space exception
(although I have very few documents there). Any ideas why I am getting this
error?

Regards,
Salman


Re: Issue with Querying Solr

2015-12-08 Thread Andrea Gazzarini
I would set up logging in the admin console (queries should be logged out
by default), and then check the difference between the two invocations.
I don't believe the two clients are doing the *same* thing, there should be
some difference.

Another chance could be that the OOM is just behind the corner and the
Solr.NET invocation, the last invoker, is not the true responsible.

Andrea


2015-12-08 13:52 GMT+01:00 Salman Ansari :

> Hi,
>
> I have created a cluster of Solr and Zookeepers on 3 machines connected
> together. Currently, I am facing a weird problem. My collection has only
> 261 documents and when I try to query the documents using the browser such
> as
>
> http://
> [ASolrServerInTheCluster]:8983/solr/sabrLocationsStore/select?q=(*:*)
>
> it returns the documents properly. However, when I try to do the same using
> Solr.NET, it throws java.lang.OutOfMemoryError: Java heap space exception
> (although I have very few documents there). Any ideas why I am getting this
> error?
>
> Regards,
> Salman
>


Re: Issue with Querying Solr

2015-12-08 Thread Alexandre Rafalovitch
Solr by default only returns 10 rows. SolrNet by default returns many
rows. I don't know why that would cause OOM, but that's definitely
your difference unless you dealt with it:
https://github.com/mausch/SolrNet/blob/master/Documentation/Querying.md#pagination

Regards,
   Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 8 December 2015 at 07:52, Salman Ansari  wrote:
> Hi,
>
> I have created a cluster of Solr and Zookeepers on 3 machines connected
> together. Currently, I am facing a weird problem. My collection has only
> 261 documents and when I try to query the documents using the browser such
> as
>
> http://
> [ASolrServerInTheCluster]:8983/solr/sabrLocationsStore/select?q=(*:*)
>
> it returns the documents properly. However, when I try to do the same using
> Solr.NET, it throws java.lang.OutOfMemoryError: Java heap space exception
> (although I have very few documents there). Any ideas why I am getting this
> error?
>
> Regards,
> Salman


Re: Issue with Querying Solr

2015-12-08 Thread Salman Ansari
Thanks Andrea and Alexandre for your responses. Indeed it was the problem
that Solr.NET was returning many rows (as I captured this by fiddler).
Currently, my setup has only 500MB of JVM (which I will definitely
increase) but at least I found the culprit by reducing the number of rows
returned.

Regards,
Salman

On Tue, Dec 8, 2015 at 5:30 PM, Alexandre Rafalovitch 
wrote:

> Solr by default only returns 10 rows. SolrNet by default returns many
> rows. I don't know why that would cause OOM, but that's definitely
> your difference unless you dealt with it:
>
> https://github.com/mausch/SolrNet/blob/master/Documentation/Querying.md#pagination
>
> Regards,
>Alex.
> 
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
>
> On 8 December 2015 at 07:52, Salman Ansari 
> wrote:
> > Hi,
> >
> > I have created a cluster of Solr and Zookeepers on 3 machines connected
> > together. Currently, I am facing a weird problem. My collection has only
> > 261 documents and when I try to query the documents using the browser
> such
> > as
> >
> > http://
> > [ASolrServerInTheCluster]:8983/solr/sabrLocationsStore/select?q=(*:*)
> >
> > it returns the documents properly. However, when I try to do the same
> using
> > Solr.NET, it throws java.lang.OutOfMemoryError: Java heap space exception
> > (although I have very few documents there). Any ideas why I am getting
> this
> > error?
> >
> > Regards,
> > Salman
>


Re: Issue with Querying Solr

2015-12-08 Thread Don Bosco Durai
You only have 261 documents. That shouldn't be a problem, unless your document 
size is huge.
I feel, the problem still exists somewhere. You have just deferred it...
Bosco






On Tue, Dec 8, 2015 at 6:48 AM -0800, "Salman Ansari"  
wrote:










Thanks Andrea and Alexandre for your responses. Indeed it was the problem
that Solr.NET was returning many rows (as I captured this by fiddler).
Currently, my setup has only 500MB of JVM (which I will definitely
increase) but at least I found the culprit by reducing the number of rows
returned.

Regards,
Salman

On Tue, Dec 8, 2015 at 5:30 PM, Alexandre Rafalovitch 
wrote:

> Solr by default only returns 10 rows. SolrNet by default returns many
> rows. I don't know why that would cause OOM, but that's definitely
> your difference unless you dealt with it:
>
> https://github.com/mausch/SolrNet/blob/master/Documentation/Querying.md#pagination
>
> Regards,
>Alex.
> 
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
>
> On 8 December 2015 at 07:52, Salman Ansari 
> wrote:
> > Hi,
> >
> > I have created a cluster of Solr and Zookeepers on 3 machines connected
> > together. Currently, I am facing a weird problem. My collection has only
> > 261 documents and when I try to query the documents using the browser
> such
> > as
> >
> > http://
> > [ASolrServerInTheCluster]:8983/solr/sabrLocationsStore/select?q=(*:*)
> >
> > it returns the documents properly. However, when I try to do the same
> using
> > Solr.NET, it throws java.lang.OutOfMemoryError: Java heap space exception
> > (although I have very few documents there). Any ideas why I am getting
> this
> > error?
> >
> > Regards,
> > Salman
>