>My query string is always simple like "design", "principle of design", "tom" >EG: >URL: http://localhost:7550/solr/select/?q=design&version=2.2&start=0&rows=10&indent=on
IMO, indeed with these types of simple searches caching (and thus RAM usage) can not be fully exploited, i.e: there isn't really anything to cache (no sort-ordering, faceting (Lucene fieldcache), no documentsets,faceting (Solr filtercache)) The only thing that helps you here would be a big solr querycache, depending on how often queries are repeated. Just execute the same query twice, the second time you should see a fast response (say < 20ms) that's the querycache (and thus RAM) working for you. >Now the issue I found is search with "fq" argument looks slow down the search. This doesn't align with your previous statement that you only use search with a q-param (e.g: http://localhost:7550/solr/select/?q=design&version=2.2&start=0&rows=10&indent=on ) For your own sake, explain what you're trying to do, otherwise we really are guessing in the dark. Anyway the FQ-param let's you cache (using the Solr-filtercache) individual documentsets that can be used to efficiently to intersect your resultset. Also the first time, caches should be warmed (i.e: the fq-query should be exectuted and results saved to cache, since there isn't anything there yet) . Only on the second time would you start seeing improvements. For instance: http://localhost:7550/solr/select/?q=design&fq=doctype:pdf&version=2.2&start=0&rows=10&indent=on<http://localhost:7550/solr/select/?q=design&version=2.2&start=0&rows=10&indent=on> <http://localhost:7550/solr/select/?q=design&version=2.2&start=0&rows=10&indent=on>would only show documents containing "design" when the doctype=pdf (Again this is just an example here where I'm just assuming that you have defined a field 'doctype') since the nr of values of documenttype would be pretty low and would be used independently of other queries, this would be an excellent candidate for the FQ-param. http://wiki.apache.org/solr/CommonQueryParameters#fq <http://wiki.apache.org/solr/CommonQueryParameters#fq> This was a longer reply than I wanted to. Really think about your use-cases first, then present some real examples of what you want to achieve and then we can help you in a more useful manner. Cheers, Geert-Jan 2010/7/17 marship <mars...@126.com> > Hi. Peter and All. > I merged my indexes today. Now each index stores 10M document. Now I only > have 10 solr cores. > And I used > > java -Xmx1g -jar -server start.jar > to start the jetty server. > > At first I deployed them all on one search. The search speed is about 3s. > Then I noticed from cmd output when search start, 4 of 10's QTime only cost > about 10ms-500ms. The left 5 cost more, up to 2-3s. Then I put 6 on web > server, 4 on another(DB, high load most time). Then the search speed goes > down to about 1s most time. > Now most search takes about 1s. That's great. > > I watched the jetty output on cmd windows on web server, now when each > search start, I saw 2 of 6 costs 60ms-80ms. The another 4 cost 170ms - > 700ms. I do believe the bottleneck is still the hard disk. But at least, > the search speed at the moment is acceptable. Maybe i should try memdisk to > see if that help. > > > And for -Xmx1g, actually I only see jetty consume about 150M memory, > consider now the index is 10x bigger. I don't think that works. I googled > -Xmx is go enlarge the heap size. Not sure can that help search. I still > have 3.5G memory free on server. > > Now the issue I found is search with "fq" argument looks slow down the > search. > > Thanks All for your help and suggestions. > Thanks. > Regards. > Scott > > > 在2010-07-17 03:36:19,"Peter Karich" <peat...@yahoo.de> 写道: > >> > Each solr(jetty) instance on consume 40M-60M memory. > > > >> java -Xmx1024M -jar start.jar > > > >That's a good suggestion! > >Please, double check that you are using the -server version of the jvm > >and the latest 1.6.0_20 or so. > > > >Additionally you can start jvisualvm (shipped with the jdk) and hook > >into jetty/tomcat easily to see the current CPU and memory load. > > > >> But I have 70 solr cores > > > >if you ask me: I would reduce them to 10-15 or even less and increase > >the RAM. > >try out tomcat too > > > >> solr distriubted search's speed is decided by the slowest one. > > > >so, try to reduce the cores > > > >Regards, > >Peter. > > > >> you mentioned that you have a lot of mem free, but your yetty containers > >> only using between 40-60 mem. > >> > >> probably stating the obvious, but have you increased the -Xmx param like > for > >> instance: > >> java -Xmx1024M -jar start.jar > >> > >> that way you're configuring the container to use a maximum of 1024 MB > ram > >> instead of the standard which is much lower (I'm not sure what exactly > but > >> it could well be 64MB for non -server, aligning with what you're seeing) > >> > >> Geert-Jan > >> > >> 2010/7/16 marship <mars...@126.com> > >> > >> > >>> Hi Tom Burton-West. > >>> > >>> Sorry looks my email ISP filtered out your replies. I checked web > version > >>> of mailing list and saw your reply. > >>> > >>> My query string is always simple like "design", "principle of design", > >>> "tom" > >>> > >>> > >>> > >>> EG: > >>> > >>> URL: > >>> > http://localhost:7550/solr/select/?q=design&version=2.2&start=0&rows=10&indent=on > >>> > >>> Response: > >>> > >>> <response> > >>> - > >>> <lst name="responseHeader"> > >>> <int name="status">0</int> > >>> <int name="QTime">16</int> > >>> - > >>> <lst name="params"> > >>> <str name="indent">on</str> > >>> <str name="start">0</str> > >>> <str name="q">design</str> > >>> <str name="version">2.2</str> > >>> <str name="rows">10</str> > >>> </lst> > >>> </lst> > >>> - > >>> <result name="response" numFound="5981" start="0"> > >>> - > >>> <doc> > >>> <str name="id">product_208619</str> > >>> </doc> > >>> > >>> > >>> > >>> > >>> > >>> EG: > >>> > http://localhost:7550/solr/select/?q=Principle&version=2.2&start=0&rows=10&indent=on > >>> > >>> <response> > >>> - > >>> <lst name="responseHeader"> > >>> <int name="status">0</int> > >>> <int name="QTime">94</int> > >>> - > >>> <lst name="params"> > >>> <str name="indent">on</str> > >>> <str name="start">0</str> > >>> <str name="q">Principle</str> > >>> <str name="version">2.2</str> > >>> <str name="rows">10</str> > >>> </lst> > >>> </lst> > >>> - > >>> <result name="response" numFound="104" start="0"> > >>> - > >>> <doc> > >>> <str name="id">product_56926</str> > >>> </doc> > >>> > >>> > >>> > >>> As I am querying over single core and other cores are not querying at > same > >>> time. The QTime looks good. > >>> > >>> But when I query the distributed node: (For this case, 6422ms is still > a > >>> not bad one. Many cost ~20s) > >>> > >>> URL: > >>> > http://localhost:7499/solr/select/?q=the+first+world+war&version=2.2&start=0&rows=10&indent=on&debugQuery=true > >>> > >>> Response: > >>> > >>> <response> > >>> - > >>> <lst name="responseHeader"> > >>> <int name="status">0</int> > >>> <int name="QTime">6422</int> > >>> - > >>> <lst name="params"> > >>> <str name="debugQuery">true</str> > >>> <str name="indent">on</str> > >>> <str name="start">0</str> > >>> <str name="q">the first world war</str> > >>> <str name="version">2.2</str> > >>> <str name="rows">10</str> > >>> </lst> > >>> </lst> > >>> - > >>> <result name="response" numFound="4231" start="0"> > >>> > >>> > >>> > >>> Actually I am thinking and testing a solution: As I believe the > bottleneck > >>> is in harddisk and all our indexes add up is about 10-15G. What about I > just > >>> add another 16G memory to my server then use "MemDisk" to map a memory > disk > >>> and put all my indexes into it. Then each time, solr/jetty need to load > >>> index from harddisk, it is loading from memory. This should give solr > the > >>> most throughout and avoid the harddisk access delay. I am testing .... > >>> > >>> But if there are way to make solr use better use our limited resource > to > >>> avoid adding new ones. that would be great. > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >> > > > > > >-- > >http://karussell.wordpress.com/ > > >