On Thu, Nov 24, 2011 at 12:09 PM, Artem Lokotosh <arco...@gmail.com> wrote:
> >How big are the documents you return (how many fields, avg KB per doc, > etc.)? > I have a following schema in my solr configuration<fields><field > name="field1" type="text" indexed="true" stored="false"/><field > name="field2" type="text" indexed="true" stored="true"/><field > name="field3" type="text" indexed="true" stored="true"/><field > name="field4" type="tlong" indexed="true" stored="true"/><field > name="field5" type="tdate" indexed="true" stored="true"/><field > name="field6" type="text" indexed="true" stored="true"/><field > name="field7" type="text" indexed="true" stored="true"/><field > name="field8" type="tlong" indexed="true" stored="true"/><field > name="field9" type="text" indexed="true" stored="true"/><field > name="field10" type="tdate" indexed="true" stored="true"/><field > name="field11" type="text" indexed="true" stored="true"/><field > name="id" type="string" indexed="true" stored="true" > required="true"/></fields> > 27M–30M docs and 12-15 GB for each shard, 0.5KB per doc > >Does performance get much better if you only request top 100, or top>10 > documents instead of top 1000? > | 10 | 100 | 1000 | 2000 > -------------|-------|--------|--------|-------- > MIN | 124 | 146 | 237 | 747 > AVG | 832 | 4666 | 16130 | 72542 > MAX | 3602 | 30197 | 57339 | 159482 > QUERIES/5MIN | 75 | 73 | 49 | 51 > >>What if you only request a couple fields, instead of fl=*?>>What if you > only search 10 shards instead of 30? > Results are similar to table above, btw I need to recieve all fields from > shards > Another one problem.I use solrmeter or simple bash script to check the > search speed.I've got QTime from 16K to 24K for first ~20 queriesfrom > 50K to 100K for next ~20 queries and until servlet goes down > > On Wed, Nov 23, 2011 at 5:55 PM, Robert Stewart <bstewart...@gmail.com> > wrote: > > If you request 1000 docs from each shard, then aggregator is really > > fetching 30,000 total documents, which then it must merge (re-sort > > results, and take top 1000 to return to client). Its possible that > > SOLR merging implementation needs optimized, but it does not seem like > > it could be that slow. How big are the documents you return (how many > > fields, avg KB per doc, etc.)? I would take a look at network to make > > sure that is not some bottleneck, and also to make sure there is not > > some underlying issue making 30 concurrent HTTP requests from the > > aggregator. I am not an expert in Java, but under .NET there is a > > setting that limits concurrent out-going HTTP requests from a process > > that must be over-ridden via configuration, otherwise by default is > > very limiting. > > > > Does performance get much better if you only request top 100, or top > > 10 documents instead of top 1000? > > > > What if you only request a couple fields, instead of fl=*? > > > > What if you only search 10 shards instead of 30? > > > > I would collect those numbers and try to determine if time increases > > linearly or not as you increase shards and/or # of docs. > > > > > > > > > > > > On Wed, Nov 23, 2011 at 9:55 AM, Artem Lokotosh <arco...@gmail.com> > wrote: > >>> If the response time from each shard shows decent figures, then > aggregator> seems to be a bottleneck. Do you btw have a lot of concurrent > users?For now is not a problem, but we expect from 1K to 10K of concurrent > users and maybe more > >> On Wed, Nov 23, 2011 at 4:43 PM, Dmitry Kan <dmitry....@gmail.com> > wrote: > >>> If the response time from each shard shows decent figures, then > aggregator > >>> seems to be a bottleneck. Do you btw have a lot of concurrent users? > >>> > >>> On Wed, Nov 23, 2011 at 4:38 PM, Artem Lokotosh <arco...@gmail.com> > wrote: > >>> > >>>> > Is this log from the frontend SOLR (aggregator) or from a shard? > >>>> from aggregator > >>>> > >>>> > Can you merge, e.g. 3 shards together or is it much effort for your > team? > >>>> Yes, we can merge. We'll try to do this and review how it will works > >>>> Thanks, Dmitry > >>>> > >>>> Any another ideas? > >>>> > >> > >> -- > >> Best regards, > >> Artem Lokotosh mailto:arco...@gmail.com > >> > > > > > > -- > Best regards, > Artem Lokotosh mailto:arco...@gmail.com > When you search each shard, are you positive that you are using all of the same parameters? You are sure you are hitting request handlers that are configured exactly the same and sending exactly the same queries? I'm my experience, the overhead for distrib search is usually very low. What types of queries are you trying? -- - Mark http://www.lucidimagination.com