Thanks Eric for your response. If you find pagination is not the main
culprit, what other factors do you guys suggest I need to tweak to test
that? As I mentioned, by navigating to 20000 results using start and row I
am getting time out from Solr.NET and I need a way to fix that.

You suggested that 4GB JVM is not enough, I have seen MapQuest going with
10GB JVM as mentioned here
http://www.slideshare.net/lucidworks/high-performance-solr-and-jvm-tuning-strategies-used-for-map-quests-search-ahead-darren-spehr
and they were getting 140 ms response time for 10 Billion documents. Not
sure how many shards they had though. With data of around 70M documents,
what do you guys suggest as how many shards should I use and how much
should I dedicate for RAM and JVM?

Regards,
Salman

On Fri, Oct 9, 2015 at 6:37 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> I think paging is something of a red herring. You say:
>
> bq: but still I get delays of around 16 seconds and sometimes even more.
>
> Even for a start of 1,000, this is ridiculously long for Solr. All
> you're really saving
> here is keeping a record of the id and score for a list 1,000 cells
> long (or even
> 20,000 assuming 1,000 pages and 20 docs/page). that's somewhat wasteful,
> but it's still hard to believe it's responsible for what you're seeing.
>
> Having 4G of RAM for 70M docs is very little memory, assuming this is on
> a single shard.
>
> So my suspicion is that you have something fundamentally slow about
> your system, the additional overhead shouldn't be as large as you're
> reporting.
>
> And I'll second Toke's comment. It's very rare that users see anything
> _useful_ by navigating that deep. Make them hit next next next and they'll
> tire out way before that.
>
> Cursor mark's sweet spot is handling some kind of automated process that
> goes through the whole result set. It'll work for what you're trying
> to do though.
>
> Best,
> Erick
>
> On Fri, Oct 9, 2015 at 8:27 AM, Salman Ansari <salman.rah...@gmail.com>
> wrote:
> > Is this a real problem or a worry? Do you have users that page really
> deep
> > and if so, have you considered other mechanisms for delivering what they
> > need?
> >
> > The issue is that currently I have around 70M documents and some generic
> > queries are resulting in lots of pages. Now if I try deep navigation (to
> > page# 1000 for example), a lot of times the query takes so long that
> > Solr.NET throws operation time out exception. The first page is
> relatively
> > faster to load but it does take around few seconds as well. After reading
> > some documentation I realized that cursors could help and it does. I have
> > tried to following the test better performance:
> >
> > 1) Used cursors instead of start and row
> > 2) Increased the RAM on my Solr machine to 14GB
> > 3) Increase the JVM on that machine to 4GB
> > 4) Increased the filterChache
> > 5) Increased the docCache
> > 6) Run Optimize on the Solr Admin
> >
> > but still I get delays of around 16 seconds and sometimes even more.
> > What other mechanisms do you suggest I should use to handle this issue?
> >
> > While pagination is faster than increasing the start parameter, the
> > difference is small as long as you stay below a start of 1000. 10K might
> > also work for you. Do your users page beyond that?
> > I can limit users not to go beyond 10K but still think at that level
> > cursors will be much faster than increasing the start variable as
> explained
> > here (
> https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results
> > ), have you tried both ways on your collection and it was giving you
> > similar results?
> >
> > On Fri, Oct 9, 2015 at 5:20 PM, Toke Eskildsen <t...@statsbiblioteket.dk>
> > wrote:
> >
> >> Salman Ansari <salman.rah...@gmail.com> wrote:
> >>
> >> [Pagination with cursors]
> >>
> >> > For example, what happens if the user navigates from page 1 to page 2,
> >> > does the front end  need to store the next cursor at each query?
> >>
> >> Yes.
> >>
> >> > What about going to a previous page, do we need to store all cursors
> >> > that have been navigated up to now at the client side?
> >>
> >> Yes, if you want to provide that functionality.
> >>
> >> Is this a real problem or a worry? Do you have users that page really
> deep
> >> and if so, have you considered other mechanisms for delivering what they
> >> need?
> >>
> >> While pagination is faster than increasing the start parameter, the
> >> difference is small as long as you stay below a start of 1000. 10K might
> >> also work for you. Do your users page beyond that?
> >>
> >> - Toke Eskildsen
> >>
>

Reply via email to