Re: Searcher Performance

Chitra R Fri, 17 Feb 2017 08:45:06 -0800

Thanx a lot Adrien.

On Fri, Feb 17, 2017 at 10:07 PM, Adrien Grand <jpou...@gmail.com> wrote:


> Some minimal information about the fields is loaded into memory when you
> open the index reader. Things like the list of fields and how they are
> indexed.
>
> However the vast majority of the data is read from disk lazily, we do not
> warm the filesystem cache or anything like that by default. We do not use
> direct I/O either. So say you run a term query, only pages that contain
> information about these particular field and value will be loaded into the
> cache.
>
> In case you want to warm the filesystem cache explicitly, which could be a
> good idea if you have plenty of filesystem cache for your index (ie. the
> unused memory of the system is larger than the index), you can look into
> using MMapDirectory.setPreload.
>
> Le ven. 17 févr. 2017 à 15:13, Chitra R <chithu.r...@gmail.com> a écrit :
>
> > Hey, thank you so much. I got it.
> >
> > I have
> >
> >    - 10 lakh docs, 30 fields in my index
> >    - opening new searcher at initial search and
> >    - there will be no filesystem cache for my current index
> >
> > At initial search, I search across only one field out of 30 fields in my
> > index.
> >
> > My question is,
> >
> > *At initial search, Whether the required page (os pages of Lucene index
> > files) for that field (a single field) will be loaded to filesystem cache
> > or all the fields info will be loaded to filesystem cache from disk?*
> >
> >
> > Regards,
> > Chitra
> >
> > On Fri, Feb 17, 2017 at 7:05 PM, Adrien Grand <jpou...@gmail.com> wrote:
> >
> > > Regarding whether the filesystem cache helps, you could look at whether
> > > there is some disk activity while your queries are running.
> > >
> > > When everything is in the filesystem cache, the latency of search
> > requests
> > > for simple queries (term queries and combinations through boolean
> > queries)
> > > usually mostly depends on the total number of matches since Lucene
> needs
> > to
> > > call the collector on every match.
> > >
> > > Le ven. 17 févr. 2017 à 10:09, Chitra R <chithu.r...@gmail.com> a
> écrit
> > :
> > >
> > > > Hi,
> > > >      While working with Searcher.Search, I have noticed a difference
> in
> > > > their performance. I have 10 lakh documents and 30 fields in my
> index.
> > I
> > > > have performed three searches using different queries in a sequential
> > > > manner. At search time, I used MMapDirectory and index is opened.
> > > >
> > > > *case1: *
> > > >
> > > >    - During the first search, I ran the Query Say (new TermQuery(new
> > > >    Term("name","Chitra"))) and which yields 1 lakh documents as
> result.
> > > > Time
> > > >    taken for first search = 50 - 60 ms nearly.
> > > >    - And for the second search, I ran the Query Say (new
> TermQuery(new
> > > >    Term("animal","lion"))) which also yields 1 lakh documents as
> > result.
> > > > Time
> > > >    taken for Second search = 50 - 60 ms nearly.
> > > >    - And for the third search,  I ran the Query Say (new
> TermQuery(new
> > > >    Term("bird","peacock"))) which also yields 1 lakh documents as
> > result.
> > > >    Time taken for Third search = 50 - 60 ms nearly.
> > > >
> > > > In this case, why does searcher.search take the same search time for
> > > > different queries?
> > > >
> > > > *case2:*
> > > >
> > > > Suppose if I ran the same query twice, Searcher.search took less time
> > > than
> > > > the previous search because of os cache.
> > > >
> > > > *Based on above observation, *
> > > >
> > > > During initial search, only the required portion of index files will
> be
> > > > loaded to i/o cache. And for the next search, if the required portion
> > is
> > > > not present in os cache,
> > > >
> > > > Will it take time to read that files from disk? If so, this is the
> > reason
> > > > behind searcher.search is taking the nearly same search time for
> > > different
> > > > queries.
> > > >
> > > >
> > > > Regards,
> > > > Chitra
> > > >
> > >
> >
>

Re: Searcher Performance

Reply via email to