Re: What is thread safe in Lucene.net?

Jokin Cuadrado Mon, 11 Jan 2010 16:18:32 -0800

Seems correct to me, I don't know what it's the size of your index,
and the frequency of the updates and searches. Common optimizations
are to queue new documents, and warming searchers before switching
them.



by the way, the error thay you have encountered in the queryparser
it's very common, people new to lucene usually made the same mistake,
because you get used  to sharing the different objects in lucene (and
it's recommended in the case of the indexwriters and indexreaders), so
sharing the queryparser seems the most logical thing. But queryparser
is a class generated by an automatic tool (javacc) and have some local
variables that maintain the states of the parsing steps. If you have
only one thread you can reuse it (when making different searches in
the same request for example), but as i said before, it's easier to
get a new one when you need it.

On 1/12/10, Simone Chiaretta <[email protected]> wrote:
> And just to make sure I did everything correctly:
>
> IndexWriter: I create one at the app startup and always use the same
> instance: many users can use the same instance to add new documents to index
> IndexSearcher: I create one at the first search, and use it to do all the
> searches (concurrent users can use the same one). And I do recreate it when
> there is a new document in the index
> Analyzer: I create one at the beginning of the app (needed to create the
> writer) and reuse it
> Directory: create one at the beginning of time to create the writer and keep
> this instance to create new indexsearchers when needed
> QueryParser: one per search
>
> Is that the correct approach?
> Thx
> Simone
>
> On Tue, Jan 12, 2010 at 12:45 AM, Jokin Cuadrado <[email protected]> wrote:
>
>> Correct, that's the way we use it.
>>
>> On 1/12/10, Simone Chiaretta <[email protected]> wrote:
>> > OK, so I can create the query parser each time, using the analyzer I
>> created
>> > at the search engine startup? Correct?
>> > Simone
>> >
>> > On Tue, Jan 12, 2010 at 12:28 AM, Jokin Cuadrado <[email protected]>
>> wrote:
>> >
>> >> The queryparser it's not thread safe, so you must use a new one in
>> >> every request, however, is very lightweight, because the bigger
>> >> complexity comes from the underlying analyzer, and this one it's
>> >> thread safe.
>> >>
>> >> On 1/12/10, Simone Chiaretta <[email protected]> wrote:
>> >> > I'm trying to go live with our search engine implementation based on
>> >> > Lucene.net.
>> >> > Unfortunately we have to keep it inside our appdomain in the web
>> >> application
>> >> > to make it work in shared hosting scenario.
>> >> >
>> >> > But we are getting quite a few problems, so I was wondering if there
>> are
>> >> > some issues with concurrent access:
>> >> > 1 - is the QueryParser thread safe? Can I make it one at the
>> >> > beginning
>> >> > of
>> >> > the times and reuse it in all my queries? or do I've to create one
>> each
>> >> > time?
>> >> > I'm asking because I'm getting strange errors like:
>> >> >
>> >> > ystem.InvalidOperationException: Collection was modified; enumeration
>> >> > operation may not execute.     at
>> >> > System.Collections.ArrayList.ArrayListEnumeratorSimple.MoveNext()
>> at
>> >> > Lucene.Net.QueryParsers.QueryParser.Jj_add_error_token(Int32 kind,
>> Int32
>> >> > pos)     at Lucene.Net.QueryParsers.QueryParser.Jj_scan_token(Int32
>> >> > kind)
>> >> > at Lucene.Net.QueryParsers.QueryParser.Jj_3_1()     at
>> >> > Lucene.Net.QueryParsers.QueryParser.Jj_rescan_token()     at
>> >> > Lucene.Net.QueryParsers.QueryParser.GenerateParseException()     at
>> >> > Lucene.Net.QueryParsers.QueryParser.Jj_consume_token(Int32 kind)
>> at
>> >> > Lucene.Net.QueryParsers.QueryParser.Clause(String field)     at
>> >> > Lucene.Net.QueryParsers.QueryParser.Query(String field)     at
>> >> > Lucene.Net.QueryParsers.QueryParser.Parse(String query)     at
>> >> >
>> Subtext.Framework.Services.SearchEngine.SearchEngineService.Search(String
>> >> > queryString, Int32 max, Int32 blogId, Int32 entryId)
>> >> >
>> >> > Which looks to me like a threading issue.
>> >> >
>> >> > I also got this one:
>> >> >
>> >> > Lucene.Net.QueryParsers.QueryParser+LookaheadSuccess: Error in the
>> >> > application.
>> >> > at Lucene.Net.QueryParsers.QueryParser.Jj_scan_token(Int32 kind)
>> at
>> >> > Lucene.Net.QueryParsers.QueryParser.Jj_3R_2()     at
>> >> > Lucene.Net.QueryParsers.QueryParser.Jj_3R_2()     at
>> >> > Lucene.Net.QueryParsers.QueryParser.Jj_rescan_token()     at
>> >> > Lucene.Net.QueryParsers.QueryParser.Jj_3_1()     at
>> >> > Lucene.Net.QueryParsers.QueryParser.GenerateParseException()     at
>> >> > Lucene.Net.QueryParsers.QueryParser.Jj_consume_token(Int32 kind)
>> at
>> >> > Lucene.Net.QueryParsers.QueryParser.Jj_consume_token(Int32 kind)
>> at
>> >> > Lucene.Net.QueryParsers.QueryParser.Term(String field)     at
>> >> > Lucene.Net.QueryParsers.QueryParser.Clause(String field)     at
>> >> > Lucene.Net.QueryParsers.QueryParser.Clause(String field)     at
>> >> > Lucene.Net.QueryParsers.QueryParser.Query(String field)     at
>> >> > Lucene.Net.QueryParsers.QueryParser.Query(String field)     at
>> >> > Lucene.Net.QueryParsers.QueryParser.Parse(String query)     at
>> >> > Lucene.Net.QueryParsers.QueryParser.Parse(String query)     at
>> >> >
>> Subtext.Framework.Services.SearchEngine.SearchEngineService.Search(String
>> >> > queryString, Int32 max, Int32 blogId, Int32 entryId)
>> >> >
>> >> > And this one:
>> >> >
>> >> > Lucene.Net.QueryParsers.ParseException: Cannot parse 'what is css
>> >> > url': Encountered
>> >> > "what is css url" at line 1, column 0. Was expecting one of:
>> >> > <NOT>
>> >> > ...     "+"
>> >> > ...     "-" ...     "(" ...     "*" ...     <QUOTED> ...     <TERM>
>> ...
>> >> > <PREFIXTERM> ...     <WILDTERM> ...     "[" ...     "{" ...
>> >> > <NUMBER> ...
>> >> > at Lucene.Net.QueryParsers.QueryParser.Parse(String query)
>> >> >
>> >> > Which is fine if I really added an invalid character in the query,
>> >> > but
>> >> "what
>> >> > is css url" looks to me like it's a valid query.
>> >> >
>> >> > What I'm doing is, to avoid creating a new query parser for each
>> query,
>> >> to
>> >> > "cache" the same as variable inside the singleton class that holds
>> >> > the
>> >> > search engine.
>> >> > Is this a good approach? or a bad one? (I guess bad since this all
>> seem
>> >> to
>> >> > be threading issues).
>> >> > Is creating a new query parser for each query a performance problem?
>> >> >
>> >> > Thank you
>> >> > Simone
>> >> >
>> >> > --
>> >> > Simone Chiaretta
>> >> > Microsoft MVP ASP.NET - ASPInsider
>> >> > Blog: http://codeclimber.net.nz
>> >> > RSS: http://feeds2.feedburner.com/codeclimber
>> >> > twitter: @simonech
>> >> >
>> >> > Any sufficiently advanced technology is indistinguishable from magic
>> >> > "Life is short, play hard"
>> >> >
>> >>
>> >>
>> >> --
>> >> Jokin
>> >>
>> >
>> >
>> >
>> > --
>> > Simone Chiaretta
>> > Microsoft MVP ASP.NET - ASPInsider
>> > Blog: http://codeclimber.net.nz
>> > RSS: http://feeds2.feedburner.com/codeclimber
>> > twitter: @simonech
>> >
>> > Any sufficiently advanced technology is indistinguishable from magic
>> > "Life is short, play hard"
>> >
>>
>>
>> --
>> Jokin
>>
>
>
>
> --
> Simone Chiaretta
> Microsoft MVP ASP.NET - ASPInsider
> Blog: http://codeclimber.net.nz
> RSS: http://feeds2.feedburner.com/codeclimber
> twitter: @simonech
>
> Any sufficiently advanced technology is indistinguishable from magic
> "Life is short, play hard"
>


-- 
Jokin

Re: What is thread safe in Lucene.net?

Reply via email to