Why don't you use Thread.interrupt(), .isInterrupted() ? On Sat, Jun 27, 2009 at 16:16, Shai Erera<ser...@gmail.com> wrote: >> A downside of breaking it out into static methods like this is that a >> thread cannot run >1 time-limited activity simultaneously but I guess that >> might be a reasonable restriction. > > I'm not sure I understand that - how can a thread run >1 activity > simultaneously anyway, and how's your impl in TimeLimitingIndexReader allows > it to do so? You use the thread as a key to the map. Am I missing something? > > Anyway, I think we can let go of the static methods and make them instance > methods. I think .. if I want to use time limited activities, I should > create a TimeLimitedThreadActivity instance and pass it around, to > TimeLimitingIndexReader (and maybe in the future to a similar **IndexWriter) > and any other custom code I have which I want to put a time limit on. > > A static class has the advantage of not needing to pass it around > everywhere, and is accessible from everywhere, so that if we discover that > limiting on IndexReader is not enough, and we want some of the scorers to > check more frequently if they should stop, we won't need to pass that > instance all the way down to them. > > I don't mind keeping it static, but I also don't mind if it will be an > instance passed around, since currently it's only passed to IndexReader. > > Are you going to open an issue for that? Seems like a nice addition to me. > Do you think it should belong in core or contrib? If 'core' then if that's > possible I'd like to see all timeout classes under one package, including > TimeLimitingCollector (which until 2.9 we can safely move around as we > want). > I don't mind working on that w/ you, if you want. > > Shai > > On Sat, Jun 27, 2009 at 2:24 PM, Mark Harwood <markharw...@yahoo.co.uk> > wrote: >> >> Thanks for the feedback, Shai. >> So I guess you're suggesting breaking this out into a general utility >> class e.g. something like: >> class TimeLimitedThreadActivity >> { >> //called by client >> public static void startTimeLimitedActivity(long >> maxTimePermitted). >> public static void endTimeLimitedActivity() >> //called by resources (reader/writers) that need to be shared >> fairly by threads >> public static void checkActivityNotElapsed(); //throws some form of >> runtime exception >> } >> A downside of breaking it out into static methods like this is that a >> thread cannot run >1 time-limited activity simultaneously but I guess that >> might be a reasonable restriction. >> >> >>Aside, how about using a PQ for the threads' times, or a TreeMap. That >> >> will save looping over the collection to find the next candidate. Just an >> >> implementation detail though. >> Yep, that was one of the rough edges - I just wanted to get raw timings >> first for the all the "is timed out?" checks we're injecting into reader >> calls. >> Cheers >> Mark >> >> On 27 Jun 2009, at 11:37, Shai Erera wrote: >> >> I like the overall approach. However it's very local to an IndexReader. >> I.e., if someone wanted to limit other operations (say indexing), or does >> not use an IndexReader (for a Scorer impl maybe), one cannot reuse it. >> >> What if we factor out the timeout logic to a Timeout class (I think it can >> be a static class, with the way you implemented it) and use it in >> TimeLimitingIndexReader? That class can offer a method check() which will do >> the internal logic (the 'if' check and throw exception). It will be similar >> to the current ensureOpen() followed by an operation. >> >> It might be considered more expensive since it won't check a boolean, but >> instead call a check() method, but it will be more reusable. Also, >> ensureOpen today is also a method call, so I don't think Timeout.check() is >> that bad. We can even later create a TimeLimitingIndexWriter and document >> Timeout class for other usage by external code. >> >> Aside, how about using a PQ for the threads' times, or a TreeMap. That >> will save looping over the collection to find the next candidate. Just an >> implementation detail though. >> >> Shai >> >> On Sat, Jun 27, 2009 at 3:31 AM, Mark Harwood <markharw...@yahoo.co.uk> >> wrote: >>> >>> Going back to my post re TimeLimitedIndexReaders - here's an incomplete >>> but functional prototype: >>> http://www.inperspective.com/lucene/TimeLimitedIndexReader.java >>> http://www.inperspective.com/lucene/TestTimeLimitedIndexReader.java >>> >>> The principle is that all reader accesses check a volatile variable >>> indicating something may have timed out (no need to check thread locals >>> etc.) If and only if a time out has been noted threadlocals are checked to >>> see which thread should throw a timeout exception. >>> All time-limited use of reader must be wrapped in try...finally calls to >>> indicate the start and stop of a timed set of activities. A background >>> thread maintains the next anticipated timeout deadline and simply waits >>> until this is reached or the list of planned activities changes with new >>> deadlines. >>> >>> Performance seems reasonable on my Wikipedia index: >>> //some tests for heavy use of termenum/term docs >>> Read term docs for 200000 terms in 4755 ms using no timeout limit (warm >>> up) >>> Read term docs for 200000 terms in 4320 ms using no timeout limit (warm >>> up) >>> Read term docs for 200000 terms in 4320 ms using no timeout limit >>> Read term docs for 200000 terms in 4388 ms using reader with >>> time-limited access >>> //Example query with heavy use of termEnum/termDocs >>> +text:f* +text:a* +text:b* no time limit matched 1090041 docs in 2000 ms >>> +text:f* +text:a* +text:b* time limited collector matched 1090041 docs in >>> 1963 ms >>> +text:f* +text:a* +text:b* time limited reader matched 1090041 docs in >>> 2121 ms >>> //Example fuzzy match burning CPU reading TermEnum >>> text:accomodation~0.5 no time limit matched 192084 docs in 6428 ms >>> text:accomodation~0.5 time limited collector matched 192084 docs in 5923 >>> ms >>> text:accomodation~0.5 time limited reader matched 192084 docs in 5945 ms >>> >>> The reader approach to limiting time is slower but has these advantages : >>> 1) Multiple reader activities can be time-limited rather than just single >>> searches >>> 2) No code changes required to scorers/queries/filters etc >>> 3) Tasks that spend plenty of time burning CPU before collection happens >>> can be killed earlier >>> I'm sure there's some thread safety issues to work through in my code and >>> not all reader classes are wrapped (e.g. TermPositions) but the basics are >>> there and seem to be functioning >>> Thoughts? >> > >
-- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org