Re: [jira] Commented: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-09-22 Thread Mark Miller
We would likely back patch it to the branch anyway - and/or put it in 2.9.1 - it won't go to 2.9. Though if we had to rebuild for some reason it will be in. - Mark http://www.lucidimagination.com (mobile) On Sep 22, 2009, at 1:44 AM, "Uwe Schindler" wrote: I thought, we are already in th

Re: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread Mark Miller
1483 - indexsearcher pulls out a readers subreaders (segmentreaders) and sends a collector over them one by one, rather than using the multireader. So only fc for seg readers that change need to be reloaded. - Mark http://www.lucidimagination.com (mobile) On Sep 22, 2009, at 1:27 AM, John W

Re: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread John Wang
Thanks Mark for the pointer! I guess my point is with NRT, and when segment files change often, this would be an issue, no? Anyway, I can run some tests. Thanks -John On Tue, Sep 22, 2009 at 3:21 PM, Mark Miller wrote: > 1483 - indexsearcher pulls out a readers subreaders (segmentreaders) an

RE: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread Uwe Schindler
The NRT reader coming from the IndexWriter.getReader() has only changes in the currently processed segments, the other segments keep stable (and even their IndexReader keys used for the FieldCache). The rest of the segments keep stable. For the consumer it looks like a normal reader (it is in fact

Re: Lucene Spatial

2009-09-22 Thread Alex
Hi, I am also very interested in that matter and I would likely contribute any help (testing etc ...) if possible. Ifyou guys plan on rewritimg something clean, I would suggest you consider using the Java Topology Suite (JTS) http://www.vividsolutions.com/jts/jtshome.htm as a base for spatial ob

Re: Build failed in Hudson: Lucene-trunk #955

2009-09-22 Thread Michael McCandless
Hmm and this was the test that failed: [junit] Testsuite: org.apache.lucene.index.TestIndexWriter [junit] Test org.apache.lucene.index.TestIndexWriter FAILED (crashed) The java sub-process running TestIndexWriter apparently crashed. Here's some more detail, from the junitResult.xml in this

Re: Lucene Spatial

2009-09-22 Thread Michael McCandless
On Tue, Sep 22, 2009 at 4:08 AM, Alex wrote: > I would suggest you consider > using the Java Topology Suite (JTS) > http://www.vividsolutions.com/jts/jtshome.htm as a base for spatial objects, +1, this looks compelling! Mike -

Hudson build is back to normal: Lucene-trunk #956

2009-09-22 Thread Apache Hudson Server
See - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Grant Ingersoll
On Sep 21, 2009, at 9:35 PM, John Wang wrote: Jason: You are missing the point. The idea is to avoid merging of large segments. The point of this MergePolicy is to balance segment merges across the index. The aim is not to have 1 large segment, it is to have n segments with bala

RE: Lucene Spatial

2009-09-22 Thread Steven A Rowe
The link Alex provided looks like it's home to an older snapshot of the project's history; I think this is the project's current home: http://tsusiatsoftware.net/jts/main.html Steve > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Tuesda

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Michael McCandless
John are you using IndexWriter.setMergedSegmentWarmer, so that a newly merged segment is warmed before it's "put into production" (returned by getReader)? Mike On Mon, Sep 21, 2009 at 9:35 PM, John Wang wrote: > Jason: > >     You are missing the point. > >     The idea is to avoid merging of la

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Yonik Seeley
On Tue, Sep 22, 2009 at 10:48 AM, Michael McCandless wrote: > John are you using IndexWriter.setMergedSegmentWarmer, so that a newly > merged segment is warmed before it's "put into production" (returned > by getReader)? I'm still not sure I see the reason for complicating the IndexWriter with wa

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Earwin Burrfoot
On Tue, Sep 22, 2009 at 19:08, Yonik Seeley wrote: > On Tue, Sep 22, 2009 at 10:48 AM, Michael McCandless > wrote: >> John are you using IndexWriter.setMergedSegmentWarmer, so that a newly >> merged segment is warmed before it's "put into production" (returned >> by getReader)? > > I'm still not

Re: Build failed in Hudson: Lucene-trunk #955

2009-09-22 Thread Robert Muir
but it says the tests only ran for 12 minutes, so it took a day to compile? On Tue, Sep 22, 2009 at 5:32 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > Hmm and this was the test that failed: > > [junit] Testsuite: org.apache.lucene.index.TestIndexWriter > [junit] Test org.apache

Re: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread Jason Rutherglen
> When heaving indexing introduces or modifies segments, would it cause reloading of FieldCache at query time and thus would impact search performance? How is this different than previous versions of Lucene? In 2.9 the field caches are only loaded for new segments incrementally, instead of over th

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Michael McCandless
On Tue, Sep 22, 2009 at 11:08 AM, Yonik Seeley wrote: > I'm still not sure I see the reason for complicating the IndexWriter > with warming... can't this be done just as efficiently (if not more > efficiently) in user/application space? It will be less efficient when you warm outside of IndexWri

Re: Build failed in Hudson: Lucene-trunk #955

2009-09-22 Thread Michael McCandless
On Tue, Sep 22, 2009 at 11:15 AM, Robert Muir wrote: > but it says the tests only ran for 12 minutes, so it took a day to compile? Yeah I don't get what exactly took so long... Mike - To unsubscribe, e-mail: java-dev-unsubscr..

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Jason Rutherglen
Adding segment warming to IW is the only way to insure newly merged segments are quickly searchable without the impact brought up by John W regarding queries on new segments being slow when they load field caches. On Tue, Sep 22, 2009 at 8:37 AM, Michael McCandless wrote: > On Tue, Sep 22, 2009 a

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Michael McCandless
It's not only that the newly merged segments are quickly searchable (you could do that with warming outside of IW). It's more importantly so that you can continue to add/delete docs, flush the segment, open a new NRT reader, and search those changes, without waiting for the warming to complete. Y

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Jason Rutherglen
Right, it allows warming without interrupting obtaining new readers. I'll update the realtime wiki with this. Thanks Mike. On Tue, Sep 22, 2009 at 8:53 AM, Michael McCandless wrote: > It's not only that the newly merged segments are quickly searchable > (you could do that with warming outside of

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Yonik Seeley
On Tue, Sep 22, 2009 at 11:37 AM, Michael McCandless wrote: > The whole point of putting optional warming into IndexWriter was so > the segment could be warmed *before* the merge commits the change to > the writer's SegmentInfos. But... doesn't this add the same amount of latency in a different p

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Ted Dunning
Actually, I strongly disagree. If you optimize for this case, you are pessimizing for the real world. It would be much better to fit a realistic life cycle or just record a trace of profile updates (no need for content, just an abstract id for each profile that got updated). On Mon, Sep 21, 2009

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Yonik Seeley
OK Mike, thanks for your patience - I understand now :-) Here's an example that helped me understand - hopefully it will add to others understanding more than it confuses ;-) IW.getReader() => segments={A, B} // something causes a merge of A,B into AB to start addDoc(doc1) // doc1 goes into s

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Michael McCandless
Well described, that's exactly it! I like the concrete example :) Thanks Yonik. Mike On Tue, Sep 22, 2009 at 1:38 PM, Yonik Seeley wrote: > OK Mike, thanks for your patience - I understand now :-) > > Here's an example that helped me understand - hopefully it will add to > others understanding

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Jason Rutherglen
John, I have a few questions in order to better understand, as the wiki does not reflect the entirety of what you're trying to describe. > But it is required to set up several parameters carefully to get desired behavior. Which parameters are you referring to? What were the ZMP parameters used

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Tim Smith
This sounds pretty interesting is there a proposed API for doing this warming yet? Is there a ticket tracking this? for my use cases, it would be really nice for applications to be able to associate a custom "IndexCache" object with an index reader, then this pluggable "AutoWarmer" would be in ch

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Michael McCandless
On Tue, Sep 22, 2009 at 2:01 PM, Tim Smith wrote: > is there a proposed API for doing this warming yet? It's already committed and available in 2.9 (see IndexWriter.setMergedSegmentWarmer). > for my use cases, it would be really nice for applications to be able to > associate a custom "IndexCac

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Mark Miller
1. see IndexWriter and the method/class that Mike pointed out earlier for the warming. 2. See Lucene-831 - I think we will get some form of that in someday. Tim Smith wrote: > This sounds pretty interesting > > is there a proposed API for doing this warming yet? > Is there a ticket tracking this?

Re: [VOTE] Release Lucene 2.9.0

2009-09-22 Thread Grant Ingersoll
I'll get to an official vote tonight or tomorrow AM, but thanks in advance for all the hard work. On Sep 21, 2009, at 1:06 PM, Mark Miller wrote: Okay, lets give this a shot: The (proposed) release artifacts have been built and are up at: http://people.apache.org/~markrmiller/staging-area/l

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Jason Rutherglen
I have a working version of Simple FieldCache Merging LUCENE-1785 that should go in real soon. On Tue, Sep 22, 2009 at 11:14 AM, Mark Miller wrote: > 1. see IndexWriter and the method/class that Mike pointed out earlier > for the warming. > > 2. See Lucene-831 - I think we will get some form of t

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Tim Smith
Jason Rutherglen wrote: > I have a working version of Simple FieldCache Merging LUCENE-1785 that > should go in real soon. > > Will this contain a callback mechanism i can register with to know what segments are being merged? that way i can merge my own caches as well at the application layer, p

IndexWriter.getReader() was Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Grant Ingersoll
Slight divergence from the topic... On Sep 22, 2009, at 10:48 AM, Michael McCandless wrote: John are you using IndexWriter.setMergedSegmentWarmer, so that a newly merged segment is warmed before it's "put into production" (returned by getReader)? One of the pieces I still am missing from all

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Jason Rutherglen
For that you can subclass IW.mergeSuccess. On Tue, Sep 22, 2009 at 11:43 AM, Tim Smith wrote: > Jason Rutherglen wrote: > > I have a working version of Simple FieldCache Merging LUCENE-1785 that > should go in real soon. > > > > Will this contain a callback mechanism i can register with to know w

Re: IndexWriter.getReader() was Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Grant Ingersoll
On Sep 22, 2009, at 2:47 PM, Grant Ingersoll wrote: And yet, at the first SF Meetup, I recall having a discussion with Michael B. about this approach versus IR.reopen() that left me wondering which one is better, since, Lucene has, in fact, always been about incremental updates (since th

Re: IndexWriter.getReader() was Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Jason Rutherglen
> which one is better Better for what? What use case are you thinking of? The merge reasons were covered well in the previous thread. Another gain is the carry over of deletes in RAM. I'm getting the feeling the Realtime wiki needs a lot of work. http://wiki.apache.org/lucene-java/NearRealtimeSe

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Tim Smith
Jason Rutherglen wrote: > For that you can subclass IW.mergeSuccess. > > looks like thats package private :( also doesn't look like it has the merged output SegmentReader which could be used for cache loading/cache key (since it may not have been opened yet, but with NRT it should be available?)

Re: IndexWriter.getReader() was Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Michael McCandless
On Tue, Sep 22, 2009 at 2:53 PM, Grant Ingersoll wrote: > One of the pieces I still am missing from all of this is why isn't > IW.getReader() now just the preferred way of getting a IndexReader > for all applications other than those that are completely batch > oriented? > > Why bother with IndexR

Re: IndexWriter.getReader() was Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Grant Ingersoll
On Sep 22, 2009, at 3:44 PM, Michael McCandless wrote: On Tue, Sep 22, 2009 at 2:53 PM, Grant Ingersoll wrote: One of the pieces I still am missing from all of this is why isn't IW.getReader() now just the preferred way of getting a IndexReader for all applications other than those that are

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Jason Rutherglen
Yeah it's all package private, I think it should be protected. One would use OneMerge.info to then obtain the newly merged SR via IW.getReader(). There's no reason not to include the newly merged SR in OneMerge, there wasn't a need when 1516 was written. On Tue, Sep 22, 2009 at 12:00 PM, Tim Smit

Re: Build failed in Hudson: Lucene-trunk #955

2009-09-22 Thread Chris Hostetter
: but it says the tests only ran for 12 minutes, so it took a day to compile? The JUnit report on total testing time is just the sum of the timing reported for each test, and as the testIndexWRiter report notes... : > 0.0030 ... : > Forked Java VM exited abnormally. Please

Re: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread John Wang
I understand what you are saying. Let me detail what I am trying to say: When "currently processed segments" are flushed down, merge may happen. When merges happen, some of those "stable segments" will be invalidated, and so will the fieldcache data keyed by them. In a high update environment, su

Re: IndexWriter.getReader() was Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Michael McCandless
On Tue, Sep 22, 2009 at 3:53 PM, Grant Ingersoll wrote: >> But, the returned reader is read-only, so you can't use it to change >> norms, do deletes, etc. > > Yeah, but an IW can do deletes, and if the this IR is coupled to it > anyway... True, but IW's deletes are still buffered, and you can't

RE: svn commit: r817279 - /lucene/java/dist/KEYS

2009-09-22 Thread Uwe Schindler
Hi all, should we use this version as the tag for "Lucene 2.9.0 release" and also as base for the 2.9 branch? Since we already added some unreleased changes to trunk, we should think, which revision is the release of 2.9.0 and when do we start modifying svn? If Mark's artifacts are the final ones

Re: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread Mark Miller
Right - when a large segment is invalidated, you will have a bigger fieldcache piece to reload - pre 2.9, you'd be reloading the *whole* field cache every time though. Sounds like you are trying to deal with those large segments changing anyway :) They are always an issue when doing RT it seems. I

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread John Wang
Jason: I am not sure what "parameters" are you referring to either. Are you responding to the right email? Anyhoot, I used everything for the default for both MergePolicies. LogMergePolicy.setCalibrateSizeByDeletes was a contribution by us from ZMP for normalize segment size using deleted doc co

Re: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread Michael McCandless
This is exactly why we added IndexWriter.setMergedSegmentWarmer -- you can warm the reader w/o blocking ongoing updates. Mike On Tue, Sep 22, 2009 at 7:15 PM, Mark Miller wrote: > Right - when a large segment is invalidated, you will have a bigger > fieldcache piece to reload - pre 2.9, you'd be

NumericRangeQuery "static" constructors

2009-09-22 Thread Grant Ingersoll
I was wondering what was the reason for the "static" constructors on the NumericRangeQuery? I don't get the point of a static method call that simply passes through to a normal constructor. Are people somehow magically more capable of discerning the meaning of a static method than simply

RE: NumericRangeQuery "static" constructors

2009-09-22 Thread Uwe Schindler
It is just for some type safety (with generics it would be better to handle). The problem is, that overloading the ctor with long, int, float, double makes it really hard to choose the right one (you miss an L when passing anumber and get an int range query and so on, especially when Java 5's autob

Re: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread John Wang
Hi Michael: Thanks for the pointer! Pardon my ignorance, but I am still no seeing the connection between this api to per/segment loading of FieldCache. (the api takes in an IndexReader instead of maybe SegmentReader[]) Can you point me to maybe the default impl of IndexReaderWar

Re: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread Mark Miller
Come on dude :) Spend a half ounce of effort first. Mike's time is too valuable ! Luckily mine is not. There is no default impl - the class is dead simple (and the class has been pointed out like 3 times in this thread - I'm not even fully following and I know where to find it): public static

Re: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread Mark Miller
Oh - yeah - also - youll be passed a segment reader if thats what makes sense. And sense it does, you will be passed one every time. You can warm a multireader the same way though, so no reason to pin it down. Mark Miller wrote: > Come on dude :) Spend a half ounce of effort first. Mike's time is

Re: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread John Wang
Mark: I did spend at least a quarter of an ounce. :) And I am sure Mike's time is more valuable than mine, but it was meant to be a "double-check" I was under the impression there is a default impl from previous email threads on how to handle field cache warming, perhaps I misunderstood. The rea

Re: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread Mark Miller
Don't take me too seriously John - I doubt anyone does :) And I wasn't implying Mike's time was more valuable than yours. I was being ... uh ... me :) And I don't claim that all of your many questions could have been found in 5 seconds ;) Just the ones you were asking - its very quick (at least

Re: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread John Wang
Thanks Mike for your valuable time! Sorry to be a pest, I am trying to write a fair perf test and to understand the feature. If there are other experts on the subject of index reader warming, please chime in. I am not seeing the connection between given an IndexReader and the FieldCacheImpl API,

Re: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread John Wang
No worries. Just trying to understand things. I wanted to double check but didn't want to write "My IDE told me that was the case" to sound pissy. I did look at the code, sometimes too much actually, but I never want to claim I understand the code 100%, hence going to the source is probably the b

Re: 2.9 NRT w.r.t. sorting and field cache

2009-09-22 Thread Mark Miller
What I would do is: In the warm method, load a FieldCache for every field I was going to end up using a FieldCache for. If its just for sorting, I might do a search with a sort on every field I was going to sort on. That will get the segment FieldCaches into RAM before the SegmentReader is put int

SegmentReader

2009-09-22 Thread John Wang
Just realized it, thanks for making SegmentReader public!!! -John

[jira] Updated: (LUCENE-1313) Near Realtime Search (using a built in RAMDirectory)

2009-09-22 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1313: - Attachment: LUCENE-1313.patch Patch is updated to trunk and compiles. Unit tests fail be

[jira] Updated: (LUCENE-1313) Near Realtime Search (using a built in RAMDirectory)

2009-09-22 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1313: - Attachment: LUCENE-1313.patch Ah, missing file now included. > Near Realtime Search (us

Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?

2009-09-22 Thread Yonik Seeley
Ya know, It turned out to be embarrassingly simple - I think I just had a mental block from thinking about how Solr's warming worked for so long. Actually, it was so simple, yet I still got in wrong on the first glance, that it reminded me of this: http://www.marilynvossavant.com/forum/viewtopic.p