Re: TestCodecs running time

2010-04-14 Thread Shai Erera
See you already did that Mike :). Thanks ! now the tests run for 2s. Shai On Fri, Apr 9, 2010 at 12:49 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > It's also slow because it repeats all the tests for each of the core > codecs (standard, sep, pulsing, intblock). > > I think it's f

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Shai Erera
So then I don't understand this: {quote} * A major release always bumps the major release number (2.x -> 3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3) releases along that branch * There is no back compat across major releases (index nor APIs), but full back compat within b

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Mark Miller
I don't read what you wrote and what Mike wrote as even close to the same. - Mark http://www.lucidimagination.com (mobile) On Apr 15, 2010, at 12:05 AM, Shai Erera wrote: Ahh ... a dream finally comes true ... what a great way to start a day :). +1 !!! I have some questions/comments tho

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Shai Erera
Also, we will still need to maintain the Backwards section in CHANGES (or move it to API Changes), to help people upgrade from release to release. Just pointing that out as well. Shai On Thu, Apr 15, 2010 at 7:05 AM, Shai Erera wrote: > Ahh ... a dream finally comes true ... what a great way to

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Shai Erera
Ahh ... a dream finally comes true ... what a great way to start a day :). +1 !!! I have some questions/comments though: * Index back compat should be maintained between major releases, like it is today, STRUCTURE-wise. So apps get a chance to incrementally upgrade their segments when they move f

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2010-04-14 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857164#action_12857164 ] Michael Busch commented on LUCENE-2324: --- {quote} It's for performance. I expect ther

[jira] Commented: (LUCENE-2359) CartesianPolyFilterBuilder doesn't handle edge case around the 180 meridian

2010-04-14 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857158#action_12857158 ] Grant Ingersoll commented on LUCENE-2359: - Reverted the last patch and the other r

[jira] Commented: (LUCENE-2393) Utility to output total term frequency and df from a lucene index

2010-04-14 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857155#action_12857155 ] Mark Miller commented on LUCENE-2393: - Perhaps this should be combined with high freq

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Andi Vajda
On Thu, 15 Apr 2010, Earwin Burrfoot wrote: Can't believe my eyes. +1 Likewise. +1 ! Andi.. On Thu, Apr 15, 2010 at 01:22, Michael McCandless wrote: On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey wrote: Essentially, we're free to break back compat within "Lucy" at any time, but w

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2010-04-14 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857124#action_12857124 ] Michael McCandless commented on LUCENE-2324: This is awesome Michael! Much si

[jira] Commented: (LUCENE-2393) Utility to output total term frequency and df from a lucene index

2010-04-14 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857121#action_12857121 ] Michael McCandless commented on LUCENE-2393: Programmatically indexing those d

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Earwin Burrfoot
Can't believe my eyes. +1 On Thu, Apr 15, 2010 at 01:22, Michael McCandless wrote: > On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey > wrote: > >> Essentially, we're free to break back compat within "Lucy" at any time, but >> we're not able to break back compat within a stable fork like "Lucy

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Robert Muir
+1 On Wed, Apr 14, 2010 at 5:22 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey > wrote: > > > Essentially, we're free to break back compat within "Lucy" at any time, > but > > we're not able to break back compat within a stable fork

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2010-04-14 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857112#action_12857112 ] Jason Rutherglen commented on LUCENE-2324: -- Michael, nice! I guess I should've s

[jira] Commented: (LUCENE-2393) Utility to output total term frequency and df from a lucene index

2010-04-14 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857107#action_12857107 ] Otis Gospodnetic commented on LUCENE-2393: -- I think creating a small index with a

[jira] Assigned: (LUCENE-1698) Change backwards-compatibility policy

2010-04-14 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch reassigned LUCENE-1698: - Assignee: (was: Michael Busch) :) > Change backwards-compatibility policy > ---

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Chris Male
On Wed, Apr 14, 2010 at 11:22 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey > wrote: > > > Essentially, we're free to break back compat within "Lucy" at any time, > but > > we're not able to break back compat within a stable fork li

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Michael McCandless
On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey wrote: > Essentially, we're free to break back compat within "Lucy" at any time, but > we're not able to break back compat within a stable fork like "Lucy1", > "Lucy2", etc. So what we'll probably do during normal development with > Analyzers is

[jira] Updated: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2010-04-14 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch updated LUCENE-2324: -- Attachment: lucene-2324.patch The patch removes all *PerThread classes downstream of Documents

[jira] Commented: (LUCENE-2359) CartesianPolyFilterBuilder doesn't handle edge case around the 180 meridian

2010-04-14 Thread Nicolas Helleringer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857081#action_12857081 ] Nicolas Helleringer commented on LUCENE-2359: - Edit done. I am currently brow

[jira] Issue Comment Edited: (LUCENE-2359) CartesianPolyFilterBuilder doesn't handle edge case around the 180 meridian

2010-04-14 Thread Nicolas Helleringer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856954#action_12856954 ] Nicolas Helleringer edited comment on LUCENE-2359 at 4/14/10 4:40 PM: --

[jira] Updated: (LUCENE-2394) Factories for cache creation

2010-04-14 Thread Oswaldo Dantas (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oswaldo Dantas updated LUCENE-2394: --- Comment: was deleted (was: By the way, in http://wiki.apache.org/lucene-java/HowToContribute

[jira] Commented: (LUCENE-2394) Factories for cache creation

2010-04-14 Thread Oswaldo Dantas (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857069#action_12857069 ] Oswaldo Dantas commented on LUCENE-2394: By the way, in http://wiki.apache.org/luc

[jira] Commented: (LUCENE-2359) CartesianPolyFilterBuilder doesn't handle edge case around the 180 meridian

2010-04-14 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857054#action_12857054 ] Grant Ingersoll commented on LUCENE-2359: - So, you're saying then that your approa

[jira] Updated: (LUCENE-2394) Factories for cache creation

2010-04-14 Thread Oswaldo Dantas (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oswaldo Dantas updated LUCENE-2394: --- Attachment: factoriesPatch.patch Attaching factory suggestion (patch for changes to https:/

Re: Bug in contrib/misc/HighFreqTerms.java?

2010-04-14 Thread Michael McCandless
OK I committed the fix. I ran it on a flex wikipedia index I had... it produces output like this: body:[3c 21 2d 2d] 509050 body:[73 68 6f 75 6c 64] 515495 body:[74 68 65 6e] 525176 body:[74 69 74 6c 65] 525361 body:[5b 5b 55 6e 69 74 65 64] 532586 body:[6b 6e 6f 77 6e] 533558 body:[75 6e 64 65 7

[jira] Created: (LUCENE-2394) Factories for cache creation

2010-04-14 Thread Oswaldo Dantas (JIRA)
Factories for cache creation Key: LUCENE-2394 URL: https://issues.apache.org/jira/browse/LUCENE-2394 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Oswaldo Danta

Re: Bug in contrib/misc/HighFreqTerms.java?

2010-04-14 Thread Michael McCandless
Ugh, I'll fix this. With the new flex API, you can't ask a composite (Multi/DirReader) for its postings -- you have to go through the static methods on MultiFields. I'm trying to put some distance b/w IndexReader and composite readers... because I'd like to eventually deprecate them. Ie, the comp

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Robert Muir
On Wed, Apr 14, 2010 at 2:49 PM, Uwe Schindler wrote: > > And 2.9's backwards compatibility layer in > > TokenStream > > was significantly slower. > > I protest! No, it was not slower, only at the beginning because of missing > reflection caching! But this also affected the *new* API. With 2.9.x

RE: Proposal about Version API "relaxation"

2010-04-14 Thread Uwe Schindler
> And 2.9's backwards compatibility layer in > TokenStream > was significantly slower. I protest! No, it was not slower, only at the beginning because of missing reflection caching! But this also affected the *new* API. With 2.9.x and old TokenStreams there is no speed difference, really. Uwe

RE: Proposal about Version API "relaxation"

2010-04-14 Thread Uwe Schindler
+1, Thanks for this detailed explanation! In my apps I have no problem to define a static default myself. And passing this to every ctor is easy, so where is the problem? Look at solr, since we introduced the version param to solrconfig, you have exactly that behavior, but its limited to this so

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Marvin Humphrey
On Wed, Apr 14, 2010 at 12:49:52AM -0400, Robert Muir wrote: > its very unnatural for release 3.0 to be almost a no-op and for release 3.1 > to provide a new default index format and support for customizing how the > index is stored. And now we are looking at providing flexibility in scoring > tha

[jira] Commented: (LUCENE-2359) CartesianPolyFilterBuilder doesn't handle edge case around the 180 meridian

2010-04-14 Thread Nicolas Helleringer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857010#action_12857010 ] Nicolas Helleringer commented on LUCENE-2359: - Yonik, It is the case, but th

RE: issues.apache.org compromised: please update your passwords

2010-04-14 Thread Chris Hostetter
: I disabled the account by assigning a dummy eMail and gave it a random password. : : I was not able to unassign the issues, as most issues were "Closed", : where no modifications can be done anymore. Reopening and changing Uwe: it may be too late (depending on wether you remember the dummy

[jira] Commented: (LUCENE-2359) CartesianPolyFilterBuilder doesn't handle edge case around the 180 meridian

2010-04-14 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857006#action_12857006 ] Yonik Seeley commented on LUCENE-2359: -- Perhaps I'm misreading the table? I had assu

Bug in contrib/misc/HighFreqTerms.java?

2010-04-14 Thread Burton-West, Tom
When I try to run HighFreqTerms.java in Lucene Revision: 933722 I get the the exception appended below. I believe the line of code involved is a result of the flex indexing merge. Should I post this as a comment to LUCENE-2370 (Reintegrate flex branch into trunk)? Or is there simply somethin

[jira] Commented: (LUCENE-2359) CartesianPolyFilterBuilder doesn't handle edge case around the 180 meridian

2010-04-14 Thread Nicolas Helleringer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857001#action_12857001 ] Nicolas Helleringer commented on LUCENE-2359: - hi Yonik, I do not aggre : as

[jira] Commented: (LUCENE-2359) CartesianPolyFilterBuilder doesn't handle edge case around the 180 meridian

2010-04-14 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856984#action_12856984 ] Yonik Seeley commented on LUCENE-2359: -- Hi Nicolas, I like the idea of reducing the n

[jira] Commented: (LUCENE-2359) CartesianPolyFilterBuilder doesn't handle edge case around the 180 meridian

2010-04-14 Thread Nicolas Helleringer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856980#action_12856980 ] Nicolas Helleringer commented on LUCENE-2359: - I do agree, it is odd. I shall

[jira] Issue Comment Edited: (LUCENE-2393) Utility to output total term frequency and df from a lucene index

2010-04-14 Thread Tom Burton-West (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856965#action_12856965 ] Tom Burton-West edited comment on LUCENE-2393 at 4/14/10 1:26 PM: --

[jira] Commented: (LUCENE-2393) Utility to output total term frequency and df from a lucene index

2010-04-14 Thread Tom Burton-West (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856967#action_12856967 ] Tom Burton-West commented on LUCENE-2393: - For an example of how this utility can

[jira] Updated: (LUCENE-2393) Utility to output total term frequency and df from a lucene index

2010-04-14 Thread Tom Burton-West (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Burton-West updated LUCENE-2393: Attachment: LUCENE-2393.patch Patch against recent trunk > Utility to output total term f

[jira] Created: (LUCENE-2393) Utility to output total term frequency and df from a lucene index

2010-04-14 Thread Tom Burton-West (JIRA)
Utility to output total term frequency and df from a lucene index - Key: LUCENE-2393 URL: https://issues.apache.org/jira/browse/LUCENE-2393 Project: Lucene - Java Issue Type: Ne

Re: [SPATIAL] Best Fit Calculation

2010-04-14 Thread Grant Ingersoll
Thanks. I added my comment on the issue. I think we should revert and then someone can put up a patch to make this pluggable. As it stands, this Best Fit calculation has nothing to do with the CartesianTierPlotter anyway, so we could refactor it pretty easily. -Grant On Apr 14, 2010, at 12:

[jira] Commented: (LUCENE-2359) CartesianPolyFilterBuilder doesn't handle edge case around the 180 meridian

2010-04-14 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856963#action_12856963 ] Grant Ingersoll commented on LUCENE-2359: - Thanks, Nicolas. To me, based on these

Re: [SPATIAL] Best Fit Calculation

2010-04-14 Thread Grant Ingersoll
On Apr 14, 2010, at 12:12 PM, Chris Male wrote: > Hi, > > On Wed, Apr 14, 2010 at 6:07 PM, Grant Ingersoll wrote: > > On Apr 14, 2010, at 11:06 AM, Chris Male wrote: > > > Hi, > > > > My understanding of the benefits of the new algorithm is that it means a > > lower tier level resulting in f

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Andi Vajda
On Apr 14, 2010, at 7:45, Yonik Seeley wrote: On Wed, Apr 14, 2010 at 10:39 AM, DM Smith wrote: Maybe have the index store the version(s) and use that when constructing a reader or writer? That would cause a reindex to change behavior (among other problems). If the index contained

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Mark Miller
On 04/14/2010 12:29 PM, Marvin Humphrey wrote: On Wed, Apr 14, 2010 at 08:30:14AM -0400, Grant Ingersoll wrote: The thing I keep going back to is that somehow Lucene has managed for years (and I mean lots of years) w/o stuff like Version and all this massive back compatibility checking.

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Robert Muir
On Wed, Apr 14, 2010 at 12:29 PM, Marvin Humphrey wrote: > > > I also am not sure whether it in the past we just missed/ignored more > back > > compatibility issues or whether now we are creating more back compat. > issues > > due to more rapid change. > > It would be hard to search the archives t

[jira] Commented: (LUCENE-2359) CartesianPolyFilterBuilder doesn't handle edge case around the 180 meridian

2010-04-14 Thread Nicolas Helleringer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856954#action_12856954 ] Nicolas Helleringer commented on LUCENE-2359: - Summary tables : ||Tile Level|

Re: [SPATIAL] Best Fit Calculation

2010-04-14 Thread Helleringer, Nicolas
Tables are well on JIRA : https://issues.apache.org/jira/browse/LUCENE-2359 Nicolas 2010/4/14 Helleringer, Nicolas > Here are the summary tables : > > First a table to remind metrics on the Tiers : > Tile Level TierLegnth TierBoxes TileLength (

Re: [SPATIAL] Best Fit Calculation

2010-04-14 Thread Helleringer, Nicolas
Here are the summary tables : First a table to remind metrics on the Tiers : Tile Level TierLegnth TierBoxes TileLength (miles) 0 1 1 24902 1 2 4 12451 2 4 16 6225,5 3 8 64 3112,75 4 16 256 1556,375 5 32 1024 778,1875 6 64 4096 389,09375 7 128 16384 194,546875 8 256 65536 97,2734375 9 512 262144 4

Re: [SPATIAL] Best Fit Calculation

2010-04-14 Thread Chris Male
On Wed, Apr 14, 2010 at 6:24 PM, Yonik Seeley wrote: > On Wed, Apr 14, 2010 at 12:12 PM, Chris Male wrote: > > On Wed, Apr 14, 2010 at 6:07 PM, Grant Ingersoll > >> On Apr 14, 2010, at 11:06 AM, Chris Male wrote: > >> > For those doing just Cartesian Tier filtering it seems like the new > >> > a

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Marvin Humphrey
On Wed, Apr 14, 2010 at 08:30:14AM -0400, Grant Ingersoll wrote: > The thing I keep going back to is that somehow Lucene has managed for years > (and I mean lots of years) w/o stuff like Version and all this massive back > compatibility checking. Non-constant global variables are an anti-pattern.

Re: [SPATIAL] Best Fit Calculation

2010-04-14 Thread Yonik Seeley
On Wed, Apr 14, 2010 at 12:12 PM, Chris Male wrote: > On Wed, Apr 14, 2010 at 6:07 PM, Grant Ingersoll >> On Apr 14, 2010, at 11:06 AM, Chris Male wrote: >> > For those doing just Cartesian Tier filtering it seems like the new >> > approach is a win, but for those doing distance calculations on t

Re: [SPATIAL] Best Fit Calculation

2010-04-14 Thread Chris Male
Hi, On Wed, Apr 14, 2010 at 6:07 PM, Grant Ingersoll wrote: > > On Apr 14, 2010, at 11:06 AM, Chris Male wrote: > > > Hi, > > > > My understanding of the benefits of the new algorithm is that it means a > lower tier level resulting in fewer boxes, but more documents inside those > boxes that are

Re: [SPATIAL] Best Fit Calculation

2010-04-14 Thread Grant Ingersoll
On Apr 14, 2010, at 11:28 AM, Helleringer, Nicolas wrote: > That minTile param allows you to trade off between filtering accuracy > and faster tile filtering. Without the param (or until it can be > implemented) the correct approach seems like the above, without a > minTile. This sounds to me l

Re: [SPATIAL] Best Fit Calculation

2010-04-14 Thread Grant Ingersoll
On Apr 14, 2010, at 11:06 AM, Chris Male wrote: > Hi, > > My understanding of the benefits of the new algorithm is that it means a > lower tier level resulting in fewer boxes, but more documents inside those > boxes that are outside of the search radius. > > While having fewer boxes means few

Re: [SPATIAL] Best Fit Calculation

2010-04-14 Thread Helleringer, Nicolas
> > That minTile param allows you to trade off between filtering accuracy > and faster tile filtering. Without the param (or until it can be > implemented) the correct approach seems like the above, without a > minTile. This sounds to me like the old approach is correct. > minTier and maxTier at

[jira] Commented: (LUCENE-2359) CartesianPolyFilterBuilder doesn't handle edge case around the 180 meridian

2010-04-14 Thread Nicolas Helleringer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856929#action_12856929 ] Nicolas Helleringer commented on LUCENE-2359: - What my code do : It looks how

Re: [SPATIAL] Best Fit Calculation

2010-04-14 Thread Helleringer, Nicolas
I ll try to find a little bit of time tonight to make a sample data set go through the two calculations to see the differences. I ll make a summary table. I ll comment the issue with some comments on 'my' version of the alogrithm right now. Nicolas 2010/4/14 Chris Male > Hi, > > My understandi

Re: [SPATIAL] Best Fit Calculation

2010-04-14 Thread Yonik Seeley
On Wed, Apr 14, 2010 at 11:06 AM, Chris Male wrote: > While having fewer boxes means fewer term queries to make against the index, > more documents means more costly calculations to filter out those extraneous > documents. Filtering out documents (greater selectivity) seems like it should be the

Re: [SPATIAL] Best Fit Calculation

2010-04-14 Thread Chris Male
Hi, My understanding of the benefits of the new algorithm is that it means a lower tier level resulting in fewer boxes, but more documents inside those boxes that are outside of the search radius. While having fewer boxes means fewer term queries to make against the index, more documents means mo

[SPATIAL] Best Fit Calculation

2010-04-14 Thread Grant Ingersoll
LUCENE-2359 changed the best fit calculation. I admit, I'm not entirely certain which one is right, so I thought we should step back and talk about what we are trying to achieve. Please correct me if/where I am wrong. Looking at the problem of tiers/tiles/grids in general, we are taking a sphe

pseudo document ids in my own indexreader/writer

2010-04-14 Thread Thomas Koch
Hi, there are currently two projects, porting lucandra to HBase: http://github.com/akkumar/hbasene http://github.com/thkoch2001/lucehbase hbasene stores a unique integer with each stored document, while lucehbase directly stores the user's primary key in the termVector table. Every lucehbase in

[jira] Commented: (LUCENE-2159) Tool to expand the index for perf/stress testing.

2010-04-14 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856917#action_12856917 ] Shai Erera commented on LUCENE-2159: bq. There is an excellent section on it in LIA2

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Yonik Seeley
On Wed, Apr 14, 2010 at 10:39 AM, DM Smith wrote: > Maybe have the index store the version(s) and use that when constructing a > reader or writer? That would cause a reindex to change behavior (among other problems). -Yonik Apache Lucene Eurocon 2010 18-21 May 2010 | Prague

[jira] Commented: (LUCENE-2159) Tool to expand the index for perf/stress testing.

2010-04-14 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856916#action_12856916 ] Mark Miller commented on LUCENE-2159: - There is an excellent section on it in LIA2 :)

Re: Proposal about Version API "relaxation"

2010-04-14 Thread DM Smith
On 04/14/2010 09:13 AM, Robert Muir wrote: Its not sidetracked at all. there seem to be more compelling alternatives to achieve the same thing, so we should consider alternative solutions, too. Maybe have the index store the version(s) and use that when constructing a reader or writer? Given en

[jira] Commented: (LUCENE-2159) Tool to expand the index for perf/stress testing.

2010-04-14 Thread John Wang (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856913#action_12856913 ] John Wang commented on LUCENE-2159: --- Yeah, that sounds great! I will need to learn how t

[jira] Commented: (LUCENE-2159) Tool to expand the index for perf/stress testing.

2010-04-14 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856911#action_12856911 ] Shai Erera commented on LUCENE-2159: Which is fine - I think this would be a neat task

[jira] Commented: (LUCENE-2159) Tool to expand the index for perf/stress testing.

2010-04-14 Thread John Wang (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856908#action_12856908 ] John Wang commented on LUCENE-2159: --- Shai: I am just stating our experiences. I am not

[jira] Commented: (LUCENE-2359) CartesianPolyFilterBuilder doesn't handle edge case around the 180 meridian

2010-04-14 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856902#action_12856902 ] Grant Ingersoll commented on LUCENE-2359: - Nicolas, Why the change in the best fi

LatLng rework

2010-04-14 Thread Helleringer, Nicolas
Hi, I will be working on the LatLng point implementation now that LUCENE-2359, LUCENE-2366, LUCENE-2367, LUCENE-1777 and LUCENE-1921 are ok. I will propose small patches inside LUCENE-1934 but it will solve also part or total of LUCENE-2149 and LUCENE-2148. Next step after that will be to addres

[jira] Commented: (LUCENE-1777) Error on distance query where miles < 1.0

2010-04-14 Thread Nicolas Helleringer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856887#action_12856887 ] Nicolas Helleringer commented on LUCENE-1777: - Can someone confirm my analysis

[jira] Commented: (LUCENE-1921) Absurdly large radius (miles) search fails to include entire earth

2010-04-14 Thread Nicolas Helleringer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856885#action_12856885 ] Nicolas Helleringer commented on LUCENE-1921: - I did re check after Grant's co

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Robert Muir
Its not sidetracked at all. there seem to be more compelling alternatives to achieve the same thing, so we should consider alternative solutions, too. On Wed, Apr 14, 2010 at 8:54 AM, Earwin Burrfoot wrote: > The thread somehow got sidetracked. So, let's get this carriage back > on its rails? >

[jira] Commented: (LUCENE-2159) Tool to expand the index for perf/stress testing.

2010-04-14 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856877#action_12856877 ] Shai Erera commented on LUCENE-2159: bq. I understand having a general performance sui

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Earwin Burrfoot
The thread somehow got sidetracked. So, let's get this carriage back on its rails? Let me remind - we have an API on hands that is mandatory and tends to be cumbersome. Proposed solution does indeed have ultrascary word "static" in it. But if you brace yourself and look closer - the use of said st

[jira] Commented: (LUCENE-2159) Tool to expand the index for perf/stress testing.

2010-04-14 Thread John Wang (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856869#action_12856869 ] John Wang commented on LUCENE-2159: --- Shai: You are right, we found this tool usef

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Grant Ingersoll
On Apr 14, 2010, at 12:49 AM, Robert Muir wrote: > > On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey > wrote: > New class names would work, too. > > I only mention that for the sake of completeness, though -- it's not a > suggestion. > > Right, to me this is just as bad. > In my eyes, the

[jira] Commented: (LUCENE-2184) CartesianPolyFilterBuilder doesn't properly account for which tiers actually exist in the index

2010-04-14 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856857#action_12856857 ] Grant Ingersoll commented on LUCENE-2184: - Thanks, Nicolas. Applied. > Cartesian

[jira] Commented: (LUCENE-2159) Tool to expand the index for perf/stress testing.

2010-04-14 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856845#action_12856845 ] Shai Erera commented on LUCENE-2159: This looks like a nice tool. But all it does is c

[jira] Updated: (LUCENE-2387) IndexWriter retains references to Readers used in Fields (memory leak)

2010-04-14 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2387: --- Attachment: LUCENE-2387-29x.patch 29x version of this patch. > IndexWriter retains

Re: Google-developed posting list encoding

2010-04-14 Thread Michael McCandless
Flex has already landed (in trunk, for 3.1), so this is "just" a matter of someone creating a codec using Group VarInt. Mike On Wed, Apr 14, 2010 at 4:58 AM, John Wang wrote: > This would be something that's excellent for contribution after the > Flex-Indexing support is added. > -John > > On We

Re: Google-developed posting list encoding

2010-04-14 Thread John Wang
This would be something that's excellent for contribution after the Flex-Indexing support is added. -John On Wed, Apr 14, 2010 at 12:22 AM, Mike Klaas wrote: > Can be quite a bit faster than vInt in some cases: > http://www.ir.uwaterloo.ca/book/addenda-06-index-compression.html > > -Mike > > --

[jira] Resolved: (LUCENE-2316) Define clear semantics for Directory.fileLength

2010-04-14 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-2316. Lucene Fields: [New, Patch Available] (was: [New]) Assignee: Shai Erera Resolution

[jira] Updated: (LUCENE-1921) Absurdly large radius (miles) search fails to include entire earth

2010-04-14 Thread Nicolas Helleringer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Helleringer updated LUCENE-1921: Attachment: TEST-1921.patch After LUCENE-2184 has been re resolved this test (TEST

[jira] Updated: (LUCENE-2184) CartesianPolyFilterBuilder doesn't properly account for which tiers actually exist in the index

2010-04-14 Thread Nicolas Helleringer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Helleringer updated LUCENE-2184: Attachment: LUCENE-2184.2.patch My work @ LUCENE-2359 did break Grant work here :s

Google-developed posting list encoding

2010-04-14 Thread Mike Klaas
Can be quite a bit faster than vInt in some cases: http://www.ir.uwaterloo.ca/book/addenda-06-index-compression.html -Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev