[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710646#action_12710646
]
Shai Erera commented on LUCENE-1614:
bq. A bigger question though, is if we should sup
>When you create IndexReader, IndexWriter and others, you must pass in a
>Settings
> instance.
I think this would also help solve the steady growth of constructor variations
(18 in 2.4's IndexWriter vs 3 in Lucene 1.9).
- Original Message
From: Otis Gospodnetic
To: java-dev@luce
On Mon, May 18, 2009 at 7:48 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> I'm happy to announce that the Lucene PMC has accepted Uwe Schindler
> as a Lucene core committer (Uwe was previously a contrib committer).
>
> Welcome aboard Uwe,
>
+1
Congratulations Uwe!
--
Regards,
Sh
[
https://issues.apache.org/jira/browse/LUCENE-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710680#action_12710680
]
Michael McCandless commented on LUCENE-1642:
Good catch! That's in the resolv
[
https://issues.apache.org/jira/browse/LUCENE-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-1642:
---
Fix Version/s: 2.9
> IndexWriter.addIndexesNoOptimize ignores the compound file sett
[
https://issues.apache.org/jira/browse/LUCENE-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reassigned LUCENE-1642:
--
Assignee: Michael McCandless
> IndexWriter.addIndexesNoOptimize ignores the co
[
https://issues.apache.org/jira/browse/LUCENE-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reassigned LUCENE-1643:
--
Assignee: Michael McCandless
> use reusable collation keys in ICUCollationFilt
[
https://issues.apache.org/jira/browse/LUCENE-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710681#action_12710681
]
Michael McCandless commented on LUCENE-1643:
Looks good, I'll commit shortly.
[
https://issues.apache.org/jira/browse/LUCENE-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved LUCENE-1643.
Resolution: Fixed
Fix Version/s: 2.9
> use reusable collation keys in ICUCo
On Mon, May 18, 2009 at 8:51 PM, Yonik Seeley
wrote:
> On Mon, May 18, 2009 at 5:06 PM, Michael McCandless
> wrote:
>> * StopFilter should enable position increments by default
>
> Is this one an actual improvement in the general case?
> A query of "foo bar" then wouldn't match a document with "
On May 18, 2009, at 11:31 PM, Robert Muir wrote:
I am curious about this, do you think its a better default because
it avoids the max boolean clauses problem? or because for a lot of
these scoring doesn't make much sense anyway?
I ran tests on a pretty big index, you pay a price for the co
On Mon, May 18, 2009 at 11:31 PM, Robert Muir wrote:
> I am curious about this, do you think its a better default because it avoids
> the max boolean clauses problem? or because for a lot of these scoring
> doesn't make much sense anyway?
I think you're referring to constant score mode default, f
On May 19, 2009, at 6:39 AM, Michael McCandless wrote:
On Mon, May 18, 2009 at 8:51 PM, Yonik Seeley
wrote:
On Mon, May 18, 2009 at 5:06 PM, Michael McCandless
wrote:
* StopFilter should enable position increments by default
Is this one an actual improvement in the general case?
A query o
I like the idea, some thoughts below.
On May 18, 2009, at 5:06 PM, Michael McCandless wrote:
As we all know, Lucene's back-compat policy necessarily hurts the
out-of-the-box experience for new users: because we are only allowed
make substantial improvements to Lucene's default settings at a maj
On Tue, May 19, 2009 at 6:47 AM, DM Smith wrote:
> It is common in my application, a Bible program, that indexes each verse
> (think of a verse as a numbered sentence) as a separate document. We index
> everything, including words that are typically stop words as those might be
> important to our
On Tue, May 19, 2009 at 6:56 AM, DM Smith wrote:
> I really like the idea of a settings class. Another benefit, *especially if
> it is documented well*, user's would be led to tuning parameters.
>
> In this settings class, would there be setters/getters so that one could
> take particular default
[
https://issues.apache.org/jira/browse/LUCENE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710712#action_12710712
]
Xiaoping Gao commented on LUCENE-1629:
--
The dictionary is loaded in to 2 classes:
Big
On Tue, May 19, 2009 at 4:34 AM, mark harwood wrote:
>
>>When you create IndexReader, IndexWriter and others, you must pass in a
>>Settings
>> instance.
>
> I think this would also help solve the steady growth of constructor
> variations (18 in 2.4's IndexWriter vs 3 in Lucene 1.9).
Right. So
On Tue, May 19, 2009 at 7:26 AM, Grant Ingersoll wrote:
> I don't think we have said that bug fixes are required to be back
> compatible, even if it is in analysis. I think it is a really bad idea for
> TokenStreams to have if clauses in them checking boolean values for old
> versus new behavior
Michael McCandless wrote:
Or, the removal of StopFilter as "Standard" all together. This coupled with
a QP that created phrases around stop words is a better solution.
Interesting... that'd be a pretty big change to StandardAnalyzer,
though.
I can see we are spinning off lots of neat ide
On May 19, 2009, at 7:45 AM, Michael McCandless wrote:
On Tue, May 19, 2009 at 6:47 AM, DM Smith
wrote:
It is common in my application, a Bible program, that indexes each
verse
(think of a verse as a numbered sentence) as a separate document.
We index
everything, including words that ar
Michael McCandless wrote:
On Mon, May 18, 2009 at 11:31 PM, Robert Muir wrote:
I am curious about this, do you think its a better default because it avoids
the max boolean clauses problem? or because for a lot of these scoring
doesn't make much sense anyway?
I think you're referring t
On Tue, May 19, 2009 at 8:28 AM, Mark Miller wrote:
>> Thinking more on this... I'd love to have a constant-score mode, but
>> implemented as a BooleanQuery, meaning the scores would be the same
>> (constant) regardless of whether under-the-hood the query was
>> rewritten to BooleanQuery vs pre-c
in my tests the problem seemed to boil down to iteration of a sparse
openbitset... so maybe the filter approach is still an option but when #
docs is small some other doc id set impl is used?
On Tue, May 19, 2009 at 8:28 AM, Mark Miller wrote:
> Michael McCandless wrote:
>
>> On Mon, May 18, 200
On May 19, 2009, at 8:19 AM, Michael McCandless wrote:
On Tue, May 19, 2009 at 7:26 AM, Grant Ingersoll
wrote:
I don't think we have said that bug fixes are required to be back
compatible, even if it is in analysis. I think it is a really bad
idea for
TokenStreams to have if clauses in
On Tue, May 19, 2009 at 8:50 AM, Robert Muir wrote:
> in my tests the problem seemed to boil down to iteration of a sparse
> openbitset... so maybe the filter approach is still an option but when #
> docs is small some other doc id set impl is used?
Directly using the BooleanQuery skips any inter
On Tue, May 19, 2009 at 8:50 AM, Robert Muir wrote:
> in my tests the problem seemed to boil down to iteration of a sparse
> openbitset... so maybe the filter approach is still an option but when #
> docs is small some other doc id set impl is used?
Interesting... was your test a case where wicke
Enable MultiTermQuery's constant score mode to also use BooleanQuery under the
hood
---
Key: LUCENE-1644
URL: https://issues.apache.org/jira/browse/LUCENE-1644
Project: L
On Tue, May 19, 2009 at 16:56, Grant Ingersoll wrote:
> There's a difference between std. coding practices and purposefully putting
> in lots of if checks to solve back compatibility issues that are created in
> order to satisfy some naming convention. Given the length of time between
> releases,
none of my queries are "wicked fast" on 100M doc index!
for narrow queries, we are talking about ~100ms queries becoming ~400ms or
so with the constant score rewrite...
for wide queries, we are talking about maybe 3 or 4s queries becoming 2s or
so with the constant score rewrite..., it depends on
Selecting backward compatibility vs latest and greatest could be done
w/o Settings (a simple static int containing the version number to act
like). It seems like the Settings debate should be based on it's own
merits.
-Yonik
-
T
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shai Erera updated LUCENE-1614:
---
Attachment: LUCENE-1614.patch
Patch introduces the two added methods, as well as changes to our test
On Tue, May 19, 2009 at 8:56 AM, Grant Ingersoll wrote:
>> Why not? The settings object could have say a property
>> "analysis.standard.enableStopFilter"?
>
> And what if it is something that has to be called in the next() chain and
> not during construction? Are you going to want to call that
On Tue, May 19, 2009 at 9:15 AM, Robert Muir wrote:
> none of my queries are "wicked fast" on 100M doc index!
OK.
> for narrow queries, we are talking about ~100ms queries becoming ~400ms or
> so with the constant score rewrite...
> for wide queries, we are talking about maybe 3 or 4s queries be
On Tue, May 19, 2009 at 9:34 AM, Yonik Seeley
wrote:
> Selecting backward compatibility vs latest and greatest could be done
> w/o Settings (a simple static int containing the version number to act
> like). It seems like the Settings debate should be based on it's own
> merits.
But isn't a stat
Deleted documents are visible across reopened MSRs
--
Key: LUCENE-1645
URL: https://issues.apache.org/jira/browse/LUCENE-1645
Project: Lucene - Java
Issue Type: Bug
Affects Versions: 2.9
On Tue, May 19, 2009 at 2:04 PM, Michael McCandless
wrote:
> On Tue, May 19, 2009 at 9:34 AM, Yonik Seeley
> wrote:
>
>> Selecting backward compatibility vs latest and greatest could be done
>> w/o Settings (a simple static int containing the version number to act
>> like). It seems like the Set
Okay, I've got some more time on my hands.
While fixing the tests, I found a reopen() bug on trunk, which was
previously hidden from tests by SR-as-toplevel-reader optimization.
(LUCENE-1645)
Can post the first experimental patch tomorrow, with this test failing.
On Mon, May 18, 2009 at 16:06, Mic
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710826#action_12710826
]
Shai Erera commented on LUCENE-1614:
BTW, as I prepared that patch, I noticed the same
Is this the time and place to re-raise a previous discussion about moving
SweetSpotSimilarity to core and move to use it?
On Tue, May 19, 2009 at 8:54 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> On Tue, May 19, 2009 at 9:15 AM, Robert Muir wrote:
> > none of my queries are "wick
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710827#action_12710827
]
Michael McCandless commented on LUCENE-1614:
I wonder if instead of returning
[
https://issues.apache.org/jira/browse/LUCENE-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Earwin Burrfoot updated LUCENE-1645:
Attachment: LUCENE-1645.patch
If you reopen() MSR with unchanged segments, the resulting M
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710838#action_12710838
]
Yonik Seeley commented on LUCENE-1614:
--
> > A bigger question though, is if we should
[
https://issues.apache.org/jira/browse/LUCENE-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-1594:
---
Attachment: LUCENE-1594.patch
Another iteration. Many changes, eg:
* All Boolean
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710841#action_12710841
]
Shai Erera commented on LUCENE-1614:
bq. I wonder if instead of returning -1 when the
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710844#action_12710844
]
Yonik Seeley commented on LUCENE-1614:
--
bq. This would save CPU for scorers that merg
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710846#action_12710846
]
Michael McCandless commented on LUCENE-1614:
bq. BTW, none of the existing ite
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710848#action_12710848
]
Yonik Seeley commented on LUCENE-1614:
--
bq. Not sure - didn't we agree for >= current
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710851#action_12710851
]
Michael McCandless commented on LUCENE-1614:
{quote}
> This would save CPU for
On Mon, May 18, 2009 at 8:06 AM, Michael McCandless
wrote:
> Yonik is there anything in Solr that might not like this change?
Yep, there is :-) Should be very easy to work around though.
-Yonik
-
To unsubscribe, e-mail: java-d
On Tue, May 19, 2009 at 2:29 PM, Shai Erera wrote:
> Is this the time and place to re-raise a previous discussion about moving
> SweetSpotSimilarity to core and move to use it?
SweetSpotSimilarity wouldn't make a good default. It's a flat topped
hill that falls suddenly off on either side. Shor
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710855#action_12710855
]
Michael McCandless commented on LUCENE-1614:
{quote}
> I wonder if instead of
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710860#action_12710860
]
Shai Erera commented on LUCENE-1614:
Ok I'll look into it tomorrow morning when I'll b
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710863#action_12710863
]
Michael McCandless commented on LUCENE-1614:
bq. I still think it's more logic
On Tue, May 19, 2009 at 2:27 PM, Yonik Seeley
wrote:
> On Tue, May 19, 2009 at 2:04 PM, Michael McCandless
> wrote:
>> On Tue, May 19, 2009 at 9:34 AM, Yonik Seeley
>> wrote:
>>
>>> Selecting backward compatibility vs latest and greatest could be done
>>> w/o Settings (a simple static int contai
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710868#action_12710868
]
Yonik Seeley commented on LUCENE-1614:
--
Scorers previously only had to worry about sk
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710869#action_12710869
]
Paul Elschot commented on LUCENE-1614:
--
About using Integer.MAX_VALUE as sentinel, di
[
https://issues.apache.org/jira/browse/LUCENE-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710872#action_12710872
]
Michael McCandless commented on LUCENE-1645:
Good catch!
So we are missing a
[
https://issues.apache.org/jira/browse/LUCENE-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-1645:
---
Fix Version/s: 2.9
> Deleted documents are visible across reopened MSRs
> --
On Tue, May 19, 2009 at 4:33 PM, Michael McCandless
wrote:
> On Tue, May 19, 2009 at 2:27 PM, Yonik Seeley
> wrote:
>> On Tue, May 19, 2009 at 2:04 PM, Michael McCandless
>> wrote:
>>> On Tue, May 19, 2009 at 9:34 AM, Yonik Seeley
>>> wrote:
>>>
Selecting backward compatibility vs latest a
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710880#action_12710880
]
Michael McCandless commented on LUCENE-1614:
bq. About using Integer.MAX_VALUE
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710887#action_12710887
]
Marvin Humphrey commented on LUCENE-1614:
-
> Marvin, what's your plan for Lucy's s
[
https://issues.apache.org/jira/browse/LUCENE-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710894#action_12710894
]
Earwin Burrfoot commented on LUCENE-1645:
-
Either that. Or having boolean readerSh
[
https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Rutherglen updated LUCENE-1313:
-
Attachment: LUCENE-1313.patch
* All tests pass, added more tests
* Added DocumentsWrit
QueryParser throws new exceptions even if custom parsing logic threw a better
one
-
Key: LUCENE-1646
URL: https://issues.apache.org/jira/browse/LUCENE-1646
Project: Lucen
[
https://issues.apache.org/jira/browse/LUCENE-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710988#action_12710988
]
Erik van Zijst commented on LUCENE-1474:
For some time now we've been getting simi
[
https://issues.apache.org/jira/browse/LUCENE-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710988#action_12710988
]
Erik van Zijst edited comment on LUCENE-1474 at 5/19/09 8:40 PM:
---
[
https://issues.apache.org/jira/browse/LUCENE-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710988#action_12710988
]
Erik van Zijst edited comment on LUCENE-1474 at 5/19/09 8:39 PM:
---
[
https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Rutherglen updated LUCENE-1313:
-
Description:
Enable near realtime search in Lucene without external
dependencies. When R
[
https://issues.apache.org/jira/browse/LUCENE-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711006#action_12711006
]
Earwin Burrfoot commented on LUCENE-1645:
-
Lazy clone() is a bad idea, since it ha
[
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711027#action_12711027
]
Shai Erera commented on LUCENE-1614:
So Mike - I've checked BS and BS2, and I don't se
71 matches
Mail list logo