[jira] Updated: (LUCENE-2303) Remove code duplication from Token class, just extend TermAttributeImpl

2010-03-08 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-2303: -- Attachment: LUCENE-2303.patch Small improvements to the patch, removed an inconsistency in typ

[jira] Resolved: (LUCENE-2303) Remove code duplication from Token class, just extend TermAttributeImpl

2010-03-08 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-2303. --- Resolution: Fixed Committed trunk revision: 920237 > Remove code duplication from Token cla

Re: svn commit: r920240 - in /lucene/java/branches/flex_1458: ./ contrib/ contrib/analyzers/common/src/java/org/tartarus/snowball/ contrib/highlighter/src/test/ contrib/instantiated/src/test/org/apa

2010-03-08 Thread Michael McCandless
On Mon, Mar 8, 2010 at 4:17 AM, wrote: > Author: uschindler > Date: Mon Mar  8 09:17:03 2010 > New Revision: 920240 > > URL: http://svn.apache.org/viewvc?rev=920240&view=rev > Log: > Merge flex up to trunk rev 920237. > > This revision was left out, because it conflicted "heavy": 919060 > Message

[jira] Resolved: (LUCENE-2300) IndexWriter should never pool readers for external segments

2010-03-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-2300. Resolution: Fixed > IndexWriter should never pool readers for external segments >

Re: IndexWriter.applyDeletes performance

2010-03-08 Thread Bogdan Ghidireac
Mike, > > But... how long does step 2 take?  Is it an option to not commit on > every update?  How many docs do you typically update? I do not commit on every update, I call commit once every 10k documents. Indexing 10k docs takes around 10 secs. > > If you are committing only so that an outsid

[jira] Updated: (LUCENE-2111) Wrapup flexible indexing

2010-03-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2111: --- Attachment: LUCENE-2111.patch New rev, just a few changes: * Rename BytesRef.toBy

[jira] Commented: (LUCENE-2280) IndexWriter.optimize() throws NullPointerException

2010-03-08 Thread Ritesh Nigam (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842657#action_12842657 ] Ritesh Nigam commented on LUCENE-2280: -- I checked the documentation of IndexWriter in

[jira] Updated: (LUCENE-2089) explore using automaton for fuzzyquery

2010-03-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2089: Attachment: LUCENE-2089.patch updated patch, regenerated with mike's update to the python script.

Re: [jira] Commented: (LUCENE-2280) IndexWriter.optimize() throws NullPointerException

2010-03-08 Thread Erick Erickson
Quick side note: The recommended upgrade path is to upgrade to 2.9.latest, fix all of the deprecation warnings, *then* upgrade to 3.0. The 2.9.X -> 3.0 upgrade just removed all the deprecated stuff. FWIW Erick On Mon, Mar 8, 2010 at 8:51 AM, Ritesh Nigam (JIRA) wrote: > >[ > https://iss

[jira] Updated: (LUCENE-2302) Replacement for TermAttribute+Impl with extended capabilities (byte[] support, CharSequence, Appendable)

2010-03-08 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-2302: -- Attachment: LUCENE-2302.patch Here a first patch. To discuss: - Names of classes/interfaces -

[jira] Commented: (LUCENE-2280) IndexWriter.optimize() throws NullPointerException

2010-03-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842710#action_12842710 ] Michael McCandless commented on LUCENE-2280: bq. I checked the documentation o

[jira] Updated: (LUCENE-2302) Replacement for TermAttribute+Impl with extended capabilities (byte[] support, CharSequence, Appendable)

2010-03-08 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-2302: -- Attachment: LUCENE-2302.patch New patch, updated for trunk: - CharTermAttributeImpl.subSequenc

[jira] Updated: (LUCENE-2302) Replacement for TermAttribute+Impl with extended capabilities (byte[] support, CharSequence, Appendable)

2010-03-08 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-2302: -- Attachment: LUCENE-2302.patch Again an update, removed the whole lazy init stuff; so initTermB

[jira] Created: (LUCENE-2304) FuzzyLikeThisQuery should set MaxNonCompetitiveBoost for faster speed

2010-03-08 Thread Robert Muir (JIRA)
FuzzyLikeThisQuery should set MaxNonCompetitiveBoost for faster speed - Key: LUCENE-2304 URL: https://issues.apache.org/jira/browse/LUCENE-2304 Project: Lucene - Java Issue

[jira] Updated: (LUCENE-2302) Replacement for TermAttribute+Impl with extended capabilities (byte[] support, CharSequence, Appendable)

2010-03-08 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-2302: -- Attachment: LUCENE-2302.patch Removed usage of deprecated API in Token, also javadocs updated.

[jira] Commented: (LUCENE-2302) Replacement for TermAttribute+Impl with extended capabilities (byte[] support, CharSequence, Appendable)

2010-03-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842753#action_12842753 ] Michael McCandless commented on LUCENE-2302: Patch looks great Uwe! Great sim

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-08 Thread Michael McCandless
On Sun, Mar 7, 2010 at 1:21 PM, Marvin Humphrey wrote: > On Sat, Mar 06, 2010 at 05:07:18AM -0500, Michael McCandless wrote: >> It won't encounter an unknown posting format. It's the codec. It >> knows all posting formats by the time it sees it. > > OK, so you're not going to handle this the way

[jira] Updated: (LUCENE-124) Fuzzy Searches do not get a boost of 0.2 as stated in "Query Syntax" doc

2010-03-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-124: --- Attachment: LUCENE-124.patch Attached is an updated patch: * Synced to trunk as these PQ rewrite meth

RE: Baby steps towards making Lucene's scoring more flexible...

2010-03-08 Thread Steven A Rowe
On 03/08/2010 at 1:13 PM, Michael McCandless wrote: > On Sun, Mar 7, 2010 at 1:21 PM, Marvin Humphrey > wrote: > > On Sat, Mar 06, 2010 at 05:07:18AM -0500, Michael McCandless wrote: > > > > What's the flex API for specifying a custom posting format? > > > > > > You implement a Codecs class, whic

RE: Baby steps towards making Lucene's scoring more flexible...

2010-03-08 Thread Steven A Rowe
On 03/08/2010 at 1:57 PM, Steven A Rowe wrote: > On 03/08/2010 at 1:13 PM, Michael McCandless wrote: > > On Sun, Mar 7, 2010 at 1:21 PM, Marvin Humphrey > > wrote: > > > On Sat, Mar 06, 2010 at 05:07:18AM -0500, Michael McCandless wrote: > > > > > What's the flex API for specifying a custom postin

[jira] Commented: (LUCENE-124) Fuzzy Searches do not get a boost of 0.2 as stated in "Query Syntax" doc

2010-03-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842772#action_12842772 ] Michael McCandless commented on LUCENE-124: --- Patch looks good Robert! > Fuzzy Se

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-08 Thread Michael McCandless
On Mon, Mar 8, 2010 at 2:07 PM, Steven A Rowe wrote: > On 03/08/2010 at 1:57 PM, Steven A Rowe wrote: >> On 03/08/2010 at 1:13 PM, Michael McCandless wrote: >> > On Sun, Mar 7, 2010 at 1:21 PM, Marvin Humphrey >> > wrote: >> > > On Sat, Mar 06, 2010 at 05:07:18AM -0500, Michael McCandless wrote:

[jira] Commented: (LUCENE-124) Fuzzy Searches do not get a boost of 0.2 as stated in "Query Syntax" doc

2010-03-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842776#action_12842776 ] Robert Muir commented on LUCENE-124: Thanks Mike. I will commit later today if no one o

RE: Baby steps towards making Lucene's scoring more flexible...

2010-03-08 Thread Steven A Rowe
On 03/08/2010 at 2:10 PM, Michael McCandless wrote: > On Mon, Mar 8, 2010 at 2:07 PM, Steven A Rowe wrote: > > On 03/08/2010 at 1:57 PM, Steven A Rowe wrote: > > > On 03/08/2010 at 1:13 PM, Michael McCandless wrote: > > > > On Sun, Mar 7, 2010 at 1:21 PM, Marvin Humphrey > > > > wrote: > > > > >

Re: Multi-node stats within individual nodes (was "Baby steps...")

2010-03-08 Thread Michael McCandless
On Sun, Mar 7, 2010 at 11:43 AM, Marvin Humphrey wrote: > On Sat, Mar 06, 2010 at 05:07:18AM -0500, Michael McCandless wrote: >> > Fortunately, beaming field length data around is an easier problem than >> > distributed IDF, because with rare exceptions, the number of fields in a >> > typical inde

[jira] Updated: (LUCENE-2294) Create IndexWriterConfiguration and store all of IW configuration there

2010-03-08 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-2294: --- Attachment: LUCENE-2294.patch Patch with all tests and code converted to not use the deprecated API,

[jira] Updated: (LUCENE-124) Fuzzy Searches do not get a boost of 0.2 as stated in "Query Syntax" doc

2010-03-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-124: --- Lucene Fields: [Patch Available] Fix Version/s: 3.1 > Fuzzy Searches do not get a boost of 0.2 as

[jira] Resolved: (LUCENE-124) Fuzzy Searches do not get a boost of 0.2 as stated in "Query Syntax" doc

2010-03-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-124. Resolution: Fixed Committed revision 920499. > Fuzzy Searches do not get a boost of 0.2 as stated

[jira] Commented: (LUCENE-2294) Create IndexWriterConfiguration and store all of IW configuration there

2010-03-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842824#action_12842824 ] Michael McCandless commented on LUCENE-2294: Patch looks good -- thanks Shai!

[jira] Created: (LUCENE-2305) Introduce Version in more places long before 4.0

2010-03-08 Thread Shai Erera (JIRA)
Introduce Version in more places long before 4.0 Key: LUCENE-2305 URL: https://issues.apache.org/jira/browse/LUCENE-2305 Project: Lucene - Java Issue Type: Improvement Reporter: Sh

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-08 Thread Marvin Humphrey
On Mon, Mar 08, 2010 at 01:13:53PM -0500, Michael McCandless wrote: > I think we can actually do so w/o losing Lucene's loose typing if we > simply peeled out [say] a FieldType class that holds the settings you > now set on each field (omitTFAP, omitNorms, TermVector, Store, > Index), and Field ins

Re: Multi-node stats within individual nodes (was "Baby steps...")

2010-03-08 Thread Marvin Humphrey
On Mon, Mar 08, 2010 at 02:23:47PM -0500, Michael McCandless wrote: > For a large index the stats will be stable after re-indexing only a > few more docs. Well, not if there's been huge churn on other nodes in the interim. > No... the stat is avg tf within the doc. Don't you need the *total* f