Re: Proposal about Version API relaxation

2010-04-15 Thread DM Smith
not let me use Version (or some other mechanism) to maintain compatibility with an older index, the user will have to re-index. Or I can forgo any future upgrades with Lucene. Neither are very palatable. -- DM Smith

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857427#action_12857427 ] DM Smith commented on LUCENE-2396: -- Robert, I think this is a red-herring. There has been

Re: Proposal about Version API relaxation

2010-04-15 Thread DM Smith
On 04/15/2010 01:50 PM, Earwin Burrfoot wrote: First, the index format. IMHO, it is a good thing for a major release to be able to read the prior major release's index. And the ability to convert it to the current format via optimize is also good. Whatever is decided on this thread should take

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857456#action_12857456 ] DM Smith commented on LUCENE-2396: -- {quote} So I think we should instead use real

[jira] Issue Comment Edited: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857456#action_12857456 ] DM Smith edited comment on LUCENE-2396 at 4/15/10 2:16 PM

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857487#action_12857487 ] DM Smith commented on LUCENE-2396: -- bq. Well, I think asking for a well-defined backwards

Re: Proposal about Version API relaxation

2010-04-15 Thread DM Smith
On 04/15/2010 03:04 PM, Earwin Burrfoot wrote: BTW Earwin, we can come up w/ a migrate() method on IW to accomplish manual migration on the segments that are still on old versions. That's not the point about whether optimize() is good or not. It is the difference between telling the customer to

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857498#action_12857498 ] DM Smith commented on LUCENE-2396: -- {quote} bq. One mechanism that would work

Re: Proposal about Version API relaxation

2010-04-15 Thread DM Smith
On 04/15/2010 03:12 PM, Earwin Burrfoot wrote: On Thu, Apr 15, 2010 at 23:07, DM Smithdmsmith...@gmail.com wrote: On 04/15/2010 03:04 PM, Earwin Burrfoot wrote: BTW Earwin, we can come up w/ a migrate() method on IW to accomplish manual migration on the segments that are still on

Re: Proposal about Version API relaxation

2010-04-15 Thread DM Smith
On 04/15/2010 03:25 PM, Shai Erera wrote: We should create a migrate() API on IW which will touch just those segments and not incur a full optimize. That API can also be used for an offline migration tool, if we decide that's what we want. What about an index that has already called

Re: Proposal about Version API relaxation

2010-04-15 Thread DM Smith
On Apr 15, 2010, at 4:50 PM, Shai Erera wrote: Robert ... I'm sorry but changes to Analyzers don't *force* people to reindex. They can simply choose not to use the latest version. They can choose not to upgrade a Unicode version. They can copy the entire Analyzer code to match their

[jira] Commented: (LUCENE-2396) remove version from core and contrib analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857543#action_12857543 ] DM Smith commented on LUCENE-2396: -- Hmmm. If we are moving stuff out of core

Re: Proposal about Version API relaxation

2010-04-15 Thread DM Smith
tokenizer? Alternatively, we can think of writing an ICU analyzer/tokenizer, but we're still using JFlex, so I don't know how much control we have on that ... Robert has already started one. (1488 I think). Shai On Fri, Apr 16, 2010 at 12:21 AM, DM Smith dmsmith...@gmail.com wrote: On Apr

Re: Proposal about Version API relaxation

2010-04-14 Thread DM Smith
On 04/14/2010 09:13 AM, Robert Muir wrote: Its not sidetracked at all. there seem to be more compelling alternatives to achieve the same thing, so we should consider alternative solutions, too. Maybe have the index store the version(s) and use that when constructing a reader or writer? Given

Re: Proposal about Version API relaxation

2010-04-13 Thread DM Smith
I like the concept of version, but I'm concerned about it too. The current Version mechanism allows one to use more than one Version in their code. Imagine that we are at 3.2 and one was unable to upgrade to a most version for a particular feature. Let's also suppose that at 3.2 a new feature

Re: [DISCUSS] Do away with Contrib Committers and make core committers

2010-03-15 Thread DM Smith
My 2 cents as one who has no aspirations of ever being a committer. I think with the pending re-org of contrib and the value of contrib, it doesn't make much sense to have the distinction between core and contrib let alone for contributors. Regarding the former low bar, either prune the list

Lucene Query Parser Syntax document

2010-02-28 Thread DM Smith
Earlier I had linked to http://lucene.apache.org/java/docs/queryparsersyntax.html in my product manual. That no longer works. Searching I found that the document is per release. Not sure when that changed, but having found it at http://lucene.apache.org/java/2_3_2/queryparsersyntax.html I

Re: SegmentInfos extends Vector

2010-02-28 Thread DM Smith
IIRC: The early implementation of Vector did not extend AbstractList and thus did not have remove. On Feb 28, 2010, at 8:04 AM, Shai Erera wrote: Why do you say remove was unsupported before? I don't see it in the class's impl. It just inherits from Vector and so remove is supported by

Re: Having a default constructor in Analyzers

2010-02-07 Thread DM Smith
On Feb 7, 2010, at 5:32 PM, Sanne Grinovero wrote: Does it make sense to use different values across the same application? Obviously in the unlikely case you want to threat different indexes in a different way, but does it make sense when working all on the same index? I think it entirely

[jira] Commented: (LUCENE-2055) Fix buggy stemmers and Remove duplicate analysis functionality

2010-01-18 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801909#action_12801909 ] DM Smith commented on LUCENE-2055: -- I think it is right to fix bad behavior

[jira] Commented: (LUCENE-2226) move contrib/snowball to contrib/analyzers

2010-01-18 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801912#action_12801912 ] DM Smith commented on LUCENE-2226: -- +1 However this is a very minor break in bw compat

[jira] Commented: (LUCENE-2226) move contrib/snowball to contrib/analyzers

2010-01-18 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802004#action_12802004 ] DM Smith commented on LUCENE-2226: -- Robert, I'm suggesting that you move

[jira] Commented: (LUCENE-2226) move contrib/snowball to contrib/analyzers

2010-01-18 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802040#action_12802040 ] DM Smith commented on LUCENE-2226: -- bq. But i think this concept doesn't even make sense

Re: Dynamic array reallocation algorithms

2010-01-13 Thread DM Smith
On Jan 13, 2010, at 1:00 AM, Marvin Humphrey wrote: On Tue, Jan 12, 2010 at 10:46:29PM -0500, DM Smith wrote: So starting at 0, the size is 0. 0 = 0 0 + 1 = 4 4 + 1 = 8 8 + 1 = 16 16 + 1 = 25 25 + 1 = 35 ... So I think the copied python comment is correct but not obviously correct

Re: Compound File Default

2010-01-12 Thread DM Smith
I'm not sure that it's safe to assume that production use of Lucene is not on a laptop or that it is always on big iron. It makes sense that Lucene is embedded in all sorts of desktop applications that might run on small machines. That certainly describes the application that I work on. I'm

Re: Dynamic array reallocation algorithms

2010-01-12 Thread DM Smith
On Jan 12, 2010, at 6:27 PM, Marvin Humphrey wrote: Greets, I've been trying to understand this comment regarding ArrayUtil.getNextSize(): * The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ... Maybe I'm missing something, but I can't see how the formula yields such a

Re: LUCENE-1515

2010-01-02 Thread DM Smith
Just my 2 cents from a user perspective to the whole thread: I want the best and an easy way to identify the best. Preferably, it will be the default by current version. The best should also have the best name. Because of the backward compatibility policy, we're painted into a box, into name

Re: LUCENE-1515

2010-01-02 Thread DM Smith
On Jan 2, 2010, at 7:46 AM, Robert Muir wrote: I also want backward compatibility. Or at least control over it. That is, I need for indexes to work fully but want an easy path to upgrade/replace an index with better analyzer/filter combos. This stemmer is not backward compatible. But

[jira] Commented: (LUCENE-1343) A replacement for ISOLatin1AccentFilter that does a more thorough job of removing diacritical marks or non-spacing modifiers.

2009-12-07 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786941#action_12786941 ] DM Smith commented on LUCENE-1343: -- I also am dubious about a general purpose folding

[jira] Commented: (LUCENE-1343) A replacement for ISOLatin1AccentFilter that does a more thorough job of removing diacritical marks or non-spacing modifiers.

2009-12-07 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786968#action_12786968 ] DM Smith commented on LUCENE-1343: -- {quote} bq. Robert Muir, Would it make sense

Release artifacts

2009-12-05 Thread DM Smith
I'm wondering about the size of the builds, which are surprisingly big to me. The src is 12M/13M and the bin is 17M/26M (tar.gz/zip) for 2.9.1, similar for 3.0.0. In looking at the binary artifact I see the following: * Every contrib jar has a corresponding javadoc jar, but there is no

Re: Lots of results

2009-12-05 Thread DM Smith
On Dec 5, 2009, at 5:22 PM, Grant Ingersoll wrote: At ScaleCamp yesterday in the UK, I was listening to a talk on Xapian and the speaker said one of the optimizations they do when retrieving a large result set is that instead of managing a Priority Queue, they just allocate a large array

[jira] Commented: (LUCENE-2105) Lucene does not support Unicode Normalization Forms

2009-12-03 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785302#action_12785302 ] DM Smith commented on LUCENE-2105: -- Is this a duplicate or solved by LUCENE-1488

[jira] Commented: (LUCENE-2034) Massive Code Duplication in Contrib Analyzers - unifly the analyzer ctors

2009-12-02 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784812#action_12784812 ] DM Smith commented on LUCENE-2034: -- bq. But I do not see the benefit compared

[jira] Commented: (LUCENE-1488) multilingual analyzer based on icu

2009-12-02 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785036#action_12785036 ] DM Smith commented on LUCENE-1488: -- Robert, just finished reviewing the code. Looks great

[jira] Commented: (LUCENE-2094) Prepare CharArraySet for Unicode 4.0

2009-12-01 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784175#action_12784175 ] DM Smith commented on LUCENE-2094: -- bq. I would like to open another issue for roberts

[jira] Commented: (LUCENE-2034) Massive Code Duplication in Contrib Analyzers - unifly the analyzer ctors

2009-12-01 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784338#action_12784338 ] DM Smith commented on LUCENE-2034: -- Robert: bq. DM, I think we can have both? A method

[jira] Commented: (LUCENE-1581) LowerCaseFilter should be able to be configured to use a specific locale.

2009-12-01 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784358#action_12784358 ] DM Smith commented on LUCENE-1581: -- bq. ultimately I still think case folding

[jira] Commented: (LUCENE-2102) LowerCaseFilter for Turkish language

2009-12-01 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784421#action_12784421 ] DM Smith commented on LUCENE-2102: -- bq. but non-NFC text doesn't work correctly

[jira] Commented: (LUCENE-2102) LowerCaseFilter for Turkish language

2009-12-01 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784423#action_12784423 ] DM Smith commented on LUCENE-2102: -- For new classes, would it be helpful to add @since

[jira] Commented: (LUCENE-2034) Massive Code Duplication in Contrib Analyzers - unifly the analyzer ctors

2009-12-01 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784506#action_12784506 ] DM Smith commented on LUCENE-2034: -- {quote} bq.How about splitting out the stop words

[jira] Commented: (LUCENE-2034) Massive Code Duplication in Contrib Analyzers - unifly the analyzer ctors

2009-11-30 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783737#action_12783737 ] DM Smith commented on LUCENE-2034: -- I was trying to lurk, but I'm not able to apply

[jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-11-24 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12781947#action_12781947 ] DM Smith commented on LUCENE-1458: -- bq. Yes, this (customizing comparator for termrefs

Re: [jira] Commented: (LUCENE-2092) BooleanQuery.hashCode and equals ignore isCoordDisabled

2009-11-23 Thread DM Smith
Since this is a bug fix, please mark it for 2.9.2 if there ever is one. On Nov 23, 2009, at 7:08 PM, Michael McCandless (JIRA) wrote: [

Re: Hiding JIRA issues

2009-11-21 Thread DM Smith
A couple of thoughts: JIRA allows for administrative export of the database to XML. If these don't export then something is really bad. Contact atlassian with the problem after searching their forums for the problem. -- DM On Nov 21, 2009, at 9:57 AM, Simon Willnauer wrote: On Sat, Nov 21,

[jira] Commented: (LUCENE-1799) Unicode compression

2009-11-19 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780129#action_12780129 ] DM Smith commented on LUCENE-1799: -- The sample code is probably what is on this page

Re: Why release 3.0?

2009-11-16 Thread DM Smith
On Nov 16, 2009, at 6:43 PM, Robert Muir wrote: DM, in this case I'm not referring to surrogates, etc, but instead the idea that properties for an existing character can change (the soft hyphen and arabic ayah were two examples), also new characters are introduced. these will affect what

Re: Why release 3.0?

2009-11-16 Thread DM Smith
unicode 3's UCD to unicode 4's UCD, in case you want to see the changes: http://people.apache.org/~rmuir/unicodeDiff.txt That's an amazing number of changes, even when you ignore name changes. On Mon, Nov 16, 2009 at 7:42 PM, DM Smith dmsmith...@gmail.com wrote: On Nov 16, 2009, at 6:43 PM

[jira] Commented: (LUCENE-2023) Improve performance of SmartChineseAnalyzer

2009-11-01 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12772350#action_12772350 ] DM Smith commented on LUCENE-2023: -- Internals are internals. Anyone digging

[jira] Commented: (LUCENE-2023) Improve performance of SmartChineseAnalyzer

2009-10-30 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12772003#action_12772003 ] DM Smith commented on LUCENE-2023: -- If we have a 2.9.2 release, can this be there too

[jira] Commented: (LUCENE-2023) Improve performance of SmartChineseAnalyzer

2009-10-30 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12772022#action_12772022 ] DM Smith commented on LUCENE-2023: -- I fully understand that at some point, just say

[jira] Commented: (LUCENE-2023) Improve performance of SmartChineseAnalyzer

2009-10-30 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12772050#action_12772050 ] DM Smith commented on LUCENE-2023: -- Robert, You have in BigramDictionary: {code} public

Re: contrib and lucene 3.0

2009-10-30 Thread DM Smith
I don't see any reason to freeze new contributions from any release. On 10/30/2009 03:19 PM, Robert Muir wrote: thanks Michael. does anyone else have any opinion on this issue? fyi we already have several new features committed to 3.0 contrib already (see contrib/CHANGES), but I don't too

Lucene as projects in Eclipse

2009-10-28 Thread DM Smith
Is there any guidance on how to set up Lucene for development within Eclipse. Perhaps a wiki page or an old email thread? I looked but didn't find one. I've done it manually twice now and it was time-consuming and ultimately I did it differently each time, not liking any way I have done it.

Re: Lucene as projects in Eclipse

2009-10-28 Thread DM Smith
On 10/28/2009 01:03 PM, Mark Miller wrote: DM Smith wrote: Is there any guidance on how to set up Lucene for development within Eclipse. Perhaps a wiki page or an old email thread? I looked but didn't find one. I've done it manually twice now and it was time-consuming and ultimately I did

Re: Lucene as projects in Eclipse

2009-10-28 Thread DM Smith
for anything in particular though it does make the dependencies between contribs obvious. It was more a pattern from habit on another project. -- DM DM Smith wrote: On 10/28/2009 01:03 PM, Mark Miller wrote: DM Smith wrote: Is there any guidance on how to set up Lucene

Re: Lucene as projects in Eclipse

2009-10-28 Thread DM Smith
On Oct 28, 2009, at 1:45 PM, Robert Muir wrote: DM, I create one project (new project, checkout projects from SVN, and let it set it as a java project). I then set the source folders like you mentioned below. I add lib/junit*whatever.jar to library classpath, and set UTF-8 default

[jira] Commented: (LUCENE-2012) Add @Override annotations

2009-10-27 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12770705#action_12770705 ] DM Smith commented on LUCENE-2012: -- Uwe, what did you use to generate the @override

[jira] Commented: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768215#action_12768215 ] DM Smith commented on LUCENE-1998: -- .bq I only added the license header back

[jira] Issue Comment Edited: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768215#action_12768215 ] DM Smith edited comment on LUCENE-1998 at 10/21/09 2:22 PM: bq

[jira] Commented: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768270#action_12768270 ] DM Smith commented on LUCENE-1998: -- I just noticed that enums are comparable

[jira] Commented: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768304#action_12768304 ] DM Smith commented on LUCENE-1998: -- bq. changing the order of enum constants is bad, you

[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-20 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DM Smith updated LUCENE-1257: - Attachment: LUCENE-1257_enum.patch Migrates to Java 5 enums in core and contrib. All tests pass

[jira] Created: (LUCENE-1998) Use Java 5 enums

2009-10-20 Thread DM Smith (JIRA)
Use Java 5 enums Key: LUCENE-1998 URL: https://issues.apache.org/jira/browse/LUCENE-1998 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.0 Reporter: DM Smith Priority: Minor

[jira] Updated: (LUCENE-1998) Use Java 5 enums

2009-10-20 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DM Smith updated LUCENE-1998: - Attachment: LUCENE-1998_enum.patch This issue and patch were part of LUCENE-1257, but may have backward

[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-20 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DM Smith updated LUCENE-1257: - Attachment: (was: LUCENE-1257_enum.patch) Port to Java5 - Key

Parameter class and Java 5 Enums

2009-10-19 Thread DM Smith
Should the Parameter class be replaced with Java 5 enums? My only concern is backward compatibility. I noticed that Parameter is serializable. Is this used by Lucene? I wasn't able to see any place that depended on it. The only public method, Parameter.toString() results in the same value as a

Arabic Analyzer: possible bug

2009-10-08 Thread DM Smith
I'm wondering if there is a bug in ArabicAnalyzer in 2.9. (I don't know Arabic or Farsi, but have some texts to index in those languages.) The tokenizer/filter chain for ArabicAnalyzer is: TokenStream result = new ArabicLetterTokenizer( reader ); result = new StopFilter(

Re: Arabic Analyzer: possible bug

2009-10-08 Thread DM Smith
, 2009 at 7:24 AM, DM Smith dmsmith...@gmail.com wrote: I'm wondering if there is a bug in ArabicAnalyzer in 2.9. (I don't know Arabic or Farsi, but have some texts to index in those languages.) The tokenizer/filter chain for ArabicAnalyzer is: TokenStream result = new

Re: Arabic Analyzer: possible bug

2009-10-08 Thread DM Smith
On 10/08/2009 09:23 AM, Uwe Schindler wrote: Just an addition: The lowercase filter is only for the case of embedded non-arabic words. And these will not appear in the stop words. I learned something new! Hmm. If one has a mixed Arabic / English text, shouldn't one be able to augment the

Re: Arabic Analyzer: possible bug

2009-10-08 Thread DM Smith
non-Arabic (also cyrillic, etc), is pretty safe across the board though. On Thu, Oct 8, 2009 at 9:29 AM, DM Smith dmsmith...@gmail.com mailto:dmsmith...@gmail.com wrote: On 10/08/2009 09:23 AM, Uwe Schindler wrote: Just an addition: The lowercase filter is only for the case

Re: Arabic Analyzer: possible bug

2009-10-08 Thread DM Smith
or something like that? for some other languages written in the Latin script, English stopwords could be bad :) I think that Lowercasing non-Arabic (also cyrillic, etc), is pretty safe across the board though. On Thu, Oct 8, 2009 at 9:29 AM, DM Smith

[jira] Commented: (LUCENE-1963) ArabicAnalyzer: Lowercase before Stopfilter

2009-10-08 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763554#action_12763554 ] DM Smith commented on LUCENE-1963: -- can you commit it to 2.9.1 too? (For those stuck

Re: [jira] Created: (LUCENE-1956) Fix javadoc comments in search package

2009-10-07 Thread DM Smith
On Oct 7, 2009, at 2:59 PM, Michael Busch (JIRA) wrote: Fix javadoc comments in search package -- Key: LUCENE-1956 URL: https://issues.apache.org/jira/browse/LUCENE-1956 Project: Lucene - Java Issue Type:

Re: [jira] Created: (LUCENE-1948) Deprecating InstantiatedIndexWriter

2009-10-05 Thread DM Smith
On 10/05/2009 12:22 PM, Karl Wettin (JIRA) wrote: Deprecating InstantiatedIndexWriter --- Key: LUCENE-1948 URL: https://issues.apache.org/jira/browse/LUCENE-1948 Project: Lucene - Java Issue Type: Task

Searcher javadoc problem

2009-10-03 Thread DM Smith
I'm working on migrating my code to 2.9. And I'm trying to figure out what to do. Along the way I found a circular argument in the JavaDoc for Searcher. BTW, this is not a user question. My current code calls: Hits hits = searcher.search(query); The JavaDoc for it says:

Re: Searcher javadoc problem

2009-10-03 Thread DM Smith
- even though the JavaDoc warns you thats a major speed trap, everyone still did it ... use a Collector. Your right though - it shouldn't point to IndexSearcher.search(Query) after that - it should point to IndexSearcher.search(Query, int) Goto fix that. DM Smith wrote: I'm working on migrating

Re: svn commit: r821434 - /lucene/java/trunk/src/java/org/apache/lucene/search/Searcher.java

2009-10-03 Thread DM Smith
On Oct 3, 2009, at 6:51 PM, Michael Busch busch...@gmail.com wrote: On 10/4/09 12:42 AM, Mark Miller wrote: Why will 3.0 be work to upgrade? 2.9 was supposed to be the work, 3.0 no work ... With 2.9 you can be lazy and live with deprecation warnings. With 3.0 you *have* to switch to

Re: svn commit: r821440 - /lucene/java/branches/lucene_2_9/src/java/org/apache/lucene/search/Searcher.java

2009-10-03 Thread DM Smith
Please apply all bug fixes tto 2.9.0 as som of us have it as our last Java1.4.2 release. On Oct 3, 2009, at 6:55 PM, Uwe Schindler u...@thetaphi.de wrote: Should we now commit all fixes also to 2.9, which should go into 2.9.1, i fit will be released as a bugfix release together with 3.0

Re: svn commit: r821434 - /lucene/java/trunk/src/java/org/apache/lucene/search/Searcher.java

2009-10-03 Thread DM Smith
On Oct 3, 2009, at 6:56 PM, Mark Miller markrmil...@gmail.com wrote: No bug fixes for the lazy! Not having 1.5 on mac osx tiger is the issue. Dou you recommend that 2.9.0 is really not for 1.4 users. Therefore was no point in waiting on Java1.5. :) Yes I see that tongue in your cheek.

Re: Searcher javadoc problem

2009-10-03 Thread DM Smith
like working up something with a Collector on your own would be better though - why compute the score if you don't need it. Hits caching was rarely that useful either. DM Smith wrote: It makes sense if you understand the context. We make each verse of a Bible a document. There are about 36000 docc

Re: Deprecated class in spatial contrib

2009-08-30 Thread DM Smith
+1 How obvious!! On Aug 30, 2009, at 3:04 PM, Mark Miller markrmil...@gmail.com wrote: The spatial contrib has not been in a release before, so just wondering why there are deprecated classes in it - should we remove those, or was there a good reason to keep them? In general, it seem we

Lucene 3.0 and Java 5 (was Re: Finishing Lucene 2.9)

2009-08-23 Thread DM Smith
it should have been for a 3.0 release. I'd also suggest that repackaging, suggested in a prior thread, be tackled also. This could follow a 3.0 release quickly. -- DM Smith - To unsubscribe, e-mail: java-dev-unsubscr

[jira] Commented: (LUCENE-1813) Add option to ReverseStringFilter to mark reversed tokens

2009-08-17 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12744092#action_12744092 ] DM Smith commented on LUCENE-1813: -- I like the idea of a constant and it presented

Re: Java 5 creeped again to Benchmark?

2009-08-13 Thread DM Smith
On 08/13/2009 01:56 PM, Mark Miller wrote: Shai Erera wrote: So far Mike has resolved the issue again, so it sounds like we go w/ it ? Lazy consensus - so its lookin good so far - but someone could still derail us I suppose. I've been a stick-in-the-mud wrt migrating to Java 5 in the

Beta (was Re: who clears attributes?)

2009-08-11 Thread DM Smith
On 08/11/2009 08:22 AM, Michael McCandless wrote: I do still think a longish 2.9 beta is warranted, if we can succeed in getting users outside the dev group to kick the tires and uncover stuff. I think a beta would be a great idea. Not sure it needs to be longish. Having not looked at it,

Re: who clears attributes?

2009-08-11 Thread DM Smith
Uwe, Is this example available? I think that an example like this would help the user community see the current value in the change. At least, I'd love to see the code for it. -- DM On 08/10/2009 06:49 PM, Uwe Schindler wrote: UIMA The new API looks like UIMA, you have streams that

[jira] Commented: (LUCENE-1793) remove custom encoding support in Greek/Russian Analyzers

2009-08-09 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12741109#action_12741109 ] DM Smith commented on LUCENE-1793: -- bq.If this is the concern, then I think a better

[jira] Commented: (LUCENE-1793) remove custom encoding support in Greek/Russian Analyzers

2009-08-09 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12741173#action_12741173 ] DM Smith commented on LUCENE-1793: -- I wasn't thinking about any encoding in particular

Re: IndexWriter.getReader usage

2009-08-01 Thread DM Smith
On Aug 1, 2009, at 7:52 AM, Grant Ingersoll gsing...@apache.org wrote: In many NRT cases, it seems the traditional approach has been to have two RAM directories and a write-through FS Directory (for example Zoie does this, and it has also been discussed a fair number of times on the

Re: test-tag does not really test against 2.4, it tests against a branch from trunk on 2008-11-29

2009-07-02 Thread DM Smith
FYI, You can always create a branch from a specific revision. Don't know if this would help. On Jul 2, 2009, at 4:44 PM, Uwe Schindler wrote: When doing LUCENE-1723, I restored the old state of RangeQuery Co from Lucene 2.4.1 and added all new things from 2.9 to the new renamed

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread DM Smith
Michael Busch wrote: Probably everyone is thinking right now Oh no! Not again!. I admit I didn't fully read the incredibly long recent thread about backwards-compatibility, so maybe what I'm about to propose has been proposed already. In that case my apologies in advance. Perhaps you should go

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread DM Smith
drive a 4.0 release? -- DM DM Smith wrote: Michael Busch wrote: Probably everyone is thinking right now Oh no! Not again!. I admit I didn't fully read the incredibly long recent thread about backwards-compatibility, so maybe what I'm about to propose has been proposed already. In that case

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread DM Smith
Michael McCandless wrote: On Tue, Jun 16, 2009 at 12:41 PM, DM Smithdmsmith...@gmail.com wrote: The Debian policy is to bump the major revision number every time there is an incompatible API change. Does this include adding methods to interfaces? Ie, is there some automated check

Re: bulk fixing svn eol-style?

2009-06-09 Thread DM Smith
Michael McCandless wrote: We have a number of sources that don't have eol-style set to native... This causes problems, eg, patches to such files become degenerate (remove all lines, add all lines), which of course hides what really changed. So... are there any objections if I go through all

Re: Lucene's default settings back compatibility

2009-05-30 Thread DM Smith
I think one conclusion that did come of this discussion was that bugs should be fixed even if it breaks backward compatibility. -- DM On May 30, 2009, at 7:21 AM, Michael McCandless wrote: Actually, I think this is a common, and in fact natural/expected occurrence in open-source. When a

Re: Lucene's default settings back compatibility

2009-05-22 Thread DM Smith
Yonik Seeley wrote: On Fri, May 22, 2009 at 1:22 PM, Michael McCandless luc...@mikemccandless.com wrote: (That said, unrelated to this discussion, I would actually like to record per-segment which version of Lucene wrote the segment; this would be very helpful when debugging issues like

Re: Lucene's default settings back compatibility

2009-05-22 Thread DM Smith
Michael McCandless wrote: On Fri, May 22, 2009 at 12:52 PM, Marvin Humphrey mar...@rectangular.com wrote: when working on 3.1 if we make some great improvement, I'd like new users in 3.1 to see the improvement by default. Sounds like an argument for more frequent major releases.

Re: Lucene's default settings back compatibility

2009-05-22 Thread DM Smith
Marvin Humphrey wrote: I feel the opposite: I'd like new users to see improvements by default, and users that require strict back-compate to ask for that. By strict back-compat, do you mean people who would like their search app to not fail silently? ;) A new user who follows your

Re: Lucene's default settings back compatibility

2009-05-22 Thread DM Smith
Michael McCandless wrote: On Fri, May 22, 2009 at 2:27 PM, DM Smith dmsmith...@gmail.com wrote: Marvin Humphrey wrote: I feel the opposite: I'd like new users to see improvements by default, and users that require strict back-compate to ask for that. By strict back-compat

  1   2   3   >