Re: Lucene's default settings & back compatibility

2009-05-30 Thread DM Smith
I think one conclusion that did come of this discussion was that bugs should be fixed even if it breaks backward compatibility. -- DM On May 30, 2009, at 7:21 AM, Michael McCandless wrote: Actually, I think this is a common, and in fact natural/expected occurrence in open-source. When a tri

Re: bulk fixing svn eol-style?

2009-06-09 Thread DM Smith
Michael McCandless wrote: We have a number of sources that don't have eol-style set to "native"... This causes problems, eg, patches to such files become degenerate (remove all lines, add all lines), which of course hides what really changed. So... are there any objections if I go through al

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread DM Smith
Michael Busch wrote: Probably everyone is thinking right now "Oh no! Not again!". I admit I didn't fully read the incredibly long recent thread about backwards-compatibility, so maybe what I'm about to propose has been proposed already. In that case my apologies in advance. Perhaps you should go

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread DM Smith
would ever drive a 4.0 release? -- DM DM Smith wrote: Michael Busch wrote: Probably everyone is thinking right now "Oh no! Not again!". I admit I didn't fully read the incredibly long recent thread about backwards-compatibility, so maybe what I'm about to propose has been

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread DM Smith
Michael McCandless wrote: On Tue, Jun 16, 2009 at 12:41 PM, DM Smith wrote: The Debian policy is to bump the major revision number every time there is an incompatible API change. Does this include adding methods to interfaces? Ie, is there some automated check done by Debian that

Re: test-tag does not really test against 2.4, it tests against a branch from trunk on 2008-11-29

2009-07-02 Thread DM Smith
FYI, You can always create a branch from a specific revision. Don't know if this would help. On Jul 2, 2009, at 4:44 PM, Uwe Schindler wrote: When doing LUCENE-1723, I restored the old state of RangeQuery & Co from Lucene 2.4.1 and added all new things from 2.9 to the new renamed TermRangeQ

Re: IndexWriter.getReader usage

2009-08-01 Thread DM Smith
On Aug 1, 2009, at 7:52 AM, Grant Ingersoll wrote: In many NRT cases, it seems the traditional approach has been to have two RAM directories and a write-through FS Directory (for example Zoie does this, and it has also been discussed a fair number of times on the various lists). I'm wonde

Re: IndexWriter.getReader usage

2009-08-03 Thread DM Smith
On 08/03/2009 08:21 AM, Earwin Burrfoot wrote: The biggest win for NRT was switching to per-segment Collector because that meant we could re-use FieldCache entries for all segments that hadn't changed. In my opinion, this switch was enough to get as NRT-ey, as you want. Fusing IR/IW togeth

Beta (was Re: who clears attributes?)

2009-08-11 Thread DM Smith
On 08/11/2009 08:22 AM, Michael McCandless wrote: I do still think a longish 2.9 beta is warranted, if we can succeed in getting users outside the dev group to kick the tires and uncover stuff. I think a beta would be a great idea. Not sure it needs to be "longish." Having not looked at it

Re: who clears attributes?

2009-08-11 Thread DM Smith
Uwe, Is this example available? I think that an example like this would help the user community see the current value in the change. At least, I'd love to see the code for it. -- DM On 08/10/2009 06:49 PM, Uwe Schindler wrote: > UIMA The new API looks like UIMA, you have streams that

Re: Java 5 creeped again to Benchmark?

2009-08-13 Thread DM Smith
On 08/13/2009 01:56 PM, Mark Miller wrote: Shai Erera wrote: So far Mike has resolved the issue again, so it sounds like we go w/ it ? Lazy consensus - so its lookin good so far - but someone could still derail us I suppose. I've been a stick-in-the-mud wrt migrating to Java 5 in the past.

Lucene 3.0 and Java 5 (was Re: Finishing Lucene 2.9)

2009-08-23 Thread DM Smith
the name/api problems and making the API of Lucene be what it should have been for a 3.0 release. I'd also suggest that repackaging, suggested in a prior thread, be tackled also. This could follow a 3.0 release quickly. -- DM Smith

Re: Deprecated class in spatial contrib

2009-08-30 Thread DM Smith
+1 How obvious!! On Aug 30, 2009, at 3:04 PM, Mark Miller wrote: The spatial contrib has not been in a release before, so just wondering why there are deprecated classes in it - should we remove those, or was there a good reason to keep them? In general, it seem we should just deprecate

Searcher javadoc problem

2009-10-03 Thread DM Smith
I'm working on migrating my code to 2.9. And I'm trying to figure out what to do. Along the way I found a circular argument in the JavaDoc for Searcher. BTW, this is not a user question. My current code calls: Hits hits = searcher.search(query); The JavaDoc for it says: /**

Re: Searcher javadoc problem

2009-10-03 Thread DM Smith
h the JavaDoc warns you thats a major speed trap, everyone still did it ... use a Collector. Your right though - it shouldn't point to IndexSearcher.search(Query) after that - it should point to IndexSearcher.search(Query, int) Goto fix that. DM Smith wrote: I'm working on migrating my c

Re: svn commit: r821434 - /lucene/java/trunk/src/java/org/apache/lucene/search/Searcher.java

2009-10-03 Thread DM Smith
On Oct 3, 2009, at 6:51 PM, Michael Busch wrote: On 10/4/09 12:42 AM, Mark Miller wrote: Why will 3.0 be work to upgrade? 2.9 was supposed to be the work, 3.0 no work ... With 2.9 you can be lazy and live with deprecation warnings. With 3.0 you *have* to switch to undeprecated APIs. M

Re: svn commit: r821440 - /lucene/java/branches/lucene_2_9/src/java/org/apache/lucene/search/Searcher.java

2009-10-03 Thread DM Smith
Please apply all bug fixes tto 2.9.0 as som of us have it as our last Java1.4.2 release. On Oct 3, 2009, at 6:55 PM, "Uwe Schindler" wrote: Should we now commit all fixes also to 2.9, which should go into 2.9.1, i fit will be released as a bugfix release together with 3.0 (e.g. the highl

Re: svn commit: r821434 - /lucene/java/trunk/src/java/org/apache/lucene/search/Searcher.java

2009-10-03 Thread DM Smith
On Oct 3, 2009, at 6:56 PM, Mark Miller wrote: No bug fixes for the lazy! Not having 1.5 on mac osx tiger is the issue. Dou you recommend that 2.9.0 is really not for 1.4 users. Therefore was no point in waiting on Java1.5. :) Yes I see that tongue in your cheek. We should also fix

Re: Searcher javadoc problem

2009-10-03 Thread DM Smith
working up something with a Collector on your own would be better though - why compute the score if you don't need it. Hits caching was rarely that useful either. DM Smith wrote: It makes sense if you understand the context. We make each verse of a Bible a document. There are about 36000 docc in a B

Re: [jira] Created: (LUCENE-1948) Deprecating InstantiatedIndexWriter

2009-10-05 Thread DM Smith
On 10/05/2009 12:22 PM, Karl Wettin (JIRA) wrote: Deprecating InstantiatedIndexWriter --- Key: LUCENE-1948 URL: https://issues.apache.org/jira/browse/LUCENE-1948 Project: Lucene - Java Issue Type: Task

Re: [jira] Created: (LUCENE-1956) Fix javadoc comments in search package

2009-10-07 Thread DM Smith
On Oct 7, 2009, at 2:59 PM, Michael Busch (JIRA) wrote: Fix javadoc comments in search package -- Key: LUCENE-1956 URL: https://issues.apache.org/jira/browse/LUCENE-1956 Project: Lucene - Java Issue Type: T

Arabic Analyzer: possible bug

2009-10-08 Thread DM Smith
I'm wondering if there is a bug in ArabicAnalyzer in 2.9. (I don't know Arabic or Farsi, but have some texts to index in those languages.) The tokenizer/filter chain for ArabicAnalyzer is: TokenStream result = new ArabicLetterTokenizer( reader ); result = new StopFilter( result

Re: Arabic Analyzer: possible bug

2009-10-08 Thread DM Smith
Oct 8, 2009 at 7:24 AM, DM Smith wrote: I'm wondering if there is a bug in ArabicAnalyzer in 2.9. (I don't know Arabic or Farsi, but have some texts to index in those languages.) The tokenizer/filter chain for ArabicAnalyzer is: TokenStream result = new ArabicLetterTo

Re: Arabic Analyzer: possible bug

2009-10-08 Thread DM Smith
c Analyzer: possible bug DM, there is no upper/lower cases in Arabic, so don't worry, but the stop word list needs some corrections and may miss some common/stop Arabic words. Best, On Thu, Oct 8, 2009 at 4:14 PM, DM Smith wrote: Robert, Thanks for the info. As I said, I am illiterate

Re: Arabic Analyzer: possible bug

2009-10-08 Thread DM Smith
owercasing non-Arabic (also cyrillic, etc), is pretty safe across the board though. On Thu, Oct 8, 2009 at 9:29 AM, DM Smith <mailto:dmsmith...@gmail.com>> wrote: On 10/08/2009 09:23 AM, Uwe Schindler wrote: Just an addition: The lowercase filter is only for the case

Re: Arabic Analyzer: possible bug

2009-10-08 Thread DM Smith
suppose. but this is a tricky subject, what if you have mixed Arabic / German or something like that? for some other languages written in the Latin script, English stopwords could be bad :) I think that Lowercasing non-Arabic (also cyrillic, etc)

Re: Going to Java 5. Was: Re: A bit of planning

2008-03-10 Thread DM Smith
d not change. On Mar 10, 2008, at 12:02 PM, Doron Cohen wrote: On Thu, Jan 17, 2008 at 4:01 PM, DM Smith <[EMAIL PROTECTED]> wrote: On Jan 17, 2008, at 1:38 AM, Chris Hostetter wrote: : I'd like to recommend that 3.0 contain the new Java 5 API changes and what it : replaces be m

Re: Going to Java 5. Was: Re: A bit of planning

2008-03-10 Thread DM Smith
Grant Ingersoll wrote: All it takes is one line in the announcement saying "Version 3.0 uses Java 1.5" I don't think the significance will be lost on anyone. Everyone knows what Java 1.5 is. I'm -1 on calling it 4.0. People will then ask where is 3.0. I am +1 for sticking w/ the plan we vo

Read-Only core

2008-03-30 Thread DM Smith
index. I'd like to explore what it would take to make Lucene work on a JavaME device to search pre-built, read-only indexes. -- DM Smith - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Lingustically-enhanced indexing for Lucene

2008-05-09 Thread DM Smith
On May 9, 2008, at 8:06 AM, [EMAIL PROTECTED] wrote: On the other hand, it is java 1.5 compatible. Even developments in contrib, that are rather independent, must be java 1.4 compatible? Java 5 is fine for contrib. -

Re: Lingustically-enhanced indexing for Lucene

2008-05-09 Thread DM Smith
n for 3.0 which will be Java 5. I would think that anything new would wait for 3.0, since that is "any day now" :) Former advocate for Java 1.4, DM Smith - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional

Token implementation

2008-05-18 Thread DM Smith
, next() should be deprecated, IMHO. The performance advantage is in reusing Tokens and their buffer. I have also used a better algorithm than doubling for resizing an array. I'd have to hunt for it. -- DM Smith, infrequent contributer, grateful user!

Re: Token implementation

2008-05-19 Thread DM Smith
; tbLength) { int size = tbLength; while (size < newSize) { size *= 2; } newSize = size; } // Check to see if the buffer needs to be resized if (newSize > tbLength) { termBuffer = new char[newSize]; } } More below More responses below: DM Smith <[EMAIL PROTECT

Java 1.4 support

2008-05-19 Thread DM Smith
I found an interesting piece of code, called Retro Weaver, that allows one to develop Java 5 code and make it byte code compatible with Java 1.4. See: http://retroweaver.sourceforge.net/ I still have a significant user base that is stuck with Java 1.4 (i.e. MacOSX 10.3) and was planning to mai

Re: Token implementation

2008-05-19 Thread DM Smith
On May 19, 2008, at 4:33 PM, Michael McCandless wrote: DM Smith <[EMAIL PROTECTED]> wrote: Michael McCandless wrote: I agree the situation is not ideal, and it's confusing. My problem as a user is that I have to read the code to figure out how to optimally use the class. The

Re: Token implementation

2008-05-20 Thread DM Smith
On May 20, 2008, at 12:50 AM, Hiroaki Kawai wrote: "Michael McCandless" <[EMAIL PROTECTED]> wrote: More responses below: DM Smith <[EMAIL PROTECTED]> wrote: But, in TokenFilter, next() should be deprecated, IMHO. I think this is a good idea. After al

Re: Token implementation

2008-05-20 Thread DM Smith
On May 20, 2008, at 5:01 AM, Michael McCandless wrote: DM Smith wrote: On May 19, 2008, at 4:33 PM, Michael McCandless wrote: DM Smith <[EMAIL PROTECTED]> wrote: Michael McCandless wrote: I agree the situation is not ideal, and it's confusing. My problem as a user is tha

Re: Token implementation

2008-07-11 Thread DM Smith
own field. -- DM On May 20, 2008, at 5:21 AM, DM Smith wrote: On May 20, 2008, at 5:01 AM, Michael McCandless wrote: DM Smith wrote: On May 19, 2008, at 4:33 PM, Michael McCandless wrote: DM Smith <[EMAIL PROTECTED]> wrote: Michael McCandless wrote: I agree the situation is not

Re: Token implementation

2008-07-11 Thread DM Smith
Michael McCandless wrote: DM Smith wrote: Shouldn't Term have constructors that take a Token? I think that makes sense, though normally Token appears during analysis and Term during searching (I think?) -- how often would you need to make a Term from a Token? The problem I'm

Re: Token implementation

2008-07-11 Thread DM Smith
(I use Eclipse and have it set to flag all deprecated uses. This helps me look for places to change.) I think that this will make migration to 3.0 be much easier. With this changing Term to add Term(String, Token) won't be necessary. -- DM Mike DM Smith wrote: Michael McCandless w

Re: Token implementation

2008-07-11 Thread DM Smith
should use the char[] reuse methods instead? Mike DM Smith wrote: Michael McCandless wrote: DM Smith wrote: Shouldn't Term have constructors that take a Token? I think that makes sense, though normally Token appears during analysis and Term during searching (I think?) -- how often

Re: Token implementation

2008-07-12 Thread DM Smith
new String from the byte[]), and in the javadocs for termText() state that you can migrate either term() (if you really want a String and you understand the performance cost of doing so) or to the re-use APIs? Mike DM Smith wrote: Michael McCandless wrote: Maybe we should un-deprecate the

Re: Token implementation

2008-07-12 Thread DM Smith
Michael McCandless wrote: But, in TokenFilter, next() should be deprecated, IMHO. I think this is a good idea. After all if people don't want to bother using the passed in Token, they are still allowed to return a new one. I'm looking into the deprecation of TokenStream.next(). I ha

Re: Token implementation

2008-07-14 Thread DM Smith
Hiroaki Kawai wrote: DM Smith <[EMAIL PROTECTED]> wrote: On Jul 11, 2008, at 9:42 PM, Hiroaki Kawai wrote: Another suggestion from me: How about making token object as an singleton? Would that work for a multi-threaded application? Of cource. We should make that

TokenStream problem?

2008-07-16 Thread DM Smith
According to the documentation for TokenStream, derived classes are to override either next() or next(Token). Currently, if next(Token) is overridden, but next() is called, payload is cloned if it exists in the new token. However, if next(Token) is called, it is up to the implementation to p

Re: [VOTE] Break Back Compatibility "Contract" on Fieldable

2008-07-30 Thread DM Smith
ch as to Lucene user's mailing list and package maintainers). -- DM Smith Grant Ingersoll wrote: As they say, rules are meant to be broken... For a variety of reasons, some outlined below, I (and others) would like us to break our back compatibility requirements and allow for modifying the

Vector and Hashtable usage

2008-08-20 Thread DM Smith
s is the time to deprecate it's usage in a the API. -- DM Smith - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: RMI, Searchable and RemoteSearchable

2008-09-26 Thread DM Smith
Grant Ingersoll wrote: Came across: http://www.google.com/url?sa=t&source=web&ct=res&cd=1&url=http%3A%2F%2Fgroups.google.com%2Fgroup%2Fandroid-developers%2Fbrowse_thread%2Fthread%2F601329551a87e601%2Fcd0919ce891b4a26%3Flnk%3Dgst%26q%3Dlucene&ei=zNzcSPHCF4yI1ga61YiTBA&usg=AFQjCNECrBnNPBkxI4I0EbIzI

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread DM Smith
On Sep 30, 2008, at 8:19 AM, Robert Muir wrote: cool. is there interest in similar basic functionality for Hebrew? I'm interested as I use lucene for biblical research. same rules apply: without using GPL data (i.e. Hspell data) you can't do it right, but you can do a lot of the common

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread DM Smith
/ points and cantillation. All are NFC. IMHO, I think it is important to document whether an analyzer works with NFC, NFD or whatever. And leave it to the program to normalize to that form. On Tue, Sep 30, 2008 at 8:54 AM, DM Smith <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]&g

Re: TokenStream and Token APIs

2008-10-13 Thread DM Smith
On Oct 13, 2008, at 3:34 PM, Doug Cutting wrote: Michael Busch wrote: public abstract boolean nextToken() throws IOException; What's the point of a separate Token and TokenStream if there's only a single Token per TokenStream? If that's really the direction we'll go, then all of the

Re: [VOTE] Relax backwards-compatibility policy for package-protected APIs

2008-10-22 Thread DM Smith
On Oct 22, 2008, at 5:27 AM, Michael McCandless wrote: I think *not* having to maintain back compat of the package private APIs is very important to keeping our freedom (and sanity!) to continue to improve Lucene. This is similar to marking a new API as experimental and subject to sudde

Re: Code Formatting

2008-11-10 Thread DM Smith
It seems to me that the following would work: 1) All new code (contrib too) has to be formatted according to published standards. This can be done either by the person doing the submission, by a committer, or (It doesn't matter to me) 2) All existing code for which there is no patch can b

Parameter class and Java 5 Enums

2009-10-19 Thread DM Smith
Should the Parameter class be replaced with Java 5 enums? My only concern is backward compatibility. I noticed that Parameter is serializable. Is this used by Lucene? I wasn't able to see any place that depended on it. The only public method, Parameter.toString() results in the same value as a

Lucene as projects in Eclipse

2009-10-28 Thread DM Smith
Is there any guidance on how to set up Lucene for development within Eclipse. Perhaps a wiki page or an old email thread? I looked but didn't find one. I've done it manually twice now and it was time-consuming and ultimately I did it differently each time, not liking any way I have done it. Or

Re: Lucene as projects in Eclipse

2009-10-28 Thread DM Smith
On 10/28/2009 01:03 PM, Mark Miller wrote: DM Smith wrote: Is there any guidance on how to set up Lucene for development within Eclipse. Perhaps a wiki page or an old email thread? I looked but didn't find one. I've done it manually twice now and it was time-consuming and ultima

Re: Lucene as projects in Eclipse

2009-10-28 Thread DM Smith
ork ... I'm not looking for anything in particular though it does make the dependencies between contribs obvious. It was more a pattern from habit on another project. -- DM DM Smith wrote: On 10/28/2009 01:03 PM, Mark Miller wrote: DM Smith wrote: Is there any guid

Re: Lucene as projects in Eclipse

2009-10-28 Thread DM Smith
On Oct 28, 2009, at 1:45 PM, Robert Muir wrote: DM, I create one project (new project, checkout projects from SVN, and let it set it as a java project). I then set the source folders like you mentioned below. I add lib/junit*whatever.jar to library classpath, and set UTF-8 default encodin

Re: contrib and lucene 3.0

2009-10-30 Thread DM Smith
I don't see any reason to freeze new contributions from any release. On 10/30/2009 03:19 PM, Robert Muir wrote: thanks Michael. does anyone else have any opinion on this issue? fyi we already have several new features committed to 3.0 contrib already (see contrib/CHANGES), but I don't too much

Re: Why release 3.0?

2009-11-16 Thread DM Smith
On Nov 16, 2009, at 6:43 PM, Robert Muir wrote: > DM, in this case I'm not referring to surrogates, etc, but instead the idea > that properties for an existing character can change (the soft hyphen and > arabic ayah were two examples), also new characters are introduced. > > these will affect

Re: Why release 3.0?

2009-11-16 Thread DM Smith
perhaps the best advice is to skip 3.0 and take the pain once. > > btw, i created a diff from unicode 3's UCD to unicode 4's UCD, in case you > want to see the changes: http://people.apache.org/~rmuir/unicodeDiff.txt That's an amazing number of changes, even when you i

Re: Hiding JIRA issues

2009-11-21 Thread DM Smith
A couple of thoughts: JIRA allows for administrative export of the database to XML. If these don't export then something is really bad. Contact atlassian with the problem after searching their forums for the problem. -- DM On Nov 21, 2009, at 9:57 AM, Simon Willnauer wrote: > On Sat, Nov 21, 2

Re: [jira] Commented: (LUCENE-2092) BooleanQuery.hashCode and equals ignore isCoordDisabled

2009-11-23 Thread DM Smith
Since this is a bug fix, please mark it for 2.9.2 if there ever is one. On Nov 23, 2009, at 7:08 PM, Michael McCandless (JIRA) wrote: > >[ > https://issues.apache.org/jira/browse/LUCENE-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781706#acti

Release artifacts

2009-12-05 Thread DM Smith
I'm wondering about the size of the builds, which are surprisingly big to me. The src is 12M/13M and the bin is 17M/26M (tar.gz/zip) for 2.9.1, similar for 3.0.0. In looking at the binary artifact I see the following: * Every contrib jar has a corresponding javadoc jar, but there is no core-jav

Re: Lots of results

2009-12-05 Thread DM Smith
On Dec 5, 2009, at 5:22 PM, Grant Ingersoll wrote: > At ScaleCamp yesterday in the UK, I was listening to a talk on Xapian and the > speaker said one of the optimizations they do when retrieving a large result > set is that instead of managing a Priority Queue, they just allocate a large > arr

Re: LUCENE-1515

2010-01-02 Thread DM Smith
Just my 2 cents from a user perspective to the whole thread: I want the best and an easy way to identify the best. Preferably, it will be the default by current version. The best should also have the best name. Because of the backward compatibility policy, we're painted into a box, into name hel

Re: LUCENE-1515

2010-01-02 Thread DM Smith
On Jan 2, 2010, at 7:46 AM, Robert Muir wrote: >> I also want backward compatibility. Or at least control over it. That is, I >> need for indexes to work fully but want an easy path to upgrade/replace an >> index with better analyzer/filter combos. This stemmer is not backward >> compatible. >

Re: Compound File Default

2010-01-12 Thread DM Smith
I'm not sure that it's safe to assume that production use of Lucene is not on a laptop or that it is always on big iron. It makes sense that Lucene is embedded in all sorts of desktop applications that might run on small machines. That certainly describes the application that I work on. I'm

Re: Dynamic array reallocation algorithms

2010-01-12 Thread DM Smith
On Jan 12, 2010, at 6:27 PM, Marvin Humphrey wrote: > Greets, > > I've been trying to understand this comment regarding ArrayUtil.getNextSize(): > > * The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ... > > Maybe I'm missing something, but I can't see how the formula yields su

Re: Dynamic array reallocation algorithms

2010-01-13 Thread DM Smith
On Jan 13, 2010, at 1:00 AM, Marvin Humphrey wrote: > On Tue, Jan 12, 2010 at 10:46:29PM -0500, DM Smith wrote: > >> So starting at 0, the size is 0. >> 0 => 0 >> 0 + 1 => 4 >> 4 + 1 => 8 >> 8 + 1 => 16 >> 16 + 1 => 25 >> 25

Re: whitespace

2007-03-25 Thread DM Smith
oling to put the code in the whitespace format I like, but before making a patch, return it to what it was, again with tooling. -- DM Smith On Mar 25, 2007, at 12:24 PM, Yonik Seeley wrote: On 3/25/07, Michael McCandless (JIRA) <[EMAIL PROTECTED]> wrote: My first comment, which I fear

Re: enabling java assertions in the tests

2007-05-31 Thread DM Smith
http://java.sun.com/j2se/1.4.2/docs/guide/lang/assert.html -- DM Smith On May 31, 2007, at 6:30 PM, Doron Cohen wrote: While testing LUCENE-866 I realized that Java assertions are disabled when *I* run 'ant test'. Others did have the assertion executed and causing that NPE. So I am not

Re: Lucene 2.2.0 release available

2007-06-19 Thread DM Smith
FYI, The announcement has not made it to the http:// lucene.apache.org/ page. On Jun 19, 2007, at 6:13 PM, Michael Busch wrote: Release 2.2.0 of Lucene is now available! Many new features, optimizations, and bug fixes have been added since 2.1, including "point-in-time" searching, payloads,

Re: Author Tags

2007-07-06 Thread DM Smith
the @author is removed from the file, should we make sure that there is a CREDITS.txt for the contrib with the info in it. The reason I ask this is that some times I see posts here asking what the code intentions of the author were. -- DM Smith On Jul 5, 2007, at 6:21 PM, Grant Ingersoll wr

Re: binary at the front of CHANGES.txt

2007-07-17 Thread DM Smith
According to the UTF-8 spec \uFEFF is not a BOM. In UTF-8 the byte order is always the same. UTF-16 defines a BOM. If it is a BOM, I bet it was put in there by an MS editor. On Jul 17, 2007, at 1:58 PM, Yonik Seeley wrote: On 7/17/07, Steven Parkes <[EMAIL PROTECTED]> wrote: Can we get rid

Re: binary at the front of CHANGES.txt

2007-07-18 Thread DM Smith
On Jul 17, 2007, at 8:40 PM, Yonik Seeley wrote: On 7/17/07, DM Smith <[EMAIL PROTECTED]> wrote: According to the UTF-8 spec \uFEFF is not a BOM. In UTF-8 the byte order is always the same. But there is a BOM for UTF-8 (even though there is no endian component, it does serve as a

Re: The JDK 1.5 Can o' Worms

2007-07-24 Thread DM Smith
ccept 1.5 patches, but don't apply them until back ported. As to what led to this conversation, I bet we can find/invent an acceptable substitute for StringBuilder. -- DM Smith

Re: The JDK 1.5 Can o' Worms

2007-07-25 Thread DM Smith
f classes. GCJ's runtime support is not there yet. http://retroweaver.sourceforge.net/ is probably as valid an option this year as last A lot of contrib code is already 1.5, and it seems about time that core made the move as well. - Mark Grant Ingersoll wrote: On Jul 24, 2007

Re: [VOTE] Migrate Lucene to JDK 1.5 for 3.0 release

2007-07-30 Thread DM Smith
ures won't appear in the 3.x series API. I think it is very important to preserve the Lucene API where possible and reasonable, not changing it without gain. Given that this has been the practice, I don't think it is an issue. -- DM Smith On Jul 26, 2007, at 8:36 PM, Grant

Re: Lucene 2.3 RC 1 available for testing

2008-01-10 Thread DM Smith
Michael Busch wrote: Hi all, I just created the release artifacts (incl. maven artifacts) from the 2.3 branch and uploaded the files to http://people.apache.org/~buschmi/staging_area/lucene_2_3/rc1/. Let's try to use the next days for testing to ensure that we find serious bugs or build proble

Re: A bit of planning

2008-01-13 Thread DM Smith
with how well the 2.x series performs, I am much less inclined to do so. Lucene 2.x is most excellent!! -- DM Smith - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Going to Java 5. Was: Re: A bit of planning

2008-01-17 Thread DM Smith
On Jan 17, 2008, at 1:38 AM, Chris Hostetter wrote: : If I remember right, the file format changed in 2.1, such that 2.0 could not : read a 2.1 index. that is totally within the bounds of the compatibility statement... http://wiki.apache.org/lucene-java/BackwardsCompatibility Note that ol

Re: Back Compatibility

2008-01-17 Thread DM Smith
Grant Ingersoll wrote: My reasoning for this solution: Our minor release cycles are currently in the 3-6 months range and our major release cycles are in the 1-1.5 year range. I think giving someone 4-8 (or whatever) months is more than enough time to prepare for API changes. I am not sur

Re: Back Compatibility

2008-01-17 Thread DM Smith
any thanks to you all for such a stable product. -- DM Smith - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Back Compatibility

2008-01-23 Thread DM Smith
ts" as well. Again many thanks for all your hard work, DM Smith, a thankful "parasite" :) On Jan 23, 2008, at 5:16 PM, Michael McCandless wrote: chris Hostetter wrote: : I do like the idea of a static/system property to match legacy : behavior. For example, the bugs

Re: Back Compatibility

2008-01-24 Thread DM Smith
This is now a hijacked thread. It is very interesting, but it may be hard to find again. Wouldn't it be better to record this thread differently, perhaps opening a Jira issue to add XA to Lucene? -- DM Doron Cohen wrote: On Jan 24, 2008 6:55 PM, robert engels <[EMAIL PROTECTED]> wrote: T

Re: formatable changes log

2008-01-25 Thread DM Smith
Doron Cohen wrote: As it is becoming hard to browse/navigate CHANGES.txt, how about maintaining it in a simple HTML file? Requirements are: - fancier formatting where adequate. - collapse/expand by release/subject - easy to maintain... Here is an example, containing the current (new) trunk and

Re: detected corrupted index / performance improvement

2008-02-06 Thread DM Smith
On Feb 6, 2008, at 5:42 PM, Michael McCandless wrote: robert engels wrote: Do we have any way of determining if a segment is definitely OK/ VALID ? The only way I know is the CheckIndex tool, and it's rather slow (and it's not clear that it always catches all corruption). Just a thought.

Re: detected corrupted index / performance improvement

2008-02-06 Thread DM Smith
avoid). On Feb 6, 2008, at 5:15 PM, DM Smith wrote: On Feb 6, 2008, at 5:42 PM, Michael McCandless wrote: robert engels wrote: Do we have any way of determining if a segment is definitely OK/ VALID ? The only way I know is the CheckIndex tool, and it's rather slow (and it's n

Re: Lingustically-enhanced indexing for Lucene

2008-02-13 Thread DM Smith
I am very interested in Apertium, especially if it is possible to grow it for biblical Greek and Hebrew. Licensing threads seem to generate more heat than light. I hope that my question won't. I develop code under many different licenses including GPL, and feel that following licenses prope

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857427#action_12857427 ] DM Smith commented on LUCENE-2396: -- Robert, I think this is a red-herring. There

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857456#action_12857456 ] DM Smith commented on LUCENE-2396: -- {quote} So I think we should instead use

[jira] Issue Comment Edited: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857456#action_12857456 ] DM Smith edited comment on LUCENE-2396 at 4/15/10 2:1

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857487#action_12857487 ] DM Smith commented on LUCENE-2396: -- bq. Well, I think asking for a well-def

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857498#action_12857498 ] DM Smith commented on LUCENE-2396: -- {quote} bq. One mechanism that would wor

[jira] Commented: (LUCENE-2396) remove version from core and contrib analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857543#action_12857543 ] DM Smith commented on LUCENE-2396: -- Hmmm. If we are moving stuff out of core and

[jira] Commented: (LUCENE-2396) remove version from core and contrib analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857595#action_12857595 ] DM Smith commented on LUCENE-2396: -- Humor me. I think I'm not seeing the fores

[jira] Created: (LUCENE-610) BooleanScorer2 does not compile with ecj

2006-06-21 Thread DM Smith (JIRA)
, Fedora Core 5 Reporter: DM Smith BooleanScorer2, derived from scorer, has two inner classes both derived, ultimately, from Scorer. As such they all define doc() or inherit it. ecj produces an error when doc() is called from score in the inner classes in the methods

[jira] Commented: (LUCENE-610) BooleanScorer2 does not compile with ecj

2006-06-21 Thread DM Smith (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-610?page=comments#action_12417186 ] DM Smith commented on LUCENE-610: - I did a bit further testing. The current behavior under Java 5 is to call this.doc(); > BooleanScorer2 does not compile with

[jira] Created: (LUCENE-611) TestConstantScoreRangeQuery does not compile with ecj

2006-06-22 Thread DM Smith (JIRA)
Environment: Eclipse 3.1.2, FC5 Reporter: DM Smith TestConstantScoreRangeQuery has an assertEquals(String, Float, Float) but most of the calls to assertEquals are (String, int, int). ecj complains with the following error: The method assertEquals(String, float, float) is ambiguous for the type

<    1   2   3   >