And what happens when someone regenerates it with 1.6 without knowing? Uwe Schindler wrote: > I check this by generating the file with 1.4 and 1.5. The 1.4 version will > not change anymore, so we just leave the java file no jflex anymore. The old > one is used for Lucene until 2.9, if you use matchVersion=LUCENE_30, the new > one is used, which can also be regenerated. > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > >> -----Original Message----- >> From: Mark Miller [mailto:markrmil...@gmail.com] >> Sent: Monday, November 16, 2009 9:21 PM >> To: java-dev@lucene.apache.org >> Subject: Re: Why release 3.0? >> >> Good point - and that likely means the current warning is not working - >> what can we do to improve it? >> >> Perhaps a new text file called jflexregen or something, and it >> specifically says you must use java 1.5? >> >> Uwe Schindler wrote: >> >>> I think the regenerated code in Standard is since years no longer >>> generated with 1.4 J Most developers use 1.5 or even 1.6. So it >>> already changed incompatible. >>> >>> >>> >>> ----- >>> Uwe Schindler >>> H.-H.-Meier-Allee 63, D-28213 Bremen >>> http://www.thetaphi.de >>> eMail: u...@thetaphi.de >>> >>> ------------------------------------------------------------------------ >>> >>> *From:* Robert Muir [mailto:rcm...@gmail.com] >>> *Sent:* Monday, November 16, 2009 8:52 PM >>> *To:* java-dev@lucene.apache.org >>> *Subject:* Re: Why release 3.0? >>> >>> >>> >>> Uwe, thats probably a good solution I think. just as long as we >>> document somewhere, >>> I think there is some warning verbage in StandardTokenizer already >>> about this. >>> >>> NOTE: if you change StandardTokenizerImpl.jflex and need to regenerate >>> the tokenizer, remember to use JRE 1.4 to run jflex (before >>> Lucene 3.0). This grammar now uses constructs (eg :digit:, >>> :letter:) whose meaning can vary according to the JRE used to >>> run jflex. See >>> https://issues.apache.org/jira/browse/LUCENE-1126 for details. >>> >>> On Mon, Nov 16, 2009 at 2:50 PM, Uwe Schindler <u...@thetaphi.de >>> <mailto:u...@thetaphi.de>> wrote: >>> >>> But it is a general warning that should be placed in the Wiki: If you >>> upgrade from Java 1.4 to Java 5, think about reindexing. >>> >>> >>> >>> It has definitely nothing to do with 3.0, because uses could have >>> changed (and most of them have) before. >>> >>> ----- >>> Uwe Schindler >>> H.-H.-Meier-Allee 63, D-28213 Bremen >>> http://www.thetaphi.de >>> eMail: u...@thetaphi.de <mailto:u...@thetaphi.de> >>> >>> ------------------------------------------------------------------------ >>> >>> *From:* Robert Muir [mailto:rcm...@gmail.com <mailto:rcm...@gmail.com>] >>> *Sent:* Monday, November 16, 2009 8:45 PM >>> >>> >>> *To:* java-dev@lucene.apache.org <mailto:java-dev@lucene.apache.org> >>> *Subject:* Re: Why release 3.0? >>> >>> >>> >>> right, my point is its true its nothing to do with Lucene at all, >>> >> really. >> >>> but the reality is we should clarify this to users I think. >>> >>> Its especially complex in the current StandardTokenizer, which uses a >>> mix of hardcoded ranges and properties, can you tell me if you should >>> reindex for given language X? >>> I wouldn't want to answer that question right now. >>> >>> On Mon, Nov 16, 2009 at 2:42 PM, Uwe Schindler <u...@thetaphi.de >>> <mailto:u...@thetaphi.de>> wrote: >>> >>> We tried out: Character.getType() for these two chars: >>> >>> >>> >>> Java 5: >>> '\u00AD' = 16 >>> '\u06DD' = 16 >>> >>> Java 1.4: >>> '\u00AD' = 20 >>> '\u06DD' = 7 >>> >>> >>> >>> The first is the soft hyphen. >>> >>> ----- >>> Uwe Schindler >>> H.-H.-Meier-Allee 63, D-28213 Bremen >>> http://www.thetaphi.de >>> eMail: u...@thetaphi.de <mailto:u...@thetaphi.de> >>> >>> ------------------------------------------------------------------------ >>> >>> *From:* Robert Muir [mailto:rcm...@gmail.com <mailto:rcm...@gmail.com>] >>> *Sent:* Monday, November 16, 2009 8:37 PM >>> >>> >>> *To:* java-dev@lucene.apache.org <mailto:java-dev@lucene.apache.org> >>> *Subject:* Re: Why release 3.0? >>> >>> >>> >>> right, its nothing to do with lucene, instead due to property changes, >>> etc. >>> >>> i just think we should inform users on java 1.4/2.9 that if they >>> upgrade to java 1.5/3.0, they should reindex. >>> >>> the reason i say this about properties, is there are some that change >>> that will affect tokenizers, i give two examples, a hyphen that >>> changes from punctuation to format (might affect >>> >> SolrWordDelimiterFilter), >> >>> and arabic ayah which changes from NSM to format, which surely affects >>> ArabicLetterTokenizer. >>> >>> On Mon, Nov 16, 2009 at 2:33 PM, Steven A Rowe <sar...@syr.edu >>> <mailto:sar...@syr.edu>> wrote: >>> >>> Hi Robert, >>> >>> I agree that the Unicode version supported by the JVM, as you say, >>> really has nothing to do with Lucene. >>> >>> The disruption here is users' upgrading from Java 1.4 to 1.5+, not >>> when they upgrade Lucene. I'd guess with few exceptions that most >>> people have been using Lucene with 1.5+ for a couple of years now, >>> >> though. >> >>> But even the upgrade from Java 1.4 to 1.5+ will have (had) zero impact >>> on most Lucene users, assuming that most use Latin-1 exclusively; >>> although I haven't looked, I'd be surprised if Latin-1 characters >>> changed much, if at all, from Unicode 3.0 to 4.0. >>> >>> It would be useful, I think, to include (a pointer to?) a description >>> of the details of the Unicode 3.0->4.0 differences in the Lucene 3.0 >>> release notes, since the minimum required Java version, and so also >>> the supported Unicode version, changes then. >>> >>> Steve >>> >>> >>> On 11/16/2009 at 2:15 PM, Robert Muir wrote: >>> >>>> the problem is that the properties have changed for various >>>> >> characters, >> >>>> and new characters were added. >>>> >>>> it really has nothing to do with lucene, but the idea you can go from >>>> jdk 1.4/lucene 2.9 to jdk 1.5/lucene3.0 without reindexing is not >>>> >> true. >> >>>> On Mon, Nov 16, 2009 at 2:12 PM, Uwe Schindler <u...@thetaphi.de >>>> >>> <mailto:u...@thetaphi.de>> wrote: >>> >>>> But an UTF-8 stream from Java 4 can still be read with Java 5, >>>> what is the problem? Java 5 extended Unicode support, but an index >>>> created with older versions can still be read. UTF-8 is standardized. >>>> >>>> >>>> >>>> ----- >>>> Uwe Schindler >>>> H.-H.-Meier-Allee 63, D-28213 Bremen >>>> http://www.thetaphi.de >>>> eMail: u...@thetaphi.de <mailto:u...@thetaphi.de> >>>> >>>> >>>> ________________________________ >>>> >>>> >>>> From: Robert Muir [mailto:rcm...@gmail.com >>>> >>> <mailto:rcm...@gmail.com>] >>> >>>> Sent: Monday, November 16, 2009 8:09 PM >>>> >>>> To: java-dev@lucene.apache.org <mailto:java- >>>> >> d...@lucene.apache.org> >> >>>> Subject: Re: Why release 3.0? >>>> >>>> >>>> >>>> uwe, on topic please read my comment on LUCENE-1689, because >>>> unicode version was bumped in jdk 1.5, i believe this index backwards >>>> compatibility is only theoretical >>>> >>>> On Mon, Nov 16, 2009 at 2:05 PM, Uwe Schindler <u...@thetaphi.de >>>> >>> <mailto:u...@thetaphi.de>> wrote: >>> >>>> 2.9 has *not* the same format as 3.0, an index created with 3.0 >>>> cannot be read with 2.9. This is because compressed field support was >>>> removed and therefore the version number of the stored fields file was >>>> upgraded. But indexes from 2.9 can be read with 3.0 and support may >>>> >> get >> >>>> removed in 4.0. 3.0 Indexes can be read until version 4.9. >>>> >>>> >>>> >>>> Uwe >>>> >>>> ----- >>>> Uwe Schindler >>>> H.-H.-Meier-Allee 63, D-28213 Bremen >>>> http://www.thetaphi.de >>>> eMail: u...@thetaphi.de <mailto:u...@thetaphi.de> >>>> >>>> >>>> ________________________________ >>>> >>>> >>>> From: Jake Mannix [mailto:jake.man...@gmail.com >>>> >>> <mailto:jake.man...@gmail.com>] >>> >>>> Sent: Monday, November 16, 2009 7:15 PM >>>> >>>> >>>> To: java-dev@lucene.apache.org <mailto:java- >>>> >> d...@lucene.apache.org> >> >>>> Subject: Re: Why release 3.0? >>>> >>>> >>>> >>>> Don't users need to upgrade to 3.0 because 3.1 won't be >>>> necessarily able to read your >>>> 2.4 index file formats? I suppose if you've already upgraded to >>>> 2.9, then all is well because >>>> 2.9 is the same format as 3.0, but we can't assume all users >>>> upgraded from 2.4 to 2.9. >>>> >>>> If you've done that already, then 3.0 might not be necessary, >>>> but if you're on 2.4 right now, >>>> you will be in for a bad surprise if you try to upgrade to 3.1. >>>> >>>> -jake >>>> >>>> On Mon, Nov 16, 2009 at 10:10 AM, Erick Erickson >>>> <erickerick...@gmail.com <mailto:erickerick...@gmail.com>> wrote: >>>> >>>> One of my "specialties" is asking obvious questions just to see >>>> if everyone's assumptions are aligned. So with the discussion about >>>> branching 3.0 I have to ask "Is there going to be any 3.0 release >>>> intended for *production*?". And if not, would we save a lot of >>>> work by just not worrying about retrofitting fixes to a 3.0 branch >>>> and carrying on with 3.1 as the first *supported* 3.x release? >>>> >>>> Since 3.0 is "upgrade-to-java5 and remove deprecations", I'm not >>>> sure *as a user* I see a good reason to upgrade to 3.0. Getting a >>>> "beta/snapshot" release to get a head start on cleaning up my code >>>> does seem worthwhile, if I have the spare time. And having a base >>>> 3.0 version that's not changing all over the place would be useful >>>> for that. >>>> >>>> That said, I'm also not terribly comfortable with a "release" >>>> that's out there and unsupported. >>>> >>>> Apologies if this has already been discussed, but I don't >>>> remember it. Although my memory isn't what it used to be (but >>>> some would claim it never was<G>)... >>>> >>>> Erick >>>> >>> >>> >>> -- >>> Robert Muir >>> rcm...@gmail.com <mailto:rcm...@gmail.com> >>> >>> >>> >>> >>> -- >>> Robert Muir >>> rcm...@gmail.com <mailto:rcm...@gmail.com> >>> >>> >>> >>> >>> -- >>> Robert Muir >>> rcm...@gmail.com <mailto:rcm...@gmail.com> >>> >>> >> -- >> - Mark >> >> http://www.lucidimagination.com >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-dev-h...@lucene.apache.org >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >
-- - Mark http://www.lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org