[
https://issues.apache.org/jira/browse/LUCENE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Rutherglen updated LUCENE-1526:
-
Attachment: LUCENE-1526.patch
Inlined into SegmentTermDocs. If there's an issue with the
>
> That's an amazing number of changes, even when you ignore name changes.
>
DM, for your reference, I created another diff from 4.0->5.1, showing what
will happen with JDK7 here: http://people.apache.org/~rmuir/unicodeDiff2.txt
the problem is that as a search engine library, lucene cares about
actually i thought about this. i change my story.
deprecating anything is stupid, because its still not back compatible, i.e.
Character.isLetter(char) even returns different results now, even if we
invoke it.
hard break is the only solution.
we should have done this deprecation in 2.9, but its c
[
https://issues.apache.org/jira/browse/LUCENE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Rutherglen updated LUCENE-1526:
-
Attachment: LUCENE-1526.patch
Here's a working version of this. The page size is statica
completely ignoring the difficulty, I would propose to fix everything to
correspond with the java 1.5 unicode version, for consistency.
I would exempt StandardTokenizer, because its completely inside our control.
we can fix it at our leisure.
for the rest of this stuff, its already a 'change in ru
So whats your best recommendation? Ignoring the difficulty and just
considering whats best for users?
Robert Muir wrote:
> well, in all honesty there is a bit of complexity.
> i leave the StandardTokenizer out of this, it gives the same results
> regardless of JVM version.
> it may not be correct,
well, in all honesty there is a bit of complexity.
i leave the StandardTokenizer out of this, it gives the same results
regardless of JVM version.
it may not be correct, but its consistent, we could wait till 5.0 or 10.0 to
make it correct :)
Also, because it gives the same results regardless of JV
Robert Muir wrote:
>
>
>> and I think it sucks they might have to reindex twice with the
>> current status of things (we did not complete unicode 4 support
>> in lucene 3.0)
>> which is why i mentioned this problem on the unicode 4 issues im
>> trying to work.
>
> Whether 3.
On Mon, Nov 16, 2009 at 8:17 PM, DM Smith wrote:
>
>
thanks DM, I hope to work on it more soon...
>
> I've been reading the thread and at first my response was. No big deal, it
> won't affect me (i.e. awareness of the problem). And now my thought is "I'm
> hosed" (i.e. understanding)
>
I guess
Add org.apache.lucene.store.FSDirectory.getDirectory()
--
Key: LUCENE-2076
URL: https://issues.apache.org/jira/browse/LUCENE-2076
Project: Lucene - Java
Issue Type: Wish
Component
[
https://issues.apache.org/jira/browse/LUCENE-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778685#action_12778685
]
Luis Alves edited comment on LUCENE-2039 at 11/17/09 1:46 AM:
--
[
https://issues.apache.org/jira/browse/LUCENE-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778685#action_12778685
]
Luis Alves commented on LUCENE-2039:
+1
I'll work on changing the queryparser on Con
On Nov 16, 2009, at 7:53 PM, Robert Muir wrote:
> right, the only way you could really contain it would be to do something like
> that.
I'm looking forward to your ICU analyzer! IMHO, it be great to have it be a
pluggable replacement for it's counterparts in core. That is, using reflection,
i
[
https://issues.apache.org/jira/browse/LUCENE-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778675#action_12778675
]
Earwin Burrfoot commented on LUCENE-2075:
-
There's no such thing in Google Collect
right, the only way you could really contain it would be to do something
like that.
I just think we should make users aware of this, thats all.
and I think it sucks they might have to reindex twice with the current
status of things (we did not complete unicode 4 support in lucene 3.0)
which is why
> Is core lucene really affected by the change? Or is it only contrib? I
> mean, if we couldn't create an index using core with surrogate pairs and
> other Unicode 4.0 stuff (though I'm not clear on the changes), how can it
> change reading/searching the index?
>
>
Sure, especially core analyzers l
On Nov 16, 2009, at 6:43 PM, Robert Muir wrote:
> DM, in this case I'm not referring to surrogates, etc, but instead the idea
> that properties for an existing character can change (the soft hyphen and
> arabic ayah were two examples), also new characters are introduced.
>
> these will affect
[
https://issues.apache.org/jira/browse/LUCENE-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778645#action_12778645
]
Jason Rutherglen commented on LUCENE-2075:
--
Solr used CHM as an LRU, however it t
[
https://issues.apache.org/jira/browse/LUCENE-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778641#action_12778641
]
Jason Rutherglen commented on LUCENE-2071:
--
I suspect there's apps out in the wil
[
https://issues.apache.org/jira/browse/LUCENE-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778624#action_12778624
]
Michael McCandless commented on LUCENE-2071:
I would rather not open up such a
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-2074:
--
Description:
The current trunk version of StandardTokenizerImpl was generated by Java 1.4
(ac
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-2074:
--
Fix Version/s: 3.1
> Use a separate JFlex generated Unicode 4 by Java 5 compatible
> Standard
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-2074:
--
Attachment: LUCENE-2074.patch
Patch for trunk using Version.LUCENE_31
> Use a separate JFlex
[
https://issues.apache.org/jira/browse/LUCENE-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778620#action_12778620
]
Michael McCandless commented on LUCENE-2047:
bq. Reopening after every doc cou
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-2074:
--
Attachment: LUCENE-2074-lucene30.patch
This is the patch for version 3.0, that keeps the old j
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778615#action_12778615
]
Robert Muir commented on LUCENE-2074:
-
Uwe, we could fix in 3.1 (but we should commit
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778614#action_12778614
]
Uwe Schindler commented on LUCENE-2074:
---
Should we fix this for 3.0 or not?
The curr
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778598#action_12778598
]
Uwe Schindler edited comment on LUCENE-2074 at 11/16/09 10:18 PM:
--
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778598#action_12778598
]
Uwe Schindler commented on LUCENE-2074:
---
It uses hardcode char ranges, the parser is
[
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778586#action_12778586
]
Michael McCandless commented on LUCENE-1458:
Thanks Mark! Hopefully, once 3.0
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778585#action_12778585
]
Robert Muir commented on LUCENE-2074:
-
well, the wikipediatokenizer at least is simila
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778582#action_12778582
]
Uwe Schindler edited comment on LUCENE-2074 at 11/16/09 10:01 PM:
--
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778583#action_12778583
]
Michael McCandless commented on LUCENE-2074:
bq. I feel bad about this whole V
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778582#action_12778582
]
Uwe Schindler commented on LUCENE-2074:
---
bq. Uwe, also, just checking, i don't know
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778580#action_12778580
]
Simon Willnauer commented on LUCENE-2074:
-
bq. The problem is, these are the hard
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778579#action_12778579
]
Robert Muir commented on LUCENE-2074:
-
Uwe, also, just checking, i don't know javacc a
I removed the artifacs from p.a.o, they were not made really public to
java-user. Let's build new ones after the jflex thing is committed.
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Uwe Schindl
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778575#action_12778575
]
Uwe Schindler commented on LUCENE-2074:
---
I add the warning to my patch! Thanks. What
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778574#action_12778574
]
Simon Willnauer commented on LUCENE-2074:
-
nothing against the patch! I just used
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-2074:
--
Attachment: LUCENE-2074.patch
Updated patch with comment fixed and dead Token-related code rem
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778571#action_12778571
]
Mark Miller commented on LUCENE-2074:
-
{quote} We should really try hard to find diffe
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778570#action_12778570
]
Simon Willnauer commented on LUCENE-2074:
-
bq. For this one it's not new, it was t
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Miller updated LUCENE-2074:
Attachment: jflexwarning.patch
I still think we also still need a more prominent warning system.
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778569#action_12778569
]
Robert Muir commented on LUCENE-2074:
-
I am anti-Version too in a lot of ways. I worry
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778562#action_12778562
]
Uwe Schindler commented on LUCENE-2074:
---
For this one it's not new, it was there bef
Op maandag 16 november 2009 19:09:52 schreef J. Delgado:
> On Mon, Nov 16, 2009 at 9:44 AM, Earwin Burrfoot wrote:
> > This algo is strictly tied to sort-by-score, if I understand it correctly.
> > Lucene has queries and sorting decoupled (except for allowOutOfOrder
> > mess), so implementing it w
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778559#action_12778559
]
Simon Willnauer commented on LUCENE-2074:
-
This might be the wrong place to mentio
Share the Term -> TermInfo cache across threads
---
Key: LUCENE-2075
URL: https://issues.apache.org/jira/browse/LUCENE-2075
Project: Lucene - Java
Issue Type: Improvement
Components: Inde
I opened https://issues.apache.org/jira/browse/LUCENE-2074
It fixes the problem, the patch uses a different impl depending on
matchVersion.
If I commit it now, I would regenerate the rc1 artifacts and release the
tomorrow to java-user. Currently the ones on people.apache.org are only
"known" to j
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-2074:
--
Attachment: LUCENE-2074.patch
Here the patch. It uses an interface containing the needed metho
[
https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler reassigned LUCENE-2074:
-
Assignee: Uwe Schindler
> Use a separate JFlex generated Unicode 4 by Java 5 compatible
Use a separate JFlex generated Unicode 4 by Java 5 compatible StandardTokenizer
---
Key: LUCENE-2074
URL: https://issues.apache.org/jira/browse/LUCENE-2074
Project: Lucene - J
[
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778548#action_12778548
]
Mark Miller commented on LUCENE-1458:
-
Merged up - I've gotto say - that was a nasty o
[
https://issues.apache.org/jira/browse/LUCENE-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778547#action_12778547
]
Robert Muir commented on LUCENE-2073:
-
Mark, I agree, there are two issues I know of:
Document issues involved in building your index with one jdk version and then
searching/updating with another
-
Key: LUCENE-2073
URL: https://issues.apache
[
https://issues.apache.org/jira/browse/LUCENE-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778536#action_12778536
]
Robert Muir commented on LUCENE-1689:
-
bq. Then thats what I am saying we should be do
[
https://issues.apache.org/jira/browse/LUCENE-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778532#action_12778532
]
Robert Muir commented on LUCENE-1689:
-
Steven, no its definitely the right place to po
[
https://issues.apache.org/jira/browse/LUCENE-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778531#action_12778531
]
Mark Miller commented on LUCENE-1689:
-
bq. Mark honestly, I do not yet know how this o
[
https://issues.apache.org/jira/browse/LUCENE-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778528#action_12778528
]
Steven Rowe commented on LUCENE-1689:
-
I don't know if this is the right place to poin
OK, I checked. The JFLEX file in tunk was 1.4 generated. I regenerated with
1.5 and it was different (completely!). I saved the old version and renamed
to StandardTokenizerImplJava14 extends StandardTokenizerImpl
By this the impl is exchanged depending on version. The 1.4 version can no
longer be
[
https://issues.apache.org/jira/browse/LUCENE-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778527#action_12778527
]
Robert Muir commented on LUCENE-1689:
-
bq. We can fix that too? If so, I think we shou
[
https://issues.apache.org/jira/browse/LUCENE-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778526#action_12778526
]
Mark Miller commented on LUCENE-1689:
-
I'm speaking in regards to:
{quote}
btw, its w
[
https://issues.apache.org/jira/browse/LUCENE-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778524#action_12778524
]
Robert Muir commented on LUCENE-1689:
-
bq. If there is nothing we can do here, then we
I still reccomend we add a file then HowToRegenJflex.txt or something -
that specifically says to use 1.5 or 1.6. I don't changing the current
notice/warning is visible enough to ensure someone doesn't break this.
Robert Muir wrote:
> no. its still 4.0, but i hear 1.7 will be 5.1 or 5.2
>
> the on
[
https://issues.apache.org/jira/browse/LUCENE-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778516#action_12778516
]
Mark Miller commented on LUCENE-1689:
-
If there is nothing we can do here, then we jus
[
https://issues.apache.org/jira/browse/LUCENE-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Simon Willnauer reassigned LUCENE-2072:
---
Assignee: Simon Willnauer
> Upgrade contrib/regex to jakarta-regex 1.5
> --
[
https://issues.apache.org/jira/browse/LUCENE-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778515#action_12778515
]
Robert Muir commented on LUCENE-2069:
-
Uwe, we can use matchVersion for all of this, t
[
https://issues.apache.org/jira/browse/LUCENE-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Simon Willnauer updated LUCENE-2072:
Attachment: jakarta-regexp-1.5.jar
LUCENE-2072.patch
> Upgrade contrib/reg
Upgrade contrib/regex to jakarta-regex 1.5
---
Key: LUCENE-2072
URL: https://issues.apache.org/jira/browse/LUCENE-2072
Project: Lucene - Java
Issue Type: Improvement
Components: contrib/*
[
https://issues.apache.org/jira/browse/LUCENE-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778514#action_12778514
]
Uwe Schindler commented on LUCENE-2069:
---
we can change it whenever we want, we must
no. its still 4.0, but i hear 1.7 will be 5.1 or 5.2
the only way to truly control this, would be to use something like ICU to
control the unicode version being used (and actually be faster, and support
higher version).
see http://site.icu-project.org/home/why-use-icu4j
the issue is that lucene d
[
https://issues.apache.org/jira/browse/LUCENE-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778510#action_12778510
]
Robert Muir commented on LUCENE-2069:
-
Simon, yes see LUCENE-1689.
this is my questio
Did 1.6 change the unicode version? Robert?
-
UWE SCHINDLER
Webserver/Middleware Development
PANGAEA - Publishing Network for Geoscientific and Environmental Data
MARUM - University of Bremen
Room 2500, Leobener Str., D-28359 Bremen
Tel.: +49 421 218 65595
Fax: +49 421 218 65505
http://www.pa
[
https://issues.apache.org/jira/browse/LUCENE-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778509#action_12778509
]
Simon Willnauer commented on LUCENE-2069:
-
we might need a changes.txt entry here
And what happens when someone regenerates it with 1.6 without knowing?
Uwe Schindler wrote:
> I check this by generating the file with 1.4 and 1.5. The 1.4 version will
> not change anymore, so we just leave the java file no jflex anymore. The old
> one is used for Lucene until 2.9, if you use mat
[
https://issues.apache.org/jira/browse/LUCENE-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778508#action_12778508
]
Robert Muir commented on LUCENE-2069:
-
Simon, those "wierd" chars are indeed real code
mark these are similar to my concerns with us doing unicode 4.0 (suppl.
characters, etc) support in 3.1.
this is why i left a comment on LUCENE-1689, I'm pretty confused about what
approach we should take, because technically, fixing this will break things.
and again, I do believe we should have f
[
https://issues.apache.org/jira/browse/LUCENE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778507#action_12778507
]
Simon Willnauer commented on LUCENE-2068:
-
We will get this in once 3.0 is out. I
I check this by generating the file with 1.4 and 1.5. The 1.4 version will
not change anymore, so we just leave the java file no jflex anymore. The old
one is used for Lucene until 2.9, if you use matchVersion=LUCENE_30, the new
one is used, which can also be regenerated.
-
Uwe Schindler
H.-H.
[
https://issues.apache.org/jira/browse/LUCENE-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778504#action_12778504
]
Simon Willnauer commented on LUCENE-2069:
-
Robert, I assume you did use those weir
Good point - and that likely means the current warning is not working -
what can we do to improve it?
Perhaps a new text file called jflexregen or something, and it
specifically says you must use java 1.5?
Uwe Schindler wrote:
>
> I think the regenerated code in Standard is since years no longer
I would rename the java file/class and write a big warning on it: for
version < 3.0. Do not recreate (which cannot be done, because jflex file is
missing). The current jflex file is recreated and is now the official
support 1.5 version. The 1.4 version will never change!
-
Uwe Schindler
H.-
This is a big deal, weather its jdk or Lucene related. We are forcing
those on 1.4 to move to 1.5 - any problems you face with that with the
JDK are Lucene problems if they affect Lucene. We need big clear
warnings about this - we should have had them before we pushed to users
to 1.5 as well if I a
We support 3.0, why do you tend to say something other? I will always fix
the bug first in 3.0 and then merge (perhaps) back to 2.9.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
_
From: Erick Erickson [mailto:erickerick...@gmail.
Steven, I think we can be almost sure of no latin-1 changes.
what do you think about this jflex situation though?
it seems like a mess, is there anything we can do before the jflex 1.5 stuff
that is going on now (where we could actually link Version to the unicode
version jflex uses explicitly?)
I have to regenerate the JFlex files to be sure that they are Java 5. Should
I do and recreate the artifacts, they are not yet released.
Correct would be to copy the current generated Java file and use it if
matchVersion < Version.LUCENE_30. For 3.0++ we have a new one. If the old
one is really
Oops, stupid mouse made me send a blank message.
Ok, I withdraw the question since there *are* good reasons to put
3.0 in a prod environment . It's also an easier thing to say "new Lucene
users should start with 3.0" rather than "new Lucene users should
start with 3.1. Use 3.0 until we release 3.1
On Mon, Nov 16, 2009 at 2:03 PM, Uwe Schindler wrote:
> Hi Erick,
>
>
>
> 3.0 is **not** unsupported or beta release, it is the cleaned up 2.9.1
> release. You are right, it is not needed for 2.9.1 users to upgrade (but
> they can), but for new users starting with Lucene, the recommendadion is t
i suppose we are ok then, except for the fact that now StandardTokenizer is
working with a unicode 3.0 definition, instead of the unicode version (4.0)
that corresponds to our required minimum jre (1.5)...
sorry if i raised a stink about nothing, but you see my concerns maybe?
On Mon, Nov 16, 200
JFlex was not regenerated as far as I know, but if somebody did, its already
broken.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
_
From: Robert Muir [mailto:rcm...@gmail.com]
Sent: Monday, November 16, 2009 8:53 PM
To: java-
I think the regenerated code in Standard is since years no longer generated
with 1.4 :-) Most developers use 1.5 or even 1.6. So it already changed
incompatible.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
_
From: Robert Muir
[
https://issues.apache.org/jira/browse/LUCENE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Simon Willnauer reassigned LUCENE-2068:
---
Assignee: Simon Willnauer
> fix reverseStringFilter for unicode 4.0
> --
[
https://issues.apache.org/jira/browse/LUCENE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Simon Willnauer updated LUCENE-2068:
Attachment: LUCENE_2068.patch
removed static import
> fix reverseStringFilter for unicode
[
https://issues.apache.org/jira/browse/LUCENE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778487#action_12778487
]
Simon Willnauer commented on LUCENE-2068:
-
bq. I just think we should use a consis
btw, so heres a great example. you are backwards broken regardless of JVM
for StandardTokenizer, because we used 1.4 JRE to run jflex in 2.9, but 1.5
in 3.0, right?
On Mon, Nov 16, 2009 at 2:51 PM, Robert Muir wrote:
> Uwe, thats probably a good solution I think. just as long as we document
> so
Uwe, thats probably a good solution I think. just as long as we document
somewhere,
I think there is some warning verbage in StandardTokenizer already about
this.
NOTE: if you change StandardTokenizerImpl.jflex and need to regenerate
the tokenizer, remember to use JRE 1.4 to run jflex (befor
But it is a general warning that should be placed in the Wiki: If you
upgrade from Java 1.4 to Java 5, think about reindexing.
It has definitely nothing to do with 3.0, because uses could have changed
(and most of them have) before.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http
right, my point is its true its nothing to do with Lucene at all, really.
but the reality is we should clarify this to users I think.
Its especially complex in the current StandardTokenizer, which uses a mix of
hardcoded ranges and properties, can you tell me if you should reindex for
given langu
We tried out: Character.getType() for these two chars:
Java 5:
'\u00AD' = 16
'\u06DD' = 16
Java 1.4:
'\u00AD' = 20
'\u06DD' = 7
The first is the soft hyphen.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
_
From: Robert Mu
But most people already use 1.5 or 1.6 even with 2.9. They could also switch
before. The problem is the used JVM not the used Lucene Version. And you can
also run Lucene 1.4.3 with Java 5 -> same problem. If people change their
Java Version, they have to take care what changed.
The only thing:
1 - 100 of 161 matches
Mail list logo