Thanks for the replies. After further investigation I have been able to
track this down to an extra sort that was occurring further along the code
path after the results have been retrieved from Lucene.
I have deleted this extract sort and set lucene to first sort by LastName
and FirstName (as 2
simon DOT willnauer AT googlemail DOT com
thx
On Sun, Jul 25, 2010 at 11:05 PM, Peter Karman pe...@peknet.com wrote:
Marvin Humphrey wrote on 7/25/10 3:07 PM:
peter AT peknet DOT com
yes, please.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
[
https://issues.apache.org/jira/browse/LUCY-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892552#action_12892552
]
Marvin Humphrey commented on LUCY-121:
--
The device of searching TESS and CTM-ONLINE
Hi
I was running tests on trunk (after merging the changes from LUCENE-2537)
and received this error message:
expected:true but was:false
junit.framework.AssertionFailedError: expected: but was:
at
org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
at
The config was corrupt (missing modules checkout). Fixed and restarted.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: Apache Hudson Server [mailto:hud...@hudson.zones.apache.org]
Sent: Monday, July 26,
[
https://issues.apache.org/jira/browse/SOLR-1240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1289#action_1289
]
Gijs Kunze commented on SOLR-1240:
--
{quote}
bq. p.s. I noticed the start parameter was
[
https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892233#action_12892233
]
Graham P commented on SOLR-1553:
Please ensure that the the edismax does not have old DisMax
[
https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892233#action_12892233
]
Graham P edited comment on SOLR-1553 at 7/26/10 5:37 AM:
-
Please
[
https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892233#action_12892233
]
Graham P edited comment on SOLR-1553 at 7/26/10 5:39 AM:
-
Please
See http://hudson.zones.apache.org/hudson/job/Lucene-trunk/1249/
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org
Hmmm this means a bug is lurking. This is the power of random testing
(that every time we all run tests, we're testing different paths
through the code)
It seems exceptionally unlikely that LUCENE-2537's changes would cause this!
But, unfortunately, when I plug that seed in I don't see it
I agree, Shai can you open a bug? I cannot reproduce, did you use an IBM JVM
or another environment that might help us figure it out?
On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
luc...@mikemccandless.com wrote:
Hmmm this means a bug is lurking. This is the power of random testing
[
https://issues.apache.org/jira/browse/LUCENE-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892285#action_12892285
]
Mark Harwood commented on LUCENE-2557:
--
I think we're agreed that the effects of IDF
On a more general note...
Any time any of you out there hit an odd test failure, please please
please do just what Shai did: take it to the dev list!
Think of Lucene's unit tests like SETI :) We are desperately seeking
bugs, and you and your machine may just be lucky enough to find one...
go
[
https://issues.apache.org/jira/browse/LUCENE-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892290#action_12892290
]
Robert Muir commented on LUCENE-2557:
-
bq. I think we're agreed that the effects of
[
https://issues.apache.org/jira/browse/LUCENE-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892297#action_12892297
]
Robert Muir commented on LUCENE-2557:
-
so here is an option for this issue. we could
[
https://issues.apache.org/jira/browse/SOLR-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir updated SOLR-1860:
--
Attachment: SOLR-1860.patch
here is a first step, 2 of the analyzers (Brazilian, Czech) use embedded
wordlistloader is inefficient
-
Key: LUCENE-2564
URL: https://issues.apache.org/jira/browse/LUCENE-2564
Project: Lucene - Java
Issue Type: Bug
Components: contrib/analyzers
Reporter: Robert
[
https://issues.apache.org/jira/browse/SOLR-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892304#action_12892304
]
Robert Muir commented on SOLR-2015:
---
Even for the euro-languages where people think this
[
https://issues.apache.org/jira/browse/LUCENE-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892311#action_12892311
]
Mark Harwood commented on LUCENE-2557:
--
bq. I dont understand why we need to average
TestUTF32ToUTF8 can run forever
---
Key: LUCENE-2565
URL: https://issues.apache.org/jira/browse/LUCENE-2565
Project: Lucene - Java
Issue Type: Bug
Reporter: Michael McCandless
Assignee:
[
https://issues.apache.org/jira/browse/SOLR-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892314#action_12892314
]
Yonik Seeley commented on SOLR-2015:
bq. is wi fi, then this will not turn into a
[
https://issues.apache.org/jira/browse/LUCENE-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892316#action_12892316
]
Robert Muir commented on LUCENE-2564:
-
There are more problems with this loader... it
[
https://issues.apache.org/jira/browse/SOLR-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892318#action_12892318
]
Robert Muir commented on SOLR-2015:
---
bq. Many people have asked how to do this sort of
[
https://issues.apache.org/jira/browse/SOLR-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892319#action_12892319
]
Yonik Seeley commented on SOLR-2015:
What would the fieldType for a generic
[
https://issues.apache.org/jira/browse/SOLR-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892319#action_12892319
]
Yonik Seeley edited comment on SOLR-2015 at 7/26/10 10:23 AM:
--
[
https://issues.apache.org/jira/browse/SOLR-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892321#action_12892321
]
Robert Muir commented on SOLR-2015:
---
bq. What would the fieldType for a generic
[
https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892323#action_12892323
]
Yonik Seeley commented on SOLR-1553:
Interesting... edismax does currently treat (bar -
thanks Koji!
On Mon, Jul 26, 2010 at 10:21 AM, k...@apache.org wrote:
Author: koji
Date: Mon Jul 26 14:21:47 2010
New Revision: 979302
URL: http://svn.apache.org/viewvc?rev=979302view=rev
Log:
SOLR-1984: typo in javadoc
Modified:
Sorry for the delayed response.
I ran it a couple more times, from Eclipse and Ant, and each time it fails
(amazing !), w/ different seeds. More seeds that fail:
NOTE: random seed of testcase 'testRandomRegexes' was: -4244174191361080127
NOTE: random seed of testcase 'testRandomRegexes' was:
Check project name issues
-
Key: LUCY-121
URL: https://issues.apache.org/jira/browse/LUCY-121
Project: Lucy
Issue Type: Task
Components: Other
Reporter: Marvin Humphrey
Assignee:
+ - operators allow any amount of whitespace
Key: LUCENE-2566
URL: https://issues.apache.org/jira/browse/LUCENE-2566
Project: Lucene - Java
Issue Type: Bug
Reporter: Yonik Seeley
Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times and
it succeeds every time. However, when I revert back to IBM's, it fail
immediately.
I can help w/ the debug, if you give me a hint where to look :).
Shai
On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera ser...@gmail.com wrote:
sounds nasty... its good you are running the tests with this different
jvm...
On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera ser...@gmail.com wrote:
Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times and
it succeeds every time. However, when I revert back to IBM's, it fail
It occurs to me that the proper static initializer code might well be able to
generate distances of 3, 4 or whatever, without bloating the jar.
Nevertheless, the real question of import to me right now is: what
“minimumDistance” value corresponds to a Levenshtein distance of 1? 2? The
[
https://issues.apache.org/jira/browse/SOLR-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892330#action_12892330
]
Michael McCandless commented on SOLR-2015:
--
bq. The problem is not that we have an
On Mon, Jul 26, 2010 at 10:57 AM, Shai Erera ser...@gmail.com wrote:
Sorry for the delayed response.
I ran it a couple more times, from Eclipse and Ant, and each time it fails
(amazing !), w/ different seeds. More seeds that fail:
NOTE: random seed of testcase 'testRandomRegexes' was:
On Mon, Jul 26, 2010 at 11:35 AM, Robert Muir rcm...@gmail.com wrote:
On Mon, Jul 26, 2010 at 11:30 AM, karl.wri...@nokia.com wrote:
It occurs to me that the proper static initializer code might well be able
to generate distances of 3, 4 or whatever, without bloating the jar.
I disagree:
Ok I've dug deeper into the test. I set the random seed to
-9029631602016965389L in setUp(), and discovered that on the 4th iteration
it breaks. For some reason though, AutomatonTestUtil.randomRegex generates
different strings every time I run the test, even though it uses the same
Random object
sorry, i screwed up the name of the test, i meant TestRegexpRandom2
On Mon, Jul 26, 2010 at 11:46 AM, Robert Muir rcm...@gmail.com wrote:
hmm maybe the bug is in AutomatonTestUtil.randomRegex?
can you do me a favor and run -Dtestcase=TestRandomRegex2
This testcase also uses this same
I wanted to show you what i mean, just so you know:
here is what 'ant createLevAutomata' does now:
createLevAutomata:
[exec] Wrote Lev1ParametricDescription.java [102 lines; 3.7 KB]
[exec] Wrote Lev2ParametricDescription.java [147 lines; 9.6 KB]
BUILD SUCCESSFUL
Total time: 4 seconds
[
https://issues.apache.org/jira/browse/SOLR-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Small updated SOLR-2014:
-
Summary: Allow BF parameter to accept complex nested expressions with
whitespace sprinkled throughout
[
https://issues.apache.org/jira/browse/LUCENE-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892341#action_12892341
]
Eks Dev commented on LUCENE-2557:
-
It looks like we have one invariant:
IDF(QueryTerm) =
[
https://issues.apache.org/jira/browse/LUCENE-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892344#action_12892344
]
Mark Harwood commented on LUCENE-2557:
--
bq. Fixing all expansions to IDF(QT) would
Is the automaton creation code checked in anywhere?
Also, if it’s not possible to generate the automaton table on the fly, perhaps
it could be included in an optional jar (whose existence would be queried by
reflection). The reason I’m interested in LD’s of three, especially, is
because I can
On Mon, Jul 26, 2010 at 12:51 PM, karl.wri...@nokia.com wrote:
Is the automaton creation code checked in anywhere?
yes, its generated from lucene/build.xml (createLevAutomata task)
Also, if it’s not possible to generate the automaton table on the fly,
perhaps it could be included in an
I really wouldnt recommend this (though i am not sure what exactly you are
trying to do), instead i would recommend analyzers/phonetic to deal with the
phonetic stuff.
What I want to capture is situations where people misspell things in roughly a
phonetic way. For example, “Tchaikovsky
My random stress testing hit an IllegalArgExc because the random
regexp was malformed.
Does this patch look OK to fix?
Index: src/test/org/apache/lucene/search/TestRegexpRandom2.java
===
---
maybe you can try this on your beast computer for a while? I think its
better.
Index: lucene/src/test/org/apache/lucene/search/TestRegexpRandom2.java
===
--- lucene/src/test/org/apache/lucene/search/TestRegexpRandom2.java (revision
OK lemme try...
Mike
On Mon, Jul 26, 2010 at 1:47 PM, Robert Muir rcm...@gmail.com wrote:
maybe you can try this on your beast computer for a while? I think its
better.
Index: lucene/src/test/org/apache/lucene/search/TestRegexpRandom2.java
Nah, its an analyzer. so you can just use termquery (fast: exact match).
at query and index time it just maps stuff to a key... typically you would
just put this in a separate field.
you can combine this with your edit distance query with a booleanquery, for
example the edit distance can handle
[
https://issues.apache.org/jira/browse/LUCENE-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892396#action_12892396
]
Jason Rutherglen commented on LUCENE-2567:
--
Further thinking about the RT terms
Just a thought: edit distance is meant for overcoming spelling errors in
form of assimilations or mistype. In your case there is a limited number
of cases that need special care, and you can actually define most of
them pretty well - hence edit distance is by definition much more than
you
maybe there is a bug in ibm's random generator :)
On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless
luc...@mikemccandless.com wrote:
That's VERY spooky that w/ a fixed seed you see different random
regexps being made.
Mike
On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera ser...@gmail.com
[
https://issues.apache.org/jira/browse/SOLR-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892407#action_12892407
]
Robert Muir commented on SOLR-2016:
---
I don't want to mess with the english example, this
I don't know what was the thing w/ the strings generated before, but now I
ran the test again w/ the same seed and it generates the same strings. So at
least it seems there are no problems w/ the Random class :).
However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any
ideas why?
From here: http://www.fileformat.info/info/unicode/char/d9ff/index.htm
Looks like that character is not a valid Unicode character, and perhaps the
IBM's JVM behaves correctly? Robert - you're the Unicode expert :).
Shai
On Mon, Jul 26, 2010 at 10:40 PM, Shai Erera ser...@gmail.com wrote:
I
first of all, thanks for taking the time to do all of this debugging!
my guess is this might be related to
https://issues.apache.org/jira/browse/LUCENE-2565
https://issues.apache.org/jira/browse/LUCENE-2565does it fail if you apply
Mike's patch?
On Mon, Jul 26, 2010 at 3:40 PM, Shai Erera
OK I think likely this is a bug in RAS. And we are just seeing the
difference in how Oracle's IBM's JREs handle an unpaired
surrogate...
Lemme work out a patch...
Mike
On Mon, Jul 26, 2010 at 4:13 PM, Michael McCandless
luc...@mikemccandless.com wrote:
Yeah that char is a high surrogate
[
https://issues.apache.org/jira/browse/LUCENE-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir updated LUCENE-2554:
Attachment: LUCENE-2554_merge.patch
here is a patch of the merge to trunk.
All tests pass.
[
https://issues.apache.org/jira/browse/SOLR-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yonik Seeley updated SOLR-1925:
---
Attachment: SOLR-1925.patch
Here's the final patch w/ tests.
I think we're all ready to go... I'll
[
https://issues.apache.org/jira/browse/LUCENE-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved LUCENE-2565.
Resolution: Fixed
TestUTF32ToUTF8 can run forever
[
https://issues.apache.org/jira/browse/LUCENE-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir resolved LUCENE-2554.
-
Resolution: Fixed
Committed revision 979453.
preflex codec doesn't order terms correctly
With LUCENE-2554 we now run all lucene/solr tests with random codecs by
default (except for a very short few).
You can use the option -Dtests.codec=Standard to force all tests to run with
a specific codec. The default is -Dtests.codec=random.
So if a failure occurs under a random codec, the
Shai can you try the patch on LUCENE-2568? Thanks.
Mike
On Mon, Jul 26, 2010 at 4:25 PM, Michael McCandless
luc...@mikemccandless.com wrote:
OK I think likely this is a bug in RAS. And we are just seeing the
difference in how Oracle's IBM's JREs handle an unpaired
surrogate...
Lemme work
[
https://issues.apache.org/jira/browse/LUCENE-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-2568:
---
Attachment: LUCENE-2568.patch
Attached patch, avoids the surrogates when computing
Some improvements to _TestUtil and its usage
Key: LUCENE-2570
URL: https://issues.apache.org/jira/browse/LUCENE-2570
Project: Lucene - Java
Issue Type: Test
Reporter: Shai Erera
67 matches
Mail list logo