[ https://issues.apache.org/jira/browse/LUCENE-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir resolved LUCENE-1756. --------------------------------- Resolution: Fixed Committed revision 825112. > contrib/memory: PatternAnalyzerTest is a very, very, VERY, bad unit test > ------------------------------------------------------------------------ > > Key: LUCENE-1756 > URL: https://issues.apache.org/jira/browse/LUCENE-1756 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/* > Reporter: Hoss Man > Assignee: Robert Muir > Priority: Minor > Fix For: 3.0 > > Attachments: LUCENE-1756.patch > > > while working on something else i was started getting consistent > IllegalStateExceptions from PatternAnalyzerTest -- but only when running the > test from the top level. > Digging into the test, i've found numerous things that are very scary... > * instead of using assertions to test that tokens streams match, it throws an > IllegalStateExceptions when they don't, and then logs a bunch of info about > the token streams to System.out -- having assertion messages that tell you > *exactly* what doens't match would make a lot more sense. > * it builds up a list of files to analyze using patsh thta it evaluates > relative to the current working directory -- which means you get different > files depending on wether you run the tests fro mthe contrib level, or from > the top level build file > * the list of files it looks for include: "../../*.txt", "../../*.html", > "../../*.xml" ... so not only do you get different results when you run the > tests in the contrib vs at the top level, but different people runing the > tests via the top level build file will get different results depending on > what types of text, html, and xml files they happen to have two directories > above where they checked out lucene. > * the test comments indicates that it's purpose is to show that > PatternAnalyzer produces the same tokens as other analyzers - but points out > this will fail for WhitespaceAnalyzer because of the 255 character token > limit WhitespaceTokenizer imposes -- the test then proceeds to compare > PaternAnalyzer to WhitespaceTokenizer, garunteeing a test failure for anyone > who happens to have a text file containing more then 255 characters of > non-whitespace in a row somewhere in "../../" (in my case: my bookmarks.html > file, and the hex encoded favicon.gif images) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org