contrib/memory: PatternAnalyzerTest is a very, very, VERY, bad unit test
------------------------------------------------------------------------
Key: LUCENE-1756
URL: https://issues.apache.org/jira/browse/LUCENE-1756
Project: Lucene - Java
Issue Type: Bug
Components: contrib/*
Reporter: Hoss Man
Priority: Minor
while working on something else i was started getting consistent
IllegalStateExceptions from PatternAnalyzerTest -- but only when running the
test from the top level.
Digging into the test, i've found numerous things that are very scary...
* instead of using assertions to test that tokens streams match, it throws an
IllegalStateExceptions when they don't, and then logs a bunch of info about the
token streams to System.out -- having assertion messages that tell you
*exactly* what doens't match would make a lot more sense.
* it builds up a list of files to analyze using patsh thta it evaluates
relative to the current working directory -- which means you get different
files depending on wether you run the tests fro mthe contrib level, or from the
top level build file
* the list of files it looks for include: "../../*.txt", "../../*.html",
"../../*.xml" ... so not only do you get different results when you run the
tests in the contrib vs at the top level, but different people runing the tests
via the top level build file will get different results depending on what types
of text, html, and xml files they happen to have two directories above where
they checked out lucene.
* the test comments indicates that it's purpose is to show that PatternAnalyzer
produces the same tokens as other analyzers - but points out this will fail for
WhitespaceAnalyzer because of the 255 character token limit WhitespaceTokenizer
imposes -- the test then proceeds to compare PaternAnalyzer to
WhitespaceTokenizer, garunteeing a test failure for anyone who happens to have
a text file containing more then 255 characters of non-whitespace in a row
somewhere in "../../" (in my case: my bookmarks.html file, and the hex encoded
favicon.gif images)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]