Thanks, Steve. I'm sure your response does in fact explain precisely why the Lucene Javadoc is "failing" to live up to my expectation (i.e., that all the actual tokenizers and filters are packaged as a separate "module".)

But... I would note the package for LowerCaseFilter is org.apache.lucene.analysis.core (yes, "core"), which is a parent package of the package for TokenFilter, org.apache.lucene.analysis, so how does your explanation of "optional" packages square with classes that are clearly packaged as... "core"?

Another way of putting it is that the Tokenizer, TokenFilter, CharFilter, and Analyzer classes are all "abstract", so how can the concrete classes in any way be considered completely "optional"??

I will go ahead and file this as a "bug", even if it doesn't get fixed due to... "constraints".

But thanks anyway for your explanation - I knew there had to be such a reason for this "feature" to be "broken".

-- Jack Krupansky

-----Original Message----- From: Steve Rowe
Sent: Tuesday, March 12, 2013 6:09 PM
To: [email protected]
Subject: Re: Javadoc "bug" for Lucene TokenFilter

Hi Jack,

TokenFilter subclasses are not just in another folder, they're in another artifact (core vs. various analysis modules), so IMHO this is not a bug, in TokenFilter's javadocs anyway. This is due to the constraint that lucene-core javadocs shouldn't refer to other modules' javadocs, since all other modules are - in theory anyway - optional. This philosophy is evident in the way javadocs are currently built: separately per-module, lucene-core first.

Maybe a listing from across the code base could be linked from top-level docs (e.g. for 4.2.0: <http://lucene.apache.org/core/4_2_0/>), for all analysis components? In case you're interested in working on it, that page is generated using 'ant documentation' from lucene/, in dependent target process-webpages, via lucene/site/xsl/index.xsl.

Steve

On Mar 12, 2013, at 5:35 PM, Jack Krupansky <[email protected]> wrote:

The Lucene Javadoc for TokenFilter shows only a single Direct Known Subclass when in fact there are dozens of them. The Lucene JavaDoc for LowerCaseFilter does in fact show TokenFilter as it’s direct parent class even though the Javadoc for TokenFilter does not report LowerCaseFilter as a Direct Known Subclass.

Is this any good reason for this discrepancy, or is this simply a bug in either Lucene’s packaging or the javadoc generation?

4.0, 4.1, and 4.2 all have consistent behavior, but 3.6 reports a long list of the expected subclasses.

I suspect it may have to do with the fact that the subclasses are off in a separate folder from the parent class.

See:
http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/analysis/TokenFilter.html
http://lucene.apache.org/core/4_1_0/analyzers-common/org/apache/lucene/analysis/core/LowerCaseFilter.html
http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/analysis/TokenFilter.html

Note: The Lucene Javadoc for TokenFilterFactory does in fact show dozens of Direct Known Subclasses, as expected.

-- Jack Krupansky


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to