So, this is an open call for ideas on how we can improve our docs. Here are some areas I think need improving:
Before I start suggesting improvements, let me qualify them all by saying I'm only taking the time to do this because I love Lucene and use it all the time. Web Site Redesign ------------------ I'd like to add a request for a top-level site redesign. I find it very difficult to find anything on the site. This isn't just a Lucene problem, it's partly an Apache problem. I believe what most people want is a top-level intro to the projects and then a pointer to where to download and/or read hello-world getting-started docs. (This is, for instance, how Tomcat and MySQL set up their home pages and sites.) I just went to the Lucene site and still can't figure out where to download the latest Lucene. I start at http://lucene.apache.org/ and get a nav choice of "who we are" and "buy stuff" and "subprojects". So I click on subrprojects, which opens up a menu and then I click on "java" (because I know that there are more versions of Lucene than the Java version and there's nothing else labeled just Lucene). I then get a choice of Features, Who We Are, Powered by Lucene, Documentation, Resources, Site Versions, and Related Projects. I guess the right answer is "Resources" then "releases", then I leave the nav for the page itself and click "downloads and releases" but hey, I'm already there, so I have to go into the text and click on "Apache Mirrors". I then select a mirror and it gives me a huge list to select from. The README gives me no hint as to what's the latest stable version, and each version has (old) written next to its description. Ask an coworker who doesn't use Lucene to try to find the javadocs, a hello world tutorial, and the download on the Lucene site. (Yes, I'm suggesting a usability test.) Altogether, the design should waste less whitespace. Compare an Apache page to something like a MySQL page to see the difference. Class, Method, Construction, Member Doc --------------------------------------- The biggest issue in the doc for me is that most methods, packages, classes, etc. are hardly documented at all. For instance, the very first class in the 2.1 alphabetical list: http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/gdata/servlet/handler/AbstractAccountHandler.html has 7 methods, 6 of which are undocumented and 1 of which has inherited redundant doc. There's an uncommented field, an uncommented constructor, and there's no class doc. It's also out of date. Someone finally fixed the infinite-loop design of Analyzer, but the class doc has a big warning that you must implement one of the methods. But now there's only the abstract tokenStream() method which must be implemented and a getPositiveIncrementGap() method (which is a useful addition, by the way). It also doesn't help that there are classes with non-descriptive names like Among, which have no doc at all. I'd rather see each jar get its own javadoc, or at the very least, indicate which jar each class is defined in for the ones that aren't part of the core. Reader Schmeader ---------------- This is actually an API, not a doc issue, though the doc around this needs work as is, too. I don't understand why Readers are used in analyzers. Using them presents several problems. First, since Analyzer.tokenStream() doesn't throw an IOException, all exceptions must be caught somewhere inside. Second, it's not clear who closes the reader or how long the analyzer will hold it open. Every time I've used Lucene, I wind up having strings or char sequences or char array slices that I need to embed in a Reader. That's because I invariably have to parse out the bits of documents I want to index in various fields. Finally, wrapping a char sequence or char array slice in a reader is a rather inefficient way to implement a sequence of chars. Can we at least introduce a method that takes a CharSequence or even just a String and deprecate the one with Reader? Or at least provide an alternative for the usual case of not having a reader. Maybe I'm just missing something here, but I don't think it's scaling to streaming input that'd overflow memory. - Bob Carpenter Alias-i --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]