I was off-line for some time, so my response is a bit out of sync. My appologies for that.
On Thu, 25 Jan 1996, Bruce Perens wrote: > From: Evert-Jan Couperus <[EMAIL PROTECTED]> > > I think the complexity in producing a good consistent set of documents is > > highly underestimated. It is of a higher order than maintaining a > > software distribution like Debian. > > I don't think we're on the same wavelength :-) . Possibly, but I think it is the result of me failing to make my intentions clear. I was deliberatly a bit vague in the actual implementation details in order to escape the danger of narrowing down the design too soon. > This is a recipie for failure. We are not mounting some tremendous > documentation project here. We simply want to present all of the available > documentation using one interface, with one top-level index. This should > be do-able in a month of evenings by one programmer. Once we have that, > we will have the leisure to start out on more ambitious projects if we feel > they are necessary. Very true. However, my experience with documentation projects is: 1. "It's easy. We only want..., nothing fancy..." 2. "It works!" 3. "Well, just don't touch ... We'll fix that later." 4. After fixing a lot of minor points it really works. 5. "Wouldn't it be great if we add ...?" 6. A couple of iterations later the system barely runs and is buggy. I think it would be a pity if Debian solves this problem for the software part, but repeats this life-cycle for the other parts. When we feel the need for better documentation management the Debian bashing may have been started already (I extrapolate on SLS and Slackware here). [ omitted a lecture of mine about document inconsistencies ] > > navigational elements (i.e. hyperlinks, browsing sequences, indexes, [ ... ] > Guaranteed consistency is not one of our goals at this time. We'd like > the reader to have some chance of finding the documentation and reading > it without having to be a Make/TeX/Roff/Lout/Latex/etc. guru. I think you are saying that the navigation through the documentation should be more consistent and thereby easier. I suppose we do not disagree on this one. > > Besides that you will need to agree on a set of keywords to > > be used by the authors. > > If you really wanted keyword search, why not simply use an inverted index > a la "refer" and then every word in every document is a keyword and the whole > procedure is automatic. I'm not convinced that we want keyword search, though. Your top-level index is a special case of the keyword indexing (not searching, that's just a related form of navigation). If you feel that your packages form a logical grouping of information you can use the package names and base/devel/net/text/../ as keywords and use them for building the index. When implementing that special case you should try to structure your code along the lines of general rules and your special choice of keywords. About the inverted index. If you want keyword based indexing you have two options: 1) assign keywords by hand (the author or debianiser), 2) automagically. To do it right you need a set of rules to keep the result of 1 consistent and AI technigues to keep the output of 2 semantically consistent. The point I was trying to make is: *if* you want a *solid* keyword index IMO you should opt for 1 without forgetting to agree upon a set of rules for the authors. Option 2 without advanced techniques adds essentialy nothing to the good old find&grep method. > > I think we should concentrate on: > > 1) what kind of information do we need, > > 2) how do we keep the maintainance distributed without sacrificing > > coherence, > > 3) how can we use the existing documents as much as possible and yet > > integrate them in one meta-document, > > 4) how do we keep the use of resources, both human and electronical, > > low? > > I don't think we can afford your standards. I know, it's hard :-) I said that because I got the feeling that a lot of implementation details were exchanged without having the same idea of what was getting implemented and what should be the constraints. I tried to broaden the view so that we can discuss design issues and their consequences for the implementation instead of discussing implementations with a lot of hidden design decisions. > > We should not do major rewrites, just add the necessary primitives > > needed for a better navigation. > > We should not add primitives. We should not alter the documentation at all > except to run it through an automatic program to translate its format > when we present it. We should construct a top-level index. With primitives I do not mean changing the document contents or something like that. With a primitive I mean something that facilitates the indexing you are talking about, but does not dictate the actual implementation. For example, you can add a short (optinial?) record like the ones in the Packages file or the *.deb files, maybe styled after the Linux Software Map records. These can be short, extendible and used by another application to extract information to be used for indexing. If you want an HTML document as the top-level index for all installed packages you can make a script build that builds that index from the names of the packages as a post-install action (like install-info) or on the fly by a CGI script. As long as the number of packages is small and their is no need for a hierarchy deeper or different than the base, system, net etc. you can just hardwire it into the implementation. But as soon as your needs change you have to rewrite your code in order to accomodate that. Whereas adding those document records gives the debianisers the freedom to: 1 choose the indexing separate from the packaging hierarchy, 2 add more items per package by adding more document records. Another primitive is adding a line "Keywords:" to the Package file records. > > For package maintainers that should mean > > defining the place in the hierarchy of the meta-index as well as giving > > keywords for the keyword network. > > This sounds very complicated :-) . Yes, it is! It is almost as demanding as making the Package record :-) > > Maybe new keywords are permitted as long as they are inherited from a more > > abstract one. > > The concept of "keyword administration" is probably outside of the scope of > our project. At the moment I agree. But I also think you should take a more advanced indexing scheme into account when designing and implementing the automatic index generation. Doing it now should take no time at all, doing it later could take a lot of time. > > I think we should look at other solutions before using yet another daemon > > like httpd. > > An HTTPd is only necessary if you want translation at run-time. If you > sacrifice disk space by having pre-translated files in place, you only > need an HTML browser using the "file:" URL, you don't need a server. In > any case, an HTML server is cheap compared to the alternatives. I oppose. At the moment the cheapest solution is viewer dispatching. Furthermore I would like to do dispatching for info, ?roff and postscript, HTML conversion during installation for the others. Easy to add if the right implementation is chosen, otherwise very hard. > > Working at a firm specialising in "information disclosure" I see too much > > documentation projects fail because of a lack of analysis and design. > > Even small ones can suffer from it. > > I'd rather have it Tuesday than have it perfect. I think after we've > satisfied the basic goal of having some way to read documentation using > one tool and one overall index we can spend as much time as we wish on > doing it right. I can have it ready by Monday morning and perfect :-) A good design does not imply using more time, on the contrary. It's about chosing a scalable and flexible solution without mixing too much implementation details into the design. Evert-Jan.

