On Feb 27, 2007, at 6:24 PM, David Balmain wrote:
How about just using doxygen. I don't have much experience with it but
I'm pretty sure there would be a way to tag particular functions that
are public so that when you generate the documentation you can
generate only the public methods.
I don't know it well either, but I'm sure you're right and it will
allow us to put in a public/non-public tag.
It would be even better if we could export at least some of the
documentation -- particularly method descriptions. I'd really like
to be able to synch up the Perl binding docs by running a script
rather than via copy-and-paste.
Of course you could also have public and private include files.
Hmm, can you elaborate? I'd basically given up hope that we'd be
able to maintain tight control over symbol export, and was expecting
to define the API via documentation only.
I'm thinking we need shared
documentation. XML, maybe? Then each binding would require an
appropriate XML-to-whatever translation utility.
I'm not entirely sure I'm on the same wavelength as you today. By
'whatever' do you mean the specific languages documentation format?
Yes, that was what I was thinking. But perhaps not quite so
ambitious as may have come across.
If
that is the case then I don't see this working as the ruby API for
Lucy will probably be quite different to the PHP API.
If we're reasonably careful about how we word things, many method
descriptions could be reused across all bindings. And one of the
things about the naming convention we've settled on for method
invocations is that you can derive either lowerCamelCase or
separated_by_underscores method names with a simple transform:
Sim_Length_Norm => lengthNorm
Sim_Length_Norm => length_norm
If we tag every last thing, enough so that we could actually
generate, say, both POD and javadoc without intervention, then sure,
XML is wayyyy too verbose. Anything would be, really, because
language syntaxes are too distinct. But if we set our sights a
little lower, and just try to share method names, method
descriptions, and public/non-public access control, that's doable --
and it's a whole lot of savings. (Maybe parameter lists and return
values, too, but that's a little harder.)
<method>
<name>Sim_Length_Norm</name>
<acl>public</acl>
<description>
Computes the normalization value for a field given the total
number of
terms contained in a field. These values, together with field
boosts,
are stored in an index and multipled into scores for hits on
each field
by the search code.
Matches in longer fields are less precise, so implementations
of this
method usually return smaller values when numTokens is large,
and larger
values when numTokens is small.
That these values are computed under IxWriter_Add_Document and
stored
then using Sim_Encode_Norm. Thus they have limited precision, and
documents must be re-indexed if this method is altered.
</description>
</method>
Note the use of "IxWriter_Add_Document" and "Sim_Encode_Norm" within
the description. Those method names are identifiable patterns,
matchable with this regex:
# $1 is class nick, $2 is short method name
/([A-Z][A-Za-z]+)_([A-Z]\w+)/
It's easy to sub out IxWriter_Add_Document for this, which will
generate a nicely formatted link...
L<IndexWriter::add_document|Lucy::Index::IndexWriter/"add_document">
Now, returning to your point about Doxygen... With XML, we'd have to
maintain separate files for the documentation, which would suck. So
I'm all for using Doxygen, especially if we can rig things up so that
the description can be isolated and parsed out reliably.
I might go write an extractor tool which parses our header files and
generates intermediate XML. Then bindings authors could write their
own final translation utilities in their language of choice, and use
as much or as little as they wish.
Hopefully they'd use more rather than less. It's to the user's
benefit for various bindings to present reasonably consistent APIs
while still being idiomatic, because it makes it easier to apply what
you learned about one of them to another.
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/