First up, thanks to Wayne for starting this thread and for being a Good
Citizen and providing code to match. Also thanks to Bruno for expanding the
context for talking about automatic documentation. As it turns out, this is
all related to one of my more recent proof-of-concept pieces. (For the past
couple of years, I've been doing a ton of experimental projects to get into
the details of various tools and design patterns.)
Anyway, to make an auto-doc feature, you need a code scanner. Grab the
comments, read the structured comments, extract the data and then do
something with it. For example:
* Produce HTML or PDF documentation.
* Set tooltips.
* Write the descriptions to a JSON file, XML file, text dump, or records.
* Generate test cases based on the declarations and comments.
As Bruno mentioned, there is a ton that can go wrong with documentation.
It's a pain to write, a pain to maintain, and a pain to validate. Moving
the comments into structured comments improves all of this (particularly
with macros), but it doesn't solve the problem completely. I'll be clear:
It's a very good solution, but it's not complete. So all respect to anyone
such as Bruno that's gone to the effort to implement and use such a
solution. We should all be so conscientious.
So what's the problem? The problem is that comments aren't code. Structured
comments are a kind of *promise* about what the code does and how it
behaves, but they are *unenforced* promises. So, there's another level to
what you can do with automated comments - one that reads the code itself to
determine the rules. For this, you need a somewhat more complex code
scanner, but that's not the hard part. The hard part is retrofitting the
declarations. Here, let me show a snipped of code that gets the idea across
better than I have so far:
*C_TEXT*($error_name)
$error_name:=*MethodCheck_ParameterCount* (*Count parameters*;3;3)
*If* ($error_name="")
$winref:=$1
$widget_name:=$2
$messages_array_pointer:=$3
*C_LONGINT*($previous_error_count)
$previous_error_count:=*ErrorStack_Count*
*MethodCheck_WindowReference* ($winref;*Value must be supplied*)
*MethodCheck_FormObjectNameInput* ($widget_name;*Value must be valid*)
*MethodCheck_PointerTypeSeries* ($messages_array_pointer;*Object array*)
*If* (*ErrorStack_Count* >$previous_error_count)
$error_name:=*ErrorStack_GetReportErrorName*
*End if End if If* ($error_name="")
// Main body of routine.
*End if *
Hopefully, that's not too cryptic or too dense. The idea is that a bunch of
pre-conditions are validated before the routine gets to work.
Basically, *ASSERT
*with extra control. Translated, the checks are:
*MethodCheck_ParameterCount*
A minimum of 3 parameters and a maximum of 3 parameters. I actually use
this code all the time. It's super handy for catching when the parameter
list is wrong. (Compiler should do this but it doesn't.) This also provides
a very tidy way of declaring optional parameters. In the example above,
there aren't any optionals.
*MethodCheck_WindowReference*
A value must be supplied, but it need not be a valid winref. I don't
remember, I think it defaults to the current window if it isn't valid.
*MethodCheck_FormObjectNameInput*
A form object name must be valid and must exist.
*MethodCheck_PointerTypeSeries*
The pointer needs to be to a variable of one of the specified types. In
this case, the list is just "object array". If you had a numeric array
pointer, you might specify longint array and real array.
The exact details may not matter much for this conversation. The general
idea is that inputs are validated in any of a few ways. Here are the names
of the validators:
*MethodCheck_BLOBInput *
*MethodCheck_DateInput *
*MethodCheck_DateRange *
*MethodCheck_DateSeries *
*MethodCheck_FilePathInput *
*MethodCheck_FolderPathInput *
*MethodCheck_FormEventInput *
*MethodCheck_FormNameInput *
*MethodCheck_FormObjectNameInput *
*MethodCheck_LongintInput *
*MethodCheck_LongintRange *
*MethodCheck_LongintSeries *
*MethodCheck_MethodNameInput *
*MethodCheck_ObjectInput *
*MethodCheck_ObjectTypeInput *
*MethodCheck_ObjectTypeSeries *
*MethodCheck_ParameterCount *
*MethodCheck_PictureInput *
*MethodCheck_PointerInput *
*MethodCheck_PointerIsAnAlpha *
*MethodCheck_PointerIsANumeric *
*MethodCheck_PointerSeries *
*MethodCheck_PointerTypeSeries *
*MethodCheck_RealInput *
*MethodCheck_RealRange *
*MethodCheck_RealSeries *
*MethodCheck_TextInput *
*MethodCheck_TextRange *
*MethodCheck_TextSeries *
*MethodCheck_TimeInput *
*MethodCheck_TimeRange *
*MethodCheck_TimeSeries *
*MethodCheck_WindowReference *
So, basically you can check by type, a range (1-7), or a series (list) like
("Mon";"Tue";"Wed").
And now, after my typical pithy introduction, I get to the point. With a
custom code scanner, you can read in these declarations and *generate
documentation based on the live code*. There is no disconnect possible
between the code and the docs because it's based on working code. Likewise,
you can automatically generate test cases to exercise the real code.
How far did I get with all of this? Pretty far. I started in V13 and then
moved to V15 as *C_OBJECT* was a handy thing to have for this situation.
Between the V13 and V15 versions, I had much of the code parser done (but
not all of it - I didn't get to plug-ins or constants), dumping to external
docs looked easy, generation of Explorer Comments is easy (and as helpful
as everyone says), and all kinds of code checks were also looking pretty
safe to implement.
Ultimately, with a comprehensive code scanner/parser, and the declarations,
you could build the code analysis and replacement features of SanityCheck
and 4D Insider. And more.
But I stopped. Why?
* The cost of retrofitting something like this is high. It's a solid idea
as if forces you to think through what your routines will and won't
tolerate and to make those expectations explicit. Then again, arguing
against the axiom "if it ain't broke, don't fix it" is hard to do...
* Some other shinny thing caught my eye. MVC/MVVC and then
Publish-Subscribe. Right now I'm kind of into some fun JS+Bootstrap stuff.
* It's a big job to do completely. Experiments are fun and can deliver
reusable code (I use those input checking routines all of the time now and
love them), but finish something? Finishing something big? That's a lot
more work. I thought about doing a product but figure there just isn't a
large enough market to justify the effort. I can see big 4D shops
justifying this level of work, but some/most (?) of them have already been
down this road and don't need a generic tool. Plus, I discovered that there
are a couple of nice-looking code checking components out there already.
* I'm most excited about getting the call paths and method relationships in
the system and then graphing them with D3. That would be a heck of a lot of
setup to finally get to the part I want to do. I'll find something else to
graph. Pity though, it would be super interesting. I think a lot about how
code fits together so, at least for me, it would be pretty fun.
So there it is. The short version is "use specific methods consistently to
make it possible for a code scanner to extract the actual behavior of the
method." Well, at least the parameter lists. Oh, for $0, I used something
like *MethodDefine_Result* with a declaration past $0. I put in a custom
type like "window reference" or "error name." More useful when you use
custom types.
P.S. I have no objection to sharing some/all of the code I did but it's not
really in that ideal a state for sharing. If there's anyone that wants to
pursue these ideas somehow, shoot me a note off-line and we'll see if
there's a way to collaborate, etc. Like I say, I'm more interested in some
uses for data out of a parser/scanner/xref tool than I am in writing the
tool itself.
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ: http://lists.4d.com/faqnug.html
Archive: http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub: mailto:[email protected]
**********************************************************************