I got this message from Stephen Davies, the Synopsis guy, tonight. It
describes what needs to be done to get Synopsis started generating
BoostBook. It mostly looks pretty straightforward and I wouldn't mind
trying to do it, but I'd like to work together with someone else on
this who knows BoostBook and XML processing well. I can supply the
Python/Synopsis expertise ;-)
-Dave
--- Begin Message ---
On Sun, 2003-02-02 at 01:33, David Abrahams wrote:
> Stephen Davies <[EMAIL PROTECTED]> writes:
> > I may get time to write a BoostBook formatter before the release.. it
> > shouldnt be too hard to write a Visitor for the AST that just outputs
> > XML in the required format. I have a quick look at the boost-docs ML and
> > the compaints about doxygen's XML output, and it seems the preferred
> > output (classes structurally nested inside their namespaces) will be the
> > easiest anyway, since the easiest way is to walk the structured AST!
>
> I'd be willing to help with this, or even possibly to do it myself.
> But if the latter, I would like you to write down some pointers, a
> basic plan of attack, a code sketch or anything else that you think
> could help. I am an XML neophyte and I don't know anything about the
> Synopsis AST yet, so I need some help connecting the dots.
Since I'll be away for the next week and a bit, I'll write down this
stuff in case you get time for it. If you don't then that's fine, I will
have a go at it myself when I get back.
The easiest way to do it is copy the DocBook formatter to, eg,
BoostBook.py in the Synopsis/Formatters directory.
The AST itself has two class hierarchies: types and declarations. A type
is used by declarations for things like return types, parameters, base
classes, etc. A type can refer to a declaration (Declared with a pointer
to the declaration), a typedef (Declared to a Typedef declaration), a
Modifier (pointers, references, const, volatile etc and a pointer to the
base type), Parameterized (template stuff), a Base type which you might
think of as an integral type - ints, floats etc. There is also a
Function type. All the types are in the Synopsis/Core/Type.py file, and
you can look at the RefManual for a nice graph.
The declarations are also arranged in a hierarchy. All declarations have
a scoped name, comments, language, type string, source file reference
and line number. The regular declarations derive directly from that
class, but the classes and namespaces are where it gets interesting. The
class Group has a list of declarations, but Group itself is used for
grouping declarations in the documentation and is not supposed to
correspond to an actual declaration. The Scope class is derived from
group, has a different visitor method, but is otherwise syntactically
identical - the difference being semantic, since a Scope instance refers
to a real declaration (namespace, class, etc). Class and Module are
derived from Scope to represent classes and namespaces/modules/packages
respectively.
You will also notice the class MetaModule - this represents the
combination of separate namespace (or IDL module) instances, which keeps
a reference to the original module declarations. This means that you can
see which files a namespace was opened in. The original module
declarations are empty (declarations() list is empty) and are just used
for the file() and line() attributes.
The inheritance representation is a bit icky. Class has a parents()
attribute, which is a list of Inheritance objects, which basically wrap
a Type with the public/protected/private type of inheritance. A Type is
used, since the base class can be a Declared type referring to a
declaration, or a Declared referring to a Typedef, or a Parameterized
type referring to a template instance, or even an Unknown, referring to
a declaration which is not in the AST (hence only name and language is
given - most types don't have a language attribute since they refer,
eventually, to a real declaration). Basically, you need to be careful if
you want to actually refer to the base classes! If you're just printing
the name, throwing the type into your type formatter will be enough.
I assume you are aware of the Visitor pattern. All the formatters
(except DUMP) implement two visitors - one for the Types and one for the
AST. The Types visitor is used to visit a type object, and maybe its
subobjects if it has any, and record an output string for that type.
Some formatters keep both a "reference" which can be used to refer to a
real declaration, and a label which is used for printing. I notice that
the DocBook formatter stores a reference (__type_ref), but it isn't
used.. copy-paste crud :(
The AST formatter is where the action is. You visit each AST node in
turn, printing out some XML, formatting types, visiting child nodes if
appropriate, and closing the XML element. It seems that I found docbook
4.2 couldn't nest classes, so you'll notice the visitClass method
storing classes and visiting them after closing the class tag. I presume
you don't need to do this for BoostBook, and can follow the pattern from
visitModule.
The Formatter class includes some utility methods for dealing with XML
entities and indentation. They should be pretty obvious.
The DocFormatter class is a bit of a hack to get the right output that I
needed for (the old version of) the DocBook manual. To see it in action
go to docs/Manual and run "make config.xml". It basically uses sections
instead of the more semantic docbook elements, with some extra magic to
deal with the particulars of the structure and comments in Config.py.
You should just have one class - get rid of DocFormatter - but you might
find some of its methods instructive.
The rest of the module is simple. You must have a line "#
THIS-IS-A-FORMATTER" near the top, or Synopsis wont recognise it as a
formatter. You must have a format() function with the same parameters
and a usage() function.
As an alternative, you can write a stand-alone formatter. Just import
Synopsis.Core.AST and call AST.load('filename.syn') which will give you
the AST object which you can traverse. You might as well create a
formatter though.
Another thing, you might want to try out the DUMP formatter. It is
basically a powerful pretty-printer, but is very very useful for
figuring out how your code is represented in the AST. I've lost count of
the number of times I've had to resort to using the DUMP formatter to
find out where that comment went, just what the returnType() is pointing
to, etc. The output is in three sections: declarations, types and
sourcefiles. It never prints the same object twice, to prevent loops. It
tries to bold the declaration names so if you're using 'less' you'll
need the -r flag, but then less doesn't handle long lines very well.
Perhaps I should add a flag to make it use <b> instead...
I hope that's enough for you. I should keep this and put it into a
"formatter howto" or some such :)
> > Could you paste some example outputs that show what it's meant to
> > do? It's not urgent.. I don't think I'll have time to mess around
> > with it before the release.
>
> Some examples (the last state of the program I have is buggy, so I'm
> hand-tweaking a couple of these):
<snip>
Okay, I see what you mean now. Very nice.
HTH,
Stephen
--
Stephen Davies <[EMAIL PROTECTED]>
--- End Message ---
--
David Abrahams
[EMAIL PROTECTED] * http://www.boost-consulting.com
Boost support, enhancements, training, and commercial distribution